반지름 기준 함수 네트워크

수학적 모델링 분야에서 방사 기저 함수 네트워크는 방사 기저 함수를 활성화 함수로 사용하는 인공 신경망이다.네트워크의 출력은 입력 및 뉴런 파라미터의 방사형 기저함수의 선형 조합입니다.레이디얼 베이스 함수 네트워크는 함수 근사, 시계열 예측, 분류, 시스템 제어 등 다양한 용도로 사용됩니다.그것들은 1988년 Royal Signals and Radar ^[1]^[2]^[3]Institution의 연구원인 Bromhead와 Lowe에 의해 처음 공식화 되었다.

네트워크 아키텍처

레이디얼 베이스 함수 네트워크의 아키텍처.입력

벡터

x(\

displaystyle

x

)

는

x

각각 다른 파라미터를 갖는 모든 방사형 기본 함수에 대한 입력으로 사용됩니다.네트워크의 출력은 방사형 기준 함수의 출력의 선형 조합입니다.

Radial Basis Function(RBF; 레이디얼 베이스 함수) 네트워크에는 일반적으로 입력 레이어, 비선형 RBF 활성화 함수를 가진 은닉 레이어 및 선형 출력 레이어의 3가지 레이어가 있습니다.입력은 실수 $\mathbf {x} \in \mathbb {R} ^{n}$ $\mathbf {x} \in \mathbb {R} ^{n}$ R $\mathbf {x} \in \mathbb {R} ^{n}$ n \ $displaystyle$ \ $mathbf$ { $x }$ \ $in$ \ $mathbb$ { $R ^$ { $n$ { the the the as as of of of as of of of of of { { { as as as as of as as as of of of of of네트워크의 출력은 $\varphi :\mathbb {R} ^{n}\to \mathbb {R}$ 벡터 $\varphi :\mathbb {R} ^{n}\to \mathbb {R}$ : R n $\varphi :\mathbb {R} ^{n}\to \mathbb {R}$ R {\ $display \varphi :\mathbb {R}^{n}\to \mathbb$ {R $\varphi :\mathbb {R} ^{n}\to \mathbb {R}$ 의 $스칼라$ 함수이며 다음과 같이 표시됩니다.

\displaystyle \varphi (\mathbf {x})=\sum _{i=1}^{N}a_{i}\rho (\mathbf {x} -\mathbf {c} _{i})}}

$N$ 서 N $(\displaystyle$ N $)$ 은 $N$ 숨겨진 층에 있는 뉴런의 수, $(\$ })은 $\mathbf {c} _{i}$ $뉴런$ i(\ $displaystyle$ i $i$ 의 중심 벡터, $(\$ })는 $a_{i}$ 선형 출력 뉴런 i $(\displaystyle$ i $)$ 의 $i$ 무게입니다.중심 벡터로부터의 거리에만 의존하는 함수는 해당 벡터에 대해 반지름 대칭이므로 반지름 기저함수라는 이름이 붙습니다.기본 형태에서는 모든 입력이 각각의 숨겨진 뉴런에 연결되어 있습니다.노름은 일반적으로 유클리드 거리로 간주되며(Mahalanobis 거리는 패턴 인식으로^[4]^[5]^{[editorializing]} 더 나은 성능을 발휘하는 것으로 보이지만), 방사 기저 함수는 일반적으로 가우스인 것으로 간주됩니다.

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

(

x

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

-

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

i

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

)

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

=

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

[ -

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

i

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

‖

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

- c i

2

2

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]

]{

displaystyle

\

rho

{

big

( } \

left

\

Vert \ mathbf

{

x }

- \

mathbf

{ c } - { i }

=

\

exp \ left

[ -

right

_

i

} \

vert

_

i }

가우스 기저 함수는 다음과 같은 의미에서 중심 벡터에 로컬입니다.

\displaystyle \lim _{ x \to \infty }\rho (\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert)=0}

즉, 하나의 뉴런의 매개변수를 변경하는 것은 해당 뉴런의 중심에서 멀리 떨어진 입력 값에 대해 작은 효과만 가져온다.

액티베이션 함수의 형태에 관한 특정 조건이 있는 경우, RBF 네트워크는 $\mathbb {R} ^{n}$ n $\mathbb {R} ^{n}$ \ $displaystyle \mathbb {R}^{n$ ^[6]의 콤팩트 서브셋 상의 범용 근사치이다.즉, 충분히 숨겨진 뉴런이 있는 RBF 네트워크는 닫힌 경계 집합에서 임의의 정밀도로 연속 함수를 근사할 수 있습니다.

$a_{i}$ $a_{i}$ {\ $displaystyle a_{i$ $\mathbf {c} _{i}$ ${\$ $\beta _{i}$ $\beta _{i}$ {\ $displaystyle$ $\beta$ _ ${i$ }}는 $\beta _{i}$ ${\$ 와 데이터 $\varphi$ 사이의 적합성을 최적화하도록 결정됩니다.

하나의 입력 치수에 두 개의 정규화되지 않은 반지름 기준 함수가 있습니다.

c_{1}=0.75

함수 센터는 c

c_{1}=0.75

=

}=

0.75)

및

c_{1}=0.75

c_{2}=3.25

c_{2}=3.25

=

3.

({displaystyle c_{

2}=3

.25)에

있습니다

c_{2}=3.25

정규화

하나의 입력 치수(Sigmoid)에서 두 개의 정규화된 방사형 기저 함수.

c_{1}=0.75

함수 센터는 c

c_{1}=0.75

=

}=

0.75)

및

c_{1}=0.75

c_{2}=3.25

c_{2}=3.25

=

3.

({displaystyle c_{

2}=3

.25)에

있습니다

c_{2}=3.25

하나의 입력 치수에 세 개의 정규화된 반지름 기준 함수가 있습니다.추가 기준 함수의

c_{3}=2.75

은

c_{3}=2.75

3

c_{3}=2.75

{

displaystyle c_{3

}

= 2.75}

입니다.

하나의 입력 치수에 네 개의 정규화된 반지름 기준 함수가 있습니다.네 번째 기준 함수의

c_{4}=0

은 c

=

0({

displaystyle c_{4}=

0

c_{4}=0

입니다. 첫 번째 기준 함수(진한 파란색)가 국소화되었습니다.

표준화된 아키텍처

상기의 비정규화된 아키텍처에 가세해, RBF 네트워크를 정규화할 수 있습니다.이 경우 매핑은

(\displaystyle \varphi (\mathbf {x})\{stackrel {def}{=}\frac {sum _ {i=1}^{n}a_{i}\rho {big (}\left \Vert \mathbf {x} -\mathbf {c} _{i} {c} \vert} {big} {sum} {{n} {\} {\} {sum} {sum} {{sum} {sum} {{n} {{n} {\} {\}ig)}}}:

어디에

(\displaystyle ubig (\left\Vert \mathbf {x} -\mathbf {c} _{i} \rel {def} {=} \ frac {rho {big (\left\Vert \mathbf {x} -\mathbf {c} {c} ) _big} \ frel {i} {i}

는 정규화된 방사형 기저 함수라고 알려져 있습니다.

정상화를 위한 이론적 동기

확률적 데이터 흐름의 경우 이 아키텍처에 대한 이론적 근거가 있습니다.결합 확률 밀도에 대한 확률적 커널 근사치를 가정한다.

{\displaystyle P\left(\mathbf {x} \land y\right)={1\over N}\sum _{i=1}^{N},\rho {big(}\left\Vert \mathbf {x} -\mathbf {c} _{i} \right\left},\left(\light\left)\left {i}, {\light}, y_e}, {\sum {i}, {\sum n}, {i}, {\sum {\s

여기서 $\mathbf {c} _{i}$ $(\$ })와 ${\mathbf {c}}_{i}$ $(\$ })는 $e_{i}$ 데이터의 예시이며 커널을 정규화해야 합니다.

\int \rho (\왼쪽)\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {big},d^{n}\mathbf {x} =1

그리고.

\int \sigma {\big (}\left\vert y-e_{i}\right\vert {\big )}\,dy=1

-

\int \sigma {\big (}\left\vert y-e_{i}\right\vert {\big )}\,dy=1

\int \sigma {\big (}\left\vert y-e_{i}\right\vert {\big )}\,dy=1

)

\int \sigma {\big (}\left\vert y-e_{i}\right\vert {\big )}\,dy=1

d y

=

1

( \

displaystyle \int

\

big

( \

left

\

vert

y - e

_

{

i }

,

dy =

1 )

\int \sigma {\big (}\left\vert y-e_{i}\right\vert {\big )}\,dy=1

입력 및 출력 공간의 확률 밀도는 다음과 같습니다.

P\left(\mathbf {x} \right), dy={1 \over N}\sum _{i=1}^{N}, \rho {big (\left\Vert \mathbf {x} -\bigmathbf {c} {c} {\right} {\} {\} {\} {\} {\} {\} {\} {\} {\

그리고.

$\mathbf {x}$ x $\mathbf {x}$ {\ $displaystyle \mathbf {x}}$ 에 ${\mathbf {x}}$ 대한 y의 기대치는 다음과 같습니다.

\displaystyle \varphi \left(\mathbf {x} \right)\{stackrel {def}{=}\left(y\mid \mathbf {x} \right)=\int y,P\left(y\mad \mathbf {x} \right)dy

어디에

P\left(y\mid\mathbf {x}\오른쪽)

는 $\mathbf {x}$ y의 조건부 확률 x $\mathbf {x}$ {\ $displaystyle \mathbf {x$ 입니다. 조건부 확률은 Bayes 정리를 통한 결합 확률과 관련이 있습니다.

P\left(y\mid \mathbf {x} \right)=frac {P\left(\mathbf {x} \land y\right)}{P\left(\mathbf {x} \오른쪽)}}

그 결과

\varphi \left(\mathbf {x} \right)=\int y\,{\frac {P\left(\mathbf {x} \land y\right)}{P\left(\mathbf {x} \right)}}\,dy

P\left(\mathbf {x}

\

right)}}, dy

이것은

\displaystyle \varphi \left(\mathbf {x} \rfrac {sum _{i=1}^{i} e _{i}\rho {x} -\mathbf {c} _{i}\right\Vert {big}{i=1}^{n}

통합이 수행될 때.

로컬 선형 모델

때로는 로컬 선형 모델을 포함하도록 아키텍처를 확장하는 것이 편리합니다.이 경우 아키텍처는 우선,

\displaystyle \varphi \left(\mathbf {x} \오른쪽)=\sum _{i=1}^{N}\left(a_{i}+\mathbf {b})_{i}\cdot \left(\mathbf {x} - \c} _{i}\right)\rho (왼쪽)

그리고.

\displaystyle \varphi \left(\mathbf {x} \right)=\sum _{i=1}^{N}\left(a_{i}+\mathbf {b}_{i})_{i}\cdot \left(\mathbf {x} -\right)\mathbf {c}\bf}\bf}\bright}\sum {b}\left}\f}\sum {left}\left}\s}\left}\mathb}\s

각각 정규화되지 않은 경우와 정규화되지 않은 경우입니다. $\mathbf {b} _{i}$ 서 b $(\$ })는 $\mathbf {b} _{i}$ 결정되는 가중치입니다.고차 선형 항도 사용할 수 있습니다.

이 결과는 쓸 수 있습니다.

\displaystyle \varphi \left(\mathbf {x} \right)=\sum _{i=1}^2N}\sum _{j=1}^{n}e_{ij}v_{ij}{\big (}\mathbf {x}-\mathbf {c}_{i}{\big}}}}

어디에

\displaystyle e_{ij}=case{case}a_{i},&{\mbox{if}i\in [1,N]\b_{ij},&{\mbox{if}i\in [N+1,2N]\end{case}}}

그리고.

v_{ij}{\big(}\mathbf{x}-\mathbf{c}_{나는}{\big)}\{\stackrel{\mathrm{def}}{)}}){\begin{경우}\delta _{ij}\rho{\big(}\left\Vert \mathbf{x}-\mathbf{c}_{나는}\right\Vert{\big)},&,{\mbox{만약}}[1,N]\\\left(x_{ij}-c_{ij}\right)\rho{\big(}\left\Vert \mathbf{x}-\mathbf{c}_{나는}\right\Vert{\big)},& i\in,{\mbox{만약}}i\i.n는 경우에는 N+1,2N]\end {case}

정규화되지 않은 경우에는

v_{ij}{\big(}\mathbf{x}-\mathbf{c}_{나는}{\big)}\{\stackrel{\mathrm{def}}{)}}){\begin{경우}\delta _{ij}u{\big(}\left\Vert \mathbf{x}-\mathbf{c}_{나는}\right\Vert{\big)},&,{\mbox{만약}}i\in[1,N]\\\left(x_{ij}-c_{ij}\right)u{\big(}\left\Vert \mathbf{x}-\mathbf{c}_{나는}\right\Vert{\big)},&,{\mbox{만약}}i\in는 경우에는 N+1,2.N]\end{cases}}

정규화된 경우입니다.

$\delta _{ij}$ 서 $\$ 는 다음과 $\delta _{ij}$ 같이 정의된 Kronecker 델타 함수입니다.

\delta _{ij}={\begin{cases}1,&{\mbox{if }}i=j\\0,&{\mbox{if }}i\neq j\end{cases}}

\delta _{ij}={\begin{cases}1,&{\mbox{if }}i=j\\0,&{\mbox{if }}i\neq j\end{cases}}

{ 1,

=

\delta _{ij}={\begin{cases}1,&{\mbox{if }}i=j\\0,&{\mbox{if }}i\neq j\end{cases}}

인 경우

\delta _{ij}={\begin{cases}1,&{\mbox{if }}i=j\\0,&{\mbox{if }}i\neq j\end{cases}}

i

j {\displaystyle

\

displaystyle

_{ij

}=param {cases}

1, &

{\mbox{if}i=

j\

0,

&

{\mbox{if}i\neq j\end{

cases

\delta _{ij}={\begin{cases}1,&{\mbox{if }}i=j\\0,&{\mbox{if }}i\neq j\end{cases}}

트레이닝

RBF 네트워크는 일반적으로 2단계 알고리즘에 의해 입력값과 $\mathbf {x} (t),y(t)$ 의 쌍x $t$ ) , $y$ ( $\mathbf {x} (t),y(t)$ ) , $y$ ( t ) , y ( $t$ ) $\mathbf {x} (t),y(t)$ $t=1,\dots ,T$ $t=1,\dots ,T$ $t=1,\dots ,T$ , $t=1,\dots ,T$ , $t=1,\dots ,T$ \ $displaystyle$ t $\mathbf {x} (t),y(t)$ $=$ 1, \ $displaystyle$ , $T$ by $t=1,\dots ,T$ of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of of $of$ of of of of of of of of of of

첫 번째 단계에서는 은닉층 내 RBF 함수의 중심 $\mathbf {c} _{i}$ $\$ 를 ${\mathbf c}_{i}$ 선택한다.이 단계는 여러 가지 방법으로 수행할 수 있으며, 일부 예제에서 랜덤하게 중심을 추출하거나 k-평균 군집 분석을 사용하여 결정할 수 있습니다.이 순서는 감독되지 않습니다.

두 번째 단계는 일부 객관적 함수와 관련하여 은닉 레이어 출력에 $w_{i}$ $w_{i$ 를 가진 선형 모델을 단순히 적합시킵니다.적어도 회귀/함수 추정의 경우 공통 목적 함수는 최소 제곱 함수입니다.

\displaystyle K(\mathbf {w})\{stackrel {mathrm {def} {=}\sum _{t=1}^{T}K_{t}(\mathbf {w})}

어디에

K

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

(

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

w )

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

e

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

[

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

(

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

t

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

) -

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

(

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

(

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

) ,

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

) ]

K_{t}(\mathbf {w} )\ {\stackrel {\mathrm {def} }{=}}\ {\big [}y(t)-\varphi {\big (}\mathbf {x} (t),\mathbf {w} {\big )}{\big ]}^{2}

\ display style

K_

{t

} (

mathbf

{

w }

\ stackrel { mathrm {

def

} { = } \ big [

y ( t

) \

mathbf

{

t }

, t

}

우리는 체중에 대한 의존성을 명시적으로 포함시켰다.최적의 가중치 선택을 통해 최소 제곱 목적 함수를 최소화하여 적합 정확도를 최적화합니다.

정확성뿐만 아니라 부드러움 등 여러 가지 목표를 최적화해야 하는 경우가 있습니다.이 경우 다음과 같은 정규화된 목적 함수를 최적화하는 것이 유용합니다.

H(\mathbf {w})\{stackrel {mathbf {w}\K(\mathbf {w})+\smathbf {def}{=}\sum _{t=1}^{}T}H_{t}(\mathbf {w})

어디에

\displaystyle S(\mathbf {w})\{stackrel {mathrm {def} {=}\sum _{t=1}^{T}S_{t}(\mathbf {w})}

그리고.

\displaystyle H_{t}(\mathbf {w})\{stackrel {mathrm {def}{=}}\K_{t}(\mathbf {w})+\displayda S_{t}(\mathbf {w})}

여기서 S의 최적화는 평활도를 최대화하고 ${\(\displaystyle\lambda)$ 는 $\lambda$ 정규화 파라미터로 알려져 있습니다.

세 번째 옵션의 역전파 스텝은 RBF 네트워크의 ^[3]모든 파라미터를 미세 조정하기 위해 실행할 수 있다.

보간법

RBF 네트워크는 함수 y $y(\mathbf {x} _{i})=b_{i},i=1,\ldots ,N$ $y:\mathbb {R} ^{n}\to \mathbb {R}$ $y:\mathbb {R} ^{n}\to \mathbb {R}$ {\ $displaystyle$ y $:\mathbb {R} ^{n}\to$ \ $mathbb {R} }$ 의 값이 유한한 개수의 포인트로 $y(\mathbf {x} _{i})=b_{i},i=1,\ldots ,N$ 경우 $y:\mathbb {R} ^{n}\to \mathbb {R}$ 함수 y : $y(\mathbf {x} _{i})=b_{i},i=1,\ldots ,N$ n $y(\mathbf {x} _{i})=b_{i},i=1,\ldots ,N$ → R { $displaystyle$ y $:\mathbbb$ } $=$ { $r}$ } { $y(\mathbf {x} _{i})=b_{i},i=1,\ldots ,N$ 에 $y:{\mathbb {R}}^{n}\to {\mathbb {R}}$ 보간하는 데 사용할 수 있습니다. $반지름$ 기저 함수의 중심이 되는 $displaystyle$ \mathbf $\mathbf {x} _{i}$ { $x} _{i$ $g_{ij}=\rho (||\mathbf {x} _{j}-\mathbf {x} _{i}||)$ {{ $i}}$ {\ $style g_$ } $=\rho$ (\ $mathbf {x}-\mathbf {x} _{i})$ } 에서의 $g_{{ij}}=\rho (||{\mathbf x}_{j}-{\mathbf x}_{i}||)$ 기저 함수의 값을 평가할 수 있습니다. 가중치는 방정식에서 풀 수 있습니다.

\displaystyle \left[{\display{11}&g_{12}&\cdots&g_{1]N}\\g_{21}&g_{22}&\cdots&g_{2}N}\\vdots &\vdots &\vdots &\vdots \\g_{N1}&g_{N2}&\cdots &\g_{NN}\end{matrix}\right]\left[{\begin{matrix}w_{1}\w_{2}\\\vdots\\\w_{N}\end{matrix}\right]=left[{\begin{1}\b_{2}\v\dots\b}\n

$\mathbf {x} _{i}$ $\mathbf {x} _{i}$ {\ $displaystyle \mathbf {x} _{i}}$ 점들이 구별되어 있으므로, $가중치$ w {\ $displaystyle$ w $}$ 를 ${\mathbf x}_{i}$ $w$ 단순 선형 대수로 풀 수 있는 경우, 위의 방정식의 보간 행렬은 비단수 행렬임을 알 수 있다.

\mathbf {w} =\mathbf {G} ^{-1}\mathbf {b}

$G=(g_{ij})$ 서 G $=$ ( $G=(g_{ij})$ i $G=(g_{ij})$ ) { $displaystyle$ G=( $g_{ij$

함수 근사

목적이 엄격한 보간을 수행하는 것이 아니라 보다 일반적인 함수 근사 또는 분류를 수행하는 것이라면, 중심에 대한 명확한 선택사항이 없기 때문에 최적화는 다소 더 복잡하다.트레이닝은 일반적으로 폭과 중심을 고정하고 무게를 고정하는 두 단계로 이루어집니다.이것은 비선형 숨겨진 뉴런과 선형 출력 뉴런의 다른 특성을 고려함으로써 정당화될 수 있다.

기본 기능 센터 교육

기본 함수 중심은 입력 인스턴스 간에 랜덤하게 추출하거나 직교 최소 제곱 학습 알고리즘을 통해 얻거나 표본을 군집화하고 군집 평균을 중심으로 선택하여 찾을 수 있습니다.

RBF 폭은 일반적으로 선택한 중심 사이의 최대 거리에 비례하는 동일한 값으로 고정됩니다.

선형 가중치에 대한 의사 역해

$c_{i}$ $(\$ 를 $c_{i}$ 고정한 후 선형 의사 역해법으로 출력에서 오류를 최소화하는 가중치를 계산할 수 있습니다.

\mathbf {w} =\mathbf {G} ^{+}\mathbf {b}

=

+

\mathbf {w} =\mathbf {G} ^{+}\mathbf {b}

{\displaystyle

\

mathbf

{w} =\

mathbf {G}

^{+}\

mathbf {b}

,

여기서 G의 엔트리는 x $x_{i}$ {\ $displaystyle x_{i$ : $g_{ji}=\rho (||x_{j}-c_{i}||)$ $g_{ji}=\rho (||x_{j}-c_{i}||)$ $=$ ( ( $g_{ji}=\rho (||x_{j}-c_{i}||)$ - $g_{ji}=\rho (||x_{j}-c_{i}||)$ ){ $displaystyle g_{ji$ }=\ $rho$ ( $x_{j}-c_{i$ $x_{i}$ 에서 평가된 방사형 기저함수의 값입니다.

이 선형 솔루션의 존재는 Multiple-Layer Perceptron(MLP; 멀티레이어 퍼셉트론) 네트워크와 달리 RBF 네트워크에는 명시적 최소화(센터 고정 시)가 있음을 의미합니다.

선형 무게의 경사 강하 훈련

또 다른 가능한 훈련 알고리즘은 경사 하강이다.경사 강하 훈련에서 체중은 각 시간 단계에서 목표 함수의 경사와는 반대 방향으로 이동함으로써 조정된다(따라서 목표 함수의 최소값을 찾을 수 있다).

\mathbf {w}(t+1)=\mathbf {w}-\nu {frac {d}{d\mathbf {w}}H_{t}(\mathbf {w})

$\nu$ 서 ${\$ { $displaystyle \nu }$ 는 $\nu$ "학습 파라미터"입니다.

선형 $a_{i}$ 를 트레이닝하는 경우 i $\$ 알고리즘은

\displaystyle a_{i}(t+1)=a_{i}(t)+\nu {}y(t)-\varphi {x}(t),\mathbf {w}{\big}{\big}{\rho(왼쪽)\vert \mathbf {x}(tfc)

정규화되지 않은 경우에는

a_{i}(t+1)=a_{i}(t)+\nu {}y(t)-\varphi {x}(t),\mathbf {w}{\big}{\big }u(}\left\Vert \mathbf {x}(t)_mathbi {c

정규화된 경우입니다.

로컬 선형 아키텍처의 경우 구배 강하 트레이닝은

\displaystyle e_{ij}(t+1)=e_{ij}(t)+\nu {\big[}y(t)-\varphi {x}(t),\mathbf {w}{\big}{\big}v_{ij}{\big}(\mathbfc)

선형 가중치에 대한 투영 연산자 교육

$a_{i}$ 웨이트 $($ i\ $displaystyle a_$ }) $e_{ij}$ $($ i\ $displaystyle e_{ij$ $a_{i}$ 의 경우 알고리즘은

\displaystyle a_{i}(t+1)=a_{i}(t)+\nu {}y(t)-\varphi {x}(t),\mathbf {w}{\big}{\big}{\big }{\frac}{\f}(t)

정규화되지 않은 경우에는

\displaystyle a_{i}(t+1)=a_{i}(t)+\nu {}y(t)-\varphi {x}(t),\mathbf {w}{\big}{\big}{\frac {ubig(\left\vert\mathbfc} {mathbfx} {mathbfc}

정규화된 경우 및

{displaystyle e_{ij}(t+1)=e_{ij}(t)+\nu {big [}y(t)-\varphi {x}(t),\mathbf {w}{big}{\big}{\big}{\frac {v_ij}{big}(\mathbx}) {t}

국소-직선의 경우입니다.

하나의 기본 함수로, 투영 연산자 훈련은 뉴턴의 방법으로 감소합니다.

그림 6: 로지스틱 맵의 시계열.로지스틱 지도를 반복하면 혼돈한 시계열이 생성됩니다.값은 0과 1 사이에 있습니다.여기에서는 이 섹션의 예시를 훈련하기 위해 사용되는 100개의 훈련 포인트를 보여 줍니다.가중치 c는 이 시계열에서 처음 5개 점입니다.

예

로지스틱 맵

반지름 기저 함수의 기본 특성은 단위 간격을 그 자체로 매핑하는 로지스틱 맵인 간단한 수학 맵으로 나타낼 수 있습니다.편리한 프로토타입 데이터 스트림을 생성하는 데 사용할 수 있습니다.로지스틱 지도를 사용하여 함수 근사, 시계열 예측 및 제어 이론을 탐색할 수 있습니다.이 지도는 인구 역학 분야에서 출발하여 혼돈된 시계열의 원형이 되었다.이 지도는 완전히 혼돈된 정권에서 주어진 것이다.

x(t+1)\{stackrel {def}{=}\f\left[x(t)\right]=4x(t)\left[1-x(t)\right]

여기서 t는 시간 인덱스입니다.시간 t+1에서의 x 값은 시간 t에서의 x의 포물선 함수입니다.이 방정식은 로지스틱 맵에서 생성된 카오스 시계열의 기본 형상을 나타냅니다.

이 방정식으로부터 시계열을 생성하는 것이 앞으로의 문제입니다.여기서의 예는 시계열의 예에서 로지스틱 맵의 기본 역학 또는 기본 방정식의 식별과 같은 역 문제를 설명한다.목표는 견적을 찾는 것이다.

\displaystyle x(t+1)=f\left[x(t)\right]\about \varphi (t)=\varphi \left[x(t)\right]}

f의 경우.

함수 근사

정규화되지 않은 반지름 기준

아키텍처는

그림 7: 정규화되지 않은 기본 함수교육 세트를 한 번 통과한 후의 로지스틱 지도(파란색)와 로지스틱 지도에 대한 근사치(빨간색)입니다.

\displaystyle \varphi (\mathbf {x})\{stackrel {mathrm {def}{=}\sum _{i=1}^{N}a_{i}\rho {big (}\left\Vert \mathbf {x}-\mathbf {c}_c}_{i}\right {c}

어디에

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta _{i}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]=\exp \left[-\beta _{i}\left(x(t)-c_{i}\right)^{2}\right]

_{i}\left(x(t)-c_{i}\right)^{2}\right

입력은 벡터가 아닌 스칼라이므로 입력 치수는 1입니다.기본 함수의 개수는 N=5로 하고, 훈련 세트의 사이즈는 카오스 시계열에 의해 생성된 100개의 샘플로 한다. $\beta$ β(\ $displaystyle\beta)$ 는 $\beta$ 5와 같은 상수로 간주됩니다. $무게 ci는$ 시계열의 5가지 예시입니다 $.$ $i$ \displaystyle $a_{i}$ 의 $a_{i}$ $a_{i}$ 는 프로젝션 오퍼레이터 트레이닝을 통해 학습됩니다.

\displaystyle a_{i}(t+1)=a_{i}(t)+\nu {}x(t+1)-\varphi {x}(t),\mathbf {w}{\big}{\big}{\frac {rho(t)\mathbf}(t)

여기서 학습률 $\nu$ (\ $displaystyle\nu)$ 는 $\nu$ 0.3으로 간주됩니다.훈련은 100개의 훈련 포인트를 통과하는 한 번의 패스로 수행됩니다.rms 오류는 0.15.

그림 8: 정규화된 기본 함수교육 세트를 한 번 통과한 후의 로지스틱 지도(파란색)와 로지스틱 지도에 대한 근사치(빨간색)입니다.정규화되지 않은 경우에 비해 개선된 점에 유의하십시오.

정규화된 반지름 기준 함수

정규화된 RBF 아키텍처는 다음과 같습니다.

(\displaystyle \varphi (\mathbf {x})\{stackrel {def}{=}\frac {sum _ {i=1}^{n}a_{i}\rho {big (}\left \Vert \mathbf {x} -\mathbf {c} _{i} {c} \vert} {big} {sum} {{n} {\} {\} {sum} {sum} {{sum} {sum} {{n} {{n} {\} {\}ig)}}}:

어디에

u{\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}\ {\stackrel {\mathrm {def} }{=}}\ {\frac {\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}}{\sum _{i=1}^{N}\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}}}

iight\Vert {i=1}^{N}\rho {big (}\왼쪽\Vert

\

mathbf {x}-\mathbf {c}_{i}\오른쪽\Vert

{

big

다시 한 번:

[

}\right)^{2}\

right

\rho {\big (}\left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert {\big )}=\exp \left[-\beta \left\Vert \mathbf {x} -\mathbf {c} _{i}\right\Vert ^{2}\right]=\exp \left[-\beta \left(x(t)-c_{i}\right)^{2}\right]

입니다

.

다시 기본 함수의 수를 5로 선택하고 훈련 세트의 크기를 혼돈된 시계열에 의해 생성된 100개의 샘플로 선택한다. $\beta$ β $(\displaystyle\beta)$ 는 $\beta$ 6과 같은 상수로 간주됩니다. $무게 ci는$ 시계열의 5가지 예시입니다 $.$ $i$ \displaystyle $a_{i}$ 의 $a_{i}$ $a_{i}$ 는 프로젝션 오퍼레이터 트레이닝을 통해 학습됩니다.

\displaystyle a_{i}(t+1)=a_{i}(t)+\nu {}x(t+1)-\varphi {x}(t),\mathbf {w}{\big}{\big}{\frac {uf}(big}){\mathbf}(t})

여기서 학습률 $\nu$ (\ $displaystyle\nu)$ 는 $\nu$ 다시 0.3으로 간주됩니다.훈련은 100개의 훈련 포인트를 통과하는 한 번의 패스로 수행됩니다.100개의 표본으로 구성된 검정 집합의 rms 오차는 0.084로 정규화되지 않은 오차보다 작습니다.정규화를 통해 정확도가 향상됩니다.일반적으로 정규화 기준 함수의 정확도는 입력 치수가 증가함에 따라 정규화되지 않은 함수에 비해 훨씬 더 높아집니다.

그림 9: 정규화된 기본 함수로지스틱 맵(파란색)과 로지스틱 맵에 대한 근사치(빨간색)를 시간 함수로 나타냅니다.근사치는 몇 가지 시간 단계에서만 유효하다는 점에 유의하십시오.이것은 혼돈한 시계열의 일반적인 특성입니다.

시계열 예측

시계열의 기본 형상이 이전 예시와 같이 추정되면 반복을 통해 시계열의 예측을 수행할 수 있습니다.

\displaystyle \varphi (0)=x (1)

{x}(t)\about \varphi(t-1)

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

(

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

t +

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

)

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

（

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

）

=

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

[ [

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

( t -

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

)

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

]{

displaystyle {x }

( t +

1

) \

about

\

varphi

[ \

varphi

( t - 1 )

{x}(t+1)\approx \varphi (t)=\varphi [\varphi (t-1)]

} 。

실제 시계열과 추정 시계열의 비교가 그림에 표시됩니다.추정된 시계열은 x(0)에 대한 정확한 지식으로 0에 시작합니다.그런 다음 동적 추정치를 사용하여 여러 시간 단계에 대한 시계열 추정치를 업데이트합니다.

견적은 몇 가지 시간 단계에 대해서만 정확합니다.이것은 혼돈한 시계열의 일반적인 특성입니다.이는 혼돈한 시계열에 공통적인 초기 조건에 대한 민감한 의존성의 특성입니다.작은 초기 오류는 시간이 지남에 따라 증폭됩니다.거의 동일한 초기 조건을 가진 시계열의 발산 척도를 랴푸노프 지수라고 한다.

카오스 시계열 제어

그림 10: 로지스틱 맵의 제어.시스템은 49개의 타임 스텝에 대해 자연스럽게 진화할 수 있습니다.50 컨트롤이 켜집니다.시계열에서 원하는 궤적은 빨간색입니다.제어 중인 시스템은 기본 역학을 학습하고 시계열을 원하는 출력으로 구동합니다.아키텍처는 시계열 예측 예시와 동일합니다.

로지스틱 맵의 출력은 다음과 같이 제어 $c[x(t),t]$ c [ $c[x(t),t]$ t ) , $c[x(t),t]$ ]{ $displaystyle c [ x$ ( t , t $]}$ 를 $c[x(t),t]$ 사용하여 조작할 수 있다고 가정합니다.

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

(

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

t +

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

)

=

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

(

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

)

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

[

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

-

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

(

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

t )

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

]

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

+

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

[

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

(

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

)

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

, t

{x}_{}^{}(t+1)=4x(t)[1-x(t)]+c[x(t),t]

]{ {

displaystyle { x

}

_{

}(

t+1

)=

4x(t)[1-x(t)]+c[x

목표는 시계열을 원하는 $d(t)$ d ( $d(t)$ $)\style$ d $(t$ 로 구동하도록 제어 파라미터를 선택하는 것입니다.이것은 제어 파라미터를 선택하면 가능합니다.

c_{}^{x(t),t]\{stackrel {mathrm {def}{=}\-\varphi [x(t)]+d(t+1)

어디에

\displaystyle y[x(t)]\approx f[x(t)]=x(t+1)-c[x(t),t]}

는 시스템의 기본 자연역학에 대한 근사치입니다.

학습 알고리즘은 다음과 같습니다.

{displaystyle a_{i}(t+1)=a_{i}(t)+\nu \varepsilon {frac {u{\big (}\left\Vert \mathbf {x}(t)-\mathbf {c}_{i}\Vert {big}}}{\sum_i=1}{nu^2}(왼쪽)

어디에

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

=

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

[ x

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

(

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

)

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

] -

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

[

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

(

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

)

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

(

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

+

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

1) -

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

[ x (

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

)

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

]-

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

[ x (

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

)

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

( t

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

+ 1)

\varepsilon \ {\stackrel {\mathrm {def} }{=}}\ f[x(t)]-\varphi [x(t)]=x(t+1)-c[x(t),t]-\varphi [x(t)]=x(t+1)-d(t+1)

-

d

( \ displaystyle \

varepsilon

\ \ \

stackrel

\

mathrm

{ def }

=

{

def

} \

ph

( t )

「」를 참조해 주세요.

레퍼런스

^ Broomhead, D. S.; Lowe, David (1988). Radial basis functions, multi-variable functional interpolation and adaptive networks (Technical report). RSRE. 4148. Archived from the original on April 9, 2013.
^ Broomhead, D. S.; Lowe, David (1988). "Multivariable functional interpolation and adaptive networks" (PDF). Complex Systems. 2: 321–355.
^ ^a ^b Schwenker, Friedhelm; Kestler, Hans A.; Palm, Günther (2001). "Three learning phases for radial-basis-function networks". Neural Networks. 14 (4–5): 439–458. CiteSeerX 10.1.1.109.312. doi:10.1016/s0893-6080(01)00027-2. PMID 11411631.
^ Beheim, Larbi; Zitouni, Adel; Belloir, Fabien (January 2004). "New RBF neural network classifier with optimized hidden neurons number". CiteSeerX 10.1.1.497.5646.
^ Ibrikci, Turgay; Brandt, M.E.; Wang, Guanyu; Acikkar, Mustafa (23–26 October 2002). Mahalanobis distance with radial basis function network on protein secondary structures. Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society. Engineering in Medicine and Biology Society, Proceedings of the Annual International Conference of the IEEE. Vol. 3. Houston, TX, USA (published 6 January 2003). pp. 2184–2185. doi:10.1109/IEMBS.2002.1053230. ISBN 0-7803-7612-9. ISSN 1094-687X. {{cite conference}}: access-date=필요. url=(도움말)
^ Park, J.; I. W. Sandberg (Summer 1991). "Universal Approximation Using Radial-Basis-Function Networks". Neural Computation. 3 (2): 246–257. doi:10.1162/neco.1991.3.2.246. PMID 31167308. S2CID 34868087.

추가 정보

J. Moody와 C. J. Darken, "로컬 조정된 처리 장치 네트워크에서 빠른 학습", Neural Computation, 1, 281-294(1989)Moody 및 Darken에 따른 Radial Basis 함수 네트워크도 참조하십시오.
T. 포지오와 F.Girosi, "근사 및 학습 네트워크", Proc.IEEE 78(9), 1484-1487(1990).
로저 D. 존스, Y. C. 리, C. W. 반즈, G. W. 플레이크, K. 리, P. S. 루이스, S.첸,?신경망을 이용한 함수 근사 및 시계열 예측?6월 17-21일 신경망에 관한 국제 공동 회의의 진행, 페이지 I-649(1990).
Martin D. Buhmann (2003). Radial Basis Functions: Theory and Implementations. Cambridge University. ISBN 0-521-63338-9.
Yee, Paul V. & Haykin, Simon (2001). Regularized Radial Basis Function Networks: Theory and Applications. John Wiley. ISBN 0-471-35349-3.
존 R. 데이비스, 스티븐 5세Coggeshall, Roger D. 의 "Intelligent Security Systems(지능형 보안 시스템)"인 {{cite book}}Jones, 및 Daniel Schutzer는 일반 이름(도움말)을 사용합니다.CS1 유지: 여러 이름: 작성자 목록(링크)
Simon Haykin (1999). Neural Networks: A Comprehensive Foundation (2nd ed.). Upper Saddle River, NJ: Prentice Hall. ISBN 0-13-908385-5.
S. Chen, C. F. N. Cowan 및 P. M. Grant, "직교 최소 제곱 학습 알고리즘", 신경망에 대한 IEEE 트랜잭션, Vol 2, No 2 (Mar) 1991.

[1] Broomhead, D. S.; Lowe, David (1988). Radial basis functions, multi-variable functional interpolation and adaptive networks (Technical report). RSRE. 4148. Archived from the original on April 9, 2013.

[2] Broomhead, D. S.; Lowe, David (1988). "Multivariable functional interpolation and adaptive networks" (PDF). Complex Systems. 2: 321–355.

[schwenker-3] Schwenker, Friedhelm; Kestler, Hans A.; Palm, Günther (2001). "Three learning phases for radial-basis-function networks". Neural Networks. 14 (4–5): 439–458. CiteSeerX 10.1.1.109.312. doi:10.1016/s0893-6080(01)00027-2. PMID 11411631.

[4] Beheim, Larbi; Zitouni, Adel; Belloir, Fabien (January 2004). "New RBF neural network classifier with optimized hidden neurons number". CiteSeerX 10.1.1.497.5646.

[5] Ibrikci, Turgay; Brandt, M.E.; Wang, Guanyu; Acikkar, Mustafa (23–26 October 2002). Mahalanobis distance with radial basis function network on protein secondary structures. Proceedings of the Second Joint 24th Annual Conference and the Annual Fall Meeting of the Biomedical Engineering Society. Engineering in Medicine and Biology Society, Proceedings of the Annual International Conference of the IEEE. Vol. 3. Houston, TX, USA (published 6 January 2003). pp. 2184–2185. doi:10.1109/IEMBS.2002.1053230. ISBN 0-7803-7612-9. ISSN 1094-687X. {{cite conference}}: access-date=필요. url=(도움말)

[Park-6] Park, J.; I. W. Sandberg (Summer 1991). "Universal Approximation Using Radial-Basis-Function Networks". Neural Computation. 3 (2): 246–257. doi:10.1162/neco.1991.3.2.246. PMID 31167308. S2CID 34868087.

[1]

[2]

[3]

[4]

[5]

[6]

Search

반지름 기준 함수 네트워크

네임스페이스

더

목차

네트워크 아키텍처

정규화

표준화된 아키텍처

정상화를 위한 이론적 동기

로컬 선형 모델

트레이닝

보간법

함수 근사

기본 기능 센터 교육

선형 가중치에 대한 의사 역해

선형 무게의 경사 강하 훈련

선형 가중치에 대한 투영 연산자 교육

예

로지스틱 맵

함수 근사

정규화되지 않은 반지름 기준

정규화된 반지름 기준 함수

시계열 예측

카오스 시계열 제어

「」를 참조해 주세요.

레퍼런스

추가 정보

Search

반지름 기준 함수 네트워크

네트워크 아키텍처

정규화

표준화된 아키텍처

정상화를 위한 이론적 동기

로컬 선형 모델

트레이닝

보간법

함수 근사

기본 기능 센터 교육

선형 가중치에 대한 의사 역해

선형 무게의 경사 강하 훈련

선형 가중치에 대한 투영 연산자 교육

예

로지스틱 맵

함수 근사

정규화되지 않은 반지름 기준

정규화된 반지름 기준 함수

시계열 예측

카오스 시계열 제어

「 」를 참조해 주세요.

레퍼런스

추가 정보

「」를 참조해 주세요.