심층신앙 네트워크

깊은 믿음의 망에 대한 개략적인 개요.화살표는 네트가 나타내는 그래픽 모델에서 지시된 연결을 나타낸다.

머신러닝에서, 심층신뢰 네트워크(DBN)는 생성 그래픽 모델 또는 심층 신경망의 한 종류로서, 각 층 내의 단위들 사이에 연결은 있지만, 층들 사이에 연결은 없는, 여러 층의 잠재 변수("숨겨진 단위")로 구성되어 있다.^[1]null

DBN은 감독 없이 일련의 예에 대해 훈련할 때 입력 내용을 확률적으로 재구성하는 방법을 배울 수 있다.그런 다음 레이어는 형상 검출기의 역할을 한다.^[1]이 학습 단계 이후 DBN은 분류를 수행하기 위한 감독을 통해 추가 교육을 받을 수 있다.^[2]null

DBN은 제한된 볼츠만 머신(RBM)^[1]이나 오토엔코더와 같은 단순하고 감독되지 않은 네트워크의 구성으로 볼 수 있으며,^[3] 여기서 각 서브 네트워크의 숨겨진 레이어가 다음에 대한 가시적 레이어 역할을 한다.RBM은 "보이는" 입력 레이어와 숨겨진 레이어 및 레이어 간 연결과 레이어 내에는 없지만 레이어 내에는 연결되지 않는 비방향의 생성 에너지 기반 모델이다.이 구성은 "가장 낮은" 계층 쌍(가장 낮은 가시 계층은 훈련 세트)에서 시작하여 각 하위 네트워크에 차례로 대비적 차이가 적용되는 계층별 감독되지 않은 빠른 훈련 절차로 이어진다.null

DBN이 한 번에 한 계층씩 탐욕스럽게 훈련될 수 있다는 관찰은^[2] 최초의 효과적인 딥러닝 알고리즘 중 하나로 이어졌다.^[4]^{: 6}전반적으로, 실제 애플리케이션 및 시나리오(예: 뇌파 촬영,^[5] 약물 발견^[6]^[7]^[8])에서 DBN의 매력적인 구현과 사용이 많다.null

트레이닝

볼츠만 기계(RBM)가 제한되어 있으며, 가시성과 숨겨진 장치가 완전히 연결되어 있음.숨겨진 연결 또는 육안으로 볼 수 있는 연결은 없다는 점에 유의하십시오.

Geoffrey Hinton이 훈련 "Product of Expert" 모델에 사용하기 위해 제안한 RBM의 훈련 방법을 CD라고 한다.^[9] CD는 가중치 학습에 이상적으로 적용될 수 있는 최대우도 방법에 대한 근사치를 제공한다.^[10]^[11]In training a single RBM, weight updates are performed with gradient descent via the following equation: $w_{ij}(t+1)=w_{ij}(t)+\eta {\frac {\partial \log(p(v))}{\partial w_{ij}}}$

where, $p(v)$ is the probability of a visible vector, which is given by $p(v)={\frac {1}{Z}}\sum _{h}e^{-E(v,h)}$ . $Z$ is the partition function (used for normalizing) and $E(v,h)$ is the energy네트워크 상태에 할당되는 기능낮은 에너지는 네트워크가 더 "바람직한" 구성에 있음을 나타낸다.The gradient ${\frac {\partial \log(p(v))}{\partial w_{ij}}}$ has the simple form $\langle v_{i}h_{j}\rangle _{\text{data}}-\langle v_{i}h_{j}\rangle _{\text{model}}$ where ${\displaystyl$ $e \langle \cdots \rangele _{p}$ 분포 $p$ $p$ 에 대한 평균을 나타낸다 $\langle \cdots \rangle _{p}$ $p$ issue $\langle v_{i}h_{j}\rangle _{\text{model}}$ $\langle v_{i}h_{j}\rangle _{\text{model}}$ $\langle v_{i}h_{j}\rangle _{\text{model}}$ $\langle v_{i}h_{j}\rangle _{\text{model}}$ $\langle v_{i}h_{j}\rangle _{\text{model}}$ ${\$ 샘플링에서 문제가 발생하는데, 이는 확장된 Gibbs 샘플링이 필요하기 때문이다 $\langle v_{i}h_{j}\rangle _{\text{model}}$ .CD는 gibbs 샘플링을 $n$ $n$ 단계에 $n$ 대해 교대로 실행하여 이 단계를 대체한다( $n=1$ = $n=1$ ${\displaystyle$ n $=1}$ 의 값이 잘 수행됨 $n=1$ ). $n$ $n$ 단계 $n$ 후, 데이터가 샘플링되고 해당 샘플은 $\langle v_{i}h_{j}\rangle _{\text{model}}$ v $\langle v_{i}h_{j}\rangle _{\text{model}}$ $\langle v_{i}h_{j}\rangle _{\text{model}}$ $\langle v_{i}h_{j}\rangle _{\text{model}}$ $\langle v_{i}h_{j}\rangle _{\text{model}}$ ${\$ 대신 사용된다 $\langle v_{i}h_{j}\rangle _{\text{model}}$ CD 절차는 다음과 같이 작동한다.^[10]

보이는 단위를 교육용 벡터로 초기화한다.
Update the hidden units in parallel given the visible units: $p(h_{j}=1\mid {\textbf {V}})=\sigma (b_{j}+\sum _{i}v_{i}w_{ij})$ . $\sigma$ is the sigmoid function and $b_{j}$ is the bias of $h_{j}$ ${\displaystyle h_{j$
Update the visible units in parallel given the hidden units: $p(v_{i}=1\mid {\textbf {H}})=\sigma (a_{i}+\sum _{j}h_{j}w_{ij})$ . $a_{i}$ is the bias of $v_{i}$ .이를 '재건축' 단계라고 한다.
2단계와 동일한 방정식을 사용하여 재구성된 가시 단위를 병렬로 업데이트하십시오.
Perform the weight update: $\Delta w_{ij}\propto \langle v_{i}h_{j}\rangle _{\text{data}}-\langle v_{i}h_{j}\rangle _{\text{reconstruction}}$ .

일단 RBM이 훈련되면, 또 다른 RBM은 최종 훈련된 계층으로부터 그것의 입력을 빼앗아 그것의 꼭대기에 "쌓여"된다.새로운 가시 계층은 훈련 벡터로 초기화되며, 이미 훈련된 계층의 단위에 대한 값은 현재의 가중치와 편향을 사용하여 할당된다.그리고 나서 새로운 RBM은 위의 절차에 따라 훈련된다.이 모든 과정은 원하는 정지 기준이 충족될 때까지 반복된다.^[12]null

최대우도에 대한 CD의 근사치는 조잡하지만(어떤 함수의 구배를 따르지 않는다) 경험적으로 효과적이다.^[10]null

참고 항목

참조

^ ^a ^b ^c Hinton G (2009). "Deep belief networks". Scholarpedia. 4 (5): 5947. Bibcode:2009SchpJ...4.5947H. doi:10.4249/scholarpedia.5947.
^ ^a ^b Hinton GE, Osindero S, Teh YW (July 2006). "A fast learning algorithm for deep belief nets" (PDF). Neural Computation. 18 (7): 1527–54. CiteSeerX 10.1.1.76.1541. doi:10.1162/neco.2006.18.7.1527. PMID 16764513. S2CID 2309950.
^ Bengio Y, Lamblin P, Popovici D, Larochelle H (2007). Greedy Layer-Wise Training of Deep Networks (PDF). NIPS.
^ Bengio, Y. (2009). "Learning Deep Architectures for AI" (PDF). Foundations and Trends in Machine Learning. 2: 1–127. CiteSeerX 10.1.1.701.9550. doi:10.1561/2200000006.
^ Movahedi F, Coyle JL, Sejdic E (May 2018). "Deep Belief Networks for Electroencephalography: A Review of Recent Contributions and Future Outlooks". IEEE Journal of Biomedical and Health Informatics. 22 (3): 642–652. doi:10.1109/jbhi.2017.2727218. PMC 5967386. PMID 28715343.
^ Ghasemi, Pérez-Sánchez; Mehri, Pérez-Garrido (2018). "Neural network and deep-learning algorithms used in QSAR studies: merits and drawbacks". Drug Discovery Today. 23 (10): 1784–1790. doi:10.1016/j.drudis.2018.06.016. PMID 29936244.
^ Ghasemi, Pérez-Sánchez; Mehri, fassihi (2016). "The Role of Different Sampling Methods in Improving Biological Activity Prediction Using Deep Belief Network". Journal of Computational Chemistry. 38 (10): 1–8. doi:10.1002/jcc.24671. PMID 27862046. S2CID 12077015.
^ Gawehn E, Hiss JA, Schneider G (January 2016). "Deep Learning in Drug Discovery". Molecular Informatics. 35 (1): 3–14. doi:10.1002/minf.201501008. PMID 27491648. S2CID 10574953.
^ Hinton GE (2002). "Training Product of Experts by Minimizing Contrastive Divergence" (PDF). Neural Computation. 14 (8): 1771–1800. CiteSeerX 10.1.1.35.8613. doi:10.1162/089976602760128018. PMID 12180402. S2CID 207596505.
^ ^a ^b ^c Hinton GE (2010). "A Practical Guide to Training Restricted Boltzmann Machines". Tech. Rep. UTML TR 2010-003.
^ Fischer A, Igel C (2014). "Training Restricted Boltzmann Machines: An Introduction" (PDF). Pattern Recognition. 47: 25–39. CiteSeerX 10.1.1.716.8647. doi:10.1016/j.patcog.2013.05.025. Archived from the original (PDF) on 2015-06-10. Retrieved 2017-07-02.
^ Bengio Y (2009). "Learning Deep Architectures for AI" (PDF). Foundations and Trends in Machine Learning. 2 (1): 1–127. CiteSeerX 10.1.1.701.9550. doi:10.1561/2200000006. Archived from the original (PDF) on 2016-03-04. Retrieved 2017-07-02.

외부 링크

"Deep Belief Networks". Deep Learning Tutorials.
"Deep Belief Network Example". Deeplearning4j Tutorials. Archived from the original on 2016-10-03. Retrieved 2015-02-22.

[scholar-1] Hinton G (2009). "Deep belief networks". Scholarpedia. 4 (5): 5947. Bibcode:2009SchpJ...4.5947H. doi:10.4249/scholarpedia.5947.

[hinton06-2] Hinton GE, Osindero S, Teh YW (July 2006). "A fast learning algorithm for deep belief nets" (PDF). Neural Computation. 18 (7): 1527–54. CiteSeerX 10.1.1.76.1541. doi:10.1162/neco.2006.18.7.1527. PMID 16764513. S2CID 2309950.

[3] Bengio Y, Lamblin P, Popovici D, Larochelle H (2007). Greedy Layer-Wise Training of Deep Networks (PDF). NIPS.

[4] Bengio, Y. (2009). "Learning Deep Architectures for AI" (PDF). Foundations and Trends in Machine Learning. 2: 1–127. CiteSeerX 10.1.1.701.9550. doi:10.1561/2200000006.

[5] Movahedi F, Coyle JL, Sejdic E (May 2018). "Deep Belief Networks for Electroencephalography: A Review of Recent Contributions and Future Outlooks". IEEE Journal of Biomedical and Health Informatics. 22 (3): 642–652. doi:10.1109/jbhi.2017.2727218. PMC 5967386. PMID 28715343.

[6] Ghasemi, Pérez-Sánchez; Mehri, Pérez-Garrido (2018). "Neural network and deep-learning algorithms used in QSAR studies: merits and drawbacks". Drug Discovery Today. 23 (10): 1784–1790. doi:10.1016/j.drudis.2018.06.016. PMID 29936244.

[7] Ghasemi, Pérez-Sánchez; Mehri, fassihi (2016). "The Role of Different Sampling Methods in Improving Biological Activity Prediction Using Deep Belief Network". Journal of Computational Chemistry. 38 (10): 1–8. doi:10.1002/jcc.24671. PMID 27862046. S2CID 12077015.

[8] Gawehn E, Hiss JA, Schneider G (January 2016). "Deep Learning in Drug Discovery". Molecular Informatics. 35 (1): 3–14. doi:10.1002/minf.201501008. PMID 27491648. S2CID 10574953.

[POE-9] Hinton GE (2002). "Training Product of Experts by Minimizing Contrastive Divergence" (PDF). Neural Computation. 14 (8): 1771–1800. CiteSeerX 10.1.1.35.8613. doi:10.1162/089976602760128018. PMID 12180402. S2CID 207596505.

[RBMTRAIN2-10] Hinton GE (2010). "A Practical Guide to Training Restricted Boltzmann Machines". Tech. Rep. UTML TR 2010-003.

[RBMTutorial-11] Fischer A, Igel C (2014). "Training Restricted Boltzmann Machines: An Introduction" (PDF). Pattern Recognition. 47: 25–39. CiteSeerX 10.1.1.716.8647. doi:10.1016/j.patcog.2013.05.025. Archived from the original (PDF) on 2015-06-10. Retrieved 2017-07-02.

[BENGIODEEP-12] Bengio Y (2009). "Learning Deep Architectures for AI" (PDF). Foundations and Trends in Machine Learning. 2 (1): 1–127. CiteSeerX 10.1.1.701.9550. doi:10.1561/2200000006. Archived from the original (PDF) on 2016-03-04. Retrieved 2017-07-02.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

Search

심층신앙 네트워크

네임스페이스

더

목차

트레이닝

참고 항목

참조

외부 링크