새총 할당

기계학습 및 자연어 처리에서 빠칭코 할당 모델(PAM)은 토픽 모델이다.주제 모델은 문서 집합의 숨겨진 주제 구조를 밝혀내는 알고리즘 모음입니다.^[1] 알고리즘은 토픽을 구성하는 단어 상관 외에 토픽 간의 상관관계를 모델링함으로써 잠재 디리클레 할당(LDA)과 같은 초기 토픽 모델을 개선한다.PAM은 잠재적인 Dirichlet 할당보다 ^[2]더 많은 유연성과 더 큰 표현력을 제공합니다.자연어 처리의 맥락에서 처음 기술되고 구현되는 동안, 알고리즘은 생물 정보학 같은 다른 분야에서 응용될 수 있다.이 모델의 이름은 파칭코 머신으로, 일본에서 인기 있는 게임으로, 금속 공이 복잡한 핀 집합 주위를 튕겨 내려가 바닥에 ^[3]있는 다양한 통에 들어가는 것입니다.

역사

파칭코 할당은 2006년 ^[3]Wei Li와 Andrew McCallum에 의해 처음 설명되었습니다.이 아이디어는 2007년 ^[4]Li, McCallum 및 David Mimno에 의해 계층적 새총 할당으로 확장되었습니다.2007년에 McCallum과 그의 동료들은 계층적 디리클레 프로세스(HDP)^[2]의 변형에 기초한 PAM을 위한 비모수적 베이시안을 제안했다.이 알고리즘은 매사추세츠 대학 애머스트의 McCallum 그룹이 발행한 MALET 소프트웨어 패키지에 구현되어 있습니다.

모델

PAM은 V의 단어와 T의 항목을 임의 DAG(유향 비순환 그래프)로 연결합니다. DAG는 주제 노드가 내부 수준을 차지하고 잎은 단어입니다.

전체 말뭉치를 생성할 확률은 모든 ^[3]문서에 대한 확률의 곱이다.

$\displaystyle P(\mathbf {D} \alpha)=\display _{d}P(d \alpha )}$

「」를 참조해 주세요.

1999년 토마스 호프만(^[5]Thomas Hofmann)의 초기 주제 모델인 확률론적 잠재의미지수(PLSI).
2002년 David Blei, Andrew Ng 및 Michael Jordan에 의해 개발된 PLSI의 일반화인 잠복성 Dirichlet 할당은 문서에 다양한 ^[6]주제를 혼합할 수 있도록 합니다.
MALET, 파칭코 할당을 구현하는 오픈 소스 Java 라이브러리입니다.

레퍼런스

^ Blei, David. "Topic modeling". Archived from the original on 2 October 2012. Retrieved 4 October 2012.
^ ^a ^b Li, Wei; Blei, David; McCallum, Andrew (2007). "Nonparametric Bayes Pachinko Allocation". arXiv:1206.5270. {{cite journal}}:Cite 저널 요구 사항 journal=(도움말)
^ ^a ^b ^c Li, Wei; McCallum, Andrew (2006). "Pachinko Allocation: DAG-Structured Mixture Models of Topic Correlations" (PDF). Proceedings of the 23rd International Conference on Machine Learning. doi:10.1145/1143844.1143917. S2CID 13160178.
^ Mimno, David; Li, Wei; McCallum, Andrew (2007). "Mixtures of Hierarchical Topics with Pachinko Allocation" (PDF). Proceedings of the 24th International Conference on Machine Learning: 633–640. doi:10.1145/1273496.1273576. ISBN 9781595937933. S2CID 6045658.
^ Hofmann, Thomas (1999). "Probabilistic Latent Semantic Indexing" (PDF). Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval. Archived from the original (PDF) on 14 December 2010.
^ Blei, David M.; Ng, Andrew Y.; Jordan, Michael I; Lafferty, John (January 2003). "Latent Dirichlet allocation". Journal of Machine Learning Research. 3: pp. 993–1022. Archived from the original on 1 May 2012. Retrieved 19 July 2010.

외부 링크

계층적 토픽과 파칭코 할당의 혼합, 2007년에 HPM을 발표한 David Mimno의 비디오 녹화.

이 컴퓨터 과학 기사는 촌스럽다.위키피디아를 확장함으로써 위키피디아를 도울 수 있습니다.

[1] Blei, David. "Topic modeling". Archived from the original on 2 October 2012. Retrieved 4 October 2012.

[mccallum07-2] Li, Wei; Blei, David; McCallum, Andrew (2007). "Nonparametric Bayes Pachinko Allocation". arXiv:1206.5270. {{cite journal}}:Cite 저널 요구 사항 journal=(도움말)

[li2006-3] Li, Wei; McCallum, Andrew (2006). "Pachinko Allocation: DAG-Structured Mixture Models of Topic Correlations" (PDF). Proceedings of the 23rd International Conference on Machine Learning. doi:10.1145/1143844.1143917. S2CID 13160178.

[mimno2007-4] Mimno, David; Li, Wei; McCallum, Andrew (2007). "Mixtures of Hierarchical Topics with Pachinko Allocation" (PDF). Proceedings of the 24th International Conference on Machine Learning: 633–640. doi:10.1145/1273496.1273576. ISBN 9781595937933. S2CID 6045658.

[hofmann1999-5] Hofmann, Thomas (1999). "Probabilistic Latent Semantic Indexing" (PDF). Proceedings of the Twenty-Second Annual International SIGIR Conference on Research and Development in Information Retrieval. Archived from the original (PDF) on 14 December 2010.

[blei2003-6] Blei, David M.; Ng, Andrew Y.; Jordan, Michael I; Lafferty, John (January 2003). "Latent Dirichlet allocation". Journal of Machine Learning Research. 3: pp. 993–1022. Archived from the original on 1 May 2012. Retrieved 19 July 2010.

[1]

[2]

[3]

[4]

[5]

[6]

Search

새총 할당

네임스페이스

더

목차

역사

모델

「」를 참조해 주세요.

레퍼런스

외부 링크

Search

새총 할당

역사

모델

「 」를 참조해 주세요.

레퍼런스

외부 링크

「」를 참조해 주세요.