XGBoost

XGBoost
개발자	XGBoost의 공헌자
초기 릴리즈	2014년 3월 27일, 8년전(
안정된 릴리스	1.6.1 / 2022년 5월 9일; 2개월 전 ()
저장소	github.com/dmlc/xgboost ;
기입처	C++
운영 체제	Linux, macOS, Windows
유형	기계 학습
면허증.	Apache 라이센스 2.0
웹 사이트	xgboost.ai

XGBoost^[2](eXtreme Gradient Boost)는 C++, Java, ^[3]Python, R,^[4] ^[5]Julia, ^[6]Perl 및 Scala를 위한 정규화 그라데이션 부스팅 프레임워크를 제공하는 오픈 소스 소프트웨어 라이브러리입니다.Linux,^[8] Windows ^[7]및 MacOS에서 작동합니다.프로젝트 설명에서는 "Scalable, Portable and Distributed Gradient Boosting (GBM, GBRT, GBDT)" 라이브러리 제공을 목표로 하고 있습니다.단일 머신과 Apache Hadoop, Apache Spark, Apache Flink 및 Dask ^[9]^[10]분산 처리 프레임워크에서 실행됩니다.

머신러닝 콘테스트의 ^[11]많은 우승 팀의 알고리즘으로서 최근 많은 인기와 주목을 받고 있습니다.

역사

XGBoost는 처음에 분산(딥) 머신 러닝 커뮤니티(DMLC) 그룹의 일환으로 Tianqi^[12] Chen의 연구 프로젝트로 시작되었습니다.처음에는 libsvm 구성 파일을 사용하여 구성할 수 있는 터미널 애플리케이션으로 시작되었습니다.힉스 머신 러닝 챌린지의 우승 솔루션에 사용된 후 ML 경쟁 업계에서 잘 알려지게 되었습니다.얼마 지나지 않아 Python과 R 패키지가 구축되었고, XGBoost는 Java, Scala, Julia, Perl 및 기타 언어용 패키지 구현이 이루어졌습니다.이로 인해 도서관은 더 많은 개발자들에게 제공되었고,^[11] 많은 대회에 이용되어 온 카글 커뮤니티에서 인기를 끌었다.

그것은 곧 다른 많은 패키지와 통합되어 각각의 커뮤니티에서 더 쉽게 사용할 수 있게 되었습니다.Python 사용자를 위한 skikit-learn 및 R 사용자를 위한 carlet 패키지와 통합되었습니다.또한 추상화된 Rabit^[13] 및 XGBoost4J를 ^[14]사용하여 Apache Spark, Apache Hadoop 및 Apache Flink와 같은 데이터 흐름 프레임워크에 통합할 수 있습니다. XGBoost는 ^[15]FPGA용 OpenCL에서도 사용할 수 있습니다.XGBoost의 효율적이고 확장 가능한 구현은 Tianqi Chen과 Carlos Guestrin에 ^[16]의해 발표되었습니다.

XGBoost 모델은 종종 단일 의사결정 트리보다 높은 정확도를 달성하지만 의사결정 트리의 본질적인 해석 가능성을 희생합니다.예를 들어, 의사결정 트리가 결정을 내리기 위해 선택한 경로를 따르는 것은 사소하고 스스로 설명되지만, 수백 또는 수천 그루의 나무 경로를 따르는 것은 훨씬 더 어렵습니다.성능과 해석성을 모두 달성하기 위해 일부 모델 압축 기술을 사용하면 XGBoost를 동일한 의사결정 ^[17]함수에 가까운 단일 "다시 태어난" 의사결정 트리로 변환할 수 있습니다.

특징들

XGBoost의 주요 기능은 다른 그라데이션 부스트 알고리즘과 다릅니다.^[18]^[19]^[20]

교묘한 나무 벌칙
리프 노드의 비례적 축소
뉴턴 부스팅
추가 랜덤화 파라미터
단일 분산 시스템 및 코어 외 컴퓨팅 구현
자동 기능 선택

알고리즘

XGBoost는 함수공간에서 구배 강하로 작동하는 구배 부스팅과 달리 함수공간에서 Newton-Raphson으로 작동하며, 손실함수에서는 2차 테일러 근사가 사용되어 Newton Raphson 방법에 연결됩니다.

일반적인 비정규화 XGBoost 알고리즘은 다음과 같습니다.

입력: 트레이닝 세트 $\{(x_{i},y_{i})\}_{i=1}^{N}$ ( $\{(x_{i},y_{i})\}_{i=1}^{N}$ i , $\{(x_{i},y_{i})\}_{i=1}^{N}$ i ) $\{(x_{i},y_{i})\}_{i=1}^{N}$ $\{(x_{i},y_{i})\}_{i=1}^{N}$ $\{(x_{i},y_{i})\}_{i=1}^{N}$ $({$ $\}_{i=1}^{N$ 미분 가능한 $L(y,F(x))$ 함수 $L(y,F(x))$ $L(y,F(x))$ x $),$ { $displaystyle$ L $(y, F(x$ 다수의 약한 $학습자$ M(\ $displaystyle$ M $)$ 및 $M$ $\alpha$ α(\ $displaystyle \alpha$ $\alpha$

알고리즘:

상수 값을 사용하여 모형 초기화:
${f}_{(0)(x)=밑줄 {\theta}{\sum \i=1}^{N}L(y_{i},\theta)$
m = 1 ~ $M$ 의 $경우$ :
1. '경사'와 '헤시안'을 계산합니다.
  $(\displaystyle {g}}_{m}(x_{i})=\left[{\frac {\frac L(y_{i},f(x_{i})]}{\f(x_{i})}}\right]_{f(x)=hat {f}_{(m-1)}(x)}.}$
  
  $(\displaystyle {h}}_{m}(x_{i})=\left[{\frac {\frac ^{2}L(y_{i},f(x_{i})]}{\f(x_{i})^{2}}\right]_{f(x)=sublichat {f}_{(m-1)}(x)}.}$
2. 교육 세트 $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ { $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ i , $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ - $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ ^ $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ ( $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ ) $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ ^ $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ ( $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ i ) $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ $\displaystyle \left\{x_{i},-{\frac {{\hat {g}}_{m}(x_{i})}{{\hat {h}}_{m}(x_{i})}}\right\}_{i=1}^{N}$ N(\ $displaystyle \left\{x_{i},-{\frac {\hat {g}_{m}(x_i})$ 을 사용하여 기본 학습자(또는 약한 학습자 등)를 맞춥니다. $}{{\hat {h}}_{m}(x_{i})$ $}}\right\}_{i=1$ }^{N $}:$ 아래 최적화 문제를 해결합니다.
  ${hat}_{m}=underset {phi\in\mathbf {Phi}}{\sum _{i=1}{{n}{\frac {1}{2}}{\hat {h}(x_i}}}\left {\frac}{m}{m}{g}{g}}}{g}}{\hat {h}_{m}(x_{i})}-\phi(x_{i}\right]^{2}.$
  
  ${f}_{m}(x)=\alpha {hat}_{m}(x)입니다.$
3. 모델 업데이트:
  ${f}_{(m)}(x)=hat {f}_{(m-1)}(x)+{\hat {f}_{m}(x)}.$
${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ f ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ( ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ) ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ( ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ) ( ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ) ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ m ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ^ ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ( ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ ) $。{$ $display$ style { $f }$ _ { ( x ) = { $m$ = ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ 0 $}^{$ M ${\hat {f}}(x)={\hat {f}}_{(M)}(x)=\sum _{m=0}^{M}{\hat {f}}_{m}(x).$ } { \ $sum$ { $f$ } { $m （$ ) $}$

어워드

존 챔버스상 (2016)^[21]
고에너지 물리학과 머신러닝상 (HEP와 ML의 만남) (^[22]2016)

「」를 참조해 주세요.

라이트 GBM

레퍼런스

^ "1.6.1 Patch Release". Retrieved 8 July 2022.
^ "GitHub project webpage". GitHub. June 2022.
^ "Python Package Index PYPI: xgboost". Retrieved 2016-08-01.
^ "CRAN package xgboost". Retrieved 2016-08-01.
^ "Julia package listing xgboost". Retrieved 2016-08-01.
^ "CPAN module AI::XGBoost". Retrieved 2020-02-09.
^ "Installing XGBoost for Anaconda in Windows". IBM. Retrieved 2016-08-01.
^ "Installing XGBoost on Mac OSX". IBM. Retrieved 2016-08-01.
^ "Dask Homepage".{{cite web}}: CS1 maint :url-status (링크)
^ "Distributed XGBoost with Dask — xgboost 1.5.0-dev documentation". xgboost.readthedocs.io. Retrieved 2021-07-15.
^ ^a ^b "XGBoost - ML winning solutions (incomplete list)". GitHub. Retrieved 2016-08-01.
^ "Story and Lessons behind the evolution of XGBoost". Retrieved 2016-08-01.
^ "Rabit - Reliable Allreduce and Broadcast Interface". GitHub. Retrieved 2016-08-01.
^ "XGBoost4J". Retrieved 2016-08-01.
^ "XGBoost on FPGAs". GitHub. Retrieved 2019-08-01.
^ Chen, Tianqi; Guestrin, Carlos (2016). "XGBoost: A Scalable Tree Boosting System". In Krishnapuram, Balaji; Shah, Mohak; Smola, Alexander J.; Aggarwal, Charu C.; Shen, Dou; Rastogi, Rajeev (eds.). Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. ACM. pp. 785–794. arXiv:1603.02754. doi:10.1145/2939672.2939785.
^ Sagi, Omer; Rokach, Lior (2021). "Approximating XGBoost with an interpretable decision tree". Information Sciences. 572 (2021): 522-542. doi:10.1016/j.ins.2021.05.055.
^ Gandhi, Rohith (2019-05-24). "Gradient Boosting and XGBoost". Medium. Retrieved 2020-01-04.
^ "Boosting algorithm: XGBoost". Towards Data Science. 2017-05-14. Retrieved 2020-01-04.
^ "Tree Boosting With XGBoost – Why Does XGBoost Win "Every" Machine Learning Competition?". Synced. 2017-10-22. Retrieved 2020-01-04.
^ "John Chambers Award Previous Winners". Retrieved 2016-08-01.
^ "HEP meets ML Award". Retrieved 2016-08-01.

이 인공지능 관련 기사는 촌극이다.위키피디아를 확장함으로써 위키피디아를 도울 수 있습니다.

[wikidata-a340f17d2b4d78ae46c197d954aff95d6fe8fa8e-v3-1] "1.6.1 Patch Release". Retrieved 8 July 2022.

[source-code-2] "GitHub project webpage". GitHub. June 2022.

[xgboost-python-3] "Python Package Index PYPI: xgboost". Retrieved 2016-08-01.

[xgboost-cran-4] "CRAN package xgboost". Retrieved 2016-08-01.

[xgboost-julia-5] "Julia package listing xgboost". Retrieved 2016-08-01.

[xgboost-perl-6] "CPAN module AI::XGBoost". Retrieved 2020-02-09.

[xgboost-windows-7] "Installing XGBoost for Anaconda in Windows". IBM. Retrieved 2016-08-01.

[xgboost-macos-8] "Installing XGBoost on Mac OSX". IBM. Retrieved 2016-08-01.

[Dask-docs-9] "Dask Homepage".{{cite web}}: CS1 maint :url-status (링크)

[10] "Distributed XGBoost with Dask — xgboost 1.5.0-dev documentation". xgboost.readthedocs.io. Retrieved 2021-07-15.

[xgboost-competition-winners-11] "XGBoost - ML winning solutions (incomplete list)". GitHub. Retrieved 2016-08-01.

[history-12] "Story and Lessons behind the evolution of XGBoost". Retrieved 2016-08-01.

[rabit-13] "Rabit - Reliable Allreduce and Broadcast Interface". GitHub. Retrieved 2016-08-01.

[xgboost4j-14] "XGBoost4J". Retrieved 2016-08-01.

[xgboost_FPGA-15] "XGBoost on FPGAs". GitHub. Retrieved 2019-08-01.

[paper-16] Chen, Tianqi; Guestrin, Carlos (2016). "XGBoost: A Scalable Tree Boosting System". In Krishnapuram, Balaji; Shah, Mohak; Smola, Alexander J.; Aggarwal, Charu C.; Shen, Dou; Rastogi, Rajeev (eds.). Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016. ACM. pp. 785–794. arXiv:1603.02754. doi:10.1145/2939672.2939785.

[17] Sagi, Omer; Rokach, Lior (2021). "Approximating XGBoost with an interpretable decision tree". Information Sciences. 572 (2021): 522-542. doi:10.1016/j.ins.2021.05.055.

[18] Gandhi, Rohith (2019-05-24). "Gradient Boosting and XGBoost". Medium. Retrieved 2020-01-04.

[19] "Boosting algorithm: XGBoost". Towards Data Science. 2017-05-14. Retrieved 2020-01-04.

[20] "Tree Boosting With XGBoost – Why Does XGBoost Win "Every" Machine Learning Competition?". Synced. 2017-10-22. Retrieved 2020-01-04.

[john-chambers-21] "John Chambers Award Previous Winners". Retrieved 2016-08-01.

[hep-meets-ml-22] "HEP meets ML Award". Retrieved 2016-08-01.

[1]

[2]

[3]

[4]

[5]

[6]

[8]

[7]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

Search

XGBoost

네임스페이스

더

목차

역사

특징들

알고리즘

어워드

「」를 참조해 주세요.

레퍼런스

Search

XGBoost

역사

특징들

알고리즘

어워드

「 」를 참조해 주세요.

레퍼런스

「」를 참조해 주세요.