XGBoost is an open-source library that offers gradient boosting which is an ensemble method. The algorithm runs on a notebook or scales to support multiple cores, GPUs, threads, and clusters of computing. Supports scikit-learn but has its own interface. 

Project Background

  • Library: XGBoost
  • Author: Tianqi Chen and DMLC group
  • Initial Release: March 2014
  • Type: Gradient boosting library
  • License: Apache 2.0
  • Language: Developed in C++
  • GitHub: dmlc/xgboost
  • Runs On: Single machine and supports Hadoop, Spark, and Flink
  • Twitter: XGBoost


  • Boosting is an ensemble method
  • The algorithm wins lots of Kaggle competitions
  •  Supports regularized boosting which is normalized
  • Handles missing values automatically
  • Takes advantage of multiple cores, threads, and clusters of compute
  •  Incremental training: stop and save training, then come back to it later
  • Tree pruning: deeper and more optimal trees
  • types of boosting supported: gradient, stochastic gradient, and regularized boosting
Scroll to Top