Scikit-learn is one of the most widely used libraries in machine learning. It’s a Python module developed on top of NumPy, SciPy, and matplotlib. David Cournapeau started the project as a Google Summer of Code project with name.scikit.learn. Today, a large community of volunteers contributes to the project.

It also is one of the most popular machine learning libraries available on GitHub. The framework integrates well with many other Python libraries, such as Plotly, Matplotlib, Pandas, and NumPy. Many reputable organizations are using it such as J.P. Morgan, Spotify, Evernote, AWeber, and Dataiku.

Project Background

  • Library: Scikit-learn
  • Author: David Cournapeau
  • Initial Release: June 2007
  • Type: Library for machine learning
  • License: New BSD License
  • Contains Various classification, regression, and clustering algorithms including support vector machines, random forests, gradient boosting, k-means, and DBSCAN.
  • Language: Python, Cython, C, and C++
  • GitHub: scikit-learn
  • Runs On: Linux, macOS, Windows
  • GitHub Discussions: scikit-learn/discussions
  • Twitter: scikit_learn
  • Stackflow: scikit-learn
  • Samples: Examples for download


  • Classification
  • Regression
  • Clustering
  • Preprocessing
  • Model selection
  • Dimensionality reduction
  • Advanced probability modeling
  • Dimensionality reduction
  • Decision tree pruning and induction

Use Cases

  • Financial cybersecurity analytics
  • Product development
  • Neuroimaging
  • Barcode scanner development
  • Medical modeling
  • Help with handling Shopify inventory issues
Scroll to Top