Scikit-learn is open source machine learning library used with Python programming language.
- Scikit-learn was initially developed by David Cournapeau as a Google summer of code project in 2007.
- Scikit-learn is mainly written in Python but some core algorithms are written in Cython to achieve performance. Cython is a programming language that is super set of the Python language, that gives C-like performance.
- Scikit-learn is built upon the SciPy (Scientific Python) which must be installed before you can use scikit-learn.
- Scikit-learn can be used to develop various regression, classification and clustering algorithms.
- Scikit-learn implements a range of machine learning, preprocessing, cross-validation and visualization algorithms.
- Scikit-learn needs input data to be numeric and stored as numpy arrays or scipy matrices to be accepted to processing.
- Scikit-learn provides three Regression Metrics namely Mean Absolute Error, Mean Squared Error, R² Score.
- Scikit-learn provides three Classification Metrics namely Accuracy Score, Classification Report, Confusion Matrix.
- Scikit-learn provides three Clustering Metrics namely Adjusted rand Index, Homogeneity, V-measure.
- Scikit-learn provides the functions to tune the model using grid search and randomized parameter optimization.