Scaling :
In machine learning, the act of getting data and models ready for usage in bigger or more complicated contexts is referred to as scaling.
This incorporates a variety of methods that make it possible to process data and models in an effective and efficient manner.
Feature scaling
There are two ways :
- Normalization : rescales the feature values to be between 0 and 1.
- Standardization : rescales the feature values to have a mean of 0 and a standard deviation of 1.
Standardization is carried out using the Scikit-Learn library's StandardScaler class.
By removing the mean and dividing by the standard deviation of each characteristic, this class scales the data.
Codeblock E.1. Standard Gradient Descent demonstration.
After scaling, a new Pandas DataFrame was constructed that had the scaled features as well as the original Outcome column, which shows whether or not the subject has diabetes.
We can make sure that each feature receives the same weight during the model training process by doing feature scaling. Better performance and more precise predictions may result from this.
Download. Download the csv used here.
Standardisation :
Normalisation :
Each data point is standardized by dividing it by the standard deviation, then by the mean of the entire dataset. When the data has a Gaussian distribution, this method is helpful.
Each data point is split by the dataset's greatest value after normalization. When the scale of the data varies significantly and it's crucial to keep the results inside a specific range, this strategy can be helpful. (usually 0 to 1).
Before using machine learning techniques, the data is transformed using both normalization and standardization. The precise requirements of the current situation and the qualities of the data determine which of the two strategies should be used.
---- Summary ----
As of now you know all basics of Scaling.
Scaling is the process of transforming data so that it fits within a specific range.
Standardization scaling, which scales the data to have a mean of 0 and a standard deviation of 1, is a common method used in many machine learning algorithms.
Scaling can improve the performance of machine learning models and help prevent overfitting.
It is important to scale the training and test data using the same method and parameters.
Scaling is just one of many preprocessing steps that can be applied to data before machine learning.
etc..
Copyright © 2022-2023. Anoop Johny. All Rights Reserved.
s