Good morning to all , here is my eighth blog for you on the topic Supervised Learning which is the type of algorithm used in Machine Learning . Before it we have covered about machine learning and it’s types and Machine Learning real life opportunities and applications . Now today we will see elaborately about Supervised Learning and some of it’s types of algorithms used. So here we start :-
Table of Contents
What is Supervised Learning ?
In supervised learning, sample labelled data is given to the machine learning system as training material, and then it uses that information to predict the outcome.
The system builds a model using labelled data to comprehend the datasets and learn about each one. After training and processing, the model is tested by utilising sample data to see if it accurately predicts the desired outcome.
In supervised learning, mapping input and output data is the main objective. The foundation of supervised learning is supervision, just like when a pupil is studying under a teacher’s supervision. Spam filtering is a prime example of supervised learning. Or you can say
The process of developing an algorithm to learn how to map an input to a specific output is known as supervised learning. The labelled datasets you have gathered are used to do this. If the mapping is accurate, the algorithm has trained effectively. If not, you modify the algorithm as needed so that it can learn properly. Algorithms for supervised learning can aid in making predictions about upcoming, previously undiscovered data.
Examples of Supervised Learning
Example 1: To forecast housing prices, supervised learning may be used. Data with specifics on the house’s dimensions, cost, number of rooms, yard, and other amenities is required. To train the data, we require information on a variety of house attributes for thousands of homes. You may now use this trained supervised machine learning model to estimate the cost of a home.
Example 2: Most businesses also utilise supervised machine learning algorithms for spam detection. Data scientists categorise several variables to distinguish between official mail and spam mail. These methods are used to train the database so that it can quickly distinguish between spam and non-spam communication and recognize trends in new data.
Why is it Important?
The Importance are as follows :-
- Learning provides the algorithm with experience, which it can use to produce predictions for brand-new, unforeseen data.
- Experience also aids in enhancing the system’s performance.
- The methods for Supervised Learning are also capable of handling computations in the real world.
Now that we are aware of how important it is, let’s look at the different kinds of this learning and their corresponding algorithms.
Types of Supervised Learning
It has been broadly classified into 2 types.
Regression is a type of learning that takes information from labelled datasets and applies it to fresh data that is fed into the algorithm to predict a continuous-valued output. When a number output, such as money or height, is required, it is employed.
Regression analysis uses one or more independent variables to describe the relationship between a dependent (target) and independent (predictor) variables. More specifically, regression analysis enables us to comprehend how, while other independent variables are held constant, the value of the dependent variable changes in relation to an independent variable. It forecasts real, continuous values like temperature, age, salary, and cost, among others.
Regression is a supervised learning method that enables us to predict the continuous output variable based on one or more predictor variables and aids in determining the correlation between variables. It is mostly used for forecasting, time series modelling, prediction, and establishing the causal connection between variables.
Regression involves creating a graph that connects the variables that best fit the given data points. The machine learning model can then make predictions about the data using this plot.
In simple words, “Regression shows a line or curve that passes through all the datapoints on target-predictor graph in such a way that the vertical distance between the datapoints and the regression line is minimum.” The distance between datapoints and line tells whether a model has captured a strong relationship or not.
Some instances of regression include:
- Use temperature and other variables to forecast rain
- Identification of Market Trends
- Predicting traffic accidents caused by reckless driving.
Terminologies Related to the Regression Analysis:
- Dependent Variable: The dependent variable in a regression analysis is the key element that we wish to forecast or comprehend. It also goes by the name target variable.
- Independent Variable: The term “independent variable,” sometimes known as a “predictor“, refers to the elements that have an impact on the dependent variables or that are employed to forecast their values.
- Outliers: An outlier is an observation that has a very low or very high value compared to other values that have been seen. An outlier should be avoided as it might hurt the outcome.
- Multicollinearity: Multicollinearity is a circumstance where the independent variables have a higher correlation with one another than with other variables. It shouldn’t be included in the dataset because it causes issues when determining which variable has the greatest impact.
- Underfitting and Overfitting: If our algorithm works well with the training dataset but not well with test dataset, then such problem is called Overfitting. And if our algorithm does not perform well even with training dataset, then such problem is called underfitting.
Why do we use Regression Analysis?
Regression is supervised learning that aids in the prediction of a continuous variable, as was already mentioned. In the real world, there are many situations where we need to make predictions about the future, including those involving the weather, sales, marketing trends, and other factors. In these situations, we need technology that can make forecasts more precisely.
Regression supervised learning, a statistical technique used in machine learning and data science, is therefore necessary in this situation. Additional justifications for adopting regression analysis are listed below:
- Regression estimates the relationship between the target and the independent variable.
- It is used to find the trends in data.
- It helps to predict real/continuous values.
- By performing the regression, we can confidently determine the most important factor, the least important factor, and how each factor is affecting the other factors.
Types of Regression
Regressions come in a variety of forms, and they are employed in data science and machine learning. The significance of each type varies depending on the situation, but fundamentally, all regression techniques examine the impact of the independent variable on the dependent variables. Here, we’ll talk about a few significant types of regression, which are listed below:
- Linear Regression
- Logistic Regression
- Polynomial Regression
- A statistical regression technique used for predictive analysis is called linear regression. It is type of Supervised learning.
- One of the most basic and straightforward algorithms, it uses regression to illustrate the relationship between continuous variables.
- It is applied to the machine learning regression problem.
- The term “linear regression” refers to a statistical method that displays a linear relationship between the independent variable (X-axis) and the dependent variable (Y-axis).
- Such linear regression is referred to as simple linear regression if there is only one input variable (x). Additionally, this type of linear regression is known as multiple linear regression if there are many input variables.
Below is the mathematical equation for Linear regression:
Here, Y = dependent variables (target variables),
X= Independent variables (predictor variables),
a and b are the linear coefficients
Some popular applications of linear regression are:
- Analyzing trends and sales estimates
- Salary forecasting
- Real estate prediction
- Arriving at ETAs in traffic.
- Logistic regression is another supervised learning algorithm which is used to solve the classification problems. In classification problems, we have dependent variables in a binary or discrete format such as 0 or 1.
- Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes or No, True or False, Spam or not spam, etc.
- It is a predictive analysis algorithm which works on the concept of probability.
- Logistic regression is a type of regression, but it is different from the linear regression algorithm in the term how they are used.
- Logistic regression uses sigmoid function or logistic function which is a complex cost function. This sigmoid function is used to model the data in logistic regression. The function can be represented as:
- f(x)= Output between the 0 and 1 value.
- x= input to the function
- e= base of natural logarithm.
When we provide the input values (data) to the function, it gives the S-curve as follows:
- The idea of threshold levels is used; numbers over the threshold level are rounded to 1, while values below the threshold level are rounded to 0.
There are three types of logistic regression:
- Binary(0/1, pass/fail)
- Multi(cats, dogs, lions)
- Ordinal(low, medium, high)
- Regression techniques like polynomial regression use a linear model to represent a non-linear dataset.
- While it fits a non-linear curve between the value of x and related conditional values of y, it is comparable to multiple linear regression.
- Consider a dataset that contains datapoints that are not linearly distributed. In this scenario, linear regression will not provide the best fit for those datapoints. Such datapoints require polynomial regression, which is what we need.
- In Polynomial regression, the original features are transformed into polynomial features of given degree and then modeled using a linear model. Which means the datapoints are best fitted using a polynomial line.
- The formula for polynomial regression was also derived from the formula for linear regression, which Polynomial regression equation Y= b0+b1x+ b2x2+ b3x3+…+ bnxn is created from the linear regression equation Y= b0+b1x.
- Here Y is the predicted/target output, b0, b1,… bn are the regression coefficients. x is our independent/input variable.
- The model is still linear as the coefficients are still linear with quadratic
Here we have Regression part of supervised Learning . We have covered What is it , why it is important , what are it’s types and about regression analysis . In next blog we will cover all remaining part of Supervised Learning so stay tuned for further updates and check our other topic blogs on our site .