top of page
Writer's pictureSai Geetha M N

Regression Algorithms

What is Regression?

Regression is a statistical model/method used to determine the strength and character of the relationship between one dependent variable and one or more independent variables.


If you have data about the consumption of electricity in an area based on the number of houses in that area and now you are asked to estimate electricity consumption for a larger area with more houses, you would easily use the average per house and extrapolate it. You would not be completely wrong. Regression goes a step further and allows this guess to be better by finding a relationship between the number of houses and electricity consumption. Typically it is a linear relationship and hence it is called a Linear Regression.


Mathematically it is represented by the equation: y= mx + c where y is the electricity consumed, x is the number of houses. If we find the right m and c we have found the relation between y and x!! It is that simple.


If you observe the hypothetical graph here, you will see that given 100 houses in an area, the line lets you predict that the consumption may be around 20 to 21 kWh.


Here x is called the independent or the predictor variable

y is called the dependent or the target variable.


In the above example, the number of houses in an area is the independent or predictor variable and the electricity consumed by that area is the dependent or target variable.



What are the Types of Regression Algorithms?

Conceptually, there are 2 types:

  • Simple Linear Regression

  • Multi-Linear Regression

Simple Linear Regression (SLR):

If the variables involved consist of only ONE independent variable, then it is a simple linear regression. In the above example, if only the 'Number of houses" determines the electricity consumption, then it is an SLR problem.


Multi-Linear Regression (MLR):

Here's where MLR comes in. If there are more than one predictors that determine the target variable, it becomes an MLR problem.

In the above example, in reality, the size of the house, the number of electrical equipment, the number of residents, the location of houses, the external temperature, and many more factors influence the electricity consumption and hence this becomes an MLR problem


The same regression output can be obtained from various other algorithms:

  • Polynomial Regression

  • Logistic Regression

  • Support Vector Regression

  • Decision Tree Regression

  • Random Forest Regression

  • Ridge Regression

  • Lasso Regression

  • Generalized Linear Regression

When do you use Regression Algorithms?


There are mainly three uses of regression analysis

  1. Predictions - of a continuous dependent variable based on the predictors

  2. Trend Forecasting - the best fit line also helps in understanding the trend

  3. Determining the strength of the predictors - if there is no strong relationship between the target and the predictors, it becomes visible through regression analysis

Associated Concepts to go deeper into Linear Regression:

  • Data Preparation and cleansing

  • Concept of Cost Functions

  • Optimization of Cost Functions

  • Assumptions of Simple Linear Regression

  • Hypothesis Testing

  • p-values of coefficients

  • Residual Analysis

  • Various statistics like R-Squared, Adjusted R-Squared, F-Statistic

Conclusion

Regression is one of the most fundamental models that work very well in quite a large number of use cases for prediction and trend forecasting. Esp. Multi-Linear Regression (MLR) is used in innumerable scenarios and would come in handy for any Data Science Manager or ML developer.

It also mathematically provides the strength of the relationship between the target and predictors.



Comentarios


You can add comments here using your facebook account too!

bottom of page