Perceptron regressing revenue for an ice cream stand according to temperature.
If we observe a significant linear correlation between the two variables, we can say that one can be predicted by the other. There may be a correlation between temperature and the revenue in a stand selling ice cream. We aim to verify that using our dataset. Our dataset has only two variables: "Temperature" in degrees Celsius, the independent variable; "Revenue" in dollars, the dependent variable. There are 500 instances with no missing or corrupt values. If both variables are plotted against each other on a scatter plot, a shape similar to a diagonal line is formed, making their linear correlation apparent:
There is a multitude of algorithms able to model the linear correlation between two variables, like Ordinary Least Squares (OLS) linear regression and Perceptron linear regression. With only one explanatory variable, any model used falls into the setting of a simple linear regression.
Simple linear regression uses a single independent variable to model a dependent variable. It is applicable in datasets where a single variable has an almost perfect linear correlation with the dependent variable, or when there is only one independent variable available. These linear models are easily trainable and are very efficient.
Simple linear regressors can be described by the following equation:
where a is the coefficient of the independent variable x and b is the intercept term (or bias).
Perceptrons, neural networks having a single neuron, have been used as regressors since their invention. Sometimes referred to as Adaline models, these perceptrons do not have an activation function. This way, their output is directly considered the regression result for a given input value. Note that the equation expressed by this perceptron, considering its weights and bias, is exactly the same equation described above for simple linear regressors.
In this project, we train a perceptron regressor to model the observed (and plotted) linear correlation between temperature and ice cream revenue. We make sure that 6 main linear regression assumptions (out of many linear regression assumptions) are respected:
- There is a linear correlation between the independent and the dependent variables;
- Multivariate normality, i.e., the set having all variables must follow a multivariate normal distribution;
- No multicollinearity, meaning that indenpendent variables should not be highly correlated with each other;
- No autocorrelation, which means that prediction errors, for any instance, must not have any correlation with prediction errors of other instances;
- Normality of residuals, as prediction errors are assumed to follow a normal distribution;
- Homoscedasticity, so that prediction errors have a constant variance.
We guarantee that the assumptions are matched using visual breakdowns of the data. For example, the Q-Q chi-squared probability plot demonstrates if the squared Mahalanobis distances between the data instances and the dataset centroid follow a chi-squared distribution. If that is the case, then the set of all variables in our dataset is multivariate normal, conforming with linear regression's assumption 2.
In the end, our perceptron regressor achieves an almost perfect R2 score of 0.98. The mean absolute error for the model is around 18, less than 2% of the revenue range in the dataset: max(revenue) - min(revenue) = 1000 - 10 = 990. The perceptron regressor is also compared with Scikit-learn's OLS regressor, achieving roughly the same performance.