Simple Linear Regression

Simple Linear Regression can be used to model a linear relationship between one response variable(output variable) and one feature representing an explanatory variable(input variable).

For Instance, you want to know the price of a pizza. You might simply look at a menu. we will use simple linear regression to predict the price of a pizza based on an attribute of the pizza that we can observe, or an explanatory variable. Let's model the relationship between the size of a pizza and its price. First, we will write a program with scikit-learn that can predict the price of a pizza given its size.

Let's assume you have recorded the diameters and prices of pizzas that you have previously eaten in your pizza journal. These observations comprise our training data:

Instance is Pizza

Training instance, Diameter in inches, Price in dollars
1                      6                    7
2                      8                    9
3                      10                   13
4                      14                   17.5
5                      18                   18
       
import numpy as np
import matplotlib.pyplot as plt

# X represents the features of our training data, the diameters of the pizzas
# A scikit-learn convention is to name the matrix of feature vectors X.
# Uppercase letters indicate matrices, and lowercase letters indicate vectors.

X = np.array([[6], [8], [10], [14], [18]]).reshape(-1,1)
y = [7, 9, 13 ,17.5, 18] # y is a vector representing the prices of the pizzas.

plt.figure()
plt.title('Pizza price plotted against diameter')
plt.xlabel('Diameter in inches')
plt.ylabel('Price in Dollars')
plt.plot(X,y,'k.', marker='o', markerfacecolor='red', color='black', markersize=20)
plt.axis([0,25,0,25])
plt.grid(True)
plt.show()

 
Simple linear regression @1marufbillah
We can see from the plot of the training data that there is a positive relationship between the diameter of a pizza and its price, which should be corroborated by our own pizza-eating experience. As the diameter of a pizza increases, its price generally increases. The following pizza price predictor program models this relationship using simple linear regression. Let's review the program and discuss how simple linear regression works:
       
from sklearn.linear_model import LinearRegression
model = LinearRegression() # create an instance of the estimator
model.fit(X,y) # Fit the model on the training data

# Predict the price of a pizza with a diameter that has never been seen before
test_pizza = np.array([[12]])
predicted_price = model.predict(test_pizza)[0]
print('A 12" pizza should cost: $%.2f' % predicted_price)

 A 12" pizza should cost: $13.68 

Evaluating The Model

Test instance

Diameter in inches

Observed price in dollars

Predicted price in dollars

1

8

11

9.7759

2

9

8.5

10.7522

3

11

15

12.7048

4

16

18

17.5863

5

12

11

13.6811

 The score method of LinearRegression returns the model's R-squared value

       
import numpy as np
from sklearn.linear_model import LinearRegression

x_train = np.array([6, 8, 10, 14, 18]).reshape(-1, 1)
y_train = [7, 9, 13, 17.5, 18]

x_test = np.array([8,9,11,16,12]).reshape(-1, 1)
y_test = [11, 8.5, 15, 18, 11]

model = LinearRegression()
model.fit(x_train, y_train)
r_squared = model.score(x_test, y_test)
print(r_squared)

 
0.6620052929422553

Comments