Forecasting in Excel Using Simple Linear Regression

scmprofrutgers
14 Feb 201207:59

Summary

TLDRThis video tutorial demonstrates how to perform a regression forecast using simple linear regression in Excel. It guides viewers through calculating the intercept and slope using the first year of data, then applying the regression to forecast future values. The process includes creating a regression equation, copying the formula for forecasting, and evaluating the forecast's accuracy using various error measures and the USE statistic. The tutorial emphasizes the importance of adjusting formulas correctly to maintain accuracy when forecasting.

Takeaways

  • 📈 The video demonstrates how to perform a regression forecast using simple linear regression for data with a growth component.
  • 🔢 It focuses on using only the first year of data (12 months) to estimate the intercept and slope for the regression model.
  • 💼 Excel's Intercept and Slope functions are introduced as an easy method to calculate the regression coefficients.
  • 📋 The script guides through the process of highlighting the correct data ranges for known y's (demand) and known x's (time periods).
  • 💡 The intercept is calculated first, followed by the slope, with the values being 80.667 and 7.57 respectively.
  • 📊 A forecast is created by applying the regression equation to the entire dataset, ensuring the formula is correctly copied down.
  • 📝 The script emphasizes the use of the ROUND function to avoid forecasting in half units, maintaining accuracy.
  • 📉 The video explains how to calculate forecast accuracy measures such as error, absolute error, percentage error, and squared error.
  • 📊 It introduces the use statistic (U统计量) as a measure to compare the forecast's accuracy against a naive forecast method.
  • 📊 The script concludes with calculating overall accuracy measures like mean error, mean absolute error, mean squared error, and the use statistic.

Q & A

  • What is the main purpose of the video?

    -The main purpose of the video is to demonstrate how to perform a regression forecast using simple linear regression in Excel, focusing on estimating an intercept and a slope to create a forecast based on data with a growth component.

  • Why is only the first year of data used for creating the regression coefficients?

    -Only the first year of data is used to create the regression coefficients to establish a baseline for the forecast, ensuring that the model is built on initial data trends without being influenced by potential changes in later data.

  • How does the video suggest calculating the intercept in Excel?

    -The video suggests using the INTERCEPT function in Excel, which requires the known y's (demand) and known x's (time periods) as inputs to calculate the intercept.

  • What function is recommended for calculating the slope in the regression?

    -The video recommends using the SLOPE function in Excel, which similarly requires the known y's (demand) and known x's (time periods) to calculate the slope of the regression line.

  • How is the forecast created in the video?

    -The forecast is created by typing the regression equation into the forecast column in Excel, which involves the intercept, the slope, and the period number, and using the ROUND function to ensure whole units are forecasted.

  • Why is the ROUND function used in the forecast equation?

    -The ROUND function is used to ensure that the forecast does not result in half units, which would not be practical for the data being analyzed.

  • What is the significance of the F4 key mentioned in the video?

    -The F4 key is used to apply absolute cell referencing ($ signs) to the intercept and slope in the formula, ensuring that these values remain constant when the formula is copied down the column.

  • How does the video address the change in slope in the second year of data?

    -The video acknowledges that the slope changes slightly in the second year, indicating that the simple linear regression model fits better for the first year of data, and this is a limitation of the model.

  • What is the purpose of calculating forecast accuracy measures in the video?

    -Calculating forecast accuracy measures, such as forecast error, absolute error, percentage error, and squared error, helps to evaluate the performance of the regression forecast and identify areas where the forecast may be less accurate.

  • How is the Mean Squared Error (MSE) calculated in the video?

    -The Mean Squared Error (MSE) is calculated by taking the average of the squared errors, which is the sum of the squared forecast errors divided by the number of observations.

  • What is the role of the USE (Unbiased Sample Error) statistic in the video?

    -The USE statistic is used to compare the accuracy of the regression forecast to a naive forecast. It is calculated by taking the square root of the ratio of the sum of squared errors to the sum of squared errors that would have been obtained using the naive method.

Outlines

00:00

📊 Introduction to Running a Regression Forecast

The speaker begins by introducing the process of creating a regression forecast using simple linear regression. The focus is on data with a growth component, aiming to estimate an intercept and a slope. The first year of data, consisting of 12 months, is selected for creating the regression coefficients. The speaker demonstrates how to use Excel's Intercept and Slope functions to calculate these values. The known y's, representing demand, and known x's, representing time periods, are inputted into the functions to obtain an intercept of 80.667 and a slope of 7.57. The next step is to apply the regression equation to the entire dataset to create a forecast, ensuring that the forecast values are rounded to avoid half units.

05:02

📈 Evaluating Forecast Accuracy with Error Metrics

In the second paragraph, the speaker discusses the evaluation of the forecast's accuracy using various error metrics. The forecast error is calculated as the difference between actual demand and the forecasted value. Absolute error, percentage error, and squared error are derived from this forecast error. The Mean Squared Error (MSE) is introduced as a measure of the forecast's accuracy, comparing it to a naive forecast method. The speaker sets up columns for error, absolute error, percentage error, and squared error with conditional formatting to visually assess the forecast's performance. The Mean Error, Mean Absolute Error, Mean Squared Error, and Mean Absolute Percentage Error are calculated to provide an overall measure of the forecast's accuracy. The Mean Squared Error (MSE) is also calculated to quantify the forecast's performance relative to the naive forecast method.

Mindmap

Keywords

💡Linear Regression

Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data. In the video, linear regression is used to forecast data that has a growth component, with the goal of estimating an intercept and a slope based on the first year of data. The script mentions using the intercept and slope functions in Excel to perform this regression, which are essential for creating the forecast.

💡Intercept

The intercept in a linear regression model is the value of the dependent variable when all the independent variables are equal to zero. It represents the starting point of the regression line on the y-axis. In the context of the video, the intercept is calculated using the first 12 months of data, which is then used as a part of the regression equation to forecast future values.

💡Slope

The slope in a linear regression model represents the rate of change of the dependent variable with respect to the independent variable. It indicates how much the dependent variable is expected to change when the independent variable increases by one unit. In the video, the slope is calculated using the first 12 months of demand data and the corresponding time periods, and it is used alongside the intercept to predict future values.

💡Forecast

A forecast in the context of the video refers to the prediction of future values based on historical data using a statistical model, such as linear regression. The video demonstrates how to create a forecast by applying the regression equation to the entire dataset, which includes using the calculated intercept and slope to estimate future demand values.

💡Time Series

A time series is a sequence of data points ordered in time. In the video, the time series data represents the demand over a period of months, and the script describes how to use the first 12 months of this time series to create a linear regression model for forecasting future demand.

💡Excel Functions

Excel functions are pre-built formulas that perform calculations on data within a spreadsheet. The video script specifically mentions using the 'INTERCEPT' and 'SLOPE' functions in Excel to calculate the necessary parameters for the linear regression model. These functions simplify the process of performing regression analysis within the Excel environment.

💡Round Function

The 'ROUND' function in Excel is used to round a number to a specified number of decimal places. In the video, the round function is applied to the regression equation to ensure that the forecast values are presented in whole units, avoiding fractions which might not be meaningful in the context of the data.

💡Forecast Error

Forecast error refers to the difference between the actual observed value and the forecasted value. In the video, the script describes calculating forecast error by subtracting the predicted demand from the actual demand. This error is then used to evaluate the accuracy of the regression model.

💡Mean Absolute Error (MAE)

Mean Absolute Error (MAE) is a measure of accuracy for a forecast method, calculated as the average of the absolute differences between the forecast and the actual values. The video script includes calculating MAE as part of the process to evaluate the forecast's accuracy, which helps in understanding how well the model is performing.

💡Mean Squared Error (MSE)

Mean Squared Error (MSE) is a risk metric corresponding to the expected value of the squared (quadratic) error or loss. In the video, MSE is calculated by squaring the forecast errors and then averaging them. It is used as a measure of the quality of the linear regression model, with lower values indicating better fit.

💡Mean Squared Error (MSE) for Naive Forecast

The naive forecast method assumes that the next period's value will be the same as the last observed value. In the video, the MSE for the naive forecast is calculated to compare it with the MSE of the linear regression model. This comparison helps in determining the effectiveness of the regression model over a simpler forecasting method.

Highlights

Introduction to running a regression forecast based on simple linear regression.

Explanation of using the first year of data to create regression coefficients for the intercept and slope.

Demonstration of using Excel's Intercept function to calculate the intercept.

Guidance on selecting known y's (demand) and known x's (time periods) for the Intercept function.

Calculation of the slope using Excel's SLOPE function.

Instructions on applying the regression equation to the entire dataset.

Use of the ROUND function to avoid forecasting in half units.

Detail on setting up the regression equation in the forecast column.

Ensuring the regression formula remains static when copied down.

Visual representation of the forecast as a straight line on the graph.

Discussion on the fit of the forecast for the first and second year of data.

Introduction to calculating forecast accuracy measures like error, absolute error, and percentage error.

Explanation of the squared error and its role in the forecast accuracy.

Calculation of the USE (Unit Squared Error) statistic for forecast evaluation.

Methodology for calculating the mean error and mean absolute error.

Process of determining the Mean Squared Error (MSE) and USE statistic for overall forecast accuracy.

Final thoughts on the effectiveness of simple linear regression for forecasting in Excel.

Transcripts

play00:00

okay in this video I'm going to show you

play00:03

how to run a reg forecast based on a

play00:07

linear regression a simple linear

play00:10

regression and the idea is basically

play00:13

that we have data with uh some kind of a

play00:16

growth component in it and uh it's uh

play00:19

moving

play00:21

upwards so we are going to estimate an

play00:24

intercept and a slope that will then be

play00:27

used to create the forecast we only only

play00:30

going to use the first year of data

play00:33

which is the first 12 months that we

play00:36

have here to create um our uh regression

play00:41

uh coefficients to estimate the

play00:43

intercept and the slope and then we're

play00:46

going to apply the regression to to the

play00:48

entire set of

play00:51

data now let's get started so first of

play00:54

all we have to calculate our intercept

play00:57

and our slope there are several ways to

play00:59

do uh to run a regression in in Excel

play01:02

but the easiest one for this case is to

play01:04

just use the Intercept in the slope

play01:06

function so to use the intercept

play01:08

function we type in

play01:12

intercept and then we open the

play01:13

parentheses and then it asks us for two

play01:16

things our known y's and our known X's

play01:18

our known y's are basically the uh uh

play01:24

output that we want the forecast to

play01:27

achieve so in this case that would be

play01:29

our demand so we are going to

play01:31

highlight the first 12 months of demand

play01:35

then our known X's are going to tell us

play01:37

where in the time series we are so for

play01:41

that I've created this column here

play01:43

that's called period I'm going to have

play01:46

highlight there again the first 12

play01:47

periods going close the parentheses and

play01:49

then it will give us a intercept of 80.

play01:55

667 now to calculate the slope it works

play01:57

very similarly we're going to type slope

play02:00

of parentheses and then the function uh

play02:03

uses the same logic we highlight our

play02:06

first 12 months of

play02:09

demand and then we highlight our first

play02:11

12 periods that are numbered

play02:13

sequentially and close our parentheses

play02:16

hit enter and here we go our slope is

play02:19

7.57

play02:21

7 all right so now in order to uh create

play02:25

the forecast we have to you uh basically

play02:29

type in the

play02:30

regression equation into uh our forecast

play02:35

column

play02:36

so before we type in the actual

play02:39

regression equation we want to make sure

play02:41

that we don't forecast half units so

play02:43

we're going to use the round

play02:46

function and we're going to say round

play02:50

and then the general way of regression

play02:53

is uh is set up is the forecast equals

play02:57

The

play02:58

Intercept Plus the slope times how many

play03:02

periods we are

play03:04

in so we are going to say

play03:08

intercept we want to make sure that when

play03:10

we copy this formula down it stays put

play03:13

so I'm going to hit F4 it's going to

play03:15

place two dollar signs in there and that

play03:18

will ensure that when we copy the

play03:20

formula will stay

play03:23

put

play03:24

plus slope

play03:28

times the period number again for slope

play03:32

we have to go back and also hit the four

play03:33

and make sure that it stays put when

play03:37

we um copy the formula down and then I

play03:40

am just finishing up this formula here

play03:44

by making sure that the round function

play03:46

is complete we close the parentheses and

play03:49

there we go that's our regression

play03:51

equation now all I need to do is I need

play03:54

to copy this down and here we go we see

play03:59

on our graph we just created a straight

play04:01

line

play04:01

forecast for 2 years worth of data it

play04:05

fits a little better for the first year

play04:07

than for the second year because for

play04:09

some reason the slope changes slightly

play04:12

in the second year however that is

play04:15

basically uh how a simple linear

play04:18

regression would work okay now just to

play04:21

ensure that we uh um created a good

play04:26

forecast I'm going to leave out the

play04:28

first period for

play04:30

because in order to calculate the use

play04:32

statistic we uh we need uh one period

play04:35

before so I'm going to just consistently

play04:38

calculate all my uh forecast accuracy

play04:41

measure for starting in Period to if you

play04:44

really wanted to you could even start in

play04:46

year two because that's when you when it

play04:48

really matters but we're we're going to

play04:50

start in Period two in this case so our

play04:53

forecast error is basically demand minus

play04:55

forecast our absolute error is the

play04:58

absolute value of that forecast

play05:02

error our percentage error we're going

play05:05

to do the uh absolute percentage error

play05:07

in this

play05:09

case equals our absolute error divided

play05:12

by

play05:14

demand our squared error is basically

play05:17

our error term that we raise to the

play05:20

second

play05:22

power and then finally the denominator

play05:25

for our use statistic which the use

play05:27

statistic basically Compares how good

play05:30

this forecast is to the naive

play05:32

forecast so we the the numerator of that

play05:36

is the squared error we just need the

play05:38

denominator which would basically be the

play05:40

squared error we would have obtained if

play05:42

we would have used uh the na method so

play05:46

that is to open up

play05:49

parentheses demand of the current period

play05:53

minus then one period earlier because

play05:56

that's what the na method would using a

play05:59

forecast and we raise that to the second

play06:01

power okay so now all we need to do is

play06:06

we need to copy all of these formulas

play06:08

down here and I set up uh the error the

play06:12

absolute error the percentage error and

play06:14

the squared error colums with

play06:16

conditional for formatting so we can see

play06:18

a little bit um where most of the errors

play06:21

occur and it uh just visually makes it a

play06:25

lot easier to to see how good our

play06:27

forecast is okay okay now to to

play06:30

calculate overall accuracy measures we

play06:33

have to calculate the mean eror which

play06:35

basically means we take the

play06:37

average of column e which is we take the

play06:41

average of our

play06:43

errors then we have to calculate our

play06:45

mean absolute error which basically the

play06:48

average

play06:49

of that column here which just call

play06:52

F our mean absolute percent

play06:56

error again the average

play07:00

of column

play07:03

G our MSC is basically the average of

play07:10

column H which is all of square Earth

play07:13

and then finding the use statistic is

play07:15

the square root of the sum of the

play07:17

squared errors divided by the

play07:19

sum of the squared errors we would have

play07:22

obtained with if we would have performed

play07:24

the use statistic we see s qrt

play07:30

parentheses and then we want to sum be

play07:33

careful to actually sum them and not

play07:35

average them like for the other

play07:37

ones sum the squared errors and then

play07:40

divide them that by the

play07:43

sum of the square for the use

play07:47

statistic and close two parentheses and

play07:50

there we

play07:51

go so this was a forecast using simple

play07:55

linear regression in

play07:57

Excel

Rate This

5.0 / 5 (0 votes)

Ähnliche Tags
Linear RegressionExcel ForecastingData AnalysisStatistical ModelingGrowth TrendsDemand ForecastTime Series DataExcel TutorialBusiness AnalyticsPredictive Modeling
Benötigen Sie eine Zusammenfassung auf Englisch?