Smoothing 4: Simple exponential smoothing (SES)

Galit Shmueli
30 Nov 201612:20

Summary

TLDRIn this video, Galit Shmueli introduces Simple Exponential Smoothing (SES), a forecasting method used for series without trend or seasonality. SES, favored for its simplicity and computational efficiency, employs a smoothing constant (alpha) to weigh past values exponentially, allowing the model to adapt by learning from the most recent data. The video explains how to initialize SES, update the level, and calculate forecasts, emphasizing the importance of selecting an appropriate alpha. It contrasts SES with moving averages and highlights its limitations in capturing trends or seasonality, suggesting its use for series with only a level component.

Takeaways

  • 📊 Simple Exponential Smoothing (SES) is a forecasting method that uses a weighted average of all previous values to predict future values, suitable for series without trend or seasonality.
  • 🔍 The Smoothing Constant (\u03B1) is a key component in SES, determining the weight given to more recent data points, and it ranges between 0 and 1.
  • 🌟 SES is popular due to its simplicity, adaptability, and computational efficiency, making it a cost-effective choice for forecasting.
  • 📉 SES assumes that the series contains only a level component, implying that the level remains constant over time.
  • 🔧 The Level Updating Equation is used to estimate the level of the series by integrating information from the most recent data point.
  • 📈 The initialization of SES typically starts with setting the first level estimate (L_1) equal to the first data point in the series.
  • 🔄 The term 'Exponential Smoothing' comes from the exponential decay of weights as we move backward through the data series.
  • 📉 Extreme values of \u03B1 (1 or 0) result in either no learning from past data or equal weighting of all data points, respectively.
  • 🔄 The relationship between moving averages and SES can be understood by setting \u03B1 to be similar to the inverse of the moving average window size.
  • 📝 SES can be viewed as an adaptive learning algorithm, where forecasts are updated based on the error from the previous forecast.
  • 🚫 SES is not effective for series with trend or seasonality unless they are first differenced or more advanced methods are used.

Q & A

  • What is the main concept behind Simple Exponential Smoothing (SES)?

    -Simple Exponential Smoothing (SES) is a forecasting method that uses a weighted average of all previous values in a series to forecast future values. It is suitable for series without trend or seasonality.

  • Why is Simple Exponential Smoothing popular?

    -Simple Exponential Smoothing is popular because it is simple, adaptive, and computationally inexpensive. It only requires the most recent forecast and forecast error to be stored, making it efficient.

  • What is the role of the Smoothing Constant (alpha) in SES?

    -The Smoothing Constant, denoted by alpha, determines the rate at which the algorithm learns from new data. It is a number between 0 and 1, where values closer to 0 give more weight to recent data, and values closer to 1 give equal weight to all data points.

  • How is the initial level (L1) for SES typically set?

    -The initial level (L1) can be set in different ways, but a common method is to set it equal to the first data point in the series (Y1).

  • What happens when alpha is set to 1 or 0 in SES?

    -When alpha is set to 1, the algorithm does not learn from past data and the level remains constant. When alpha is 0, all data points have equal weight, and the algorithm does not give more importance to recent information.

  • How does the choice of alpha affect the weights assigned to past observations in SES?

    -A larger alpha results in faster decay of weights as we go further into the past, giving more weight to recent observations. A smaller alpha results in slower decay, giving almost equal weight to all observations.

  • What is the relationship between Moving Average and Simple Exponential Smoothing?

    -While different, similar results can be achieved between Moving Average and SES by setting the smoothing constant alpha to be similar to the inverse of the moving average window width.

  • How does SES update forecasts based on forecast errors?

    -In SES, the next period's forecast is updated by adding the previous forecast error, multiplied by the smoothing constant alpha, to the previous forecast.

  • Why might SES not perform well on certain data series?

    -SES might not perform well on data series that exhibit trend or seasonality because it does not account for these patterns in the data.

  • How can software tools like XLMiner or R's forecast package be used for SES?

    -Software tools like XLMiner and R's forecast package provide functions to compute SES, making it easier to apply the method without manual calculations.

Outlines

00:00

📊 Introduction to Simple Exponential Smoothing

In this segment, Galit Shmueli introduces the concept of Simple Exponential Smoothing (SES), a forecasting method used for time series data that lacks trends and seasonality. SES operates on the principle of calculating a weighted average of past observations, with the weights exponentially decreasing over time. The key component of SES is the 'smoothing constant' (alpha), which dictates the rate at which the algorithm adapts to new data. The video explains that SES is favored for its simplicity, adaptability, and computational efficiency. It contrasts SES with the Moving Average Forecaster and sets the stage for discussions on more complex exponential smoothing methods like Holt's and Holt-Winter's, which are designed for series with trends and seasonality. The video also touches on the initialization of the SES algorithm and the concept of exponential decay in weights, emphasizing the method's reliance on recent data points over older ones.

05:01

🔍 The Role of Alpha in Exponential Smoothing

This paragraph delves into the critical role of the smoothing constant, alpha, in the Simple Exponential Smoothing process. It explains how alpha determines the rate at which the algorithm learns from new data, with higher alpha values leading to quicker adaptation and lower values resulting in slower, more gradual learning. The video warns against overfitting when selecting alpha by minimizing error metrics like RMSE or MAPE during the training period. It visually demonstrates the effect of different alpha values on the weighting of past and recent observations through a chart and a table. The relationship between moving averages and SES is also explored, with a formula provided to equate a moving average window to an equivalent alpha value in SES. The adaptive nature of SES is highlighted, showing how forecasts are updated based on forecast errors, and the benefits of SES over moving averages in terms of computational efficiency and emphasis on recent observations.

10:03

📈 Practical Application and Limitations of Simple Exponential Smoothing

The final paragraph discusses the practical application of Simple Exponential Smoothing using software tools like XLMiner and R's forecast package. It provides a step-by-step example of how SES is applied to quarterly sales data of soft drinks, illustrating the process of forecast adjustment based on forecast errors. The video then contrasts this with an example of monthly Amtrak ridership data, which includes both trend and seasonality, to demonstrate the limitations of SES in such scenarios. The summary underscores that while SES is computationally inexpensive and simple to understand, it is not suitable for series with trends or seasonality unless they are first differenced. The video concludes by reinforcing the importance of selecting the appropriate forecasting method based on the characteristics of the data series.

Mindmap

Keywords

💡Exponential Smoothing

Exponential smoothing is a forecasting method for time series data that gives more weight to recent observations. It's a type of weighted moving average that 'smooths' out business data by applying increasingly smaller weights to older data points. In the video, it's introduced as a method for forecasting future values by using a weighted average of all the previous values in a series, particularly suitable for series without trend or seasonality.

💡Simple Exponential Smoothing (SES)

Simple Exponential Smoothing is a specific type of exponential smoothing used for time series data that does not exhibit trend or seasonality. It's called 'simple' because it only considers the level of the series, not the trend or seasonality components. The video explains that SES is popular due to its simplicity, adaptability, and computational efficiency.

💡Smoothing Constant (Alpha)

The smoothing constant, denoted by alpha (α), is a parameter in exponential smoothing methods that determines the rate at which past data is decayed. It is a number between 0 and 1, where a higher alpha gives more weight to recent observations. The video emphasizes that alpha is crucial as it controls the algorithm's responsiveness to new data, with typical industry values around 0.1 or 0.2.

💡Level Updating Equation

The level updating equation is a formula used in exponential smoothing to update the level of a time series based on the most recent data point. It's a weighted average that includes the smoothing constant and the previous level. In the video, it's used to illustrate how the SES algorithm integrates new information to update the forecast.

💡Forecast Error

Forecast error refers to the difference between the actual value and the forecasted value in a time series. It's a measure of the accuracy of the forecasting model. The video script uses forecast error in the context of updating forecasts, where the SES method adjusts the forecast based on the error from the previous period.

💡Trend

In time series analysis, a trend is a long-term movement in the data, either upward or downward. The video explains that simple exponential smoothing is not suitable for series with a trend because it assumes a constant level and does not account for such movements.

💡Seasonality

Seasonality refers to regular and predictable fluctuations in a time series that recur every fixed period of time. The video mentions that SES is not appropriate for series with seasonality, as it does not model these periodic changes.

💡Holt's Exponential Smoothing

Holt's Exponential Smoothing is an extension of simple exponential smoothing that can handle time series data with trends. It's mentioned in the video as a method suitable for series that have a trend but no seasonality, in contrast to simple exponential smoothing.

💡Holt-Winters Exponential Smoothing

Holt-Winters Exponential Smoothing is an advanced method that can handle time series data with both trends and seasonality. The video script introduces it as a method that will be discussed in future videos, indicating its complexity and applicability for more complex series.

💡Initialisation

Initialisation in the context of exponential smoothing refers to the starting point for the forecasting model, typically set as the first data point in the series or using another statistical method to estimate the initial level. The video script explains that one common approach is to set the initial level (L1) equal to the first data point.

💡Adaptive

Adaptive in the video refers to the ability of the forecasting model to adjust or 'learn' from new data as it becomes available. SES is described as an adaptive method because it updates the forecast based on the most recent data, making it responsive to changes in the series.

Highlights

Introduction to exponential smoothing as a forecasting method.

Explanation of Simple Exponential Smoothing (SES) for series without trend or seasonality.

Advantages of SES: simplicity, adaptability, and computational efficiency.

The concept of the Smoothing Constant, alpha, in SES.

Introduction to three types of Exponential Smoothing: Simple, Holt's, and Holt-Winter's.

Assumption of SES: the series contains only level, no trend or seasonality.

The k-step-ahead SES forecast is the most recent level estimate.

Level Updating Equation for SES, incorporating the smoothing constant alpha.

Initialization of SES with L1, the first data point in the series.

The rationale behind the name 'Exponential Smoothing' due to the decaying weights.

Impact of the smoothing constant alpha on the algorithm's learning from new data.

Extremes of alpha: alpha equals one (no learning) and alpha equals zero (equal weight to all values).

Typical industry values for alpha and the caution against overfitting.

Visual representation of alpha's effect on weights分配 over time.

Comparison between Moving Average and Simple Exponential Smoothing.

SES as an adaptive learning algorithm, updating forecasts based on forecast errors.

Practical application of SES on quarterly soft drink sales data.

Demonstration of SES in software like XLMiner and R's forecast package.

Evaluation of SES performance on the soft drink sales data.

Application of SES on Amtrak's monthly ridership data with trend and seasonality.

Limitations of SES in forecasting series with trend and seasonality.

Upcoming discussion on Holt's and Holt-Winter's Exponential Smoothing methods.

Transcripts

play00:00

[MUSIC PLAYING]

play00:02

play00:09

Welcome to Business Analytics Using Forecasting.

play00:12

I'm Galit Shmueli.

play00:13

In the next few videos, we're going

play00:15

to talk about a smoothing method called exponential smoothing.

play00:19

In this video we'll start by talking about Simple

play00:22

Exponential Smoothing.

play00:25

The idea behind Simple Exponential Smoothing

play00:28

is to forecast future values by using

play00:31

a weighted average of all the previous values in our series.

play00:36

We can use this for forecasting a series that

play00:39

doesn't have trend and doesn't have seasonality.

play00:42

If you remember, this was also the case with the Moving

play00:44

Average Forecaster.

play00:46

Simple Exponential Smoothing is very popular

play00:49

because it's simple, it's adaptive,

play00:51

and it's cheap to compute.

play00:53

The key concept to remember here is

play00:55

going to be something called The Smoothing Constant.

play00:58

play01:01

We're going to talk about three types of Exponential

play01:03

Smoothing in this course.

play01:05

In this video we're talking about Simple Exponential

play01:07

Smoothing, which is suitable for a series with no trend

play01:11

and no seasonality.

play01:13

In future videos we'll talk about Holt's Exponential

play01:16

Smoothing, which is suitable for a series that has

play01:19

a trend, but no seasonality.

play01:21

And then we'll talk about Holt Winter's or Winter's

play01:24

Exponential Smoothing, which is suitable for series that have

play01:27

both trend and seasonality.

play01:28

play01:31

Simple Exponential Smoothing, or simply SES,

play01:35

makes the assumption that our series only

play01:37

contains level, no trend or seasonality,

play01:41

and it does include error.

play01:44

So if we only have level the assumption

play01:46

of the Exponential Smoother is that this level

play01:49

will stay put and not move.

play01:52

Therefore, the k-step- ahead SES forecast is simply

play01:57

the most recent estimate of our level at time, t.

play02:03

To do this, we're going to have to estimate the level.

play02:06

To do that we're going to use something

play02:08

called a Level Updating Equation.

play02:11

In this equation we're taking the level at time, t

play02:15

and updating the previous level at time,

play02:18

t minus 1 by integrating information from our most

play02:22

recent data point, Yt.

play02:25

You can see that it's a weighted average where we have alpha

play02:28

and 1 minus alpha as our weights.

play02:31

Alpha is called the smoothing constant

play02:34

and it's a number somewhere between 0 and 1.

play02:38

What this tells us is that the algorithm

play02:40

is learning the new level from the newest data

play02:46

that it's seeing.

play02:47

How do you start this whole system?

play02:49

Well, you have to initialise it with L1 at some point.

play02:52

There are different ways of doing it.

play02:54

One option is simply setting L1 equal to the first record

play02:58

in your series, Y1.

play03:02

So why is the algorithm called Exponential Smoothing?

play03:05

To see this let's rewrite the updating equation a little bit

play03:10

differently.

play03:11

We have the level updating equation

play03:13

as we wrote it before, but now let's substitute L sub

play03:17

t minus 1 with its own formula.

play03:19

play03:22

So we start with the normal formula

play03:24

and then substitute L sub t minus 1

play03:27

with its own formula that is based on L sub t minus 2.

play03:32

We can then do the same thing again and substitute Lt minus 2

play03:36

with its predecessors.

play03:38

And if you write this out all the way down,

play03:41

you'll end up with something that

play03:43

looks like this, alpha times Yt1 plus alpha times 1

play03:48

minus alpha Yt minus 1 and so on and so forth.

play03:52

We see that we end up with an average of all our values

play03:55

in the series, but they have weights that

play03:58

are decaying exponentially.

play04:01

Because these weights are decaying exponentially

play04:04

we call this method Exponential Smoothing.

play04:07

The smoothing constant, alpha, determines

play04:09

how much smoothing we do.

play04:11

We can take the two extremes.

play04:14

When alpha is equal to one the past values have no effect

play04:19

on the algorithm and in fact, it's not learning anything.

play04:22

So the level just remains the way we started it out.

play04:25

The other extreme is alpha equal zero

play04:28

where all the values in our series

play04:30

have equal weight in our average.

play04:33

In that case, we're not giving any more weight

play04:36

to more recent information.

play04:38

That's why we typically choose something between 0 and 1

play04:42

and usually closer to 0 than to 1.

play04:45

Typical values that are used in industry are around 0.1 or 0.2,

play04:49

and you might see that also in software defaults.

play04:53

Another way to set alpha is by trial and error.

play04:56

You can try a few different values,

play04:59

compare the predictive performance,

play05:00

and see which ones work better.

play05:02

If you do that be really careful.

play05:05

For example, if you're looking to minimize

play05:08

the RMSE or the MAPE of the training period

play05:12

and choosing the alpha that gives you the smallest value,

play05:15

you might be overfitting the training period.

play05:18

So be very careful if you're going

play05:19

to choose alpha in that way.

play05:22

To feel the effect of alpha let's look

play05:24

at a chart that shows how alpha effects the weights on the most

play05:29

recent period and older periods.

play05:32

We can see that in all cases all these lines are

play05:35

different alphas, but they all decay

play05:38

as we go further into the past.

play05:41

We can look at the actual values in the table below

play05:45

and you can see again the exponential decay.

play05:48

With alphas that are very large, like alpha of 0.9,

play05:52

the decay is fast.

play05:54

Whereas with smaller alphas the decay is slower.

play05:58

You might be wondering about the relationship between a moving

play06:01

average and simple exponential smoothing.

play06:04

There are a little bit different,

play06:06

but we can actually achieve results

play06:08

that are pretty similar if we choose

play06:11

the smoothing constant to be similar in a way to the moving

play06:15

average window.

play06:16

For example, if we choose a moving average window

play06:20

width of w then it would be almost

play06:23

equivalent to using simple exponential smoothing

play06:26

with alpha equal to 2 over W plus 1.

play06:32

Another way to think of simple exponential smoothing

play06:35

is as an adaptive learning algorithm.

play06:38

Again, let's rewrite the level updating equation a little bit

play06:42

differently.

play06:43

By opening the parentheses and reordering the components,

play06:47

I can write L sub t is equal to alpha times Yt plus L sub

play06:53

t minus 1 minus alpha times L sub t minus 1.

play06:59

So are one step ahead forecast, F sub t plus 1,

play07:04

can be written as L sub t minus 1

play07:09

plus alpha times Yt minus Lt minus 1.

play07:13

From here we can substitute Lt minus 1

play07:16

with Ft because that's exactly the forecast for time, t.

play07:21

And the next step is to notice that in the parentheses

play07:24

Y sub t minus F sub t is actually the forecast error.

play07:28

So we can write that as E sub t.

play07:33

This last form is very simple and useful.

play07:35

What it means is that to forecast the next time

play07:39

period we update the previous forecast by an amount that

play07:43

depends on the error in the previous forecast.

play07:48

This last formulation really shows

play07:50

the beauty of simple exponential smoothing.

play07:53

Forecasts use all the previous values,

play07:56

but we only need to store the most recent forecast

play07:59

and the most recent forecast error.

play08:02

Unlike moving average, the simple exponential smoothing

play08:06

does give more weight to more recent observations.

play08:09

It's also relatively simple to understand,

play08:12

depending on which of the explanations you choose.

play08:15

Let's return to our example of sales of soft drinks

play08:18

in order to see how the simple exponential smoother works.

play08:22

Remember that we have quarterly sales of soft drinks

play08:25

over a pretty long period.

play08:28

And now we're going to apply the simple exponential smoothing.

play08:32

Using the formula F sub t plus 1 is

play08:35

Ft plus alpha times the error, E sub t,

play08:39

we need the forecast and the forecast error

play08:41

from the previous period in order

play08:43

to compute the next forecast.

play08:46

Let's use alpha equals 0.2 for illustration.

play08:50

We therefore start with the second period in the data,

play08:53

quarter two of 1986.

play08:56

The forecast for this period is initialised

play09:00

to be equal to the sales in the previous quarter.

play09:03

So it's $1734 million US dollars.

play09:07

We see that this forecast ended up

play09:09

being too low compared to the actual sales in that quarter,

play09:12

which were $2244 million.

play09:16

The SES forecast for the next quarter, quarter three of 1986,

play09:21

takes the previous forecast 1734 and adjust it up

play09:26

by adding alpha times the previous error.

play09:29

In other words, 0.2 times 510.13.

play09:34

This gives the forecast of 1,836.86.

play09:42

The same process is repeated for the next quarter,

play09:45

we take our previous forecast and adjust it

play09:47

by adding alpha times the error.

play09:50

You'll notice that some of the forecast errors

play09:52

are negative, which indicates an over forecast.

play09:56

In such cases, the next simple exponential smoothing forecast

play10:00

will be adjusted down.

play10:02

These computations can be done manually,

play10:04

but of course in practice we use software to compute them.

play10:08

In XLMiner for example, simple exponential smoothing

play10:12

is in the time series smoothing menu.

play10:15

In R we can use the SES function in the forecast package.

play10:20

Looking at the performance of a simple exponential

play10:22

smoother for the soft drink sales, does it perform well?

play10:28

Why?

play10:29

Let's look at another example.

play10:31

Remember the monthly ridership on Amtrak?

play10:33

This series exhibited both a trend and seasonality.

play10:36

What's going to happen when we apply

play10:38

simple exponential smoothing?

play10:40

The SES forecasts are computed in the exact same way

play10:43

as before.

play10:43

We take the previous forecast and add it's forecast error

play10:47

times alpha.

play10:49

For example the forecast for March of 1991

play10:52

is equal to the February 1991 forecast plus 0.2 times

play10:58

the February 1991 forecast error.

play11:02

Looking at the performance chart,

play11:05

do you think this forecaster is performing well?

play11:09

Why do you think it's working the way it is?

play11:13

The bottom line for simple exponential smoothing

play11:16

is that it simply takes a weighted average of all

play11:19

the values in our series.

play11:21

The weights decay exponentially into the past

play11:24

so that the most recent values get higher weight.

play11:28

The key concept of the smoothing constant

play11:31

is what controls how fast this algorithm learns from new data.

play11:36

The SES is a simple algorithm.

play11:38

It's very cheap to compute because we only

play11:40

need to keep and use the most recent forecast and the most

play11:44

recent forecast error.

play11:46

But realize that it does not capture trend

play11:48

and it does not capture seasonality.

play11:51

Therefore, it won't work very well

play11:53

if you're trying to forecast a series that

play11:55

has trend and has seasonality.

play11:57

That's what we saw in the two examples

play11:59

that we showed earlier.

play12:01

Make sure that when you're applying this method

play12:04

your series does not contain any trend or seasonality,

play12:07

or if it does, you might want to use differencing first.

play12:11

In the next videos we'll look at two other types

play12:14

of exponential smoothing methods.

play12:15

[MUSIC PLAYING]

play12:18

Rate This

5.0 / 5 (0 votes)

相关标签
Business AnalyticsForecastingExponential SmoothingSimple Exponential SmoothingTime SeriesData ForecastingSmoothing ConstantHolt's ExponentialHolt Winter's ExponentialForecasting Methods
您是否需要英文摘要?