Forecasting and big data: Interview with Prof. Rob Hyndman

Galit Shmueli

30 Nov 201607:26

Summary

TLDRProfessor Rob Hyndman from Monash University discusses big data in time series forecasting, emphasizing the importance of analyzing multiple time series rather than very long individual ones. He highlights the shift from manual to automated forecasting as data volume increases and recommends software like R's 'forecast' package and Tableau for automation. Hyndman warns of the risks of overfitting with complex models and advocates for simplicity and testing methods on holdout sets to ensure accuracy.

Takeaways

📈 Big data in time series forecasting refers to a large collection of time series, each of which may not be particularly long but collectively form a vast dataset.
🏬 Examples of big data in time series include daily sales data for multiple products across various stores and countries, or security streaming data from hundreds of sensors.
🔍 When forecasting a few time series, it's feasible to manually tweak models for individual series, but automation becomes crucial when dealing with many series.
🛠️ Automated forecasting algorithms are essential for handling large numbers of time series, as manual analysis is impractical.
💻 Software like R's 'forecast' package, Forecast Pro, and Tableau offer automated forecasting solutions, some with algorithms written by the interviewee.
🚀 Thrive Technologies has an exceptionally fast automatic forecasting algorithm, noted for its speed.
⏰ The benefits of automation in forecasting include time and cost savings, while the danger lies in the potential for poor performance on certain series due to the limitations of automated algorithms.
🔧 A recommended strategy is to let the automatic algorithm handle the bulk of the forecasting while analysts focus on series that are not forecast well.
🏁 In forecasting competitions, simple methods often outperform complex ones, as large, complicated models can overfit the data, especially when time series are not very long.
📚 It's important to test different forecasting methods on holdout sets to determine what works best for the specific type of data being analyzed.

Q & A

What is the definition of big data in the context of time series forecasting according to Rob Hyndman?
-Big data in time series forecasting refers to a large collection of time series, where each individual series may not be particularly long but the volume of series is substantial. Examples include daily sales data for multiple products in various stores and countries or security streaming data from hundreds of sensors.
How does Rob Hyndman describe the difference between handling a few time series versus many?
-With a few time series, one can manually analyze and tweak forecasting methods for each series to account for peculiarities. However, with many series, manual analysis becomes impractical, necessitating automated algorithms to generate forecasts efficiently.
What software does Rob Hyndman mention for automatic forecasting of many time series?
-Rob Hyndman mentions several software options for automatic forecasting, including the R package 'forecast', Forecast Pro for Windows, Tableau, and an algorithm by Thrive Technologies known for its speed.
What are the benefits of using automated forecasting algorithms according to Rob Hyndman?
-Automated forecasting algorithms save time and money by quickly generating forecasts for large numbers of time series without the need for manual intervention.
What are the potential dangers of relying on automated forecasting algorithms as highlighted by Rob Hyndman?
-The danger lies in the fact that no automatic algorithm will work well for every time series. There may be edge cases where the algorithm performs poorly, and improvements in one area might inadvertently degrade performance in others.
What strategy does Rob Hyndman suggest for dealing with time series that are not forecasted well by automated algorithms?
-Rob Hyndman suggests identifying the poorly forecasted series and focusing analyst time on these cases, while allowing the automatic algorithm to handle the majority of the series where it performs adequately.
What insights did Rob Hyndman gain from forecasting competitions involving hundreds or thousands of time series?
-From forecasting competitions, Rob Hyndman learned that simple models often outperform complex ones due to the limited length of individual time series, and that methods like exponential smoothing tend to do well, especially with data showing trends and seasonality.
Why do large, complicated models not always perform well in time series forecasting competitions?
-Large, complicated models are prone to overfitting when applied to individual time series that are not long enough to support the complexity of such models, which is often the case in forecasting competitions.
What is the importance of testing forecasting methods on holdout sets according to Rob Hyndman?
-Testing forecasting methods on holdout sets is crucial for evaluating their effectiveness and identifying which methods work well with the specific data at hand. This practice helps in selecting the most appropriate forecasting approach.
What is the significance of the ATS algorithm mentioned by Rob Hyndman?
-The ATS (Automatic Exponential Smoothing) algorithm is significant because it automates the process of fitting exponential smoothing models to time series data, making it easier to forecast without manual intervention.
How does Rob Hyndman's involvement in developing the algorithm for Tableau reflect his expertise in forecasting?
-Rob Hyndman's involvement in developing Tableau's forecasting algorithm demonstrates his expertise in creating efficient and effective automated forecasting solutions, contributing to the accessibility of advanced forecasting methods in widely used software.

Outlines

plate

This section is available to paid users only. Please upgrade to access this part.

Mindmap

plate

This section is available to paid users only. Please upgrade to access this part.

Keywords

plate

This section is available to paid users only. Please upgrade to access this part.

Highlights

plate

This section is available to paid users only. Please upgrade to access this part.

Transcripts

plate

This section is available to paid users only. Please upgrade to access this part.

Browse More Related Video

Exponentially Weighted Moving Average or Exponential Weighted Average | Deep Learning

Time Series Forecasting ARIMA model | Boat Sales Forecasting [ End to End Project]

Excel - Time Series Forecasting - Part 2 of 3

Transformes for Time Series: Is the New State of the Art (SOA) Approaching? - Ezequiel Lanza, Intel

Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data Science | Simplilearn

CH01_VID06_DBMS other functions

Rate This

★

★

★

★

★

5.0 / 5 (0 votes)

Related Tags

Time SeriesForecastingBig DataAutomationCompetitionsR SoftwareARIMAExponential SmoothingData AnalysisPredictive Modeling