SAS Tutorial | How to Choose a Machine Learning Algorithm

SAS Users
7 Dec 202014:09

Summary

TLDRIn this tutorial, Aurora Peddycord-Liu discusses how to select the right machine learning algorithms by focusing on data, goals, and metrics. She compares the pros and cons of decision trees, neural networks, and deep learning models, offering guidance on when to use each method. The session also includes a demo in SAS Viya Model Studio, showcasing how to assess and compare model performance systematically. Aurora emphasizes the importance of matching machine learning techniques to project needs and concludes with an introduction to SAS Model Manager for model deployment and monitoring.

Takeaways

  • 😀 Aurora Peddycord-Liu introduces her approach to selecting machine learning algorithms based on data, goals, and metrics.
  • 🤖 The first consideration when selecting algorithms is the type of data, whether it includes well-defined features or signals like time-series and image data.
  • 🎯 Understanding the goal of the project is essential: is it prediction, exploration, or description?
  • ⚖️ Metrics like accuracy, speed, interpretability, and implementation difficulty are key factors in selecting a model.
  • 🌳 Decision trees are great for messy data, highly interpretable, and actionable, but have unstable topology when the data changes.
  • 🧠 Neural networks are universal approximators, resistant to the curse of dimensionality, but they require large datasets, training time, and are a 'black box' with no interpretability.
  • 💡 Deep learning excels at handling specific data like images or time-series, leveraging layers of abstraction, but requires large amounts of data, computational power, and has low explainability.
  • 🔧 Machine learning algorithms are tools, and matching the method to the project's data characteristics and goals is critical.
  • 🛠 SAS Viya Model Studio is demonstrated as a tool for pre-processing data, selecting models, and systematically comparing them to find the best algorithm.
  • 📊 Gradient boosting models performed best in the example using SAS Viya, and the platform also supports comparing pipelines and monitoring model performance over time.

Q & A

  • What is the primary focus of Aurora Peddycord-Liu's tutorial?

    -The primary focus of Aurora Peddycord-Liu's tutorial is to share her approach to picking the right machine learning algorithms by discussing the pros and cons of commonly used models and demonstrating how to assess and compare models in SAS Viya Model Studio.

  • What are the three main components Aurora suggests considering before picking machine learning models?

    -Aurora suggests considering data, goal, and metrics as the three main components before picking machine learning models.

  • How does Aurora define 'data' in the context of selecting machine learning algorithms?

    -In the context of selecting machine learning algorithms, 'data' refers to understanding how the data is collected, whether it contains well-defined features or comes in a set of signals like time-series sensors or image pixels, and if there are any relationships or distributions in the data that should be taken advantage of.

  • What does Aurora mean by 'goal' when discussing machine learning projects?

    -By 'goal', Aurora refers to the objective of the machine learning project, such as whether the project aims to make predictions, describe, or explore the data.

  • What are some of the metrics Aurora considers important when selecting machine learning models?

    -Aurora considers metrics such as accuracy, speed during the scoring process on new data, interpretability, ease of implementation, and low maintenance as important factors when selecting machine learning models.

  • Why does Aurora recommend decision trees for messy data?

    -Aurora recommends decision trees for messy data because they are essentially a set of rules to partition the data and are more flexible in handling missing values and outliers without needing to replace missing values.

  • What are the advantages of using neural networks according to the tutorial?

    -Neural networks are advantageous because they are universal approximators, resistant to the curse of dimensionality, and fast in scoring once learned. They can model any non-linear relationship without needing to specify a predefined formula.

  • What is the main drawback of using neural networks mentioned in the tutorial?

    -The main drawback of using neural networks is that they act as a black box, meaning they are not interpretable and do not offer explainability.

  • How does Aurora differentiate deep learning from neural networks?

    -Aurora differentiates deep learning from neural networks by explaining that deep learning involves more complex network structures designed for specific tasks and data types, such as time series data or image data, and it is not just about having more layers in a neural network.

  • What is the role of SAS Viya Model Studio in the machine learning model selection process as described in the tutorial?

    -SAS Viya Model Studio plays a role in the machine learning model selection process by providing a systematic approach to assess results, allowing for the comparison of different models and their performance metrics in an easy and visual manner.

  • What is the significance of the telecommunication company data example used in SAS Viya Model Studio?

    -The telecommunication company data example is significant as it demonstrates how to apply machine learning to predict customer churn, using clearly defined features, and how SAS Viya Model Studio can be used to compare different models to find the best one for the task.

Outlines

plate

このセクションは有料ユーザー限定です。 アクセスするには、アップグレードをお願いします。

今すぐアップグレード

Mindmap

plate

このセクションは有料ユーザー限定です。 アクセスするには、アップグレードをお願いします。

今すぐアップグレード

Keywords

plate

このセクションは有料ユーザー限定です。 アクセスするには、アップグレードをお願いします。

今すぐアップグレード

Highlights

plate

このセクションは有料ユーザー限定です。 アクセスするには、アップグレードをお願いします。

今すぐアップグレード

Transcripts

plate

このセクションは有料ユーザー限定です。 アクセスするには、アップグレードをお願いします。

今すぐアップグレード
Rate This

5.0 / 5 (0 votes)

関連タグ
Machine LearningAlgorithm SelectionDecision TreeNeural NetworkDeep LearningSAS ViyaModel ComparisonData ScienceAI TechniquesModel Performance
英語で要約が必要ですか?