Get Started with Azure Machine Learning

Data Science Dojo
18 Jul 201712:14

TLDRIn this tutorial series, Phuc Duong from Data Science Dojo introduces viewers to Azure Machine Learning Studio, a cloud-based tool for data mining and machine learning that doesn't require coding. The series is aimed at anyone interested in learning Azure ML and is a prerequisite for Data Science Dojo's five-day boot camp. Duong covers the basics, including importing and exporting data, exploring datasets, and building predictive models. He also discusses the benefits of cloud computing, such as scalable storage and collaborative workspaces, as well as potential drawbacks like the need for a reliable internet connection and compliance issues. The video provides an overview of Azure ML, its integration with other technologies, and the costs associated with using the platform, including subscription fees and usage charges.

Takeaways

  • πŸ“š This tutorial series is aimed at anyone interested in learning Azure Machine Learning Studio, including those who will attend Data Science Dojo's five-day boot camp.
  • πŸ’» Azure Machine Learning Studio is a cloud-based tool that allows for drag-and-drop machine learning without the need for coding, making it accessible for beginners and advanced users alike.
  • πŸ”— Azure ML integrates seamlessly with SQL, R, and Python, allowing users to mix and match these languages within the platform.
  • ☁️ As a cloud-based service, Azure ML offers the benefits of cloud computing, such as scalability, collaborative workspaces, and freedom from hardware maintenance.
  • πŸ’° The cost of using Azure ML includes a monthly subscription fee and charges based on usage, which covers runtime, deployment calls, and data storage.
  • πŸš€ Azure ML allows for the deployment of models as web services, which can be consumed via REST APIs and auto-generated code in languages like C#, R, or Python.
  • πŸ“ˆ The platform is particularly powerful for handling large datasets, as it is backed by robust data infrastructure such as Azure SQL databases, HDInsight, and Blob storage.
  • 🌐 Data governance and compliance are important considerations when using cloud-based tools like Azure ML, as not all industries or companies may allow for cloud storage of data.
  • πŸ”’ A reliable internet connection is a necessity when using cloud-based tools, as they require constant access to the internet to function.
  • πŸ“ˆ Azure ML Studio is part of the Azure ecosystem, which is a cloud platform by Microsoft, comparable to Amazon Web Services and Google Cloud Services.
  • πŸ“ Phuc Duong, the presenter, has extensive experience in data science and data engineering, having authored books and lab manuals on Azure Machine Learning and related topics.
  • πŸ“± The cloud-based nature of Azure ML means that users can run the service on various devices, including less powerful machines like iPads, as the processing is done on the cloud servers.

Q & A

  • What is the purpose of the video tutorial series presented by Phuc Duong?

    -The purpose of the video tutorial series is to expand the viewer's data science toolkit by teaching how to data mine using Azure Machine Learning Studio.

  • Who is the intended audience for this video series?

    -The video series is intended for anyone who wants to learn about Azure Machine Learning Studio, and it is also one of the five modules students are expected to learn before attending Data Science Dojo's five-day boot camp.

  • What are some of the skills that students are expected to acquire by the end of the video series?

    -By the end of the series, students should be comfortable with importing and exporting data, exploring and manipulating datasets, building and predicting models in Azure ML, exposing models as web services, and coding within Azure ML itself.

  • What is Azure Machine Learning Studio and how does it differ from traditional data science tools?

    -Azure Machine Learning Studio (Azure ML) is a cloud-based data mining and machine learning tool that does not require coding. It uses a drag-and-drop approach with a visual interface, making it more accessible for beginners and also useful for advanced users due to its integration with SQL, R, or Python.

  • What are the benefits of using a cloud-based machine learning tool like Azure ML?

    -Benefits include not needing powerful hardware, the ability to run on any device with a browser, collaborative work spaces, scalability, and a robust IT infrastructure support for big data.

  • What are the potential downsides or considerations when using cloud computing for machine learning?

    -Downsides include the need for a reliable internet connection, compliance with industry and company data governance policies, and potential costs associated with data storage and usage.

  • How does one obtain Azure Machine Learning Studio?

    -One can obtain Azure Machine Learning Studio through a free trial method, which provides limited access, or a full workspace method, which requires an Azure subscription.

  • What are the costs associated with using Azure Machine Learning Studio?

    -There is a monthly subscription fee of $9.99 per seat, and additional charges for usage, which includes runtime, deployment calls, and data storage.

  • What are the different ways to obtain an Azure ML workspace?

    -You can obtain an Azure ML workspace through a free trial by signing up with your email, or by creating a workspace within an Azure subscription for a fully functional workspace.

  • What is the cost for storage in Azure Blob storage?

    -Azure Blob storage charges approximately $0.02 per gigabyte per month for data storage.

  • What is the estimated cost for using Azure ML for the entire video series?

    -The estimated cost for using Azure ML for the entire video series is around $20 for the month, which includes the seat and usage fees.

  • What is the role of Azure ML in the context of big data and IT infrastructure?

    -Azure ML provides a cloud-based platform that supports big data through integration with various Azure services like Azure SQL databases, HDInsight for Spark and Apache Hadoop, Blob storage, Data Lake storage, and Azure Data Factory for data pipeline orchestration.

  • How does Azure ML facilitate collaboration among team members?

    -Azure ML is collaborative as it allows users to invite others to their cloud spaces, enabling them to work together and share the same cloud space for machine learning projects.

Outlines

00:00

πŸ“š Introduction to Azure Machine Learning Studio

Phuc Duong introduces the Data Science Dojo and the upcoming multi-part video series on Azure Machine Learning Studio. He explains that the series is aimed at anyone interested in learning about Azure ML and mentions the five-day boot camp on data science and data engineering. The boot camp includes a 50-hour in-person training covering various modules, including R and Azure ML Studio. Duong emphasizes that students should be comfortable with these tools before attending. Even for those not attending the course, the video series will provide comprehensive knowledge of Azure ML, covering data import/export, exploration, manipulation, transformation, cleaning, model building, prediction, and API consumption. The series assumes prior knowledge of data mining, and for those without it, a video series link is provided. Duong shares his background in teaching data science and data engineering, his authorship of a lab manual on Azure ML, and his other works. He also discusses the evolution of Azure ML and the need for an updated tutorial series. The video concludes with an overview of what Azure ML is, its cloud-based nature, and the benefits of its drag-and-drop interface for machine learning.

05:02

πŸ’» Azure ML and Cloud Computing Benefits

The second paragraph delves into the benefits of cloud computing and how they apply to Azure Machine Learning. Duong discusses the cost-effectiveness of cloud storage, the importance of large datasets for machine learning, and the IT infrastructure required to support them. He highlights Azure's comprehensive data infrastructure, including SQL databases, HDInsight, Blob storage, Data Lake storage, and Azure Data Factory. The paragraph also covers the freedom from hardware maintenance, the ability to run on less powerful devices, and the collaborative nature of cloud-based tools. Duong emphasizes the scalability of cloud services and their ability to distribute workloads efficiently. However, he also addresses potential drawbacks, such as the need for a reliable internet connection and compliance issues related to data governance and industry regulations. The paragraph concludes with an explanation of how to obtain Azure ML Studio, either through a free trial or a full workspace subscription, and touches on the costs associated with using Azure ML.

10:03

πŸ’Ό Pricing and Subscription Details for Azure ML Studio

In the final paragraph, Duong provides a detailed breakdown of the costs associated with using Azure Machine Learning Studio. He explains that there are two main components to the pricing: a monthly subscription fee and charges for usage. The monthly subscription is $9.99 per month per seat. Usage charges are divided into three parts: runtime, deployment calls, and data storage. Runtime charges are $1 per hour of experiment run time, deployment calls are charged based on the number of API calls to the deployed web services, and data storage charges are $0.02 per gigabyte per month. Duong mentions that the first 1,000 API calls are free and estimates that the entire video series might cost around $20 for the month. He also notes that charges are prorated if the workspace is deleted. The paragraph ends with an invitation for viewers to like, subscribe, and share their favorite data mining tools in the comments, and a teaser for the next video, which will demonstrate how to create an Azure ML workspace.

Mindmap

Keywords

Azure Machine Learning Studio

Azure Machine Learning Studio, often abbreviated as Azure ML, is a cloud-based data mining and machine learning tool that allows users to build predictive models without the need for traditional coding. It is a visual interface that uses a drag-and-drop approach, which is particularly beneficial for beginners as they can focus on learning data mining and machine learning concepts without worrying about programming syntax. In the video, it is mentioned as a core tool for the Data Science Dojo's boot camp and is used to teach students how to import, export, explore, manipulate, and clean data, as well as build and deploy predictive models.

Data Science Dojo

Data Science Dojo is an organization that hosts a five-day boot camp on data science and data engineering. The boot camp is an intensive, in-person training program that lasts for 50 hours over a week, covering various modules including the use of Azure ML Studio. The video's presenter, Phuc Duong, is associated with Data Science Dojo, indicating that the content is part of a broader educational curriculum aimed at equipping students with practical data science skills.

Data Mining

Data mining is the process of discovering patterns and extracting useful information from large datasets. It is a core concept in the field of data science and is integral to the application of machine learning. In the context of the video, data mining is the primary focus of the tutorial series, with Azure ML Studio being used as the tool to teach participants how to perform this task effectively.

Cloud Computing

Cloud computing refers to the delivery of computing services, including storage, processing power, and databases, over the internet. It offers benefits such as scalability, collaborative capabilities, and reduced reliance on physical hardware. The video discusses the advantages and disadvantages of using cloud-based tools like Azure ML, emphasizing the flexibility and cost-effectiveness of cloud storage and the need for a reliable internet connection.

Drag and Drop Interface

A drag and drop interface is a user-friendly design feature that allows users to move items, such as modules or data elements, within a software application by dragging them with a mouse or touch input and dropping them into a desired location. Azure ML Studio uses this approach, making it accessible to users who may not have a strong programming background, as highlighted in the video.

Predictive Models

Predictive models are algorithms or statistical frameworks that are used to predict outcomes based on historical data. They are a fundamental part of machine learning and are used in various applications, from forecasting to decision-making. In the video, the presenter discusses how Azure ML Studio can be used to build and deploy predictive models as web services, which can then be consumed via APIs.

Web Services

Web services are applications that provide specific functionality over the internet using HTTP protocols. They can be consumed by other applications or users to perform tasks without needing to understand the underlying code. In the context of Azure ML, predictive models can be exposed as web services, allowing users to access the models' predictive capabilities through REST APIs.

APIs

API stands for Application Programming Interface, which is a set of protocols, routines, and tools for building software and applications. In the video, it is mentioned that after deploying a predictive model in Azure ML, it can be accessed via REST APIs. This allows for integration with other applications and services, such as those written in C#, R, or Python.

Azure Stack

Azure Stack is a cloud platform developed by Microsoft that allows organizations to deliver Azure services from their own data centers. It is part of the Azure ecosystem and enables hybrid cloud computing, which is a model where cloud computing is extended to on-premises infrastructure. The video mentions Azure Stack as the platform where Azure ML resides, emphasizing its integration within the broader Azure services.

Data Governance

Data governance is the process of managing the availability, usability, integrity, and security of data used in an organization. It is a critical aspect of cloud computing, as it ensures that data is handled in compliance with legal and regulatory requirements. The video discusses the importance of data governance when considering the use of cloud-based tools like Azure ML, particularly in terms of compliance with industry standards and company policies.

Pricing Model

The pricing model for a service outlines how customers are charged for using that service. For Azure ML Studio, the video explains that there is a monthly subscription fee per seat and additional charges for usage, which includes runtime, deployment calls, and data storage. Understanding the pricing model is crucial for users to manage costs and make informed decisions about using the platform.

Highlights

This tutorial series is designed to expand your data science toolkit by teaching data mining using Azure Machine Learning Studio.

Data Science Dojo offers a five-day boot camp on data science and data engineering, which includes learning Azure ML Studio.

The boot camp is an intensive, 50-hour, in-person training that prepares students for tackling data science problems.

Azure ML Studio is a cloud-based tool that enables data mining and machine learning without the need for coding.

The platform features a drag-and-drop interface, making it accessible for beginners and advanced users alike.

Azure ML Studio seamlessly integrates with SQL, R, or Python, allowing users to mix and match these languages in their data mining processes.

Models created in Azure ML can be automatically deployed as web services and accessed via REST APIs.

Azure ML is part of the Azure Stack, a cloud platform by Microsoft, offering robust IT services for building software platforms.

Cloud computing offers the advantage of not being limited by data size due to the plummeting cost of cloud storage.

Azure provides a comprehensive data infrastructure to support machine learning, including SQL databases, HDInsight, Blob storage, and Data Lake storage.

The cloud-based nature of Azure ML Studio allows for collaboration and scalability, distributing workloads across multiple nodes.

One of the cons of cloud computing is the requirement for a stable internet connection to access your work.

Compliance and data governance are critical considerations when using cloud services for industry-specific data.

Azure Machine Learning Studio can be accessed through a free trial or a full workspace with an Azure subscription.

The pricing for Azure ML Studio includes a monthly subscription fee and charges based on usage, such as runtime, deployment calls, and data storage.

The first 1,000 API calls to deployed web services are typically free, reducing costs for small-scale deployments.

For the duration of this tutorial series, the estimated cost for using Azure ML Studio on a pay-as-you-go basis is around $20 per month.

The presenter, Phuc Duong, has extensive experience in data science and data engineering and has authored several books and manuals on the subject.

Azure ML Studio's visual interface is compared to Visio or PowerPoint, making it more user-friendly than traditional data science tools.