Five Steps to Create a New AI Model

IBM Technology
13 Sept 202306:56

Summary

TLDRDeep learning has revolutionized AI model development with foundation models, streamlining the process from data preparation to deployment. These models, adaptable through fine-tuning, accelerate specialized AI creation. The workflow involves data preparation, model training, validation, tuning, and deployment. IBM's Watsonx platform facilitates this, encompassing data management, governance, and AI interaction, promoting efficient AI lifecycle management.

Takeaways

  • šŸ¤– **Deep Learning Specialization**: Deep learning allows for the creation of specialized AI models such as chatbots and fraud detection systems.
  • šŸ”„ **Foundation Models**: Foundation models provide a base that can be fine-tuned for specific applications, streamlining the AI development process.
  • šŸ’¾ **Data Preparation**: Stage 1 involves preparing large amounts of data from various domains, including categorization and filtering.
  • šŸ—ļø **Model Training**: Stage 2 is about training the model on the prepared data piles, which can involve various types of foundation models.
  • šŸ” **Validation**: After training, models are benchmarked in Stage 3 to assess their performance and quality.
  • šŸ› ļø **Fine-Tuning**: In Stage 4, non-AI experts can fine-tune the model with local data to improve its performance.
  • šŸš€ **Deployment**: Stage 5 covers deploying the model either as a service in the cloud or embedded in an application.
  • šŸŒ **IBM's Watsonx**: IBM's Watsonx platform supports all stages of the AI model development workflow.
  • šŸ’§ **Watsonx.data**: Watsonx.data is a data lakehouse that connects to data repositories for Stage 1.
  • šŸ“‹ **Watsonx.governance**: Watsonx.governance manages data and model cards for governance and lifecycle management.
  • šŸ¤ **Engagement**: Watsonx.ai enables application developers to engage with the model in Stage 4.

Q & A

  • What is the significance of deep learning in building specialized AI models?

    -Deep learning allows for the creation of detailed and specialized AI models, such as customer service chatbots or fraud detection systems in banking, by training them with large amounts of labeled data.

  • What is the traditional process of building a new AI model for a specific specialization?

    -The traditional process involves starting from scratch with data selection and curation, labeling, model development, training, and validation for each new specialization.

  • How do foundation models change the traditional AI model development paradigm?

    -Foundation models provide a centralized base model that can be fine-tuned and adapted to specialized models, speeding up the development process.

  • What is the purpose of fine-tuning a foundation model?

    -Fine-tuning adjusts a foundation model to a specific use case by training it with relevant data, which can significantly reduce the time and computational resources required.

  • What are the five stages of the workflow to create an AI model as described in the script?

    -The five stages are: 1) Prepare the data, 2) Train the model, 3) Validate the model, 4) Tune the model, and 5) Deploy the model.

  • What types of data are used in Stage 1 of the AI model creation workflow?

    -Stage 1 uses a combination of open source data and proprietary data across various domains, which may include petabytes of data.

  • What data processing tasks are performed during the preparation of data in Stage 1?

    -Data processing tasks include categorization, filtering for inappropriate content, and removal of duplicates, resulting in a base data pile.

  • How does the selection of a foundational model affect the training process in Stage 2?

    -The choice of foundational model influences the training process by determining the type of data it will work with and the complexity of the model, which can affect training duration and resource requirements.

  • What is the role of the application developer in Stage 4 of the AI model workflow?

    -In Stage 4, the application developer engages with the model to generate prompts and may provide additional local data to fine-tune the model for better performance.

  • How does the IBM Watsonx platform support the AI model creation workflow?

    -IBM Watsonx supports the workflow with three elements: Watsonx.data for data management, Watsonx.governance for data and model governance, and Watsonx.ai for application developer engagement.

  • What benefits do foundation models offer in terms of AI model development?

    -Foundation models enable the creation of sophisticated AI applications more quickly by providing a base model that can be adapted to various specializations through fine-tuning.

Outlines

00:00

šŸ¤– Deep Learning and Foundation Models

This paragraph discusses the evolution of AI model development with deep learning, emphasizing the importance of data gathering, labeling, and training. It introduces the concept of foundation models as a base for creating specialized AI models through fine-tuning. The paragraph outlines the five stages of AI model development: data preparation, model training, validation, tuning, and deployment. Data preparation involves categorization, filtering out unwanted content, and creating a base data pile. Model training involves selecting a foundational model, tokenizing data, and training the model, which can be computationally intensive. Validation assesses the model's performance against benchmarks. Tuning allows developers to improve model performance with local data and prompts. Deployment can be either as a cloud service or embedded in an application.

05:03

šŸš€ Streamlining AI Model Development with IBM's Watsonx

The second paragraph focuses on the practical application of the five-stage workflow for AI model development, as facilitated by IBM's Watsonx platform. Watsonx streamlines the process by providing tools for each stage: Watsonx.data for data management, Watsonx.governance for overseeing data and model cards, and Watsonx.ai for application developers to engage with the model. The paragraph highlights how foundation models are revolutionizing AI model development, allowing for more sophisticated and rapid creation of AI applications. The platform is built on IBM's hybrid cloud platform, Red Hat OpenShift, indicating a robust infrastructure for AI development.

Mindmap

Keywords

šŸ’”Deep learning

Deep learning is a subset of machine learning that focuses on artificial neural networks with multiple layers, or 'deep' layers, to model and understand complex patterns in data. In the context of the video, deep learning enables the creation of specialized AI models that can perform tasks like customer service chatbots or fraud detection in banking. It's foundational to the process of building AI models, as it allows for the training of these models on large datasets.

šŸ’”Data labeling

Data labeling is the process of categorizing and describing data to make it useful for machine learning models. It's a critical step in training AI, as it helps the model understand what the data represents. In the script, data labeling is mentioned as a necessary step in preparing data for training AI models, ensuring that the model can accurately recognize and respond to different types of data.

šŸ’”Foundation model

A foundation model is a pre-trained AI model that serves as a starting point for developing specialized AI applications. It's designed to be adaptable through fine-tuning with specific data. The video discusses how foundation models can be fine-tuned for various applications, such as programming language translation, which speeds up the development process compared to starting from scratch.

šŸ’”Fine-tuning

Fine-tuning is the process of adjusting a pre-trained model to better suit a specific task or dataset. It's a key concept in the video, as it illustrates how a foundational model can be customized for a particular application by using relevant data to tweak its performance. This method is highlighted as a way to rapidly develop specialized AI models.

šŸ’”Data processing tasks

Data processing tasks refer to a series of operations performed on data to prepare it for use in AI models. These tasks, as mentioned in the script, include categorization, filtering, and removal of duplicates. They are essential for creating a clean and organized dataset, which is the 'base data pile' used for training AI models.

šŸ’”Model validation

Model validation is the process of assessing an AI model's performance against a set of benchmarks to ensure its quality and effectiveness. In the video, this step is crucial for determining how well the trained model performs before it is fine-tuned or deployed. It helps in creating a model card that documents the model's capabilities and performance metrics.

šŸ’”Application developer

An application developer, in the context of the video, is someone who works on integrating AI models into applications. They engage with the model during the 'tune' stage to generate prompts and provide local data for fine-tuning. This persona is crucial for adapting the AI model to specific use cases and does not necessarily need to be an AI expert.

šŸ’”Deployment

Deployment refers to the process of making an AI model operational in a real-world environment, either as a service in the cloud or embedded in an application at the network's edge. The video discusses deployment as the final stage of the AI model workflow, where the model is put to use and can continue to be iterated and improved upon.

šŸ’”Watsonx

Watsonx is the platform announced by IBM that enables all five stages of the AI model workflow discussed in the video. It is composed of three elements: watsonx.data, watsonx.governance, and watsonx.ai. Watsonx represents IBM's approach to providing a comprehensive solution for AI model development, from data preparation to deployment.

šŸ’”Hybrid cloud platform

A hybrid cloud platform, as mentioned in the context of IBM's Watsonx, is a computing environment that combines public and private cloud services. This platform allows for flexibility and can cater to various deployment needs, whether an AI model is run as a service in the public cloud or embedded in an application closer to the network's edge.

šŸ’”Data governance

Data governance is the process of managing and controlling the use of data in an organization. In the video, watsonx.governance is highlighted for managing data cards and model cards, ensuring that the AI development process is well-governed and follows established standards. This is crucial for maintaining the integrity and compliance of AI models.

Highlights

Deep learning enables building detailed specialized AI models with sufficient data.

Foundation models change the paradigm by providing a base model adaptable to specializations.

Foundation models can be fine-tuned with specialized data for rapid AI model development.

Stage 1 of AI model creation involves preparing data, which may include petabytes of data across domains.

Data processing in Stage 1 includes categorization, filtering for inappropriate content, and removing duplicates.

The output of Stage 1 is a base data pile, which is versioned and tagged for governance.

Stage 2 involves training the model on the base data piles using various types of foundation models.

Tokenization is a key step in preparing data for training foundation models.

Training foundation models can be computationally intensive, taking months and thousands of GPUs.

Stage 3 is validation, where the model's performance is benchmarked and a model card is created.

Stage 4 introduces the application developer who fine-tunes the model with local data and prompts.

Fine-tuning can be done quickly, in hours or days, compared to building a model from scratch.

Stage 5 is deployment, where the model can be offered as a service or embedded into applications.

IBM's Watsonx platform supports all five stages of the AI model creation workflow.

Watsonx.data connects with data repositories, Watsonx.governance manages data and model cards, and Watsonx.ai engages developers.

The 5-stage workflow allows for the creation of sophisticated AI applications more rapidly.

Transcripts

play00:00

Deep learning has enabled us toĀ build detailed specialized AI models,Ā Ā 

play00:07

and we can do that provided we gather enough data,

play00:10

label it, and useĀ that to train and deploy those models.

play00:14

Models like customer service chatbots or fraud detection inĀ banking.

play00:18

Now, in the past if you wanted to build a new model for your specialization -

play00:22

so, say a model forĀ predictive maintenance in manufacturing -

play00:26

well, youā€™d need to start again with data selectionĀ and curation,

play00:30

labeling, model development,Ā training, and validation.

play00:33

But foundation modelsĀ are changing that paradigm.

play00:37

So what is a foundation model?

play00:42

A foundation model is a more focused, centralized effort to createĀ a base model.

play00:50

And, through fine tuning, that base foundation model can be adapted to a specializedĀ model.

play00:55

Need an AI model for programming language translation?

play00:59

Well, start with a foundational model

play01:01

andĀ then fine tune it with programming language data.

play01:04

Fine tuning and adapting base foundationĀ models rapidly speeds up AI model development.

play01:10

So, how do we do that?

play01:12

Letā€™s look at the fiveĀ stages of the workflow to create an AI model.

play01:17

Stage 1 is to prepare the data.

play01:24

Now in this stage we need to trainĀ our AI model with the data we're going to use,

play01:30

and we're going to need a lot of data.

play01:32

Potentially petabytes of data acrossĀ dozens of domains.

play01:36

The data can combine both available open source data and proprietaryĀ data.

play01:41

Now this stage performs a series of data processing tasks.

play01:48

Those includeĀ categorization which describes what the data is.

play01:53

So which data is English, which is German?

play01:55

WhichĀ is Ansible which is Java? That sort of thing.

play01:58

Then the data is also applied with a filtere.

play02:03

So filtering allows us to, for example, apply filters for hate speech,

play02:08

and profanity and abuse, and that sort of thing.

play02:10

Stuff we want to filter out of the system that we don't train the model on it.

play02:15

Other filters may flag copyrighted material, private or sensitive information.

play02:20

Something else we're going to take out is duplicate data as well.

play02:25

So we're going to remove that from there.

play02:28

And then that leaves us with something called a base data pile.

play02:35

So that's really the output of stage one.

play02:39

And this base data pile can be versioned and tagged.

play02:43

AndĀ that allows us to say, "This is what Iā€™m training the AI model on, and here are theĀ filters I used".

play02:50

It's perfect for governance.

play02:52

Now, Stage 2 is to train the model.

play02:58

And we're going to train the model on those base data piles.

play03:01

So we startĀ this stage by picking the foundational model we want to use.

play03:06

So we will select our model.

play03:10

Now, there are many types of foundation models.

play03:14

There are generative foundationĀ models, encoder-only models, lightweight models, high parameter models.

play03:19

Are you looking to build anĀ AI model to use as a chatbot, or as a classifier?

play03:24

So pick the foundational model that matches your useĀ case,

play03:26

then match the data pile with that model.

play03:30

Next we take the data pile and we tokenize it.

play03:37

Foundation models work with tokens rather than words, and a data pile could resultĀ in potentially trillions of tokens.Ā Ā 

play03:45

And now we can engage the process of training using all of those tokens.

play03:51

This process can take a long time, depending on the size of the model.

play03:55

Large scale foundationĀ models can take months with many thousands of GPUs.

play04:00

But, once itā€™s done, the longest andĀ highest computational costs are behind us.

play04:06

Stage 3 is "validate".

play04:10

When training isĀ finished we benchmark the model.

play04:13

And this involves running the model

play04:15

and assessing itsĀ performance against a set of benchmarks

play04:18

that help define the quality of the model.

play04:20

And then from here we can create a model card

play04:26

that says this is the model Iā€™ve trained

play04:28

andĀ these are the benchmark scores it has achieved.

play04:32

Now up until this point the main persona thatĀ has performed these tasks

play04:37

is the data scientist.

play04:39

Now Stage 4 is "tune",

play04:42

and this is where we bring in theĀ persona of the application developer.

play04:46

This persona does not need to be an AI expert.

play04:49

They engage withĀ the model, generating - for example - prompts that elicit good performance from the model.

play04:55

TheyĀ can provide additional local data to fine tune the model

play05:02

to improve its performance.

play05:04

And this stage is something that you can do in hours or days -

play05:08

much quickerĀ than building a model from scratch.

play05:12

And now weā€™re ready for Stage 5, which is to deployment the model.

play05:19

Now thisĀ model could run as as service offering deployed to a public cloud.

play05:24

Or we could, alternatively, embed the model into anĀ application that runs much closer to the edgeĀ of the network.

play05:33

Either way we can continueĀ to iterate and improve the model over time.

play05:38

Now here at IBM weā€™ve announced a platformĀ that enables all 5 of the stages of this workflow.

play05:45

And Itā€™s called watsonx and itā€™s composed ofĀ three elements.

play05:49

So we have: watsonx.data, watsonx.governance, and watsonx.ai.,

play05:59

and this all built on IBMā€™sĀ hybrid cloud platform which is Red Hat OpenShift.

play06:10

Now Watsonx.data is a modern data lakehouse

play06:15

and establishes connections with the dataĀ repositories that make up the data inĀ Stage 1.

play06:20

Watsonx.governance manages the data cards from Stage 1 and model cards fromĀ Stage 3

play06:26

enabling a collection of fact sheets that ensure a well-governed AI process andĀ lifecycle.

play06:31

And watsonx.ai provides a means for the application developer personaĀ to engage with the model in Stage 4.

play06:40

Overall foundation models are changingĀ the way we build specialized AI modelsĀ Ā 

play06:45

and this 5-stage workflow allows teams toĀ create AI and AI-derived applications

play06:51

with greater sophistication while rapidlyĀ speeding up AI model development.

Rate This
ā˜…
ā˜…
ā˜…
ā˜…
ā˜…

5.0 / 5 (0 votes)

Related Tags
AI ModelsDeep LearningData ScienceFraud DetectionChatbotsPredictive MaintenanceModel TrainingData ProcessingIBM WatsonAI Development