How to Use Llama 3 with PandasAI and Ollama Locally

Tirendaz AI
3 May 202413:55

Summary

TLDRThis video script introduces viewers to the integration of Llama 3, a large language model, with PandasAI and Ollama for local data analysis. It guides through setting up a virtual environment, installing necessary open-source tools, and building an interactive app with Streamlit. The app allows users to explore and analyze the Titanic dataset using natural language prompts, demonstrating the power of generative AI in data manipulation and visualization.

Takeaways

  • πŸ” Large models have huge potential for data analysis, and this tutorial covers using Llama 3 with PandasAI and Ollama.
  • πŸ› οΈ PandasAI is a Python tool that allows you to explore, clean, and analyze data using generative AI, making data interaction easier.
  • πŸ’» Ollama helps you run large language models (LLMs) like Llama 3 locally, without needing an API key.
  • πŸ“‚ The app created in this tutorial uses the Titanic dataset to demonstrate how to interact with data using PandasAI.
  • πŸ› οΈ The tutorial walks through setting up a virtual environment with conda, installing necessary tools like pandas, pandasai, and streamlit, and initializing Llama 3 with Ollama.
  • πŸ”— The app is built using Streamlit, a popular tool for creating web apps with Python.
  • πŸ“Š The app allows users to upload a CSV file, view the first rows of data, and interact with the dataset using natural language prompts.
  • πŸ€– The SmartDataframe class in PandasAI is used to convert the dataset into a format that can be queried with natural language prompts.
  • πŸ“ˆ The tutorial demonstrates various data queries and visualizations, including bar charts, pie charts, histograms, and heatmaps, generated by interacting with the dataset using Llama 3.
  • πŸ” The key message is that good prompts lead to good outputs, and the combination of PandasAI, Llama 3, and Streamlit makes data exploration intuitive and powerful.

Q & A

  • What are the tools mentioned in the script for data analysis with large models?

    -The tools mentioned are PandasAI, Ollama, and Streamlit. PandasAI is a smart version of pandas for data exploration, cleaning, and analysis using generative AI. Ollama helps run large models like Llama 3 locally. Streamlit is used for building the app interface.

  • Why is Pandas AI considered a smart version of pandas?

    -Pandas AI is considered a smart version of pandas because it allows users to explore, clean, and analyze data using generative AI, enabling conversational interaction with the data.

  • How does Ollama assist in working with large models locally?

    -Ollama assists by allowing users to run open-source Large Language Models (LLMs) like Llama 3 locally on their computers, which would otherwise be difficult to manage.

  • What is the purpose of the app created in the script?

    -The app is created to demonstrate the power of Llama-3 by allowing users to chat with a dataset, specifically the Titanic dataset, and get responses based on their prompts.

  • What is the first step in setting up the environment for the app as described in the script?

    -The first step is to create a virtual environment using conda and naming it 'genai', followed by activating this environment.

  • How are the required tools for the app installed in the script?

    -The required tools are installed by creating a 'requirements.txt' file listing the necessary libraries such as pandas, pandasai, and streamlit, and then using pip to install them with the command 'pip install -r requirements.txt'.

  • What is the role of the LocalLLM class in the script?

    -The LocalLLM class is used to instantiate an LLM object that connects to Ollama and specifies the model to be used, facilitating the interaction with the Llama 3 model for data analysis.

  • How is the user input for chatting with the dataset collected in the app?

    -User input is collected through a text area created in the Streamlit app, where users can enter their prompts to interact with the dataset.

  • What is the significance of the 'spinner' method used in the app?

    -The 'spinner' method is used to display a loading animation with the message 'Generating response...' to indicate that the app is processing the user's prompt before displaying the results.

  • How can users visualize data using the app created in the script?

    -Users can visualize data by entering prompts to plot various types of charts such as bar charts, pie charts, histograms, and heatmaps based on the dataset's columns.

  • What is the importance of a good prompt when using the app?

    -A good prompt is crucial because it directly affects the quality of the output. 'Garbage in, garbage out' applies here; clear and specific prompts will yield more accurate and useful responses.

Outlines

00:00

πŸ€– Introduction to Generative AI Tools

This paragraph introduces the potential of large models in data analysis and the tools that will be used in the video: Llama 3, PandasAI, and Ollama. The video aims to demonstrate the use of these open-source tools for local data analysis without the need for an API key. PandasAI is described as an intelligent version of the popular pandas library for data manipulation, allowing users to interact with their data using generative AI. Ollama is highlighted as a tool that facilitates the local running of large models, making it easier to work with them without cloud-based services. The video will guide viewers through creating an app that uses these tools to interact with the Titanic dataset, showcasing the power of Llama-3.

05:03

πŸ› οΈ Setting Up the Development Environment

The second paragraph details the process of setting up the development environment for the app. It starts with creating a virtual environment using conda and installing the necessary tools: pandas, pandasai, and streamlit, which are specified in a requirements.txt file. The tools are installed using pip. The paragraph also covers the initialization of Llama 3 with Ollama, which involves downloading and installing Ollama, starting the Ollama server, and installing the Llama 3 model. The video script then proceeds to show how to initialize the model in the app using the LocalLLM class from pandasai, connecting it to the Ollama server and specifying the model to be used.

10:05

πŸ“Š Building the Data Analysis App with Streamlit

In this paragraph, the focus shifts to building the actual data analysis app using Streamlit. The app is initialized with a title, and a file uploader widget is created to load CSV datasets. The uploaded file is then converted into a pandas dataframe, and the first few rows are displayed to verify the upload. The dataframe is further converted into a SmartDataframe using the SmartDataframe class from pandasai, which is configured to use the previously initialized Llama 3 model. The app includes a text area for user prompts and a 'Generate' button that triggers the chat method of the SmartDataframe to interact with the dataset based on the user's input. The paragraph concludes with examples of how to use the app to ask questions about the dataset and generate responses, demonstrating the ease of exploring data with the help of generative AI.

πŸ“ˆ Visualizing Data and Making Inferences with Llama-3

The final paragraph demonstrates how to use the app to make inferences and visualize data from the Titanic dataset. It shows how to ask the app questions about the dataset, such as the number of rows and columns, average age, and the number of people with more than three siblings. The paragraph also covers how to calculate the percentage of surviving passengers by gender. Furthermore, it illustrates the process of plotting various charts, including bar charts, pie charts, histograms, and heatmaps, to gain insights into the dataset's distribution and relationships. The video emphasizes the importance of using clear and effective prompts to achieve accurate and meaningful results, concluding with an invitation to subscribe, like, and comment on the video.

Mindmap

Keywords

πŸ’‘Data Analysis

Data analysis is the process of inspecting, cleaning, transforming, and modeling data to extract useful information, inform decision-making, and generate new knowledge. In the context of the video, data analysis is the main theme, focusing on how to utilize tools like PandasAI and Ollama to perform this process with large datasets like the Titanic dataset, showcasing the power of AI in data exploration and manipulation.

πŸ’‘Llama 3

Llama 3 is a large language model that is part of the video's discussion. It represents the third version of a generative AI model capable of understanding and generating human-like text. The video script mentions using Llama 3 with Ollama to run the model locally, emphasizing the ease of working with large models without the need for external APIs like OpenAI.

πŸ’‘PandasAI

PandasAI is introduced as a smart version of the popular Python library pandas, which is widely used for data manipulation and analysis. The script describes PandasAI as a tool that allows users to explore, clean, and analyze data using generative AI, essentially enabling conversational interaction with the data, which is a key aspect of the video's demonstration.

πŸ’‘Ollama

Ollama is a tool highlighted in the video that facilitates the local running of large language models (LLMs). It is presented as a solution for the difficulty of working with large models locally, making it easier for users to leverage the capabilities of models like Llama 3 directly from their computers, as illustrated by the video's step-by-step guide.

πŸ’‘Virtual Environment

A virtual environment in the context of the video refers to an isolated space where developers can install and use specific versions of libraries and tools without affecting the system's global state. The script details the creation and activation of a virtual environment named 'genai' using conda, which is essential for setting up the project's dependencies.

πŸ’‘Requirements.txt

In software development, a 'requirements.txt' file lists the dependencies required for a Python project. The video script includes creating this file to specify libraries such as pandas, pandasai, and streamlit, which are necessary for the app's functionality, and then installing them using pip, a package installer for Python.

πŸ’‘Streamlit

Streamlit is an open-source Python library used for quickly creating and sharing custom web applications for data science. The video describes using Streamlit to build the app interface, allowing users to upload datasets, interact with the data through prompts, and visualize the results, demonstrating the ease of creating interactive apps with this tool.

πŸ’‘SmartDataframe

SmartDataframe, as mentioned in the script, is a class from the PandasAI library that enhances the functionality of a traditional pandas DataFrame by integrating AI capabilities. It allows the dataset to be interacted with using natural language prompts, making complex data analysis tasks more accessible, as shown when the video's app uses it to chat with the dataset.

πŸ’‘LocalLLM

LocalLLM is a class from the PandasAI library that represents a locally running large language model. The script explains how to instantiate a LocalLLM object and connect it to Ollama, specifying the model name and API base to ensure compatibility with the locally served Llama 3 model, which is crucial for the app's operation.

πŸ’‘Data Visualization

Data visualization involves creating visual representations of data to analyze and communicate patterns and insights effectively. The video script includes using the app to generate various charts and plots, such as bar charts, pie charts, histograms, and heatmaps, to illustrate different aspects of the Titanic dataset, demonstrating the power of combining AI with visualization for data exploration.

πŸ’‘Garbage In, Garbage Out (GIGO)

GIGO is a principle in data processing that states the output quality is dependent on the input quality. The script mentions this principle to emphasize the importance of crafting good prompts when using AI tools like PandasAI and Ollama, as poor input can lead to uninformative or incorrect outputs, highlighting the need for clear and thoughtful interaction with AI systems.

Highlights

Large models have significant potential for data analysis.

Introduction to using Llama 3 with PandasAI and Ollama for local model deployment without the need for an API key.

Pandas AI is a smart version of pandas that allows for generative AI-based data exploration, cleaning, and analysis.

Ollama is a tool that facilitates the local running of large models, making it easier to work with them without cloud-based services.

Demonstration of an app creation process using Llama-3, showcasing its capabilities in data interaction and analysis.

Step-by-step guide on setting up a virtual environment for the project using conda.

Instructions on installing necessary tools like pandas, pandasai, and streamlit using a requirements.txt file.

Explanation of how to initialize Llama 3 with Ollama for local model usage.

Creating a Python file for the app and setting up the environment for coding with VS Code.

Using Streamlit to build and run the app, with automatic browser opening for easy access.

Creating a file uploader widget in Streamlit for dataset loading.

Conversion of uploaded CSV file into a pandas dataframe for data manipulation.

Utilization of the SmartDataframe class from PandasAI for advanced data interaction.

Incorporation of user input through a text area and a generate button for dynamic data queries.

Implementation of a spinner for user feedback during data processing.

Interactive chat with the dataset using the chat method of the SmartDataframe class.

Inference examples demonstrating how to ask the model questions about the dataset and receive answers.

Visualization capabilities of the app, including bar charts, pie charts, histograms, and heatmaps for data analysis.

Emphasis on the importance of good prompts for accurate and meaningful data analysis outcomes.

Conclusion summarizing the tutorial on using PandasAI with Llama-3 and building an interactive data analysis app with Streamlit.

Transcripts

play00:00

Large models have huge potential for data analysis.

play00:03

Today, we'll cover how to use Llama 3 with PandasAI and Ollama.

play00:07

We'll run this model locally, so we won't leverage any api key like OpenAI.

play00:13

The tools we'll use in this video are completely open-source.

play00:17

Let me briefly explain these tools.

play00:18

When it comes to data manipulation, as you know, pandas is king.

play00:23

You can think of Pandas AI as a smart version of pandas.

play00:26

Pandas AI is nothing but a Python tool that allows you to explore, clean,

play00:31

and analyze your data using generative AI.

play00:34

This means that you can talk to your data using this tool.

play00:38

Cool, right?

play00:38

Another amazing tool in generative AI is Ollama.

play00:42

As you know, it is difficult to work with large models locally.

play00:46

This is where Ollama comes into play.

play00:48

Ollama helps you run LLMs locally.

play00:50

Awesome, right?

play00:50

Here is the app we'll create.

play00:52

First, we'll load our dataset.

play00:55

The dataset we'll use is the Titanic dataset.

play00:57

Here, we'll enter our prompt to chat with our dataset.

play01:01

And then the app will return a response.

play01:04

This is a simple app but great to see the power of Llama-3.

play01:07

Let's take a look at the topics we'll handle.

play01:10

First, we'll create a virtual environment and then install the tools we'll use.

play01:14

Next, we'll initialize Llama 3 with Ollama.

play01:17

After that, we'll load the dataset.

play01:19

And then we'll build the app.

play01:21

Lastly, we'll chat with the dataset using PandasAI.

play01:24

Nice, we've seen the topics we'll cover.

play01:27

Let's go ahead and start with the setup.

play01:29

To write our codes, we're going to use the vs code editor.

play01:32

Now, let's create a virtual environment.

play01:34

To do this, we're going to use conda.

play01:36

Let's write,

play01:37

conda create -n let's name genai

play01:42

Let me press enter.

play01:44

I already created this.

play01:46

After that, we're going to activate this environment.

play01:49

conda activate genai

play01:53

Here you go.

play01:54

Our environment is ready to use.

play01:56

The next step we're going to do is install the tools we'll use.

play02:00

To do this, let's create a requirements.txt file.

play02:02

We're going to click on the new file and then name it.

play02:09

requirements.txt

play02:11

Okay.

play02:12

Let's write the libraries we'll use here.

play02:14

To work with PandasAI, let's write, pandas, and pandasai.

play02:20

Next, to build the app, we'll use streamlit.

play02:24

Let's write streamlit.

play02:25

Okay, these tools are enough to build the app.

play02:28

We're going to now install these.

play02:30

It's simple to do this with pip.

play02:32

Let's type,

play02:33

pip install -r requirements.txt

play02:38

Let me press enter.

play02:40

Here you go.

play02:41

Our tools are ready to use.

play02:44

To write our codes, we're going to create a Python file.

play02:47

Let me click on the new file and then give a name.

play02:50

Let's say, app.py

play02:52

Yeah, our file is ready.

play02:54

So far, we created a virtual environment and installed the tools.

play02:58

Let's go ahead and initialize the model with Ollama.

play03:01

When it comes to working with llms locally, Ollama is king.

play03:05

This tool allows you to run open-source LLMs, such as Llama 3 and Mistral.

play03:09

Trust me, it's very easy to use Ollama.

play03:12

All you need to do is download it from the website and then install it on your computer.

play03:17

After installing you can use Ollama in the terminal.

play03:20

First, let's start it. Let's go to the terminal and then

play03:25

Write, ollama serve

play03:28

I already started ollama.

play03:30

Ollama is ready to use.

play03:31

You can use many large models with Ollama.

play03:34

Let me show you these models.

play03:35

Let's click the model.

play03:37

There you go.

play03:38

Here, you can choose any model you want.

play03:40

It is a piece of cake to load a model with ollama.

play03:43

To install a model, we're going to use the terminal.

play03:46

Let me show you.

play03:47

Let's say, we want to install llama 3.

play03:50

ollama pull llama3

play03:55

Let me press enter.

play03:56

There you go.

play03:57

As you can see, llama 3 is loaded.

play03:59

You can see the installed models with the list command.

play04:04

ollama list

play04:05

There you go.

play04:06

As you can see I loaded many models.

play04:09

The model we'll use is the Llama 3 8B version.

play04:12

Awesome, Ollama is ready to go.

play04:14

What we're going to do is initialize the model with Ollama.

play04:17

To do this, we're going to use the LocalLLM.

play04:20

First, let's import this class.

play04:23

from pandasai.llm.local_llm import LocalLLM

play04:33

Okay.

play04:33

What we're going to do now is instantiate an LLM object by specifying the model name.

play04:39

To do this, we're going to use compatibility.

play04:41

This helps the app to connect to Ollama.

play04:44

Let write, model = LocalLLM()

play04:48

Let's connect Ollama.

play04:50

api_base="http://localhost:11434/v1"

play05:03

Okay, next, let's specify the model.

play05:06

model="llama3"

play05:10

Nice, our model is ready.

play05:12

Let's go ahead and initialize the app with streamlit.

play05:16

First, let's import this tool.

play05:19

import streamlit as st

play05:23

Next, let's give the app a title.

play05:26

st.title("Data analysis with PandasAI")

play05:35

After that, let's run this app.

play05:37

To do this, in the terminal let's write,

play05:41

streamlit run app.py

play05:45

Let me press enter.

play05:46

Yeah, the app opens automatically in the browser.

play05:49

This is simple, right?

play05:50

You can see the title here.

play05:52

Let's go ahead and create a widget to load the dataset.

play05:55

To do this, we can use the file_uploader method.

play05:58

Let's say, uploaded_file = st.file_uploader()

play06:06

Let's give a text, "Upload a CSV file",

play06:12

Next, let's specify the file type, type=['csv'])

play06:19

Okay.

play06:19

Nice, we created a widget.

play06:22

Let's test this widget.

play06:23

Go to the app and click rerun.

play06:26

There you go.

play06:27

To load the dataset, let's click over here.

play06:29

And then select our dataset.

play06:34

Yeah, the file is uploaded.

play06:36

Easy peasy, right?

play06:37

Let's go ahead and take a look at the first rows of the dataset.

play06:41

To do this, let's use the if statement.

play06:43

if uploaded_file is not None:

play06:49

After that, let's convert data into the pandas dataframe.

play06:52

First, we're going to import pandas.

play06:56

import pandas as pd

play07:00

Next, let's read the dataset.

play07:02

data = pd.read_csv(uploaded_file)

play07:11

Now, let's write the first rows.

play07:14

st.write(data.head(3))

play07:21

Let's test the app.

play07:22

To do this, let's go to the app and click rerun.

play07:26

There you go.

play07:27

You can see the first rows of the Titanic dataset.

play07:30

What we need to do now is convert the dataset into SmartDataframe.

play07:34

To do this, we're going to use the SmartDataframe class.

play07:37

Let's go back and write,

play07:39

from pandasai import SmartDataframe

play07:47

Next, let's get an object from this class.

play07:50

df = SmartDataframe()

play07:54

Let's pass our data,

play07:57

Let's set the config parameter, let's pass {"llm": model}

play08:04

Nice, our smart dataframe is ready.

play08:06

Next, to get the input from the user, let's create a text area.

play08:10

prompt = st.text_area("Enter your prompt:")

play08:20

Nice, our text area is ready.

play08:22

What we're going to do now is create a button.

play08:25

For this, let's use the button method with the if statement.

play08:28

if st.button("Generate"):

play08:35

When pressing this button, we want to run the prompt.

play08:38

if prompt:

play08:40

If the prompt is true, let's display a message while running the code.

play08:45

To do this, let's use the spinner method using the with keyword.

play08:49

with st.spinner("Generating response..."):

play08:59

Now all we need to do is return the response.

play09:03

To do this, let's write, st.write()

play09:07

Let's invoke the chat method for a response.

play09:09

df.chat(prompt))

play09:13

Awesome, our app is ready.

play09:15

Now it's time to make inferences.

play09:18

Let me go to the app and then click rerun.

play09:21

There you go.

play09:22

All we need to do is write a prompt to chat with the dataset.

play09:25

Let's write,

play09:27

How many rows and columns are in the dataset?

play09:35

Let me click on the generate button.

play09:38

Yeah, the app is up and running.

play09:40

Voila!

play09:41

We got the output by talking to the dataset.

play09:43

Amazing, right?

play09:44

Now, we can explore our dataset using this app.

play09:48

Let's want to learn the average age.

play09:50

What is the average age?

play09:54

Let me click generate.

play09:55

There you go.

play09:56

The average age is about twenty-nine.

play09:58

Let's go ahead and find out How many people have more than 3 siblings?

play10:05

How many people have more than 3 siblings?

play10:12

Let me press generate.

play10:13

There you go.

play10:14

There are 30 people with more than 3 siblings.

play10:18

Let's move on and want to see how many people died and how many survived.

play10:25

How many people died and how many survived?

play10:32

Let me click generate.

play10:34

There you go.

play10:35

Here are the numbers of people who survived and died.

play10:38

This was very easy to find out with Llama 3 right?

play10:42

Let's go ahead and want to calculate the percentage of surviving passengers by gender.

play10:47

Return the percentage of passengers by gender

play10:57

Let me click generate.

play10:59

Yeah, it's done.

play11:00

The percentage of male passengers is 64.76% and

play11:04

and for female passengers, the percentage is 35.24%.

play11:08

As you can see, it's easy peasy to explore the dataset with Llama-3.

play11:14

Let's go ahead and plot charts to understand the dataset.

play11:17

Let's write,

play11:19

Draw a bar chart of the sex column

play11:26

Let me click generate.

play11:28

There you go.

play11:29

As you can see, men more than women. Okay.

play11:33

Let's go ahead and want to plot a pie plot.

play11:38

Plot a pie chart of the pclass column

play11:45

Let me click generate.

play11:48

It's done.

play11:49

Here is the pie chart.

play11:50

You can see percentages here.

play11:52

Let's go ahead and want to visualize the distribution of the fare column.

play11:58

Visualize the distribution of the fare column

play12:06

Let me click generate.

play12:09

There you go.

play12:10

Here is the distribution of the fare column.

play12:12

Now, we want to plot the histogram of the age column.

play12:16

Draw the histogram of the age column.

play12:23

Let me click generate.

play12:25

There you go.

play12:26

Here is the distribution of the age column.

play12:29

Let's go ahead and plot a histogram of the fare by sex.

play12:33

Draw the histogram of the fare column separated by the sex column.

play12:46

Let me click generate.

play12:47

There you go.

play12:49

Here is the distribution of the fare column by sex.

play12:52

You can plot any chart you want.

play12:54

Lastly, let's want to plot the heatmap chart of numerical variables

play12:58

Draw the heatmap of numerical variables.

play13:05

Let me click generate.

play13:06

There you go.

play13:07

Here is the heatmap chart.

play13:08

Note that garbage in, garbage out.

play13:11

So good prompt means good output.

play13:14

You may need to try a few prompts to find a good one.

play13:17

Yeah, that's it.

play13:18

In this video, we covered how to use PandasAI with Llama-3.

play13:23

To show this, we built an app step-by-step with Streamlit.

play13:26

Hope you enjoyed it.

play13:27

Thanks for watching.

play13:28

Don't forget to subscribe, like the video, and leave a comment.

play13:32

See you in the next video.

play13:33

Bye for now.

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Data AnalysisAI ToolsPandasAIOllamaLlama-3Local LLMStreamlit AppTitanic DatasetGenerative AIOpen Source