RUN LLMs Locally On ANDROID: LlaMa3, Gemma & More

Ksk Royal

16 May 202406:55

Summary

TLDRThis video tutorial demonstrates how to run large language models (LLMs) on Android devices without root access, utilizing open-source models like Gemma and Llama3. By installing the Termux app and following a series of commands, users can set up Ollama, a tool for running these models locally. The video shows the process of downloading models, running them, and provides tips on navigating the command line interface, highlighting the potential for faster performance on newer Android devices.

Takeaways

📱 Generative AI models can be run on Android devices without root access.
🔗 The demonstration uses a Pixel 7a, but the process is applicable to any Android device.
💻 Download the Termux Application from GitHub, specifically the ARM64 V8 version.
📂 Grant file storage permission and update the package manager with 'pkg upgrade'.
🛠️ Install dependencies like git, cmake, and golang for setting up the environment.
🔗 Follow a provided link to copy and paste code into Termux for further setup.
🔧 Utilize the Ollama tool to run open-source models locally on Android.
⏳ The setup and model download times vary based on device speed and internet connection.
📚 Models like Gemma, Llama3, and Tiny Llama can be installed and run.
🚀 Gemma 2B model runs faster compared to the 8B model, offering better performance.
❌ To stop model output, use CTRL + C, and to exit the model, use CTRL + D.
🗑️ Remove Ollama and its models by clearing Termux's storage and cache.

Q & A

What is the main topic of the video?
-The video is about running large language models (LLMs) on an Android device using an open-source tool called Ollama, without needing root access or powerful computers.
Which Android device is used for the demonstration in the video?
-A Pixel 7a is used for the demonstration, and the procedure is said to be the same for any Android device.
What is the first step to install on the Android device as per the video?
-The first step is to download the Termux Application from a GitHub page, specifically the file labeled as ARM64 V8.
What command is used to grant file storage permission in Termux?
-The specific command to grant file storage permission is not mentioned in the transcript, but it suggests typing a command after installing Termux.
What does 'Termux-change-repo' command do in the setup process?
-The 'Termux-change-repo' command is used to select the default mirror and press enter to proceed with the setup.
What are the dependencies that need to be installed before running LLMs on Android?
-The dependencies that need to be installed are git, cmake, and golang.
What is the purpose of the Ollama tool mentioned in the video?
-Ollama is a tool that allows running open-source models locally on an Android phone.
How long does it take to set up Ollama and run an LLM?
-The time it takes depends on the speed of the Android device and the internet connection, as the transcript does not specify an exact duration.
What is the command used to install a model using Ollama?
-The command used to install a model is './ollama run [model name] —verbose', for example, './ollama run Gemma —verbose'.
How can one stop a model from generating output while using Ollama?
-To stop a model from generating output, press CTRL + C and then press enter to get back to the prompt.
What is the recommended way to exit the model and return to the Ollama directory?
-To exit the model and return to the Ollama directory, press CTRL + D.
How can Ollama and the installed models be removed from an Android device?
-To remove Ollama and the models, press and hold the Termux icon, choose app info, select storage & cache, and choose clear storage.
What are some of the models mentioned in the video that can be run on Android using Ollama?
-Some of the models mentioned are Gemma, Llama3, and Tiny Llama.
How does the performance of the Gemma 2B model compare to the 8B model?
-The Gemma 2B model runs much faster than the 8Billion Model and offers impressive performance.
What is the URL provided for more information about the models?
-The transcript does not provide the exact URL, but it mentions checking out a URL for more information about the models.