I Analyzed My Finance With Local LLMs

Thu Vu data analytics
31 Jan 202417:51

TLDRIn this insightful video, the creator shares their annual financial review process, which includes analyzing bank transactions to understand income and expenses better. Inspired by someone else's income and expense breakdown, they decide to use a local, open-source large language model (LLM) to categorize expenses from their bank statements. They explore different frameworks like Lama CPP and GPT for running LLMs locally on their laptop, which is both secure and free. After experimenting with various models, they find Llama 2 to be effective for categorizing expenses. The creator then customizes Llama 2 with a model file for better task performance and uses Python to automate the process of classifying transactions. The result is a detailed personal finance dashboard created with Plotly Express and Panel, showcasing income and expense breakdowns for 2022 and 2023, as well as monthly earnings and expenditures. The video concludes with a note on the importance of considering assets in personal finance and an encouragement to experiment with open-source LLMs for personal projects.

Takeaways

  • 💰 Money is not everything, but it's important to review your financial transactions regularly.
  • 🚀 The speaker was inspired to create an income and expense breakdown, leading to a personal finance analysis project.
  • 🔒 Privacy is a concern when uploading bank statements to online services, so the speaker chose to use a local large language model (LLM).
  • 📚 The use of open-source LLMs like LLaMa 2 was explored to classify expenses without relying on internet APIs or third-party services.
  • 💻 Frameworks like LLaMa CPP and AMA facilitate running large language models locally, offering security and cost benefits.
  • 📈 The project involved installing and running an LLM, classifying expenses, analyzing data, and creating visualizations using Python.
  • 🛠️ The speaker encountered issues with basic arithmetic tasks using LLMs, highlighting the need for caution in their use for quantitative tasks.
  • 🏦 The LLM was used to categorize bank statement transactions, with varying success, and a custom model was created for better performance.
  • 🔗 The speaker shared the code on GitHub for others to replicate or learn from the project.
  • 📊 A personal finance dashboard was created using Plotly Express and Panel to visualize income and expense breakdowns for 2022 and 2023.
  • 🤔 The importance of considering assets alongside expenses and income for a complete financial overview was discussed.

Q & A

  • Why does the speaker review their bank transactions periodically?

    -The speaker reviews their bank transactions to analyze their income and expenses, which they find important for managing their finances effectively, despite acknowledging that money is not everything.

  • What inspired the speaker to create an income and expense breakdown?

    -The speaker was inspired to create an income and expense breakdown after coming across someone else's breakdown, which motivated them to do the same.

  • Why did the speaker decide not to use an online API service to process their financial data?

    -The speaker was concerned about the privacy and security of their sensitive financial information, which could be stored by the API service for up to 30 days, hence they chose to use a local, open-source large language model.

  • What are the two main benefits of using frameworks like LLMa and Lama CPP for running large language models locally?

    -The two main benefits are quantization, which reduces the memory footprint of the model weights, and improved efficiency for the user, making it easier to utilize the models.

  • How does the speaker plan to use the local large language model for their bank transactions?

    -The speaker plans to use the local large language model to classify their expenses from bank transactions into appropriate categories and analyze the data to gain insights into their spending habits.

  • What is the speaker's experience with the arithmetic capabilities of the large language model?

    -The speaker found that while the large language model provided an elaborate response to a basic arithmetic question, it did not accurately calculate the product of two large numbers, indicating that out-of-the-box, these models may not be the best for basic arithmetic.

  • How did the speaker address the issue of the large language model providing different answers to the same question?

    -The speaker acknowledges the randomness in the model's responses and suggests that for specific use cases, one can customize the language model by specifying a model file with desired parameters and a custom system message.

  • What is the process the speaker uses to classify their bank transactions using the large language model?

    -The speaker uses a for loop to process their transactions in batches of 30 to avoid exceeding the token limit of the model. They then use a custom function to validate and format the model's output into categories for each transaction.

  • How does the speaker handle the task of creating a personal finance dashboard?

    -The speaker creates a personal finance dashboard using Plotly Express for visualizations and Panel for organizing the dashboard. They include pie charts for income and expense breakdowns and bar charts for monthly income and expenses for the years 2022 and 2023.

  • What is the speaker's final step in processing their financial data?

    -The final step is to clean up the dataframe, merge it with the main transaction dataframe, and ensure all transactions are categorized correctly to use in the personal finance dashboard.

  • What is the speaker's perspective on the future use of large language models?

    -The speaker believes that in the future, it will become a norm to use and run large language models locally on personal devices like laptops for various applications, including personal finance management.

Outlines

00:00

💼 Personal Finance Management with AI

The speaker discusses the importance of managing personal finances and how they review their bank transactions annually. Inspired by someone else's income and expense breakdown, they decide to use an open-source large language model (LLM) to categorize expenses from their bank statements. They mention the privacy concerns of uploading sensitive financial data to a third-party service and choose to run the LLM locally on their laptop. The video also introduces the process of installing and running an LLM, such as LLaMa 2, and using it to classify expenses into categories like groceries, rent, and travel. The speaker provides a GitHub link for the code used in the project and discusses the frameworks available for running LLMs locally, such as LLaMa CPP and GPT for.

05:01

🧮 Testing LLMs for Expense Classification

The speaker tests the ability of LLMs to perform basic arithmetic and classify expenses from bank statements. They find that while LLMs can provide elaborate answers, they may not be the best for basic math. The speaker then focuses on classifying expenses using the Mistro and LLaMa 2 models. They find that LLaMa 2 performs better in categorizing expenses as expected. The speaker also discusses creating a custom model file for further customization of the LLM to a specific use case, such as a financial assistant.

10:03

📊 Analyzing Bank Transactions with Python

The speaker outlines the process of using Python to analyze bank transaction data. They mention installing the Lang chain community library to access language models through LLaMa and discuss reading transaction data, handling token limits, and creating a loop to process transactions in batches. The speaker uses a custom function to format the output from the LLM and a validation check to ensure the correct format. They also mention using the pandas library for data manipulation and the creation of a DataFrame to store categorized transactions.

15:04

📈 Creating a Personal Finance Dashboard

The speaker describes the process of creating a personal finance dashboard using Plotly Express and Panel. They read in transaction data and create functions to generate pie charts for income and expense breakdowns and bar charts for monthly income and expense histograms. The speaker then combines these charts into a dashboard using a Panel template, providing an overview of income and expenses for the years 2022 and 2023. They conclude by noting that while they cannot retire soon, they hope the project inspires viewers to experiment with open-source LLMs and manage their personal finances.

Mindmap

Keywords

💡Bank Transactions

Bank transactions refer to the various activities recorded by a bank regarding the deposit, withdrawal, or transfer of funds. In the video, the speaker downloads and reviews their bank transactions annually to analyze their financial status. Bank transactions are the foundation for the personal finance analysis conducted in the video.

💡Expense Classification

Expense classification is the process of categorizing expenses into specific groups or categories such as groceries, rent, or travel. This is a crucial step in the video as it helps the speaker organize and understand their spending habits. The speaker uses a large language model to automate this process, which is a central theme in the video.

💡Large Language Model (LLM)

A large language model (LLM) is an artificial intelligence system designed to process and predict language. In the context of the video, the speaker uses an open-source LLM locally to classify expenses from their bank statements. The use of LLMs is a key technological aspect of the video, showcasing their application in personal finance management.

💡Local Installation

Local installation refers to the process of downloading and running software, such as an LLM, on an individual's personal computer rather than using an online service. The speaker chooses to run an LLM locally to ensure the privacy and security of their financial data, which is a significant decision in the video.

💡Data Analytics

Data analytics is the qualitative and quantitative process of analyzing raw data to extract useful information, primarily for decision-making purposes. The video demonstrates the use of data analytics through the analysis of bank transactions and the creation of a personal finance dashboard. This is a core component of the video, as it enables the speaker to visualize and understand their financial data.

💡Visualization

Visualization in the context of the video refers to the graphical representation of data, which helps in understanding complex information more easily. The speaker creates visualizations using Python and Plotly Express to show the main insights from their financial data. Visualization is a key method for presenting the analyzed financial information in an accessible format.

💡GitHub

GitHub is a web-based platform for version control and collaboration used by programmers to manage and track code development. The speaker mentions sharing all the codes used in the project on GitHub, which allows others to view, use, and contribute to the project. GitHub serves as a platform for open-source collaboration and is a key resource in the video.

💡Personal Finance Dashboard

A personal finance dashboard is a tool that provides an overview of an individual's financial situation, typically including income, expenses, and assets. In the video, the speaker creates a dashboard to visualize their income and expense breakdown for 2022 and 2023. The dashboard is a culmination of the video's analysis, providing a clear and interactive summary of the speaker's financial data.

💡Plotly Express

Plotly Express is a Python library used for creating interactive and visually appealing statistical graphics. The speaker uses Plotly Express to generate charts and graphs for their personal finance dashboard. It is an essential tool in the video for visualizing the categorized financial data.

💡Panel

Panel is a high-level app and dashboarding solution for Python. It is used in the video to organize and display the interactive visualizations created with Plotly Express in a coherent and user-friendly manner. Panel helps in creating a comprehensive dashboard that consolidates all the financial insights.

💡Open-Source

Open-source refers to software where the source code is available to the public, allowing anyone to view, use, modify, and distribute it. The video emphasizes the use of open-source LLMs, which are free and can be run locally, highlighting the accessibility and collaborative nature of open-source software in personal projects.

Highlights

The importance of reviewing personal finances regularly is emphasized, with the author comparing money to 'almost everything'.

The author shares their method of downloading bank transactions to analyze income and expenses.

Inspired by someone else's income and expense breakdown, the author decides to create their own.

Classifying expenses into appropriate categories is identified as the most challenging part of the process.

Due to privacy concerns, the author opts to use a local large language model (LLM) instead of uploading bank statements to a website.

The author discusses the limitations of using open APIs, including data storage concerns.

The decision to run an open-source LLM locally on a laptop for privacy and cost benefits is detailed.

Installing and running an LLM like LLaMa 2 locally is shown as a secure and free alternative to third-party services.

Frameworks like Lama CPP and GPT for are used to run open-source language models locally, with quantization and efficiency benefits.

The author guides viewers on how to download and install the LLM framework 'Orama' for Mac or Linux.

Windows users are shown how to run LLMs through Docker Desktop.

The process of installing a language model locally through the command line is demonstrated.

The author tests the LLM's ability to perform basic arithmetic and categorize expenses from bank statements.

Llama 2 is found to be more effective at classifying expenses than the Mistro model.

Customizing LLMs with a model file allows for a more tailored approach to specific use cases.

The author demonstrates creating a custom model file named 'expense analyzer' with specific parameters.

Interacting with LLMs through Python and Jupyter Notebook is shown to be more convenient than using the terminal.

The author's method for handling large amounts of transaction data to avoid token limit issues is explained.

A for loop is used to process transactions in batches, optimizing the interaction with the LLM.

The author discusses the use of the 'pandas' library for data validation and handling different output formats from the LLM.

A personal finance dashboard is created using Plotly Express and Panel to visualize income and expense data.

The dashboard provides an income and expense breakdown for 2022 and 2023, as well as monthly earnings and expenditures.

The author acknowledges that the financial overview provided is not complete without considering assets.

The project serves as an inspiration for others to experiment with open-source language models for personal finance management.