Gemini API and Flutter: Practical, AI-driven apps with Google AI tools
Summary
TLDRIn this video, Eric and Ander from the Dart and Flutter teams explore Generative AI, demonstrating how it can transform app development. They share their journey of creating a cooking app using Google AI Studio and the Gemini API, which generates recipes from photos of ingredients. The talk covers prompt design, integrating the API with Flutter, and enhancing the app's user experience with AI. They showcase how developers can leverage AI to build functional apps without extensive server-side coding.
Takeaways
- 🧑💻 Eric Windmill and Ander Dobo are engineers and product managers, respectively, working on the Dart and Flutter teams.
- 🤖 Large Language Models (LLMs) are sophisticated AI systems that power generative AI, capable of creating content like text, images, code, and music.
- 🛠️ Generative AI is rapidly becoming a practical tool for developers, with new products being released and improved frequently.
- 🧐 It can be challenging for developers to identify the right AI tools and understand their practical applications in app development.
- 🚀 The speakers built a cooking app using the Gemini API, showcasing the ease of integrating AI into a Flutter app with the help of Google AI SDK for Dart.
- 🔍 Google AI Studio is a browser-based IDE for experimenting with Google's generative models and is instrumental in the development process.
- 📝 Prompt design is a critical process in AI, involving creating and refining prompts to guide the AI model to produce the desired output.
- 🍲 The cooking app allows users to take a photo of ingredients, and the app generates a recipe, bypassing the need for a pre-existing database.
- 🔑 To utilize the Gemini API, developers need to obtain an API key from Google AI Studio and integrate it into their app projects.
- 🔄 The app's user interface allows for dynamic input, which is interpolated into the prompt sent to the Gemini API to generate personalized recipes.
- 📚 The speakers emphasize the importance of safety considerations and adherence to Google's safety guidance when working with AI models.
- 📈 They also highlight the potential for continuous improvement of the app, including making the AI character more interactive through chat features.
Q & A
What are Large Language Models (LLMs)?
-Large Language Models (LLMs) are sophisticated artificial intelligence systems trained on large sets of data, capable of generating new content such as text, images, code, or music.
How can generative AI transform application creation and interaction?
-Generative AI, powered by LLMs, has the potential to transform how we create and interact with applications by enabling the creation of new content and providing more dynamic, personalized user experiences.
What challenges do developers face when starting with generative AI tools?
-Developers may find it challenging to identify the right tools, understand how to get started with them, and determine the practical applications of AI in app development.
How did Eric and Ander overcome their lack of experience with AI in app development?
-Eric and Ander overcame their lack of experience by using tools like the Google AI SDK for Dart, which allowed them to quickly get started and build an app using the Gemini API.
What is the purpose of Google AI Studio?
-Google AI Studio is a browser-based IDE for prototyping with Google's generative models, useful for experimenting with different prompts when building features that use the Gemini API.
What is the main functionality of the cooking app built by Eric and Ander?
-The cooking app allows users to take a photo of ingredients they have on hand, and the app uses generative AI to generate a recipe based on those ingredients, eliminating the need for manual entry and a pre-existing recipe database.
What is prompt design in the context of using the Gemini API?
-Prompt design is the process of creating and tweaking prompts given to a large language model like Gemini to achieve the desired type and quality of output.
Why is it important to consider safety when working with large language models like Gemini?
-It is crucial to consider safety to ensure the app provides appropriate and safe content, following guidelines such as avoiding harmful or dangerous information and adhering to food safety practices.
How did Ander address unexpected results from the Gemini model in the cooking app?
-Ander addressed unexpected results by refining the prompt, adding instructions for the model to avoid returning recipes when the image doesn't contain edible items, and incorporating safety measures as per Google's guidelines.
What steps are involved in setting up the Gemini API for a Flutter app?
-The steps include obtaining an API key from Google AI Studio, adding the Google generative AI package to the Flutter app, setting up the API with the necessary code, and making requests to the Gemini API using a properly formatted prompt.
How did Eric enhance the user experience of the cooking app?
-Eric enhanced the user experience by adding a more interesting personality to Chef Noodle, the app's character, and by structuring the data returned by the Gemini API to make it more reliable and easier to parse.
What are the future plans for the cooking app according to Ander?
-Ander plans to make Chef Noodle more interactive by incorporating the Gemini API's chat feature, allowing for a more conversational user experience in future versions of the app.
Outlines
🤖 Introduction to Generative AI and the Cooking App Project
In the first paragraph, Eric Windmill and Ander Dobo introduce themselves as members of the Dart and Flutter teams, respectively. They explain the concept of Large Language Models (LLMs) and generative AI, highlighting its potential to revolutionize application creation and interaction. Eric discusses the rapid development of AI tools for developers and the challenges of selecting the right ones. Ander emphasizes the difficulty in identifying practical AI applications for app developers. The speakers share their experience with the Google AI SDK for Dart and the process of building a cooking app using the Gemini API, which leverages generative AI to create recipes from photos of ingredients, addressing the 'cold start problem' and eliminating the need for manual ingredient entry or a pre-existing recipe database.
🔧 Experimentation with Google AI Studio and Prompt Design
The second paragraph delves into the initial steps taken by the team to explore the capabilities of generative AI using Google AI Studio, a browser-based IDE for prototyping with Google's generative models. The team experimented with different prompts to understand the potential applications of AI, such as creating smart chatbots or inspiring users with images. Eric expresses his interest in cooking apps that suggest recipes based on available ingredients but points out their limitations. The team then discusses the proof of concept for their cooking app, which involved using the Gemini model to generate recipes from images of ingredients. Ander explains the process of prompt design, which involves creating and refining prompts to achieve the desired output from the LLM. He also addresses safety considerations and the incorporation of Google's safety guidance into their prompts.
📝 Setting Up the Gemini API and Enhancing the Cooking App
In the third paragraph, the focus shifts to the technical setup and integration of the Gemini API into the cooking app. Ander outlines the process of obtaining an API key and setting up the Google generative AI package in the Flutter app. Eric demonstrates the app's functionality, which includes capturing a photo of ingredients and personalizing the recipe request through additional inputs. The app then sends a formatted prompt to the Gemini API, which generates a recipe in response. The paragraph also covers the steps for setting up the API key, adding the necessary package to the Flutter app, and writing the code to communicate with the Gemini API. Additionally, the team discusses enhancing the app's user experience by giving the app's character, Chef Noodle, a more engaging personality through updates to the prompt. Eric also addresses the challenges of structuring data returned by the Gemini API and how they were overcome by specifying the expected format in the prompt.
Mindmap
Keywords
💡Large Language Models (LLMs)
💡Generative AI
💡Google AI SDK for Dart
💡Gemini API
💡Prompt Design
💡Flutter
💡Multimodal
💡Safety Parameters
💡API Key
💡Environment Variable
💡JSON
Highlights
Introduction to Large Language Models (LLMs) and generative AI, emphasizing their potential to transform application creation and interaction.
Generative AI's capability to create new content such as text, images, code, and music.
The rapid development of AI tools for developers and the challenges in choosing the right ones.
The use of Google AI SDK for Dart to quickly build an app using the Gemini API.
A walkthrough of building a cooking app with generative AI as the backend.
Google AI Studio as a browser-based IDE for experimenting with Google's generative models.
The potential of generative AI in solving problems like building smart chatbots and inspiring users with images.
The innovative cooking app that generates recipes from photos of ingredients, eliminating the need for manual input and databases.
The proof of concept process using the Gemini models to ensure they can generate reasonable and delicious recipes from images.
The importance of prompt design in getting the desired output from a large language model.
The use of free-form prompts in Google AI Studio for open-ended instructions to the Gemini model.
The process of tweaking prompts to achieve consistently reasonable recommendations and useful information.
Incorporating safety considerations and following Google's safety guidance in the app development.
The setup process for the Gemini API in a Flutter app, including obtaining an API key and adding the Google generative AI package.
The implementation of dynamic input from the form into the prompt for the Gemini API in the Flutter app.
The idea of giving Chef Noodle a more interesting personality by updating the prompt.
The challenge of structuring data when working with the Gemini API and the solution of adding explicit formatting to the prompt.
The personalization of the cooking app experience with the ability to switch devices for hands-free recipe following.
The potential of the Gemini API for building AI-driven apps and the invitation to explore its use in Flutter or Dart apps.
Transcripts
[MUSIC PLAYING]
ERIC WINDMILL: Hi.
I'm Eric.
I'm an engineer on the Dart and Flutter teams.
ANDER DOBO: And I'm Ander, and I'm a product
manager on the Flutter team.
Large Language Models, or LLMs, are
sophisticated artificial intelligence systems
trained on large sets of data.
And generative AI is powered by LLMs.
These are artificial intelligence models
that can create new content such as text, images, code,
or even music.
Generative AI has the potential to transform how we create
and interact with applications.
ERIC WINDMILL: As a developer, you've probably seen and heard
news about how quickly generative AI is
becoming a tool you could use to build software.
New AI products for developers are being released all the time,
and those products are changing and improving fast.
ANDER DOBO: It can be hard to know what the right tools are
and how to get started with them.
It's even hard to know what the practical applications of AI
might be as an app developer.
ERIC WINDMILL: We didn't have much experience building
apps that use AI before preparing for this talk.
But using tools like the Google AI SDK for Dart,
we were able to get up and running
and build an app that uses Gemini API in no time.
In this talk, we're going to walk you
through our journey of building a cooking app that uses
generative AI as the back end.
First, we'll talk about how we got started with generative
AI using Google AI Studio.
ANDER DOBO: Then I'll walk you through how
you can get the most out of the Gemini API
through a process called prompt design.
ERIC WINDMILL: Finally, I'll show you
how I use the Gemini API to enhance
a real-world application.
ANDER DOBO: When we started this project,
we didn't really know what was possible
when it came to generative AI.
So our first step was to learn and experiment
in Google AI Studio.
Google AI Studio is a browser-based IDE
for prototyping with Google's generative models.
It's useful for experimenting with different prompts
as you build a feature that uses the Gemini API.
While experimenting in Google AI Studio,
we started to realize how many problems
we could solve by building Flutter
apps that use generative AI, such as building
a smart chatbot for users to have
a natural conversation about a topic
or using an image to inspire a user to make something.
ERIC WINDMILL: And I'm a big fan of the cooking apps that tell me
what recipes I can make based on the ingredients I already
have on hand.
But these types of apps can be cumbersome to use.
It's time-consuming to use apps that
require the user to manually type all the food
items in their pantry every time they want to find a new recipe.
And these apps can be difficult to build because they
have the cold start problem.
They rely on having a large database of recipes
to be useful.
But with generative AI, both of those problems go away.
Using our new app, the user can just
take a photo of some ingredients they want to use,
and the app generates a recipe using that photo, which
means there's no need to type each of the ingredients
and there's no need for a pre-existing database.
ANDER DOBO: We needed to do a proof of concept
to make sure that the Gemini models are
capable of taking an image of ingredients
and returning a recipe that is both reasonable to make
and delicious.
This required a process of trial and error called prompt design.
Prompt design is the process of creating
and tweaking prompts given to a large language model
to get the desired type and quality of output.
The first decision I had to make is what type of prompt
would fit our use case--
free form, which is open-ended text;
structured, which has a predefined format
and often where you provide examples of requests
and responses; or chat, which enables a user
to have a natural ongoing conversation with a language
model.
I started with the most basic type of prompt
in Google AI Studio, a free-form prompt.
And I used the Gemini 1.5 Pro model.
A free form prompt is an open-ended instruction
or a question you provide to a large language
model like Gemini.
It has no predefined structure and doesn't
require you to give specific examples of requests
and responses.
The Gemini 1.5 Pro model is multimodal
and takes text and images as inputs and outputs text.
To start, I entered, what recipe can I
make using the items in this photo,
along with a photo of some food items that I took.
And here's a result I got back in Google AI Studio.
After a bit more experimentation,
I found that for version one of our app,
a free-form prompt was perfect.
It let us quickly experiment, and it gave us
good results for our use case.
Next, I focused on tweaking the prompt
to get consistently reasonable recommendations.
I also instructed Gemini model to provide useful information
with the recipe, such as the number of people
it will serve and nutritional information per serving.
An example of something unexpected that I address
was the Gemini model returned a recipe even if the image didn't
contain any edible items.
So I added a line to the prompt instructing the Gemini model
not to return a recipe in this scenario.
It's crucial to be mindful of safety considerations
when building your app and working with large language
models like Gemini.
Following Google's safety guidance,
we incorporated several safety measures in our prompt.
For example, in our case, we instruct
the model to only provide recipes
that contain real edible ingredients
and to follow food safety practices like ensuring poultry
is fully cooked.
I updated the prompt by adding that the Gemini model should
list ingredients that are potential allergens.
Additionally, there are adjustable safety parameters
for the Gemini API, such as for harassment or dangerous content.
After reading up on each, I found
that the default settings for these safety parameters
were suitable for our app.
We will continue to test and monitor for safety problems
throughout the lifecycle of the app.
This is the initial prompt I came up with for the app.
Now I can save it and share it with Eric using the share
feature in Google AI Studio so he can add
the prompt to the Flutter app.
ERIC WINDMILL: To start, let me show you the app that we built.
When you open the app, the first thing the user sees
is the chef, Chef Noodle, asking them which ingredients
they want to use in a recipe.
They provide this list of ingredients by taking a picture.
Then the app has additional inputs that
allows them to personalize their recipe request,
such as buttons for common ingredients they may have,
dietary restrictions, and cuisines they're in the mood
for.
When the form is filled out, the user
presses Submit to request a new recipe from Chef Noodle.
Behind the scenes, this form data
is being interpolated into the prompt.
So in the Flutter app, conceptually, the prompt
looks more like this.
The inputs from the form are inserted into the text prompt,
and the images are attached.
The prompt is then sent to the Gemini
API, which generates a new recipe
and returns it to the app.
And that's the main functionality of the app.
Now let's go through the steps taken
to set up and start making requests to the Gemini API
with our prompt.
ANDER DOBO: First, you need to get an API key for the Gemini
API.
In Google AI Studio, click Get API Key
in the left-hand navigation bar.
Let's create the API key in a new project.
This automatically creates the API key for you in Google Cloud
and restricts the key to only be able to call the Gemini API.
Alternatively, you can select an existing Google Cloud project
if you already have one that you would like
to associate your API key with.
Now you can copy the key to use as you develop your app.
If you don't set up billing, you can use the API free of charge
up to specified rate limits.
ERIC WINDMILL: Once you're set up with the API key,
the next step is to add the Google generative AI
package to your Flutter app.
To do so, open your terminal and navigate
to the directory of your Flutter project,
then add the package with the pub add command.
Next, add the code required to set up the Gemini
API and a Flutter project.
You can find the code needed to do this in the Getting Started
docs at ai.google.dev.
The setup code looks like this.
I added this code in the init state method
of the app's top level widget.
This code is creating a new instance of a generative model
object, which knows how to communicate with the Gemini API.
The constructor for the generative model class
expects the name of the Google LLM you're passing in,
such as Gemini 1.5 Pro, as well as your API key.
Finally, this code attempts to get your Gemini API key using
the string from environment method, which is
part of the Dart core library.
This method expects that an environment variable
called API key will be passed in when the app starts running.
The simplest way to pass in the API key as an argument
is to use the Dart define flag when you run the Flutter run
command.
And this works great for development.
Now that the Gemini API is set up and the app is running,
we can focus on adding the logic to the app that
will make the request to the Gemini API with the prompt.
To start, I looked at the documentation at ai.google.dev
and found an example of the code I needed to add to my app.
That example code looks like this.
The most important part of this code
is the generative model generate content method from the Google
generative AI package.
The generate content method is where you build
the prompt for the Gemini API.
It expects a list of content objects, which
will be a list of content subtypes, text part, and data
part.
Text parts are used to pass in strings and data
parts are objects you can use to pass in files, such as images.
Let's get back to our prompt, which currently looks like this.
But, of course, our app has dynamic input from the form,
so we need to update the prompt in the app
to look more like this.
To add this to the Flutter app, I
copied the prompt text from AI Studio into a text part object
and then replace the specifics, like dietary restrictions,
with values from the form the user fills out.
Now, back on the main page of the app, when a user presses
Submit Prompts, the app will generate
a recipe that can be saved.
Let's see it in action.
[CAMERA CLICK]
Great, it all works as expected.
But I think we can do more with AI.
Namely, I want to give Chef Noodle
a more interesting personality.
ANDER DOBO: Let's see what happens
if we update the prompt by adding,
you are a cat chef who travels around the world,
and your travels inspire recipes.
With this update to the prompt, let's reload
and see what happens.
And now Chef Noodle tells us something
interesting with each recipe.
ERIC WINDMILL: Lastly, I want to talk about structuring data when
working with the Gemini API.
By default, when we started, the Gemini API
returned the recipe and all the accompanying data as Markdown.
In the beginning, this was great,
and parsing out a title, a list of ingredients,
and a list of instructions was simple.
But as our prompt became more complex
and we were requesting more information like nutritional
information and allergens, it became
impossible to parse out the information reliably.
But then I realized I was thinking with my pre-LLM brain.
This isn't a problem that I need to solve with code.
It's a problem I can solve in the prompt itself.
So I added an explicit formatting to the prompt.
This mostly worked right away, but occasionally, the Gemini API
would return different properties as different types.
For example, sometimes ingredients
would be a list of strings, and sometimes it
would just be a long string.
To solve this, I added the expected types to the prompt.
And since that update, the responses
have been in the expected format,
and I've been able to reliably deserialize the response as JSON
without throwing exceptions.
And that's the app.
Now I have a personal chef in my pocket,
and I didn't to build a database of recipes to get it.
And at the end of a long day, I don't
have to type a list of ingredients
into my phone to find a recipe to make.
I just snap a picture, and Chef Noodle figures it out for me.
And, of course, this is a Flutter app.
So, when it's time to start cooking,
I can switch to my Pixel Tablet so it's easier
to follow the recipe hands-free.
As a Flutter developer, I like building interesting
UIs. I find it fun to focus on the user experience
and animations, and I don't find it
fun to tinker with server-side logic.
It's pretty incredible that using the Gemini API in Flutter,
I was able to build an app that has real-world functionality,
and I spent almost none of my coding time
on building out a server and writing database queries.
ANDER DOBO: As a product manager,
I like continuously improving the product and experience
for users.
In the future, I'd like to make Chef Noodle more
interactive with chat.
I'm looking forward to using the Gemini API's chat feature
to build that version of the app.
Right now, you can check out this video's description
for links to the GitHub repo for this app
and some other useful resources.
I hope you have seen the potential
of building AI-driven apps with the Gemini API.
And if you'd like to use the Gemini API in your Flutter
or Dart apps, head to the Quickstart to get
started with the Google AI SDK.
We can't wait to see what you will build.
[MUSIC PLAYING]
Browse More Related Video
![](https://i.ytimg.com/vi/V8P_S9OLI_I/hq720.jpg)
Build generative AI agents with Vertex AI Agent Builder and Flutter
![](https://i.ytimg.com/vi/qgT-quk3JEo/hq720.jpg)
Build Anything With ChatGPT API, Here’s How
![](https://i.ytimg.com/vi/_j7JEDWuqLE/hq720.jpg)
Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps
![](https://i.ytimg.com/vi/273hDFcwvUQ/hq720.jpg)
How To Create An App With CHATGPT For Free In Minutes
![](https://i.ytimg.com/vi/0tHH7EPBy9M/hq720.jpg)
How to Become an AI Prompt Engineer For Beginners
![](https://i.ytimg.com/vi/xPQhZWE20vI/hqdefault.jpg?sqp=-oaymwEXCJADEOABSFryq4qpAwkIARUAAIhCGAE=&rs=AOn4CLDEJvj57SNSxDB5ah3KJRLo-R8kpw)
Harness The Unbelievable Power of Gemini 1.5 Pro
5.0 / 5 (0 votes)