Build generative AI agents with Vertex AI Agent Builder and Flutter
Summary
TLDRAt Google I/O, the Dart and Flutter team introduced a generative AI-powered Flutter app that enhances user photographs by providing deeper insights. The app uses the Gemini API to identify subjects in photos and offer detailed descriptions. It also features an AI agent for interactive learning and a chat interface for user inquiries. The talk covered building the app with Vertex AI for Firebase Dart SDK, integrating AI agents with Vertex AI agent Builder for external data integration, and optimizing the Flutter app for multiple platforms. The result is an adaptive, cross-platform app that enriches user experience through generative AI.
Takeaways
- 📸 Khan, a developer relations engineer at Google, introduced a generative AI-powered Flutter app that enhances photos by providing detailed information about the subjects within them.
- 🌐 The app leverages the Gemini API, Google's advanced AI model, which supports multimodal understanding, including text, images, videos, and audio.
- 🔍 Users can interact with the app by selecting a photo, which the app uses to identify the subject and provide a description, as well as answer follow-up questions through an AI chat interface.
- 🛍️ The app can also identify Google merchandise, providing information on pricing and offering direct links for purchase.
- 🛠️ The talk covered the technical aspects of building the app, including using the Vertex AI for Firebase Dart SDK to send prompts and photos to the Gemini API.
- 🤖 The importance of AI agents was discussed, with the introduction of Vertex AI Agent Builder for creating chat agents that can reason and orchestrate tasks with external tools.
- 🔗 The concept of 'rag' (retrieval augmented generation) was explained, highlighting how it connects language models with external data sources to provide up-to-date and accurate information.
- 📚 The app development process included creating a data model, implementing camera and image selection features, and using various Flutter packages to enhance functionality.
- 🎨 Flutter's capabilities for building adaptive and responsive apps across multiple platforms were demonstrated, ensuring a consistent user experience on different devices.
- 🔄 The talk emphasized the efficiency of Flutter's hot reload feature, which allows for rapid development cycles and a great developer experience.
- 🌟 The combination of Flutter and Google Cloud provides developers with powerful tools to build, scale, and reach more users with their applications.
Q & A
What is the main theme of the talk at Google I/O?
-The main theme of the talk is about building a generative AI-powered Flutter app that helps users learn more about their photographs using the Gemini API and Google Cloud's Vertex AI.
Who are the speakers at the Google I/O talk?
-The speakers are Khan, a developer relations engineer on the Dart and Flutter team at Google, and Cass, a developer advocate for the Cloud AI team.
What does the app developed in the talk allow users to do?
-The app allows users to select a photo, identify the subject within the photo, get a description of the subject, and chat with an AI agent to ask questions and learn more about the subject.
How does the app identify the subject in a photo?
-The app uses the Gemini API to identify the subject in the photo. It sends a prompt along with the user's photo to the API, which then returns the subject and a description.
What is the role of the Vertex AI for Firebase Dart SDK in the app?
-The Vertex AI for Firebase Dart SDK is used to call the Gemini API. It handles the configuration and setup needed to communicate with the API, making the process easier for the developers.
Why is the Gemini 1.5 Pro model significant according to the talk?
-The Gemini 1.5 Pro model is significant because it features a mixture of expert (MoE) architecture for efficiency and lower latency, supports a large context window of up to 2 million tokens, and has multimodal understanding, capable of processing text, images, videos, and audio.
What is the purpose of the AI agent in the generative AI app?
-The AI agent serves to provide a chat interface for users to interact with, ask questions, and receive more information about the subject of their photos. It uses reasoning and orchestration to dynamically integrate with various external tools and data sources.
How does the app handle user requests for information not available in Wikipedia?
-The app uses an AI agent that can access multiple data sources. For example, if the user asks about a Google merchandise item, the agent can use the Google Shop data set to provide information such as pricing and a purchase link.
What is the significance of the 'tell me more' feature in the app?
-The 'tell me more' feature enhances user engagement by providing a chat interface where users can ask questions about the subject in their photo. It allows for a more interactive and informative experience.
How does the app ensure it works well across different platforms and devices?
-The app uses Flutter, which allows for a single code base to run on multiple platforms. It also implements responsive and adaptive design principles to optimize the user interface and experience for different devices like mobile phones, tablets, and desktops.
What are some of the Flutter packages used in the app development?
-Some of the Flutter packages used include 'permission_handler' for device permissions, 'image_picker' for selecting images and launching the camera, 'image_gallery_saver' for saving images, and 'flutter_chat_ui' for the chat interface.
How does the app handle the potential issue of outdated or 'hallucinated' information from the AI?
-The app addresses this by using an AI agent that can connect to external data sources, ensuring that the information provided is up-to-date and accurate. This approach, known as retrieval-augmented generation (RAG), combines the AI model with a retrieval system to fetch the latest information.
What is the role of Vertex AI Agent Builder in building the app?
-Vertex AI Agent Builder provides a suite of tools for building AI agents with orchestration and grounding capabilities. It allows developers to easily design, deploy, and manage AI agents that can reason and interact with various external tools and data sources.
How does the app ensure a consistent and less random output from the Gemini API?
-The app sets the temperature to zero when calling the Gemini API. This minimizes randomness and ensures more consistent output from the API for each request.
What is the importance of the JSON format support in Gemini 1.5 Pro model?
-The JSON format support in the Gemini 1.5 Pro model allows for easier extraction of information by the Flutter app. It enables developers to use Dart's pattern matching to efficiently parse and utilize the data returned by the API.
What are some considerations for deploying a generative AI app in a production environment?
-When deploying a generative AI app in production, considerations include ensuring the information provided is up-to-date, grounding the model with external data sources for accuracy, and handling user requests in a way that maps to the appropriate functions and tools.
How does the app provide a good developer experience while building with Flutter?
-The app provides a good developer experience by leveraging Flutter's capabilities for writing a single code base that runs on multiple platforms, using official packages for common functionalities, and adhering to best practices for responsive and adaptive design.
Outlines
📸 Building a Generative AI-Powered Flutter App
The speaker, Khan, introduces the concept of a generative AI-powered Flutter app that enhances the understanding of photographs. The app allows users to take a photo, which the system identifies and provides information about. It also offers a chat interface with an AI agent to ask questions about the subject. The talk will cover the development process, including building a prompt for the Gemini API, using the Vertex AI for Firebase Dart SDK, and integrating AI agents with the app. The app is designed to be adaptive across all Flutter's target platforms.
🌟 Introducing the Gemini Family of Models
Cass, a developer advocate for the Cloud AI team, introduces the Gemini models, emphasizing the capabilities of Gemini 1.5 Pro. The model features a mixture of expert (MoE) architecture for efficiency, a large context window supporting up to 2 million tokens, and multimodal understanding for text, video, and images. The talk demonstrates how to use the Gemini model with Google Cloud's AI platform, including uploading images and receiving information in JSON format for easy app integration.
🛠️ Constructing the Flutter App for Photo Identification
The session focuses on building a Flutter app that integrates with the Gemini API. The app's data model maps the JSON response from the API, and a quick ID feature allows users to take or select a photo, which is then sent to the Gemini API for identification. The app utilizes various Flutter packages for permissions, image selection, and camera functionality. The Firebase Vertex AI package simplifies the process of calling the Gemini API and handling responses.
🤖 Enhancing the App with AI Agents and External Data Sources
Cass discusses considerations for deploying a generative AI app, such as ensuring information accuracy and connecting to external data sources. The concept of AI agents is introduced, which combines reasoning and orchestration with external tools to provide up-to-date and relevant information. The talk outlines the benefits of using AI agents and how they can be integrated into the app to enhance user experience and information accuracy.
🔧 Building and Integrating AI Agents with Vertex AI Agent Builder
The talk delves into building AI agents using Vertex AI Agent Builder, focusing on the reasoning engine based on the open-source tool called Langchain. The process involves defining tools with Python functions, creating an agent with these tools, and deploying the agent to a reasoning engine runtime. The agent can then be accessed via the Vertex AI SDK or REST API, providing a scalable and fully managed platform for AI agents.
💬 Integrating AI Agents with Flutter for a Rich User Experience
The session covers integrating the AI agent into the Flutter app using a chat interface, allowing users to interact with the agent for more information about the photo subject. The app uses the HTTP package to make network calls to a Cloud Run endpoint, which communicates with the reasoning engine runtime. The Flutter Chat UI package is utilized to create a chat interface, and the app is designed to be adaptive to different devices, ensuring a consistent user experience.
📱 Making the App Responsive and Adaptive Across Devices
The talk addresses making the Flutter app responsive and adaptive for various devices. The app layout is adjusted for larger displays, and a navigation rail replaces the navigation bar on larger screens. The app uses an abstract measure and Branch framework for responsive designs and a capabilities class to determine device features like camera and keyboard presence. The app also introduces a keyboard shortcut for chat UI and implements dark mode for visual consistency.
🚀 Recap and Conclusion: Building with Gemini and Flutter
The speakers recap the talk, highlighting the introduction of the Gemini 1.5 Pro model, the concept of AI agents, and the use of Vertex AI Agent Builder. They also emphasize the benefits of building with Flutter for multi-platform availability and the scalability offered by Google Cloud. The talk concludes with resources for learning more about Gemini and Flutter, and an invitation to explore the possibilities of building projects with these technologies.
Mindmap
Keywords
💡Google IO
💡Dart and Flutter team
💡Gemini API
💡Vertex AI
💡AI Agents
💡Multimodal Understanding
💡Firebase
💡Responsive and Adaptive Design
💡Flutter
💡JSON Format
💡Cloud Run
Highlights
Introduction of a generative AI-powered Flutter app that enables users to learn more about their photographs by identifying and providing information on the subjects within them.
Demonstration of the app's functionality to identify a photo's subject, such as the Lone Cypress tree in Monterey, California, and provide a description through an AI agent.
The app's ability to recognize Google merchandise, provide pricing information, and offer a direct link for purchase, enhancing the shopping experience for users.
Explanation of the four-part talk structure covering the building of a prompt for the Gemini API, Flutter app development, AI agents, and chat UI integration.
Introduction of the Gemini family of models, highlighting their capabilities and the latest version's features, such as the mixture of expert (MoE) architecture and large context window support.
Showcasing the multimodal understanding of the Gemini model, which supports text, images, videos, and audio, allowing for complex tasks like summarizing videos or suggesting code improvements.
The ease of getting started with the Gemini model through Google Cloud's Vertex AI, where users can build prompts and receive responses in JSON format for app integration.
Flutter's role as an open-source, multiplatform app development framework allowing code reuse across Android, iOS, web, Mac OS, Windows, and Linux.
Building the quick ID feature in the Flutter app that sends a user's photo and prompt to the Gemini API, utilizing the Firebase Vertex AI package for simplified API calls.
The importance of considering the limitations of generative AI, such as potential outdated information or 'hallucinations', and the need to ground models with external data sources.
Introduction of the concept of AI agents and their role in improving generative AI apps by reasoning and orchestrating with external tools and data sources.
Google's Vertex AI agent Builder and its capabilities for building AI agents with reasoning engines like Longchain, which integrates with Google Cloud services for easy deployment.
The process of defining tools for the AI agent using Python functions with proper signatures and comments, allowing the agent to understand how to use these tools to answer user requests.
Integration of the AI agent with the Flutter app using a chat interface, enabling users to interact with the agent and receive more information about their photos' subjects.
Adaptive design considerations for Flutter apps to ensure optimal user experiences across different devices, including responsive layouts and device capability checks.
The use of Flutter's widgets and packages to help optimize apps for both responsive and adaptive designs, allowing for code reuse and efficient development.
Final app presentation showcasing its ability to help users learn more about their photographs through Flutter and Vertex AI, accessible on various devices.
Recap of the talk's content, emphasizing the capabilities of Gemini 1.5 Pro, AI agents with Vertex AI agent Builder, and the flexibility of Flutter for multiplatform app development.
Transcripts
morning everyone we are so happy to be
here at Google IO with y'all today I'm
Khan I'm a developer relations engineer
on the dart and flutter team at
[Applause]
Google thank you so there's the saying
that a picture is worth a thousand words
as an amateur photographer myself I tend
to agree with the statement whether I
whenever I travel somewhere new I love
to take photos to capture the moment and
remember that place sort of like a
souvenir in fact I took all the photos
that you see on screen right now so over
the years I found myself in this habit
of taking these photos coming home going
through them one by one editing them and
then somehow I end up in this Wikipedia
Rabbit Hole reading page after page
learning about the location that I just
captured on my camera quite a bit more
than a thousand words I'd say as Cass
and I were talking about this project we
wondered what if we had an app for this
an app that lets you dive deeper and
understand more about the photos that
you've just
taken so in this talk we're going to
walk you through our journey building a
gen powered flutter app that lets users
dive deeper and learn more about their
photographs let's take a
look so you can select a photo uh here
I'll pick a photo that I took in
Monterey California
give it a moment and it will identify
the subject in this case the lone
cypress tree and gets a description of
that subject as well you want to learn
more tap tell me more and here you can
chat with con your AI agent and ask her
questions about the subject of that
photo and if you happen to be wandering
around Mountain View and come across
some cool Google merch like our favorite
Chrome Dino pin you can take a photo
with the app and it'll identify the
product whether it's official Google
merch and how much it costs and if
you're an imple Shopper like me it'll
even give you a direct link to purchase
the product as
well now that you've seen the app we'll
show you how it was built this talk is
broken down into four
parts first we'll walk you through the
Journey building a prompt for the Gemini
API then we'll show you how we built a
flutter app that sends a prompt along
with the user Ed photo to the Gemini API
using the vertex AI for Firebase Dart
SDK after that we'll tell you why you
should be using AI agents and show you
how you can build one with vertex AI
agent Builder and finally we'll show you
how we added the chat UI to our app that
lets users chat with the agent and how
we can make the app adaptive across all
six of flutters Target platforms as you
can see we cover a little bit of
everything in this talk if you're
interested in learning about gen we've
got you covered if you're a cloud
engineer you'll get to see some of the
cool new tooling that we have at Google
cloud and if you're an app developer
we've got some flutter code for you as
well ultimately we're bringing all these
skill sets together to build something
awesome whether you're interested in
flutter Cloud app development or
generative AI by the end of this talk
you're going to have a good idea of what
the process looks like for building a
generative AI agent powered app with
vertex AI agent Builder and flutter from
beginning to end Cass would you like to
kick us off thank you
K hi uh I'm Cass I'm developer advocate
for the croud AI team and I'd like to
introduce the uh the first introduce the
Gemini
models the G the Gemini family of the
models the Gemini family of models the
most capable and uh the models we have
ever built and that is a result of the
large scale collaboration across Google
including Google Deep Mind and Google
research
the ratest version the Gina 1.5 pro has
three major features the first one is
the mixture of expert or Moe
architecture that means we are not using
the entire L rage model for the every
request of prediction rather than that
for the each specific the tasks we use
the only part of the models the
so-called exper networks so that you can
get much higher efficiency and much
shorter latency especially you can see
you can experience on that with the
latest model uh j9 1.5
Flash and the second the it support a
very large context window yesterday we
announced that the this model supports
now up to 2 million tokens how long is
that how large is that you can just
approach twice the document that has the
twice the size of the entire load of the
Ring series as a single prompt and you
can ask any things any event or any
people anything about the the
story and also you can upload the mle
the documents multiple videos or many
tens or hundreds of the images in a
single prompt and ask anything on about
about
them the third multimodal understanding
reasonings as I mentioned it doesn't
support only text it supports your also
the videos and images and audios as well
so for for example you can let the
the Gemini uh taking the all the the
tens of minutes of the video and
summarize what happened inside the video
or you can ask the uh the Gemini to take
the very long complex intricate
programming code and ask uh make some
any suggestions on the improvements on
the
code in this example you are seeing at
the right we have apped 44 minutes of
the video Bast Kon movie and with with
an illustration of the water tower and a
person
underneath and asked what happens when
it happens then the Gemini replies that
this happens at 15 minutes 32
seconds and you can get started with the
gini model by using the Google Cloud Pro
C
Studio it's so easy as you can see
seeing on the video right now now you
can just open the console this OB the
studio and choose the model you can
choose gini 1.5 Pro or maybe you can try
out 1.5 flash that is announced
yesterday and the build a prompt you can
write the text prompt and also you can
upload the image like a image of the
Chrome D
dyin and press submit and you'll be
seeing the result within the
seconds that's so easy so we have tried
the uh uploading the photos K take took
like this one this is a very beautiful
photo taken by K uh of the falling water
so you can upload this image to the uh
the the console and ask what is this
then you get response like this this is
a falling water a house designed by
Frank Ro right the popular the famous
architect in in 1935 and blah blah blah
and the response already looks so great
great the model understand deeply about
the house but as we are building the
geni app you have to ask some other
things to the
gini uh especially you want to read the
response in the machine readable
format so we need to tweak this uh
prompt a bit here we have added V
request uh in the prompt first name of
the object the second description
the third suggested questions for the
object so the user can explore more
about the object and also as a final
request we asked the uh the Gemini to
Output the response in Json
format then you will get response like
this now gini gini 1.5 pro model
supports the Json format output so that
the flatter app can easily proceed and
extract any element from it
now we have the prompt to get the
informations about the building or the
object in the port I directly pass the
stage back to cam so she can share how
we can incorporate this prompt in her FL
app thank you
Cass cool let's talk about building a
flutter app before we jump in for the
folks who aren't familiar what is
flutter well flutter is a an open source
multiplatform app development framework
let you write one code base that runs on
Android iOS web Mac OS windows and
Linux so once we had the structure of
the data that was coming back from the
Gemini API we could jump into building
the app the first thing that we did was
create a data model that Maps the Json
that we'd be getting back from the
Gemini API you can see the metadata
object has an image name description and
a list of suggested
questions from there we started building
the quick ID feature this feature
presents the user with the option to
take a photo or s an existing image
sends it along with a prompt with to the
Gemini API and identifies the subject of
that image you'll see throughout this
talk we lean heavily on pub. flutter and
darts official package repository to
bring this app to
life to implement the camera and image
selection we use three packages
permission Handler image picker and
image gallery saver permission handler
was used to request the necessary device
permissions for accessing the camera and
camera roll image picker provided
functionality for selecting the image
from the camera roll now there's an
added perk here if users are on Android
it'll trigger the default Android photo
picker which has built-in Google photos
inte creation so you can actually pick
from photos that have been shared you
through Google photos but aren't
necessarily directly on your camera
roll image pricker also provides
functionality to launch the camera and
take a new photo as well the only thing
that changed in our code between picking
a photo from the camera roll and taking
a new photo was the image picker Source
parameter regardless of the source image
picker gives us the image file and we
can decide what to do with it if it's a
new photo we'll use image gallery saver
to save it to the camera roll and then
call the send vertex message
method speaking of to call the Gemini
API in order to identify the subject in
the photo we use the Firebase vertex AI
package also known as the vertex AI for
Firebase D SDK which is now officially
in public
preview calling the vertex AI Gemini API
is essentially making a rest call
however trust me when I say that the SDK
makes the process much easier since it
handles most of the configuration and
setup that you need to
do so to start we initialized the model
instance and specified the specific
Gemini model that we want to use Gemini
1.5 Pro we also added configurations
like setting the temperature to zero so
that we can minimize the randomness and
get more consistent output from the
Gemini API each time and since we're
using gem 1.5 Pro we also get to set the
response type to Json so no more
checking for and removing those back
ticks uh in the
response then we Define the method to
send the prompt to the API called send
vert text message this message packages
up the prompt that cash shared earlier
along with the user selected image and
it sends it to the Gemini API if you
want to optimize for faster response
times you can resize the image to be
smaller as well when response comes back
it gets decoded and stored as a metadata
State object and here's a pro Tip since
the Gemini API is returning the data in
Json format we use darts pattern
matching for a less Rose way of pulling
out all the information that we needed
here it's taking the Json map and
grabbing the values for the keys name
description suggested questions and
assigns it to local variables as someone
who came from a language without pattern
matching I love it and by the way we
just introduced experiment Al support
for macros in Dart so I'm looking
forward to a future where we don't even
have to write these
methods
W fantastic job by the dart team by the
way so once we have the metadata objects
representing the response we pass it to
our custom card widget to display it on
the screen you can see we also provided
a loading State variable which triggers
that Shimmer P placeholder that you see
there and if you're familiar with the
Google gener of AI package you're
probably thinking that this code looks
really really familiar that's because
vertex AI for Firebase DSD is built on
top of the Google gener of AI package
different endpoints but similar
structure the best part is if you're
already familiar with the Google gener
of AI package you'll be right at home
with vertex AI for Firebase start
SDK and there you have it a mobile app
with a fully built quick ID feature that
that identifies the subject of the photo
now Cass what are some of the
considerations that we have to think
about when we're building a gener AI app
like this one
okay so let's think about how you can
improve this app uh when you want to uh
deploy this app to the as a production
app for the corporation and Enterprises
or you want to use this app for any wide
variety of the use cases first we have
to real that informations LM any LM
generates could not be the latest
informations because LM is trained with
the data uh that can have with the data
when they have the trained uh the the
model so it could not be the latest
informations you have right now there
could be the cases where the up show
some outdate informations or so-called
hallucination so the you could get those
the hallucinated answer that is not
based on the the the
facts so you may want to connect the
model to the external data source if a
users are expecting for example some the
Enterprise data source uh that is
proprietary to the a specific company uh
then you need to have a way to connect
the model to those data sources or
so-called grounding the models but how
would you do that if you write a prompt
uh you want to write a promp to extract
what would be the U most important
keyword from the entire user request or
prompt to build an SQL query or search
query for the uh the uh search engines
or the vector
database and what if you have the
multiple data sources not only a single
data source but a multiple like a five
servers or the the documents or anything
or maybe the internet the the web pages
then how you can choose the right data
source for for solving a problem
requested by the specific user
request for example in our apps use case
we have two different the images one is
the falling water images that could be
maybe found in the Wikipedia The another
one is the Chrome din pin and Wikipedia
doesn't have the information how much
does it cost so how would you solve this
problem to solve these problems we'd
like to introduce the concept of the AI
agents to our flatter
up why do we need AI agent
in the early early days of the
generative AI many people have been
using the LM models alone without any
external data data store or anything so
they could be possibly get the outdate
informations don't have any access to
the corporate the propi data or maybe
you could be seeing any hallucinated
answers so that's the reason why we are
starting to hear the new buzzword called
rag or retrieval augmented generation
that is all about connecting or
grounding the the LM model with the
external data set with r you can combine
the L models with the back and retrial
system this could be anything you have
right it could be a vector database
somehow the vector database is the most
popular choice right now but you can
even use the SQL database or usual
search engine like elastic search as the
retriever back end so every user request
you can take some important query from
that and make the query against those
the the search engines or databases to
find most relevant information and embed
that documentation or informations as a
part of the prompt for the the Gemini or
LM models so that you can feed the
latest informations or company's propri
informations as a part of the the prompt
to to to reduce the risk of the halation
and get the latest informations as the
response from the llm
so that was Rag and uh now we have ai
agent this is a natural Evolution from
rag in this architecture you you have
you would deploy an application called
Agent to the runtime environment this
agent as a capability of the reasoning
and orchestration to the rck
system orchestration with agent Works in
two ways first it takes an user request
and determines which external data
source were we call the them as tools we
could provide the best way to find
relevant information for example if you
get your query on the falling water
maybe the model thinks Wikipedia could
be the right place to find
it and or using request can involve some
actions rather than the query for
example the users may want to reserve a
ticket or make a some purchase on the
the product or items then you can also
access the other apis to make those
transactions if POS if you
want that's the first one the second
aspect of the orchestration is that the
agent determines how to convert the user
request to the function code of the tool
with AI agent you don't have to take
care of the any converting the user
request taking out the any important
keywords and put into the uh SQL or the
vector database
query in later section I will explain
how this reasoning and orchestration
work so AI agent works just like a human
agent for the user but it running inside
runtime and who knows what does the user
request mean and also that knows the
best tool to answer your question and it
knows also knows how to map the user
request to the function CS with those
multiple
tools that is AI agent so in short the
AI agent is an application that uses the
power of LM for reasoning and
orchestration with external tools to
achieve the goal it inherits all the
benefits of R such as gring with the
external data source to get the latest
informations and the lower the risk of
the HM plus it add the flexibility like
a human agent for choosing the right
tool or trying the multiple tools to
satisfy the
request AI agent is a new way of
dynamically on and semantically
integrate the loose Capital to
distribute the system with the power of
LM last April Google announced a new
suite of products called verx AI agent
uh verx AI agent Builder which is a site
contains multiple products and tools for
building AI agent with orchestration
tool and grounding with search
engines with vertx AI agent Builder
there are different approaches and
projects for building AI agents in this
session we'd like to focus on a
reasoning engine that is also called as
longchain
vertic that is as it name suggest it
it's a b product based on the most
popular open source tool called rank
chain that is used for the the building
chat Bo and R the r systems with the
flexibility and existem of the rank
chain reing engine tightly integrates it
with the Google croud services with a
fully managed uh runtime for the AI
agents
reasoning engine let's developers
transparently access the benefit of the
another service called function calling
which works like a magic oneand to
easily map the LM output to your python
function also we will use the vertic AI
search for retrieving data from external
data
source so let's see how to build AI
agent this reasoning
engine first you start defining your
tool by writing python code WR this this
function uses Wikipedia API to find the
relevant Wikipedia page for specified
object and the return returns a full
text of the page this tool could be used
to feed the facts and the latest
information on the any popular object on
your on your
photos and now the most interesting part
is here you don't have to write any
other metadata or based on the open API
the
specification no no more writing
metadata for for the the
function instead you can just write yet
another usual python function with the
uh the proper uh function signature and
the comment this is the most important
part so that the uh function coding can
read those comments and function
signature and find out how to now when
this function should be used to to
answer to the the user request and how
to map the user request to the
parameters of this function in this
example the comment explains that this
is this function uh will search for a
Wikipedia page for a specified object so
if the agent received an an user request
that is likely to be solved by Wikipedia
Pages then the agent CA this
function then what about the Google shop
items like a chrome dyo pin you saw area
Wikipedia page cannot solve this problem
so we need another
tool to show the flexibility of AI agent
we have added another tool for searching
with Google merchandise shop data set
this tool can feed the product name
description pricing and the page link on
any Google shop
products this function uses vtic AI
search end point that runs a query on
The Google Shop data set v v search is a
search engine that as the cut engage
search search technology with Google
search packaged as the outof the Box
fully managed uh service if you have any
Enterprise data on big quy cloud storage
CSV or Json file or any other popular
data sources data format then you can
easily import them into the verc search
and build an index and leun a query on
them in my case when I was building this
demo it took only a couple of hours for
importing the entire data set set from
The Google Shop data set to the V search
building index and Define this function
as a tool for the reasoning engine thata
so
quick now we have two tools Wikipedia
search tool and Google shop Search
tool so next thing we have to do is
create an agent with the two tools when
this this agent receives as user request
function calling checks what kind of
functions are available with these tools
and then pick a tool that should provide
the best way to satisfy the user request
and for the request to its
function this is how the reing and
orchestration work inside the reing
engine lastly you can deploy the agent
to the reasoning engine runtime it
creates a container image that
encapsulate the agent and tools the run
time provides a scalable fully managed
platform for the agent without effort of
designing building and operating your
own infrastructure for providing AI
agents and that's it you can start using
the agent with the verx AI SDK or rest
API from your app by specifying the
which agent you want to use and passing
the
request with the flat we are building
now we used Cloud round for a simple
Ender point for receiving the request
from the app and pass them to the
reasoning engine runtime this agent
running on one time does the reasoning
and orchestration with the two tools for
example if the request is related to any
popular objects and landmarks then the
agent Calls Wikipedia tool that calls
the Wikipedia
API or if the user request is related to
any Google shop items like Chrome G Pin
then agent calls The Google Shop tool
that uses the vertic search to looking
for the the Chrome ding in in Google
shop data set so that's how the back end
works now it's ready to integrate this
agent with the flat app back toam thanks
H all right let's talk about integrating
the AI agent with flutter we'll
integrate the AI agent into our app
using a chat interface this tell me more
feature presents a user with a chat UI
where they can ask their AI agent con
for more information about the subject
in the photo we'll surface Gemini's
suggested questions here as well to
start off we use the HTTP package to
make our Network calls so we defined the
ask agent method to reach out and hit
that cloud run endpoint that Cass have
put together we construct a URI
including a query string which includes
the subject name description and the
user's question depending on the tools
available in the vertex agent you could
specify a particular data source or tool
but the agent has been good at deciding
the right tool for us so we left that up
to the agent to decide send the reest
the endpoint and finally decode the
response as and return it as a
string one of my first projects as a
flutter developer was building a chat UI
I was just learning so I thought I had
to build everything from scratch I mean
flutter is meant for putting pixels on
the screen it's been a few years now and
I know that there's many devs out there
who've already built fantastic chat uis
and chat apps with flutter so instead of
Reinventing the wheel I reached for the
flutter chat UI package
out of the box the package gives you
essentially everything that you need to
create a chat it handles most of the
chat boiler plate and you can configure
it how you want it you can see I
constructed the chat object and then
configured a theme determined whether it
should display avatars and
names specify the current user so that
it knows which messages came from you
the current user on the device some more
interesting configurations the typing
indicator the last thing that you want
is for the user to send a question and
then wonder is anything really happening
on the back end right now so the typing
indicator shows the three bouncing dots
indicating that the agent is currently
typing uh based on loading State the
list bottom widget is a convenient place
to surface Gemini suggested questions so
that when the chat is first shown the
user can pick from a list of suggested
questions or just use it as
inspiration and last but not least we
also provided a list of messages and an
on send call
back the handle send pressed a method
constructs a new
message adds it to the messages list and
sends it off to the agent by calling
well send message to agent send message
to agent sets the state to loading calls
our ask agent method from earlier to hit
the cloud run API gets the result back
and constructs a new message adds that
to list of messages and sets the loading
State back to false to clear that agent
is typing status and there you have it
with the tell me more feature users can
now ask for more information about the
subject in their photo that's great
we've completed our app since we built
with flutter we can just package it up
and ship it to all the platforms
right not yet sorry to burst your bubble
there so because we know better than
that we know that different devices can
provide very different user experiences
using a mobile app on a desktop doesn't
feel too great so if you're a developer
building multiplatform apps you want to
optimize your apps so that users can
benefit from the user experience for
their particular device a phone a tablet
a laptop and desktop they all have very
different configurations and user
experiences touchcreen or keyboard how
big is a display is there a camera so
there's a lot to think about as a
developer especially when you start
considering all the various capabilities
for every single device so I like to
break down these user device
considerations into two categories
responsive and adaptive responsive is is
a change in how your app looks based on
screen or window size and adaptive is
based on your app's features uh based on
the device configuration flutter has
widgets and packages to help with
optimizing for both so let's look into a
few of the modifications that we made to
the mobile app so that it's responsive
and adaptive for the best experience
across different
devices let's start by looking at how we
made our app
responsive first we adjusted the layout
to be horizontal on wider displays this
lets users get full use out of their
nice large monitors and minimizes the
need to scroll we didn't have to change
anything about the card containing
metadata we only rearranged how the
widgets were laid out it's a column on
mobile But a row on wider screens Plus
instead of pushing the new chat window
as a new screen on larger displays we
can make it pop up as an overlay window
in the main screen again we use the same
exact chat widget regardless of device
with flutter we can reuse a lot of the
components like buttons cards and even
in this case the chat widget across
devices so that minimizes the amount of
new code that we have to write as we
expand on our Target uh devices while
we're on the topic of reusing code
here's a little fun fact Google's first
party flutter apps have an average of
97% code share as said by one of my
favorite TV show hosts you don't have to
take my word for it the classroom team
for example saw a 3X increase in
velocity from a mixture of of only
writing each feature once or at most one
and a half times in scenarios where
there were heavy native components among
other benefits talk about
efficiency so then when we're moving
from Mobile to larger devices again like
tablet desktop or web another change you
often see is the navigation bar don't
get me wrong I love a navigation bar on
mobile but have you ever seen a
navigation bar on a large screen like a
tablet or a desktop it looks like an XL
version of the app that wasn't exact ly
designed with that particular device in
mind so while we have a navigation bar
on mobile we change that to a navigation
rail on large
screens for all these responsive designs
we used an abstract measure and Branch
framework the idea is that you first
find shared data between the components
decide how you want to measure the
available space and then conditionally
return UI components it's a really quick
and succinct way to summarize a process
that personally I found could otherwise
be very overwhelming so my teammate Reed
and Tyler actually came up with this
concept and I'm going to steal it from
now on speaking of if you're wondering
that's really vague where is the code
well it's in Tyler and read uh session
where they'll be going into far more
detail on how to build adaptive uis in
flutter definitely check it out and they
have some fantastic tips in their
talk so that was responsive now let's
talk about adaptive how we change the
functionality of our app based on device
configuration a common way to determine
what platform the app is running on is
to use code that looks like this the
dart platform class exposes constant
values that tell you if the app is
compiled for Android iOS Linux Mac or
Windows and the flutter Foundation
libraries KS web constant will tell you
if you're running on web this is great
for something like styling your app if
you want to make sure everything fits in
the native platform design system but
you don't want to assume things like
input modality based on platform instead
of checking the platform and assuming
certain behaviors we can define a
capabilities class to determine whether
a device has well certain
capabilities here we Define methods to
determine whether device has a camera or
a keyboard then you can write a plugin
to check the native platform's apis for
the exact Hardware that you need now I
know this piece is a little hand wavy
here please check out flutterdev
adaptive um for more in-depth guidance
on this piece so from there we can
define a policy class to determine
whether certain features are supported
in our app based on on the device's
capabilities so here we've written code
that determines if the taking a picture
is supported depending on whether device
has a camera and as long as the device
has a physical keyboard the app supports
keyboard shortcuts keep in mind just
because the device can do something
doesn't mean that it should especially
if it's an anti- pattern for that
particular platform something like pull
to refresh doesn't really feel right on
your web browser so in that case you can
stack device capabilities for more
complex configurations so for example
example support puls to refresh only if
the device has a touchcreen a small
display and is not weet defining this
device this kind of a device policy
makes our UI code more readable and
understandable but it also makes
developers lives easier when it comes to
mocking device configurations for
testing as well plus if device
capabilities change in the future or
flutter ads support for more platforms
you only have to update your code in one
place in the policy instead of updating
it in every single widget that uses
keyboard shortcuts for
example with that background scare
squared away the first adaptive change
that we made was the showing a take a
photo button we should make sure to
check the device policy before
displaying the button make sure that you
can actually take a photo on that
device the other adaptive feature is for
the folks using a physical keyboard we
introduce a keyboard shortcut control T
to bring up the chat UI we built a
shortcut helper widget to wrap our UI
the widget is built using the Callback
shortcuts widget which takes a map of
Key bindings to callbacks and a child
widget flutter has a more powerful
shortcuts actions intense keyboard
shortcut system but for simple use case
the Callback shortcuts widget gets the
job done so here you can see we passed
the map the shortcut helper a map of
keyboard shortcut bindings and a child
our content whatever is being displayed
on the
screen and finally for that last bit of
Polish
of course we had to implement dark mode
since we use materials color scheme from
seed method to generate our color scheme
we passed the brightness State variable
that was controlled by switch on the
settings
page and after all that work we have an
app that helps you dive deeper and learn
more about your photographs powered by
flutter and vertex agent accessible on
every device whether that be a regular
old mobile
phone foldable
device tablet and or
desktop okay that was a lot of
information in 40 minutes let's recap
everything we've talked about C you want
to start sure for gini we have
introduced the gini 1.5 Pro the model
provides the multimodal AI so our flup
can easily pass the photo to the model
and get information about the object
also with the l large language context
window you could even pass tens or
hundreds of the photos at once or long
video as the single prompt to the model
and have it explain about the the
content secondly we have explained the
concept of the AI agents and how to use
the verx AI agent Builder to design and
deploy them easily it provides scalable
fully manage one time for the AI agents
the agent provides the reasoning and
orchestration capabilities that
dynamically integrates the Gen app with
a variety of the external tools and data
sources not only the Wikipedia Pages or
The Google Shop data set you could also
your uh your own you could also add your
own external data set and API as a tools
what
about well as you saw a single flutter
app can be made available across
multiple platforms for users on desktop
mobile and web you build for every
screen while optimizing for the best
experience on those devices reach more
users on more platforms the bonus of
scalable faster development Cycles with
one code base flutter is flexible and
the goals to help you build and ship
awesome apps now Cass if anyone's
watching our talk and wants to learn
more about building with G Gemini how
can they do that sure uh if you're
interested in the Gemini models on the
verx AI please go to Google search and
search with vertx AI Gemini API and
vertic AI agent Builder to find
documents and tutorials what about the F
Well if you're interested in building
with flutter try it out head to flutter.
d/ getstarted to experience what we mean
when we talk about a great developer
experience and for those of you who are
already a part of the dart and flutter
Community I know there's a lot of you
[Music]
here check out the new vertex AI for
Firebase Dart SDK and let us know what
you
think ultimately Google cloud and flut
are both great tools for developers to
scale their productivity individually
but they're even better together
together together they're a valuable
multiplier that can help developers be
more productive as developers we want to
have the right tool for the right job
and ultimately we believe flutter and
Google Cloud offers you the flexibility
to build what you want and the option to
scale and reach more users if and when
you need it vertex AI agent Builder
makes building chat agents accessible to
developers and flutter makes the agent
accessible to your end
users for more on all things flutter
please check out the what's new in
flutter talk my teammates and John will
be covering all the coolest new flutter
updates it's been a privilege to be here
with yall today thank you so much for
taking the time out of your day and
giving us the opportunity to share our
project with you if you'd like to watch
a replay of this talk it'll be on
YouTube we hope that we've inspired you
to think about all the cool projects
that are possible with flutter and
Google cloud and we can't wait to see
what You' build happy google.io and
we'll see you around thank you
[Applause]
oh
浏览更多相关视频
Gemini API and Flutter: Practical, AI-driven apps with Google AI tools
Google Releases AI AGENT BUILDER! 🤖 Worth The Wait?
AI Flashcards - Study with Personalized AI Learning | Live Demo & Tech Overview
Creating Admin App in Flutter with Firebase Auth and Firestore (Your App Idea 1)
Create a Customized LLM Chatbot on Your Own Data Using Vertex AI Agent Builder & Dialogflow
Why I Do NOT Use Flutter for Mobile App Development
5.0 / 5 (0 votes)