How to build an IVR with Custom AI Voices (in Dialogflow)
Summary
TLDR本视频教程介绍了如何在30分钟内构建一个交互式语音响应(IVR)系统。主讲人首先介绍了Resemble公司,该公司提供定制化的人工智能语音服务,能够快速创建逼真且具有高度表现力的语音模型,支持多种语言和口音。接着,主讲人展示了如何利用Dialogflow这一自然语言理解(NLU)引擎来创建IVR系统,通过训练模型识别用户输入的意图并映射到相应的响应。视频还演示了如何将Resemble的语音合成技术与Dialogflow集成,实现无需编写代码即可进行实时对话的IVR系统。最后,通过实际拨打电话的方式展示了集成后的IVR系统的工作流程,包括账户查询、转账等操作。
Takeaways
- 😀 该视频教程介绍了如何在30分钟内构建一个交互式语音响应(IVR)系统,并且实际演示了这个过程。
- 🛠️ 视频中提到了不需要编写代码即可完成整个过程,这简化了IVR系统的构建。
- 🗣️ 介绍了Resemble公司,它提供定制的AI语音服务,能够快速构建真实感强、表达丰富的语音模型。
- 🌐 Resemble的语音模型支持多种语言,并且能够在不同语言之间进行翻译。
- 🎭 Resemble Fill技术允许在真实语音中插入合成的语音片段,使得IVR系统更加动态和个性化。
- 🔌 Dialogflow是一个自然语言理解(NLU)引擎,用于理解和处理用户的语音指令,将其映射到相应的意图。
- 🔑 在Dialogflow中,关键组件包括意图(Intents)、实体(Entities)和履行(Fulfillment)。
- 🔗 展示了如何将Resemble与Dialogflow集成,通过提供一个API端点和API密钥来实现。
- 📞 通过Dialogflow Phone Gateway,可以轻松地将Dialogflow代理连接到电话系统中,实现实时对话。
- 📈 视频最后演示了如何通过电话与集成了Resemble语音的Dialogflow代理进行交互,展示了整个IVR流程。
- 💬 如果观众有问题,可以在聊天中提问,或者在视频结束后通过电子邮件联系Resemble团队。
Q & A
如何在30分钟内构建一个IVR系统?
-根据脚本,通过使用Dialogflow和Resemble可以快速构建IVR系统。Dialogflow是一个自然语言理解引擎,而Resemble提供定制的AI声音。整个过程中不需要编写代码,只需将两者集成即可。
Resemble是做什么的?
-Resemble是一个创建定制AI声音的平台,它拥有一个神经声音引擎,可以快速构建逼真、富有表现力的声音模型,支持多种语言和口音,适用于IVR、视频游戏旁白、市场营销概述等多种场景。
Resemble的声音模型有哪些特点?
-Resemble的声音模型非常具有表现力,可以处理任何口音,并且在高采样率下工作,这意味着它们可以用于多种场景,如IVR、视频游戏旁白等。
Resemble Fill是什么?
-Resemble Fill是一种功能,允许用户在真实语音中插入合成的部分,如姓名、地址、账户余额等变量,这样可以动态生成语音,而不需要预先录制或拼接。
Dialogflow是什么?
-Dialogflow是一个自然语言理解(NLU)引擎,它允许用户训练模型,将用户的输入语句映射到特定的意图上,广泛应用于移动应用、Web应用、聊天机器人、IVR等场景。
在Dialogflow中,Intents、Entities和Fulfillment分别代表什么?
-Intents是用户的意图,Entities是对话中的变量或参数,而Fulfillment是Dialogflow或代理对查询的响应方式,可以集成Resemble的API来实现语音回复。
如何将Resemble与Dialogflow集成?
-在Dialogflow中,通过启用Webhook调用,并在Fulfillment设置中输入Resemble提供的端点URL、API密钥和代理的令牌,就可以将Resemble与Dialogflow集成。
Resemble支持实时API和流媒体,这有什么好处?
-Resemble支持实时API和流媒体,这意味着无论输入长度如何,首次发声的时间总是在300毫秒左右,这对于对话场景非常有用,可以实现快速响应。
如何在Dialogflow中创建并使用预构建的代理?
-在Dialogflow中,可以使用预构建的代理,如银行代理,它已经预加载了多种意图,如检查账户余额、开设新账户等,用户可以直接使用或根据自己的需求进行定制。
如何通过Dialogflow Phone Gateway测试IVR系统?
-通过Dialogflow Phone Gateway,用户可以快速设置一个电话号码,然后拨打这个号码来测试IVR系统。在脚本中,提供了一个电话号码示例,用户可以拨打这个号码来体验集成了Resemble声音的IVR系统。
脚本中提到的IVR和IBA有什么区别?
-IVR是交互式语音响应,而IBA是交互式语音助手,两者可以互换使用,但IBA通常被视为IVR的增强版,允许与智能系统进行更复杂的事务性对话。
Outlines
😀 构建IVR系统与Resemble介绍
本段介绍了如何快速构建一个交互式语音响应(IVR)系统,强调了使用Dialogflow和Resemble两个工具的便捷性。Resemble是一个可以创建定制AI声音的平台,它拥有神经声音引擎,能够根据提供的音频数据快速构建逼真且多语种的声音模型。这些声音模型不仅表达能力强,能够处理各种口音,还能在高采样率下工作,适用于从IVR到视频游戏旁白等多种场景。此外,Resemble还提供了Resemble Fill功能,允许在真实语音中插入合成的动态元素,如姓名、地址、账户余额等,以实现更加自然的对话体验。
😉 Dialogflow与Resemble集成演示
这段内容主要讲解了Dialogflow的基本功能和如何与Resemble集成。Dialogflow是一个自然语言理解(NLU)引擎,能够识别用户的输入意图并将其映射到预设的意图上。通过训练模型,Dialogflow可以理解用户的短语或提示,并将它们与特定的意图相匹配。在本段中,演示了如何在Dialogflow中创建意图、实体,并设置Webhook以响应用户的查询。通过Resemble提供的API端点,可以将Resemble的合成声音与Dialogflow集成,实现自动回复功能。
🎙️ 对话流设置与Resemble集成详解
本段深入介绍了如何在Dialogflow中设置对话流,并与Resemble进行集成。首先,讲解了如何在Dialogflow中创建意图,并通过训练短语来识别用户的查询意图。接着,介绍了实体的概念,即用户查询中的变量或参数,以及如何通过Webhook实现对话的自动回复。特别强调了Resemble的API端点如何与Dialogflow的Webhook集成,以及如何在Resemble平台上创建和配置代理,以实现特定声音的自动回复。
📞 实时IVR系统演示与问答环节
在这段中,演示了如何将Dialogflow中的代理与Resemble的合成声音结合,创建一个实时的IVR系统。通过Dialogflow Phone Gateway,可以轻松地将Dialogflow代理连接到电话系统,实现自动的电话服务。演示了如何通过电话与IVR系统进行交互,包括开户、查询付款到期日和转账等操作。此外,还展示了IVR系统如何理解和回应用户的指令,以及如何进行小对话。最后,鼓励观众在聊天或问答环节中提出问题,或通过电子邮件和网站联系Resemble团队以获取更多帮助。
Mindmap
Keywords
💡IVR
💡Dialogflow
💡Resemble
💡语音模型
💡意图(Intent)
💡实体(Entity)
💡Fulfillment
💡API
💡实时APIs
💡多语言支持
💡合成语音
Highlights
介绍如何在30分钟内构建一个IVR系统,实际过程可能少于30分钟。
不涉及编码,将展示Dialogflow的功能和关键词。
Resemble公司创建定制的AI语音,使用神经语音引擎快速构建逼真的语音模型。
Resemble的语音模型具有高表达性,能够处理任何口音,并支持高清晰度采样率。
Resemble支持多种语言,并且能够在不同语言之间进行翻译。
Resemble Fill功能允许在真实语音中插入合成元素,如姓名、地址或账户余额等变量。
演示了Resemble生成的不同语音样本,包括IVR、数字角色和不同语言的语音。
Dialogflow是一个自然语言理解引擎,用于训练模型以识别用户的短语或提示并映射到意图。
Dialogflow广泛应用于移动应用、网页应用、聊天机器人和IVR等场景。
介绍了Dialogflow的关键组件:意图、实体和履行(fulfillment)。
演示了如何使用Dialogflow创建一个银行业务的智能代理,并设置意图和实体。
展示了如何将Resemble与Dialogflow集成,通过API端点和API密钥实现语音合成。
Resemble提供实时API和流式传输功能,显著降低响应时间。
通过Dialogflow Phone Gateway,可以轻松地将Dialogflow代理连接到电话系统。
实际演示了通过电话与集成了Resemble语音的Dialogflow代理进行交互的过程。
展示了如何通过Dialogflow设置IVR流程,包括开户、查询到期日和转账等操作。
提供了联系方式[email protected],以便用户在有后续问题时能够联系Resemble团队。
Transcripts
uh today i'm really excited to talk
about um how to build an ivr in 30
minutes and
you'll see that the slide title here is
slightly different than the title of the
webinar itself
um because we're going to do it in less
than 30 minutes
most likely here there will be no code
involved in the entire process and we'll
we'll even go through what dialogflow is
and does and how that works and some uh
an overview of uh some keywords that
dialogflow has
so with that we'll just jump in to
resemble first so
if you're
unaware resemble creates
custom ai voices we have a neural voice
engine which means that you give us some
sort of audio data it could be your
voice it could be someone else's voice
that you have permission
and we really quickly build realistic uh
voice models
across various languages
that are extremely versatile
the interesting thing about our voice
models are that they are extremely
expressive
um they can handle any accent
um and they also work at really high
sharp sample rates
um so if you're doing anything from ivr
to
narration for a video game
uh product overview marketing overview
it all works because of the flexibility
the engine provides
so just to
illustrate what that might sound like is
i have a few voices here that are
completely generated so this one is
is ivr
please hold while i connect you with an
agent
this one's a digital character
i have no desire to be your friend on
this quest
and these are all both both of them just
the input is just text and the output is
this audio you'll see like how they how
different they sound
um
hmm i'm not seeing jerry smith on a
device can i get you someone else from
their department so even things like
and other non-english or
just sounds that you're making
will follow through
it works across different languages but
one of the interesting things is that we
can translate between different
languages so you'll see here hola
hello there this is a test
so she's able to switch between spanish
and english
or if you're narrating something longer
this series will take to the last
wildernesses and show you the planet and
its wildlife as you have never seen them
before
and yeah something narration there
we also have something called resemble
fill which is uh very interesting to us
that a lot of our customers the general
idea is you don't want to transition
from
a complete voice over to complete text
of speech
sometimes all you really want to do is
drop in synthetic bits
into realistic speech so a lot of the
ivr uh and iba components kind of follow
this kind of pattern where we still have
a 80 or 70 static conversation that's
occurring uh but it's sprinkled in with
dynamic elements so you have variables
like names or addresses account balances
um credit card information four digits
etc
and you just kind of want to generate
those on the fly and you don't have
those pre-recorded and you don't want to
stitch them together either so if you
have an original sentence that sounds
like this
what is your current employment status
that's a real person that spoke exactly
like that and all we want to do is
replace the word employment with the
word marital
what is your current marital status
or if you want to change a couple of
words say what was your last so changing
the tense and then changing the word
last
what was your last employment status
so you'll notice that all three of them
um were two of them here that we've
replaced that we've synthetically
generated it sounds just like the
original and we're able to sprinkle in
some synthetic bits in there um
kind of seamlessly to make it seem like
uh to edit the speech and create some
sort of uh dynamicness uh with variables
now the other interesting thing that i
just showcased before um dubbing between
different languages is also very
interesting especially in cases where
you are a company that serves in
multiple locales or regions
in some cases you might have like
restaurants for example
that might be surveying in french but
the the name of the item is in english
it's fairly common
so we're able to do that really easily
as well so we might have a voice that we
generate in french
we could take that voice that only spoke
french um you know this she never spoke
english in this data set or any other
language but we can get it to speak a
different language here and computer
hackney
kickboxing
a computer once beat me a chess but it
was no match for me at kickboxing
and this is this is really easy to do in
within the application you kind of just
write in the native language um
in other languages and you kind of
highlight these words click on the
language tag here and on the right side
it'll show you what languages that voice
speaks and you can kind of toggle
between them so overall there's a bunch
of things that
resemble does out of the box we have
real-time apis
we've also introduced streaming
which is basically regardless of the
input length that you're sending in the
time to first sound is always going to
be around 300 milliseconds
which is extremely exciting for
conversational cases because now you can
reply with a chapter of harry potter
within 300 milliseconds um
and that's that's really cool
so i'll jump into ivr and iba really
quickly here at dialogflow so quick
introduction
ivr
is
interactive voice response um there's
another keyword called iba which is
interactive voice assistant
and they're kind of used interchangeably
but ib is really like a enhancement over
ivr so typically it's an automated
system that allows
transactional conversations to occur
with some sort of intelligent system
so you pick up the phone call your
telco
hopefully sometimes you have an okay
experience sometimes it's not that great
uh but that entire system in that
conversation is is ibr or iba um
a lot of stuff happening in this space
there's a lot of different components
one of them that's widely used is called
dialogflow and dialogflow is just an nlu
engine or a natural language
understanding engine so dialogflow
basically allows you to train this model
of sorts that is able to
take in some sort of phrases or prompts
that your user might say and map them to
some sort of intent so if you go and
walk up to a restaurant and say i want
to order a hamburger the intent there is
to order some item
and the item there is the hamburger so
dialogflow basically tries to understand
that uh that sentence and try to figure
out what what the intent was or is
this is used in a variety of places
mobile apps web applications chat bots
uh ibr etc anywhere with this
conversation you kind of need one of
these nlu platforms like dialogflow
uh obviously towards the end of
dialogflow you always have something
that replies back and hopefully that's
where you've understood that's where
resemble comes in so in 30 seconds we're
going to get a quick intro to dialogflow
here um when you log into dialogflow it
can be quite overwhelming but i'm going
to try to just get you to understand the
key components um and there's only
really three things you need to
understand for this tutorial intense
entities and fulfillment
so if you jump into intents you have
this ability to create training phrases
label some sort of parameters and then
have some sort of responses
so again in this in this particular case
it's checking some sort of balance um so
you load it in with training phrases and
it's able to figure out what other
phrases may sound like that
and try to map it to this intent so the
more phrases you give it the better it
gets
entities are basically
um
uh variables or parameters so if you
have accounts then you have well saving
account checking account credit card
there's different kinds of accounts but
you basically create this entity of
sorts
uh and then you have fulfillments so
fulfillments are basically uh how does
dialogflow
or your agent respond
to whatever query is coming in
and this is where resemble resembles
magic comes in we basically have an
endpoint that we provide you which is
this this url right here it's the same
for everybody you hook in your api key
you paste in your agent's token and
you're good to go and we'll explain
exactly how to do this in a couple of
minutes or maybe just 30 seconds
so
let's do that now
and i'll jump into
uh how this all works so we'll first go
into dialogflow and i'll quickly just
demonstrate exactly what i showed you in
uh in the real setting so i've created
an agent here called banking you can
have many dialogflow agents um we're
just going to deal with banking one for
now um you have intense
uh entities fulfillments et cetera uh
and we just use a pre-built agent here
so dialogflow has a bunch of these
pre-built agents that you can kind of
get started with but we just took the
banking one here as an example
so it pre-loaded with a bunch of
different intents so you can check your
account balance you can open a new
account
you can
check the the due date
uh transfer money et cetera so let's go
into checking a balance here and you'll
see very similar to what we talked about
here
you have training phrases um you have
words that are highlighted here that
indicate uh what kind of parameter
is being asked for here so savings maps
to a particular type of account checking
credit card they all map to a particular
type of account here um now if the user
says something like check how much money
i have well they'll basically go ahead
and say well i'm missing this parameter
here called accounts and what do i fill
this in with
it'll ask for these prompts whether it's
checking or savings
et cetera down here you have responses
so you could have multiple responses
here
multiple variants that you can respond
with but in this case we just have
here's your related balance
um you can add more responses and the
most important thing in this tutorial is
fulfillment so
there's two options here one to enable
webhook calls for this particular intent
and the other one to enable web hub
calls for slot billing so this intent
just meaning like when this is said what
is the reply with um for slot filling
it's basically when it's asking for
what account do you want bound to which
account
it should also fulfill that through
resemble as well
so
you can go into
a few others and they all look about the
same so here you have transferring money
so sending two bucks
to savings from checking transfer 100
or just transfer money and you can see
there's many more uh parameters here
that you can fill
account from account two and an amount
and each one of them
will have prompts if it's missing it'll
ask for um
the the prompter the text response here
is basically just you're transferring
something from something to something is
that right and then again we have the
fulfillment set up here as as we expect
awesome
so
i'll jump into entities really quickly i
mean i think we have a pretty good grasp
of this you kind of saw
uh saw this earlier in the presentation
um but we have
a transfer type whether it's credit
deposit eft uh you create as many
entities as you want
um and we'll hop back into fulfillment
here so again you have this endpoint you
have this agent id and this
authorization so the question is well
where do you find this authorization and
where do you find this agent id so we
can really quickly jump into resemble so
if you go into resembles dashboard on
the top right here
you will see that there is api under
your name there's api when you click on
that it lets actually see your api token
up here so you basically just want to
take your api token and put it inside a
dialogflow
right there
and then you have agents so the way that
our integration is set up is it allows
you to create many dialogflow
integrations because we understand
different voices might want to reply to
different agents here
um or different dialogue for agents that
is so essentially we have one that's set
up already um it's it maps to a voice
called vienna
it has a particular name that we gave it
uh and it has that agent uh id that we
can copy over to dialogflow so it knows
how to route uh when it hits our api
you create a real a new one really
easily so say here we'll do like uh an
agent name call it webinar demo um we'll
keep the project uh to banking um and
then out of all the voices that you
build on our platform you can pick any
to fulfill that particular agent here so
in this case we might pick someone like
tarkos
and create fulfillment and there you
have it they created an agent id so if
you just copy this over to dialogflow
you would just have tarkos responding
uh instead of deanna or
instead of any other voice that dialogue
has by default
so
let's get to the fun part here
um
we'll go back into dialogflow and you
wonder we want to see how this all
comes into action how it all kind of
fits together and
um how do we actually get that agent to
be real so dialogflow has built-in
agents or integrations um so they make
it really easy for you to hook into a
telephony system so
uh doesn't matter if you're using avaya
signal wire box implant twilio audio
codes and there's a few more underneath
um there's also dialogflow phone gateway
and basically these are like one click
setups so if you set up with avaya
it'll basically use your dialogflow
agent and use a fulfillment that your
agent has on dialogflow
for this case we'll just use the phone
gateway here so i've already had this
set up
all you have to all the setup process is
fairly simple there's really nothing to
it it just assigns you a phone number uh
so in this case at this phone number
voice 727-233-5979 actually dial this
number so i'll copy it from here i have
it pasted inside of here so you can see
that's the phone number
and when we make this phone call
hello thanks for choosing acme bank
how do i open account
to open your account you should come to
one of our banks in person don't forget
to bring your id when is the payment due
sorry can you tell me again
when is the payment due
the due date is next friday
i want to transfer some money
sure
transfer from which account
checking
to which account saving
and how much do you want to transfer
one thousand dollars
all right so you're transferring 1000
usd from your checking account and
checking account to a savings account is
that right
that's right so you can see here it goes
through an entire flow
um exactly what we have set up it's able
to understand and respond in that
synthetic voice here are your deposit
transactions
and it responds back pretty accurately
there with whatever dialogflow is
routing
um it also does small talk if you notice
sorry say that again
like sorry say that again
um so it's able to do that as well
and there you have it um
that is in
less than 30 minutes how you can
take an agent on dialogflow
hook it up to resemble
um a synthetic voice on resemble and get
uh kind of a real-time conversation
going without writing any code so if you
have any questions feel free to ask them
right now in the chat or q a
if you have questions later on you can
always reach us at team resemble.ai
or you can always go on our website and
there's this annoying chat widget that
pops up on your right hand side
if you ask questions there as well
Ver Más Videos Relacionados
5.0 / 5 (0 votes)