How to Create Voiceover Using Google Cloud Text to Speech

LearnWoo
28 May 202203:27

TLDRIn this informative video, Nian Roby from LearnVue demonstrates two methods for creating text-to-speech voiceovers using Google Cloud. The first method requires a Google Cloud account, enabling the Cloud Text-to-Speech API, and obtaining an API key. The Wavenet extension for Chrome is then used to convert text into speech, with the process streamlined by copying text and using the extension's context menu. The second method is more straightforward, involving the installation of an audio capture extension and utilizing Google Cloud's interface to input text, select language and voice, and record the speech. Both methods are user-friendly, offering a quick and efficient way to generate professional-sounding voiceovers.

Takeaways

  • 📈 Create a Google Cloud account to access the Cloud Text-to-Speech API.
  • 🔍 Search for 'Text-to-Speech' on the Google Cloud homepage to find and enable the API.
  • 💳 Provide credit card details for account verification, but no charges will be made until after the complimentary credit is used.
  • 🚫 Ensure to restrict your API key to prevent unauthorized access to your account resources.
  • 🔗 Copy your API key for later use in the process.
  • 🌐 Install the Wavenet extension from the browser's webstore and paste your API key into it.
  • 📑 Use the Wavenet extension to convert text into speech and download as an .mp3 file.
  • 🔄 Repeat the process for the entire text until the article is fully converted.
  • 🎧 For an alternative method, install an audio capture extension from the browser's webstore.
  • 🎤 Use the Google Cloud website to input text, choose language and voice, and adjust speed and pitch.
  • 📝 Record the speech using the audio capture extension and save the audio file.
  • 📢 Subscribe to the channel for more helpful tutorials and support.

Q & A

  • What are the two methods demonstrated in the video for creating text to speech voiceover using Google Cloud?

    -The video demonstrates two methods: Method one involves using the Wavenet Chrome extension after setting up a Google Cloud account and obtaining an API key. Method two uses the Audio Capture extension to record speech generated from Google Cloud's Text-to-Speech service without needing to sign in.

  • How many voices and languages are available in Google Cloud's Text-to-Speech service?

    -Google Cloud's Text-to-Speech service offers over 200 voices in more than 40 languages.

  • What is the complimentary credit amount provided by Google Cloud for new users?

    -Google Cloud provides a complimentary credit of 300 units for new users.

  • What is the cost for processing more than one million characters with Wavenet invoices?

    -If a user needs to process more than one million characters, they will have to pay a fee of 16 units of currency.

  • How can one restrict the use of their API key in Google Cloud?

    -To restrict the use of the API key, users can set restrictions when creating the key in the Google Cloud console. This is important as the API key acts like a password and should not be shared with others to prevent unauthorized use of the account resources.

  • What is the purpose of installing the Wavenet extension in the first method?

    -The Wavenet extension is used to facilitate the text-to-speech process directly from the browser, allowing users to convert selected text into speech using their Google Cloud API key.

  • How does one obtain their API key in Google Cloud?

    -After enabling the Cloud Text-to-Speech API in the Google Cloud console, users navigate to the 'Credentials' section, select 'Create credentials', and choose 'API key' to generate and retrieve their unique API key.

  • What is the process to download an audio file using the Wavenet extension?

    -After copying the text into the character counter and pasting it into the Wavenet extension, users right-click the selected text and choose 'Download smp3' from the context menu. This process is repeated until the end of the article.

  • What does the Audio Capture extension allow users to do in the second method?

    -The Audio Capture extension allows users to record the speech generated by Google Cloud's Text-to-Speech service. Users can start recording, speak their text, and then save the audio file once the recording is complete.

  • How can users customize the voice and language in Google Cloud's Text-to-Speech service?

    -Users can customize the voice and language by selecting their preferred options from the available dropdown menus in the Text-to-Speech interface. They can also adjust the speed and pitch of the speech to their liking.

  • What is the prerequisite for using the second method demonstrated in the video?

    -The prerequisite for using the second method is to have the Audio Capture extension installed in the browser and to be able to access the Google Cloud website, where they can input text and select voice settings without signing in.

  • How can users ensure their API key remains secure?

    -Users should restrict the usage of their API key to specific applications or IPs when creating it in the Google Cloud console. After copying the API key, they should not share it with anyone and should save their changes to apply the restrictions.

Outlines

00:00

🎓 Introduction to Google Cloud Text-to-Speech

The video begins with an introduction by Nian Roby from LearnVue, who outlines two methods for creating text-to-speech voiceovers using Google Cloud. The tool offers a wide selection of over 200 voices in more than 40 languages. The presenter emphasizes the need for a Google Cloud account and guides viewers through the sign-up process, including filling out personal details, verifying a phone number, and providing credit card information for account verification. It's noted that the complimentary credit of $300 will not be charged automatically. After signing up, viewers are instructed to search for and enable the Cloud Text-to-Speech API, which allows for processing up to one million characters for free. The video then explains how to create and restrict an API key for security purposes and how to install the Wavenet extension in the browser for the first method.

🔑 Using the Wavenet Extension for Text-to-Speech

The first method detailed in the video involves using the Wavenet extension installed from the browser's webstore. After installing the extension, the user is instructed to paste their API key into the extension. The video then guides the user on how to copy text into a character counter, right-click to select text, and use the Wavenet extension to download the speech as an MP3 file. This process is repeated until the entire article is converted into voiceover. The presenter ensures to mention the importance of not sharing the API key, as it functions like a password, granting access to the user's account resources.

🎙️ Method Two: Audio Capture Extension for Text-to-Speech

The second method presented is described as being very simple. It involves installing an audio capture extension from the browser's web store. With the extension installed, the user navigates to the Google Cloud website, where they can use the 'Put text to speech into action' feature without signing in. The user can input or paste text into a provided box, select their preferred language and voice, and adjust the speed and pitch of the voiceover. To record the speech, the user activates the audio capture extension and clicks the 'speak' button. Once the speech is recorded, the user can finalize the recording through the audio extension and save the audio file. The video concludes with a prompt for viewers to subscribe for more content and to leave comments with any questions or doubts for further assistance.

Mindmap

Keywords

Google Cloud

Google Cloud is a suite of cloud computing services offered by Google. It includes a range of hosted services for computing, data storage, database, machine learning, and more. In the video, Google Cloud is used to access the Text-to-Speech API, which is the main focus of the tutorial.

Text-to-Speech (TTS)

Text-to-Speech is a technology that converts written text into audible speech. The video demonstrates how to use Google Cloud's Text-to-Speech API to create voiceovers. It is a key concept as it is the core functionality being explored.

Voiceover

A voiceover is a recording of a voice that is reproduced or mixed, in which the voice is not intended to be seen but heard, typically used in films, radio broadcasting, and presentations. In the context of the video, creating a voiceover involves using the TTS technology to generate spoken audio from a text script.

API Key

An API key is a unique identifier used in the context of an API to authenticate the identity of a user or calling program to the API. In the video, the presenter instructs viewers to create and restrict an API key for the Google Cloud Text-to-Speech service to ensure secure access.

Wavenet

Wavenet refers to a specific type of neural network technology used by Google for its Text-to-Speech API, known for producing high-quality and natural-sounding speech. The video mentions using a Wavenet extension to facilitate the TTS process.

Complimentary Credit

Complimentary credit is a promotional offer given by a service provider, in this case, Google Cloud, to allow new users to try out their services for free. The video mentions a complimentary credit of $300 for new users to explore Google Cloud services.

Character Counter

A character counter is a tool used to count the number of characters in a given text. In the video, the character counter is used to ensure that the text to be converted into speech does not exceed the free usage limits of the Google Cloud Text-to-Speech API.

Webstore

A webstore, specifically referring to the Chrome Web Store in this context, is an online marketplace for web apps and extensions for the Google Chrome browser. The video guides viewers to install specific extensions from the Chrome Web Store to assist with the TTS process.

Audio Capture Extension

An audio capture extension is a browser extension that allows users to record audio directly from the browser. In the video, it is used to record the voiceover generated by the Google Cloud Text-to-Speech service.

Language Selection

Language selection refers to the process of choosing the language in which the text will be converted to speech. The video demonstrates how to select different languages for the TTS conversion, offering versatility in voiceover creation.

Voice Selection

Voice selection is the process of choosing the specific voice to be used for the text-to-speech conversion. The video highlights the ability to select from over 200 voices, emphasizing the customization options available in Google Cloud's TTS service.

Speed and Pitch

Speed and pitch refer to the rate of speech and the frequency of the voice's tone, respectively. The video shows how users can adjust the speed and pitch of the generated voiceover to suit their preferences, which is an important feature for creating professional-sounding voiceovers.

Highlights

The video demonstrates two methods to create text-to-speech voiceovers using Google Cloud.

Over 200 voices and 40 languages are available through the Google Cloud Text-to-Speech tool.

To get started, create a Google Cloud account and enable the Cloud Text-to-Speech API.

The free version of the API allows processing up to one million characters for WaveNet voices.

For additional character processing, a fee of $16 is required.

Create credentials and restrict the API key to prevent unauthorized use.

Install the WaveNet Chrome extension and paste your API key to use the first method.

Copy text into the character counter and use the WaveNet extension to download the audio file.

The second method involves installing an audio capture extension.

Use the Google Cloud website to input text and select language, voice, speed, and pitch.

Record the speech using the audio capture extension and the speak function.

Save the recorded audio through the audio extension for the final output.

Both methods are quick and simple to implement.

Subscription to the channel is encouraged for further assistance and updates.

Questions and doubts can be addressed by leaving a comment for support.

The presenter, Nian Roby from LearnVue, offers help for any issues viewers may have.

The video concludes with an invitation to the next video in the series.