How to Use Generative Audio | Runway Academy

Runway
8 May 202403:07

TLDRRunway Academy's tutorial introduces generative audio, covering text-to-speech, custom voice models, and creating lip-sync videos using Runway's tool. Users can input text, select a voice, and generate audio files quickly. The video also guides on training custom voice models with clean audio and creating lip-sync videos with images or videos, noting the audio's looping behavior if longer than the video. Tips for seamless video integration are provided, along with an invitation to join the community for further exploration.

Takeaways

  • 🎙️ Generative audio in Runway includes text-to-speech, custom voice models, and lip sync videos.
  • 🔍 Access the generative audio tool from the Runway dashboard to convert text into spoken audio files.
  • 📝 Preview and select from a list of default voices, such as James, to generate audio from your written text.
  • ⏱ Generation times vary based on script length, but are generally quick.
  • 📁 Audio files are automatically saved in the 'generative audio' folder within the main assets folder.
  • 🎧 Train a custom voice model with a few minutes of clean audio, which can be imported or recorded within the tool.
  • 📌 Ensure the audio for custom voice models is as clear as possible for optimal results.
  • 🖼️ Create lip sync videos using an image or video of a person with a full face visible in the frame.
  • 🎥 Lip sync can be applied to generated, recorded, or uploaded audio, and can be combined with text-to-speech.
  • 🔁 If the audio is longer than the video, the video will reverse and loop back to the beginning for the duration of the audio.
  • 📹 For video workflows, avoid camera motion parameters and use subject motion with a motion brush to minimize the reversing effect.
  • 🗣️ Join the Runway community on Discord for more resources and to find specific answers to questions.

Q & A

  • What is the main focus of the Runway Academy video?

    -The main focus of the Runway Academy video is to demonstrate how to use generative audio, including text to speech, custom voice models, and creating lip sync videos in Runway.

  • How can I access the generative audio tool in Runway?

    -You can access the generative audio tool from your Runway dashboard by clicking on it at the top.

  • What is the first step after typing in the text for generative audio?

    -The first step after typing in the text is to preview it and choose a voice from the default voice list.

  • How long does it usually take for the audio generation to complete?

    -Audio generation times depend on the script length, but they usually complete quite quickly.

  • Where are the audio generations saved by default in Runway?

    -Audio generations are automatically saved to the generative audio folder inside your main assets folder in Runway.

  • What is required to train a custom voice model in Runway?

    -To train a custom voice model, you need a few minutes of clean audio which can be imported or recorded directly in the generative audio tool.

  • What should be ensured while recording the audio for a custom voice model?

    -The audio should be as clean as possible to ensure the best quality for the custom voice model.

  • How can I create a lip sync video in Runway?

    -To create a lip sync video, you need an image or video of a person with their full face viewable and use the lip sync feature with generated or uploaded audio.

  • What happens if the audio is longer than the video in a lip sync project?

    -If the audio is longer than the video, once the video ends, it will reverse and go back to the beginning for the duration of the audio.

  • What is a pro tip for using the video workflow in Runway's generative audio tool?

    -A pro tip is to avoid using camera motion parameters and instead add subject motion with a motion brush to make the reversing effect less noticeable.

  • Where can I find more information and join the Runway community for further help and resources?

    -You can join the Runway community on Discord for more information, experimentation, and to find specific answers to your questions.

Outlines

00:00

🎙️ Introduction to Generative Audio

This paragraph introduces the topic of the video, which is generative audio in Runway Academy. It covers text-to-speech, custom voice models, and creating lip-sync videos. The process begins with accessing the generative audio tool from the dashboard, inputting text, and converting it into spoken audio. Users can preview and select a voice from a list, with James as the default option. The generation time varies based on script length but is typically quick. The paragraph also mentions that the generated audio files are automatically saved in a specific folder within the assets, but users have the option to save them elsewhere.

🔍 Training a Custom Voice Model

The second paragraph explains how to train a custom voice model in Runway using a few minutes of clean audio. This audio can be imported or recorded directly within the generative audio tool. It's important that the audio is as clear as possible. After recording, the user names their voice model, and it's ready for use with text-to-speech in just a few seconds. This feature allows for a personalized audio experience tailored to the user's needs.

🎥 Creating Lip-Sync Videos

The third paragraph discusses the process of creating lip-sync videos. To do this, an image or video of a person with a full face visible is required. Users can upload their own media or select from preset characters. The lip-sync feature can be applied to generated audio, recorded audio, or uploaded audio. The paragraph also includes a demonstration of adding text-to-speech and choosing a voice for the lip-sync effect. It notes a technical tip regarding video workflow, suggesting to avoid camera motion parameters and instead use subject motion with a motion brush to minimize the reversing effect when the audio is longer than the video.

📚 Conclusion and Additional Resources

The final paragraph wraps up the video with a conclusion, expressing appreciation for the viewer's time and encouraging them to get started with their work. It also provides information on where to find more helpful resources, such as joining the community on Discord for further information and experimentation with Runway. Additionally, it mentions the availability of a button on the dashboard for finding specific answers to questions related to the use of Runway.

Mindmap

Keywords

💡Generative Audio

Generative audio refers to the use of artificial intelligence to create or manipulate sound. In the context of the video, it includes generating speech from text, creating custom voice models, and synchronizing audio with video. The tool in Runway allows users to type text and turn it into spoken audio, which can be used for various multimedia applications.

💡Text to Speech

Text to Speech (TTS) technology converts written text into spoken voice. In the video, the generative audio tool in Runway uses TTS to allow users to input any text and generate an audio file. This feature is useful for creating narrations or voiceovers for videos without needing a human voice actor.

💡Custom Voice Models

Custom voice models are personalized voice profiles created from a few minutes of clean audio. These models can be trained within the Runway generative audio tool by importing or recording audio. Once trained, these models can be used to generate speech that sounds like the person in the original recording, adding a unique touch to TTS applications.

💡Lip Sync

Lip sync involves matching the movement of a character's mouth with spoken audio. The video describes using Runway to create lip sync videos where an image or video of a person is animated to match the generated or recorded audio. This technique is crucial for making animations or videos appear more realistic and engaging.

💡Runway Dashboard

The Runway Dashboard is the central interface from which users can access various tools and features in Runway. For generative audio, users navigate to this dashboard to start creating audio files, train custom voice models, and sync audio with videos. It serves as the starting point for all the creative processes described in the video.

💡Generate Button

The Generate Button is a key feature in the Runway tool, used to initiate the creation of audio from text. Once the user selects a voice and inputs their text, clicking this button produces the spoken audio file. This action is crucial for transforming written scripts into audible content efficiently.

💡Assets Folder

The Assets Folder in Runway is a storage area where generated audio files are automatically saved. Users can find their created audio under the generative audio folder within this main directory. This organizational system helps users manage their multimedia files effectively.

💡Motion Brush

Motion Brush is a tool mentioned for enhancing the natural look of videos, especially when using the lip sync feature. By adding subject motion without camera movement, it reduces noticeable artifacts in the video, such as the reversing effect when audio outlasts video. This technique helps in creating smoother animations.

💡Gen 2

Gen 2 is likely a feature or a tool within Runway mentioned for turning images into videos. In the context of the video, it is used to convert static images into moving videos, which can then be lip-synced with generated audio. This capability is valuable for creating dynamic content from still images.

💡Discord Community

The Discord Community refers to the user community on the Discord platform where Runway users can join to find additional resources, share their work, and get answers to their questions. It is highlighted as a place for learning and experimentation with Runway's tools, fostering a collaborative environment for users.

Highlights

Introduction to generative audio in Runway Academy.

Exploring text to speech, custom voice models, and lip sync videos in Runway.

Accessing the generative audio tool from the Runway dashboard.

Converting typed text into spoken audio files.

Previewing and selecting a voice from the default list.

The process of generating audio based on script length.

Automatic saving of audio generations to the assets folder.

Customizing voice models with clean audio recordings.

Training a custom voice model within Runway.

Creating lip sync videos using images or videos of a person.

Using preset characters for lip sync examples.

Adding text to speech for lip sync generation.

Generating audio with custom voice selection.

Handling audio longer than video duration in lip sync.

Pro tip for video workflow to avoid camera motion parameters.

Using motion brush for subject motion to reduce reversing effect.

Invitation to join the Runway community on Discord for more resources.

Accessing specific answers through the dashboard at any time.