Loki - Live Portrait - NEW TALKING FACES in ComfyUI !

FiveBelowFiveUK
7 Jul 202411:23

TLDRIn this video, the creator introduces a new update for Loki, a face swap application with batch modes, allowing users to generate face models and integrate them into animations. The workflow now includes Live Portrait, which syncs audio and facial expressions for a more realistic talking head effect. The video demonstrates the process, from creating a face model to animating with ComfyUI, and highlights the efficiency and control offered by the updated system, promising further exploration in upcoming videos.

Takeaways

  • 😀 The video introduces the latest edition of Loki, focusing on face swapping and animation capabilities.
  • 🔄 Loki's face swap feature includes batch modes suitable for creating animations and gifts, with the ability to save and create face models.
  • 📸 The Trio T pose workflow allows loading of created face models into images, ensuring consistency in facial features.
  • 🎨 Hedra is used to animate the images, creating a workflow that combines face models with animation.
  • 👤 The video demonstrates creating a face model using the provided workflow, highlighting the ease of adding more inputs.
  • 📹 Live Portrait KJ is introduced as a tool that can be used out of the box, with an included Hedra video for immediate use.
  • 🔧 ComfyUI is updated to include Live Portrait, with slight changes to fix frame rate issues for better synchronization with source videos.
  • 📚 Additional models are required for Live Portrait, which are small in size and can be easily integrated into the workflow.
  • 🎭 The updated Live Portrait workflow simplifies the process of syncing audio with the animation, reducing the need for manual adjustments.
  • 🤖 The video showcases the use of video from ComfyUI Helper Suite and the Video Loader for frame rate and audio synchronization.
  • 🎨 Live Portrait models are used to animate a T-pose created with control net, allowing for more detailed and accurate facial animations.
  • 📈 The video discusses potential improvements and adjustments, such as using dewarp stabilizers to reduce facial distortion during animation.
  • 📝 The workflow is available for download, and the video concludes with a call to action for further exploration in future deep dives.

Q & A

  • What is the main topic of the video script?

    -The main topic of the video script is the introduction and explanation of the latest edition of Loki, a face swap and animation tool, with a focus on creating and animating talking faces using a new feature called Live Portrait in ComfyUI.

  • What is the purpose of the Loki face swap tool with batch modes?

    -The Loki face swap tool with batch modes is designed for creating animations and face swapping, allowing users to save and create face models that can be loaded into the Trio T pose workflow for consistent face rendering in images.

  • How does the Live Portrait feature in ComfyUI work?

    -The Live Portrait feature in ComfyUI allows users to animate images by syncing the frame rate and audio from a source video or audio track, creating a more realistic and dynamic talking head animation.

  • What changes were made to the ComfyUI to accommodate the Live Portrait feature?

    -Slight changes were made to the ComfyUI to fix the frame rate, ensuring that the animation matches the frame rate of the source video or audio, and to allow for the syncing of audio directly within the workflow.

  • What additional models are required for the Live Portrait workflow?

    -Six additional models are required for the Live Portrait workflow, which are small in size and can be found in the description and workflow links provided in the script.

  • How does the Live Portrait workflow handle audio synchronization?

    -The Live Portrait workflow uses the video info node to extract the source FPS and audio, which are then used to drive the frame rate and sync the audio with the animation, eliminating the need for manual syncing.

  • What is the significance of the 'Trio T pose' workflow mentioned in the script?

    -The 'Trio T pose' workflow is a method for creating a base pose for the face models, which can then be used to create 2D puppets that can be animated using a webcam or other video sources.

  • How does the Live Portrait feature handle facial distortions during animation?

    -The Live Portrait feature may cause some facial distortions, but these can be mitigated using dewarp stabilizers and other post-processing techniques to improve the final animation quality.

  • What is the role of the 'video driver' in the Live Portrait workflow?

    -The 'video driver' in the Live Portrait workflow is responsible for the heavy lifting, processing the input video or audio and using the six models to create a realistic and animated talking head.

  • How can users improve the realism of the talking head animation?

    -Users can improve the realism of the talking head animation by ensuring good lighting and the correct angle for the source video, as well as by using overdubbing techniques for specific scenes where lip sync may not be necessary.

  • Where can users find the workflow for the Live Portrait feature?

    -Users can find the workflow for the Live Portrait feature on Civ AI, with a link provided in the description of the video script.

Outlines

00:00

😀 Introduction to Loki's Face Swap and Animation Workflow

The script opens with a warm welcome and dives into a discussion about the latest updates on Loki's face swap feature, which was recently introduced with batch modes. The feature allows for creating and saving face models that can be loaded into the Trio T pose workflow for consistent facial animations. The speaker mentions a previous tutorial on creating face models and highlights the inclusion of a Hedra video to facilitate the learning process. The workflow remains unchanged, but slight modifications have been made to fix the frame rate issue, ensuring that the animation frame rate matches the source video's frame rate and audio, which is crucial for synchronizing speech in animations.

05:00

🎭 Enhancing Live Portraits with Improved Synchronization and Control

This paragraph delves into the advancements made to the live portrait feature, which now includes a removable head and a tracking stick for better synchronization between the face and body models. The speaker demonstrates the process of using the live portrait models to animate a talking head, emphasizing the improved speed and controllability of the system. They note the importance of good lighting and camera angle for accurate facial capture and discuss the potential for character expression and emotion in the animations. The speaker also touches on the possibility of using text-to-speech for video creation and the integration of audio and video synchronization for a seamless animation experience.

10:02

📝 Conclusion and Future Directions for the Workflow

In the concluding paragraph, the speaker wraps up the discussion by summarizing the workflow and expressing gratitude to the developers of the nodes that make the process possible. They encourage viewers to check out the workflow on Civ AI and provide a link in the description for easy access. The speaker also hints at future improvements and the potential for using one's own speech in animations, which will be explored in more detail in an upcoming 'Deep Dive Part Three'. They end with a light-hearted reference to Sir Humphrey Davey, signaling the end of the video and looking forward to the next session.

Mindmap

Keywords

💡Loki

Loki is a reference to the software or feature discussed in the video, likely related to face manipulation or animation technology. It is a Norse mythology figure known for his shape-shifting abilities, which could metaphorically relate to the video's theme of changing faces in digital media. In the script, 'Loki' is mentioned as the subject of the latest edition and face swap technology.

💡Face Swap

Face Swap is a technology that allows the user to replace the face of a person in a video or image with another face. It is a core concept in the video, as it discusses the release of a new feature that enhances the process of face swapping with batch modes for animations and gifts. The script describes the process of using the 'Loki fastest face swap' to create animations and save face models.

💡Batch Modes

Batch Modes refer to the ability to process multiple files or tasks simultaneously, which is a feature of the Loki software mentioned in the script. This feature is particularly useful for creating animations and face swapping on a larger scale, as it allows for efficiency and speed in generating content.

💡Face Models

Face Models in the context of this video are digital representations of faces that can be manipulated and used in animations or other visual media. The script explains how users can save and create these models using the Loki software and then load them into the Trio T pose workflow for further use in image creation.

💡Trio T Pose

Trio T Pose is a workflow mentioned in the script that involves using a specific pose as a starting point for creating face models. It seems to be a method for preparing the face models for further manipulation and animation within the Loki software.

💡Hedra

Hedra is a term used in the script that could refer to a software or tool used for animation. It is mentioned in the context of animating the faces created with the Loki software, suggesting that Hedra is used in conjunction with Loki to bring the face models to life.

💡Live Portrait

Live Portrait is a feature or tool within the Loki software that is discussed extensively in the script. It is used for creating animated face models that mimic the movements and speech of a live person, enhancing the realism of the animations.

💡ComfyUI

ComfyUI appears to be the user interface of the software being discussed. It is mentioned in the context of updating and installing new features like Live Portrait, indicating that it is the platform where users interact with the software's functionalities.

💡Frame Rate

Frame Rate refers to the number of frames displayed per second in a video, which is an important aspect of video quality and synchronization. The script mentions fixing the frame rate to match the source video, ensuring that the animation created by Live Portrait is smooth and in sync with the audio.

💡Text to Speech

Text to Speech (TTS) is a technology that converts written text into spoken words. In the script, it is mentioned in the context of creating videos where the head model speaks using TTS, allowing for the creation of talking head animations without requiring the original audio.

💡2D Puppets

2D Puppets are two-dimensional representations of characters that can be animated. The script discusses creating these puppets from the face models and T poses, which can then be manipulated for animation using a webcam or other input methods.

💡Tracking Stick

A Tracking Stick is a tool or feature mentioned in the script that seems to be used for aligning or tracking the movement of the face model within the animation. It is particularly useful for ensuring that the face model's movements match the body's movements in the animation.

💡Dewarp Stabilizers

Dewarp Stabilizers are tools used to correct distortions in images or videos, often caused by perspective or motion. The script mentions using these to fix issues where the face model appears to 'liquefy' or distort during animation, thus improving the final output's quality.

Highlights

Introduction of Loki's latest edition with new features for face swapping and animation.

Release of Loki's fastest face swap with batch modes for creating animations and gifts.

Ability to save and create face models and load them using the Trio T pose workflow.

Use of Hedra to animate the created faces in images.

Demonstration of creating a face model with the given images.

Explanation of adding more inputs to create additional face models.

Inclusion of a Hedra video for users to experiment with the workflow.

Installation of Live Portrait and slight changes made to fix the frame rate.

Need for extra models and their download links provided in the description.

Details on how to integrate the live portrait models into the workflow.

Update on the live portrait workflow with new features for audio synchronization.

Use of video from Comfy UI Helper Suite for better audio-video synchronization.

Demonstration of the live portrait's ability to animate with various input sources.

Discussion on the use of text-to-speech videos and their integration with the workflow.

Introduction of a removable head and tracking stick for better face-body resolution match.

Observation of face distortion issues and solutions with dewarp stabilizers.

Efficiency improvements and generation cost reduction tips for users.

Final thoughts on the workflow's potential and future developments.

Link to the workflow provided in the description for interested users.

Acknowledgment of node developers and the community's contributions to the project.