Today's featured cutting-edge AI information, welcome to read 👇
💡 TANGO: Capable of generating full-body dialogue videos based on audio and reference videos, potentially becoming an open-source version of HeyGen.
🎬 Animate-X: Brings static image characters to life, supporting various character types while maintaining character consistency.
🖼️ Rectified Flows: An efficient and accurate technique for image inversion and editing, capable of modifying images based on reference images or text descriptions.
🎙️ AsrTools: An open-source intelligent speech-to-subtitle tool, supporting multiple formats, batch processing, and generating srt and txt subtitle files.
📚 Awesome_Math_Books collects numerous classic mathematics books, providing downloads or online reading, suitable for different levels of learning.
Cutting-edge Technology
1. Is an open-source version of HeyGen coming?
The TANGO project can generate full-body conversational videos matching corresponding audio by providing an audio clip and a reference video of body movements.
Detailed introduction: https://pantomatrix.github.io/TANGO/
Try it online: https://huggingface.co/spaces/H-Liu1997/TANGO
From the generated videos, the lip sync with the audio is not very accurate. It might be better to combine it with some open-source lip-sync projects, such as Kuaishou's LivePortrait.
2. Bringing image characters to life: Animate-X.
Similar to AnimateAnyone, by inputting a character image + reference action, it can animate the character according to the specified action, maintaining good character consistency.
Detailed introduction: https://lucaria-academy.github.io/Animate-X/
It performs well with various character types including real people, game characters, cartoons, and animations.
AI Drawing
1. An efficient and accurate technique for image inversion and editing: Rectified Flows (RFs).
It can generate images in a similar style based on reference images, and also supports editing images based on text descriptions, such as adding glasses to a person or changing a person's age and gender.
Detailed introduction: https://rf-inversion.github.io/
From the demonstration images provided, the effect looks quite impressive, but the project code has not yet been open-sourced.
Open Source Projects
1. An open-source intelligent speech-to-subtitle text tool: AsrTools.
It integrates official interfaces from Jianying, Kuaishou, and Bijian, supports flac, m4a, mp3, wav audio formats, efficient batch processing, and can generate .srt and .txt subtitle files.
GitHub: https://github.com/WEIFENG2333/AsrTools
It provides a simple and easy-to-use interface, requiring no GPU or complicated local configuration, making it easy for beginners to use.
2. A collection specifically organizing various mathematics-related books: Awesome_Math_Books.
It has collected many classic books in the field of mathematics, such as "Probability Theory and Mathematical Statistics", "Advanced Algebra", "Calculus", etc., providing download links or online reading.
GitHub: https://github.com/valeman/Awesome_Math_Books
In addition, there are some mathematics problem books for high school students, as well as some books on basic physics knowledge.