Today's featured cutting-edge AI information, welcome to read 👇
📹 AnchorCrafter: Generate natural and realistic e-commerce product videos through person image + product image + action video, featuring excellent interaction between people and products.
👗 TryOffDiff: An innovative clothing extraction technology that can extract the shape and texture of clothing from person photos to generate standardized clothing images, compatible with other virtual try-on technologies.
🔊 Auralis open-source high-performance text-to-speech framework released, supporting multiple languages and voice cloning, with fast speed and low memory usage - can convert an entire "Harry Potter" audiobook in 10 minutes.
Cutting-edge Technology
1. AnchorCrafter: A tool for generating realistic and interactive e-commerce product videos.
By simply providing a person's image + product image + pose action video, it can generate realistic product videos showing people introducing products.
Detailed introduction: https://cangcz.github.io/Anchor-Crafter/
The demonstration video shows excellent results, with natural and realistic interaction between people and objects, looking just like real people introducing products.
Though the code is not yet open-sourced, it's worth keeping an eye on.
2. TryOffDiff: A high-fidelity clothing reconstruction technology for virtual try-on.
Different from previous virtual try-on approaches, this technology captures the shape, texture, and complex patterns of clothing worn in photos to convert them into standardized product images.
Detailed introduction: https://rizavelioglu.github.io/tryoffdiff/
Online demo: https://huggingface.co/spaces/rizavelioglu/tryoffdiff
It can be combined with previous virtual try-on technologies - first using TryOffDiff to extract clothing from a person, then placing it on another person.
Open Source Projects
1. Auralis: A high-performance open-source text-to-speech framework.
Capable of converting text to natural speech at ultra-fast speeds, supporting voice cloning, with low memory usage and ability to handle multiple requests simultaneously.
GitHub: https://github.com/astramind-ai/Auralis
Supports multiple languages including English, Spanish, French, and Chinese, plus audio quality enhancement and noise reduction.
Official tests on a 3090 GPU showed it could complete the voice conversion of an entire "Harry Potter" novel in 10 minutes. Worth trying if you need such functionality.