Spend 1 minute every day to get curated cutting-edge AI information.
The content covers but is not limited to cutting-edge AI news, AI tools, AI painting, open-source projects, and learning tutorials, etc.
Follow AI Daily to stay updated with AI trends. We hope this is helpful to you. For important information, we will post separate detailed introductions.
Here is the latest AI information for July 30.
Cutting-edge News
1. Shengshu Technology released the end-to-end text-to-video large model Vidu.
Based on the U-ViT architecture, it can generate up to 16 seconds of 1080p HD video at one time, featuring high consistency, multi-camera generation, strong coherence, and rich imagination.
Detailed introduction: https://www.shengshu-ai.com/vidu
Online experience: https://www.vidu.studio/create
Currently, you can log in directly using a Google account and get 80 points for experience. Each use consumes 4 points and generates a video of up to 4 seconds.
I tried it out, the generation speed is fast, the video is stable, and the content matches the description quite well. However, ordinary users can only generate 4s videos, so it's not possible to test the coherence of long videos.
2. Apple begins pushing Apple Intelligence!
Officially pushed on the iOS 18.1 Beta system, applicable to iPhone 15 Pro, Pro Max, or iPad with Apple M series chips.
Demo video: https://www.youtube.com/watch?v=OHU20Ygypy0
Note: You need to set it to English (US) to join the waiting list.
3. Runway released the Gen-3 Alpha image-to-video function.
Allows you to use any image as the first frame of the video, which can be used alone or guided with a prompt.
Usage address: https://app.runwayml.com/video-tools
Cutting-edge Technology
1. High-definition enhancement model AuraSR released version 2!
Based on the GigaGAN 4x open-source super-resolution model, it's fast, has good detail magnification effects, and can be used commercially.
Model download: https://huggingface.co/fal/AuraSR-v2
AI Painting
- Boom! A very effective product advertising image generation tool Fotographer.ai.
It can seamlessly blend the foreground elements of a product with any background, while maintaining the shape and style consistency of the product.
Detailed introduction: https://t.zsxq.com/OQoKe
Personally tested it with a Xiaomi car SU7 image. First, fill in the background prompt, then fill in the product prompts (texture, lighting, style), and finally control the fusion intensity according to the situation.
Open-source Projects
1. An open-source real-time interactive streaming digital human project: metahuman-stream.
Achieves audio and video synchronous dialogue, can basically reach commercial effect.
GitHub: https://github.com/lipku/metahuman-stream
Features:
- Supports multiple digital human models: ernerf, musetalk, wav2lip;
- Supports voice cloning;
- Supports digital human speech interruption;
- Supports full-body video splicing;
- Supports rtmp and webrtc;
- Supports video arrangement: play custom videos when not speaking;
- Supports digital human dialogue using large language models like ChatGPT, Qwen, and Gemini.
2. An open-source best practice guide for civil service exams: developer2gwy.
Written by a group of friends who transitioned from programmers to civil servants, sharing their experiences, life in the system, and mental journey.
GitHub: https://github.com/miss-mumu/developer2gwy
Covers basic knowledge of civil exams, best practices for preparation, common problems, interview manual, and some Q&As that everyone cares about.
3. A series of study materials to take you deep into Spring source code: Spring-Reading.
Covers core concepts and key features of the Spring framework such as resource loading and access, Spring Expression Language, Bean definition, Aware interface, core annotations, and Spring AOP.
GitHub: https://github.com/xuchengsheng/spring-reading
Hope this information helps everyone better understand the internal workings of Spring, for better application in practical work.