AI Avatar / Talking Head
One photo + one audio = a talking digital human
- Just a photo and audio — that's it
- OmniHuman v1.5 / Hedra and other top models
- Auto lip-sync, expressions, head motion
- 9:16 vertical and 16:9 landscape
What is it
AI Avatar combines a static face photo with audio to generate a digital human speaking video. The system auto-aligns lips with audio and adds natural blinks, nods, expression changes. Common uses: e-commerce voiceover, educational videos, virtual hosts. A 30s video takes 2-5 minutes to generate.
How to use it
Get started in 5 steps
- 1
Upload a face photo
Clear front-facing works best. 5-10MB PNG/JPG/WEBP. Face is auto-detected.
- 2
Upload audio
The line you want spoken. MP3/WAV/M4A up to 20MB. English / Mandarin / etc. supported. Or generate with Text to Audio node first.
- 3
Optional: scene prompt
Describe shot framing, action, expression hints ("medium shot, natural smile, occasional nods"). Optional.
- 4
Pick model + aspect ratio
OmniHuman v1.5 recommended. 9:16 for short-form, 16:9 for long-form platforms.
- 5
Generate + download
Hit Generate, wait 2-5 minutes. Download, save, or send to Workflow Editor for postprocessing.
Use cases
What other users build with it
E-commerce voiceover
Host photo + product script audio → short video. 90% time savings vs live recording.
Educational content
Historical figure photo + lecture audio = "the ancients" teaching history.
Virtual hosts
Same character across episodes for consistent brand persona.
Multilingual marketing
One photo, multiple language audios → one shoot, all languages.
Why Pixify
Dead-simple two steps
Upload photo + audio. 30 seconds to submit.
Frame-accurate lip-sync
OmniHuman v1.5 is current industry SOTA for lip alignment.
Workflow chainable
Chain with Text to Audio (synth lines) or Audio Video Merge (add BGM).
Frequently asked questions
What kind of photo works best?
+
How long can the audio be?
+
Can I do two-person dialogue?
+
Who owns the output?
+
Ready to start?
Sign up gets you starter credits. No card required.
Generate your first avatar