AI Studio
Choose natural voice output and control variable timing.
Voice and lip sync settings decide how personalized words sound, how the presenter mouth movement is handled, and how variables fit into the original video timing.
Model
What voice and lip sync controls
- Voice sample
- A generated audio option for a variable phrase.
- Voice config
- The saved per-variable selection used during render.
- Lip sync
- A render option that updates mouth movement when variable audio changes.
- Variable start time
- Manual timing used when a variable should play at a specific time.
- Timing window
- A start/end range with anchor and fit behavior for variable audio.
Voice picker
Pick the most natural voice for each variable
The voice picker is designed around variable phrases. For each variable, compare generated options and choose the one that sounds most natural in context.

- 01
Open the variable voice settings
Review variables that need generated voice output. - 02
Listen in context
Choose the sample that matches surrounding words and pacing. - 03
Save the selection
The selected voice configuration is saved with the template segment. - 04
Preview before launching
Generate a small test row to confirm the selected voice works with real values.
Lip sync
When to enable lip sync
- Enable lip sync when the presenter face is visible and variable audio changes mouth movement.
- Disable lip sync when the variable is off-camera, covered by B-roll, or already fits a fixed audio slot.
- Use shorter variables for more predictable facial and audio results.
- Test with different values before using the template in a large campaign.
Manual timing can override lip sync needs
If every mapped variable has manual start timing, the campaign can use manual timing instead of requesting lip sync for those variables.
Timing
Manual timing windows
Manual timing windows are useful when the source recording has a predictable pause and the personalized word should fit into that window. The timing model supports start time, end time, anchor, and fit behavior.
- anchor
- Controls whether the timing window is anchored to the left or right side.
- fit mode
- Controls whether the renderer can adjust audio speed, video timing, or both.
- start/end seconds
- The exact window where the personalized audio should fit.
Checks
Quality checks
- Use real sample values, not only placeholders.
- Check pronunciation for names, companies, and product terms.
- Verify mouth movement if the face is close to camera.
- Confirm subtitles still match the rendered words when subtitles are enabled.
- Keep template variables short and pause around them in the source recording.