Upload your audio
Drop a voiceover or narration file. We run speech-to-text and split it into sentences with start and end times.
Drop audio here, or click to browse
WAV · MP3 · M4A · AAC · FLAC — up to 200 MB
Auto-detect language · timestamped sentencesReview the transcript
Every sentence becomes a shot. Hover or click a row to highlight it — these timings drive image durations on the timeline.
AI prompt assist
One click drafts your visual style, a character consistency bible, and a negative prompt. Everything stays editable.
Generate one image per sentence
We render a 16:9 frame for each timestamped line with Nano Banana. Reference image, style template, and character cast previews shape the final result.
- Cinematic - photoreal film still
- Anime - cel-shaded illustration
- Pixar - glossy 3D animation
- Watercolor - hand-painted
- Children's Book - gouache storybook
- Photorealistic - true-to-life photo
- Documentary - candid realism
- Stick figure - simple line drawing
- Pixel art - retro 8-bit
- Comic book - bold ink & halftone
- Claymation - stop-motion clay
- Oil painting - classical brushwork
- Low poly 3D - faceted geometry
- Line art - minimal black lines
- Type your own
Results gallery
Each generated 16:9 frame with its index, filename, and the prompt that produced it. Export the whole job as a ZIP with manifest.
Video preview
Every frame laid end to end, each clip stretched to its sentence duration over the narration waveform. Scrub, preview, and export to MP4.
Timeline
Credits and usage
Transcript and ZIP exports are free. Image generation and video export use credits.