Skip to main content
StoryAnimator.app · illustrated story video studio

Turn any voiceover into an illustrated story video

Make faceless YouTube videos from a voiceover — no camera, no face needed. Drop in an audio narration or record your own. StoryAnimator transcribes it into timestamped sentences, generates one consistent image per line, and lays it onto a real editing timeline synced to the waveform - ready to export as MP4.

0
timestamped lines
0:00
narration length
1
image model
16:9
export ratio
1

Upload your audio

Drop a voiceover or narration file. We run speech-to-text and split it into sentences with start and end times.

Drop audio here, or click to browse

WAV · MP3 · M4A · AAC · FLAC — up to 200 MB

Auto-detect language · timestamped sentences
source.wav
No file yet
Waiting for audio…
Waiting
Upload an audio file to begin transcription.
2

Review the transcript

Every sentence becomes a shot. Hover or click a row to highlight it — these timings drive image durations on the timeline.

transcript.txtread-only
shots.list0 shots
fewer more
3

AI prompt assist

One click drafts your visual style, a character consistency bible, and a negative prompt. Everything stays editable.

prompt.cfginjected into every image
Transcribe audio first, then auto-fill these fields with AI.
Mood, medium, palette, lighting.
Fixed designs reused every shot.
What to keep out of frame.
4

Generate one image per sentence

We render a 16:9 frame for each timestamped line with Nano Banana. Reference image, style template, and character cast previews shape the final result.

generate.cfg
Upload character ref
Upload voice first to unlock this step.
Character cast preview
Generate a portrait of each character first. Refresh any you don't like, then generate the story - the approved cast locks character consistency.
Applies to the cast and every generated frame.
estimate
Cost / frame$0.0000
Quality score
Nano Banana keeps characters consistent, accepts a reference image, and follows per-sentence prompts.
$0.0000
0 frames @ $0.0000 · estimated total
Transcribe audio before generating.
5

Results gallery

Each generated 16:9 frame with its index, filename, and the prompt that produced it. Export the whole job as a ZIP with manifest.

render_out/no job yet
Generate frames to populate the gallery.
6

Video preview

Every frame laid end to end, each clip stretched to its sentence duration over the narration waveform. Scrub, preview, and export to MP4.

sequence.editno sequence
Sequence preview

Timeline

~ size shown here
Background music
No music selected
Upload music you own, licensed royalty-free music, or Creative Commons music with the required attribution.
SHOT 01
PREVIEW
0:00.0 / 0:00.0
0:000:00
Ready · records in real time with audio
$

Credits and usage

Transcript and ZIP exports are free. Image generation and video export use credits.

total.bill
Transcription$0.0000
Character portraits
Image generation$0.0000
Video render$0.0000
Total spent this session
$0.0000
Regular users see credits only. Admins see internal USD cost for margin tracking.
Done