Browse features & use cases24 pages

Feature · Available on Creator and Studio plans

Auto-Subtitles

Word-level karaoke captions burned into every clip — TikTok Bold, Minimal, Neon, or White Bar — synced to the exact moment each word is spoken.

≈ 50ms

Word-to-caption sync (Whisper verbose_json precision)

4

Visual styles — TikTok Bold, Minimal, Neon, White Bar

0

Manual timing edits required — captions snap to word boundaries

Four caption styles \u00b7 word-level sync

STOPTikTok Bold
clean · simpleMinimal
GLOWNeon Yellow
clearWhite Bar

The problem

Captions are table-stakes, but synced captions are rare.

Most auto-caption tools slice by silence or 5-second blocks. The result: 3–4 words appear at once, disappear when the next phrase overlaps, and half-read captions feel choppy. Viewers scroll past in 1.2 seconds.

How we solve it

Word-level Whisper + Shotstack burn-in.

Clipflow runs Whisper with `timestamp_granularities=[word]` so every word lands with its own start and end in seconds. Captions render in 2–3 word chunks that flip at natural reading pace — never mid-word, never held too long. Rules: max 3 words per chunk, force a break on sentence-ending punctuation, force a break if the gap between words is ≥ 0.6s.

  • 2–3 word chunks for TikTok-pace reading
  • Break on sentence punctuation, not guessed
  • Break on natural pauses (≥0.6s gap)
  • Max 2s per chunk so nothing lingers past the voice

The four styles

Match your visual system — no design time required.

TikTok Bold uses Arial Black with a fat black stroke for that recognizable "TikTok caption" look. Minimal is clean Arial on white with no stroke. Neon Yellow glows for high contrast against dark backgrounds. White Bar drops captions onto a semi-transparent pill — Instagram Reels style. Switch between them in the Preview editor before rendering.

Where it fits

Built into every render, including Viral Moments.

Every clip rendered via the Viral Moments pipeline ships with captions already. For one-off subtitle generation on videos not going through clips (podcasts with full-length uploads, long-form uploads to YouTube), the standalone /subtitles page runs Whisper + outputs SRT + VTT files you can attach to any platform upload.

Works well with

Try Auto-Subtitles on your next recording.

Free tier, no credit card. Your first draft lands in about two minutes.

Start free — no card