MioSub Docs
Guides

Timeline Forced Alignment

Get millisecond-precise timestamps with CTC forced alignment

v3.0 New Feature

v3.0 includes a brand new built-in CTC aligner — no extra downloads needed, works out of the box!

Use forced alignment models to get higher precision character-level timestamps, ideal for scenarios requiring precise timing.


⚡ Quick Setup (v3.0+)

v3.0 has a built-in CTC aligner. Simply:

  1. Open Settings > Enhancement > Timeline Alignment
  2. Set Alignment Mode to "CTC"
  3. Download the alignment model
  4. Set the model directory in CTC aligner configuration to the model path (folder)

Model Download Links

Omnilingual ASR CTC 300M (Recommended, 1600+ languages, Apache 2.0 license)

Hugging Face

HF Mirror (for users in China)

Please download model.int8.onnx and tokens.txt, place them in the same folder, then select that folder in settings.

⚠️ The Omnilingual model requires aligner v0.2.0 or later. Check your current version in Settings > About, and update via "Check for Updates".

Legacy model: MMS-300M (not recommended)

Hugging Face

HF Mirror (for users in China)

Please download all four files, place them in the same folder, then select that folder in settings. This model has no aligner version requirement.


🎯 How It Works

High-precision timeline alignment based on CTC (Connectionist Temporal Classification) technology:

  • Millisecond Precision: Supports character-level timestamp alignment
  • Auto Correction: Fixes timing drift in Whisper transcription
  • Multi-Language Support: Supports Chinese, English, Japanese, and more
  • GPU Acceleration: Supports ONNX Runtime GPU acceleration (if available)

Alignment Mode Comparison

ModePrecisionSpeedUse Case
OffOriginalFastestQuick preview
CTCMillisecondMediumProfessional subtitle production

❓ FAQ

Alignment is slow?

CTC alignment requires computational resources. Optimization tips:

  1. Ensure sufficient memory (16GB+ recommended)
  2. For long videos, alignment is processed in segments

Alignment makes timing worse?

This can happen if the source video has poor audio quality, or the alignment model isn't optimized for the specific language/accent. Suggestions:

  1. Check the source video audio quality
  2. Temporarily disable alignment and use the original timestamps

Aligner version too old?

The Omnilingual ASR CTC 300M model requires aligner v0.2.0 or later. To fix:

  1. Open Settings > About
  2. Click "Check for Updates" to update the aligner component
  3. If the update is unavailable, you can temporarily switch back to the legacy MMS-300M model

On this page