0.3.0 — 2026-03-22

AI fork — transcription, voice, and beyond

OpenCut AI is born. This release forks OpenCut and wraps a full AI suite around it — transcription, text-based editing, voice cloning, filters, fact-checking, overlays, and more. Everything runs locally.

Features:

Forked from OpenCut and added a complete AI backend with FastAPI, Docker Compose, and 6 microservices (Whisper, TTS, Ollama, Image, AI Backend, Web).
Backend Whisper transcription with word-level timestamps. Video auto-splits at segment boundaries after transcription.
Text-based video editing — delete transcript segments to cut video, drag to reorder scenes, remove individual words. Timeline compacts automatically to close gaps.
Live word highlighting in the transcript panel. Auto-scrolls to the active segment during playback.
Filler word detection with dotted underline styling. Mark words for removal and apply cuts in bulk.
Voice cloning and TTS via Coqui XTTS v2. Upload a voice sample to clone any voice. Male/female voice selection. Multilingual voiceover with auto-translation.
Per-segment voiceover generation — each transcript segment generates its own audio clip on a new voice track, aligned to the original timing.
Subtitle system — add/remove subtitle text track from transcript. Positioned at the bottom of the screen with semi-transparent background.
12 color filter presets (Grayscale, Sepia, Vintage, Warm, Cool, Vivid, Muted, Dramatic, High Key, Low Key, Noir, Golden Hour) via WebGL color-adjust shader.
Adjustment panel with brightness, contrast, saturation, temperature, and vignette sliders. Filters and adjustments applied as timeline effect tracks scoped to selected clips.
Overlays panel with picture-in-picture presets (4 corner positions), split screen (left/right/top/bottom), and compositing presets (ghost, dark, light leak). Per-element opacity and 17 blend modes.
Fact-check panel — extracts claims from transcript, verifies them via local LLM, and shows verdicts (True/False/Partial/Unverifiable). Add fact-check overlays to the video.
AI Studio script editing mode — when transcript exists, AI can rewrite, simplify, make more professional, or summarize the transcript text.
Quick actions bar with one-click filler detection, silence detection, subtitle toggle, and fact-check. Actions derive state from the actual timeline.
Audio separation — auto-detach audio from video into its own track after transcription. Per-clip volume control with draggable dB line on audio elements.
Per-clip volume applied to playback via Web Audio GainNode. Split audio clips to set different volumes per section.
Voiceover panel with 6 open-source TTS model options listed (XTTS v2, Bark, StyleTTS 2, Piper, Fish Speech, Kokoro). Voice generation runs as background tasks with progress in the task widget.
Background tasks widget — shows running voiceover/transcription tasks with live progress, elapsed time, completion status, and error handling.
Service health monitoring via proxied backend endpoint. RAM/GPU memory status in the header. All health checks go through the AI backend to avoid CORS issues.
API keys management in Settings — Freesound, AI Backend URL, OpenAI, ElevenLabs. Shows existing env values, info popovers explaining each key, hide/show/clear controls.
Self-hosting pricing page on the website — 4 tiers from $0 (local) to $150/mo (GPU server), comparison table vs Descript/Kapwing/Runway.

Improvements:

Renamed Captions tab to Transcript. Transcription now uses backend Whisper instead of browser-based ONNX Whisper for better accuracy.
Transcript segments filtered to actual video duration — hallucinated segments beyond the video length are dropped.
Transcript store syncs segment times to compacted timeline after deletion, so subsequent edits target the correct time ranges.
Transcript auto-restores from existing subtitle text elements when reopening a project. Clears automatically when all video is deleted.
Timeline track labels panel now scrolls vertically with synced scroll. Playhead line extends through all tracks including scrolled-off ones.
Tab bar scroll arrows — up/down buttons appear when there are more tabs than visible, with smooth scrolling.
Track icon tooltips — hover over any track icon to see the track name. Track labels are editable inline.
Quick actions tooltips moved to top to avoid being cut off by the timeline.
Landing page redesigned — cinematic hero with parallax scroll, SVG icons for features, animated accent lines, gradient text, stats row, terminal-style CTA.
Footer updated with correct page links (Editor, Models, Roadmap, Changelog, Blog, Contributors, Brand, Privacy, Terms).
Privacy and Terms pages updated to reflect local-first, self-hosted architecture. No cloud uploads, no analytics, no tracking.
Roadmap rewritten with 15 items covering upstream foundation, fork, and all AI features added.
Contributors page now includes thank-you sections for OpenCut upstream project and the open-source community.

Fixes:

Hydration mismatch in ThemeToggle — added mounted state to avoid server/client text mismatch.
TTS service Dockerfile — added build-essential and cargo for compiling native dependencies. Added COQUI_TOS_AGREED env var. Pinned torch<2.6 for weights_only compatibility.
Image service — changed rembg to rembg[cpu] to include onnxruntime dependency.
TTS generation now runs in a thread pool so health checks remain responsive during long generation.
Audio InputDisposedError — wrapped audio iterator in try-catch so deleting tracks during playback doesn't crash.
React hooks violation in QuickActionsBar — moved early return after all hook declarations.
File extension fix for transcription — creates a proper File with extension when mediaAsset has no filename.