All releases
0.3.02026-03-22

AI fork — transcription, voice, and beyond

OpenCut AI is born. This release forks OpenCut and wraps a full AI suite around it — transcription, text-based editing, voice cloning, filters, fact-checking, overlays, and more. Everything runs locally.

Features:

  • Forked from OpenCut and added a complete AI backend with FastAPI, Docker Compose, and 6 microservices (Whisper, TTS, Ollama, Image, AI Backend, Web).
  • Backend Whisper transcription with word-level timestamps. Video auto-splits at segment boundaries after transcription.
  • Text-based video editing — delete transcript segments to cut video, drag to reorder scenes, remove individual words. Timeline compacts automatically to close gaps.
  • Live word highlighting in the transcript panel. Auto-scrolls to the active segment during playback.
  • Filler word detection with dotted underline styling. Mark words for removal and apply cuts in bulk.
  • Voice cloning and TTS via Coqui XTTS v2. Upload a voice sample to clone any voice. Male/female voice selection. Multilingual voiceover with auto-translation.
  • Per-segment voiceover generation — each transcript segment generates its own audio clip on a new voice track, aligned to the original timing.
  • Subtitle system — add/remove subtitle text track from transcript. Positioned at the bottom of the screen with semi-transparent background.
  • 12 color filter presets (Grayscale, Sepia, Vintage, Warm, Cool, Vivid, Muted, Dramatic, High Key, Low Key, Noir, Golden Hour) via WebGL color-adjust shader.
  • Adjustment panel with brightness, contrast, saturation, temperature, and vignette sliders. Filters and adjustments applied as timeline effect tracks scoped to selected clips.
  • Overlays panel with picture-in-picture presets (4 corner positions), split screen (left/right/top/bottom), and compositing presets (ghost, dark, light leak). Per-element opacity and 17 blend modes.
  • Fact-check panel — extracts claims from transcript, verifies them via local LLM, and shows verdicts (True/False/Partial/Unverifiable). Add fact-check overlays to the video.
  • AI Studio script editing mode — when transcript exists, AI can rewrite, simplify, make more professional, or summarize the transcript text.
  • Quick actions bar with one-click filler detection, silence detection, subtitle toggle, and fact-check. Actions derive state from the actual timeline.
  • Audio separation — auto-detach audio from video into its own track after transcription. Per-clip volume control with draggable dB line on audio elements.
  • Per-clip volume applied to playback via Web Audio GainNode. Split audio clips to set different volumes per section.
  • Voiceover panel with 6 open-source TTS model options listed (XTTS v2, Bark, StyleTTS 2, Piper, Fish Speech, Kokoro). Voice generation runs as background tasks with progress in the task widget.
  • Background tasks widget — shows running voiceover/transcription tasks with live progress, elapsed time, completion status, and error handling.
  • Service health monitoring via proxied backend endpoint. RAM/GPU memory status in the header. All health checks go through the AI backend to avoid CORS issues.
  • API keys management in Settings — Freesound, AI Backend URL, OpenAI, ElevenLabs. Shows existing env values, info popovers explaining each key, hide/show/clear controls.
  • Self-hosting pricing page on the website — 4 tiers from $0 (local) to $150/mo (GPU server), comparison table vs Descript/Kapwing/Runway.

Improvements:

  • Renamed Captions tab to Transcript. Transcription now uses backend Whisper instead of browser-based ONNX Whisper for better accuracy.
  • Transcript segments filtered to actual video duration — hallucinated segments beyond the video length are dropped.
  • Transcript store syncs segment times to compacted timeline after deletion, so subsequent edits target the correct time ranges.
  • Transcript auto-restores from existing subtitle text elements when reopening a project. Clears automatically when all video is deleted.
  • Timeline track labels panel now scrolls vertically with synced scroll. Playhead line extends through all tracks including scrolled-off ones.
  • Tab bar scroll arrows — up/down buttons appear when there are more tabs than visible, with smooth scrolling.
  • Track icon tooltips — hover over any track icon to see the track name. Track labels are editable inline.
  • Quick actions tooltips moved to top to avoid being cut off by the timeline.
  • Landing page redesigned — cinematic hero with parallax scroll, SVG icons for features, animated accent lines, gradient text, stats row, terminal-style CTA.
  • Footer updated with correct page links (Editor, Models, Roadmap, Changelog, Blog, Contributors, Brand, Privacy, Terms).
  • Privacy and Terms pages updated to reflect local-first, self-hosted architecture. No cloud uploads, no analytics, no tracking.
  • Roadmap rewritten with 15 items covering upstream foundation, fork, and all AI features added.
  • Contributors page now includes thank-you sections for OpenCut upstream project and the open-source community.

Fixes:

  • Hydration mismatch in ThemeToggle — added mounted state to avoid server/client text mismatch.
  • TTS service Dockerfile — added build-essential and cargo for compiling native dependencies. Added COQUI_TOS_AGREED env var. Pinned torch<2.6 for weights_only compatibility.
  • Image service — changed rembg to rembg[cpu] to include onnxruntime dependency.
  • TTS generation now runs in a thread pool so health checks remain responsive during long generation.
  • Audio InputDisposedError — wrapped audio iterator in try-catch so deleting tracks during playback doesn't crash.
  • React hooks violation in QuickActionsBar — moved early return after all hook declarations.
  • File extension fix for transcription — creates a proper File with extension when mediaAsset has no filename.