Roadmap
OpenCut AI is a fork of OpenCut with AI wrapped around it. Here's what's been built and what's next. (last updated: March 22, 2026)
OpenCut — the foundation
OpenCut is the open-source video editor this project is forked from. It provides the core editor — multi-track timeline, real-time preview, text/sticker/effect tracks, keyboard shortcuts, and browser-based storage. Huge thanks to the OpenCut team and all upstream contributors.
Fork & AI integration
OpenCut AI is a fork that wraps AI capabilities around the core editor. The goal: make video editing accessible to non-editors by letting them edit videos through text, voice, and AI commands — all running locally.
AI transcription & text-based editing
Backend Whisper service for speech-to-text with word-level timestamps. Transcript panel with live word highlighting, auto-scroll, segment deletion, word-level cuts, and drag-to-reorder. Video auto-splits at segment boundaries. Delete text to cut video, reorder text to rearrange scenes.
Voice cloning & TTS
Coqui XTTS v2 for multilingual voice generation with voice cloning. Upload a voice sample to clone any voice. Generate voiceovers per-segment from the transcript. Male/female voice selection. Auto-translation for multilingual voiceovers. Background task tracking in the UI.
Subtitles & multilingual support
One-click subtitle generation from transcript, positioned at the bottom of the screen. Add/remove subtitles toggle. Auto-translation of transcript to 12+ languages for multilingual subtitle tracks. Subtitle text elements on their own timeline track.
Filters, adjustments & effects
WebGL color-adjust shader with 12 filter presets (Grayscale, Sepia, Vintage, Warm, Cool, Vivid, etc.). Adjustment panel with brightness, contrast, saturation, temperature, and vignette sliders. Effects applied as timeline tracks scoped to selected clips.
Audio separation & volume control
Auto-separate audio from video into its own track. Per-clip volume control with draggable dB line on audio elements. Volume changes apply to playback in real-time via Web Audio GainNode. Split audio clips to set different volumes per section.
Overlays & compositing
Picture-in-picture presets (corner positions, center). Split screen (left/right, top/bottom). Compositing presets (ghost overlay, dark overlay, light leak). Per-element opacity, 17 blend modes, and transform controls.
AI Studio & fact-checking
AI chat for brainstorming, script writing, and content planning. Script editing mode — rewrite transcript via AI prompts. Fact-check panel — extracts claims from transcript and verifies them via LLM. Fact-check overlays on the video timeline.
Quick actions & filler removal
Quick actions bar with one-click filler word detection, silence detection, subtitle toggle, and fact-check. Actions derive state from the actual timeline — stay in sync across all panels. Filler words shown with dotted underline in transcript.
Self-hosting & infrastructure
Full Docker Compose setup with 9 services (web, AI backend, Whisper, TTS, image, Ollama, Postgres, Redis). Health monitoring via proxied endpoint. RAM/GPU status in header. API keys management in Settings. Self-hosting cost documentation.
Export
MP4 (H.264) and WebM (VP9) export with quality presets (Low to Very High). Platform presets for YouTube, TikTok, Instagram, etc. Audio mixing with per-element volume. Progress bar with cancel support. Direct file download.
Advanced export & rendering
Subtitle burn-in during export. Background export that doesn't block the UI. Batch export for multiple formats. Custom resolution and bitrate controls. GPU-accelerated rendering.
More AI models
Model download manager in the UI. Support for StyleTTS 2, Bark, Piper, Fish Speech, Kokoro for TTS. Switchable Whisper model sizes (tiny to large-v3). Cloud API fallbacks (OpenAI, ElevenLabs) as optional alternatives.
Native app (mobile/desktop)
Native OpenCut AI apps for Mac, Windows, Linux, and iOS/Android.