Open sourceRuns locallyNo cloud

Video editing,reimaginedby AI.

Turn long podcasts into viral clips. Multi-speaker detection, word-pop subtitles, auto-reframe for Shorts, brand kits. 22 Indian languages via Sarvam AI. Everything runs locally.

10AI Services
30+Languages
$0API Cost
100%Local
Scroll to explore

AI-powered editing for creators who ship

Transcription in 30+ languages, multi-speaker detection, viral clip finding, voice cloning, word-pop subtitles, auto-reframe, brand kits — with 22 Indian languages powered by Sarvam AI.

Core AI Engine

Edit by Text

Transcribe your video, then edit it like a document. Delete a sentence and the video cuts itself. Drag to reorder scenes.

New

Multi-Speaker Detection

Auto-detect who's talking with pyannote AI. Each speaker gets a label you can rename. Video auto-cuts at speaker boundaries.

Voice Cloning

Clone any voice from a 6-second sample. Generate voiceovers in cloned voices across any language. XTTS v2 powered.

New

22 Indian Languages

Hindi, Tamil, Telugu, Kannada, Bengali, Malayalam, and 16 more Indian languages. Transcribe, translate, and generate TTS via Sarvam AI with your API key.

100% Local

All models run on your machine. No cloud uploads, no API keys, no subscriptions. Your footage stays private.

Podcast Clip Studio
New

Podcast Clip Generator

AI finds the most viral-worthy 30-60 second moments from long podcasts. Scored by engagement potential, emotional peaks, and shareability.

New

Word-Pop Karaoke Subs

Hormozi-style subtitles where each word pops up when spoken and scales larger. Keyword highlighting with accent colors. 4 preset styles.

New

Auto-Reframe 9:16

Face-tracking crop that converts 16:9 to 9:16 for TikTok, Reels, and Shorts. Smooth pan transitions between speakers using MediaPipe.

New

Brand Kit

Define your brand once — colors, fonts, logo, social handles, CTA. One-click intro/outro cards, lower thirds, and persistent watermarks.

And Even More
New

AI Question Cards

AI generates topic question slides from your transcript. Overlaid on video with transparent or themed backgrounds. Subtitles auto-hide during cards.

New

Emotion Detection

SpeechBrain AI detects emotional peaks in your audio — excitement, calm, neutral. Used to boost clip scores and find the most impactful moments.

New

Speed Control

0.1x slow-mo to 4x fast-forward on any clip. 8 preset speeds plus a continuous slider. Audio pitch adjusts automatically.

AI Transcription

Whisper + Sarvam AI speech-to-text with word-level timestamps. 30+ languages including 22 Indian regional languages. Runs locally or via Sarvam cloud.

From raw footage to final cut

A fork of OpenCut with AI added on top. Import, transcribe, translate, enhance, and export. All running locally on your machine.

Step 01

Drop your footage

Drag any video or audio file into the editor. Pick 16:9 for YouTube, 9:16 for TikTok/Reels, or 1:1 for Instagram. Start editing instantly.

Multi-formatLocal-firstNo upload
Step 02

Transcribe + detect speakers

One click to transcribe with Whisper and auto-detect speakers with pyannote. Each speaker gets a name you can edit. Emotion peaks are detected simultaneously.

Multi-speakerWord-levelEmotion AI
Step 03

Find the best clips

AI scores every moment for viral potential — hot takes, surprising facts, emotional peaks, humor. Get ranked clip candidates with one-click apply.

Clip scoringSmart finderLLM-powered
Step 04

Add word-pop subtitles

Hormozi-style karaoke subtitles where each word pops in when spoken. AI highlights keywords in accent colors. 4 preset styles. No subtitles during question cards.

Word-popKeyword colorsCard-aware
Step 05

Brand it and reframe

Apply your brand kit — logo, colors, intro/outro, CTA. Auto-reframe 16:9 to 9:16 with face tracking. Speed up or slow down any clip from 0.1x to 4x.

Brand kitAuto-reframeSpeed control
Step 06

Export for any platform

One-click export presets for YouTube, TikTok, Reels, Instagram, and more. Everything composited — subs, cards, brand overlays, speed changes. No watermark.

Platform presetsBackground exportAuto-save
No per-seat pricing

Self-host for the
cost of a VPS

No usage credits. No cloud lock-in. Pay only for the server and use it as much as you want, with as many users as you need.

Local Machine

$0forever

Run on your own computer

Any laptop with 8+ GB RAM

  • Full video editor
  • AI transcription (CPU)
  • Voice cloning & TTS
  • Text-based editing
  • Filler word removal
  • All filters & effects

No GPU = no image generation

Starter VPS

$20/month

Light editing & transcription

4 vCPU, 8 GB RAM, CPU-only

  • Everything in Local
  • Remote access from any device
  • Always-on availability
  • Shareable with team
  • Transcription + TTS
  • LLM commands

Hetzner, DigitalOcean, Vultr

Most Popular

Standard VPS

$50/month

Full workflow, no GPU

4 vCPU, 16 GB RAM

  • Everything in Starter
  • Faster transcription
  • Larger LLM models (7B+)
  • Voice cloning at quality
  • Multiple concurrent users
  • Full TTS generation

Best value for most teams

GPU Server

$150/month

All AI features at speed

8 vCPU, 32 GB RAM, NVIDIA T4

  • Everything in Standard
  • 10x faster transcription
  • AI image generation
  • Real-time voice cloning
  • Large LLM models
  • Production-ready speed

AWS g4dn, RunPod, Lambda

Your footage.
Your machine.
Your rules.

Fork it. Install it. Self-host it. No cloud, no subscriptions, no limits.

$ git clone https://github.com/Ekaanth/OpenCut-AI.git
$ cd OpenCut-AI
$ docker compose up -d