Burn subtitles into video: the 2026 reference guide
Everything you need to decide how to add permanent captions to your video — formats, methods, quality, platform requirements. Vendor-neutral, fact-checked, no tool ads.
This is a reference guide for one specific question: how do you burn subtitles into a video so they show up everywhere, on every device, without depending on the player to support a subtitle track?
I am writing this because most of the guides that rank for this query online are either (a) thinly disguised promotions for a single tool or (b) outdated by two or three years. I want a guide that explains the actual decisions you have to make — format, method, quality, platform — and trusts you to pick the right tool yourself.
I do build a subtitle tool (BurnSub). I will mention it where it is genuinely relevant. I will also point at other tools, including direct competitors, where they are the better choice for a given job. Where I cite numbers or specs, I link the source.
What “burning subtitles” actually means
“Burning” or “hardcoding” subtitles means rendering the captions directly into the pixels of the video file. After burning, the captions are part of the image. They cannot be turned off. They cannot be edited. They cannot be replaced without re-encoding the entire video.
This is opposed to soft subtitles (also called closed captions or sidecar subtitles), where the captions live in a separate track inside the video container — or even in a separate file alongside the video. A media player reads the track and overlays the captions during playback. The viewer can usually toggle them on and off.
The two approaches have different use cases:
| Hardcoded | Soft | |
|---|---|---|
| Always visible | ✅ | Depends on player |
| Stripped by upload (TikTok, Reels) | Never | Often |
| Editable later | ❌ | ✅ |
| Searchable in player | ❌ | ✅ |
| Translatable post-hoc | ❌ | ✅ |
| File size impact | Slight increase | Negligible |
If you are uploading to a social platform that ignores or strips soft subtitle tracks — TikTok, Reels, X/Twitter, Instagram, YouTube Shorts — hardcoding is the only way to guarantee captions show up. If you are publishing to a platform that supports soft tracks well (YouTube long-form, Vimeo, Netflix), soft is usually better.
A practical rule: short-form social = hardcode, long-form distribution = soft, archival = both.
Subtitle file formats: SRT, VTT, ASS
If you already have a subtitle file (rather than auto-generating from speech), it comes in one of three formats. Pick the right one before you start.
SRT (SubRip Subtitle)
The simplest and oldest format. Plain text with timecodes:
1
00:00:01,200 --> 00:00:04,800
Hello and welcome.
2
00:00:05,000 --> 00:00:08,500
This is the second caption.
That is the whole spec. No styling, no positioning, no language metadata. SRT is supported by every player on every platform. If you are not sure what format to use, use SRT.
VTT (WebVTT)
The modern web-native successor to SRT. It supports cue positioning, styling hints, and CSS:
WEBVTT
00:00:01.200 --> 00:00:04.800 line:80% align:center
Hello and welcome.
00:00:05.000 --> 00:00:08.500
This is the second caption.
VTT is the format HTML5 video players expect natively (<track kind="subtitles" src="captions.vtt">). YouTube, Vimeo, and most streaming platforms accept it. The cue positioning syntax is useful if you want captions to appear in specific places.
Reference: WebVTT specification at W3C.
ASS / SSA (Advanced SubStation Alpha)
A heavyweight format originally from the anime fansub community. It supports fonts, colors, drop shadows, karaoke timing, sprites, and motion. If you have seen a fan-translated anime episode with elaborate caption animations, that is ASS.
[Events]
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
Dialogue: 0,0:00:01.20,0:00:04.80,Default,,0,0,0,,Hello and welcome.
ASS has features no other subtitle format has, at the cost of being significantly more complex. Most online burning tools support reading ASS but only some preserve the styling. If your source is ASS and you want the styling preserved, check the documentation of whichever tool you use.
Which format should you use?
| Source | Output target | Use |
|---|---|---|
| Auto-generated from speech | Any | The tool that generates them usually outputs SRT or VTT. Either works. |
| Existing soft-subbed video | TikTok / Reels / Shorts | Extract the existing track (any format), burn it in. |
| Karaoke or anime fansub | Anime fansub distribution | ASS — anything else loses information. |
| Translated subs from a service | Most platforms | SRT is the safe default. |
| YouTube auto-captions export | YouTube hardcoded re-upload | YouTube exports VTT. Use it directly. |
The four methods to burn subtitles
There are four real ways to burn subtitles into a video in 2026. Each has a different trade-off profile.
Method 1: Desktop FFmpeg (free, technical)
The gold standard for command-line video work. FFmpeg has a subtitles filter that burns SRT, VTT, or ASS into a video.
ffmpeg -i input.mp4 -vf subtitles=captions.srt output.mp4
This works on every operating system, supports every subtitle format, and produces output that is identical to what professional video software would. The downside is the learning curve. If you do not already use the terminal, this is a non-starter.
Reference: FFmpeg subtitles filter documentation.
Method 2: Desktop GUI app (free, easy, install required)
Tools like HandBrake (free, open source) or commercial editors like DaVinci Resolve and Adobe Premiere can burn subtitles via a user interface. HandBrake is the most accessible: open the source file, add a subtitle track, check “Burned In,” start the encode.
Pros: free, GUI, no upload, identical quality to FFmpeg. Cons: requires installation (a few hundred MB), may not run on Chromebooks or restricted work machines.
Method 3: Server-based online tools (free with limits, no install)
Tools like Veed.io, Kapwing, Clideo, HappyScribe, and dozens of smaller services let you upload a video to their server, add subtitles in a browser UI, and download the result.
Pros: no install, often integrated auto-captioning. Cons (consistent across this category):
- Upload required — files of any meaningful size take minutes to upload.
- Watermarks on the free tier of most.
- File size caps (250 MB is common on free tiers).
- Privacy concern — your video sits on their server, sometimes indefinitely. For confidential client work, this is a non-starter.
- Subscription pressure — many features behind a paywall.
If you are uploading a short, non-sensitive clip and the tool’s free tier covers your needs, this category is fine.
Method 4: Browser-local tools (free, no install, no upload)
A newer category, enabled by browser APIs that did not exist a few years ago (WebCodecs, WebGPU, transformers.js for in-browser AI). The video stays on your device throughout. The browser uses your machine’s hardware to decode, render, and re-encode. There is no upload step.
Examples in this category include VideoToFrames, the BurnSub tool I built (burnsub.com), and a small but growing number of others.
Pros: instant (no upload), private (file never leaves your device), no watermark on most, no file size cap. Cons: requires a modern browser (Chrome 94+, Edge 94+, Safari 17+, Firefox 147+). Older browsers cannot use the underlying APIs.
For the technical reason browser-local tools differ in speed from each other, see the WebCodecs vs ffmpeg.wasm post — tools built on hardware-accelerated WebCodecs run roughly an order of magnitude faster than tools built on software-only ffmpeg.wasm, on the same hardware.
Quality considerations
Burning subtitles always involves re-encoding the video. Every re-encode loses some quality. You can minimize that loss but not eliminate it. Here is what affects output quality.
Codec choice
Most online tools default to H.264 in an MP4 container. This is the right default for compatibility — every device since approximately 2008 plays it, and every social platform accepts it.
H.265 (HEVC) is more efficient but has patchy support: Safari plays it, Chrome’s support depends on hardware decoding being available, and many older Android devices fall back to software decoding. Use H.265 only if you control playback (your own player, your own site).
AV1 is the future-facing choice for streaming services but its encoder support in browsers is limited as of 2026. For client-side burning, AV1 is currently rarely the right call.
Bitrate
The single biggest decision. A higher bitrate means a larger file and better quality. A lower bitrate means a smaller file and visible compression artifacts.
Rough guidelines for H.264 1080p output:
| Use case | Bitrate target |
|---|---|
| Social media re-upload (will be re-compressed anyway) | 5–8 Mbps |
| Personal archive / direct playback | 8–15 Mbps |
| Source-quality preservation | Match input bitrate |
| 4K social media | 25–40 Mbps |
| 4K archive | 50–80 Mbps |
These are starting points, not laws. Every encoder is different.
Color and dynamic range
This is where most browser tools fall short of professional desktop tools. Your input may be HDR (high dynamic range), wide color gamut, or 10-bit. Most browser pipelines (including BurnSub) currently re-encode to standard dynamic range, 8-bit, BT.709 color. The result is fine for social media but is a downgrade from a true source.
If you are working with HDR or 10-bit source and you care about preserving it, use desktop FFmpeg or a professional editor.
Subtitle anti-aliasing
A small detail that matters more than people realize. Captions are text. Text rendered on top of a video should be anti-aliased so the edges blend with the background. Cheap implementations skip this and produce captions with pixelated edges. Quality implementations render the text at 2x or 4x the output resolution and downsample, which gives clean edges.
If you can preview the output before downloading, look at the caption edges at 100% zoom. Pixelated edges are a sign the tool cut corners.
Platform-specific requirements
Each social platform has different ingestion rules. Hardcoding into your output is the safest way to comply, because soft subtitle tracks are stripped or ignored on most short-form platforms.
TikTok
- Aspect: 9:16 vertical (1080×1920 recommended)
- Max duration: 10 minutes
- Caption recommendation: large, high-contrast, bottom-positioned with safe area above the like/share UI
- Soft subtitle tracks: not supported. Hardcode.
Reference: TikTok video specifications.
YouTube Shorts
- Aspect: 9:16 vertical
- Max duration: 60 seconds
- Caption recommendation: bold, often with a thick black outline for readability against any background
- Soft subtitles: YouTube supports them on Shorts, but they default-off. For Shorts specifically, hardcoding is usually better.
Instagram Reels
- Aspect: 9:16 vertical
- Max duration: 90 seconds (some accounts have higher limits)
- Caption recommendation: minimal, often centered, designer-aesthetic style. The audience skews design-conscious.
- Soft subtitles: not supported. Hardcode.
YouTube long-form
- Aspect: 16:9 horizontal (or 9:16 for vertical content)
- Max duration: 12 hours
- Caption recommendation: depends on content. Long-form tutorials benefit from soft (so users can disable) plus optional hardcoded titles for emphasis.
- Soft subtitles: fully supported. Often the better choice.
X / Twitter
- Aspect: 16:9 or 1:1 most common; 9:16 supported
- Max duration: 2:20 standard, up to 10 minutes for premium
- Caption recommendation: high-contrast, often centered, sized for autoplay in feed
- Soft subtitles: not supported. Hardcode.
- Aspect: 16:9 or 1:1 most common
- Max duration: 10 minutes
- Caption recommendation: professional, brand-safe styling. Heavy stroke / drop shadow does not fit the platform feel.
- Soft subtitles: supported but uncommon. Hardcoding the most common pattern.
Common mistakes that hurt your output
These are the patterns that consistently show up in low-quality burned-subtitle output:
-
Caption font too small for the platform. Mobile users see Reels and Shorts on screens that show your caption at a real-world size of about 30 mm. If your font is sized for a desktop preview, it will be unreadable on the actual viewing device. Test on your phone, not your monitor.
-
No safe area considered. TikTok and Reels overlay their own UI (likes, share button, username) in fixed positions. If your caption sits where the UI sits, viewers either cannot read it or cannot tap the UI.
-
Captions positioned at the absolute bottom edge. Many players show a progress bar in the bottom 5%. Captions should start at least 8–10% above the bottom of the frame.
-
No outline or background on the caption. A caption with no stroke and no background becomes invisible the moment the video background matches the caption color. Even subtle backgrounds (semi-transparent pill) prevent this.
-
Burning in subtitles in two languages at once. If you genuinely need two languages, use both languages on the same line, or alternate cues. Burning two separate caption tracks on top of each other almost always looks like a mistake.
-
Output file too large to upload. Most platforms have ingestion limits around 4 GB (sometimes higher). If your hardcoded output is larger, the platform will re-compress it more aggressively, undoing your quality work.
How to verify a tool’s claims
The category of “free, no upload, no watermark” online subtitle tools attracts misleading marketing. Here is how to actually verify a tool’s claim before trusting it with confidential video.
Test for upload: Open the tool, then open your browser’s DevTools → Network tab. Drop your video in. Watch the network panel. If you see any outgoing request that contains video bytes (look for large POST requests or chunked uploads to the tool’s domain), the video is being uploaded. A truly local tool will show only static asset requests (CSS, JS, the AI model file) and no large video upload.
Test for watermark: Run the tool on a short test clip with the free tier. Download the output. View the entire clip. Look for any logo, text overlay, or color tint that did not exist in your source.
Test for signup wall: Note the exact moment in the workflow when an account is required. Some tools delay the signup ask until the download step — meaning you do all the work, then they hold the file hostage.
Test the actual privacy claim: For tools that claim local-only processing, the test above (DevTools network monitoring) is conclusive. Verify, do not trust.
Frequently asked questions
Can I burn subtitles into a video on my phone? Yes, in some cases. Mobile Safari on iOS 17+ and recent Chrome on Android support most of the same browser APIs as desktop. Performance is slower because of the smaller GPU, but a short clip is feasible. Desktop is still the better choice for anything longer than a couple of minutes.
Will burning subtitles reduce my video’s quality? Yes, slightly. Every re-encode loses some quality. Choosing a higher output bitrate minimizes the loss. There is no way to burn subtitles without re-encoding.
Can I remove burned-in subtitles later? No, not cleanly. They are part of the image. Tools that claim to “remove” hardcoded subtitles use AI inpainting to fill in the pixels behind the text, but the result is always softer / blurrier than the original. Always keep your source file as a backup before burning.
What’s the difference between “burning” and “hardcoding” subtitles? Nothing. The two terms mean the same thing — rendering subtitles permanently into the video frames.
Do I need to know what codec or bitrate to use? For social media, sensible defaults work fine: H.264, 5–8 Mbps for 1080p. Most tools default to these. Only customize if you have a specific reason.
Why are some online subtitle tools so slow even for short clips? Two reasons: (1) the upload to their server is the bottleneck, not the encoding itself, and (2) tools built on ffmpeg.wasm run the encoder in WebAssembly, which is several times slower than the same encoder running natively. See the WebCodecs vs ffmpeg.wasm post for the technical details.
How long do hardcoded captions take to render? Depends on the tool and your machine. On a modern desktop with hardware encoding, real-time or faster is typical (a 60-second clip renders in 30–60 seconds). On a server-based tool, total time is dominated by upload, which can be many minutes for 4K content.
When to choose what — a decision tree
If you want to skip the analysis and just have a decision, here is a working tree:
-
Is your video confidential or client work? → Use a desktop tool (FFmpeg, HandBrake) or a verified browser-local tool. Avoid uploaders.
-
Is your output going to TikTok, Reels, Shorts, or X? → Hardcode. Soft tracks are stripped or ignored on these platforms.
-
Is your output going to YouTube long-form, Vimeo, or your own player? → Use soft subtitles (SRT or VTT track). Better accessibility, better user control.
-
Do you have FFmpeg / HandBrake installed and are comfortable with them? → Use them. They produce the best quality output and you control every parameter.
-
Do you want a one-click solution with no install? → A browser-local tool (BurnSub, VideoToFrames) avoids the upload and the watermark. A server-based tool (Veed, Kapwing, Clideo) has more polish but the upload and watermark trade-offs.
That is the entire decision. The category is not magic. Pick the method that fits your specific job, ignore the marketing, and verify the privacy claims yourself.
A note on the bias of this guide. I run BurnSub, which is in the browser-local category. That bias is visible in places. Where I can name a category that is genuinely a better fit than BurnSub — desktop FFmpeg for long-form or color-graded work, soft subtitle tracks for YouTube long-form — I have done so. The goal of this guide is to give you a working decision, not to funnel you toward any single tool.