Changelog

v2.5.1

Track all updates, improvements, and fixes to Based Subtitles.Building in public, one release at a time.

v2.5.1

Latest
UX Polish, Double-Transcription Fix & Cleanup
fixed

Double audio extraction / transcription triggered by clicking Transcribe twice — added ref guard in useTranscription to prevent concurrent calls

fixed

1-2 second idle gap between language modal closing and processing overlay appearing — status now set synchronously in the same render batch as modal close

changed

Aspect ratio tabs now use rectangle icons (horizontal/vertical) instead of plain text, with labels on desktop

changed

Mobile Styling/Edit tabs restyled with primary color active state and bolder font to establish clear visual hierarchy above the Word/Phrase sub-tabs

added

Bottom fade shadow on mobile scroll area to hint that more content is available below

removed

Removed drag-to-pan and fit/crop toggle for portrait-in-landscape — face tracking handles intelligent cropping; cleaned up all related code from video-upload, main-app, and export hook

fixed

Font subset error from fonttrio install — removed invalid 'menu' subset from Jost, Rubik, and Roboto_Mono declarations in layout.tsx

changed

Disabled React Strict Mode to prevent dev-only double-mount side effects (double worker init, double audio extraction)

v2.5.0

Mobile-First Redesign & shadcn Theming
changed

Full mobile-first layout: viewport-locked page with no page scroll, inline Styling/Edit tabs below the download button with internal scrolling, and an expand button for full-screen editing

changed

Converted entire codebase from hardcoded Tailwind colors to semantic theme variables (bg-primary, text-foreground, bg-muted, etc.) so shadcn presets apply correctly

changed

Moved custom utility functions out of lib/utils.ts into lib/transcript-utils.ts so shadcn preset installs no longer overwrite app code

changed

All mobile controls now use shadcn Button component with proper variants and sizes for consistent touch targets and interaction states

changed

Regenerate Subs button (formerly Change Language) now opens the language modal without auto-starting processing, letting users pick a different language first

added

Confirmation dialog (shadcn AlertDialog) when uploading a new video to prevent accidental loss of current subtitles and styling

added

Confirmation dialog for background removal on videos longer than 1 minute warning about processing time

added

Cancel button with live progress shown inline during background removal processing

added

Slim progress bar below controls during background removal on mobile

fixed

Mobile bottom sheets (Styling/Edit) no longer blink and fail to open — replaced dual-SheetContent pattern with separate Sheet components

fixed

Language selection modal now properly resets state when reopened, preventing stale Processing state from previous transcription

v2.4.0

Formal Subtitle Styling & Faster Export
added

New Formal subtitle preset using Playfair Display with white text, a subtle dark outline/shadow treatment, and stronger preset button contrast for light styles

added

Separate active-word recolor emphasis control with a configurable highlight color, supported in both preview and export rendering

changed

Default subtitle styling now starts with the Formal look, 3 max words per line, display-on-spoken disabled, active-word recolor enabled, and a -10px vertical offset

changed

Active-word emphasis is now split into independent resize and recolor controls so each effect can be toggled separately

added

Export quality selector — choose between Medium (smaller file, closer to source bitrate) and High Quality before downloading

changed

Export pipeline now uses OffscreenCanvas, alpha-free contexts, and StreamTarget for reduced memory usage and faster frame rendering

changed

Default export quality lowered from High to Medium so output file sizes match the original video more closely

changed

Face position analysis before 9:16 export now runs at 2fps instead of 5fps (2.5x faster) and shows a live progress percentage

changed

Mobile MP4 export now uses more conservative compatibility settings during the playback-freeze investigation, including explicit AVC settings on mobile, lighter mobile export caps, and on-screen export diagnostics for codec, MIME type, bitrate, and source-track details

changed

HEVC-source export startup now streams audio samples alongside video rendering and resolves the output MIME type asynchronously so the ongoing 0% stall investigation is no longer front-loaded on whole-file audio preparation

fixed

Canvas context state (transforms, composite ops) no longer resets every frame during compositing export — blur, foreground, and mask canvases are sized once and cleared instead of recreated

fixed

Subtitle chunk lookup during export uses binary search instead of linear scan, eliminating per-frame overhead on long videos with many subtitles

fixed

Subtitle preview text on mobile now scales relative to the video container width instead of the viewport, with a readability boost on small screens so text stays legible even on compact previews

fixed

Export subtitle positioning now matches the preview baseline, so vertical offset and default bottom placement render at the expected height in downloaded videos

fixed

Export frame extraction now clears the canvas every frame, falls back to the source video when a decoded sample is unavailable, and detects HEVC phone-camera sources so they can use sequential decoded-frame reads instead of sparse timestamp lookups during export

v2.3.0

Installable PWA & Home Screen Support
added

Installable Progressive Web App support — Based Subtitles can now be added to the home screen and launched in standalone app mode

added

Generated branded app icons using the existing BS monogram for Android, desktop install prompts, and Apple touch devices

added

Install banner — shows a native install prompt on supported browsers and Add to Home Screen instructions on iPhone/iPad Safari

changed

App Router metadata now includes a web app manifest, Apple web app settings, and service worker wiring required for installation

changed

Split subtitle controls are now always visible in the styling panel, so top/bottom and left/right layouts can be selected without depending on 3D depth mode visibility

fixed

Display-on-spoken subtitles in dynamic 3D preview and export now keep a stable layout — future words stay reserved in place and no longer cause earlier words to shift when they appear

fixed

Per-word editing chips now remain available for single-word phrase subtitles, so brief isolated words can still be selected and styled from the bottom editor bar

fixed

Split subtitle mode now preserves per-word emoji replacement and emoji overlay styling in both preview and exported video instead of dropping those overrides on split lines

v2.2.0

Vertical Offset, Sidebar Cleanup & Open Source
added

Vertical offset slider (±50 px) to fine-tune subtitle position above or below the default — applies to preview, canvas compositing, and exported video

fixed

Vertical offset now correctly applied during video export — previously the setting was ignored and subtitles always rendered at the default position

added

"Hide inactive" toggle in the transcript sidebar — hides both skipped segments (no audio) and hidden-subtitle segments (audio plays, no text) to keep the sidebar clean

changed

Project is now open source — source code available on GitHub at github.com/deifos/basedsubtitles

v2.1.0

Streaming Transcription & Long Video Support
added

Streaming transcription — partial results appear chunk by chunk as Whisper processes the audio instead of waiting for the full video to finish

added

Live subtitle preview updates in the transcript sidebar while transcription is still running

added

Non-blocking UI after the first chunk arrives — blocking overlay replaced by a slim in-header progress indicator so you can see partial subtitles in real time

added

Transcribing banner in the transcript sidebar (amber, with spinner and progress bar) pinned below the scroll area so it stays visible while reading results

added

Transcribing banner now shows the current timestamp being processed (e.g. Transcribing 0:42 · 73%) so you can track progress without scrolling

changed

Long video support is now reliable — transcription uses the same 30s / 5s stride chunking that the Whisper reference implementation uses, with stride-aware multi-chunk merging via _decode_asr

changed

Whisper worker now replicates the pipeline's internal _call_whisper loop (model.generate() per chunk + tokenizer._decode_asr() for merging) instead of calling the high-level pipeline function, giving identical accuracy with streaming output

fixed

Transcription was silently skipping sections of audio in videos longer than ~30 seconds — caused by sending the full audio in one call without stride-aware chunking

fixed

Multi-chunk merging was incorrectly concatenating chunks instead of filtering stride overlap regions — fixed by enabling return_timestamps so timestamp tokens are present for _decode_asr's stride filtering logic

fixed

Subtitle file downloads used a generic filename (subtitles.srt) regardless of the source video — now uses the video filename as the base (e.g. my-clip.srt)

removed

Removed 'Add Subtitle' manual entry form from the transcript sidebar — no longer needed now that transcription accuracy covers the full video without gaps

v2.0.0

Split Subtitles & Display on Spoken
added

Split subtitle mode — phrase text is divided in two and placed around the person's head. Two modes: Above/Below (works on any ratio) and Left/Right (widescreen only)

added

Left/Right split positions text at eye level (~38% from top) with a gap on each side of the face so text has room to move as the person shifts

added

Per-phrase frozen face position for split subtitles — text is anchored to where the face was at the start of each phrase and stays fixed for its duration, avoiding distracting mid-phrase drift

added

Face tracking auto-starts when a split subtitle mode is selected and stops when set back to Off — no manual toggle required

added

Display on Spoken — all phrase words appear dim at phrase start and each word lights up to full brightness when spoken. Layout is stable from the first word (no shifting as words appear)

added

Display on Spoken works in all render paths: DOM preview, canvas compositing, and video export

added

Display on Spoken combined with Split Subtitle mode — each half renders words individually with dim/bright per word

changed

Split Subtitle section is always visible in the styling panel — no longer gated behind face tracking being active

changed

Applying a style preset no longer resets the font size — presets change color, border, and font family while preserving the user's chosen size

changed

Default style updated to green (#00FF41), medium font size, no background, and Display on Spoken enabled

v1.9.0

Face Tracking Export, Stroke Slider & Fixes
fixed

Face tracking was not applied during 9:16 video export — the buildExportTimeline function was missing from the useCallback dependency array, causing the export to always use a stale closure where it was undefined

fixed

Exported face tracking had a slight delay compared to the preview — added a 150ms look-ahead offset when sampling the face timeline during export to compensate for EMA smoothing phase lag

fixed

Export failed with odd-dimension error on 4K sources — 9:16 crop of 2160p height produces 1215px width (odd), which H.264 rejects. Dimensions are now rounded down to even numbers

fixed

Font stroke/border was filling inside the text at higher widths — canvas strokeText draws centered on the glyph outline so half went inward. Now draws stroke at 2x width before fill, and CSS preview uses paint-order: stroke fill

changed

Border width control replaced with a 0–20px slider instead of a dropdown with only 4 options (None/Thin/Medium/Thick)

changed

Default subtitle style now uses 10px white border stroke for better contrast and readability on any background

removed

Removed text fade-in effect (letter-by-letter reveal) — timing was unreliable, words would disappear before fully revealing on short chunks

v1.8.2

Mobile Export Quality & Color Picker Performance
fixed

Exported video was low quality on mobile — the export pipeline creates multiple full-resolution offscreen canvases that exceeded mobile browser memory limits, causing silent quality degradation. Export resolution is now capped at 1080p on mobile to stay within canvas memory budgets

fixed

Black video export on mobile browsers without WebCodecs decoding support — when canDecode() returned false, no video frames were drawn. Added a fallback that seeks the video element and draws each frame directly

fixed

Color pickers caused heavy lag while dragging — all 5 color inputs (per-word color, text color, background color, border color, solid background) fired onChange on every pixel movement, triggering full state updates and canvas re-renders 10–50 times per second. Now debounced with immediate visual feedback via local state

v1.8.1

Mobile Playback & Camera Fixes
fixed

Camera recording showed black screen on mobile — video element didn't exist in the DOM when the stream was attached; now re-attaches via useEffect when the preview mounts

fixed

Video not playing after scrubbing on mobile — pausing the video on drag start then calling play() on release failed because pointerup on range inputs isn't a trusted gesture on mobile; removed pause/resume entirely

fixed

Video freezing when scrubbing on mobile — was setting videoEl.currentTime on every drag frame, overwhelming the decoder; now only seeks once on pointer release (mediabunny pattern)

fixed

Subtitles showing stale position during seek drag — now calls onTimeUpdate during drag so subtitles and transcript stay in sync with the scrub position

fixed

Video not buffering fully on mobile — added preload="auto" so mobile browsers load the complete video instead of lazy-loading

changed

Replaced <input type="range"> seek bar with custom div-based progress bar using pointer events and setPointerCapture — eliminates all mobile range input quirks and provides reliable touch tracking even when finger drifts off the bar

changed

Progress bar fill and time display update via refs during drag (zero re-renders while scrubbing, Vercel React best practice: useRef for transient values)

changed

Added keyboard support (left/right arrows ±5s) and ARIA slider attributes to custom progress bar for accessibility

changed

Video timeupdate events are ignored during seek drag to prevent the video's stale position from overriding the user's scrub position

v1.8.0

Unified Word Editing & Min Words
added

Per-word emoji size slider — scale each word's emoji independently from 50% to 200% (appears in word style popover when an emoji is set)

added

Compact word style popover on mobile — collapsible icon tabs (Font, Size, Color, FX, Emoji) overlay the video without blocking subtitles or requiring scroll

changed

Word editing standardized to chip bar below video — works in all modes (3D on/off), replacing direct click-on-word in the preview

changed

Word chips restyled with dark background for better visibility

changed

Min words per line lowered from 3 to 1 — allows single-word-at-a-time subtitle display

changed

Emoji size is now per-word instead of global — moved from styling panel to the word style popover

fixed

Emoji overlay size was not applied during video export — now correctly uses per-word emojiScale in all render paths

v1.7.0

Per-Word Emoji Replace & Overlay
added

Per-word emoji replace — click any word and pick an emoji to replace the word text entirely with that emoji

added

Per-word emoji overlay — place an emoji above any word while keeping the text visible underneath

added

Emoji picker integrated into the per-word style popover with search support

added

Emoji replace and overlay render in DOM preview, canvas compositing (3D depth mode), and video export

added

Emoji size scales with the word's font size — enlarging a word also enlarges its emoji

fixed

Word style popover in 3D mode did not update when clicking a different word badge — popover now resets and shows the correct word's settings

added

Branding watermark — toggleable "basedsubs.getbasedapps.com" text in the bottom-left corner, rendered in preview and baked into exported videos

v1.6.0

Letter-by-Letter Text Fade In
added

Letter-by-letter text fade in — characters reveal left-to-right within each word with staggered timing, similar to VEED-style animated captions

added

"Text fade in" toggle in the styling panel — when off, text appears and disappears instantly with no fade effects

added

Letter-by-letter reveal works in DOM preview, canvas compositing (3D depth), and video export

changed

Word and chunk fade effects are now gated behind the text fade in toggle instead of always on

changed

Active word emphasis is now off by default

fixed

Mobile camera starting with black screen — autoPlay not reliable on mobile, now explicitly calls play() after attaching the stream

fixed

Flipping camera showed "camera in use by another app" — the useEffect re-fired openCamera on facingMode change, causing two getUserMedia calls to race for the same device

fixed

Camera defaulting to back camera on some Android devices — now uses exact facingMode constraint instead of a preference hint the browser can ignore

fixed

Mobile styling and edit panels could not be switched directly — had to close one before opening the other. Bottom bar buttons are now toggles with mutual exclusion

fixed

Mobile drawer panels cut off at the bottom — content now fills the full available height down to the navigation bar

fixed

Mobile styling panel couldn't scroll to the last items — scroll area now accounts for the fixed bottom navigation bar

fixed

Letter-by-letter text fade in now works in 3D depth mode — both behind and front text layers support per-character reveal and chunk fade-out

fixed

Mobile transcript panel height was capped at 384px (max-h-96) — removed the cap on mobile so the list fills the drawer

removed

Removed redundant "Subtitles behind person" toggle — its functionality is fully covered by the "Dynamic depth 3D" toggle

v1.5.0

Knockout Text Effect & New Font
added

Knockout text effect — per-word style option that makes the text shape reveal an inverted/negative version of the video behind it

added

Knockout toggle in the word style popover under a new 'Effect' section

added

Knockout renders in DOM preview (mix-blend-mode: difference), canvas compositing (3D depth mode), and video export

added

Lilita One font — thick, bold display font with a magazine-cover aesthetic, great for knockout effects

v1.4.1

Mobile Bug Fixes
fixed

Footer 'Powered by' row no longer overflows on small screens — items wrap and bullet separators hide on mobile

fixed

Video freezing on mobile after transcription — unhandled play() promise rejections from autoplay policy now caught on all play/pause controls

added

Toast notifications for background removal errors — shows a clear message when WebGPU/WASM is unavailable instead of silently failing

v1.4.0

Camera Recording, Custom Player & Bug Fixes
added

Camera recording — record video directly from your camera (front or back) with a 'Record video' button on the landing page

added

Records straight to MP4 using MediaBunny (H.264 + AAC) — no WebM conversion needed

added

Camera flip button to switch between front and back cameras on mobile

added

Review screen after recording with re-record and 'Use this video' options

added

Custom player controls (play/pause, seek bar, time, mute) shown consistently in all modes

added

Click anywhere on the video to play/pause

added

Umami analytics integration

added

Powered by Transformers.js and MediaBunny attribution with logos in the footer

changed

Removed native browser video controls — custom controls prevent subtitles from covering playback buttons

changed

Landing hero tagline updated from 'Free forever' to 'Free'

fixed

Selecting the 500MB model caused a blank screen — base model preload raced with the user's model choice, resolving the wrong download promise

fixed

Added load ID tracking so stale model 'ready' messages from previous downloads can't resolve the wrong promise

fixed

Cancelling the language modal on a fresh upload now returns to the landing page instead of showing a dead-end video screen

fixed

Subtitle text appearing/disappearing no longer causes layout jumps — placeholder space is always reserved

fixed

Word chip bar in compositing mode no longer causes layout shifts between phrases

v1.3.1

3D Depth Export Performance Fix
fixed

Video export with 3D depth enabled was re-running MODNet AI inference on every frame at 30fps (~1,800 calls per minute of video) — now uses pre-computed 5fps mask cache, reducing export time from ~15 minutes to under a minute

fixed

Export with 3D depth consumed ~2GB+ RAM from per-frame 8MB getImageData allocations and data copies to the AI worker — eliminated entirely by using cached masks

v1.3.0

Word Chip Bar & Auto 3D Depth
added

Word chip bar — clickable word buttons appear below the video when 3D depth mode is active, letting you select any word for per-word styling

added

Words with custom style overrides are highlighted with an amber tint in the chip bar

changed

3D depth mode now auto-enables when background removal starts — no extra toggle needed

changed

Per-word selection in 3D/bg-removal mode uses the reliable chip bar instead of canvas click coordinates

fixed

Clicking anywhere on the canvas in 3D mode no longer incorrectly triggers word editing

fixed

Hidden and disabled words can no longer be selected for editing

fixed

Major memory leak in video export — frame pixel data (~8MB) was allocated per-frame instead of reusing an offscreen canvas via GPU blit

fixed

AudioContext not closed on error during export, leaking audio memory

fixed

Reusable canvases and buffers from export were never released after completion, staying in memory indefinitely

changed

Font mapping table (24 entries) extracted to a single module-level constant — was duplicated in 5 functions and recreated per-frame during export

removed

Removed dead hitTestOnly prop from VideoCaption and unused renderDynamicWordToCanvas function

v1.2.0

Per-Word Styling & Preset Preview
added

Per-word custom styling — click any word in the video preview to override its font, size, and color independently

added

Word style popover with font family picker, size multiplier (50%–200%), color picker, and reset button

added

Selected word highlight with yellow outline in the video preview

added

Per-word overrides render correctly in both the live preview and exported video

changed

Preset buttons now render with their actual font — Bangers for Green, Permanent Marker for Gold, Outfit for Subtitle, Bebas Neue for Gamer

changed

Preset buttons show the correct font weight so you can preview the style before applying

v1.1.0

Landing Page Redesign & Performance
added

New landing page with hero section, feature highlights, how-it-works flow, local processing showcase, and call-to-action

added

Full-page drag-and-drop — drop a video anywhere on the landing page to get started

added

Drag-over visual feedback on the dropzone with amber highlight

added

Changelog page at /changelog with version history

added

"Buy me a coffee" floating button

added

Site footer matching getbasedapps design with version link

changed

Redesigned feature cards with playful visual representations — waveform bars, layered text, font samples, language globe

changed

Subtitle style presets now use dynamic fonts (Bangers, Permanent Marker, Bebas Neue, Outfit) instead of generic system fonts

changed

Default subtitle font size changed to Small for a cleaner look

changed

Subtitle styling panel now uses Outfit font with consistent rounded-lg buttons and amber accent colors

changed

Style presets now visible in dynamic (3D) mode for consistent styling

changed

Subtitle preview pinned above the scroll area so it's always visible while adjusting settings

changed

Editor buttons and panels updated to match the new landing page design

fixed

Major memory leak — Whisper model (~1GB) now freed after transcription completes instead of staying in memory

fixed

Major memory leak — background removal model (~500MB-1GB) now freed after processing completes

fixed

Mask data was duplicated in both a ref and React state (~2x memory usage) — removed unused state copy

fixed

Canvas render loop now skips redundant draws when video is paused, saving significant CPU

fixed

ImageData allocation (~8MB) was created every frame at 60fps — now reused across frames

fixed

Temporary canvases in video export now reused across frames instead of recreated per-frame

removed

Removed purple color accents — replaced with amber tones throughout

v1.0.0

Initial Release
added

AI-powered subtitle generation using Whisper.js — runs 100% locally in the browser via WebGPU/WASM

added

Multi-language transcription supporting 100+ languages with selectable Whisper model sizes (tiny, base, small)

added

Background removal with AI person segmentation — processes video frames locally without any uploads

added

Dynamic 3D subtitles — place text behind or in front of people with depth effects

added

Follow-word mode for dynamic subtitles — text tracks the position of the spoken word

added

25+ Google Fonts including Bangers, Montserrat, Bebas Neue, Poppins, Oswald, Anton, Fredoka, Permanent Marker, Pacifico, and more

added

Full subtitle styling controls — font, size, weight, color, background, border, drop shadow, word emphasis

added

Word-by-word and phrase display modes with configurable max words per line

added

Subtitle position control (top, middle, bottom) with adjustable Y positioning for dynamic mode

added

Video export with baked-in subtitles using Mediabunny — downloads MP4 with subtitles permanently rendered

added

Landscape (16:9) and portrait (9:16) aspect ratio modes with zoom toggle for portrait

added

Editable transcript sidebar — click any segment to modify the transcribed text

added

Language selection modal with model size picker before transcription starts

added

Mobile-responsive design with drawer navigation for styling and transcript editing

added

Works offline after first load — all AI models are cached in the browser

added

Free to use — no sign-up, no watermarks, no data collection

The beginning