Skip to main content

Audio Lab

Drop in an audio clip and the Lab runs real Whisper speech recognition right here in your browser via onnxruntime-web — the same Whisper family the native kapi-asr plugin runs through whisper.cpp. The recognized speech becomes timing-anchored subtitle cues: play the audio and the active cue highlights. That's exactly the shape the audio-to-subtitles flow translates and writes to .vtt/.srt. The model (~40 MB) loads on first use; nothing is mocked — only the runtime differs. For an instant, no-download tour, see the Multimodal Showcase.

Loading the interactive lab…