STT
Charivo's STT layer combines @charivo/stt with a concrete transcriber.
For production browser apps, use the remote transcriber with a server route
backed by @charivo/server/openai.
Recommended Stack
@charivo/stt
@charivo/stt/remote
your /api/stt route
@charivo/server/openai
The browser records locally. The backend handles transcription.
Basic Setup
import { Charivo } from "@charivo/core";
import { createSTTManager } from "@charivo/stt";
import { createRemoteSTTTranscriber } from "@charivo/stt/remote";
const charivo = new Charivo();
charivo.attachSTT(
createSTTManager(createRemoteSTTTranscriber({ apiEndpoint: "/api/stt" })),
);
await charivo.getSTTManager()?.start({ language: "ko" });
const text = await charivo.getSTTManager()?.stop();
Transcriber Choices
Remote
@charivo/stt/remote- records in the browser and sends audio to your route as multipart form data
- best default for production browser apps
Direct OpenAI
@charivo/stt/openai- useful for local development and testing
- exposes credentials to the browser
Browser-Native
@charivo/stt/web- built on the Web Speech API
- useful for prototypes and zero-server flows
- browser support varies
What @charivo/stt Owns
- recording lifecycle
- interaction with the transcriber implementation
- STT lifecycle and error events back into core
STTManager intentionally uses setEventEmitter(...) rather than the full
event bus.
Provider Route
The remote transcriber usually pairs with @charivo/server/openai on the
server:
const provider = createOpenAISTTProvider({
apiKey: process.env.OPENAI_API_KEY!,
defaultModel: "whisper-1",
});
const text = await provider.transcribe(audioBlob, {
language: "ko",
});
Alternatives
- Use
@charivo/stt/webwhen you want the fewest moving parts and browser support is good enough. - Use
@charivo/stt/openaiwhen you are testing direct vendor behavior. - Move to Realtime when you want continuous session-based voice interaction instead of turn-based transcription.