Realtime
Use Charivo's realtime stack when you want session-based voice interaction, streaming assistant output, or tool-enabled voice workflows.
Recommended Stack
@charivo/realtime
@charivo/realtime/remote
your /api/realtime route
@charivo/server/openai
This is the current production-oriented browser path. The browser calls your route, receives an adapter-aware bootstrap, and connects through the default remote adapter registry.
Basic Setup
import {
createRealtimeManager,
type RealtimeToolRegistration,
} from "@charivo/realtime";
import {
createAvatarControlTools,
createAvatarResultProjector,
} from "@charivo/realtime-avatar";
import { createRemoteRealtimeClient } from "@charivo/realtime/remote";
const client = createRemoteRealtimeClient({ apiEndpoint: "/api/realtime" });
const avatarTools = createAvatarControlTools({
expressions: ["Smile", "Sad"],
motions: { Idle: 2, TapBody: 3 },
});
const tools: RealtimeToolRegistration[] = [
...avatarTools,
{
definition: {
type: "function",
name: "describeCharacterProfile",
description: "Return the active character profile.",
parameters: {
type: "object",
properties: {},
},
},
async handler(_args, context) {
return {
success: true,
name: context.character?.name ?? null,
};
},
},
];
const manager = createRealtimeManager(client, {
tools,
resultProjectors: [createAvatarResultProjector()],
});
If your app also renders lipsync locally, prepare audio from a user gesture before the first realtime session:
await renderManager.prepareAudio?.();
await manager.startSession({
provider: "openai",
model: "gpt-realtime-mini",
});
Typical session start:
await manager.startSession({
provider: "openai",
model: "gpt-realtime-mini",
});
If you need stronger product-specific acting guidance, append it in the app
layer on top of the library-generated base instead of making
@charivo/realtime own product persona rules:
import { buildRealtimeSessionConfig } from "@charivo/realtime";
import { buildAvatarControlInstructions } from "@charivo/realtime-avatar";
const base = buildRealtimeSessionConfig({ character });
await manager.startSession({
provider: "openai",
model: "gpt-realtime-mini",
instructions: [
base.instructions,
buildAvatarControlInstructions(avatarCatalog),
"Keep replies short and natural for this product.",
].join("\n"),
});
buildRealtimeSessionConfig(...) already includes character identity,
description, personality, the generic realtime/tooling rules, and
character.voice.voiceId when available. It does not supply provider/model
defaults. OpenAI-specific model and voice fallbacks live in the OpenAI
transport/provider packages, not in the provider-agnostic manager helper.
Why @charivo/realtime/remote Is The Default
- it is the recommended production path
- it works through your own server route
- it resolves a browser transport adapter from its registry
- the built-in resolver maps OpenAI WebRTC traffic to the current adapter defaults
Today, that usually means the OpenAI Agents WebRTC bootstrap flow.
Client Choices
Remote
@charivo/realtime/remote- best default for production browser apps
- adapter-aware and server-mediated
OpenAI Agents SDK Transport
@charivo/realtime/openai-agents- current OpenAI Agents SDK transport client and adapter
- useful when you need to own the underlying browser client directly
Legacy Low-Level OpenAI Transport
@charivo/realtime/openai- older low-level OpenAI WebRTC path
- mainly useful for legacy compatibility and debugging
What @charivo/realtime Owns
- session state
- tool registry
- typed session config helpers
- in-place
updateSession(...)session patching - reconnect orchestration and reconnect observability events
- relaying realtime output into the Charivo event stream
RealtimeManager intentionally uses setEventEmitter(...), not the full event
bus. It emits realtime, tool, text, and lip-sync related events back into core.
Local tool calls are checked against the registered tool definition before the
handler runs. The built-in validator covers required, enum, and basic JSON
Schema type fields; invalid arguments emit realtime:tool:error and return a
failure tool result. Nested object/array schemas, additionalProperties, and
union type arrays are not enforced.
Avatar expression/motion/gaze tools are optional and now live in
@charivo/realtime-avatar. Use a result projector when you want those tool
results bridged back into realtime:expression, realtime:motion, and
realtime:gaze.
Reconnect Behavior
When the browser transport drops temporarily, the manager keeps the realtime session active and attempts recovery with the latest effective config.
state.session.statusstays"active"during recoverystate.connectionswitches back to"connecting"until recovery succeeds- successful reconnects do not emit synthetic
realtime:session:start/end updateSession(...)still updates the cached base config while reconnectingsendMessage(...),sendAudioChunk(...), andinterrupt()reject while the connection is recoveringrealtime:reconnect:attempt,realtime:reconnect:success, andrealtime:reconnect:exhaustedare emitted for observability
Provider Route
The server route typically uses @charivo/server/openai:
const provider = createOpenAIRealtimeProvider({
apiKey: process.env.OPENAI_API_KEY!,
});
const bootstrap = await provider.createSession({
adapter: "openai-agents-webrtc",
transport: "webrtc",
session: {
provider: "openai",
model: "gpt-realtime-mini",
voice: "marin",
},
});
If model or voice are omitted from an OpenAI realtime session, the OpenAI
provider applies its OpenAI-specific defaults before calling OpenAI. Apps can
still pass those fields explicitly when they need deterministic provider
configuration.
Alternatives
- Use the direct Agents transport package when you need to own the realtime transport client directly in the browser.
- Use the legacy low-level package only when you intentionally depend on the older
openai-webrtcflow. - Use turn-based STT and TTS when you do not need continuous live sessions.