iFrame Implementation
How it works
The JavaScript library provides a simple, secure, and flexible way to embed and control a fully interactive 3D AI avatar on your website using a single function call.
When you include the provided <script> snippet, it loads the required library hosted by our CDN. This handles authentication, iframe creation, and postMessage-based communication between your site and the avatar engine running remotely inside a sandboxed iframe.
Both Type 1 and Type 2 avatars can be initialized using the same method. However, because iframe-based rendering introduces additional communication latency, Type 2 avatars are also available through our JavaScript SDK. The SDK initializes the avatar engine directly within the client environment, bypassing iframe postMessage communication to deliver low-latency, real-time facial tracking, animation control, and direct access to camera or avatar media streams. These streams can then be integrated into any WebRTC based service, such as LiveKit, Twilio Video, Agora etc, allowing you to replace a user’s live video feed with their animated avatar.
Initialization & Authentication
The primary entry point is the loadAvatar(params) function.
You pass it a configuration object containing your API key, desired avatar options, and setup preferences (like width, height etc).
<!-- Avatar container -->
<div id="avatar-container" style="width:600px; height:500px;"></div>
<!-- Load the library -->
<script src="https://interactiveavatar.co.uk/loadAvatar_v1.0.min.js"></script>
<script>
window.addEventListener('load', () => {
if (typeof loadAvatar !== "function") {
console.error("Avatar library not loaded yet.");
return;
}
loadAvatar({
setup: {
key: "YOUR API KEY",
elementId: "avatar-container",
debug: false
},
avatar: {
glb: "https://example.com/hero.glb",
type: 1
},
chat: {
showChat: true,
showChatInput: true
}
})
.then((interactiveAvatar) => {
console.log("API ready, methods available", interactiveAvatar);
})
.catch(err => console.error("Avatar failed to load:", err));
});
</script>
Before rendering your avatar, your API key is validated along with any HTTPS referrer options you may have set in your profile area, then the iframe is created and initialized.
To see all available options, refer here.
The Control Object (interactiveAvatar)
Once the iframe loads successfully, the loadAvatar() Promise resolves with an interactive control object.
This object serves as your API bridge to the avatar inside the iframe.
loadAvatar(config).then(interactiveAvatar => {
interactiveAvatar.speak({
text: "Hello, world!"
});
});
Methods are executed via window.postMessage() from the iframe, using a simple { type, options } pattern. Such methods include:
loadAvatar– loads a new avatarsetView/setMood– set the view/mood of the avatarsendMessage– communicate with the avatar by typing a message instead of speakingspeak– allows the avatar to speak some text that you define
To see all of the available methods, refer here.
Event System
The library includes a lightweight event listener system for receiving asynchronous updates from the iframe.
Use .on(eventType, callback) to subscribe:
interactiveAvatar.on("avatarReady", (event) => {
console.log("Avatar is ready!", event.data);
});
Events are triggered via window.postMessage() from the iframe and include updates such as:
avatarReady– the avatar has initializedspeechStarted/speechEnded– speech state changesanimationCompleted– when a triggered animation finishesconversationUpdated– AI response or chat update events
To see all of the available events, refer here.
Debugging and Logging
If setup.debug is set to true, the library will log:
- Initialization parameters (with key redacted)
- The generated iframe URL
- Element attachment information
- Other logging information when you interact
This helps verify that your configuration and environment are correct during integration.
iFrame Implementation
How it works
The JavaScript library provides a simple, secure, and flexible way to embed and control a fully interactive 3D AI avatar on your website using a single function call.
When you include the provided <script> snippet, it loads the required library hosted by our CDN. This handles authentication, iframe creation, and postMessage-based communication between your site and the avatar engine running remotely inside a sandboxed iframe.
Both Type 1 and Type 2 avatars can be initialized using the same method. However, because iframe-based rendering introduces additional communication latency, Type 2 avatars are also available through our JavaScript SDK. The SDK initializes the avatar engine directly within the client environment, bypassing iframe postMessage communication to deliver low-latency, real-time facial tracking, animation control, and direct access to camera or avatar media streams. These streams can then be integrated into any WebRTC based service, such as LiveKit, Twilio Video, Agora etc, allowing you to replace a user’s live video feed with their animated avatar.
Initialization & Authentication
The primary entry point is the loadAvatar(params) function.
You pass it a configuration object containing your API key, desired avatar options, and setup preferences (like width, height etc).
<!-- Avatar container -->
<div id="avatar-container" style="width:600px; height:500px;"></div>
<!-- Load the library -->
<script src="https://interactiveavatar.co.uk/loadAvatar_v1.0.min.js"></script>
<script>
window.addEventListener('load', () => {
if (typeof loadAvatar !== "function") {
console.error("Avatar library not loaded yet.");
return;
}
loadAvatar({
setup: {
key: "YOUR API KEY",
elementId: "avatar-container",
debug: false
},
avatar: {
glb: "https://example.com/hero.glb",
type: 2
}
})
.then((interactiveAvatar) => {
console.log("API ready, methods available", interactiveAvatar);
})
.catch(err => console.error("Avatar failed to load:", err));
});
</script>
Before rendering your avatar, your API key is validated along with any HTTPS referrer options you may have set in your profile area, then the iframe is created and initialized.
To see all available options, refer here.
The Control Object (interactiveAvatar)
Once the iframe loads successfully, the loadAvatar() Promise resolves with an interactive control object.
This object serves as your API bridge to the avatar inside the iframe.
loadAvatar(config).then(interactiveAvatar => {
interactiveAvatar.startCamera();
});
Methods are executed via window.postMessage() from the iframe, using a simple { type, options } pattern. Such methods include:
loadAvatar– loads a new avatarstartCamera– starts the camera and facial trackingstopCamera– stops the camerasetCamera– changes the camera device
To see all of the available methods, refer here.
Event System
The library includes a lightweight event listener system for receiving asynchronous updates from the iframe.
Use .on(eventType, callback) to subscribe:
interactiveAvatar.on("avatarReady", (event) => {
console.log("Avatar is ready!", event.data);
});
Events are triggered via window.postMessage() from the iframe and include updates such as:
avatarReady– the avatar has initializedavatarLoaded– the avatar has loadedcameraStarted/cameraStopped– camera state changeserrorOccurred– when an error occurs
To see all of the available events, refer here.
Debugging and Logging
If setup.debug is set to true, the library will log:
- Initialization parameters (with key redacted)
- The generated iframe URL
- Element attachment information
- Other logging information when you interact
This helps verify that your configuration and environment are correct during integration.
SDK Implementation
How it works
The SDK is available for Type 2 avatars only. It runs the avatar engine directly in the client, removing the additional communication latency introduced by iframe postMessage. This allows for low-latency, real-time facial tracking and animation, along with direct access to the raw camera stream and avatar-rendered canvas stream.
These streams can be integrated into any WebRTC-based service, such as LiveKit, Twilio Video, Agora, etc., enabling you to replace a user’s live video feed with their animated avatar.
The SDK is initialized by passing your API key and a container element to attach the avatar canvas to.
Initialization & Authentication
The primary entry point is the MimicSDK.init(params) function.
You pass it a configuration object containing your API key and the container element ID.
<!-- Avatar container -->
<div id="avatar-container" style="width:600px; height:500px;"></div>
<!-- Load the SDK -->
<script src="https://interactiveavatar.co.uk/mimicSDK_v1.0.min.js"></script>
<script>
window.addEventListener('load', () => {
if (typeof MimicSDK === "undefined") {
console.error("SDK not loaded yet.");
return;
}
const faceTracker = MimicSDK.init({
key: "YOUR API KEY",
container: "avatar-container"
});
console.log("SDK initialized", faceTracker);
});
</script>
Your API key is validated before initializing the avatar engine.
The Control Object (faceTracker)
Once initialized, the MimicSDK.init() returns a control object with methods to interact with the avatar.
faceTracker.loadAvatar("https://example.com/hero.glb");
Methods include:
loadAvatar– loads a new avatarstartCamera– starts the camera and facial trackingstopCamera– stops the camerasetCamera– changes the camera devicegetAvatarStream– gets the avatar canvas streamgetCameraStream– gets the raw camera streamgetVideoDevices– lists available video devices
To see all of the available methods, refer here.
Event System
The SDK includes an event listener system for receiving updates.
Use MimicSDK.on(eventType, callback) to subscribe:
MimicSDK.on("avatarLoaded", () => {
console.log("Avatar loaded");
});
Events include:
avatarLoaded– avatar has loadedavatarProgress– loading progresscameraStarted/cameraStopped– camera stateretrievedDevices– video devices listerrorOccurred– errors
To see all of the available events, refer here.
Options
Setup
This is your API key which is generated automatically upon signing up. If you need to regenerate this key for any reason, you can do so within your profile area.
Setting this value to true, allows for various debugging messages to be outputted to the console. You should not have this set to true whilst in a live environment. It can be useful to see any errors or help in diagnosing faults.
Controls the iframe width that is produced and can be set to any supported CSS value.
Controls the iframe height that is produced and can be set to any supported CSS value.
When an elementID is supplied, the iframe will be appended as a child of the node which has this ID. If you don't supply a value here, the iframe is instead appended to the document body.
Agent
Choose an inference provider here. Accepted values: groq, gemini.
Provide the model ID that is supported with the selected provider. This will be either a gemini or groq model.
This controls randomness and should be set to a value between 0 and 1. The higher the temperature, the more random (and usually creative) the responses will be. For most factual use cases such as data extraction and truthful questions & answers, a temperature closer to 0 will be better suited.
Specify the maximum number of tokens that the model accepts. Depending on the model in use, you may need to consult the model's documentation to see the max token limit.
This sets the behaviour of the avatar and can be used to provide specific instructions for how it should behave throughout the conversation. Try and keep this message written from the perspective of the second person, guiding the text towards the avatar as in the example provided.
Avatar
Specify the URL to a GLB file. The model must contain the required blend shapes (morph targets) for facial animation and lip-sync.
Specify the avatar type here. Type 1 avatars use our AI Agent where you may have conversations with, whilst Type 2 avatars are designed to mirror facial movements and return a media stream that you can then use for your own use. Accepted values: 1, 2.
Specify the gender of the avatar, this helps with tailoring certain body animations. Accepted values: male, female.
Set the default view. This can be changed afterwards by running the 'setView()' method. Accepted values: full, upper, head, mid.
Sets a name for the avatar, it will refer to itself using this.
Allows the AI to determine the mood based upon the context, the face will then animate to a specific emotional mood/expression. Moods can also be set manually by running the 'setMood()' method.
When set to Yes, the AI will determine whether an emoji would be appropriate based upon the context, the face will then animate to the emoji if it is applied. Some emojis will animate the mouth movements which may look a little strange as the avatar is speaking. Emojis can also be applied by running the 'setMood()' method.
Allows the mood to automatically revert back to neutral 3 seconds after it has changed.
Allows the AI to decide whether to use a specific gesture based upon the context. For example, it may wave or show a thumbs-up under certain circumstances.
This option is only available on a paid plan.
These are the available gestures, remove any that you do not want to use. Values must be separated using a comma.
This option is only available on a paid plan.
Allows the AI to decide whether to use a specific pose based upon the context. For example, it may turn to the side or sit down under certain circumstances.
This option is only available on a paid plan.
These are the available poses, remove any that you do not want to use. Keep in mind that some poses might briefly move the avatar outside the viewable area, depending on the view selected. If you're using a head view for example, you might want to exclude oneknee, kneel and sitting. Values must be separated using a comma.
This option is only available on a paid plan.
Enables the ability to pan the camera.
Enables the ability to zoom the camera.
Enables the ability to rotate the camera angle.
The default distance from the camera. A negative value is closer to the camera. For best results, use whole number values (e.g. -2, -1, 0, 1, 2 etc.) as this setting can be quite sensitive.
Moves the camera left and right. A negative value moves the camera left. For best results, use precise values (e.g. -0.4, -0.2, 0, 0.2, 0.4 etc.) as this setting can be quite sensitive.
Moves the camera up and down. A negative value moves the camera down. For best results, use precise values (e.g. -0.4, -0.2, 0, 0.2, 0.4 etc.) as this setting can be quite sensitive.
Rotates the camera left and right (yaw). A negative value rotates the camera to the left, making the avatar appear to be looking towards the right. For best results, use precise values (e.g. -0.4, -0.2, 0, 0.2, 0.4 etc.) as this setting can be quite sensitive.
Rotates the camera up and down (pitch). A negative value rotates the camera downwards, giving more of an upwards look at the avatar. For best results, use precise values (e.g. -0.4, -0.2, 0, 0.2, 0.4 etc.) as this setting can be quite sensitive.
Applies a background color, can be set to any supported CSS value. Must not have a backgroundImage applied to see this.
Applies a background image. Overrides the backgroundColor.
Applies a blur (in pixels) to any background image supplied.
Applies a color to the circular loading indicator which is seen when the avatar is loading. Can be set to any supported CSS value.
Speech
Choose a provider to handle the speech-to-text (STT) transcriptions. If you choose Cartesia, transcriptions are handled through a real-time streaming WebSocket connection. If you choose Gemini, the 'gemini-2.5-flash-lite' model will be used. If you choose Groq, the 'whisper-large-v3-turbo' model will be used. Note: When using Gemini or Groq, you do not have to use the same value that you supplied for the provider option above. Accepted values: cartesia, gemini, groq.
Provide the model ID for an available voice. For further info on the models available, refer to: https://docs.cartesia.ai/build-with-cartesia/models/tts
Select from one of the available voices. Accepted values: Jordan, Nathan, Pleasant Man, Sarah Curious, Sarah, Helpful Woman, Southern Woman, Friendly Sidekick, Madame Mischief, American Voiceover Man, Brandon (emotive), Tessa (emotive).
You can specify your own custom voice ID which will override the voice selection. Cartesia have a voice library available where you may take any voice ID and use it here. For further information, refer to: https://play.cartesia.ai/voices
You can also navigate to the voice cloning area where you can create clones of your own voice and assign to any of your avatars.
Sets a wake word that allows the avatar to respond only when the word is detected. This can be any single word or short phrase, for example "hey Jude" or "hey Bert". You should not use any punctuation, numbers or special characters. You need to be using the microphone for this option to function.
When a wake word is configured, this option will allow the avatar to respond when it's detected at the beginning of a sentence. This enables detection when using phrases such as "Ella, how are you today?".
The language that the given voice should speak the transcript in. Supports: English (en), French (fr), German (de), Spanish (es), Portuguese (pt), Chinese (zh), Japanese (ja), Hindi (hi), Italian (it), Korean (ko), Dutch (nl), Polish (pl), Russian (ru), Swedish (sv) and Turkish (tr).
Only use the 2-letter language code in brackets, so the actual accepted values: en, fr, de, es, pt, zh, ja, hi, it, ko, nl, pl, ru, sv, tr.
Adjusts the amount of audio buffered before playback. Lower values start playback sooner but may cause choppiness, whilst higher values can reduce choppiness but increase delay.
Cartesia’s text-to-speech (TTS) and speech-to-text (STT) features use continuous WebSocket connections for streaming. This setting defines how long (in minutes) an idle connection should remain open. Each time a message is sent, the timer resets. If no messages are sent within the specified period, the WebSocket connections will close automatically and the relevant events will be triggered.
VAD (voice activity detector)
This option is only available on a paid plan.
Determines the threshold over which a probability is considered to indicate the presence of speech.
Determines the threshold under which a probability is considered to indicate the absence of speech.
The number of milliseconds of speech-negative frames to wait before ending a speech segment.
The number of milliseconds of audio to prepend to a speech segment.
The minimum duration in milliseconds for a speech segment.
Chat
Shows an overlay of the chat element within the avatar iframe.
Controls the maximum width of the chat element and can be set to any supported CSS value.
Controls the maximum height of the chat element and can be set to any supported CSS value.
Shows an overlay of the chat input element within the avatar iframe.
Sets the placeholder text in the chat input.
Shows an input type field within the chat element that contains either text or voice depending on the type of interaction with the avatar (per message).
Swaps the positions of the chat and chat input elements.
Sets a background color for the avatar's response inside the chat element. Can be set to any supported CSS value.
Sets a background color for the user's prompt inside the chat element. Can be set to any supported CSS value.
Shows a small label inside the chat element, containing the time at which messages were sent at.
General
Automatically starts the microphone.
Automatically starts the camera.
When enabled, the avatar can be interrupted while speaking. If a new prompt is detected, the avatar will stop its current response and provide a new one. When disabled, interruptions will not be allowed and you'll need to wait for the avatar to finish speaking (or is manually stopped) before it will respond to anything new. This setting must be used with the microphone and does not apply to the 'sendMessage()' method.
Allows the avatar to become aware of your location and surrounding areas.
Controls the number of interactions within the avatar's conversation history that should be retained. If left blank, the entire conversation will be remembered, giving the avatar access to all context. Whilst this can improve the user experience, it could have an impact on any token limits enforced by the selected model. This value has no impact on the data returned through the 'conversationUpdated' event. It is only used for the internal messages that are sent to and from the selected provider.
Shows a draggable video element that will appear within the avatar iframe. If the camera isn't started automatically, then this will appear after the camera has been manually started using the 'startCamera()' method.
Set the width of the video element. The height will be adjusted based on the aspect ratio of the device in use.
Puts a margin (in pixels) around all 4 sides of the video element.
Setup
This is your API key which is generated automatically upon signing up. If you need to regenerate this key for any reason, you can do so within your profile area.
Setting this value to true, allows for various debugging messages to be outputted to the console. You should not have this set to true whilst in a live environment. It can be useful to see any errors or help in diagnosing faults.
Controls the iframe width that is produced and can be set to any supported CSS value.
Controls the iframe height that is produced and can be set to any supported CSS value.
When an elementID is supplied, the iframe will be appended as a child of the node which has this ID. If you don't supply a value here, the iframe is instead appended to the document body.
Avatar
Specify the URL to a GLB file. The model must contain the required blend shapes (morph targets) for facial animation and lip-sync.
Specify the avatar type here. Type 1 avatars use our AI Agent where you may have conversations with, whilst Type 2 avatars are designed to mirror facial movements and return a media stream that you can then use for your own use. Accepted values: 1, 2.
Sets a name for the avatar, it will refer to itself using this.
Enables the ability to pan the camera.
Enables the ability to zoom the camera.
Enables the ability to rotate the camera angle.
The default distance from the camera. A negative value is closer to the camera. For best results, use whole number values (e.g. -2, -1, 0, 1, 2 etc.) as this setting can be quite sensitive.
Moves the camera left and right. A negative value moves the camera left. For best results, use precise values (e.g. -0.4, -0.2, 0, 0.2, 0.4 etc.) as this setting can be quite sensitive.
Moves the camera up and down. A negative value moves the camera down. For best results, use precise values (e.g. -0.4, -0.2, 0, 0.2, 0.4 etc.) as this setting can be quite sensitive.
Rotates the camera left and right (yaw). A negative value rotates the camera to the left, making the avatar appear to be looking towards the right. For best results, increment in whole number steps such as in -4, -2, 0, 2, 4 or use decimals for finer control.
Rotates the camera up and down (pitch). A negative value rotates the camera downwards, giving more of an upwards look at the avatar. For best results, increment in whole number steps such as in -4, -2, 0, 2, 4 or use decimals for finer control.
Allows the avatar to lean and turn with you. This option may consume more resources.
Allows the avatar to track your arm movements, including hands. This option may consume more resources.
Allows the avatar to track your finger movements. This option may consume more resources.
Applies a background color, can be set to any supported CSS value. Must not have a backgroundImage applied to see this.
Applies a background image. Overrides the backgroundColor.
Applies a blur (in pixels) to any background image supplied.
Applies a color to the circular loading indicator which is seen when the avatar is loading. Can be set to any supported CSS value.
General
Shows a draggable video element that will appear within the avatar iframe. If the camera isn't started automatically, then this will appear after the camera has been manually started using the 'startCamera()' method.
Set the width of the video element. The height will be adjusted based on the aspect ratio of the device in use.
Puts a margin (in pixels) around all 4 sides of the video element.
SDK Options (Type 2 only)
The authentication token obtained from the /generate_token request. This must be a valid ephemeral or persistent token before initialization.
Setting this value to true, allows for various debugging messages to be outputted to the console. You should not have this set to true whilst in a live environment. It can be useful to see any errors or help in diagnosing faults.
Enables the camera for testing, this shows a video element on the screen.
Determines the position of the video element within the avatar display area. Accepted values: top-left, top-right.
When the camera is showing, this value sets a margin (pixels). This value is applied to all 4 sides of the video element.
When the camera is showing, this value sets a border radius (pixels). This value is applied to all 4 corners of the video element.
Controls the camera’s field of view. Positive values will show more of the scene, making the avatar appear further away from the camera.
Rotates the camera left and right (yaw). A negative value rotates the camera to the left, making the avatar appear to be looking towards the right. For best results, increment in whole number steps such as in -4, -2, 0, 2, 4 or use decimals for finer control.
Rotates the camera up and down (pitch). A negative value rotates the camera downwards, giving more of an upwards look at the avatar. For best results, increment in whole number steps such as in -4, -2, 0, 2, 4 or use decimals for finer control.
Moves the avatar horizontally along the X-axis (left/right offset). A negative value shifts the avatar to the left, whilst a positive value shifts it to the right. Increment in 0.1 steps for fine control as this is quite sensitive.
Moves the avatar vertically along the Y-axis (up/down offset). A negative value shifts the avatar upwards, whilst a positive value shifts it downwards. Increment in 0.1 steps for fine control as this is quite sensitive.
Moves the avatar forwards or backwards along the Z-axis (depth offset). Positive values will make the avatar appear further away from the camera.
Enables the ability to pan the camera.
Enables the ability to zoom the camera.
Enables the ability to rotate the camera angle.
Allows the avatar to lean and turn with you. This option may consume more resources.
Allows the avatar to track your arm movements, including hands. This option may consume more resources.
Methods
loadAvatar 1 / 2
This loads a new avatar.
interactiveAvatar.loadAvatar({
glbfile: "https://example/test.glb",
gender: "female"
});
setView 1
This allows you to set the view of the avatar.
interactiveAvatar.setView({
view: "upper"
});
setMood 1
This allows you to set the mood of the avatar (based on a set of emotions or emojis) and animate when changed. Moods and emojis can be run simultaneously.
😏 smirk
🙂 slightsmile
😊 warmsmile
😇 angel
😀 grin
😃 bigsmile
😄 beaming
😁 cheesysmile
😆 laughing
😝 playful
😋 yummy
😂 tears
🤣 rofl
😉 wink
😭 crying
🥺 pleading
😞 disappointed
😔 sad
😳 flushed
☹️ frown
😚 kiss
😘 blowingKiss
🥰 loved
🤩 starstruck
😡 angry
😠 mad
🤬 cursing
😒 unamused
😱 screaming
😬 grimace
🙄 eyeroll
🤔 thinking
😴 sleeping
interactiveAvatar.setMood({
mood: "confusion",
emoji: "😒",
revert: true
});
sendMessage 1
This allows you to communicate with the avatar by sending a message instead of speaking. The avatar will then reply.
interactiveAvatar.sendMessage({
text: `hello, how are you today?`
});
speak 1
This allows the avatar to speak some text that you define. Upon doing so, an emotion will be determined from the context supplied and the face will animate slightly to match that emotion.
interactiveAvatar.speak({
text: `hello, how are you today?`
});
animate 1
This allows you to define a gesture or pose that the avatar will then animate. Gestures and Poses can be run simultaneously.
interactiveAvatar.animate({
duration: 3000,
gesture: "thumbup",
pose: "bend"
});
playAnimation 1
This allows you to apply an animation to the avatar. You can either select from one of the preset animations, or you may load a Mixamo FBX file. You do not need to include skins for this work, files will load quicker without a skin included if uploading your own.
interactiveAvatar.playAnimation({
animation: 'chicken dance',
fbxfile: 'https://example/anim.fbx',
duration: 10000,
scale: 0.005
});
stopAnimation 1
This allows you to stop an animation that was started using the play animation method.
interactiveAvatar.stopAnimation();
startMicrophone 1
This allows you to manually start the microphone. Alternatively, you can automatically start it by configuring the "autoStartMic" property within the options.
interactiveAvatar.startMicrophone();
stopMicrophone 1
This allows you to stop the microphone.
interactiveAvatar.stopMicrophone();
startCamera 1 / 2
This allows you to manually start the camera. For Type 1 avatars, you can optionally enable automatic camera start-up by configuring the "autoStartCam" property within the options. For Type 2 avatars, the camera will always start automatically, and this method will also return both the raw camera stream and the avatar-rendered canvas stream.
interactiveAvatar.startCamera().then(({ cameraStream, avatarStream }) => {
if (cameraStream) {
console.log("camera stream", cameraStream);
} else {
console.log("no camera stream available, is null for type 1 avatars");
}
if (avatarStream) {
console.log("avatar stream", avatarStream);
} else {
console.log("no avatar stream available, is null for type 1 avatars");
}
});
stopCamera 1 / 2
This allows you to stop the camera.
interactiveAvatar.stopCamera();
stopSpeech 1
This allows you to manually stop any speech from playing. Please note, because speech is streamed and tied into the phonemes produced by Cartesia, it's a little tricky to implement pause/resume functionality without breaking the lip-sync. When this method runs, you won't be able to resume speech.
interactiveAvatar.stopSpeech();
setMorphTarget 1
This allows you to override any of the supported morph targets, that control various aspects of animations.
The viseme names represent the mouth shapes used to produce different sounds. The table below lists each viseme along with example sounds and words.
Viseme Typical Sounds Example Words
viseme_CH ch, sh, zh, j church, ship, vision, jump
viseme_DD d, t, l, n dog, top, love, no
viseme_E e, ey bet, face
viseme_FF f, v fun, voice
viseme_I ih, ee sit, see
viseme_O o, aw no, tall
viseme_PP p, b, m pat, bat, mat
viseme_RR rr run
viseme_SS s, z sit, zoo
viseme_TH th thin, this
viseme_U oo, u food, book
viseme_aa ah, a father
viseme_kk k, g cat, go
viseme_nn n, ng no, song
viseme_sil silence (pause / no sound)
When you use the 'morphs' property, anything in the name and value fields will be ignored.
interactiveAvatar.setMorphTarget({
name: 'jawOpen',
value: 1,
revert: true,
morphs: null
});
Animation examples
Toggles the opening/closing of the eyes.
async function toggleEyes() {
// Get current values
const eyeBlinkLeft = await interactiveAvatar.getMorphTarget({ name: "eyeBlinkLeft" });
const eyeBlinkRight = await interactiveAvatar.getMorphTarget({ name: "eyeBlinkRight" });
// Define morph target sequences
let eyesClosed = [ { "name": "eyeBlinkLeft", "value": 1, "delay": 0 }, { "name": "eyeBlinkRight", "value": 1, "delay": 0 } ];
let eyesOpened = [ { "name": "eyeBlinkLeft", "value": 0, "delay": 0 }, { "name": "eyeBlinkRight", "value": 0, "delay": 0 } ];
// If both eyes are closed, then open them
if (eyeBlinkLeft === 1 && eyeBlinkRight === 1) {
interactiveAvatar.setMorphTarget({
name: null,
value: null,
revert: false,
morphs: JSON.stringify(eyesOpened)
});
// Otherwise close them
} else {
interactiveAvatar.setMorphTarget({
name: null,
value: null,
revert: false,
morphs: JSON.stringify(eyesClosed)
});
}
}
A “goofy face burst” animation
function goofyFaceBurst() {
// Define morph sequences for animation
let sequence = [
// Step 1: Eyes wide + jaw drop
{ "name": "eyeWideLeft", "value": 1, "delay": 0 },
{ "name": "eyeWideRight", "value": 1, "delay": 0 },
{ "name": "jawOpen", "value": 1, "delay": 500 },
// Step 2: Tongue out
{ "name": "tongueOut", "value": 1, "delay": 800 },
// Step 3: Puff cheeks + cross eyes
{ "name": "cheekPuff", "value": 1, "delay": 800 },
{ "name": "eyeLookInLeft", "value": 1, "delay": 0 },
{ "name": "eyeLookInRight", "value": 1, "delay": 0 },
// Step 4: Huge smile, tongue back in, cheeks relax
{ "name": "mouthSmileLeft", "value": 1, "delay": 1000 },
{ "name": "mouthSmileRight", "value": 1, "delay": 0 },
{ "name": "tongueOut", "value": 0, "delay": 0 },
{ "name": "cheekPuff", "value": 0, "delay": 0 },
{ "name": "eyeLookInLeft", "value": 0, "delay": 0 },
{ "name": "eyeLookInRight", "value": 0, "delay": 0 },
// Step 5: Reset to neutral (or you can set the revert option to true)
{ "name": "eyeWideLeft", "value": 0, "delay": 800 },
{ "name": "eyeWideRight", "value": 0, "delay": 0 },
{ "name": "jawOpen", "value": 0, "delay": 0 },
{ "name": "mouthSmileLeft", "value": 0, "delay": 500 },
{ "name": "mouthSmileRight", "value": 0, "delay": 0 }
];
// Send animation sequence
interactiveAvatar.setMorphTarget({
name: null,
value: null,
revert: false,
morphs: JSON.stringify(sequence)
});
}
getMorphTarget 1
This allows you to get the current value of a morph target.
interactiveAvatar.getMorphTarget({
name: 'jawOpen'
}).then((value) => {
console.log(value);
});
loadAvatar 1 / 2
This loads a new avatar.
interactiveAvatar.loadAvatar({
glbfile: "https://example/test.glb"
});
startCamera 1 / 2
This allows you to manually start the camera. For Type 1 avatars, you can optionally enable automatic camera start-up by configuring the "autoStartCam" property within the options. For Type 2 avatars, the camera will always start automatically, and this method will also return both the raw camera stream and the avatar-rendered canvas stream.
interactiveAvatar.startCamera().then(({ cameraStream, avatarStream }) => {
if (cameraStream) {
console.log("camera stream", cameraStream);
} else {
console.log("no camera stream available, is null for type 1 avatars");
}
if (avatarStream) {
console.log("avatar stream", avatarStream);
} else {
console.log("no avatar stream available, is null for type 1 avatars");
}
});
stopCamera 1 / 2
This allows you to stop the camera.
interactiveAvatar.stopCamera();
setCamera 2
This feature is only available for Type 2 avatars and lets you change the camera using a video input deviceId. A list of devices can be retrieved by running the 'getVideoDevices' method.
interactiveAvatar.setCamera({
deviceId:
});
getStreams 2
This feature is only available for Type 2 avatars and lets you retrieve either the raw camera stream, the avatar-rendered canvas stream, or both.
interactiveAvatar.getStreams({
camera: true,
avatar: true
}).then(({ camera, avatar }) => {
console.log("camera stream", camera);
console.log("avatar stream", avatar);
});
getVideoDevices 2
This feature is only available for Type 2 avatars and lets you retrieve a list of video devices. You can then pass the deviceId to the 'setCamera' method to swap cameras.
interactiveAvatar.getVideoDevices().then((devices) => {
console.log("video devices", devices);
});
loadAvatar 2
This loads a new avatar.
faceTracker.loadAvatar("https://example.com/hero.glb", (progress) => {
console.log("progress", Number(progress.toFixed(2)));
}).then(() => {
console.log("loaded");
}).catch(error => {
console.log(error);
});
startCamera 2
Allows you to manually start the camera and returns the camera's media stream if on a paid plan.
faceTracker.startCamera().then(stream => {
console.log("media stream", stream);
}).catch(error => {
console.log(error);
});
stopCamera 2
This allows you to stop the camera.
faceTracker.stopCamera();
setCamera 2
Allows you to change/swap cameras by passing a video input deviceId and returns the camera's media stream if on a paid plan. The available list of video devices can be retrieved by running getVideoDevices.
faceTracker.setCamera("deviceId").then(stream => {
console.log("media stream", stream);
}).catch(error => {
console.log(error);
});
getAvatarStream 2
Allows you to retrieve the avatar-rendered canvas stream. Only available when on a paid plan.
const avatarStream = faceTracker.getAvatarStream();
console.log(avatarStream);
getCameraStream 2
Allows you to retrieve the raw camera stream. Only available when on a paid plan.
const cameraStream = faceTracker.getCameraStream();
console.log(cameraStream);
getVideoDevices 2
Allows you to enumerate available video input devices. You can then use a specific deviceId with the setCamera method.
faceTracker.getVideoDevices().then(devices => {
console.log(devices);
}).catch(error => {
console.log(error);
});
Events
avatarReady 1 / 2
Triggered when the avatar is ready to be interacted with.
interactiveAvatar.on("avatarReady", (event) => {
console.log("avatar is ready");
});
avatarLoaded 1 / 2
Triggers whenever an avatar has fully loaded.
interactiveAvatar.on("avatarLoaded", (event) => {
console.log("avatar has loaded");
});
speechStarted 1
Triggered whenever speech starts.
interactiveAvatar.on("speechStarted", (event) => {
console.log("speech started");
});
speechEnded 1
Triggered whenever speech ends.
interactiveAvatar.on("speechEnded", (event) => {
console.log("speech ended");
});
animationCompleted 1
Triggered after an animation has completed.
interactiveAvatar.on("animationCompleted", (event) => {
console.log("animation completed");
});
conversationUpdated 1
Triggered whenever the avatar has responded and the conversation has updated.
This event is only available on a paid plan.
interactiveAvatar.on("conversationUpdated", (event) => {
// event.data is an array of JSON strings, so convert it into a JSON array.
const jsonString = "[" + event.data.join(",") + "]";
const parsedJson = JSON.parse(jsonString);
console.log(parsedJson);
});
Example of what event.data entries look like:
{
"identifier": "651866",
"type": "text",
"user": "hi there",
"avatar": "Hello! How can I help you today?",
"timestamp": "Jun 24, 2025 9:35 pm"
}
avatarReady 1 / 2
Triggered when the avatar is ready to be interacted with.
interactiveAvatar.on("avatarReady", (event) => {
console.log("avatar is ready");
});
avatarLoaded 1 / 2
Triggers whenever an avatar has fully loaded.
interactiveAvatar.on("avatarLoaded", (event) => {
console.log("avatar has loaded");
});
cameraStarted 1 / 2
Triggered whenever the camera starts.
interactiveAvatar.on("cameraStarted", (event) => {
console.log("camera started");
});
cameraStopped 1 / 2
Triggered whenever the camera stops.
interactiveAvatar.on("cameraStopped", (event) => {
console.log("camera stopped");
});
errorOccurred 1 / 2
Triggered whenever an error occurs. The event.data will contain the error message.
interactiveAvatar.on("errorOccurred", (event) => {
console.log(event.data);
});
avatarLoaded 2
Triggers whenever an avatar has fully loaded.
MimicSDK.on("avatarLoaded", () => {
console.log("avatar has loaded");
});
avatarProgress 2
Triggered constantly as an avatar is loading, returning the current load process as a percentage.
MimicSDK.on("avatarProgress", (progress) => {
console.log("progress:", progress);
});
cameraStarted 2
Triggered whenever the camera starts, returning the raw camera stream if on a paid plan.
MimicSDK.on("cameraStarted", (stream) => {
console.log("camera started", stream);
});
cameraStopped 2
Triggered whenever the camera stops.
MimicSDK.on("cameraStopped", () => {
console.log("camera stopped");
});
retrievedDevices 2
Triggered after the getVideoDevices method has run, returning an array of video input devices.
MimicSDK.on("retrievedDevices", (devices) => {
console.log("video devices:", devices);
});
errorOccurred 2
Triggered whenever an error occurs, returning details of the error.
MimicSDK.on("errorOccurred", (err) => {
console.error("error:", err);
});