End-to-end encrypted video calls where the signaling itself is encrypted via XMTP's decentralized messaging protocol.
- XMTP MLS — All signaling messages (SDP offers, answers, ICE candidates) are encrypted end-to-end using XMTP's Messaging Layer Security protocol. No server can read them.
- DTLS-SRTP — WebRTC's mandatory transport encryption protects all media (audio/video) in transit. Once the peer connection is established, audio and video flow directly between peers.
- E2E Media Encryption (Encoded Transform) — On top of DTLS-SRTP, every audio and video frame is encrypted with AES-128-GCM using the WebRTC Encoded Transform API (
RTCRtpScriptTransform). The symmetric key is generated by the caller and shared with the callee over the already-encrypted XMTP signaling channel. This means even if a TURN relay or any intermediary were present, it could never decrypt the actual media content. - No central signaling server — Unlike typical WebRTC apps, there is no WebSocket server that could be compromised. XMTP's decentralized network replaces it entirely.
This app uses Google's public STUN servers (stun.l.google.com) for NAT traversal. STUN servers do not compromise E2EE — their only role is helping peers discover their public IP addresses so they can establish a direct connection. They never see any media content, signaling data, or message payloads.
| Component | What it knows | What it can't see |
|---|---|---|
| STUN server | That two IP addresses are trying to connect | Who the users are, what they're saying, any media content |
| XMTP network | Encrypted signaling blobs between two inbox IDs | SDP/ICE content (encrypted via MLS) |
| No one else | — | Everything is E2EE |
Even if a TURN relay were added (for networks where direct connections fail), both DTLS-SRTP and the Encoded Transform E2E layer encrypt all media — the relay would only forward opaque encrypted bytes that it cannot decrypt.
# Install dependencies
npm install
# Start the dev server
npm run devImportant: The Vite dev server is configured with the required
Cross-Origin-Embedder-PolicyandCross-Origin-Opener-Policyheaders that the XMTP Browser SDK needs for SharedArrayBuffer/WASM support.
- Open the app
- Click Start Call — a secure identity is created automatically
- Click Start Camera (or skip for audio-only)
- Click Copy invite link and send it to the person you want to call
- Open the invite link
- Click Start Call
- Click Start Camera (or skip for audio-only)
- Click Call — the connection is established automatically
- Mute / Unmute — toggle your microphone
- Hide Video / Show Video — toggle your camera for audio-only mode
- End Call — hang up
- User clicks Start Call
- An ephemeral Ethereum wallet is created and used to connect an XMTP client
- The invite link contains the user's XMTP inbox ID as a
?partner=query parameter - The caller generates an AES-128-GCM key and sends it to the peer over XMTP
- The caller creates a WebRTC offer (SDP) and sends it over XMTP
- The answerer receives the encryption key and offer, sets up matching encryption transforms, and sends an answer back over XMTP
- ICE candidates are exchanged via XMTP until a direct peer connection is established
- Audio/video frames flow peer-to-peer, encrypted at three layers (MLS, DTLS-SRTP, AES-128-GCM Encoded Transform)
- Chat messages are sent separately through XMTP's messaging service
src/
├── main.tsx # React entry point
├── App.tsx # Main UI component
├── styles.css # Styles
├── LogLevel.ts # Shared log level type
├── SignalingMessage.ts # Signaling message types
├── SignalingCodec.ts # Custom XMTP content type codec
├── createXmtpSigner.ts # Ephemeral wallet → XMTP signer
├── XmtpSignaling.ts # XMTP signaling layer
├── ConnectionState.ts # WebRTC connection state types
├── RTCConfiguration.ts # STUN server configuration
├── E2EEncryption.ts # E2E media encryption via Encoded Transform API
└── WebRTCManager.ts # WebRTC peer connection manager
By default, this app connects to the XMTP dev network. To use production:
// In App.tsx, change:
const id = await signaling.connect(xmtpSigner, "production");- STUN servers (Google) are used for NAT traversal — they see IP addresses but not media content
MIT