Blog/Deep Dives/WebRTC Architecture: Signaling, ICE, and Peer Connections

POST

September 05, 2025

LAST UPDATEDSeptember 05, 2025

WebRTC Architecture: Signaling, ICE, and Peer Connections

Q: What is WebRTC and how does it work?

WebRTC (Web Real-Time Communication) is a set of browser APIs that enables peer-to-peer audio, video, and data transfer without plugins. It works by using a signaling server to exchange connection metadata (SDP offers/answers), ICE candidates for NAT traversal, and STUN/TURN servers to discover public IP addresses and relay media when direct connections fail.

Q: What is the difference between STUN and TURN servers?

STUN servers help peers discover their public IP address and port for NAT traversal, enabling direct connections. TURN servers act as media relays when direct peer-to-peer connections fail due to restrictive firewalls or symmetric NATs. STUN is lightweight and free to operate; TURN consumes significant bandwidth and is more expensive.

Q: Can WebRTC be used for more than video calls?

Yes. WebRTC supports data channels for arbitrary data transfer, enabling use cases like file sharing, real-time gaming, collaborative editing, and IoT device communication—all with the same low-latency peer-to-peer architecture.

A deep dive into WebRTC architecture covering signaling servers, ICE framework, STUN/TURN, SDP negotiation, and peer connection lifecycle.

WebRTC Architecture: Signaling, ICE, and Peer Connections

WebRTC (Web Real-Time Communication) enables direct peer-to-peer audio, video, and data transfer between browsers without requiring plugins or native applications. However, establishing that direct connection is far more complex than most developers realize. Before any media flows between peers, a sophisticated dance of signaling, network discovery, and capability negotiation must occur. This guide breaks down each layer of the WebRTC architecture so you can build reliable real-time applications.

TL;DR

WebRTC uses three core mechanisms to establish peer-to-peer connections: a signaling server for exchanging session descriptions (SDP) and connection candidates, the ICE framework (with STUN/TURN servers) for NAT traversal, and the RTCPeerConnection API for managing media streams and data channels. Understanding how these pieces interact is essential for building production-grade real-time communication systems.

Why This Matters

Real-time communication has become a baseline expectation across industries. Video conferencing, telehealth, live customer support, collaborative editing, and multiplayer gaming all depend on low-latency peer-to-peer connections. WebRTC is the technology that makes this possible natively in browsers, but its architecture has enough moving parts that misunderstanding any single component can result in connections that fail silently for a subset of users—particularly those behind corporate firewalls or symmetric NATs.

If you are building anything that requires real-time media or data transfer in the browser, understanding WebRTC architecture is not optional—it is foundational.

How It Works

The Signaling Server

WebRTC itself does not define a signaling protocol. It leaves that choice to the developer. The signaling server is responsible for two critical tasks: helping peers discover each other and relaying the metadata required to establish a connection.

The metadata exchanged during signaling includes:

›SDP (Session Description Protocol) offers and answers — describing media capabilities, codecs, and connection parameters
›ICE candidates — potential network paths for the connection

You can implement signaling over WebSockets, HTTP long polling, Server-Sent Events, or even manual copy-paste for debugging. Here is a basic signaling flow using WebSockets:

typescript

// Signaling server (Node.js with ws)
import { WebSocketServer, WebSocket } from 'ws';
 
const wss = new WebSocketServer({ port: 8080 });
const rooms = new Map<string, Set<WebSocket>>();
 
wss.on('connection', (ws) => {
  let currentRoom: string | null = null;
 
  ws.on('message', (data) => {
    const message = JSON.parse(data.toString());
 
    switch (message.type) {
      case 'join':
        currentRoom = message.room;
        if (!rooms.has(currentRoom)) {
          rooms.set(currentRoom, new Set());
        }
        rooms.get(currentRoom)!.add(ws);
        break;
 
      case 'offer':
      case 'answer':
      case 'ice-candidate':
        // Relay to all other peers in the room
        if (currentRoom && rooms.has(currentRoom)) {
          rooms.get(currentRoom)!.forEach((peer) => {
            if (peer !== ws && peer.readyState === WebSocket.OPEN) {
              peer.send(JSON.stringify(message));
            }
          });
        }
        break;
    }
  });
 
  ws.on('close', () => {
    if (currentRoom && rooms.has(currentRoom)) {
      rooms.get(currentRoom)!.delete(ws);
    }
  });
});

The signaling server never sees the actual media. It is purely a coordination mechanism.

The SDP Offer/Answer Model

Before peers can exchange media, they must agree on what media to send and how to encode it. This negotiation happens through SDP. The initiating peer creates an offer, and the receiving peer responds with an answer.

An SDP message contains information about:

›Media types (audio, video, data)
›Codec preferences and parameters
›Encryption keys for SRTP
›ICE credentials and fingerprints

typescript

// Initiating peer creates an offer
const peerConnection = new RTCPeerConnection(configuration);
 
// Add local media tracks
const stream = await navigator.mediaDevices.getUserMedia({
  video: true,
  audio: true,
});
 
stream.getTracks().forEach((track) => {
  peerConnection.addTrack(track, stream);
});
 
// Create and set local description
const offer = await peerConnection.createOffer();
await peerConnection.setLocalDescription(offer);
 
// Send offer through signaling server
signalingChannel.send(JSON.stringify({
  type: 'offer',
  sdp: offer.sdp,
}));

On the receiving side:

typescript

// Receiving peer handles the offer
signalingChannel.onmessage = async (event) => {
  const message = JSON.parse(event.data);
 
  if (message.type === 'offer') {
    await peerConnection.setRemoteDescription(
      new RTCSessionDescription({ type: 'offer', sdp: message.sdp })
    );
 
    const answer = await peerConnection.createAnswer();
    await peerConnection.setLocalDescription(answer);
 
    signalingChannel.send(JSON.stringify({
      type: 'answer',
      sdp: answer.sdp,
    }));
  }
};

The ICE Framework and NAT Traversal

Most devices sit behind NATs (Network Address Translators), which means their local IP addresses are not directly reachable from the internet. The ICE (Interactive Connectivity Establishment) framework solves this by discovering all possible network paths between peers and selecting the best one.

ICE gathers three types of candidates:

›Host candidates — the device's local IP addresses (works when both peers are on the same network)
›Server-reflexive candidates — the public IP and port as discovered by a STUN server (works when the NAT is not too restrictive)
›Relay candidates — a TURN server that relays traffic between peers (works when direct connectivity fails entirely)

typescript

const configuration: RTCConfiguration = {
  iceServers: [
    { urls: 'stun:stun.l.google.com:19302' },
    {
      urls: 'turn:turn.example.com:3478',
      username: 'user',
      credential: 'password',
    },
  ],
};
 
const peerConnection = new RTCPeerConnection(configuration);
 
// ICE candidates are gathered asynchronously
peerConnection.onicecandidate = (event) => {
  if (event.candidate) {
    signalingChannel.send(JSON.stringify({
      type: 'ice-candidate',
      candidate: event.candidate.toJSON(),
    }));
  }
};
 
// Receive ICE candidates from the remote peer
signalingChannel.onmessage = async (event) => {
  const message = JSON.parse(event.data);
 
  if (message.type === 'ice-candidate') {
    await peerConnection.addIceCandidate(
      new RTCIceCandidate(message.candidate)
    );
  }
};

STUN and TURN Servers

STUN (Session Traversal Utilities for NAT) servers are lightweight. A peer sends a binding request to the STUN server, and the server responds with the public IP address and port that the request appeared to come from. This allows the peer to share its "server-reflexive" address with the other peer. STUN servers handle no media traffic and are inexpensive to operate.

TURN (Traversal Using Relays around NAT) servers are the fallback. When both peers are behind restrictive NATs (symmetric NATs) or firewalls that block UDP entirely, TURN servers relay all media traffic between the peers. This means the TURN server must handle the full bandwidth of the media stream, making it the most expensive component in a WebRTC deployment.

In production, you should always configure both STUN and TURN servers. Without TURN, a meaningful percentage of your users will be unable to connect.

Peer Connection Lifecycle

The RTCPeerConnection goes through a well-defined lifecycle:

›new — the connection object is created
›connecting — ICE and DTLS negotiation is in progress
›connected — at least one ICE candidate pair is active and DTLS handshake is complete
›disconnected — connectivity checks indicate a temporary loss of connectivity
›failed — ICE has exhausted all candidate pairs without finding a working connection
›closed — the connection is shut down

typescript

peerConnection.onconnectionstatechange = () => {
  console.log('Connection state:', peerConnection.connectionState);
 
  switch (peerConnection.connectionState) {
    case 'connected':
      console.log('Peers connected successfully');
      break;
    case 'disconnected':
      console.log('Peer disconnected — may reconnect');
      // Implement reconnection logic
      break;
    case 'failed':
      console.log('Connection failed — restart ICE or create new connection');
      peerConnection.restartIce();
      break;
    case 'closed':
      console.log('Connection closed');
      cleanup();
      break;
  }
};

Media Streams and Data Channels

WebRTC supports two types of real-time communication:

Media tracks handle audio and video using addTrack() and the ontrack event:

typescript

// Receiving remote media
peerConnection.ontrack = (event) => {
  const remoteVideo = document.getElementById('remoteVideo') as HTMLVideoElement;
  if (remoteVideo.srcObject !== event.streams[0]) {
    remoteVideo.srcObject = event.streams[0];
  }
};

Data channels provide arbitrary bidirectional data transfer with configurable reliability:

typescript

// Creating a data channel
const dataChannel = peerConnection.createDataChannel('chat', {
  ordered: true,         // guarantee order
  maxRetransmits: 3,     // or use maxPacketLifeTime for time-based
});
 
dataChannel.onopen = () => {
  dataChannel.send('Hello from peer A');
};
 
dataChannel.onmessage = (event) => {
  console.log('Received:', event.data);
};
 
// Receiving peer listens for data channels
peerConnection.ondatachannel = (event) => {
  const channel = event.channel;
  channel.onmessage = (e) => {
    console.log('Received:', e.data);
  };
};

Data channels use SCTP over DTLS, providing encryption by default. You can configure them as ordered or unordered, reliable or unreliable, making them suitable for everything from chat messages to game state updates.

Practical Implementation

A complete peer connection setup brings all these pieces together. Here is a consolidated example:

typescript

class WebRTCConnection {
  private pc: RTCPeerConnection;
  private signalingChannel: WebSocket;
 
  constructor(signalingUrl: string, iceServers: RTCIceServer[]) {
    this.signalingChannel = new WebSocket(signalingUrl);
    this.pc = new RTCPeerConnection({ iceServers });
 
    this.pc.onicecandidate = ({ candidate }) => {
      if (candidate) {
        this.signal({ type: 'ice-candidate', candidate: candidate.toJSON() });
      }
    };
 
    this.pc.ontrack = (event) => {
      this.onRemoteStream(event.streams[0]);
    };
 
    this.pc.onconnectionstatechange = () => {
      if (this.pc.connectionState === 'failed') {
        this.pc.restartIce();
      }
    };
 
    this.signalingChannel.onmessage = (event) => {
      this.handleSignal(JSON.parse(event.data));
    };
  }
 
  async startCall(): Promise<void> {
    const stream = await navigator.mediaDevices.getUserMedia({
      video: { width: 1280, height: 720 },
      audio: { echoCancellation: true, noiseSuppression: true },
    });
 
    stream.getTracks().forEach((track) => this.pc.addTrack(track, stream));
 
    const offer = await this.pc.createOffer();
    await this.pc.setLocalDescription(offer);
    this.signal({ type: 'offer', sdp: offer.sdp });
  }
 
  private async handleSignal(message: any): Promise<void> {
    switch (message.type) {
      case 'offer':
        await this.pc.setRemoteDescription(new RTCSessionDescription(message));
        const answer = await this.pc.createAnswer();
        await this.pc.setLocalDescription(answer);
        this.signal({ type: 'answer', sdp: answer.sdp });
        break;
      case 'answer':
        await this.pc.setRemoteDescription(new RTCSessionDescription(message));
        break;
      case 'ice-candidate':
        await this.pc.addIceCandidate(new RTCIceCandidate(message.candidate));
        break;
    }
  }
 
  private signal(data: object): void {
    this.signalingChannel.send(JSON.stringify(data));
  }
 
  private onRemoteStream(stream: MediaStream): void {
    const video = document.getElementById('remoteVideo') as HTMLVideoElement;
    video.srcObject = stream;
  }
 
  close(): void {
    this.pc.close();
    this.signalingChannel.close();
  }
}

Common Pitfalls

Not including TURN servers. The most frequent production issue. Developers test on the same network or with permissive NATs and skip TURN configuration. When real users behind corporate firewalls try to connect, they get silent failures.

Ignoring ICE restart. When a connection enters the "failed" state, many implementations simply give up. Calling restartIce() can recover connections that temporarily lost connectivity without requiring a full re-negotiation.

Race conditions in signaling. If ICE candidates arrive before the remote description is set, addIceCandidate will throw. Buffer incoming candidates until setRemoteDescription has been called.

Not handling renegotiation. Adding or removing tracks after the initial connection requires renegotiation via the negotiationneeded event. Ignoring this event leads to tracks that never appear on the remote side.

Assuming UDP availability. Some enterprise networks block all UDP traffic. Without TURN over TCP (or TURN over TLS on port 443), these users cannot connect at all.

When to Use (and When Not To)

Use WebRTC when:

›You need low-latency, real-time media streaming between browsers
›Privacy matters and you want end-to-end encrypted media without server-side processing
›You are building video calls, screen sharing, file transfer, or real-time gaming
›You need data channels for low-latency bidirectional communication

Do not use WebRTC when:

›You need to broadcast to thousands of viewers (use HLS/DASH or a media server like Janus/mediasoup for SFU architecture)
›You need server-side recording or processing of media (you will need a media server intermediary)
›Your use case is simple request-response (standard HTTP or WebSockets are simpler)
›You need guaranteed delivery of large files (standard file transfer protocols are more appropriate)

For large-scale video applications, consider a Selective Forwarding Unit (SFU) architecture where a media server receives streams from each participant and selectively forwards them, avoiding the exponential bandwidth growth of full mesh peer-to-peer topologies.

FAQ

What is WebRTC and how does it work?

WebRTC is a set of browser APIs that enables peer-to-peer audio, video, and data transfer without plugins. It works by using a signaling server to exchange connection metadata, ICE candidates for NAT traversal, and STUN/TURN servers to discover public IP addresses and relay media when direct connections fail.

Why does WebRTC need a signaling server if it is peer-to-peer?

WebRTC is peer-to-peer for media transfer, but peers need a way to discover each other and exchange connection details first. The signaling server handles this coordination—exchanging SDP offers/answers and ICE candidates—before the direct connection is established.

What is the difference between STUN and TURN servers?

STUN servers help peers discover their public IP address for NAT traversal, enabling direct connections. TURN servers relay traffic when direct connections fail due to restrictive firewalls or symmetric NATs. STUN is lightweight; TURN consumes significant bandwidth.

When does WebRTC fall back to TURN relay?

WebRTC falls back to TURN when direct peer-to-peer connectivity fails, typically due to symmetric NATs, strict corporate firewalls, or VPN configurations that block UDP traffic.

Can WebRTC be used for more than video calls?

Yes. WebRTC data channels support arbitrary data transfer, enabling file sharing, real-time gaming, collaborative editing, and IoT device communication with the same low-latency peer-to-peer architecture.

Collaboration

Need help with a project?

Let's Build It

I help startups and established companies design, build, and scale world-class digital products. From deep technical architecture to pixel-perfect UI — let's bring your vision to life.

Start a Conversation

Article Author

Sadam Hussain

Senior Full Stack Developer

Senior Full Stack Developer with over 7 years of experience building React, Next.js, Node.js, TypeScript, and AI-powered web platforms.

Next.js Image Optimization Tips You Should Know

Design a Real-Time Chat System Like WhatsApp

Mar 21, 20266 min read

Micro-Frontends

BFF

API Design

How to Design API Contracts Between Micro-Frontends and BFFs

Learn how to design stable API contracts between Micro-Frontends and Backend-for-Frontend layers with versioning, ownership boundaries, error handling, and schema governance.

Read Article

Mar 21, 20261 min read

Next.js

BFF

Architecture

Next.js BFF Architecture

An architectural deep dive into using Next.js as a Backend-for-Frontend, including route handlers, server components, auth boundaries, caching, and service orchestration.

Read Article

Mar 21, 20266 min read

Next.js

Performance

Caching

Next.js Cache Components and PPR in Real Apps

A practical guide to using Next.js Cache Components and Partial Prerendering in real applications, with tradeoffs, cache strategy, and freshness considerations.

Read Article

WebRTC Architecture: Signaling, ICE, and Peer Connections

WebRTC Architecture: Signaling, ICE, and Peer Connections

TL;DR

Why This Matters

How It Works

The Signaling Server

The SDP Offer/Answer Model

The ICE Framework and NAT Traversal

STUN and TURN Servers

Peer Connection Lifecycle

Media Streams and Data Channels

Practical Implementation

Common Pitfalls

When to Use (and When Not To)

FAQ

What is WebRTC and how does it work?

Why does WebRTC need a signaling server if it is peer-to-peer?

What is the difference between STUN and TURN servers?

When does WebRTC fall back to TURN relay?

Can WebRTC be used for more than video calls?

Need help with a project?

Let's Build It

Sadam Hussain

Related Articles

How to Design API Contracts Between Micro-Frontends and BFFs

Next.js BFF Architecture

Next.js Cache Components and PPR in Real Apps