Reference: Real-Time Connection Patterns

What it is

Patterns for getting data from server to client (or peer to peer) when request/response isn’t enough. The choice shapes latency, server cost, mobile behavior, and how the system fails.

When you care

Any question involving chat, notifications, live dashboards, collaborative editing, presence, or streaming. The trap is defaulting to WebSocket because it sounds real-time. Different patterns fit different constraints; naming the right one with a specific reason is the signal.

The options

Pattern	Direction	Transport	Connection	Good for
Short polling	Client → server on interval	HTTP	One request per poll	Low-frequency updates; simple clients.
Long polling	Client → server, server holds	HTTP	Held until data or timeout	Legacy real-time, HTTP-only clients.
Server-Sent Events (SSE)	Server → client	HTTP	Single long-lived connection	Server-push streams, one-way updates.
WebSocket	Bidirectional	TCP (upgraded from HTTP)	Single long-lived connection	Chat, collaborative editing, presence.
WebRTC data channel	Peer-to-peer	UDP (via ICE/STUN/TURN)	Peer connection	Low-latency P2P, gaming, A/V sidecars.
Push notifications	Server → device OS	APNs / FCM	OS-managed	Background delivery to mobile when app is closed.

Short and long polling

Short polling is a client timer that hits an endpoint every N seconds. It wastes requests on idle state but requires zero server-side state. Fine for minute-scale updates (dashboard refreshes, weak SLAs).

Long polling holds the request open server-side until data arrives or a timeout hits. Better latency than short polling; much worse server cost because every client holds a connection-equivalent. Historically used before WebSocket was universal; today mostly a fallback.

Server-Sent Events (SSE)

One-way: server streams text events over a persistent HTTP connection the client opens. Runs over HTTP/1.1 or HTTP/2, works through most proxies, auto-reconnects in browsers.

Good for: live feeds, notifications, progress updates, anywhere the server talks and the client just listens. LLM token streaming to a browser is a common modern use.

Limits: one-way only. Client-to-server messages still go over a separate HTTP request. Some middleboxes buffer SSE; harder to debug than WebSocket in practice.

WebSocket

Full-duplex TCP connection negotiated via an HTTP upgrade handshake. Once upgraded, both sides can send frames at any time.

Good for: chat, collaborative editing, multiplayer state, any system where both sides need to initiate messages.

Limits: stateful — each connection pins a server. Scaling requires sticky routing or a shared connection layer. Reconnect logic is the app’s problem. Some corporate proxies block WebSocket entirely (mostly solved in 2026 but still a real concern for enterprise products).

WebRTC data channel

Peer-to-peer over UDP (with fallback to TCP via TURN relay). Requires a signaling server to set up the connection, then peers talk directly.

Good for: real-time gaming, collaborative apps where the latency of server relay is unacceptable, A/V systems that need a data sidecar.

Limits: complex setup (ICE, STUN, TURN), firewall-sensitive, not a good fit for server-authoritative data.

Push notifications

Not a connection pattern in the same sense — the device OS maintains a persistent connection to Apple’s APNs or Google’s FCM, and your server hands messages to those services for delivery. The app doesn’t need to be running.

Good for: notifications when the app is backgrounded or closed; the only option on iOS where the OS kills background networking aggressively.

Limits: delivery is best-effort, not real-time. Payloads are small. Requires per-platform integration (APNs tokens vs FCM tokens).

Scaling considerations

Stateful connections (SSE, WebSocket) don’t scale like stateless HTTP. Key concerns:

Sticky routing. A client’s messages must reach the server holding their connection. L7 load balancers can do this via connection hashing; alternatively, route through a shared layer (Redis pub/sub, Kafka) so any server can reach any connection.
Connection server fleet. Dedicated “connection servers” do nothing but hold connections and relay messages; business logic runs elsewhere. Common at scale (Slack, Discord, Meta chat).
Horizontal scaling. Number of concurrent connections, not QPS, is the dimension to size against. A single box typically handles 10K–100K idle WebSocket connections.

When to pick what

One-way server → client stream: SSE. Simpler than WebSocket; runs on HTTP; auto-reconnects.
Two-way real-time (chat, editing): WebSocket.
Background delivery to a mobile app that may be closed: push notifications, full stop. No other option works on iOS.
Peer-to-peer with latency < 100ms: WebRTC data channel.
Weak SLA, simple client: short polling. Don’t over-engineer.
In interviews: the answer is rarely “use WebSocket.” It’s “SSE for the notification feed, push notifications for mobile background, WebSocket for the live collaborative layer” — the hybrid answer is the signal.

Walkthrough: Designing a Chat System — the canonical WebSocket + connection-server design.
Walkthrough: Designing a News Feed — where push notifications and SSE both show up.

Tags #system-design#concepts#real-time#connections#reference