What to Design When Building a Realtime Log Viewer

Suppose we are building a CI log viewer.

At first, the requirements look simple.

Users should be able to see logs in realtime while a job is running.
Multiple people should be able to view the same run / job.
When a user opens the page, they should also see logs that were already produced.
If the network connection drops, the viewer should recover.

It is tempting to summarize this as: "just stream logs over WebSocket."

But the hard part is not opening a WebSocket connection. The hard parts are the boundary between historical logs and live logs, fan-out to multiple viewers, reconnect / resume behavior, and preserving log order.

To think through those design points, I looked into how the browser version of the GitHub Actions live log view appears to work.

How to Make It Realtime

The first question is how to deliver updates to the client.

The common options are roughly:

polling
WebSocket
SSE

Strictly speaking, polling can be split into short polling and long polling, and HTTP streaming is another possible option. But for this article, I will keep the comparison at this level: polling, WebSocket, and SSE.

Polling

Polling is probably the easiest option to implement.

The client fetches the run / job / log state every few seconds. It is just HTTP request / response, so both the server and client behavior are easy to understand. It also tends to work well with existing authentication, load balancers, and infrastructure.

But it has weaknesses as a log viewer design.

More clients means more requests.
Requests still happen when there are no updates.
To make the UI feel more live, the interval tends to get shorter.
Expressing incremental updates can become awkward.
Re-fetching large logs repeatedly can become expensive.

Polling may be enough if the goal is only to refresh job / step state. If the goal is to live-tail log text at a fine granularity, it can become inefficient.

WebSocket

WebSocket lets the server push new log events to the client.

That fits a running log viewer well. The server can send an event when a new chunk of log output appears, and there is no need to make repeated requests when nothing is changing. It also maps naturally to fan-out, where multiple viewers receive the same event.

However, using WebSocket for everything creates other problems.

The server needs to manage connected clients.
Reconnect and resume behavior must be designed.
If the backlog is large, a huge amount of log text may be sent over WebSocket.
Logs produced before the client connected still need to be delivered somehow.

So WebSocket looks best as the transport for live updates, while historical logs are fetched over HTTP.

backlog: HTTP
live updates: WebSocket
boundary: last seen offset

SSE

SSE can also be used for live tailing logs, because it sends events from the server to the client over a one-way stream.

It is closer to HTTP than WebSocket, and can be easier to handle from the browser side in some cases. But as I understand it, if multiple people are watching the same log stream and we want strict retry / resume behavior after a failed connection, the server still needs to manage connected clients and last seen offsets.

In other words, both SSE and WebSocket eventually lead to the same core question: which viewer has seen which offset of which stream?

So the transport choice matters, but it is not the whole design. The bigger question is whether the log stream is modeled in a way that can be resumed.

What GitHub Seems to Use

For GitHub Actions, the CLI and the browser seem to make different choices.

The CLI leans toward polling. The browser version appears to lean toward WebSocket.

That difference is interesting.

The CLI is mainly a terminal view of which job / step is currently running, so refreshing state every few seconds is often enough. The browser log viewer, on the other hand, shows log text growing live in the page. That is a better fit for a push-based realtime system.

The CLI Uses Polling

First, gh run watch in GitHub CLI does not stream log text.

cli/cli

Looking at the implementation, watch periodically fetches workflow run / job / step state and redraws the terminal. The default interval is 3 seconds. It mainly uses APIs like these:

GET /repos/{owner}/{repo}/actions/runs/{run_id}
GET /repos/{owner}/{repo}/actions/runs/{run_id}/jobs
GET /repos/{owner}/{repo}/actions/runs/{run_id}/attempts/{attempt}/jobs
GET /repos/{owner}/{repo}/check-runs/{job_id}/annotations

This is enough to show job / step progress, but it is not a live tail of log text.

gh run view --log is different. It is closer to fetching logs after they are available.

GET /repos/{owner}/{repo}/actions/runs/{run_id}/logs
GET /repos/{owner}/{repo}/actions/runs/{run_id}/attempts/{attempt}/logs
GET /repos/{owner}/{repo}/actions/jobs/{job_id}/logs

So, from the public Actions REST API alone, I did not find a live log stream equivalent to what the browser UI appears to use.

The Browser Uses WebSocket

The browser version appears to use WebSocket, but it is not immediately obvious from the page context.

When I inspected the GitHub.com Actions page in DevTools, I initially could not find an obvious realtime request.

There was no text/event-stream.
There was no obvious WebSocket created by the page itself.
It did not look like log text was being fetched through fetch streaming.
The most visible pending request was a worker script.

I also analyzed a HAR file, but it did not include the WebSocket protocol log I needed. This was the part that took a while to untangle.

The request that stood out was this worker:

https://github.com/assets-cdn/worker/socket-worker-xxx.js

That file was a wrapper. The actual code was imported from github.githubassets.com. There was also a source map, and from that I could see source names like these:

packages/socket-worker/socket-worker.ts
packages/alive/session.ts
node_modules/@github/alive-client/dist/alive-session.js
node_modules/@github/stable-socket/dist/index.js
node_modules/@github/alive-client/dist/subscription-set.js

From the names, this looks less like an Actions-specific log implementation and more like part of GitHub.com's general "Alive" realtime client.

The shape appears to be roughly this:

GitHub Actions page
  |
  | postMessage({connect, subscribe, unsubscribe, ...})
  v
socket-worker SharedWorker
  |
  | WebSocket
  v
GitHub Alive backend

This also explains why monkey-patching window.WebSocket or window.fetch from the page did not catch it. The WebSocket is created in the SharedWorker global scope, not in the page's JavaScript context.

Modeling the Log Stream

The worker receives messages such as connect, subscribe, and unsubscribe from the page. It then sends subscription messages as JSON over the WebSocket.

{"subscribe": {"SIGNED_TOPIC": OFFSET}}

Unsubscribe looks like this:

{"unsubscribe": ["SIGNED_TOPIC"]}

Messages from the server include at least ack and msg. The ack updates offsets for subscribed topics. The msg uses the channel / topic to find subscribers and forward the event.

The important detail here is that topics and offsets appear in the protocol.

If a realtime log viewer only says "send log lines over WebSocket", it is weak against these problems:

How do we fetch logs that were produced before the page opened?
What happens if new logs appear while the client is fetching a historical snapshot?
Where should a reconnecting viewer resume from?
How do we dedupe duplicate deliveries?
How do we detect missing events?
How do we avoid excessive backend load when multiple viewers watch the same job?

This suggests modeling logs as an append-only stream where each event or chunk has a monotonically increasing offset.

topic: run:123:job:456:logs
offset: 98766
data: "npm test..."

The client remembers the last offset it rendered. On subscribe or reconnect, it sends that offset to the server.

{
  "subscribe": {
    "run:123:job:456:logs": 98765
  }
}

The meaning is: "subscribe to this topic starting after offset 98765."

Because GitHub's worker also had per-topic offsets and ack handling, the transport layer appears to be designed with resume / catch-up behavior in mind.

Offsets make several things easier.

The client knows which range of logs to fetch.
HTTP and WebSocket responsibilities can be separated.
Reconnect can resume from the last seen offset.
Different viewers can have different connection states and progress, but still converge to the same view.
Duplicate delivery can be deduped on the client.
Missing events are easier to detect.

The most important benefit is that HTTP and WebSocket can be joined cleanly.

For example, the client can first fetch logs from 0..98765 over HTTP, then subscribe to 98765.. over WebSocket. Or it can subscribe first, receive an ack for the current offset, and fetch history up to that offset over HTTP.

With offsets, historical log retrieval and live event subscription can be treated as one continuous view.

Summary

The browser version of the GitHub Actions live log view does not appear to be plain polling through the public REST API. It looks like it uses GitHub's internal realtime system through SharedWorker and WebSocket.

But the main lesson is not "use WebSocket to make it realtime."

For a realtime log viewer, the important design points are:

Treat logs as an append-only stream.
Subscribe by topic.
Give each event / chunk an offset.
Represent the boundary between backlog and live logs with a last seen offset.
Resume reconnects from the last seen offset.
Fan out to multiple viewers in the backend.
Make UI states such as live / catching up / reconnecting explicit.

The core shape is something like:

append-only log + durable backlog + resumable offset + live pub/sub

WebSocket is only one transport in that design. The quality of the log viewer depends more on how carefully the boundary between history and live updates is modeled.