Files
streamstack/ARCHITECTURE.md
T
unkinben 2309e9f43a Initial commit — StreamStack v1
Five-service streaming platform: auth, catalogue, streaming, ingest, thumbnailer.
Includes React frontend served by nginx, NATS JetStream event bus, aiobotocore
async S3, PyAV video metadata + thumbnail extraction, service-to-service JWT auth,
and a full unit + e2e test suite.
2026-05-04 22:16:39 +10:00

4.1 KiB

StreamStack Architecture

Services

Service Replicas Backing stores Responsibility
auth 2 Postgres, NATS KV User accounts, JWT issue/refresh/revoke
catalogue 2 Postgres, NATS pub Media metadata CRUD, stream token requests
streaming 2 NATS KV, S3 Token issuance, byte-range video delivery
ingest 2 S3, (catalogue HTTP) Upload video, extract metadata/thumbnail, register in catalogue
nginx 1 Reverse proxy + React SPA

Infrastructure

Component Purpose
Postgres Persistent store for user accounts (auth) and media metadata (catalogue)
NATS JetStream KV Short-lived stream tokens (1h TTL); revoked-token list for JWT blacklisting
S3 / MinIO Binary storage — media/ bucket for video files, thumbnails/ bucket for JPEG thumbnails

Request flows

Login

Browser → nginx → auth
  auth reads Postgres (verify credentials)
  auth writes nothing to NATS
  auth returns access_token (JWT, RS256, 30min) + refresh_token (7 days)

Browse catalogue

Browser → nginx → catalogue
  catalogue reads Postgres (published media items)
  returns list of metadata (title, duration, thumbnail_s3_key, etc.)
  no NATS, no S3

Request a stream token

Browser → nginx → catalogue  POST /catalogue/{id}/stream-token
  catalogue reads Postgres → gets s3_key + size_bytes for the item
  catalogue → streaming  POST /stream/token  {media_id, s3_key, size_bytes}
    streaming verifies JWT (public key, local)
    streaming writes NATS KV: token → "media_id|user_id|timestamp|s3_key|size_bytes"
    streaming returns {stream_url: "/api/v1/stream/<token>"}
  catalogue returns stream_url to browser

Token TTL: 1 hour. After that, NATS discards it automatically.

Play video (each range request)

Browser → nginx → streaming  GET /stream/<token>  Range: bytes=X-Y
  streaming reads NATS KV (resolve token → s3_key + size_bytes)
  streaming → S3  GET object with byte range  (aiobotocore, fully async)
  streams bytes back to browser
  no Postgres, no catalogue HTTP call

The browser sends many range requests for a single video. Each one costs only a NATS lookup + S3 range-get.

Ingest a video (admin only)

curl/frontend → nginx → ingest  POST /ingest/upload  (multipart)
  ingest verifies JWT (admin role required)
  ingest → S3  upload file → media/{uuid}.ext
  ingest → S3  head_object → size_bytes
  ingest runs PyAV (in threadpool):
    - reads S3 via range-gets → extracts duration, codec, width, height, fps
    - decodes first video frame → JPEG → S3 thumbnails/{uuid}.jpg
  ingest → catalogue  POST /catalogue/  {s3_key, size_bytes, metadata...}
    catalogue writes Postgres
    catalogue publishes NATS: catalogue.events.media.published
  returns catalogue item JSON

JWT flow

Auth uses RS256 (asymmetric). The private key signs tokens; all other services hold only the public key and verify locally — no auth HTTP call on every request.

Revoked tokens are stored as keys in a NATS KV bucket (revoked-tokens). Streaming checks this bucket on token issue, not on every range request.


Data ownership

Postgres       auth        users, hashed passwords, roles
               catalogue   media items, all metadata fields

NATS KV        streaming   stream tokens (s3_key + size_bytes embedded)
               auth        revoked JWT list

S3             ingest      video files  →  media/
               ingest      thumbnails   →  thumbnails/
               (read)      streaming    reads media/ for range delivery
               (read)      ingest/PyAV  reads media/ for metadata extraction

Inter-service HTTP calls

Caller Callee When
catalogue streaming Stream token request — passes s3_key + size_bytes
ingest catalogue After upload — registers the media item

All other cross-service communication is either direct DB access (own service only) or NATS pub/sub. Services do not query each other's databases.