Initial commit — StreamStack v1

Five-service streaming platform: auth, catalogue, streaming, ingest, thumbnailer.
Includes React frontend served by nginx, NATS JetStream event bus, aiobotocore
async S3, PyAV video metadata + thumbnail extraction, service-to-service JWT auth,
and a full unit + e2e test suite.
This commit is contained in:
2026-05-04 22:16:39 +10:00
commit 2309e9f43a
80 changed files with 6339 additions and 0 deletions
+112
View File
@@ -0,0 +1,112 @@
# StreamStack Architecture
## Services
| Service | Replicas | Backing stores | Responsibility |
|---------|----------|----------------|----------------|
| **auth** | 2 | Postgres, NATS KV | User accounts, JWT issue/refresh/revoke |
| **catalogue** | 2 | Postgres, NATS pub | Media metadata CRUD, stream token requests |
| **streaming** | 2 | NATS KV, S3 | Token issuance, byte-range video delivery |
| **ingest** | 2 | S3, (catalogue HTTP) | Upload video, extract metadata/thumbnail, register in catalogue |
| **nginx** | 1 | — | Reverse proxy + React SPA |
## Infrastructure
| Component | Purpose |
|-----------|---------|
| **Postgres** | Persistent store for user accounts (auth) and media metadata (catalogue) |
| **NATS JetStream KV** | Short-lived stream tokens (1h TTL); revoked-token list for JWT blacklisting |
| **S3 / MinIO** | Binary storage — `media/` bucket for video files, `thumbnails/` bucket for JPEG thumbnails |
---
## Request flows
### Login
```
Browser → nginx → auth
auth reads Postgres (verify credentials)
auth writes nothing to NATS
auth returns access_token (JWT, RS256, 30min) + refresh_token (7 days)
```
### Browse catalogue
```
Browser → nginx → catalogue
catalogue reads Postgres (published media items)
returns list of metadata (title, duration, thumbnail_s3_key, etc.)
no NATS, no S3
```
### Request a stream token
```
Browser → nginx → catalogue POST /catalogue/{id}/stream-token
catalogue reads Postgres → gets s3_key + size_bytes for the item
catalogue → streaming POST /stream/token {media_id, s3_key, size_bytes}
streaming verifies JWT (public key, local)
streaming writes NATS KV: token → "media_id|user_id|timestamp|s3_key|size_bytes"
streaming returns {stream_url: "/api/v1/stream/<token>"}
catalogue returns stream_url to browser
```
Token TTL: 1 hour. After that, NATS discards it automatically.
### Play video (each range request)
```
Browser → nginx → streaming GET /stream/<token> Range: bytes=X-Y
streaming reads NATS KV (resolve token → s3_key + size_bytes)
streaming → S3 GET object with byte range (aiobotocore, fully async)
streams bytes back to browser
no Postgres, no catalogue HTTP call
```
The browser sends many range requests for a single video. Each one costs only a NATS lookup + S3 range-get.
### Ingest a video (admin only)
```
curl/frontend → nginx → ingest POST /ingest/upload (multipart)
ingest verifies JWT (admin role required)
ingest → S3 upload file → media/{uuid}.ext
ingest → S3 head_object → size_bytes
ingest runs PyAV (in threadpool):
- reads S3 via range-gets → extracts duration, codec, width, height, fps
- decodes first video frame → JPEG → S3 thumbnails/{uuid}.jpg
ingest → catalogue POST /catalogue/ {s3_key, size_bytes, metadata...}
catalogue writes Postgres
catalogue publishes NATS: catalogue.events.media.published
returns catalogue item JSON
```
---
## JWT flow
Auth uses **RS256** (asymmetric). The private key signs tokens; all other services hold only the public key and verify locally — no auth HTTP call on every request.
Revoked tokens are stored as keys in a NATS KV bucket (`revoked-tokens`). Streaming checks this bucket on token issue, not on every range request.
---
## Data ownership
```
Postgres auth users, hashed passwords, roles
catalogue media items, all metadata fields
NATS KV streaming stream tokens (s3_key + size_bytes embedded)
auth revoked JWT list
S3 ingest video files → media/
ingest thumbnails → thumbnails/
(read) streaming reads media/ for range delivery
(read) ingest/PyAV reads media/ for metadata extraction
```
---
## Inter-service HTTP calls
| Caller | Callee | When |
|--------|--------|------|
| catalogue | streaming | Stream token request — passes s3_key + size_bytes |
| ingest | catalogue | After upload — registers the media item |
All other cross-service communication is either direct DB access (own service only) or NATS pub/sub. Services do **not** query each other's databases.