Initial commit — StreamStack v1

Five-service streaming platform: auth, catalogue, streaming, ingest, thumbnailer.
Includes React frontend served by nginx, NATS JetStream event bus, aiobotocore
async S3, PyAV video metadata + thumbnail extraction, service-to-service JWT auth,
and a full unit + e2e test suite.
This commit is contained in:
2026-05-04 22:16:39 +10:00
commit 2309e9f43a
80 changed files with 6339 additions and 0 deletions
+16
View File
@@ -0,0 +1,16 @@
# TODO
- Transcode MKV uploads to MP4 during ingest — browsers (Firefox/Chrome) cannot natively play MKV containers, so Jellyfish-style uploads fail to load in the video player.
- IMDB metadata microservice — subscribe to `catalogue.events.media.published` (durable consumer `"imdb-fetcher"`), look up title/year against IMDB API, patch catalogue with enriched metadata (rating, genre, plot, cast).
- Subtitle fetcher microservice — subscribe to `catalogue.events.media.published` (durable consumer `"subtitle-fetcher"`), fetch subtitles (e.g. OpenSubtitles API), store as `.vtt` in S3, update catalogue with subtitle_s3_key. Frontend `<video>` supports `<track>` elements for native subtitle display.
## TV show metadata identification
For a file like `Clarkson's.Farm.S01E01.Tractoring.WEBRip-1080p.mp4`, metadata can be identified via:
- **Filename parsing** — extract show name, season, episode number, and episode title from the filename using a regex (e.g. `S(\d+)E(\d+)` pattern). The ingest service or a dedicated parser microservice could do this automatically at upload time, pre-filling `show_name`, `season`, `episode`, `episode_title` fields so the user doesn't have to type them.
- **TheTVDB API** — given `show_name` + `season` + `episode`, look up the canonical title, air date, plot, guest cast, network, and a high-quality episode thumbnail. Free API key available. Subscribe to `catalogue.events.media.published` as a durable consumer `"tvdb-fetcher"`.
- **TMDB (The Movie Database)** — also covers TV series (`/tv/{series_id}/season/{n}/episode/{n}`). Has episode stills, show banners, cast photos. Free API key.
- **IMDb / Cinemagoer** — Python library (`cinemagoer`, formerly IMDbPY) that scrapes IMDb data without an API key. Slower but no key required. IMDb series ID can be cross-referenced from TheTVDB.
- **Video container metadata** — MKV/MP4 files sometimes embed title, show name, season/episode in container tags (readable via PyAV `container.metadata`). Worth checking before hitting external APIs — already have the file open during ingest.
- **Suggested flow**: parse filename → check container tags → query TheTVDB with (show_name, season, episode) → fall back to TMDB → patch catalogue via service JWT.