feat: cache parsed member indexes as msgpack to skip YAML re-parse on rebuild (#40)
ci/woodpecker/tag/docker Pipeline was successful
ci/woodpecker/tag/docker Pipeline was successful
Closes #36 ## Summary - After fetching a member's `index.yaml` (from upstream or S3), the handler now parses it and stores a compact msgpack file (`index.msgpack`) alongside the raw YAML in S3 - On subsequent virtual rebuilds (member caches valid, virtual TTL expired), the handler loads the msgpack file instead of re-parsing raw YAML — eliminating the costliest phase - `_entries_to_msgpack_safe()` converts datetime/date objects to ISO strings before packing (msgpack cannot natively serialize Python datetimes) - `_merge_helm_indexes()` accepts `list[dict | None]` as pre-parsed entries; falls back to raw YAML parse when msgpack is unavailable - `_VirtualHandler.merge()` protocol updated to pass pre-parsed entries to all future handler implementations - Broken msgpack is detected and rebuilt from raw YAML automatically ## Performance Phase breakdown (19-member helm-all virtual, 14 MB total): | Phase | Time | % | |---|---|---| | YAML parse (eliminated) | 6314 ms | 60% | | URL rewrite + dedup | 33 ms | 0.3% | | YAML dump | 4124 ms | 39% | | Scenario | Before (CSafeLoader only, #34) | After | |---|---|---| | Cold rebuild (upstream fetch) | ~21s | ~26s (+5s for msgpack build, one-time) | | **Warm rebuild (S3 hit, virtual expired)** | **~9.6s** | **~5.9s (38% faster)** | | Virtual cache hit | ~0.03s | ~0.03s | Log line confirms msgpack hits: `msgpack=19/19` ## Test plan - 297 tests pass - `TestEntriesToMsgpackSafe`: datetime/date serialization, empty input, round-trip - `TestMergeHelmIndexesWithParsed`: pre-parsed path produces identical output to raw-bytes path - `TestGetMemberIndexMsgpack`: msgpack hit, cold-build, broken msgpack fallback, upstream failure - Docker warm-rebuild measured at 5.9s vs 9.6s baseline Reviewed-on: #40
This commit was merged in pull request #40.
This commit is contained in:
@@ -390,7 +390,7 @@ If a member is unreachable and has no cached index, it is skipped and a warning
|
||||
|
||||
**Caching:**
|
||||
|
||||
The merged index is cached using `min(mutable_ttl)` across all members. Each member's raw index is cached in S3 under its own remote key by the normal proxy rules; the virtual handler reuses those copies when available.
|
||||
The merged index is cached using `min(mutable_ttl)` across all members. Each member's raw index is cached in S3 under its own remote key; the virtual handler reuses those copies when available. On rebuild, each member's parsed index is also stored as a compact msgpack file (`index.msgpack`) alongside the raw YAML, eliminating the YAML parse cost on subsequent rebuilds.
|
||||
|
||||
**Helm example:**
|
||||
|
||||
|
||||
Reference in New Issue
Block a user