Virtual helm merge: use yaml.CSafeLoader/CDumper for 10-50x YAML parse/dump speedup #34

Closed
opened 2026-05-01 23:44:12 +10:00 by unkinben · 0 comments
Owner

Performance Issue

The _merge_helm_indexes function in artifact/virtual.py uses yaml.safe_load (pure-Python loader) and yaml.dump (pure-Python dumper). For a 19-member virtual repo merging ~14MB of index data, the merge step took 38.5 seconds in testing.

PyYAML ships C extensions (yaml.CSafeLoader, yaml.CDumper/yaml.CSafeDumper) that are 10–50x faster than their pure-Python equivalents.

Fix

Replace in _merge_helm_indexes:

  • yaml.safe_load(raw_data)yaml.load(raw_data, Loader=yaml.CSafeLoader)
  • yaml.dump(merged, Dumper=_HelmDumper, ...) — extend _HelmDumper from yaml.CDumper instead of yaml.Dumper and use yaml.dump(merged, Dumper=_CHelmDumper, ...)

Add a graceful fallback to pure-Python loaders if the C extension is not available (importable as yaml.CSafeLoader).

Expected Impact

The 38.5s merge step should drop to under 5s for 19 members. The cache-miss latency for large virtual repos becomes acceptable for the first request.

## Performance Issue The `_merge_helm_indexes` function in `artifact/virtual.py` uses `yaml.safe_load` (pure-Python loader) and `yaml.dump` (pure-Python dumper). For a 19-member virtual repo merging ~14MB of index data, the merge step took **38.5 seconds** in testing. PyYAML ships C extensions (`yaml.CSafeLoader`, `yaml.CDumper`/`yaml.CSafeDumper`) that are 10–50x faster than their pure-Python equivalents. ## Fix Replace in `_merge_helm_indexes`: - `yaml.safe_load(raw_data)` → `yaml.load(raw_data, Loader=yaml.CSafeLoader)` - `yaml.dump(merged, Dumper=_HelmDumper, ...)` — extend `_HelmDumper` from `yaml.CDumper` instead of `yaml.Dumper` and use `yaml.dump(merged, Dumper=_CHelmDumper, ...)` Add a graceful fallback to pure-Python loaders if the C extension is not available (importable as `yaml.CSafeLoader`). ## Expected Impact The 38.5s merge step should drop to under 5s for 19 members. The cache-miss latency for large virtual repos becomes acceptable for the first request.
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: unkin/artifactapi#34