Benchmark and Quality

Benchmark scripts and detailed evaluation reports are not synced to this GitHub repository for now.

Current Status

Local benchmark and evaluation work is still maintained privately.
Public documentation only keeps a lightweight capability summary.
When the benchmark workflow is ready for open-source release, this page will be updated together with the related scripts and reports.

Notes

The benchmark methodology is based on RAGAS.
Public benchmark numbers in the repository should be treated as temporary snapshots rather than a continuously updated source of truth.