Benchmark and Quality
Benchmark scripts and detailed evaluation reports are not synced to this GitHub repository for now.
Current Status
- Local benchmark and evaluation work is still maintained privately.
- Public documentation only keeps a lightweight capability summary.
- When the benchmark workflow is ready for open-source release, this page will be updated together with the related scripts and reports.
Notes
- The benchmark methodology is based on RAGAS.
- Public benchmark numbers in the repository should be treated as temporary snapshots rather than a continuously updated source of truth.