Skip to content

Benchmark and Quality

Benchmark scripts and detailed evaluation reports are not synced to this GitHub repository for now.

Current Status

  • Local benchmark and evaluation work is still maintained privately.
  • Public documentation only keeps a lightweight capability summary.
  • When the benchmark workflow is ready for open-source release, this page will be updated together with the related scripts and reports.

Notes

  • The benchmark methodology is based on RAGAS.
  • Public benchmark numbers in the repository should be treated as temporary snapshots rather than a continuously updated source of truth.