数据与分析 / 分析洞察

eval-audit

安装量 299GitHub Stars 1,281更新时间 2026年5月16日

描述

Audit an LLM eval pipeline and surface problems: missing error analysis, unvalidated judges, vanity metrics, etc. Use when inheriting an eval system, when unsure whether evals are trustworthy, or as a starting point when no eval infrastructure exists. Do NOT use when the goal is to build a new evaluator from scratch (use error-analysis, write-judge-prompt, or validate-evaluator instead).

安全审计

使用前的风险提示

未审计

规则审计

未审计

更新 1年1月1日

智能审计

未审计

更新 1年1月1日

uiauditllmpromptevalpipelineandsurfaceproblemsmissingerroranalysis

GitHub 仓库