Metrics API

Metrics module.

contextbench.metrics.coverage_precision(pred_size: float, gold_size: float, inter_size: float) Tuple[float, float][source]

Compute (coverage, precision). Edge cases: gold=0→cov=1.0, pred=0→prec=1.0.

contextbench.metrics.compute_granularity_metrics(pred_files: Set[str], pred_defs: Set[Tuple[str, str, int, int]], pred_spans: Dict[str, List[Tuple[int, int]]], gold_files: Set[str], gold_defs: Set[Tuple[str, str, int, int]], gold_spans: Dict[str, List[Tuple[int, int]]], pred_lines: Dict[str, List[Tuple[int, int]]] | None = None, gold_lines: Dict[str, List[Tuple[int, int]]] | None = None) dict[source]

Compute metrics at all granularities.

contextbench.metrics.compute_trajectory_metrics(steps, gold_files: Set[str], gold_symbols: Set[Tuple[str, str, int, int]], gold_spans: Dict[str, List[Tuple[int, int]]], repo_dir: str, gold_lines: Dict[str, List[Tuple[int, int]]] | None = None) dict[source]

Compute AUC-Coverage, Redundancy, and per-step metrics.

contextbench.metrics.span_total_bytes(spans_by_file: Dict[str, List[Tuple[int, int]]]) int[source]

Total bytes across all files.

contextbench.metrics.span_intersection_bytes(a: Dict[str, List[Tuple[int, int]]], b: Dict[str, List[Tuple[int, int]]]) int[source]

Total intersection bytes across all files.