Evaluation metrics reference
Evaluation metrics in RAGET provide quantitative measures of RAG system performance across different dimensions.
Correctness
Using LLM as a judge strategy, the correctness metrics check if an answer is correct compared to the reference answer.
RAGAS Metrics
We provide wrappers for some RAGAS metrics. You can implement other RAGAS metrics using the RAGASMetric class.
- giskard.rag.metrics.ragas_metrics.ragas_context_precision(question_sample: dict, answer: AgentAnswer) dict
- giskard.rag.metrics.ragas_metrics.ragas_faithfulness(question_sample: dict, answer: AgentAnswer) dict
Base Metric
- class giskard.rag.metrics.Metric(name: str, llm_client: LLMClient = None)[source]
Metric base class. All metrics should inherit from this class and implement the __call__ method. The instances of this class can be passed to the evaluate method.
- abstract __call__(question_sample: dict, answer: AgentAnswer)[source]
Compute the metric on a single question and its associated answer.
- Parameters:
question_sample (dict) – A question sample from a QATestset.
answer (AgentAnswer) – The agent answer on that question.
- Returns:
The result of the metric computation. The keys should be the names of the metrics computed.
- Return type:
dict