Entities reference

Entities represent the core data objects in Giskard Hub, such as projects, datasets, models, and evaluations.

class giskard_hub.data.ChatMessage(role: Literal['system', 'assistant', 'user'], content: str)[source]

Bases: BaseData

Message from an LLM, with role & content.

content: str
role: Literal['system', 'assistant', 'user']
class giskard_hub.data.ChatTestCase(messages: List[ChatMessage] = ..., demo_output: ChatMessageWithMetadata | None = None, tags: List[str] = ..., checks: List[CheckConfig] = ...)[source]

Bases: Entity

A Dataset entry representing a chat test case.

messages

List of messages in the chat test case. Each message is an object with a role and content attributes.

Type:

List[ChatMessage]

demo_output

Output of the agent for demonstration purposes.

Type:

Optional[ChatMessageWithMetadata], optional

tags

List of tags for the chat test case.

Type:

List[str], optional

checks

List of checks to be performed on the chat test case.

Type:

List[CheckConfig], optional

checks: List[CheckConfig]
demo_output: ChatMessageWithMetadata | None = None
classmethod from_dict(data: Dict[str, Any], **kwargs) ChatTestCase[source]

Class method factory, allowing to filter from a dict.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

messages: List[ChatMessage]
tags: List[str]
class giskard_hub.data.Check(identifier: str, description: str, name: str, params: Dict[str, Any])[source]

Bases: Entity

description: str
identifier: str
name: str
params: Dict[str, Any]
class giskard_hub.data.Conversation(messages: List[ChatMessage] = ..., demo_output: ChatMessageWithMetadata | None = None, tags: List[str] = ..., checks: List[CheckConfig] = ...)[source]

Bases: ChatTestCase

A Dataset entry representing a conversation.

messages

List of messages in the conversation. Each message is an object with a role and content attributes.

Type:

List[ChatMessage]

demo_output

Output of the agent for demonstration purposes.

Type:

Optional[ChatMessageWithMetadata], optional

tags

List of tags for the conversation.

Type:

List[str], optional

checks

List of checks to be performed on the conversation.

Type:

List[CheckConfig], optional

classmethod from_dict(data: Dict[str, Any], **kwargs) Conversation[source]

Class method factory, allowing to filter from a dict.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

class giskard_hub.data.Dataset(name: str, description: str = '', project_id: str | None = None, tags: List[str] = ...)[source]

Bases: EntityWithTaskProgress

Dataset object, containing the metadata about the dataset.

property chat_test_cases

Return the chat test cases of the dataset.

property conversations

Return the conversations of the dataset.

create_chat_test_case(chat_test_case: ChatTestCase)[source]

Add a chat test case to the dataset.

create_conversation(conversation: Conversation)[source]

Add a conversation to the dataset.

description: str = ''
classmethod from_dict(data: dict, **kwargs) giskard.Dataset[source]

Create a Dataset instance from a dictionary.

name: str
project_id: str | None = None
property resource: str

Abstract property for the resource name used in API calls.

tags: List[str]
class giskard_hub.data.Document(content: 'str', topic_id: "'UUID' | None" = None, embedding: 'list[float]' = ...)[source]

Bases: Entity

content: str
embedding: list[float]
topic_id: 'UUID' | None = None
class giskard_hub.data.EvaluationRun(name: str | None, project_id: str | None, datasets: List[giskard.Dataset] = ..., model: Model | None = None, criteria: List = ..., metrics: List[Metric] = ..., tags: List[Metric] = ..., failure_categories: Dict[str, int] = ..., scheduled_evaluation_id: str | None = None)[source]

Bases: EntityWithTaskProgress

Evaluation run.

criteria: List
datasets: List[Dataset]
failure_categories: Dict[str, int]
classmethod from_dict(data: Dict[str, Any], **kwargs) EvaluationRun[source]

Class method factory, allowing to filter from a dict.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

metrics: List[Metric]
model: Model | None = None
name: str | None
print_metrics()[source]

Print the evaluation metrics.

project_id: str | None
property resource: str

Abstract property for the resource name used in API calls.

scheduled_evaluation_id: str | None = None
tags: List[Metric]
class giskard_hub.data.FrequencyOption(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: str, Enum

Frequency options for scheduled evaluations.

DAILY = 'daily'
MONTHLY = 'monthly'
WEEKLY = 'weekly'
class giskard_hub.data.KnowledgeBase(name: 'str', project_id: 'str', description: 'str | None' = None, n_documents: 'int' = 0, filename: 'str | None' = None, topics: 'list[Topic]' = ...)[source]

Bases: EntityWithTaskProgress

description: str | None = None
filename: str | None = None
classmethod from_dict(data: dict, **kwargs) KnowledgeBase[source]

Create a KnowledgeBase instance from a dictionary.

n_documents: int = 0
name: str
project_id: str
property resource: str

Abstract property for the resource name used in API calls.

topics: list[Topic]
class giskard_hub.data.Metric(name: str, passed: int, failed: int, errored: int, total: int)[source]

Bases: BaseData

Evaluation metric.

name

The name of the metric (e.g. “correctness”).

Type:

str

passed

The number of samples that passed evaluations.

Type:

int

failed

The number of samples that failed evaluations.

Type:

int

skipped

The number of samples that were not evaluated (typically because of missing evaluation annotations).

Type:

int

errored

The number of samples that errored during evaluations.

Type:

int

total

The total number of samples (including the ones skipped).

Type:

int

percentage

The percentage of passed evaluations (not considering the skipped samples).

Type:

float

errored: int
failed: int
name: str
passed: int
property percentage
property skipped
total: int
class giskard_hub.data.Model(name: str, project_id: str | None = None, url: str | None = None, description: str | None = None, supported_languages: List[str] = ..., headers: Dict[str, str] = ...)[source]

Bases: Entity

chat(messages: List[ChatMessage]) ModelOutput[source]

Chat with the model.

Parameters:

messages (List[ChatMessage]) – A list of messages to send to the model.

Returns:

The model response.

Return type:

ModelOutput

description: str | None = None
classmethod from_dict(data: Dict[str, str], **kwargs) Model[source]

Class method factory, allowing to filter from a dict.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

headers: Dict[str, str]
name: str
project_id: str | None = None
supported_languages: List[str]
url: str | None = None
class giskard_hub.data.ModelOutput(message: ChatMessage | None = None, metadata: Dict[str, any] = ..., error: ExecutionError | None = None)[source]

Bases: BaseData

Model output.

error: ExecutionError | None = None
classmethod from_dict(data: Dict[str, any], **kwargs) BaseData[source]

Class method factory.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

Returns:

The dataclass instance.

Return type:

BaseDataclass

message: ChatMessage | None = None
metadata: Dict[str, any]
class giskard_hub.data.Project(name: str, description: str = '')[source]

Bases: Entity

name

The name of the project.

Type:

str

description

The description of the project.

Type:

str, optional

description: str = ''
name: str
class giskard_hub.data.ScheduledEvaluation(project_id: str, name: str, model_id: str, dataset_id: str, tags: list[str] = ..., run_count: int = 1, frequency: FrequencyOption = FrequencyOption.DAILY, time: str = '00:00', day_of_week: int | None = None, day_of_month: int | None = None, paused: bool = False, last_execution_at: datetime | None = None, last_execution_status: SuccessExecutionStatus | ErrorExecutionStatus | None = None)[source]

Bases: Entity

Scheduled evaluation entity.

project_id

The ID of the project this scheduled evaluation belongs to.

Type:

str

name

The name of the scheduled evaluation.

Type:

str

model_id

The ID of the model to evaluate.

Type:

str

dataset_id

The ID of the dataset to evaluate against.

Type:

str

tags

List of tags to filter the conversations that will be evaluated.

Type:

List[str], optional

run_count

The number of times to run each test case (1-5).

Type:

int

frequency

The frequency of the scheduled evaluation (daily, weekly, monthly).

Type:

FrequencyOption

time

The time to run the evaluation (HH:MM format).

Type:

str

day_of_week

The day of the week to run (1-7, 1 is Monday). Required for weekly frequency.

Type:

int, optional

day_of_month

The day of the month to run (1-28). Required for monthly frequency.

Type:

int, optional

paused

Whether the scheduled evaluation is paused.

Type:

bool

last_execution_at

The timestamp of the last execution.

Type:

datetime, optional

last_execution_status

The status of the last execution.

Type:

ExecutionStatus, optional

dataset_id: str
day_of_month: int | None = None
day_of_week: int | None = None
frequency: FrequencyOption = 'daily'
classmethod from_dict(data: dict[str, Any], *, _client=None, **kwargs: Any) ScheduledEvaluation[source]

Class method factory, allowing to filter from a dict.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

last_execution_at: datetime | None = None
last_execution_status: ExecutionStatus | None = None
model_id: str
name: str
paused: bool = False
project_id: str
property resource: str
run_count: int = 1
tags: list[str]
time: str = '00:00'
class giskard_hub.data.Topic(name: 'str', description: 'str | None' = None)[source]

Bases: Entity

description: str | None = None
name: str