Entities reference

Entities represent the core data objects in Giskard Hub, such as projects, datasets, models, and evaluations.

class giskard_hub.data.ChatMessage(role: Literal['system', 'assistant', 'user'], content: str)[source]

Bases: BaseData

Message from an LLM, with role & content.

content: str

role: Literal['system', 'assistant', 'user']

class giskard_hub.data.ChatTestCase(messages: List[ChatMessage] = ..., demo_output: ChatMessageWithMetadata | None = None, tags: List[str] = ..., checks: List[CheckConfig] = ...)[source]

Bases: Entity

A Dataset entry representing a chat test case.

messages

List of messages in the chat test case. Each message is an object with a role and content attributes.

Type:: List[ChatMessage]

demo_output

Output of the agent for demonstration purposes.

Type:: Optional[ChatMessageWithMetadata], optional

tags

List of tags for the chat test case.

Type:: List[str], optional

checks

List of checks to be performed on the chat test case.

Type:: List[CheckConfig], optional

checks: List[CheckConfig]

demo_output: ChatMessageWithMetadata | None = None

classmethod from_dict(data: Dict[str, Any], **kwargs) → ChatTestCase[source]

Class method factory, allowing to filter from a dict.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.

messages: List[ChatMessage]

tags: List[str]

class giskard_hub.data.Check(identifier: str, description: str, name: str, params: Dict[str, Any])[source]

Bases: Entity

description: str

identifier: str

name: str

params: Dict[str, Any]

class giskard_hub.data.Conversation(messages: List[ChatMessage] = ..., demo_output: ChatMessageWithMetadata | None = None, tags: List[str] = ..., checks: List[CheckConfig] = ...)[source]

Bases: ChatTestCase

A Dataset entry representing a conversation.

messages

List of messages in the conversation. Each message is an object with a role and content attributes.

Type:: List[ChatMessage]

demo_output

Output of the agent for demonstration purposes.

Type:: Optional[ChatMessageWithMetadata], optional

tags

List of tags for the conversation.

Type:: List[str], optional

checks

List of checks to be performed on the conversation.

Type:: List[CheckConfig], optional

classmethod from_dict(data: Dict[str, Any], **kwargs) → Conversation[source]

Class method factory, allowing to filter from a dict.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.

class giskard_hub.data.Dataset(name: str, description: str = '', project_id: str | None = None, tags: List[str] = ...)[source]

Bases: EntityWithTaskProgress

Dataset object, containing the metadata about the dataset.

property chat_test_cases: Return the chat test cases of the dataset.

property conversations: Return the conversations of the dataset.

create_chat_test_case(chat_test_case: ChatTestCase)[source]: Add a chat test case to the dataset.

create_conversation(conversation: Conversation)[source]: Add a conversation to the dataset.

description: str = ''

classmethod from_dict(data: dict, **kwargs) → giskard.Dataset[source]: Create a Dataset instance from a dictionary.

name: str

project_id: str | None = None

property resource: str: Abstract property for the resource name used in API calls.

tags: List[str]

class giskard_hub.data.Document(content: 'str', topic_id: "'UUID' | None" = None, embedding: 'list[float]' = ...)[source]

Bases: Entity

content: str

embedding: list[float]

topic_id: 'UUID' | None = None

class giskard_hub.data.EvaluationRun(name: str | None, project_id: str | None, datasets: List[giskard.Dataset] = ..., model: Model | None = None, criteria: List = ..., metrics: List[Metric] = ..., tags: List[Metric] = ..., failure_categories: Dict[str, int] = ..., scheduled_evaluation_id: str | None = None)[source]

Bases: EntityWithTaskProgress

Evaluation run.

criteria: List

datasets: List[Dataset]

failure_categories: Dict[str, int]

classmethod from_dict(data: Dict[str, Any], **kwargs) → EvaluationRun[source]

Class method factory, allowing to filter from a dict.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.

metrics: List[Metric]

model: Model | None = None

name: str | None

print_metrics()[source]: Print the evaluation metrics.

project_id: str | None

property resource: str: Abstract property for the resource name used in API calls.

scheduled_evaluation_id: str | None = None

tags: List[Metric]

class giskard_hub.data.FrequencyOption(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: str, Enum

Frequency options for scheduled evaluations.

DAILY = 'daily'

MONTHLY = 'monthly'

WEEKLY = 'weekly'

class giskard_hub.data.KnowledgeBase(name: 'str', project_id: 'str', description: 'str | None' = None, n_documents: 'int' = 0, filename: 'str | None' = None, topics: 'list[Topic]' = ...)[source]

Bases: EntityWithTaskProgress

description: str | None = None

filename: str | None = None

classmethod from_dict(data: dict, **kwargs) → KnowledgeBase[source]: Create a KnowledgeBase instance from a dictionary.

n_documents: int = 0

name: str

project_id: str

property resource: str: Abstract property for the resource name used in API calls.

topics: list[Topic]

class giskard_hub.data.Metric(name: str, passed: int, failed: int, errored: int, total: int)[source]

Bases: BaseData

Evaluation metric.

name

The name of the metric (e.g. “correctness”).

Type:: str

passed

The number of samples that passed evaluations.

Type:: int

failed

The number of samples that failed evaluations.

Type:: int

skipped

The number of samples that were not evaluated (typically because of missing evaluation annotations).

Type:: int

errored

The number of samples that errored during evaluations.

Type:: int

total

The total number of samples (including the ones skipped).

Type:: int

percentage

The percentage of passed evaluations (not considering the skipped samples).

Type:: float

errored: int

failed: int

name: str

passed: int

property percentage

property skipped

total: int

class giskard_hub.data.Model(name: str, project_id: str | None = None, url: str | None = None, description: str | None = None, supported_languages: List[str] = ..., headers: Dict[str, str] = ...)[source]

Bases: Entity

chat(messages: List[ChatMessage]) → ModelOutput[source]

Chat with the model.

Parameters:: messages (List[ChatMessage]) – A list of messages to send to the model.
Returns:: The model response.
Return type:: ModelOutput

description: str | None = None

classmethod from_dict(data: Dict[str, str], **kwargs) → Model[source]

Class method factory, allowing to filter from a dict.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.

headers: Dict[str, str]

name: str

project_id: str | None = None

supported_languages: List[str]

url: str | None = None

class giskard_hub.data.ModelOutput(message: ChatMessage | None = None, metadata: Dict[str, any] = ..., error: ExecutionError | None = None)[source]

Bases: BaseData

Model output.

error: ExecutionError | None = None

classmethod from_dict(data: Dict[str, any], **kwargs) → BaseData[source]

Class method factory.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.
Returns:: The dataclass instance.
Return type:: BaseDataclass

message: ChatMessage | None = None

metadata: Dict[str, any]

class giskard_hub.data.Project(name: str, description: str = '')[source]

Bases: Entity

name

The name of the project.

Type:: str

description

The description of the project.

Type:: str, optional

description: str = ''

name: str

class giskard_hub.data.ScheduledEvaluation(project_id: str, name: str, model_id: str, dataset_id: str, tags: list[str] = ..., run_count: int = 1, frequency: FrequencyOption = FrequencyOption.DAILY, time: str = '00:00', day_of_week: int | None = None, day_of_month: int | None = None, paused: bool = False, last_execution_at: datetime | None = None, last_execution_status: SuccessExecutionStatus | ErrorExecutionStatus | None = None)[source]

Bases: Entity

Scheduled evaluation entity.

project_id

The ID of the project this scheduled evaluation belongs to.

Type:: str

name

The name of the scheduled evaluation.

Type:: str

model_id

The ID of the model to evaluate.

Type:: str

dataset_id

The ID of the dataset to evaluate against.

Type:: str

tags

List of tags to filter the conversations that will be evaluated.

Type:: List[str], optional

run_count

The number of times to run each test case (1-5).

Type:: int

frequency

The frequency of the scheduled evaluation (daily, weekly, monthly).

Type:: FrequencyOption

time

The time to run the evaluation (HH:MM format).

Type:: str

day_of_week

The day of the week to run (1-7, 1 is Monday). Required for weekly frequency.

Type:: int, optional

day_of_month

The day of the month to run (1-28). Required for monthly frequency.

Type:: int, optional

paused

Whether the scheduled evaluation is paused.

Type:: bool

last_execution_at

The timestamp of the last execution.

Type:: datetime, optional

last_execution_status

The status of the last execution.

Type:: ExecutionStatus, optional

dataset_id: str

day_of_month: int | None = None

day_of_week: int | None = None

frequency: FrequencyOption = 'daily'

classmethod from_dict(data: dict[str, Any], *, _client=None, **kwargs: Any) → ScheduledEvaluation[source]

Class method factory, allowing to filter from a dict.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.

last_execution_at: datetime | None = None

last_execution_status: ExecutionStatus | None = None

model_id: str

name: str

paused: bool = False

project_id: str

property resource: str

run_count: int = 1

tags: list[str]

time: str = '00:00'

class giskard_hub.data.Topic(name: 'str', description: 'str | None' = None)[source]

Bases: Entity

description: str | None = None

name: str