Entities

class giskard_hub.data.ChatMessage(role: Literal['system', 'assistant', 'user'], content: str)[source]

Bases: BaseData

Message from an LLM, with role & content.

content: str
role: Literal['system', 'assistant', 'user']
class giskard_hub.data.Conversation(messages: List[ChatMessage] = ..., rules: List[str] = ..., tags: List[str] = ..., expected_output: str | None = None, demo_output: ChatMessage | None = None)[source]

Bases: Entity

A Dataset entry representing a conversation.

messages

List of messages in the conversation. Each message is an object with a role and content attributes.

Type:

List[ChatMessage]

tags

List of tags for the conversation.

Type:

List[str], optional

expected_output

Expected output which will be used for correctness evaluation.

Type:

Optional[str], optional

rules

List of rules used for evaluation.

Type:

List[str], optional

demo_output

Output of the agent for demonstration purposes.

Type:

Optional[ChatMessage], optional

demo_output: ChatMessage | None = None
expected_output: str | None = None
classmethod from_dict(data: Dict[str, Any], **kwargs) Conversation[source]

Class method factory, allowing to filter from a dict.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

messages: List[ChatMessage]
rules: List[str]
tags: List[str]
class giskard_hub.data.Dataset(name: str, description: str = '', project_id: str | None = None, tags: List[str] = ...)[source]

Bases: Entity

Dataset object, containing the metadata about the dataset.

property conversations

Return the conversations of the dataset.

create_conversation(conversation: Conversation)[source]

Add a conversation to the dataset.

description: str = ''
name: str
project_id: str | None = None
tags: List[str]
class giskard_hub.data.EvaluationRun(name: str | None = None, project_id: str | None = None, datasets: List[Dataset] = ..., model: Model | None = None, criteria: List = ..., metrics: List[Metric] = ..., progress: TaskProgress | None = None)[source]

Bases: Entity

Evaluation run.

criteria: List
datasets: List[Dataset]
classmethod from_dict(data: Dict[str, Any], **kwargs) EvaluationRun[source]

Class method factory, allowing to filter from a dict.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

is_errored() bool[source]

Check if the evaluation terminated with an error.

is_finished() bool[source]

Check if the evaluation is finished.

is_running() bool[source]

Check if the evaluation is running.

metrics: List[Metric]
model: Model | None = None
name: str | None = None
print_metrics()[source]

Print the evaluation metrics.

progress: TaskProgress | None = None
project_id: str | None = None
refresh() EvaluationRun[source]

Refresh the evaluation run from the Hub.

wait_for_completion(timeout: float = 600, poll_interval: float = 5) EvaluationRun[source]

Wait for the evaluation to complete successfully.

Parameters:
  • timeout (int, optional) – The timeout in seconds, by default 600

  • poll_interval (int, optional) – The polling interval in seconds, by default 5.

Returns:

The updated evaluation run instance. The object will have a valid metrics attribute containing the evaluation results.

Return type:

EvaluationRun

class giskard_hub.data.Metric(name: str, passed: int, failed: int, errored: int, total: int)[source]

Bases: BaseData

Evaluation metric.

name

The name of the metric (e.g. “correctness”).

Type:

str

passed

The number of samples that passed evaluations.

Type:

int

failed

The number of samples that failed evaluations.

Type:

int

skipped

The number of samples that were not evaluated (typically because of missing evaluation annotations).

Type:

int

errored

The number of samples that errored during evaluations.

Type:

int

total

The total number of samples (including the ones skipped).

Type:

int

percentage

The percentage of passed evaluations (not considering the skipped samples).

Type:

float

errored: int
failed: int
name: str
passed: int
property percentage
property skipped
total: int
class giskard_hub.data.Model(name: str, project_id: str | None = None, url: str | None = None, description: str | None = None, supported_languages: List[str] = ..., headers: Dict[str, str] = ...)[source]

Bases: Entity

chat(messages: List[ChatMessage]) ModelOutput[source]

Chat with the model.

Parameters:

messages (List[ChatMessage]) – A list of messages to send to the model.

Returns:

The model response.

Return type:

ModelOutput

description: str | None = None
classmethod from_dict(data: Dict[str, str], **kwargs) Model[source]

Class method factory, allowing to filter from a dict.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

headers: Dict[str, str]
name: str
project_id: str | None = None
supported_languages: List[str]
url: str | None = None
class giskard_hub.data.ModelOutput(message: ChatMessage, metadata: Dict[str, any] = ...)[source]

Bases: BaseData

Model output.

classmethod from_dict(data: Dict[str, any], **kwargs) BaseData[source]

Class method factory.

Parameters:

data (Dict[str, Any]) – The data to use to initialize the dataclass.

Returns:

The dataclass instance.

Return type:

BaseDataclass

message: ChatMessage
metadata: Dict[str, any]
class giskard_hub.data.Project(name: str, description: str = '')[source]

Bases: Entity

name

The name of the project.

Type:

str

description

The description of the project.

Type:

str, optional

description: str = ''
name: str