Entities¶

class giskard_hub.data.ChatMessage(role: Literal['system', 'assistant', 'user'], content: str)[source]¶

Bases: BaseData

Message from an LLM, with role & content.

content: str¶

role: Literal['system', 'assistant', 'user']¶

class giskard_hub.data.Conversation(messages: List[ChatMessage] = ..., rules: List[str] = ..., tags: List[str] = ..., expected_output: str | None = None, demo_output: ChatMessage | None = None)[source]¶

Bases: Entity

A Dataset entry representing a conversation.

messages¶

List of messages in the conversation. Each message is an object with a role and content attributes.

Type:: List[ChatMessage]

tags¶

List of tags for the conversation.

Type:: List[str], optional

expected_output¶

Expected output which will be used for correctness evaluation.

Type:: Optional[str], optional

rules¶

List of rules used for evaluation.

Type:: List[str], optional

demo_output¶

Output of the agent for demonstration purposes.

Type:: Optional[ChatMessage], optional

demo_output: ChatMessage | None = None¶

expected_output: str | None = None¶

classmethod from_dict(data: Dict[str, Any], **kwargs) → Conversation[source]¶

Class method factory, allowing to filter from a dict.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.

messages: List[ChatMessage]¶

rules: List[str]¶

tags: List[str]¶

class giskard_hub.data.Dataset(name: str, description: str = '', project_id: str | None = None, tags: List[str] = ...)[source]¶

Bases: Entity

Dataset object, containing the metadata about the dataset.

property conversations¶: Return the conversations of the dataset.

create_conversation(conversation: Conversation)[source]¶: Add a conversation to the dataset.

description: str = ''¶

name: str¶

project_id: str | None = None¶

tags: List[str]¶

class giskard_hub.data.EvaluationRun(name: str | None = None, project_id: str | None = None, datasets: List[Dataset] = ..., model: Model | None = None, criteria: List = ..., metrics: List[Metric] = ..., progress: TaskProgress | None = None)[source]¶

Bases: Entity

Evaluation run.

criteria: List¶

datasets: List[Dataset]¶

classmethod from_dict(data: Dict[str, Any], **kwargs) → EvaluationRun[source]¶

Class method factory, allowing to filter from a dict.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.

is_errored() → bool[source]¶: Check if the evaluation terminated with an error.

is_finished() → bool[source]¶: Check if the evaluation is finished.

is_running() → bool[source]¶: Check if the evaluation is running.

metrics: List[Metric]¶

model: Model | None = None¶

name: str | None = None¶

print_metrics()[source]¶: Print the evaluation metrics.

progress: TaskProgress | None = None¶

project_id: str | None = None¶

refresh() → EvaluationRun[source]¶: Refresh the evaluation run from the Hub.

wait_for_completion(timeout: float = 600, poll_interval: float = 5) → EvaluationRun[source]¶

Wait for the evaluation to complete successfully.

Parameters:

timeout (int, optional) – The timeout in seconds, by default 600
poll_interval (int, optional) – The polling interval in seconds, by default 5.

Returns:

The updated evaluation run instance. The object will have a valid metrics attribute containing the evaluation results.

Return type:

EvaluationRun

class giskard_hub.data.Metric(name: str, passed: int, failed: int, errored: int, total: int)[source]¶

Bases: BaseData

Evaluation metric.

name¶

The name of the metric (e.g. “correctness”).

Type:: str

passed¶

The number of samples that passed evaluations.

Type:: int

failed¶

The number of samples that failed evaluations.

Type:: int

skipped¶

The number of samples that were not evaluated (typically because of missing evaluation annotations).

Type:: int

errored¶

The number of samples that errored during evaluations.

Type:: int

total¶

The total number of samples (including the ones skipped).

Type:: int

percentage¶

The percentage of passed evaluations (not considering the skipped samples).

Type:: float

errored: int¶

failed: int¶

name: str¶

passed: int¶

property percentage¶

property skipped¶

total: int¶

class giskard_hub.data.Model(name: str, project_id: str | None = None, url: str | None = None, description: str | None = None, supported_languages: List[str] = ..., headers: Dict[str, str] = ...)[source]¶

Bases: Entity

chat(messages: List[ChatMessage]) → ModelOutput[source]¶

Chat with the model.

Parameters:: messages (List[ChatMessage]) – A list of messages to send to the model.
Returns:: The model response.
Return type:: ModelOutput

description: str | None = None¶

classmethod from_dict(data: Dict[str, str], **kwargs) → Model[source]¶

Class method factory, allowing to filter from a dict.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.

headers: Dict[str, str]¶

name: str¶

project_id: str | None = None¶

supported_languages: List[str]¶

url: str | None = None¶

class giskard_hub.data.ModelOutput(message: ChatMessage, metadata: Dict[str, any] = ...)[source]¶

Bases: BaseData

Model output.

classmethod from_dict(data: Dict[str, any], **kwargs) → BaseData[source]¶

Class method factory.

Parameters:: data (Dict[str, Any]) – The data to use to initialize the dataclass.
Returns:: The dataclass instance.
Return type:: BaseDataclass

message: ChatMessage¶

metadata: Dict[str, any]¶

class giskard_hub.data.Project(name: str, description: str = '')[source]¶

Bases: Entity

name¶

The name of the project.

Type:: str

description¶

The description of the project.

Type:: str, optional

description: str = ''¶

name: str¶