Entities reference
Entities represent the core data objects in Giskard Hub, such as projects, datasets, models, and evaluations.
- class giskard_hub.data.ChatMessage(role: Literal['system', 'assistant', 'user'], content: str)[source]
Bases:
BaseData
Message from an LLM, with role & content.
- content: str
- role: Literal['system', 'assistant', 'user']
- class giskard_hub.data.ChatTestCase(messages: List[ChatMessage] = ..., demo_output: ChatMessageWithMetadata | None = None, tags: List[str] = ..., checks: List[CheckConfig] = ...)[source]
Bases:
Entity
A Dataset entry representing a chat test case.
- messages
List of messages in the chat test case. Each message is an object with a role and content attributes.
- Type:
List[ChatMessage]
- demo_output
Output of the agent for demonstration purposes.
- Type:
Optional[ChatMessageWithMetadata], optional
- tags
List of tags for the chat test case.
- Type:
List[str], optional
- checks
List of checks to be performed on the chat test case.
- Type:
List[CheckConfig], optional
- checks: List[CheckConfig]
- demo_output: ChatMessageWithMetadata | None = None
- classmethod from_dict(data: Dict[str, Any], **kwargs) ChatTestCase [source]
Class method factory, allowing to filter from a dict.
- Parameters:
data (Dict[str, Any]) – The data to use to initialize the dataclass.
- messages: List[ChatMessage]
- tags: List[str]
- class giskard_hub.data.Check(identifier: str, description: str, name: str, params: Dict[str, Any])[source]
Bases:
Entity
- description: str
- identifier: str
- name: str
- params: Dict[str, Any]
- class giskard_hub.data.Conversation(messages: List[ChatMessage] = ..., demo_output: ChatMessageWithMetadata | None = None, tags: List[str] = ..., checks: List[CheckConfig] = ...)[source]
Bases:
ChatTestCase
A Dataset entry representing a conversation.
- messages
List of messages in the conversation. Each message is an object with a role and content attributes.
- Type:
List[ChatMessage]
- demo_output
Output of the agent for demonstration purposes.
- Type:
Optional[ChatMessageWithMetadata], optional
- tags
List of tags for the conversation.
- Type:
List[str], optional
- checks
List of checks to be performed on the conversation.
- Type:
List[CheckConfig], optional
- classmethod from_dict(data: Dict[str, Any], **kwargs) Conversation [source]
Class method factory, allowing to filter from a dict.
- Parameters:
data (Dict[str, Any]) – The data to use to initialize the dataclass.
- class giskard_hub.data.Dataset(name: str, description: str = '', project_id: str | None = None, tags: List[str] = ...)[source]
Bases:
EntityWithTaskProgress
Dataset object, containing the metadata about the dataset.
- property chat_test_cases
Return the chat test cases of the dataset.
- property conversations
Return the conversations of the dataset.
- create_chat_test_case(chat_test_case: ChatTestCase)[source]
Add a chat test case to the dataset.
- create_conversation(conversation: Conversation)[source]
Add a conversation to the dataset.
- description: str = ''
- classmethod from_dict(data: dict, **kwargs) giskard.Dataset [source]
Create a Dataset instance from a dictionary.
- name: str
- project_id: str | None = None
- property resource: str
Abstract property for the resource name used in API calls.
- tags: List[str]
- class giskard_hub.data.Document(content: 'str', topic_id: "'UUID' | None" = None, embedding: 'list[float]' = ...)[source]
Bases:
Entity
- content: str
- embedding: list[float]
- topic_id: 'UUID' | None = None
- class giskard_hub.data.EvaluationRun(name: str | None, project_id: str | None, datasets: List[giskard.Dataset] = ..., model: Model | None = None, criteria: List = ..., metrics: List[Metric] = ..., tags: List[Metric] = ..., failure_categories: Dict[str, int] = ..., scheduled_evaluation_id: str | None = None)[source]
Bases:
EntityWithTaskProgress
Evaluation run.
- criteria: List
- datasets: List[Dataset]
- failure_categories: Dict[str, int]
- classmethod from_dict(data: Dict[str, Any], **kwargs) EvaluationRun [source]
Class method factory, allowing to filter from a dict.
- Parameters:
data (Dict[str, Any]) – The data to use to initialize the dataclass.
- metrics: List[Metric]
- model: Model | None = None
- name: str | None
- print_metrics()[source]
Print the evaluation metrics.
- project_id: str | None
- property resource: str
Abstract property for the resource name used in API calls.
- scheduled_evaluation_id: str | None = None
- tags: List[Metric]
- class giskard_hub.data.FrequencyOption(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]
Bases:
str
,Enum
Frequency options for scheduled evaluations.
- DAILY = 'daily'
- MONTHLY = 'monthly'
- WEEKLY = 'weekly'
- class giskard_hub.data.KnowledgeBase(name: 'str', project_id: 'str', description: 'str | None' = None, n_documents: 'int' = 0, filename: 'str | None' = None, topics: 'list[Topic]' = ...)[source]
Bases:
EntityWithTaskProgress
- description: str | None = None
- filename: str | None = None
- classmethod from_dict(data: dict, **kwargs) KnowledgeBase [source]
Create a KnowledgeBase instance from a dictionary.
- n_documents: int = 0
- name: str
- project_id: str
- property resource: str
Abstract property for the resource name used in API calls.
- topics: list[Topic]
- class giskard_hub.data.Metric(name: str, passed: int, failed: int, errored: int, total: int)[source]
Bases:
BaseData
Evaluation metric.
- name
The name of the metric (e.g. “correctness”).
- Type:
str
- passed
The number of samples that passed evaluations.
- Type:
int
- failed
The number of samples that failed evaluations.
- Type:
int
- skipped
The number of samples that were not evaluated (typically because of missing evaluation annotations).
- Type:
int
- errored
The number of samples that errored during evaluations.
- Type:
int
- total
The total number of samples (including the ones skipped).
- Type:
int
- percentage
The percentage of passed evaluations (not considering the skipped samples).
- Type:
float
- errored: int
- failed: int
- name: str
- passed: int
- property percentage
- property skipped
- total: int
- class giskard_hub.data.Model(name: str, project_id: str | None = None, url: str | None = None, description: str | None = None, supported_languages: List[str] = ..., headers: Dict[str, str] = ...)[source]
Bases:
Entity
- chat(messages: List[ChatMessage]) ModelOutput [source]
Chat with the model.
- Parameters:
messages (List[ChatMessage]) – A list of messages to send to the model.
- Returns:
The model response.
- Return type:
ModelOutput
- description: str | None = None
- classmethod from_dict(data: Dict[str, str], **kwargs) Model [source]
Class method factory, allowing to filter from a dict.
- Parameters:
data (Dict[str, Any]) – The data to use to initialize the dataclass.
- headers: Dict[str, str]
- name: str
- project_id: str | None = None
- supported_languages: List[str]
- url: str | None = None
- class giskard_hub.data.ModelOutput(message: ChatMessage | None = None, metadata: Dict[str, any] = ..., error: ExecutionError | None = None)[source]
Bases:
BaseData
Model output.
- error: ExecutionError | None = None
- classmethod from_dict(data: Dict[str, any], **kwargs) BaseData [source]
Class method factory.
- Parameters:
data (Dict[str, Any]) – The data to use to initialize the dataclass.
- Returns:
The dataclass instance.
- Return type:
BaseDataclass
- message: ChatMessage | None = None
- metadata: Dict[str, any]
- class giskard_hub.data.Project(name: str, description: str = '')[source]
Bases:
Entity
- name
The name of the project.
- Type:
str
- description
The description of the project.
- Type:
str, optional
- description: str = ''
- name: str
- class giskard_hub.data.ScheduledEvaluation(project_id: str, name: str, model_id: str, dataset_id: str, tags: list[str] = ..., run_count: int = 1, frequency: FrequencyOption = FrequencyOption.DAILY, time: str = '00:00', day_of_week: int | None = None, day_of_month: int | None = None, paused: bool = False, last_execution_at: datetime | None = None, last_execution_status: SuccessExecutionStatus | ErrorExecutionStatus | None = None)[source]
Bases:
Entity
Scheduled evaluation entity.
- project_id
The ID of the project this scheduled evaluation belongs to.
- Type:
str
- name
The name of the scheduled evaluation.
- Type:
str
- model_id
The ID of the model to evaluate.
- Type:
str
- dataset_id
The ID of the dataset to evaluate against.
- Type:
str
- tags
List of tags to filter the conversations that will be evaluated.
- Type:
List[str], optional
- run_count
The number of times to run each test case (1-5).
- Type:
int
- frequency
The frequency of the scheduled evaluation (daily, weekly, monthly).
- Type:
FrequencyOption
- time
The time to run the evaluation (HH:MM format).
- Type:
str
- day_of_week
The day of the week to run (1-7, 1 is Monday). Required for weekly frequency.
- Type:
int, optional
- day_of_month
The day of the month to run (1-28). Required for monthly frequency.
- Type:
int, optional
- paused
Whether the scheduled evaluation is paused.
- Type:
bool
- last_execution_at
The timestamp of the last execution.
- Type:
datetime, optional
- last_execution_status
The status of the last execution.
- Type:
ExecutionStatus, optional
- dataset_id: str
- day_of_month: int | None = None
- day_of_week: int | None = None
- frequency: FrequencyOption = 'daily'
- classmethod from_dict(data: dict[str, Any], *, _client=None, **kwargs: Any) ScheduledEvaluation [source]
Class method factory, allowing to filter from a dict.
- Parameters:
data (Dict[str, Any]) – The data to use to initialize the dataclass.
- last_execution_at: datetime | None = None
- last_execution_status: ExecutionStatus | None = None
- model_id: str
- name: str
- paused: bool = False
- project_id: str
- property resource: str
- run_count: int = 1
- tags: list[str]
- time: str = '00:00'
- class giskard_hub.data.Topic(name: 'str', description: 'str | None' = None)[source]
Bases:
Entity
- description: str | None = None
- name: str