Quick start¶
The Hub is the user interface from which you can perform LLM evaluations. It can be deployed on-premise or in the cloud, depending on your specific needs.
Note
Throughout this user guide, we’ll use a banking app called Zephyr Bank, designed by data scientists. The app’s chatbot provides customer service support on their website, offering knowledge about the bank’s products, services, and more.
The Dashboard¶
The Dashboard is the first page you’ll see upon logging in. It provides an overview of your project, displaying the number of models, datasets, evaluations, and knowledge bases.
It also features a graph showing the model’s performance over time, measured by two metrics: Conformity and Correctness. By default, the bar graph displays Conformity—clicking the Correctness block switches the view to show Correctness data. We’ll delve into these metrics in more detail in the Evaluations section.
Additionally, the dashboard lists your most recent evaluations and datasets for quick access.
data:image/s3,"s3://crabby-images/f77af/f77afced5b9d28d3218a96502479445b5b953b55" alt=""Dashboard""
Create a project¶
In this section, you will learn how to create a project. Before creating one, ensure you have properly configured the model (see Setup up the model section).
Click the “Account” icon in the upper right corner of the screen, then select “Settings”. The Settings page allows you to manage your projects and users (if you have the proper access rights).
In the Projects tab, click the “Create project” button. A modal will appear where you can enter your project’s name and description.
data:image/s3,"s3://crabby-images/5cf66/5cf6604087b52e83abbf20651061ea2e03e8be74" alt=""Create a project""
Once the project is created, you can access its dashboard by clicking on it in the list. Alternatively, use the dropdown menu in the upper left corner of the screen to select the project you want to work on.
Setup the model¶
This section guides you through creating a new model.
Note
Models are conversational agents configured through an API endpoint. They can be evaluated against datasets.
On the Agents page, under the Model tab, click the “New model” button.
data:image/s3,"s3://crabby-images/9a041/9a041491561e46cc5fea58f9db2432641c3576d1" alt=""List of models""
The interface below displays the model details that need to be filled out.
data:image/s3,"s3://crabby-images/11e8f/11e8fc817207492c4b3a5dd1348485d22eae5c6b" alt=""Setup the model""
Name
: The name of the agent.Description
: Used to refine automatic evaluation and generation for better accuracy in your specific use case.Supported languages
: Add the languages your agent can handle. Note that this affects data generation.Connection settings
:Model API endpoint
: The URL of your model’s API endpoint. This is where requests are sent to interact with your model.Headers
: These are useful for authentication and other custom headers
The endpoint should expect an object shape like the following example:
{
"messages": [
{
"role": "user",
"content": "Hello!"
},
{
"role": "assistant",
"content": "Hello! How can I help you?"
},
{
"role": "user",
"content": "What color is an orange?"
}
]
}
The endpoint’s response should be structured as follows:
{
"response": {
"role": "assistant",
"content": "An orange is green"
},
"metadata": {
"some_key": "whatever value"
}
}
Import a knowledge base¶
This section guides you through importing your custom knowledge base.
Note
A Knowledge Base is a domain-specific collection of information. You can have several knowledge bases for different areas of your business.
On the Agents page, under the Model tab, click the “Add knowledge base” button.
data:image/s3,"s3://crabby-images/1c13b/1c13b5816677b744bcf1a849e28ac1fa5ac1646c" alt=""List of knowledge bases""
The interface below displays the model details that need to be filled out.
data:image/s3,"s3://crabby-images/77705/7770585dd5bd75c18dbedfbdebb0fbfc8644a0a9" alt=""Import a knowledge base""
Name
: The name of the knowledge base.File
: The document to upload, in CSV format, containing the knowledge base content. The file should have one column named “text” with the document content. If you’re uploading a knowledge base with pre-defined topics, the file should have two columns with the first row labeled “text, topic”. Note the following rules:If the text has a value but the topic is blank, the topic will be set to ‘Others’.
If both the text and topic are blank, or if the text is blank but the topic has a value, the row will not be imported.
The interface below displays information about the knowledge base and its content with corresponding topics. If no topics were uploaded with the knowledge base, Giskard Hub will identify and generate them for you. In the example below, the knowledge base is ready to be used with over 200 documents and 3 topics.
data:image/s3,"s3://crabby-images/c4e83/c4e83733a6444c0decbff79be7db593aef34e2d2" alt=""Imported knowledge base""