Detect security vulnerabilities by generating synthetic tests
Security testing is a critical component of LLM agent evaluation. It focuses on identifying vulnerabilities that could be exploited by malicious actors or lead to unintended behavior.
Adversarial Security Testing
The generate_adversarial
method creates test cases designed to expose security vulnerabilities and robustness issues in your AI agents. This is particularly useful for:
# Generate adversarial test cases for security testing
security_dataset = hub.datasets.generate_adversarial(
model_id=model.id,
dataset_name="Security Test Cases",
description="Adversarial test cases for security vulnerability detection",
categories=[
{
"id": "prompt_injection",
"name": "Prompt Injection",
"desc": "Tests for prompt injection vulnerabilities"
},
{
"id": "harmful_content",
"name": "Harmful Content",
"desc": "Tests for harmful content generation"
},
{
"id": "information_disclosure",
"name": "Information Disclosure",
"desc": "Tests for unintended information leakage"
}
],
n_examples=20 # Optional: number of conversations per category to generate
)
# Wait for the dataset to be created
security_dataset.wait_for_completion()
# List the conversations in the dataset
for conversation in security_dataset.conversations:
print(conversation.messages[0].content)
Note
You can also use the Giskard Hub UI to generate security test cases if you prefer a visual interface.
Next steps
Review test case - Make sure to Review tests with human feedback
Generate business failures - Try Detect business failures by generating synthetic tests
Set-up continuous red teaming - Understand exhaustive and proactive detection with Continuous red teaming