Detect security vulnerabilities by generating synthetic tests

Security testing is a critical component of LLM agent evaluation. It focuses on identifying vulnerabilities that could be exploited by malicious actors or lead to unintended behavior.

Adversarial Security Testing

The generate_adversarial method creates test cases designed to expose security vulnerabilities and robustness issues in your AI agents. This is particularly useful for:

# Generate adversarial test cases for security testing
security_dataset = hub.datasets.generate_adversarial(
    model_id=model.id,
    dataset_name="Security Test Cases",
    description="Adversarial test cases for security vulnerability detection",
    categories=[
        {
            "id": "prompt_injection",
            "name": "Prompt Injection",
            "desc": "Tests for prompt injection vulnerabilities"
        },
        {
            "id": "harmful_content",
            "name": "Harmful Content",
            "desc": "Tests for harmful content generation"
        },
        {
            "id": "information_disclosure",
            "name": "Information Disclosure",
            "desc": "Tests for unintended information leakage"
        }
    ],
    n_examples=20 # Optional: number of conversations per category to generate
)

# Wait for the dataset to be created
security_dataset.wait_for_completion()

# List the conversations in the dataset
for conversation in security_dataset.conversations:
    print(conversation.messages[0].content)

Note

You can also use the Giskard Hub UI to generate security test cases if you prefer a visual interface.

Next steps

Review test case - Make sure to Review tests with human feedback
Generate business failures - Try Detect business failures by generating synthetic tests
Set-up continuous red teaming - Understand exhaustive and proactive detection with Continuous red teaming