🛡️ LLM Vulnerability Scanner: Automatically Test & Secure Your AI Agents (OWASP, NIST, MITRE)

Giskard

5 Nov 202502:16

Summary

TLDRThe LLM Vulnerability Scanner is an automated tool designed to evaluate AI agents by generating realistic attack scenarios and categorizing them by risk level. It categorizes probes into critical, major, or minor, covering security and business-related issues such as harmful content generation. Users can review and select probes, adding them to their evaluation datasets. The tool offers both a web UI and an SDK, allowing for seamless integration with Python workflows to launch scans and analyze results. This tool enhances AI agent security by providing comprehensive and customizable vulnerability assessments.

Takeaways

😀 The LLM Vulnerability Scanner generates realistic evaluation scenarios for AI agents, categorizing them as critical, major, or minor risks.
😀 Scenarios are grouped into 11 categories, with most based on O security standards and some focusing on business concerns like harmful content generation.
😀 Each attack scenario is associated with specific probes that simulate potential vulnerabilities.
😀 Probes are evaluated to determine their likelihood of success and are categorized accordingly.
😀 Users can review probes and decide whether to add them to their evaluation dataset or mark them as false positives.
😀 The LLM Vulnerability Scanner interface allows users to launch scans by selecting agents and customizing scenarios with a knowledge base.
😀 The scanner enables users to tailor attack scenarios to their specific use case by integrating a knowledge base into the scan.
😀 To launch a scan, users can select probes, choose agents, and then initiate the scan for vulnerability evaluation.
😀 Python users can also run vulnerability scans via the SDK by initializing the client, getting the model ID, and executing the scan programmatically.
😀 Users can monitor the progress of scans and view metrics after completion to assess the performance and results of the evaluation.
😀 The documentation provides detailed guidance on how to use the LLM Vulnerability Scanner both through the UI and the SDK.

Q & A

What is the main purpose of the LLM Vulnerability Scanner?
-The LLM Vulnerability Scanner automatically generates realistic evaluation scenarios for AI agents and categorizes them as critical, major, or minor to identify which scenarios pose the greatest risk to a company.
How does the LLM Vulnerability Scanner classify the scenarios?
-The scanner classifies scenarios into three categories: critical, major, or minor, based on their likelihood to pose a risk to the company.
What are the 11 categories of attack patterns in the scanner?
-The attack patterns in the scanner are linked to security standards, with some focused on general security concerns and others on business-related issues like harmful content generation.
How does the user interact with the probes in the LLM Vulnerability Scanner?
-The user reviews individual probes, decides whether to add them to an evaluation dataset, or marks them as false positives if they are not relevant for the evaluation.
What can users find in the documentation regarding the vulnerability scanner?
-The documentation provides an overview of attack categories, probes associated with each category, and detailed instructions on how to launch the vulnerability scan and customize probes for specific scenarios.
How does the LLM Vulnerability Scanner allow customization of attacks?
-Users can include a knowledge base that adapts the attacks to their specific usage scenario, ensuring the generated attack patterns are relevant to the context.
What is the process for launching a vulnerability scan in the UI?
-To launch a scan, the user selects the agent to scan, includes a knowledge base for customization, selects relevant probes, and then launches the scan.
Can Python users launch a vulnerability scan using the LLM Vulnerability Scanner?
-Yes, Python users can launch a vulnerability scan by using the provided SDK documentation, where they initialize the client, obtain a model ID, and execute the scan.
What metrics are provided after completing a vulnerability scan?
-After completing the scan, users can view metrics that detail the success or failure of probes in evaluating the LLM agent.
How are false positives handled in the scanner?
-If a probe is deemed irrelevant or inaccurate, users can mark it as a false positive and exclude it from their evaluation dataset.