Inspect AI
Supervise and evaluate agentic systems using UK’s AI Safety Institute’s Inspect AI evaluation framework
Getting Started with Inspect AI
Inspect AI is a powerful open-source framework for large language model evaluations. This guide will help you get started with running examples, editing configurations, and understanding the main functionalities of using Asteroid’s approvers together with Inspect.
Note ⚠️: We are in the process of consolidating our functionality under the
asteroid-sdk
package, but the Inspect AI features are currently still available in theentropy-labs
package. We are actively working on migrating all features toasteroid-sdk
for a more unified experience.
Running the Inspect AI Example
You can run the Inspect AI example to see how it works and understand how to use it in your own projects. Before running the example, make sure that you cloned the Asteroid repository and that your frontend and backend server are running. You can start them using Docker Compose as described in the Quickstart guide.
Entropy Labs to Asteroid
Install Dependencies
To get started, ensure that inspect-ai
and asteroid-sdk
are installed in your Python environment:
Navigate to Example Directory
Change to the example directory:
Run the Example with Multiple Approval Levels
Run the example that uses approval_escalation.yaml
for approvals configuration:
This command runs the example and triggers approvals. You can view the approvals at http://localhost:3000
.
Change Approval Configuration
You can make changes to the examples/inspect_example/run.py
file to try different models, more samples, or different approval configurations. For example, we provide 2 more approval configuration files in the approval_human.yaml
and approval_llm.yaml
files. Change the approval_file_name
in run.py
to use a different approval configuration file.
Registering Inspect AI Evaluations with Asteroid
To integrate Inspect AI evaluations with Asteroid, it’s necessary to register projects, tasks, and runs. This registration ensures that your evaluation is correctly tracked and managed within the Asteroid platform. Additionally, you can optionally use the Asteroid scorer to score the finished evaluation using the Asteroid UI.
Here are the relevant additions to the the Inspect Task definition:
Explanation of the Additions:
-
Registering Samples with Asteroid:
register_inspect_samples_with_asteroid_sdk_solver
function is necessary for correctly registering your evaluation with Asteroid. It should be added to thesolver
list in yourTask
definition. -
Using the Asteroid Scorer (Optional): The
asteroid_sdk_web_ui_scorer
function can be used to score the finished evaluation using the Asteroid UI. By setting thescorer
parameter toasteroid_sdk_web_ui_scorer()
, you enable scoring and feedback collection through the web interface.
Note: Ensure that you have the Asteroid backend server running and accessible at the URL specified in asteroid_sdk_backend_url
. This is typically http://localhost:8080
when running locally.
By incorporating these functions into your code, you enable seamless integration with Asteroid, enhancing your ability to monitor, manage, and assess your Inspect AI evaluations.
Approval Configuration
Approvals in Inspect AI are configured using YAML files. Examples of approval configurations are provided in the approval_human.yaml
, approval_llm.yaml
, and approval_escalation.yaml
files.
Using Approvers
We follow the Inspect AI format for the approval configuration. Our approvers are in the asteroid-sdk
package. You can install the asteroid-sdk
package to get these approvers:
Then, you can reference the approvers in the approval configuration using names like asteroid_sdk/human_approver
.
Approval Configuration Examples
Human Approval
The approval_human.yaml
file configures human approval for all tools:
LLM Approval
The approval_llm.yaml
file configures approval using a Language Model. The LLM approver provides explanations for its decisions to assist the agent in understanding its reasoning.
Escalation with Allowlists
In the approval_escalation.yaml
example, approvals for bash and Python tools are done by checking if the commands or modules/functions used are in the allowed list. If approval is not passed, human approval is triggered.
This configuration ensures that only approved Python modules and functions, as well as specific bash commands, can be used without triggering human approval. Any attempts to use disallowed or sensitive operations will require human intervention through the specified API endpoint.
Understanding the Approvers
Human Approver
The human approver requires human interaction to approve or reject actions taken by the agent. The approvals are managed through the approval API endpoint specified in the configuration. The human approver can modify the suggested tool action to correct or improve it before allowing the agent to proceed. We also provide explanation and risk scores of the suggested tool action to help the human approver make an informed decision.
Understanding the n
Parameter
In the human_approver
, the n
parameter specifies the number of action options that the model should generate for each tool call. When n
is greater than 1, the model will generate multiple options for the action, and the human approver can select the most appropriate one. This provides flexibility in overseeing the agent’s actions.
This is the view for the human approver with n=2
. You can see the multiple options presented to the human approver and the explanation and risk scores for each option:
These capabilities provide greater control over the agent’s behavior, ensuring that the human approver can guide the agent effectively.
Bash Approver
The bash approver checks bash commands against an allowlist of permitted commands. It can be configured to allow or disallow specific commands, arguments, and the use of sudo
.
Python Approver
The python approver checks Python code for the use of allowed modules and functions. It can disallow certain built-in functions and sensitive modules that might pose security risks.
Editing Configurations
You can customize the behavior of your agents by editing the approval YAML files and modifying the example code.
Modifying the Agent Script
You can edit run.py
to change the tasks, the tools used, and other aspects of the agent’s behavior.
Additional Resources
For more information on Inspect AI and approval configurations, refer to the Inspect AI documentation and the Asteroid package.