Safe Labs AI — AI Agent Red-Teaming & Evaluation

// threat coverage

What it tests

47 adversarial test cases mapped to the OWASP Agentic Security Initiative (ASI01–ASI10), engineered for production agentic systems.

ASI01

💉

Prompt Injection

Direct and indirect injection attacks targeting system prompts, user context, and tool outputs. Tests agent resistance to adversarial instruction hijacking.

ASI02

🔓

Jailbreak Attacks

Roleplay bypasses, persona switching, encoding tricks, and multi-turn manipulation designed to override safety guardrails and alignment constraints.

ASI03

🔍

Data Leakage

Systematic probing for system prompt exposure, training data extraction, PII leakage through tool calls, and cross-session memory contamination.

ASI04

🎭

Hallucination Attacks

Adversarial prompts engineered to maximize confident false outputs. Measures hallucination rate under factual pressure, ambiguity, and authority spoofing.

ASI05

🔧

Tool-Use Safety

Tests for unauthorized tool invocations, privilege escalation through chained tool calls, and boundary violations in multi-agent orchestration pipelines.

ASI06–10

📡

Behavioral Drift + More

Scope violations, long-context drift, adversarial memory poisoning, multi-agent collusion, and autonomous action boundary testing across extended runs.

// quickstart

How it works

Run your first red-team in under five minutes. Point it at any agent endpoint and get a structured vulnerability report.

                from safelabs import RedTeamAgent, TestSuite

# Point at your agent endpoint
agent = RedTeamAgent(
    target_url="http://localhost:8000/chat",
    framework="langchain"  # or "crewai", "custom"
)

# Run the full OWASP ASI suite
suite   = TestSuite.owasp_asi_full()
results = agent.run(suite)

# Export audit-ready reports
results.export_pdf("audit_report.pdf")
results.export_json("findings.json")
results.export_sarif("findings.sarif")

# Quick summary
print(results.summary())
# → 47 tests · 12 PASS · 35 FAIL · Score: 25/100
              

                # Install
$ pip install safelabs-eval

# Run default suite
$ safelabs run \
    --target http://localhost:8000/chat \
    --framework langchain \
    --suite owasp-asi-full

# Choose output format
$ safelabs run --target ... --format pdf
$ safelabs run --target ... --format sarif

# List available test suites
$ safelabs suites list
# → owasp-asi-full
# → prompt-injection-only
# → jailbreak-focused
              

                // findings.json — excerpt
{
  "scan_id":     "sl_20240615_a3f9",
  "framework":  "langchain",
  "score":      25,
  "total_tests":47,
  "critical":   8,
  "high":       14,
  "medium":     13,
  "findings": [
    {
      "id":         "ASI01-003",
      "category":  "prompt_injection",
      "severity":  "CRITICAL",
      "reproduced":true,
      "remediation":"..."
    }
  ]
}
              

Install + Point

pip install and point safelabs-eval at any agent endpoint. Native support for LangChain, CrewAI, AutoGPT, and raw HTTP agents.

Select Test Suite

Choose from pre-built OWASP ASI suites, or compose custom attack sequences from 47 individual adversarial vectors.

Run Adversarial Probes

The framework systematically exercises your agent across all threat categories, logging every interaction and grading responses automatically.

Export Audit Report

Get structured PDF, JSON, or SARIF output with severity rankings, reproduction steps, and remediation guidance — compliance-ready.

// use cases

Built for builders
and security teams

From pre-launch safety checks to enterprise compliance audits — one open framework, every agentic use case.

↑ Primary 🚀

AI Startups

Building agentic products with LangChain, CrewAI, or a custom stack? Catch critical vulnerabilities before your users — or attackers — do.

Pre-launch safety certification
Investor-ready audit reports
CI/CD integration on every deploy
Framework-native test coverage

Security Teams 🛡️

Red Teamers

Extend your offensive security practice to cover AI agent attack surfaces. OWASP ASI aligned, reproducible, and SARIF-compatible for existing toolchains.

47 adversarial test vectors
SARIF + JSON export
Custom attack sequence composer
CVE-ready finding format

Enterprise 🏢

Fortune 500

Regulated industries deploying AI agents need independent third-party assurance. Banking, healthcare, and government compliance reports built in.

SOC 2 evidence artifacts
Regulatory compliance mapping
White-label audit reports
On-prem deployment available

Find vulnerabilities in your AI agents before attackers do

What it tests

How it works

Built for buildersand security teams

Find
vulnerabilities
in your AI agents
before attackers do

Built for builders
and security teams