LLM Security Testing & Research Toolkit
Test LLM prompts against 219 research attacks. Build DSPy defenses (targeting 95-99% based on AegisLLM paper)
•Updated October 16, 2025•1 min read
Overview
Research toolkit for LLM security testing with 219 curated jailbreak attacks validated against production systems. Completed Phase 1: Attack Testing (16.5% baseline success, multi-turn 1.48× stronger, model comparison). Phase 2: Defense validation in progress, targeting 95-99% blocking based on AegisLLM 2024 paper.
Built on 5 major 2024 research papers (Crescendo, AegisLLM, DefensiveTokens) with findings validated through extensive testing. 20+ GitHub stars from security researchers and AI practitioners.
Attack Testing Complete (Phase 1):
- 219 research attacks validated: 16.5% baseline success rate
- Multi-turn architecture: 1.48× improvement over single-turn
- DSPy attack generation tested: 7-11% (underperforms curated attacks)
- Model comparison: Kimi K2 57% better than Llama 3.3
Defense Development Next (Phase 2):
- Pattern-Based: ✅ Validated - 70-80% blocking, ~1ms latency, FREE
- DSPy-Optimized: ⚠️ In development - targeting 95-99% based on AegisLLM paper (99.76% reported)
- Scripts created, large-scale validation pending
Links:
- GitHub Repository →
- Research Findings - Full citations to 5 papers
- Quick Start Guide - 25 min setup, test in 10 seconds
