🎉 **Architecture Transformation (v2.0)** - Complete migration from web app to professional CLI tool - File-based SIGMA rule management system - Git-friendly directory structure organized by year/CVE-ID - Multiple rule variants per CVE (template, LLM, hybrid) ✨ **New CLI System** - Professional command-line interface with Click framework - 8 command groups: process, generate, search, stats, export, migrate - Modular command architecture for maintainability - Comprehensive help system and configuration management 📁 **File-Based Storage Architecture** - Individual CVE directories: cves/YEAR/CVE-ID/ - Multiple SIGMA rule variants per CVE - JSON metadata with processing history and PoC data - Native YAML files perfect for version control 🚀 **Core CLI Commands** - process: CVE processing and bulk operations - generate: SIGMA rule generation with multiple methods - search: Advanced CVE and rule searching with filters - stats: Comprehensive statistics and analytics - export: Multiple output formats for different workflows - migrate: Database-to-file migration tools 🔧 **Migration Support** - Complete migration utilities from web database - Data validation and integrity checking - Backward compatibility with existing processors - Legacy web interface maintained for transition 📊 **Enhanced Features** - Advanced search with complex filtering (severity, PoC presence, etc.) - Multi-format exports (YAML, JSON, CSV) - Comprehensive statistics and coverage reports - File-based rule versioning and management 🎯 **Production Benefits** - No database dependency - runs anywhere - Perfect for cybersecurity teams using git workflows - Direct integration with SIGMA ecosystems - Portable architecture for CI/CD pipelines - Multiple rule variants for different detection scenarios 📝 **Documentation Updates** - Complete README rewrite for CLI-first approach - Updated CLAUDE.md with new architecture details - Detailed CLI documentation with examples - Migration guides and troubleshooting **Perfect for security teams wanting production-ready SIGMA rules with version control\! 🛡️** 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com> |
||
---|---|---|
backend | ||
cli | ||
exploit-db-mirror@99e10e9ba8 | ||
frontend | ||
github_poc_collector@5c171fb9a9 | ||
.env.example | ||
.gitignore | ||
.gitmodules | ||
CLAUDE.md | ||
docker-compose.yml | ||
init.sql | ||
Makefile | ||
README.md | ||
start.sh |
CVE-SIGMA Auto Generator - CLI Edition
Professional file-based SIGMA rule generation system for cybersecurity workflows
Automated CLI tool that generates SIGMA detection rules from CVE data using AI-enhanced exploit analysis. Now optimized for git workflows and production SIGMA rule management with a file-based architecture.
🌟 Major Architecture Update
🎉 New in v2.0: Transformed from web application to professional CLI tool with file-based SIGMA rule management!
- Git-Friendly: Native YAML files perfect for version control
- Industry Standard: Direct integration with SIGMA ecosystems
- Portable: No database dependency, works anywhere
- Scalable: Process specific years/CVEs as needed
- Multiple Variants: Different generation methods per CVE
✨ Key Features
- Bulk CVE Processing: Complete NVD datasets (2002-2025) with nomi-sec PoC integration
- AI-Powered Rule Generation: Multi-provider LLM support (OpenAI, Anthropic, local Ollama)
- File-Based Storage: Organized directory structure for each CVE and rule variant
- Quality-Based PoC Analysis: 5-tier quality scoring system for exploit reliability
- Advanced Search & Filtering: Find CVEs and rules with complex criteria
- Comprehensive Statistics: Coverage reports and generation analytics
- Export Tools: Multiple output formats for different workflows
🚀 Quick Start
Prerequisites
- Python 3.8+ with pip
- (Optional) Docker for legacy web interface
- (Optional) API keys for enhanced features
Installation
# Clone repository
git clone <repository-url>
cd auto_sigma_rule_generator
# Install CLI dependencies
pip install -r cli/requirements.txt
# Make CLI executable
chmod +x cli/sigma_cli.py
# Initialize configuration
./cli/sigma_cli.py config-init
First Run - Migration from Web App (If Applicable)
# If migrating from previous web version
./cli/sigma_cli.py migrate from-database --database-url "postgresql://user:pass@localhost:5432/db"
# Validate migration
./cli/sigma_cli.py migrate validate
# Or start fresh with new CVE processing
./cli/sigma_cli.py process year 2024
🎯 CLI Usage
Core Commands
# Process CVEs and generate rules
./cli/sigma_cli.py process year 2024 # Process specific year
./cli/sigma_cli.py process cve CVE-2024-0001 # Process specific CVE
./cli/sigma_cli.py process bulk --start-year 2020 # Bulk process multiple years
./cli/sigma_cli.py process incremental --days 7 # Process recent changes
# Generate rules for existing CVEs
./cli/sigma_cli.py generate cve CVE-2024-0001 --method all # All generation methods
./cli/sigma_cli.py generate regenerate --year 2024 --method llm # Regenerate with LLM
# Search CVEs and rules
./cli/sigma_cli.py search cve "buffer overflow" --severity critical --has-poc
./cli/sigma_cli.py search rules "powershell" --method llm
# View statistics and reports
./cli/sigma_cli.py stats overview --year 2024 --output ./reports/2024-stats.json
./cli/sigma_cli.py stats poc --year 2024 # PoC coverage statistics
./cli/sigma_cli.py stats rules --method template # Rule generation statistics
# Export data
./cli/sigma_cli.py export sigma ./output-rules --format yaml --year 2024
./cli/sigma_cli.py export metadata ./reports/cve-data.csv --format csv
Available Generation Methods
template
- Template-based rule generationllm
- AI/LLM-enhanced generation (OpenAI, Anthropic, Ollama)hybrid
- Combined template + LLM approachall
- Generate all variants
📁 File Structure
The CLI organizes everything in a clean, git-friendly structure:
auto_sigma_rule_generator/
├── cves/ # CVE data organized by year
│ ├── 2024/
│ │ ├── CVE-2024-0001/
│ │ │ ├── metadata.json # CVE info & generation metadata
│ │ │ ├── rule_template.sigma # Template-based rule
│ │ │ ├── rule_llm_openai.sigma # OpenAI-generated rule
│ │ │ ├── rule_llm_anthropic.sigma# Anthropic-generated rule
│ │ │ ├── rule_hybrid.sigma # Hybrid-generated rule
│ │ │ └── poc_analysis.json # PoC analysis data
│ │ └── CVE-2024-0002/...
│ └── 2023/...
├── cli/ # CLI tool and commands
│ ├── sigma_cli.py # Main CLI executable
│ ├── commands/ # Command modules
│ └── README.md # Detailed CLI documentation
└── reports/ # Generated reports and exports
File Formats
metadata.json - CVE information and processing history
{
"cve_info": {
"cve_id": "CVE-2024-0001",
"description": "Remote code execution vulnerability...",
"cvss_score": 9.8,
"severity": "critical",
"published_date": "2024-01-01T00:00:00Z"
},
"poc_data": {
"poc_count": 3,
"poc_data": {"nomi_sec": [...], "github": [...]}
},
"rule_generation": {
"template": {"generated_at": "2024-01-01T12:00:00Z"},
"llm_openai": {"generated_at": "2024-01-01T12:30:00Z"}
}
}
SIGMA Rule Files - Ready-to-use detection rules
# rule_llm_openai.sigma
title: CVE-2024-0001 Remote Code Execution Detection
id: 12345678-1234-5678-9abc-123456789012
status: experimental
description: Detects exploitation attempts for CVE-2024-0001
author: CVE-SIGMA Auto Generator (OpenAI Enhanced)
date: 2024/01/01
references:
- https://nvd.nist.gov/vuln/detail/CVE-2024-0001
tags:
- attack.t1059.001
- cve.2024.0001
- ai.enhanced
logsource:
category: process_creation
product: windows
detection:
selection:
Image|endswith: '\powershell.exe'
CommandLine|contains:
- '-EncodedCommand'
- 'bypass'
condition: selection
falsepositives:
- Legitimate administrative scripts
level: high
⚙️ Configuration
CLI Configuration (~/.sigma-cli/config.yaml
)
# API Keys for enhanced functionality
api_keys:
nvd_api_key: "your_nvd_key" # Optional: 5→50 req/30s rate limit
github_token: "your_github_token" # Optional: Enhanced PoC analysis
openai_api_key: "your_openai_key" # Optional: AI rule generation
anthropic_api_key: "your_anthropic_key" # Optional: AI rule generation
# LLM Settings
llm_settings:
default_provider: "ollama" # Default: ollama (local)
default_model: "llama3.2" # Provider-specific model
ollama_base_url: "http://localhost:11434"
# Processing Settings
processing:
default_batch_size: 50 # CVEs per batch
default_methods: ["template"] # Default generation methods
API Keys Setup
NVD API Key (Recommended)
- Get key: https://nvd.nist.gov/developers/request-an-api-key
- Benefit: 10x rate limit increase (5 → 50 requests/30s)
GitHub Token (Optional)
- Create: https://github.com/settings/tokens (public_repo scope)
- Benefit: Enhanced PoC analysis and exploit indicators
LLM APIs (Optional)
- Local Ollama: No setup required (default) - runs locally
- OpenAI: Get key from https://platform.openai.com/api-keys
- Anthropic: Get key from https://console.anthropic.com/
🧠 AI-Enhanced Rule Generation
How It Works
- CVE Analysis: Extract vulnerability details from NVD data
- PoC Collection: Gather exploit code from nomi-sec, GitHub, ExploitDB
- Quality Assessment: Score PoCs based on stars, recency, completeness
- AI Enhancement: LLM analyzes actual exploit code to create detection logic
- SIGMA Generation: Produce valid, tested SIGMA rules with proper syntax
- Multi-Variant Output: Generate template, LLM, and hybrid versions
Quality Tiers
- Excellent (80+ pts): High-star PoCs with recent updates, detailed analysis
- Good (60-79 pts): Moderate quality with some validation
- Fair (40-59 pts): Basic PoCs with minimal indicators
- Poor (20-39 pts): Low-quality or outdated PoCs
- Very Poor (<20 pts): Minimal or unreliable PoCs
Rule Variants Generated
- 🤖 AI-Enhanced (
rule_llm_*.sigma
): LLM analysis of actual exploit code - 🔧 Template-Based (
rule_template.sigma
): Pattern-based generation - ⚡ Hybrid (
rule_hybrid.sigma
): Best of both approaches
📊 Advanced Features
Search & Analytics
# Complex CVE searches
./cli/sigma_cli.py search cve "remote code execution" \
--year 2024 --severity critical --has-poc --has-rules --limit 50
# Rule analysis
./cli/sigma_cli.py search rules "powershell" \
--rule-type process --method llm --limit 20
# Comprehensive statistics
./cli/sigma_cli.py stats overview # Overall system stats
./cli/sigma_cli.py stats poc --year 2024 # PoC coverage analysis
./cli/sigma_cli.py stats rules --method llm # AI generation statistics
Export & Integration
# Export for SIEM integration
./cli/sigma_cli.py export sigma ./siem-rules \
--format yaml --year 2024 --method llm
# Metadata for analysis
./cli/sigma_cli.py export metadata ./analysis/cve-data.csv \
--format csv --year 2024
# Consolidated ruleset
./cli/sigma_cli.py export ruleset ./complete-rules.json \
--year 2024 --include-metadata
🛠️ Development & Legacy Support
CLI Development
The new CLI system is built with:
- Click: Professional CLI framework
- Modular Commands: Separate modules for each command group
- Async Processing: Efficient handling of bulk operations
- File-Based Storage: Git-friendly YAML and JSON formats
Legacy Web Interface (Optional)
The original web interface is still available for migration purposes:
# Start legacy web interface (if needed for migration)
docker-compose up -d db redis backend frontend
# Access points:
# - Frontend: http://localhost:3000
# - API: http://localhost:8000
# - Flower (Celery): http://localhost:5555
Migration Path
- Export Data: Use CLI migration tools to export from database
- Validate: Verify all data transferred correctly
- Switch: Use CLI for all new operations
- Cleanup: Optionally remove web components
🔧 Troubleshooting
Common Issues
CLI Import Errors
- Ensure you're running from project root directory
- Install dependencies:
pip install -r cli/requirements.txt
- Check Python version (3.8+ required)
CVE Processing Failures
- Verify NVD API key in configuration
- Check network connectivity and rate limits
- Use
--verbose
flag for detailed logging
No Rules Generated
- Ensure LLM provider is accessible (test with
./cli/sigma_cli.py stats overview
) - Check PoC data availability with
--has-poc
filter - Verify API keys for external LLM providers
File Permission Issues
- Ensure write permissions to
cves/
directory - Check CLI executable permissions:
chmod +x cli/sigma_cli.py
Performance Optimization
- Use
--batch-size
parameter for large datasets - Process recent years first (2020+) for faster initial results
- Use
incremental
processing for regular updates - Monitor system resources during bulk operations
🛡️ Security Best Practices
- Store API keys in configuration file (
~/.sigma-cli/config.yaml
) - Validate generated rules before production deployment
- Rules marked as "experimental" require analyst review
- Use version control to track rule changes and improvements
- Regularly update PoC data sources for current threat landscape
📈 Monitoring & Maintenance
# System health checks
./cli/sigma_cli.py stats overview # Overall system status
./cli/sigma_cli.py migrate validate # Data integrity check
# Regular maintenance
./cli/sigma_cli.py process incremental --days 7 # Weekly updates
./cli/sigma_cli.py generate regenerate --filter-quality excellent # Refresh high-quality rules
# Performance monitoring
./cli/sigma_cli.py stats rules --year 2024 # Generation statistics
./cli/sigma_cli.py stats poc --year 2024 # Coverage analysis
🗺️ Roadmap
CLI Enhancements
- Rule quality scoring and validation
- Custom template editor
- Integration with popular SIEM platforms
- Advanced MITRE ATT&CK mapping
- Threat intelligence feed integration
Export Features
- Splunk app export format
- Elastic Stack integration
- QRadar rule format
- YARA rule generation
- IOC extraction
📝 License
MIT License - see LICENSE file for details.
🤝 Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Test with both CLI and legacy systems
- Add tests and documentation
- Submit a pull request
📞 Support
CLI Issues
- Check
cli/README.md
for detailed CLI documentation - Use
--verbose
flag for debugging - Ensure proper configuration in
~/.sigma-cli/config.yaml
General Support
- Review troubleshooting section above
- Check application logs with
--verbose
- Open GitHub issue with specific error details
🎉 What's New in v2.0
✅ Complete CLI System - Professional command-line interface
✅ File-Based Storage - Git-friendly YAML and JSON files
✅ Multiple Rule Variants - Template, AI, and hybrid generation
✅ Advanced Search - Complex filtering and analytics
✅ Export Tools - Multiple output formats for different workflows
✅ Migration Tools - Seamless transition from web application
✅ Portable Architecture - No database dependency, runs anywhere
Perfect for cybersecurity teams who want production-ready SIGMA rules with version control integration! 🚀