bpmcdevitt/auto_sigma_rule_generator

Fork 0

This project is a proof of concept to see if we can have a program create SIGMA rules based on information in new CVEs that are published. - Extracts CVE records from the National Vulnerability Database - Extracts exploit data from Github repoositories, ExploitDB, and the CISA Known Exploited Vulnerabilities catalog - Extracts text data from reference links found on both exploit records + CVE records - Sends exploit data + reference data to LLM to create SIGMA rules based on the content This data is not meant for production use and is considered experimental. Inspired from: https://blogs.night-wolf.io/sigmagen-ai-powered-attck-mapped-threat-detection-with-sigma-rules

Find a file

bpmcdevitt 20b3a63c78 add claude client + generic llm client using langchain		2025-07-09 18:02:45 -05:00
backend	add claude client + generic llm client using langchain	2025-07-09 18:02:45 -05:00
frontend	add claude client + generic llm client using langchain	2025-07-09 18:02:45 -05:00
github_poc_collector@5c171fb9a9	added git submodule for more exploits. added template dir for base yaml templates for sigma rules	2025-07-09 11:58:29 -05:00
.env.example	add claude client + generic llm client using langchain	2025-07-09 18:02:45 -05:00
.gitignore	fix build errors	2025-07-08 09:10:25 -05:00
.gitmodules	added git submodule for more exploits. added template dir for base yaml templates for sigma rules	2025-07-09 11:58:29 -05:00
docker-compose.yml	added git submodule for more exploits. added template dir for base yaml templates for sigma rules	2025-07-09 11:58:29 -05:00
init.sql	more updates for bulk	2025-07-08 17:50:01 -05:00
Makefile	fix build errors	2025-07-08 09:10:25 -05:00
README.md	more updates for bulk	2025-07-08 17:50:01 -05:00
start.sh	more updates for bulk	2025-07-08 17:50:01 -05:00

README.md

CVE-SIGMA Auto Generator (Enhanced)

An advanced automated platform that processes comprehensive CVE data and generates enhanced SIGMA rules for threat detection using curated exploit intelligence.

🚀 Enhanced Features

Data Processing

Bulk NVD Processing: Downloads and processes complete NVD JSON datasets (2002-2025)
nomi-sec PoC Integration: Uses curated PoC data from github.com/nomi-sec/PoC-in-GitHub
Incremental Updates: Efficient updates using NVD modified/recent feeds
Quality Assessment: Advanced PoC quality scoring with star count, recency, and relevance analysis

Intelligence Generation

Enhanced SIGMA Rules: Creates rules using real exploit indicators from curated PoCs
Quality Tiers: Excellent, Good, Fair, Poor, Very Poor classification system
Smart Template Selection: AI-driven template matching based on PoC characteristics
Advanced Indicator Extraction: Processes, files, network, registry, and command patterns
MITRE ATT&CK Mapping: Automatic technique identification based on exploit analysis

User Experience

Modern Web Interface: React-based UI with enhanced bulk processing controls
Real-time Monitoring: Live job tracking and progress monitoring
Comprehensive Statistics: PoC coverage, quality metrics, and processing status
Bulk Operations Dashboard: Centralized control for all data processing operations

Architecture

Backend: FastAPI with SQLAlchemy ORM
Frontend: React with Tailwind CSS
Database: PostgreSQL
Cache: Redis (optional)
Containerization: Docker & Docker Compose

Quick Start

Prerequisites

Docker and Docker Compose
(Optional) NVD API Key for increased rate limits

Setup

Clone the repository:

git clone <repository-url>
cd cve-sigma-generator

Quick Start (Recommended):

chmod +x start.sh
./start.sh

Manual Setup:

# Copy environment file
cp .env.example .env

# (Optional) Edit .env and add your NVD API key
nano .env

# Start the application
docker-compose up -d --build

Wait for services to initialize (about 30-60 seconds)
Access the application:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs

First Run

The application will automatically:

Initialize the database with rule templates
Start fetching recent CVEs from NVD
Generate SIGMA rules for each CVE
Continue polling for new CVEs every hour

Usage

Web Interface

The web interface provides three main sections:

Dashboard: Overview statistics and recent CVEs
CVEs: Complete list of all fetched CVEs with details
SIGMA Rules: Generated detection rules organized by CVE

Manual CVE Fetch

You can trigger a manual CVE fetch using the "Fetch New CVEs" button in the dashboard or via API:

curl -X POST http://localhost:8000/api/fetch-cves

API Endpoints

GET /api/cves - List all CVEs
GET /api/cves/{cve_id} - Get specific CVE details
GET /api/sigma-rules - List all SIGMA rules
GET /api/sigma-rules/{cve_id} - Get SIGMA rules for specific CVE
POST /api/fetch-cves - Manually trigger CVE fetch
GET /api/stats - Get application statistics

Configuration

Environment Variables

DATABASE_URL: PostgreSQL connection string
NVD_API_KEY: Optional NVD API key for higher rate limits (5→50 requests/30s)
GITHUB_TOKEN: Optional GitHub personal access token for exploit analysis
REACT_APP_API_URL: Backend API URL for frontend

GitHub Integration (Optional)

For enhanced SIGMA rule generation with exploit analysis:

Create GitHub Token: Visit https://github.com/settings/tokens
Required Permissions: Only needs "public_repo" scope for searching public repositories
Add to Environment: GITHUB_TOKEN=your_token_here in .env file
Benefits:
- Automatically searches for CVE-related exploit code
- Extracts real indicators (processes, files, network connections)
- Generates more accurate and specific SIGMA rules
- Higher confidence ratings for exploit-based rules

Rate Limits: 5000 requests/hour with token, 60/hour without

Rule Templates

The application includes pre-configured rule templates for:

Windows Process Execution
Network Connections
File Modifications

Additional templates can be added to the database via the rule_templates table.

SIGMA Rule Generation Logic

The enhanced rule generation process:

CVE Analysis: Analyzes CVE description and affected products
GitHub Exploit Search: Searches GitHub for exploit code using multiple query strategies
Code Analysis: Extracts specific indicators from exploit code:
- Process names and command lines
- File paths and registry keys
- Network connections and ports
- PowerShell commands and scripts
- Command execution patterns
Template Selection: Chooses appropriate SIGMA rule template based on exploit analysis
Enhanced Rule Population: Fills template with real exploit indicators
MITRE ATT&CK Mapping: Maps to specific MITRE ATT&CK techniques
Confidence Scoring: Higher confidence for exploit-based rules

Rule Quality Levels

Basic Rules: Generated from CVE description only
Exploit-Based Rules: Enhanced with GitHub exploit analysis (marked with 🔍)
Confidence Ratings:
- High: CVSS ≥9.0 + exploit analysis
- Medium: CVSS ≥7.0 or exploit analysis
- Low: Basic CVE description only

Template Matching

PowerShell Execution: Exploit contains PowerShell scripts or cmdlets
Process Execution: Exploit shows process creation or command execution
Network Connection: Exploit demonstrates network communications
File Modification: Exploit involves file system operations

Example Enhanced Rule

title: CVE-2025-1234 Exploit-Based Detection
description: Detection for CVE-2025-1234 remote code execution [Enhanced with GitHub exploit analysis]
tags:
    - attack.t1059.001
    - cve-2025-1234  
    - exploit.github
detection:
    selection:
        Image|contains:
            - "powershell.exe"
            - "malicious_payload.exe"
            - "reverse_shell.ps1"
    condition: selection
level: high

Development

Local Development

Start the database:

docker-compose up -d db redis

Run the backend:

cd backend
pip install -r requirements.txt
uvicorn main:app --reload

Run the frontend:

cd frontend
npm install
npm start

Database Migration

The application automatically creates tables on startup. For manual schema changes:

# Connect to database
docker-compose exec db psql -U cve_user -d cve_sigma_db

# Run custom SQL
\i /path/to/migration.sql

SIGMA Rule Quality

Generated rules are marked as "experimental" and should be:

Reviewed by security analysts
Tested in a lab environment
Tuned to reduce false positives
Validated against real attack scenarios

Monitoring

Logs

View application logs:

# All services
docker-compose logs -f

# Specific service
docker-compose logs -f backend

Health Checks

The application includes health checks for database connectivity. Monitor with:

docker-compose ps

✅ Recent Fixes (July 2025)

Fixed 404 CVE fetch error: Corrected NVD API 2.0 endpoint format and parameters
Updated for current dates: Now properly fetches CVEs from July 2025 (current date)
Improved API integration: Better error handling, fallback mechanisms, and debugging
Enhanced date handling: Proper ISO-8601 format with UTC timezone
API key integration: Correctly passes API keys in headers for higher rate limits

Troubleshooting

Common Issues

Frontend build fails with "npm ci" error: This is fixed in the current version. The Dockerfile now uses npm install instead of npm ci.
CVE Fetch returns 404: Fixed in latest version. The application now uses proper NVD API 2.0 format with current 2025 dates.
No CVEs being fetched:
- Check if you have an NVD API key configured in .env for better rate limits
- Use the "Test NVD API" button to verify connectivity
- Check backend logs: docker-compose logs -f backend
Database Connection Error: Ensure PostgreSQL is running and accessible
Frontend Not Loading: Verify backend is running and CORS is configured
Rule Generation Issues: Check CVE description quality and template matching
Port conflicts: If ports 3000, 8000, or 5432 are in use, stop other services or modify docker-compose.yml

API Key Setup

NVD API (Recommended) For optimal CVE fetching performance:

Visit: https://nvd.nist.gov/developers/request-an-api-key
Add to your .env file: NVD_API_KEY=your_key_here
Restart the application

Without an API key: 5 requests per 30 seconds With an API key: 50 requests per 30 seconds

GitHub API (Optional) For enhanced exploit-based SIGMA rules:

Visit: https://github.com/settings/tokens
Create token with "public_repo" scope
Add to your .env file: GITHUB_TOKEN=your_token_here
Restart the application

Without a GitHub token: Basic rules only With a GitHub token: Enhanced rules with exploit analysis (🔍 Exploit-Based)

Rate Limits

Without an API key, NVD limits requests to 5 per 30 seconds. With an API key, the limit increases to 50 per 30 seconds.

Security Considerations

API Keys: Store NVD API keys securely using environment variables
Database Access: Use strong passwords and restrict database access
Network Security: Deploy behind a reverse proxy in production
Rule Validation: Always validate generated SIGMA rules before deployment

Contributing

Fork the repository
Create a feature branch
Make changes and add tests
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

For issues and questions:

Check the troubleshooting section
Review application logs
Open an issue on GitHub

Roadmap

Planned features:

Custom rule template editor
MITRE ATT&CK mapping
Rule effectiveness scoring
Export to SIEM platforms
Advanced threat intelligence integration
Machine learning-based rule optimization