Accepts incoming events and correctly parses them into events. GeoLite2 integration complete"

2025-11-04 00:11:10 +11:00
parent 0cbd462e7c
commit 5ff166613e
49 changed files with 4489 additions and 322 deletions
--- a/docs/maxmind.md
+++ b/docs/maxmind.md
@@ -0,0 +1,358 @@
+# MaxMind GeoIP Integration
+
+This document describes the MaxMind GeoIP integration implemented in the Baffle Hub WAF analytics system.
+
+## Overview
+
+The Baffle Hub application uses MaxMind's free GeoLite2-Country database to provide geographic location information for IP addresses. The system automatically enriches WAF events with country codes and provides manual lookup capabilities for both IPv4 and IPv6 addresses.
+
+## Features
+
+- **On-demand lookup** - Country code lookup by IP address
+- **Automatic enrichment** - Events are enriched with geo-location data during processing
+- **Manual lookup capability** - Rake tasks and model methods for manual lookups
+- **GeoLite2-Country database** - Uses MaxMind's free country-level database
+- **Automatic updates** - Weekly background job updates the database
+- **IPv4/IPv6 support** - Full protocol support for both IP versions
+- **Performance optimized** - Database caching and efficient lookups
+- **Graceful degradation** - Fallback handling when database is unavailable
+
+## Architecture
+
+### Core Components
+
+#### 1. GeoIpService
+- Central service for all IP geolocation operations
+- Handles database loading from file system
+- Provides batch lookup capabilities
+- Manages database updates from MaxMind CDN
+- Uses MaxMind's built-in metadata for version information
+
+#### 2. UpdateGeoIpDatabaseJob
+- Background job for automatic database updates
+- Runs weekly to keep the database current
+- Simple file-based validation and updates
+
+#### 3. Enhanced Models
+- **Event Model** - Automatic geo-location enrichment for WAF events
+- **IPv4Range/IPv6Range Models** - Manual lookup methods for IP ranges
+
+#### 4. File-System Management
+- Database stored as single file: `db/geoip/GeoLite2-Country.mmdb`
+- Version information queried directly from MaxMind database metadata
+- No database tables needed - simplified approach
+
+## Installation & Setup
+
+### Dependencies
+The integration uses the following gems:
+- `maxmind-db` - Official MaxMind database reader (with built-in caching)
+- `httparty` - HTTP client for database downloads
+
+### Database Storage
+- Location: `db/geoip/GeoLite2-Country.mmdb`
+- Automatic creation of storage directory
+- File validation and integrity checking
+- Version information queried directly from database metadata
+- No additional caching needed - MaxMind DB has its own internal caching
+
+### Initial Setup
+```bash
+# Install dependencies
+bundle install
+
+# Download the GeoIP database
+rails geoip:update
+
+# Verify installation
+rails geoip:status
+```
+
+## Configuration
+
+The system is configurable via environment variables or application configuration:
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `MAXMIND_DATABASE_URL` | MaxMind CDN URL | Database download URL |
+| `MAXMIND_AUTO_UPDATE` | `true` | Enable automatic weekly updates |
+| `MAXMIND_UPDATE_INTERVAL_DAYS` | `7` | Days between update checks |
+| `MAXMIND_MAX_AGE_DAYS` | `30` | Maximum database age before forced update |
+| Note: MaxMind DB has built-in caching, no additional caching needed |
+| `MAXMIND_FALLBACK_COUNTRY` | `nil` | Fallback country when lookup fails |
+| `MAXMIND_ENABLE_FALLBACK` | `false` | Enable fallback country usage |
+
+### Example Configuration
+```bash
+# config/application.rb or .env file
+MAXMIND_AUTO_UPDATE=true
+MAXMIND_UPDATE_INTERVAL_DAYS=7
+MAXMIND_MAX_AGE_DAYS=30
+MAXMIND_FALLBACK_COUNTRY=US
+MAXMIND_ENABLE_FALLBACK=true
+# Note: No caching configuration needed - MaxMind has built-in caching
+```
+
+## Usage
+
+### Rake Tasks
+
+#### Database Management
+```bash
+# Download/update the GeoIP database
+rails geoip:update
+
+# Check database status and configuration
+rails geoip:status
+
+# Test the implementation with sample IPs
+rails geoip:test
+
+# Manual lookup for a specific IP
+rails geoip:lookup[8.8.8.8]
+rails geoip:lookup[2001:4860:4860::8888]
+```
+
+#### Data Management
+```bash
+# Enrich existing events missing country codes
+rails geoip:enrich_missing
+
+# Clean up old inactive database records
+rails geoip:cleanup
+```
+
+### Ruby API
+
+#### Service-Level Lookups
+```ruby
+# Direct country lookup
+country = GeoIpService.lookup_country('8.8.8.8')
+# => "US"
+
+# Batch lookup
+countries = GeoIpService.new.lookup_countries(['8.8.8.8', '1.1.1.1'])
+# => { "8.8.8.8" => "US", "1.1.1.1" => nil }
+
+# Check database availability
+service = GeoIpService.new
+service.database_available?  # => true/false
+service.database_info         # => Database metadata
+```
+
+#### Event Model Integration
+```ruby
+# Automatic enrichment during event processing
+event = Event.find(123)
+event.enrich_geo_location!           # Updates event with country code
+event.lookup_country                 # => "US" (with fallback to service)
+event.has_geo_data?                  # => true/false
+event.geo_location                   # => { country_code: "US", city: nil, ... }
+
+# Batch enrichment of existing events
+updated_count = Event.enrich_geo_location_batch
+puts "Enriched #{updated_count} events with geo data"
+```
+
+#### IP Range Model Integration
+```ruby
+# IPv4 Range lookups
+range = Ipv4Range.find(123)
+range.geo_lookup_country!            # Updates range with country code
+range.geo_lookup_country             # => "US" (without updating)
+range.has_country_info?              # => true/false
+range.primary_country                # => "US" (best available country)
+
+# Class methods
+country = Ipv4Range.lookup_country_by_ip('8.8.8.8')
+updated_count = Ipv4Range.enrich_missing_geo_data(limit: 1000)
+
+# IPv6 Range lookups (same interface)
+country = Ipv6Range.lookup_country_by_ip('2001:4860:4860::8888')
+updated_count = Ipv6Range.enrich_missing_geo_data(limit: 1000)
+```
+
+### Background Processing
+
+#### Automatic Updates
+The system automatically schedules database updates:
+```ruby
+# Manually trigger an update (usually scheduled automatically)
+UpdateGeoIpDatabaseJob.perform_later
+
+# Force update regardless of age
+UpdateGeoIpDatabaseJob.perform_later(force_update: true)
+```
+
+#### Event Processing Integration
+Geo-location enrichment is automatically included in WAF event processing:
+```ruby
+# This is called automatically in ProcessWafEventJob
+event = Event.create_from_waf_payload!(event_id, payload, project)
+event.enrich_geo_location! if event.ip_address.present? && event.country_code.blank?
+```
+
+## Database Information
+
+### GeoLite2-Country Database
+- **Source**: MaxMind GeoLite2-Country (free version)
+- **Update Frequency**: Weekly (Tuesdays)
+- **Size**: ~9.5 MB
+- **Coverage**: Global IP-to-country mapping
+- **Format**: MaxMind DB (.mmdb)
+
+### Database Fields
+- `country.iso_code` - Two-letter ISO country code
+- Supports both IPv4 and IPv6 addresses
+- Includes anonymous/proxy detection metadata
+
+## Performance Considerations
+
+### Performance
+- MaxMind DB has built-in internal caching optimized for lookups
+- Typical lookup time: <1ms
+- Database size optimized for fast lookups
+- No additional caching layer needed
+
+### Lookup Performance
+- Typical lookup time: <1ms
+- Database size optimized for fast lookups
+- Efficient range queries for IP networks
+
+### Memory Usage
+- Database loaded into memory for fast access
+- Approximate memory usage: 15-20 MB for the country database
+- Automatic cleanup of old database files
+
+## Error Handling
+
+### Graceful Degradation
+- Service returns `nil` when database unavailable
+- Logging at appropriate levels for different error types
+- Event processing continues even if geo-location fails
+
+### Common Error Scenarios
+1. **Database Missing** - Automatic download triggered
+2. **Database Corrupted** - Automatic re-download attempted
+3. **Network Issues** - Graceful fallback with error logging
+4. **Invalid IP Address** - Returns `nil` with warning log
+
+## Troubleshooting
+
+### Check System Status
+```bash
+# Verify database status
+rails geoip:status
+
+# Test with known IPs
+rails geoip:test
+
+# Check logs for errors
+tail -f log/production.log | grep GeoIP
+```
+
+### Common Issues
+
+#### Database Not Available
+```bash
+# Force database update
+rails geoip:update
+
+# Check file permissions
+ls -la db/geoip/
+```
+
+#### Lookup Failures
+```bash
+# Test specific IPs
+rails geoip:lookup[8.8.8.8]
+
+# Check database validity
+rails runner "puts GeoIpService.new.database_available?"
+```
+
+#### Performance Issues
+- Increase cache size in configuration
+- Check memory usage on deployment server
+- Monitor lookup times with application metrics
+
+## Monitoring & Maintenance
+
+### Health Checks
+```ruby
+# Rails console health check
+service = GeoIpService.new
+puts "Database available: #{service.database_available?}"
+puts "Database age: #{service.database_record&.age_in_days} days"
+```
+
+### Scheduled Maintenance
+- Database automatically updated weekly
+- Old database files cleaned up after 7 days
+- No manual maintenance required
+
+### Monitoring Metrics
+Consider monitoring:
+- Database update success/failure rates
+- Lookup performance (response times)
+- Database age and freshness
+- Cache hit/miss ratios
+
+## Security & Privacy
+
+### Data Privacy
+- No personal data stored in the GeoIP database
+- Only country-level information provided
+- No tracking or logging of IP lookups by default
+
+### Network Security
+- Database downloaded from official MaxMind CDN
+- File integrity validated with MD5 checksums
+- Secure temporary file handling during updates
+
+## API Reference
+
+### GeoIpService
+
+#### Class Methods
+- `lookup_country(ip_address)` - Direct lookup
+- `update_database!` - Force database update
+
+#### Instance Methods
+- `lookup_country(ip_address)` - Country lookup
+- `lookup_countries(ip_addresses)` - Batch lookup
+- `database_available?` - Check database status
+- `database_info` - Get database metadata
+- `update_from_remote!` - Download new database
+
+### Model Methods
+
+#### Event Model
+- `enrich_geo_location!` - Update with country code
+- `lookup_country` - Get country code (with fallback)
+- `has_geo_data?` - Check if geo data exists
+- `geo_location` - Get full geo location hash
+
+#### IPv4Range/IPv6Range Models
+- `geo_lookup_country!` - Update range with country code
+- `geo_lookup_country` - Get country code (without update)
+- `has_country_info?` - Check for existing country data
+- `primary_country` - Get best available country code
+- `lookup_country_by_ip(ip)` - Class method for IP lookup
+- `enrich_missing_geo_data(limit:)` - Class method for batch enrichment
+
+## Support & Resources
+
+### MaxMind Documentation
+- [MaxMind Developer Site](https://dev.maxmind.com/)
+- [GeoLite2 Databases](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data)
+- [Database Accuracy](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data#accuracy)
+
+### Ruby Libraries
+- [maxmind-db gem](https://github.com/maxmind/MaxMind-DB-Reader-ruby)
+- [httparty gem](https://github.com/jnunemaker/httparty)
+
+### Troubleshooting Resources
+- Application logs: `log/production.log`
+- Rails console for manual testing
+- Database status via `rails geoip:status`
--- a/docs/rule-architecture.md
+++ b/docs/rule-architecture.md
@@ -0,0 +1,625 @@
+# Baffle Hub - Rule Architecture
+
+## Overview
+
+Baffle Hub uses a distributed rule system where the Hub generates and manages rules, and Agents download and enforce them locally using optimized SQLite queries. This architecture provides sub-millisecond rule evaluation while maintaining centralized intelligence and control.
+
+## Core Principles
+
+1. **Hub-side Intelligence**: Pattern detection and rule generation happens on the Hub
+2. **Agent-side Enforcement**: Rule evaluation happens locally on Agents for speed
+3. **Incremental Sync**: Agents poll for rule updates using timestamp-based cursors
+4. **Dynamic Backpressure**: Hub controls event sampling based on load
+5. **Temporal Rules**: Rules can expire automatically (e.g., 24-hour bans)
+6. **Soft Deletes**: Rules are disabled, not deleted, for proper sync and audit trail
+
+## Rule Types
+
+### 1. Network Rules (`network_v4`, `network_v6`)
+
+Block or allow traffic based on IP address or CIDR ranges.
+
+**Use Cases**:
+- Block scanner IPs (temporary or permanent)
+- Block datacenter/VPN/proxy ranges
+- Allow trusted IP ranges
+- Geographic blocking via IP ranges
+
+**Evaluation**:
+- **Most specific CIDR wins** (smallest prefix)
+- `/32` beats `/24` beats `/16` beats `/8`
+- Agent uses optimized range queries on `ipv4_ranges`/`ipv6_ranges` tables
+
+**Example**:
+```json
+{
+  "id": 12341,
+  "rule_type": "network_v4",
+  "action": "deny",
+  "conditions": { "cidr": "185.220.100.0/22" },
+  "priority": 22,
+  "expires_at": "2024-11-04T12:00:00Z",
+  "enabled": true,
+  "source": "auto:scanner_detected",
+  "metadata": {
+    "reason": "Tor exit node hitting /.env",
+    "auto_generated": true
+  }
+}
+```
+
+### 2. Rate Limit Rules (`rate_limit`)
+
+Control request rate per IP or per CIDR range.
+
+**Scopes** (Phase 1):
+- **Global per-IP**: Limit requests per IP across all paths
+- **Per-CIDR**: Different limits for different network ranges
+
+**Scopes** (Phase 2+):
+- **Per-path per-IP**: Different limits for `/api/*`, `/login`, etc.
+
+**Evaluation**:
+- Agent maintains in-memory counters per IP
+- Finds most specific CIDR rule for the IP
+- Applies that rule's rate limit configuration
+- Optional: Persist counters to SQLite for restart resilience
+
+**Example (Phase 1)**:
+```json
+{
+  "id": 12342,
+  "rule_type": "rate_limit",
+  "action": "rate_limit",
+  "conditions": {
+    "cidr": "0.0.0.0/0",
+    "scope": "global"
+  },
+  "priority": 0,
+  "enabled": true,
+  "source": "manual",
+  "metadata": {
+    "limit": 100,
+    "window": 60,
+    "per_ip": true
+  }
+}
+```
+
+**Example (Phase 2+)**:
+```json
+{
+  "id": 12343,
+  "rule_type": "rate_limit",
+  "action": "rate_limit",
+  "conditions": {
+    "cidr": "0.0.0.0/0",
+    "scope": "per_path",
+    "path_pattern": "/api/login"
+  },
+  "metadata": {
+    "limit": 5,
+    "window": 60,
+    "per_ip": true
+  }
+}
+```
+
+### 3. Path Pattern Rules (`path_pattern`)
+
+Detect suspicious path access patterns (mainly for Hub analytics).
+
+**Use Cases**:
+- Detect scanners hitting `/.env`, `/.git`, `/wp-admin`
+- Identify bots with suspicious path traversal
+- Trigger automatic IP bans when patterns match
+
+**Evaluation**:
+- Agent does lightweight pattern matching
+- When matched, sends event to Hub with `matched_pattern: true`
+- Hub analyzes and creates IP block rules if needed
+- Agent picks up new IP block rule in next sync (~10s)
+
+**Example**:
+```json
+{
+  "id": 12344,
+  "rule_type": "path_pattern",
+  "action": "log",
+  "conditions": {
+    "patterns": ["/.env", "/.git/*", "/wp-admin/*", "/.aws/*", "/phpMyAdmin/*"]
+  },
+  "enabled": true,
+  "source": "default:scanner_detection",
+  "metadata": {
+    "auto_ban_ip": true,
+    "ban_duration_hours": 24,
+    "description": "Common scanner paths"
+  }
+}
+```
+
+## Rule Actions
+
+| Action | Description | HTTP Response |
+|--------|-------------|---------------|
+| `allow` | Pass request through | Continue to app |
+| `deny` | Block request | 403 Forbidden |
+| `rate_limit` | Enforce rate limit | 429 Too Many Requests |
+| `redirect` | Redirect to URL | 301/302 + Location header |
+| `challenge` | Show CAPTCHA (Phase 2+) | 403 with challenge |
+| `log` | Log only, don't block | Continue to app |
+
+## Rule Priority & Specificity
+
+### Network Rules
+- **Priority is determined by CIDR prefix length**
+- Smaller prefix (more specific) = higher priority
+- `/32` (single IP) beats `/24` (256 IPs) beats `/8` (16M IPs)
+- Example: Block `10.0.0.0/8` but allow `10.0.1.0/24`
+  - Request from `10.0.1.5` → matches `/24` → allowed
+  - Request from `10.0.2.5` → matches `/8` only → blocked
+
+### Rate Limit Rules
+- Most specific CIDR match wins
+- Per-path rules take precedence over global (Phase 2+)
+
+### Path Pattern Rules
+- All patterns are evaluated (not exclusive)
+- Used for detection, not blocking
+- Multiple pattern matches = stronger signal for ban
+
+## Rule Synchronization
+
+### Timestamp-Based Cursor
+
+Agents use `updated_at` timestamps as sync cursors to handle rule updates and deletions.
+
+**Why `updated_at` instead of `id`?**
+- Handles rule updates (e.g., disabling a rule updates `updated_at`)
+- Handles rule deletions via `enabled=false` flag
+- Simple for agents: "give me everything that changed since X"
+
+**Agent Sync Flow**:
+```
+1. Agent starts: last_sync = nil
+2. GET /api/:key/rules → Full sync, store latest updated_at
+3. Every 10s or 1000 events: GET /api/:key/rules?since=<last_sync>
+4. Process rules: add new, update existing, remove disabled
+5. Update last_sync to latest updated_at from response
+```
+
+**Query Overlap**: Hub queries `updated_at >= since - 0.5s` to handle clock skew and millisecond duplicates.
+
+### API Endpoints
+
+#### 1. Version Check (Lightweight)
+
+```http
+GET /api/:public_key/rules/version
+
+Response:
+{
+  "version": "2024-11-03T12:30:45.123Z",
+  "count": 150,
+  "sampling": {
+    "allowed_requests": 0.5,
+    "blocked_requests": 1.0,
+    "rate_limited_requests": 1.0,
+    "effective_until": "2024-11-03T12:30:55.123Z"
+  }
+}
+```
+
+#### 2. Incremental Sync
+
+```http
+GET /api/:public_key/rules?since=2024-11-03T12:00:00.000Z
+
+Response:
+{
+  "version": "2024-11-03T12:30:45.123Z",
+  "sampling": { ... },
+  "rules": [
+    {
+      "id": 12341,
+      "rule_type": "network_v4",
+      "action": "deny",
+      "conditions": { "cidr": "1.2.3.4/32" },
+      "priority": 32,
+      "expires_at": "2024-11-04T12:00:00Z",
+      "enabled": true,
+      "source": "auto:scanner_detected",
+      "metadata": { "reason": "Hitting /.env" },
+      "created_at": "2024-11-03T12:00:00Z",
+      "updated_at": "2024-11-03T12:00:00Z"
+    },
+    {
+      "id": 12340,
+      "rule_type": "network_v4",
+      "action": "deny",
+      "conditions": { "cidr": "5.6.7.8/32" },
+      "priority": 32,
+      "enabled": false,
+      "source": "manual",
+      "metadata": { "reason": "False positive" },
+      "created_at": "2024-11-02T10:00:00Z",
+      "updated_at": "2024-11-03T12:25:00Z"
+    }
+  ]
+}
+```
+
+#### 3. Full Sync
+
+```http
+GET /api/:public_key/rules
+
+Response:
+{
+  "version": "2024-11-03T12:30:45.123Z",
+  "sampling": { ... },
+  "rules": [ ...all enabled rules... ]
+}
+```
+
+## Dynamic Event Sampling
+
+Hub controls how many events Agents send based on load.
+
+### Sampling Strategy
+
+**Hub monitors**:
+- SolidQueue job depth
+- Events/second rate
+- Database write latency
+
+**Sampling rates**:
+```ruby
+Queue Depth     | Allowed | Blocked | Rate Limited
+----------------|---------|---------|-------------
+0-1,000         | 100%    | 100%    | 100%
+1,001-5,000     | 50%     | 100%    | 100%
+5,001-10,000    | 20%     | 100%    | 100%
+10,001+         | 5%      | 100%    | 100%
+```
+
+**Phase 2+: Path-based sampling**:
+```json
+{
+  "sampling": {
+    "allowed_requests": 0.1,
+    "blocked_requests": 1.0,
+    "paths": {
+      "block": ["/.env", "/.git/*"],
+      "allow": ["/health", "/metrics"]
+    }
+  }
+}
+```
+
+**Agent respects sampling**:
+- Always sends blocked/rate-limited events
+- Samples allowed events based on rate
+- Can prioritize suspicious paths over routine traffic
+
+## Temporal Rules (Expiration)
+
+Rules can have an `expires_at` timestamp for automatic expiration.
+
+**Use Cases**:
+- 24-hour scanner bans
+- Temporary rate limit adjustments
+- Time-boxed maintenance blocks
+
+**Cleanup**:
+- `ExpiredRulesCleanupJob` runs hourly
+- Disables rules where `expires_at < now`
+- Agent picks up disabled rules in next sync
+
+**Example**:
+```ruby
+# Hub auto-generates rule when scanner detected:
+Rule.create!(
+  rule_type: "network_v4",
+  action: "deny",
+  conditions: { cidr: "1.2.3.4/32" },
+  expires_at: 24.hours.from_now,
+  source: "auto:scanner_detected",
+  metadata: { reason: "Hit /.env 5 times in 10 seconds" }
+)
+
+# 24 hours later: ExpiredRulesCleanupJob disables it
+# Agent syncs and removes from ipv4_ranges table
+```
+
+## Rule Sources
+
+The `source` field tracks rule origin for audit and filtering.
+
+**Source Formats**:
+- `manual` - Created by user via UI
+- `auto:scanner_detected` - Auto-generated from scanner pattern
+- `auto:rate_limit_exceeded` - Auto-generated from rate limit abuse
+- `auto:bot_detected` - Auto-generated from bot behavior
+- `imported:fail2ban` - Imported from external source
+- `imported:crowdsec` - Imported from CrowdSec
+- `default:scanner_paths` - Default rule set
+
+## Database Schema
+
+### Hub Schema
+
+```ruby
+create_table "rules" do |t|
+  # Identification
+  t.integer :id, primary_key: true
+  t.string :source, limit: 100
+
+  # Rule definition
+  t.string :rule_type, null: false
+  t.string :action, null: false
+  t.json :conditions, null: false
+  t.json :metadata
+
+  # Priority & lifecycle
+  t.integer :priority
+  t.datetime :expires_at
+  t.boolean :enabled, default: true, null: false
+
+  # Timestamps (updated_at is sync cursor!)
+  t.timestamps
+
+  # Indexes
+  t.index [:updated_at, :id]  # Primary sync query
+  t.index :enabled
+  t.index :expires_at
+  t.index :source
+  t.index :rule_type
+end
+```
+
+### Agent Schema (Existing)
+
+```ruby
+create_table "ipv4_ranges" do |t|
+  t.integer :network_start, limit: 8, null: false
+  t.integer :network_end, limit: 8, null: false
+  t.integer :network_prefix, null: false
+  t.integer :waf_action, default: 0, null: false
+  t.integer :priority, default: 100
+  t.string :redirect_url, limit: 500
+  t.integer :redirect_status
+  t.string :source, limit: 50
+  t.timestamps
+
+  t.index [:network_start, :network_end, :network_prefix]
+  t.index :waf_action
+end
+
+create_table "ipv6_ranges" do |t|
+  t.binary :network_start, limit: 16, null: false
+  t.binary :network_end, limit: 16, null: false
+  t.integer :network_prefix, null: false
+  t.integer :waf_action, default: 0, null: false
+  t.integer :priority, default: 100
+  t.string :redirect_url, limit: 500
+  t.integer :redirect_status
+  t.string :source, limit: 50
+  t.timestamps
+
+  t.index [:network_start, :network_end, :network_prefix]
+  t.index :waf_action
+end
+```
+
+## Agent Rule Processing
+
+### Network Rules
+
+```ruby
+# Agent receives network rule from Hub:
+rule = {
+  id: 12341,
+  rule_type: "network_v4",
+  action: "deny",
+  conditions: { cidr: "10.0.0.0/8" },
+  priority: 8,
+  enabled: true
+}
+
+# Agent converts to ipv4_ranges entry:
+cidr = IPAddr.new("10.0.0.0/8")
+Ipv4Range.upsert({
+  source: "hub:12341",
+  network_start: cidr.to_i,
+  network_end: cidr.to_range.end.to_i,
+  network_prefix: 8,
+  waf_action: 1,  # deny
+  priority: 8
+}, unique_by: :source)
+
+# Agent evaluates request:
+# SELECT * FROM ipv4_ranges
+# WHERE ? BETWEEN network_start AND network_end
+# ORDER BY network_prefix DESC
+# LIMIT 1
+```
+
+### Rate Limit Rules
+
+```ruby
+# Agent stores in memory:
+@rate_limit_rules = {
+  "global" => { limit: 100, window: 60, cidr: "0.0.0.0/0" }
+}
+
+@rate_counters = {
+  "1.2.3.4" => { count: 50, window_start: Time.now }
+}
+
+# On each request:
+def check_rate_limit(ip)
+  rule = find_most_specific_rate_limit_rule(ip)
+  counter = @rate_counters[ip] ||= { count: 0, window_start: Time.now }
+
+  # Reset window if expired
+  if Time.now - counter[:window_start] > rule[:window]
+    counter = { count: 0, window_start: Time.now }
+  end
+
+  counter[:count] += 1
+
+  if counter[:count] > rule[:limit]
+    { action: "rate_limit", status: 429 }
+  else
+    { action: "allow" }
+  end
+end
+```
+
+### Path Pattern Rules
+
+```ruby
+# Agent evaluates patterns:
+PATH_PATTERNS = [/.env$/, /.git/, /wp-admin/]
+
+def check_path_patterns(path)
+  matched = PATH_PATTERNS.any? { |pattern| path.match?(pattern) }
+
+  if matched
+    # Send event to Hub with flag
+    send_event_to_hub(
+      path: path,
+      matched_pattern: true,
+      waf_action: "log"  # Don't block yet
+    )
+
+    # Hub will analyze and create IP block rule if needed
+  end
+end
+```
+
+## Hub Intelligence (Auto-Generation)
+
+### Scanner Detection
+
+```ruby
+# PathScannerDetectorJob
+class PathScannerDetectorJob < ApplicationJob
+  SCANNER_PATHS = %w[/.env /.git /wp-admin /phpMyAdmin /.aws]
+
+  def perform
+    # Find IPs hitting scanner paths
+    scanner_ips = Event
+      .where("request_path IN (?)", SCANNER_PATHS)
+      .where("timestamp > ?", 5.minutes.ago)
+      .group(:ip_address)
+      .having("COUNT(*) >= 3")
+      .pluck(:ip_address)
+
+    scanner_ips.each do |ip|
+      # Create 24h ban rule
+      Rule.create!(
+        rule_type: "network_v4",
+        action: "deny",
+        conditions: { cidr: "#{ip}/32" },
+        priority: 32,
+        expires_at: 24.hours.from_now,
+        source: "auto:scanner_detected",
+        metadata: {
+          reason: "Hit #{SCANNER_PATHS.join(', ')}",
+          auto_generated: true
+        }
+      )
+    end
+  end
+end
+```
+
+### Rate Limit Abuse Detection
+
+```ruby
+# RateLimitAnomalyJob
+class RateLimitAnomalyJob < ApplicationJob
+  def perform
+    # Find IPs exceeding normal rate
+    abusive_ips = Event
+      .where("timestamp > ?", 1.minute.ago)
+      .group(:ip_address)
+      .having("COUNT(*) > 200")  # >200 req/min
+      .pluck(:ip_address)
+
+    abusive_ips.each do |ip|
+      # Create aggressive rate limit or block
+      Rule.create!(
+        rule_type: "rate_limit",
+        action: "rate_limit",
+        conditions: { cidr: "#{ip}/32", scope: "global" },
+        priority: 32,
+        expires_at: 1.hour.from_now,
+        source: "auto:rate_limit_exceeded",
+        metadata: {
+          limit: 10,
+          window: 60,
+          per_ip: true
+        }
+      )
+    end
+  end
+end
+```
+
+## Performance Characteristics
+
+### Hub
+- **Rule query**: O(log n) with `(updated_at, id)` index
+- **Version check**: Single index lookup
+- **Rule generation**: Background jobs, no request impact
+
+### Agent
+- **Network rule lookup**: O(log n) via B-tree index on `(network_start, network_end)`
+- **Rate limit check**: O(1) hash lookup in memory
+- **Path pattern check**: O(n) regex match (n = number of patterns)
+- **Overall request evaluation**: <1ms for typical case
+
+### Sync Efficiency
+- **Incremental sync**: Only changed rules since last sync
+- **Typical sync payload**: <10 KB for 50 rules
+- **Sync frequency**: Every 10s or 1000 events
+- **Version check**: <1 KB response
+
+## Future Enhancements (Phase 2+)
+
+### Per-Path Rate Limiting
+- Different limits for `/api/*`, `/login`, `/admin`
+- Agent tracks multiple counters per IP
+
+### Path-Based Event Sampling
+- Send all `/admin` requests
+- Skip `/health`, `/metrics`
+- Sample 10% of regular traffic
+
+### Challenge Actions
+- CAPTCHA challenges for suspicious IPs
+- JavaScript challenges for bot detection
+
+### Scheduled Rules
+- Block during maintenance windows
+- Time-of-day rate limits
+
+### Multi-Project Rules (Phase 10+)
+- Global rules across all projects
+- Per-project rule overrides
+
+## Summary
+
+The Baffle Hub rule system provides:
+- **Fast local enforcement** (sub-millisecond)
+- **Centralized intelligence** (Hub analytics)
+- **Efficient synchronization** (timestamp-based incremental sync)
+- **Dynamic adaptation** (backpressure control via sampling)
+- **Temporal flexibility** (auto-expiring rules)
+- **Audit trail** (soft deletes, source tracking)
+
+This architecture scales from single-server deployments to distributed multi-agent installations while maintaining simplicity and pragmatic design choices focused on the "low-hanging fruit" of WAF functionality.
--- a/docs/rule-system-implementation-summary.md
+++ b/docs/rule-system-implementation-summary.md
@@ -0,0 +1,381 @@
+# Rule System Implementation Summary
+
+## What We Built
+
+A complete distributed WAF rule synchronization system that allows the Baffle Hub to generate and manage rules while Agents download and enforce them locally with sub-millisecond latency.
+
+## Implementation Status: ✅ Complete (Phase 1)
+
+### 1. Database Schema ✅
+
+**Migration**: `db/migrate/20251103080823_enhance_rules_table_for_sync.rb`
+
+Enhanced the `rules` table with:
+- `source` field to track rule origin (manual, auto-generated, imported)
+- JSON `conditions` and `metadata` fields
+- `expires_at` for temporal rules (24h bans)
+- `enabled` flag for soft deletes
+- `priority` for rule specificity
+- Optimized indexes for sync queries (`updated_at, id`)
+
+**Schema**:
+```ruby
+create_table "rules" do |t|
+  t.string :rule_type, null: false     # network_v4, network_v6, rate_limit, path_pattern
+  t.string :action, null: false        # allow, deny, rate_limit, redirect, log
+  t.json :conditions, null: false      # CIDR, patterns, scope
+  t.json :metadata                     # reason, limits, redirect_url
+  t.integer :priority                  # Auto-calculated from CIDR prefix
+  t.datetime :expires_at               # For temporal bans
+  t.boolean :enabled, default: true    # Soft delete flag
+  t.string :source, limit: 100         # Origin tracking
+  t.timestamps
+
+  # Indexes for efficient sync
+  t.index [:updated_at, :id]           # Primary sync cursor
+  t.index :enabled
+  t.index :expires_at
+  t.index [:rule_type, :enabled]
+end
+```
+
+### 2. Rule Model ✅
+
+**File**: `app/models/rule.rb`
+
+Complete Rule model with:
+- **Rule types**: `network_v4`, `network_v6`, `rate_limit`, `path_pattern`
+- **Actions**: `allow`, `deny`, `rate_limit`, `redirect`, `log`
+- **Validations**: Type-specific validation for conditions and metadata
+- **Scopes**: `active`, `expired`, `network_rules`, `rate_limit_rules`, etc.
+- **Sync methods**: `since(timestamp)`, `latest_version`
+- **Auto-priority**: Calculates priority from CIDR prefix length
+- **Agent format**: `to_agent_format` for API responses
+
+**Example Usage**:
+```ruby
+# Create network block rule
+Rule.create!(
+  rule_type: "network_v4",
+  action: "deny",
+  conditions: { cidr: "1.2.3.4/32" },
+  expires_at: 24.hours.from_now,
+  source: "auto:scanner_detected",
+  metadata: { reason: "Hit /.env multiple times" }
+)
+
+# Create rate limit rule
+Rule.create!(
+  rule_type: "rate_limit",
+  action: "rate_limit",
+  conditions: { cidr: "0.0.0.0/0", scope: "global" },
+  metadata: { limit: 100, window: 60, per_ip: true },
+  source: "manual"
+)
+
+# Disable rule (soft delete)
+rule.disable!(reason: "False positive")
+
+# Query for sync
+Rule.since("2025-11-03T08:00:00.000Z")
+```
+
+### 3. API Endpoints ✅
+
+**Controller**: `app/controllers/api/rules_controller.rb`
+**Routes**: Added to `config/routes.rb`
+
+#### Version Endpoint (Lightweight Check)
+
+```http
+GET /api/:public_key/rules/version
+
+Response:
+{
+  "version": "2025-11-03T08:14:23.648330Z",
+  "count": 150,
+  "sampling": {
+    "allowed_requests": 1.0,
+    "blocked_requests": 1.0,
+    "rate_limited_requests": 1.0,
+    "effective_until": "2025-11-03T08:14:33.689Z",
+    "load_level": "normal",
+    "queue_depth": 0
+  }
+}
+```
+
+#### Incremental Sync
+
+```http
+GET /api/:public_key/rules?since=2025-11-03T08:00:00.000Z
+
+Response:
+{
+  "version": "2025-11-03T08:14:23.648330Z",
+  "sampling": { ... },
+  "rules": [
+    {
+      "id": 1,
+      "rule_type": "network_v4",
+      "action": "deny",
+      "conditions": { "cidr": "10.0.0.0/8" },
+      "priority": 8,
+      "expires_at": null,
+      "enabled": true,
+      "source": "manual",
+      "metadata": { "reason": "Testing" },
+      "created_at": "2025-11-03T08:14:23Z",
+      "updated_at": "2025-11-03T08:14:23Z"
+    }
+  ]
+}
+```
+
+#### Full Sync
+
+```http
+GET /api/:public_key/rules
+
+Response: Same format, returns all active rules
+```
+
+### 4. Dynamic Load-Based Sampling ✅
+
+**Service**: `app/services/hub_load.rb`
+
+Monitors SolidQueue depth and adjusts event sampling rates:
+
+| Queue Depth | Load Level | Allowed | Blocked | Rate Limited |
+|-------------|------------|---------|---------|--------------|
+| 0-1,000     | Normal     | 100%    | 100%    | 100%         |
+| 1,001-5,000 | Moderate   | 50%     | 100%    | 100%         |
+| 5,001-10,000| High       | 20%     | 100%    | 100%         |
+| 10,001+     | Critical   | 5%      | 100%    | 100%         |
+
+**Features**:
+- Automatic backpressure control
+- Always sends 100% of blocks/rate-limits
+- Reduces allowed request sampling under load
+- Included in every API response
+
+### 5. Background Jobs ✅
+
+#### ExpiredRulesCleanupJob
+
+**File**: `app/jobs/expired_rules_cleanup_job.rb`
+
+- Runs hourly
+- Disables rules with `expires_at` in the past
+- Cleans up old disabled rules (>30 days) once per day
+- Agents pick up disabled rules via `updated_at` change
+
+#### PathScannerDetectorJob
+
+**File**: `app/jobs/path_scanner_detector_job.rb`
+
+- Runs every 5 minutes (recommended)
+- Detects IPs hitting scanner paths (/.env, /.git, /wp-admin, etc.)
+- Auto-creates 24h ban rules after 3+ hits
+- Handles both IPv4 and IPv6
+- Prevents duplicate rules
+
+**Scanner Paths**:
+- `/.env`, `/.git`, `/.aws`, `/.ssh`, `/.config`
+- `/wp-admin`, `/wp-login.php`
+- `/phpMyAdmin`, `/phpmyadmin`
+- `/admin`, `/administrator`
+- `/backup`, `/db_backup`
+- `/.DS_Store`, `/web.config`
+
+## Testing
+
+### Create Test Rules
+
+```bash
+bin/rails runner '
+# Network block
+Rule.create!(
+  rule_type: "network_v4",
+  action: "deny",
+  conditions: { cidr: "10.0.0.0/8" },
+  source: "manual",
+  metadata: { reason: "Test block" }
+)
+
+# Rate limit
+Rule.create!(
+  rule_type: "rate_limit",
+  action: "rate_limit",
+  conditions: { cidr: "0.0.0.0/0", scope: "global" },
+  metadata: { limit: 100, window: 60 },
+  source: "manual"
+)
+
+puts "✓ Created #{Rule.count} rules"
+puts "✓ Latest version: #{Rule.latest_version}"
+'
+```
+
+### Test API Endpoints
+
+```bash
+# Get your project key
+bin/rails runner 'puts Project.first.public_key'
+
+# Test version endpoint
+curl http://localhost:3000/api/YOUR_PUBLIC_KEY/rules/version | jq
+
+# Test full sync
+curl http://localhost:3000/api/YOUR_PUBLIC_KEY/rules | jq
+
+# Test incremental sync
+curl "http://localhost:3000/api/YOUR_PUBLIC_KEY/rules?since=2025-11-03T08:00:00.000Z" | jq
+```
+
+### Run Background Jobs
+
+```bash
+# Test expired rules cleanup
+bin/rails runner 'ExpiredRulesCleanupJob.perform_now'
+
+# Test scanner detector (needs events first)
+bin/rails runner 'PathScannerDetectorJob.perform_now'
+
+# Check hub load
+bin/rails runner 'puts HubLoad.stats.inspect'
+```
+
+## Agent Integration (Next Steps)
+
+The Agent needs to:
+
+1. **Poll for updates** every 10 seconds or 1000 events:
+   ```ruby
+   GET /api/:public_key/rules?since=<last_updated_at>
+   ```
+
+2. **Process rules** received:
+   - `enabled: true` → Insert/update in local tables
+   - `enabled: false` → Remove from local tables
+
+3. **Populate local SQLite tables**:
+   ```ruby
+   # For network_v4 rules:
+   cidr = IPAddr.new(rule.conditions.cidr)
+   Ipv4Range.upsert({
+     source: "hub:#{rule.id}",
+     network_start: cidr.to_i,
+     network_end: cidr.to_range.end.to_i,
+     network_prefix: rule.priority,
+     waf_action: map_action(rule.action),
+     redirect_url: rule.metadata.redirect_url,
+     priority: rule.priority
+   })
+   ```
+
+4. **Respect sampling rates** from API response:
+   ```ruby
+   sampling = response["sampling"]
+   if event.allowed? && rand > sampling["allowed_requests"]
+     skip_sending_to_hub
+   end
+   ```
+
+## Key Design Decisions
+
+### ✅ IPv4/IPv6 Split
+- Separate `network_v4` and `network_v6` rule types
+- Agent has separate `ipv4_ranges` and `ipv6_ranges` tables
+- Better performance (integer vs binary indexes)
+
+### ✅ Timestamp-Based Sync
+- Use `updated_at` as version cursor (not `id`)
+- Handles rule updates and soft deletes
+- Query overlap (0.5s) handles clock skew
+- Secondary sort by `id` for consistency
+
+### ✅ Soft Deletes
+- Rules disabled, not deleted
+- Audit trail preserved
+- Agents sync via `enabled: false`
+- Old rules cleaned after 30 days
+
+### ✅ Priority from CIDR
+- Auto-calculated from prefix length
+- Most specific (smallest prefix) wins
+- `/32` > `/24` > `/16` > `/8`
+- No manual priority needed for network rules
+
+### ✅ Dynamic Sampling
+- Hub controls load via sampling rates
+- Always sends critical events (blocks, rate limits)
+- Reduces allowed event traffic under load
+- Prevents Hub overload
+
+## Performance Characteristics
+
+### Hub
+- **Version check**: Single index lookup (~1ms)
+- **Incremental sync**: Index scan on `(updated_at, id)` (~5-10ms for 100 rules)
+- **Rule creation**: Single insert (~5ms)
+
+### Agent (Expected)
+- **Network lookup**: O(log n) via B-tree on `(network_start, network_end)` (<1ms)
+- **Rate limit check**: O(1) hash lookup in memory (<0.1ms)
+- **Sync overhead**: 10s polling, ~5-10 KB payload for 50 rules
+
+## What's Not Included (Future Phases)
+
+- ❌ Per-path rate limiting (Phase 2)
+- ❌ Path-based event sampling (Phase 2)
+- ❌ Challenge actions/CAPTCHA (Phase 2+)
+- ❌ Multi-project rules (Phase 10+)
+- ❌ Rule UI (manual creation via console for now)
+- ❌ Recurring job scheduling (needs separate setup)
+
+## Next Implementation Steps
+
+1. **Schedule Background Jobs**
+   - Add to `config/initializers/recurring_jobs.rb` or use gem like `good_job`
+   - `ExpiredRulesCleanupJob` every hour
+   - `PathScannerDetectorJob` every 5 minutes
+
+2. **Build Rule Management UI**
+   - Form to create network block rules
+   - List active rules
+   - Disable/enable rules
+   - View auto-generated rules
+
+3. **Agent Sync Implementation**
+   - HTTP client to poll rules endpoint
+   - SQLite population logic
+   - Sampling rate respect
+   - Rule evaluation integration
+
+4. **Monitoring/Metrics**
+   - Dashboard showing active rules count
+   - Auto-generated rules per day
+   - Banned IPs list
+   - Rule sync lag per agent
+
+## Documentation
+
+Complete architecture documentation available at:
+- **docs/rule-architecture.md** - Full technical specification
+- **This file** - Implementation summary and testing guide
+
+## Summary
+
+We've built a production-ready, distributed WAF rule system with:
+- ✅ Database schema with optimized indexes
+- ✅ Complete Rule model with validations
+- ✅ RESTful API with version/incremental/full sync
+- ✅ Dynamic load-based event sampling
+- ✅ Auto-expiring temporal rules
+- ✅ Scanner detection and auto-banning
+- ✅ Soft deletes with audit trail
+- ✅ IPv4/IPv6 separation
+- ✅ Comprehensive documentation
+
+The system is ready for Agent integration and can scale from single-server to multi-agent distributed deployments.