Accepts incoming events and correctly parses them into events. GeoLite2 integration complete"
This commit is contained in:
358
docs/maxmind.md
Normal file
358
docs/maxmind.md
Normal file
@@ -0,0 +1,358 @@
|
||||
# MaxMind GeoIP Integration
|
||||
|
||||
This document describes the MaxMind GeoIP integration implemented in the Baffle Hub WAF analytics system.
|
||||
|
||||
## Overview
|
||||
|
||||
The Baffle Hub application uses MaxMind's free GeoLite2-Country database to provide geographic location information for IP addresses. The system automatically enriches WAF events with country codes and provides manual lookup capabilities for both IPv4 and IPv6 addresses.
|
||||
|
||||
## Features
|
||||
|
||||
- **On-demand lookup** - Country code lookup by IP address
|
||||
- **Automatic enrichment** - Events are enriched with geo-location data during processing
|
||||
- **Manual lookup capability** - Rake tasks and model methods for manual lookups
|
||||
- **GeoLite2-Country database** - Uses MaxMind's free country-level database
|
||||
- **Automatic updates** - Weekly background job updates the database
|
||||
- **IPv4/IPv6 support** - Full protocol support for both IP versions
|
||||
- **Performance optimized** - Database caching and efficient lookups
|
||||
- **Graceful degradation** - Fallback handling when database is unavailable
|
||||
|
||||
## Architecture
|
||||
|
||||
### Core Components
|
||||
|
||||
#### 1. GeoIpService
|
||||
- Central service for all IP geolocation operations
|
||||
- Handles database loading from file system
|
||||
- Provides batch lookup capabilities
|
||||
- Manages database updates from MaxMind CDN
|
||||
- Uses MaxMind's built-in metadata for version information
|
||||
|
||||
#### 2. UpdateGeoIpDatabaseJob
|
||||
- Background job for automatic database updates
|
||||
- Runs weekly to keep the database current
|
||||
- Simple file-based validation and updates
|
||||
|
||||
#### 3. Enhanced Models
|
||||
- **Event Model** - Automatic geo-location enrichment for WAF events
|
||||
- **IPv4Range/IPv6Range Models** - Manual lookup methods for IP ranges
|
||||
|
||||
#### 4. File-System Management
|
||||
- Database stored as single file: `db/geoip/GeoLite2-Country.mmdb`
|
||||
- Version information queried directly from MaxMind database metadata
|
||||
- No database tables needed - simplified approach
|
||||
|
||||
## Installation & Setup
|
||||
|
||||
### Dependencies
|
||||
The integration uses the following gems:
|
||||
- `maxmind-db` - Official MaxMind database reader (with built-in caching)
|
||||
- `httparty` - HTTP client for database downloads
|
||||
|
||||
### Database Storage
|
||||
- Location: `db/geoip/GeoLite2-Country.mmdb`
|
||||
- Automatic creation of storage directory
|
||||
- File validation and integrity checking
|
||||
- Version information queried directly from database metadata
|
||||
- No additional caching needed - MaxMind DB has its own internal caching
|
||||
|
||||
### Initial Setup
|
||||
```bash
|
||||
# Install dependencies
|
||||
bundle install
|
||||
|
||||
# Download the GeoIP database
|
||||
rails geoip:update
|
||||
|
||||
# Verify installation
|
||||
rails geoip:status
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
The system is configurable via environment variables or application configuration:
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `MAXMIND_DATABASE_URL` | MaxMind CDN URL | Database download URL |
|
||||
| `MAXMIND_AUTO_UPDATE` | `true` | Enable automatic weekly updates |
|
||||
| `MAXMIND_UPDATE_INTERVAL_DAYS` | `7` | Days between update checks |
|
||||
| `MAXMIND_MAX_AGE_DAYS` | `30` | Maximum database age before forced update |
|
||||
| Note: MaxMind DB has built-in caching, no additional caching needed |
|
||||
| `MAXMIND_FALLBACK_COUNTRY` | `nil` | Fallback country when lookup fails |
|
||||
| `MAXMIND_ENABLE_FALLBACK` | `false` | Enable fallback country usage |
|
||||
|
||||
### Example Configuration
|
||||
```bash
|
||||
# config/application.rb or .env file
|
||||
MAXMIND_AUTO_UPDATE=true
|
||||
MAXMIND_UPDATE_INTERVAL_DAYS=7
|
||||
MAXMIND_MAX_AGE_DAYS=30
|
||||
MAXMIND_FALLBACK_COUNTRY=US
|
||||
MAXMIND_ENABLE_FALLBACK=true
|
||||
# Note: No caching configuration needed - MaxMind has built-in caching
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Rake Tasks
|
||||
|
||||
#### Database Management
|
||||
```bash
|
||||
# Download/update the GeoIP database
|
||||
rails geoip:update
|
||||
|
||||
# Check database status and configuration
|
||||
rails geoip:status
|
||||
|
||||
# Test the implementation with sample IPs
|
||||
rails geoip:test
|
||||
|
||||
# Manual lookup for a specific IP
|
||||
rails geoip:lookup[8.8.8.8]
|
||||
rails geoip:lookup[2001:4860:4860::8888]
|
||||
```
|
||||
|
||||
#### Data Management
|
||||
```bash
|
||||
# Enrich existing events missing country codes
|
||||
rails geoip:enrich_missing
|
||||
|
||||
# Clean up old inactive database records
|
||||
rails geoip:cleanup
|
||||
```
|
||||
|
||||
### Ruby API
|
||||
|
||||
#### Service-Level Lookups
|
||||
```ruby
|
||||
# Direct country lookup
|
||||
country = GeoIpService.lookup_country('8.8.8.8')
|
||||
# => "US"
|
||||
|
||||
# Batch lookup
|
||||
countries = GeoIpService.new.lookup_countries(['8.8.8.8', '1.1.1.1'])
|
||||
# => { "8.8.8.8" => "US", "1.1.1.1" => nil }
|
||||
|
||||
# Check database availability
|
||||
service = GeoIpService.new
|
||||
service.database_available? # => true/false
|
||||
service.database_info # => Database metadata
|
||||
```
|
||||
|
||||
#### Event Model Integration
|
||||
```ruby
|
||||
# Automatic enrichment during event processing
|
||||
event = Event.find(123)
|
||||
event.enrich_geo_location! # Updates event with country code
|
||||
event.lookup_country # => "US" (with fallback to service)
|
||||
event.has_geo_data? # => true/false
|
||||
event.geo_location # => { country_code: "US", city: nil, ... }
|
||||
|
||||
# Batch enrichment of existing events
|
||||
updated_count = Event.enrich_geo_location_batch
|
||||
puts "Enriched #{updated_count} events with geo data"
|
||||
```
|
||||
|
||||
#### IP Range Model Integration
|
||||
```ruby
|
||||
# IPv4 Range lookups
|
||||
range = Ipv4Range.find(123)
|
||||
range.geo_lookup_country! # Updates range with country code
|
||||
range.geo_lookup_country # => "US" (without updating)
|
||||
range.has_country_info? # => true/false
|
||||
range.primary_country # => "US" (best available country)
|
||||
|
||||
# Class methods
|
||||
country = Ipv4Range.lookup_country_by_ip('8.8.8.8')
|
||||
updated_count = Ipv4Range.enrich_missing_geo_data(limit: 1000)
|
||||
|
||||
# IPv6 Range lookups (same interface)
|
||||
country = Ipv6Range.lookup_country_by_ip('2001:4860:4860::8888')
|
||||
updated_count = Ipv6Range.enrich_missing_geo_data(limit: 1000)
|
||||
```
|
||||
|
||||
### Background Processing
|
||||
|
||||
#### Automatic Updates
|
||||
The system automatically schedules database updates:
|
||||
```ruby
|
||||
# Manually trigger an update (usually scheduled automatically)
|
||||
UpdateGeoIpDatabaseJob.perform_later
|
||||
|
||||
# Force update regardless of age
|
||||
UpdateGeoIpDatabaseJob.perform_later(force_update: true)
|
||||
```
|
||||
|
||||
#### Event Processing Integration
|
||||
Geo-location enrichment is automatically included in WAF event processing:
|
||||
```ruby
|
||||
# This is called automatically in ProcessWafEventJob
|
||||
event = Event.create_from_waf_payload!(event_id, payload, project)
|
||||
event.enrich_geo_location! if event.ip_address.present? && event.country_code.blank?
|
||||
```
|
||||
|
||||
## Database Information
|
||||
|
||||
### GeoLite2-Country Database
|
||||
- **Source**: MaxMind GeoLite2-Country (free version)
|
||||
- **Update Frequency**: Weekly (Tuesdays)
|
||||
- **Size**: ~9.5 MB
|
||||
- **Coverage**: Global IP-to-country mapping
|
||||
- **Format**: MaxMind DB (.mmdb)
|
||||
|
||||
### Database Fields
|
||||
- `country.iso_code` - Two-letter ISO country code
|
||||
- Supports both IPv4 and IPv6 addresses
|
||||
- Includes anonymous/proxy detection metadata
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Performance
|
||||
- MaxMind DB has built-in internal caching optimized for lookups
|
||||
- Typical lookup time: <1ms
|
||||
- Database size optimized for fast lookups
|
||||
- No additional caching layer needed
|
||||
|
||||
### Lookup Performance
|
||||
- Typical lookup time: <1ms
|
||||
- Database size optimized for fast lookups
|
||||
- Efficient range queries for IP networks
|
||||
|
||||
### Memory Usage
|
||||
- Database loaded into memory for fast access
|
||||
- Approximate memory usage: 15-20 MB for the country database
|
||||
- Automatic cleanup of old database files
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Graceful Degradation
|
||||
- Service returns `nil` when database unavailable
|
||||
- Logging at appropriate levels for different error types
|
||||
- Event processing continues even if geo-location fails
|
||||
|
||||
### Common Error Scenarios
|
||||
1. **Database Missing** - Automatic download triggered
|
||||
2. **Database Corrupted** - Automatic re-download attempted
|
||||
3. **Network Issues** - Graceful fallback with error logging
|
||||
4. **Invalid IP Address** - Returns `nil` with warning log
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Check System Status
|
||||
```bash
|
||||
# Verify database status
|
||||
rails geoip:status
|
||||
|
||||
# Test with known IPs
|
||||
rails geoip:test
|
||||
|
||||
# Check logs for errors
|
||||
tail -f log/production.log | grep GeoIP
|
||||
```
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### Database Not Available
|
||||
```bash
|
||||
# Force database update
|
||||
rails geoip:update
|
||||
|
||||
# Check file permissions
|
||||
ls -la db/geoip/
|
||||
```
|
||||
|
||||
#### Lookup Failures
|
||||
```bash
|
||||
# Test specific IPs
|
||||
rails geoip:lookup[8.8.8.8]
|
||||
|
||||
# Check database validity
|
||||
rails runner "puts GeoIpService.new.database_available?"
|
||||
```
|
||||
|
||||
#### Performance Issues
|
||||
- Increase cache size in configuration
|
||||
- Check memory usage on deployment server
|
||||
- Monitor lookup times with application metrics
|
||||
|
||||
## Monitoring & Maintenance
|
||||
|
||||
### Health Checks
|
||||
```ruby
|
||||
# Rails console health check
|
||||
service = GeoIpService.new
|
||||
puts "Database available: #{service.database_available?}"
|
||||
puts "Database age: #{service.database_record&.age_in_days} days"
|
||||
```
|
||||
|
||||
### Scheduled Maintenance
|
||||
- Database automatically updated weekly
|
||||
- Old database files cleaned up after 7 days
|
||||
- No manual maintenance required
|
||||
|
||||
### Monitoring Metrics
|
||||
Consider monitoring:
|
||||
- Database update success/failure rates
|
||||
- Lookup performance (response times)
|
||||
- Database age and freshness
|
||||
- Cache hit/miss ratios
|
||||
|
||||
## Security & Privacy
|
||||
|
||||
### Data Privacy
|
||||
- No personal data stored in the GeoIP database
|
||||
- Only country-level information provided
|
||||
- No tracking or logging of IP lookups by default
|
||||
|
||||
### Network Security
|
||||
- Database downloaded from official MaxMind CDN
|
||||
- File integrity validated with MD5 checksums
|
||||
- Secure temporary file handling during updates
|
||||
|
||||
## API Reference
|
||||
|
||||
### GeoIpService
|
||||
|
||||
#### Class Methods
|
||||
- `lookup_country(ip_address)` - Direct lookup
|
||||
- `update_database!` - Force database update
|
||||
|
||||
#### Instance Methods
|
||||
- `lookup_country(ip_address)` - Country lookup
|
||||
- `lookup_countries(ip_addresses)` - Batch lookup
|
||||
- `database_available?` - Check database status
|
||||
- `database_info` - Get database metadata
|
||||
- `update_from_remote!` - Download new database
|
||||
|
||||
### Model Methods
|
||||
|
||||
#### Event Model
|
||||
- `enrich_geo_location!` - Update with country code
|
||||
- `lookup_country` - Get country code (with fallback)
|
||||
- `has_geo_data?` - Check if geo data exists
|
||||
- `geo_location` - Get full geo location hash
|
||||
|
||||
#### IPv4Range/IPv6Range Models
|
||||
- `geo_lookup_country!` - Update range with country code
|
||||
- `geo_lookup_country` - Get country code (without update)
|
||||
- `has_country_info?` - Check for existing country data
|
||||
- `primary_country` - Get best available country code
|
||||
- `lookup_country_by_ip(ip)` - Class method for IP lookup
|
||||
- `enrich_missing_geo_data(limit:)` - Class method for batch enrichment
|
||||
|
||||
## Support & Resources
|
||||
|
||||
### MaxMind Documentation
|
||||
- [MaxMind Developer Site](https://dev.maxmind.com/)
|
||||
- [GeoLite2 Databases](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data)
|
||||
- [Database Accuracy](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data#accuracy)
|
||||
|
||||
### Ruby Libraries
|
||||
- [maxmind-db gem](https://github.com/maxmind/MaxMind-DB-Reader-ruby)
|
||||
- [httparty gem](https://github.com/jnunemaker/httparty)
|
||||
|
||||
### Troubleshooting Resources
|
||||
- Application logs: `log/production.log`
|
||||
- Rails console for manual testing
|
||||
- Database status via `rails geoip:status`
|
||||
Reference in New Issue
Block a user