358 lines
11 KiB
Markdown
358 lines
11 KiB
Markdown
# MaxMind GeoIP Integration
|
|
|
|
This document describes the MaxMind GeoIP integration implemented in the Baffle Hub WAF analytics system.
|
|
|
|
## Overview
|
|
|
|
The Baffle Hub application uses MaxMind's free GeoLite2-Country database to provide geographic location information for IP addresses. The system automatically enriches WAF events with country codes and provides manual lookup capabilities for both IPv4 and IPv6 addresses.
|
|
|
|
## Features
|
|
|
|
- **On-demand lookup** - Country code lookup by IP address
|
|
- **Automatic enrichment** - Events are enriched with geo-location data during processing
|
|
- **Manual lookup capability** - Rake tasks and model methods for manual lookups
|
|
- **GeoLite2-Country database** - Uses MaxMind's free country-level database
|
|
- **Automatic updates** - Weekly background job updates the database
|
|
- **IPv4/IPv6 support** - Full protocol support for both IP versions
|
|
- **Performance optimized** - Database caching and efficient lookups
|
|
- **Graceful degradation** - Fallback handling when database is unavailable
|
|
|
|
## Architecture
|
|
|
|
### Core Components
|
|
|
|
#### 1. GeoIpService
|
|
- Central service for all IP geolocation operations
|
|
- Handles database loading from file system
|
|
- Provides batch lookup capabilities
|
|
- Manages database updates from MaxMind CDN
|
|
- Uses MaxMind's built-in metadata for version information
|
|
|
|
#### 2. UpdateGeoIpDatabaseJob
|
|
- Background job for automatic database updates
|
|
- Runs weekly to keep the database current
|
|
- Simple file-based validation and updates
|
|
|
|
#### 3. Enhanced Models
|
|
- **Event Model** - Automatic geo-location enrichment for WAF events
|
|
- **IPv4Range/IPv6Range Models** - Manual lookup methods for IP ranges
|
|
|
|
#### 4. File-System Management
|
|
- Database stored as single file: `db/geoip/GeoLite2-Country.mmdb`
|
|
- Version information queried directly from MaxMind database metadata
|
|
- No database tables needed - simplified approach
|
|
|
|
## Installation & Setup
|
|
|
|
### Dependencies
|
|
The integration uses the following gems:
|
|
- `maxmind-db` - Official MaxMind database reader (with built-in caching)
|
|
- `httparty` - HTTP client for database downloads
|
|
|
|
### Database Storage
|
|
- Location: `db/geoip/GeoLite2-Country.mmdb`
|
|
- Automatic creation of storage directory
|
|
- File validation and integrity checking
|
|
- Version information queried directly from database metadata
|
|
- No additional caching needed - MaxMind DB has its own internal caching
|
|
|
|
### Initial Setup
|
|
```bash
|
|
# Install dependencies
|
|
bundle install
|
|
|
|
# Download the GeoIP database
|
|
rails geoip:update
|
|
|
|
# Verify installation
|
|
rails geoip:status
|
|
```
|
|
|
|
## Configuration
|
|
|
|
The system is configurable via environment variables or application configuration:
|
|
|
|
| Variable | Default | Description |
|
|
|----------|---------|-------------|
|
|
| `MAXMIND_DATABASE_URL` | MaxMind CDN URL | Database download URL |
|
|
| `MAXMIND_AUTO_UPDATE` | `true` | Enable automatic weekly updates |
|
|
| `MAXMIND_UPDATE_INTERVAL_DAYS` | `7` | Days between update checks |
|
|
| `MAXMIND_MAX_AGE_DAYS` | `30` | Maximum database age before forced update |
|
|
| Note: MaxMind DB has built-in caching, no additional caching needed |
|
|
| `MAXMIND_FALLBACK_COUNTRY` | `nil` | Fallback country when lookup fails |
|
|
| `MAXMIND_ENABLE_FALLBACK` | `false` | Enable fallback country usage |
|
|
|
|
### Example Configuration
|
|
```bash
|
|
# config/application.rb or .env file
|
|
MAXMIND_AUTO_UPDATE=true
|
|
MAXMIND_UPDATE_INTERVAL_DAYS=7
|
|
MAXMIND_MAX_AGE_DAYS=30
|
|
MAXMIND_FALLBACK_COUNTRY=US
|
|
MAXMIND_ENABLE_FALLBACK=true
|
|
# Note: No caching configuration needed - MaxMind has built-in caching
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Rake Tasks
|
|
|
|
#### Database Management
|
|
```bash
|
|
# Download/update the GeoIP database
|
|
rails geoip:update
|
|
|
|
# Check database status and configuration
|
|
rails geoip:status
|
|
|
|
# Test the implementation with sample IPs
|
|
rails geoip:test
|
|
|
|
# Manual lookup for a specific IP
|
|
rails geoip:lookup[8.8.8.8]
|
|
rails geoip:lookup[2001:4860:4860::8888]
|
|
```
|
|
|
|
#### Data Management
|
|
```bash
|
|
# Enrich existing events missing country codes
|
|
rails geoip:enrich_missing
|
|
|
|
# Clean up old inactive database records
|
|
rails geoip:cleanup
|
|
```
|
|
|
|
### Ruby API
|
|
|
|
#### Service-Level Lookups
|
|
```ruby
|
|
# Direct country lookup
|
|
country = GeoIpService.lookup_country('8.8.8.8')
|
|
# => "US"
|
|
|
|
# Batch lookup
|
|
countries = GeoIpService.new.lookup_countries(['8.8.8.8', '1.1.1.1'])
|
|
# => { "8.8.8.8" => "US", "1.1.1.1" => nil }
|
|
|
|
# Check database availability
|
|
service = GeoIpService.new
|
|
service.database_available? # => true/false
|
|
service.database_info # => Database metadata
|
|
```
|
|
|
|
#### Event Model Integration
|
|
```ruby
|
|
# Automatic enrichment during event processing
|
|
event = Event.find(123)
|
|
event.enrich_geo_location! # Updates event with country code
|
|
event.lookup_country # => "US" (with fallback to service)
|
|
event.has_geo_data? # => true/false
|
|
event.geo_location # => { country_code: "US", city: nil, ... }
|
|
|
|
# Batch enrichment of existing events
|
|
updated_count = Event.enrich_geo_location_batch
|
|
puts "Enriched #{updated_count} events with geo data"
|
|
```
|
|
|
|
#### IP Range Model Integration
|
|
```ruby
|
|
# IPv4 Range lookups
|
|
range = Ipv4Range.find(123)
|
|
range.geo_lookup_country! # Updates range with country code
|
|
range.geo_lookup_country # => "US" (without updating)
|
|
range.has_country_info? # => true/false
|
|
range.primary_country # => "US" (best available country)
|
|
|
|
# Class methods
|
|
country = Ipv4Range.lookup_country_by_ip('8.8.8.8')
|
|
updated_count = Ipv4Range.enrich_missing_geo_data(limit: 1000)
|
|
|
|
# IPv6 Range lookups (same interface)
|
|
country = Ipv6Range.lookup_country_by_ip('2001:4860:4860::8888')
|
|
updated_count = Ipv6Range.enrich_missing_geo_data(limit: 1000)
|
|
```
|
|
|
|
### Background Processing
|
|
|
|
#### Automatic Updates
|
|
The system automatically schedules database updates:
|
|
```ruby
|
|
# Manually trigger an update (usually scheduled automatically)
|
|
UpdateGeoIpDatabaseJob.perform_later
|
|
|
|
# Force update regardless of age
|
|
UpdateGeoIpDatabaseJob.perform_later(force_update: true)
|
|
```
|
|
|
|
#### Event Processing Integration
|
|
Geo-location enrichment is automatically included in WAF event processing:
|
|
```ruby
|
|
# This is called automatically in ProcessWafEventJob
|
|
event = Event.create_from_waf_payload!(event_id, payload, project)
|
|
event.enrich_geo_location! if event.ip_address.present? && event.country_code.blank?
|
|
```
|
|
|
|
## Database Information
|
|
|
|
### GeoLite2-Country Database
|
|
- **Source**: MaxMind GeoLite2-Country (free version)
|
|
- **Update Frequency**: Weekly (Tuesdays)
|
|
- **Size**: ~9.5 MB
|
|
- **Coverage**: Global IP-to-country mapping
|
|
- **Format**: MaxMind DB (.mmdb)
|
|
|
|
### Database Fields
|
|
- `country.iso_code` - Two-letter ISO country code
|
|
- Supports both IPv4 and IPv6 addresses
|
|
- Includes anonymous/proxy detection metadata
|
|
|
|
## Performance Considerations
|
|
|
|
### Performance
|
|
- MaxMind DB has built-in internal caching optimized for lookups
|
|
- Typical lookup time: <1ms
|
|
- Database size optimized for fast lookups
|
|
- No additional caching layer needed
|
|
|
|
### Lookup Performance
|
|
- Typical lookup time: <1ms
|
|
- Database size optimized for fast lookups
|
|
- Efficient range queries for IP networks
|
|
|
|
### Memory Usage
|
|
- Database loaded into memory for fast access
|
|
- Approximate memory usage: 15-20 MB for the country database
|
|
- Automatic cleanup of old database files
|
|
|
|
## Error Handling
|
|
|
|
### Graceful Degradation
|
|
- Service returns `nil` when database unavailable
|
|
- Logging at appropriate levels for different error types
|
|
- Event processing continues even if geo-location fails
|
|
|
|
### Common Error Scenarios
|
|
1. **Database Missing** - Automatic download triggered
|
|
2. **Database Corrupted** - Automatic re-download attempted
|
|
3. **Network Issues** - Graceful fallback with error logging
|
|
4. **Invalid IP Address** - Returns `nil` with warning log
|
|
|
|
## Troubleshooting
|
|
|
|
### Check System Status
|
|
```bash
|
|
# Verify database status
|
|
rails geoip:status
|
|
|
|
# Test with known IPs
|
|
rails geoip:test
|
|
|
|
# Check logs for errors
|
|
tail -f log/production.log | grep GeoIP
|
|
```
|
|
|
|
### Common Issues
|
|
|
|
#### Database Not Available
|
|
```bash
|
|
# Force database update
|
|
rails geoip:update
|
|
|
|
# Check file permissions
|
|
ls -la db/geoip/
|
|
```
|
|
|
|
#### Lookup Failures
|
|
```bash
|
|
# Test specific IPs
|
|
rails geoip:lookup[8.8.8.8]
|
|
|
|
# Check database validity
|
|
rails runner "puts GeoIpService.new.database_available?"
|
|
```
|
|
|
|
#### Performance Issues
|
|
- Increase cache size in configuration
|
|
- Check memory usage on deployment server
|
|
- Monitor lookup times with application metrics
|
|
|
|
## Monitoring & Maintenance
|
|
|
|
### Health Checks
|
|
```ruby
|
|
# Rails console health check
|
|
service = GeoIpService.new
|
|
puts "Database available: #{service.database_available?}"
|
|
puts "Database age: #{service.database_record&.age_in_days} days"
|
|
```
|
|
|
|
### Scheduled Maintenance
|
|
- Database automatically updated weekly
|
|
- Old database files cleaned up after 7 days
|
|
- No manual maintenance required
|
|
|
|
### Monitoring Metrics
|
|
Consider monitoring:
|
|
- Database update success/failure rates
|
|
- Lookup performance (response times)
|
|
- Database age and freshness
|
|
- Cache hit/miss ratios
|
|
|
|
## Security & Privacy
|
|
|
|
### Data Privacy
|
|
- No personal data stored in the GeoIP database
|
|
- Only country-level information provided
|
|
- No tracking or logging of IP lookups by default
|
|
|
|
### Network Security
|
|
- Database downloaded from official MaxMind CDN
|
|
- File integrity validated with MD5 checksums
|
|
- Secure temporary file handling during updates
|
|
|
|
## API Reference
|
|
|
|
### GeoIpService
|
|
|
|
#### Class Methods
|
|
- `lookup_country(ip_address)` - Direct lookup
|
|
- `update_database!` - Force database update
|
|
|
|
#### Instance Methods
|
|
- `lookup_country(ip_address)` - Country lookup
|
|
- `lookup_countries(ip_addresses)` - Batch lookup
|
|
- `database_available?` - Check database status
|
|
- `database_info` - Get database metadata
|
|
- `update_from_remote!` - Download new database
|
|
|
|
### Model Methods
|
|
|
|
#### Event Model
|
|
- `enrich_geo_location!` - Update with country code
|
|
- `lookup_country` - Get country code (with fallback)
|
|
- `has_geo_data?` - Check if geo data exists
|
|
- `geo_location` - Get full geo location hash
|
|
|
|
#### IPv4Range/IPv6Range Models
|
|
- `geo_lookup_country!` - Update range with country code
|
|
- `geo_lookup_country` - Get country code (without update)
|
|
- `has_country_info?` - Check for existing country data
|
|
- `primary_country` - Get best available country code
|
|
- `lookup_country_by_ip(ip)` - Class method for IP lookup
|
|
- `enrich_missing_geo_data(limit:)` - Class method for batch enrichment
|
|
|
|
## Support & Resources
|
|
|
|
### MaxMind Documentation
|
|
- [MaxMind Developer Site](https://dev.maxmind.com/)
|
|
- [GeoLite2 Databases](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data)
|
|
- [Database Accuracy](https://dev.maxmind.com/geoip/geolite2-free-geolocation-data#accuracy)
|
|
|
|
### Ruby Libraries
|
|
- [maxmind-db gem](https://github.com/maxmind/MaxMind-DB-Reader-ruby)
|
|
- [httparty gem](https://github.com/jnunemaker/httparty)
|
|
|
|
### Troubleshooting Resources
|
|
- Application logs: `log/production.log`
|
|
- Rails console for manual testing
|
|
- Database status via `rails geoip:status` |