8.5 KiB
8.5 KiB
Velour Phase 5: Audio Support (Music & Audiobooks)
Phase 5 extends Velour from a video library to a comprehensive media library by adding support for music and audiobooks. This builds upon the extensible MediaFile architecture established in Phase 1.
Technology Stack
Audio Processing Components
- FFmpeg - Audio transcoding and metadata extraction (extends existing video processing)
- Ruby Audio Gems - ID3 tag parsing, waveform generation
- Active Storage - Album art and waveform visualization storage
- MediaInfo - Comprehensive audio metadata extraction
Database Schema Extensions
Audio Model (inherits from MediaFile)
class Audio < MediaFile
# Audio-specific associations
has_many :audio_assets, dependent: :destroy # album art, waveforms
# Audio-specific metadata store
store :audio_metadata, accessors: [:sample_rate, :channels, :artist, :album, :track_number, :genre, :year]
# Audio-specific methods
def quality_label
return "Unknown" unless bit_rate
case bit_rate
when 0..128 then "128kbps"
when 129..192 then "192kbps"
when 193..256 then "256kbps"
when 257..320 then "320kbps"
else "Lossless"
end
end
def format_type
return "Unknown" unless format
case format&.downcase
when "mp3" then "MP3"
when "flac" then "FLAC"
when "wav" then "WAV"
when "aac", "m4a" then "AAC"
when "ogg" then "OGG Vorbis"
else format&.upcase
end
end
end
class AudioAsset < ApplicationRecord
belongs_to :audio
enum asset_type: { album_art: 0, waveform: 1, lyrics: 2 }
# Uses Active Storage for file storage
has_one_attached :file
end
Extended Work Model
class Work < ApplicationRecord
# Existing video associations
has_many :videos, dependent: :destroy
has_many :external_ids, dependent: :destroy
# New audio associations
has_many :audios, dependent: :destroy
# Enhanced primary media selection
def primary_media
(audios + videos).sort_by(&:created_at).last
end
def primary_video
videos.order(created_at: :desc).first
end
def primary_audio
audios.order(created_at: :desc).first
end
# Content type detection
def video_content?
videos.exists?
end
def audio_content?
audios.exists?
end
def mixed_content?
video_content? && audio_content?
end
end
Audio Processing Pipeline
AudioProcessorJob
class AudioProcessorJob < ApplicationJob
queue_as :processing
def perform(audio_id)
audio = Audio.find(audio_id)
# Extract audio metadata
AudioMetadataExtractor.new(audio).extract!
# Generate album art if embedded
AlbumArtExtractor.new(audio).extract!
# Generate waveform visualization
WaveformGenerator.new(audio).generate!
# Check web compatibility and transcode if needed
unless AudioTranscoder.new(audio).web_compatible?
AudioTranscoderJob.perform_later(audio_id)
end
audio.update!(processed: true)
rescue => e
audio.update!(processing_error: e.message)
raise
end
end
AudioTranscoderJob
class AudioTranscoderJob < ApplicationJob
queue_as :transcoding
def perform(audio_id)
audio = Audio.find(audio_id)
AudioTranscoder.new(audio).transcode_for_web!
end
end
File Discovery Extensions
Enhanced FileScannerService
class FileScannerService
AUDIO_EXTENSIONS = %w[mp3 flac wav aac m4a ogg wma].freeze
def scan_directory(storage_location)
# Existing video scanning logic
scan_videos(storage_location)
# New audio scanning logic
scan_audio(storage_location)
end
private
def scan_audio(storage_location)
AUDIO_EXTENSIONS.each do |ext|
Dir.glob(File.join(storage_location.path, "**", "*.#{ext}")).each do |file_path|
process_audio_file(file_path, storage_location)
end
end
end
def process_audio_file(file_path, storage_location)
filename = File.basename(file_path)
return if Audio.joins(:storage_location).exists?(filename: filename, storage_locations: { id: storage_location.id })
# Create Work based on filename parsing (album/track structure)
work = find_or_create_audio_work(filename, file_path)
# Create Audio record
Audio.create!(
work: work,
storage_location: storage_location,
filename: filename,
xxhash64: calculate_xxhash64(file_path)
)
AudioProcessorJob.perform_later(audio.id)
end
end
User Interface Extensions
Audio Player Integration
- Video.js Audio Plugin - Extend existing video player for audio
- Waveform Visualization - Interactive seeking with waveform display
- Chapter Support - Essential for audiobooks
- Speed Control - Variable playback speed for audiobooks
Library Organization
- Album View - Grid layout with album art
- Artist Pages - Discography and album organization
- Audiobook Progress - Chapter tracking and resume functionality
- Mixed Media Collections - Works containing both video and audio content
Audio-Specific Features
- Playlist Creation - Custom playlists for music
- Shuffle Play - Random playback for albums/artists
- Gapless Playback - Seamless track transitions
- Lyrics Display - Embedded or external lyrics support
Implementation Timeline
Phase 5A: Audio Foundation (Week 1-2)
- Create Audio model inheriting from MediaFile
- Implement AudioProcessorJob and audio metadata extraction
- Extend FileScannerService for audio formats
- Basic audio streaming endpoint
Phase 5B: Audio Processing (Week 3)
- Album art extraction and storage
- Waveform generation
- Audio transcoding for web compatibility
- Quality optimization and format conversion
Phase 5C: User Interface (Week 4)
- Audio player component (extends Video.js)
- Album and artist browsing interfaces
- Audio library management views
- Search and filtering for audio content
Phase 5D: Advanced Features (Week 5)
- Chapter support for audiobooks
- Playlist creation and management
- Mixed media Works (video + audio)
- Audio-specific user preferences
Migration Strategy
Database Migrations
# Extend videos table for STI (already done in Phase 1)
# Add audio-specific columns if needed
class AddAudioFeatures < ActiveRecord::Migration[8.1]
def change
create_table :audio_assets do |t|
t.references :audio, null: false, foreign_key: true
t.string :asset_type
t.timestamps
end
# Audio-specific indexes
add_index :audios, :artist if column_exists?(:audios, :artist)
add_index :audios, :album if column_exists?(:audios, :album)
end
end
Backward Compatibility
- All existing video functionality remains unchanged
- Video URLs and routes continue to work identically
- Database migration is additive (type column only)
- No breaking changes to existing API
Configuration
Environment Variables
# Audio Processing (extends existing video processing)
FFMPEG_PATH=/usr/bin/ffmpeg
AUDIO_TRANSCODE_QUALITY=high
MAX_AUDIO_TRANSCODE_SIZE_GB=10
# Audio Features
ENABLE_AUDIO_SCANNING=true
ENABLE_WAVEFORM_GENERATION=true
AUDIO_THUMBNAIL_SIZE=300x300
Storage Considerations
- Album art storage in Active Storage
- Waveform images (generated per track)
- Potential audio transcoding cache
- Audio-specific metadata storage
Testing Strategy
Model Tests
- Audio model validation and inheritance
- Work model mixed content handling
- Audio metadata extraction accuracy
Integration Tests
- Audio processing pipeline end-to-end
- Audio streaming with seeking support
- File scanner audio discovery
System Tests
- Audio player functionality
- Album/artist interface navigation
- Mixed media library browsing
Performance Considerations
Audio Processing
- Parallel audio metadata extraction
- Efficient album art extraction
- Optimized waveform generation
- Background transcoding queue management
Storage Optimization
- Compressed waveform storage
- Album art caching and optimization
- Efficient audio streaming with range requests
User Experience
- Fast audio library browsing
- Quick album art loading
- Responsive audio player controls
Future Extensions
Phase 5+ Possibilities
- Podcast Support - RSS feed integration and episode management
- Radio Streaming - Internet radio station integration
- Music Discovery - Similar artist recommendations
- Audio Bookmarks - Detailed note-taking for audiobooks
- Social Features - Sharing playlists and recommendations
This phase transforms Velour from a video library into a comprehensive personal media platform while maintaining the simplicity and robustness of the existing architecture.