Files
velour/docs/phases/phase_5.md
Dan Milne 88a906064f
Some checks failed
CI / scan_ruby (push) Has been cancelled
CI / scan_js (push) Has been cancelled
CI / lint (push) Has been cancelled
CI / test (push) Has been cancelled
CI / system-test (push) Has been cancelled
Much base work started
2025-10-31 14:36:14 +11:00

8.5 KiB

Velour Phase 5: Audio Support (Music & Audiobooks)

Phase 5 extends Velour from a video library to a comprehensive media library by adding support for music and audiobooks. This builds upon the extensible MediaFile architecture established in Phase 1.

Technology Stack

Audio Processing Components

  • FFmpeg - Audio transcoding and metadata extraction (extends existing video processing)
  • Ruby Audio Gems - ID3 tag parsing, waveform generation
  • Active Storage - Album art and waveform visualization storage
  • MediaInfo - Comprehensive audio metadata extraction

Database Schema Extensions

Audio Model (inherits from MediaFile)

class Audio < MediaFile
  # Audio-specific associations
  has_many :audio_assets, dependent: :destroy  # album art, waveforms

  # Audio-specific metadata store
  store :audio_metadata, accessors: [:sample_rate, :channels, :artist, :album, :track_number, :genre, :year]

  # Audio-specific methods
  def quality_label
    return "Unknown" unless bit_rate
    case bit_rate
    when 0..128 then "128kbps"
    when 129..192 then "192kbps"
    when 193..256 then "256kbps"
    when 257..320 then "320kbps"
    else "Lossless"
    end
  end

  def format_type
    return "Unknown" unless format
    case format&.downcase
    when "mp3" then "MP3"
    when "flac" then "FLAC"
    when "wav" then "WAV"
    when "aac", "m4a" then "AAC"
    when "ogg" then "OGG Vorbis"
    else format&.upcase
    end
  end
end

class AudioAsset < ApplicationRecord
  belongs_to :audio

  enum asset_type: { album_art: 0, waveform: 1, lyrics: 2 }

  # Uses Active Storage for file storage
  has_one_attached :file
end

Extended Work Model

class Work < ApplicationRecord
  # Existing video associations
  has_many :videos, dependent: :destroy
  has_many :external_ids, dependent: :destroy

  # New audio associations
  has_many :audios, dependent: :destroy

  # Enhanced primary media selection
  def primary_media
    (audios + videos).sort_by(&:created_at).last
  end

  def primary_video
    videos.order(created_at: :desc).first
  end

  def primary_audio
    audios.order(created_at: :desc).first
  end

  # Content type detection
  def video_content?
    videos.exists?
  end

  def audio_content?
    audios.exists?
  end

  def mixed_content?
    video_content? && audio_content?
  end
end

Audio Processing Pipeline

AudioProcessorJob

class AudioProcessorJob < ApplicationJob
  queue_as :processing

  def perform(audio_id)
    audio = Audio.find(audio_id)

    # Extract audio metadata
    AudioMetadataExtractor.new(audio).extract!

    # Generate album art if embedded
    AlbumArtExtractor.new(audio).extract!

    # Generate waveform visualization
    WaveformGenerator.new(audio).generate!

    # Check web compatibility and transcode if needed
    unless AudioTranscoder.new(audio).web_compatible?
      AudioTranscoderJob.perform_later(audio_id)
    end

    audio.update!(processed: true)
  rescue => e
    audio.update!(processing_error: e.message)
    raise
  end
end

AudioTranscoderJob

class AudioTranscoderJob < ApplicationJob
  queue_as :transcoding

  def perform(audio_id)
    audio = Audio.find(audio_id)
    AudioTranscoder.new(audio).transcode_for_web!
  end
end

File Discovery Extensions

Enhanced FileScannerService

class FileScannerService
  AUDIO_EXTENSIONS = %w[mp3 flac wav aac m4a ogg wma].freeze

  def scan_directory(storage_location)
    # Existing video scanning logic
    scan_videos(storage_location)

    # New audio scanning logic
    scan_audio(storage_location)
  end

  private

  def scan_audio(storage_location)
    AUDIO_EXTENSIONS.each do |ext|
      Dir.glob(File.join(storage_location.path, "**", "*.#{ext}")).each do |file_path|
        process_audio_file(file_path, storage_location)
      end
    end
  end

  def process_audio_file(file_path, storage_location)
    filename = File.basename(file_path)
    return if Audio.joins(:storage_location).exists?(filename: filename, storage_locations: { id: storage_location.id })

    # Create Work based on filename parsing (album/track structure)
    work = find_or_create_audio_work(filename, file_path)

    # Create Audio record
    Audio.create!(
      work: work,
      storage_location: storage_location,
      filename: filename,
      xxhash64: calculate_xxhash64(file_path)
    )

    AudioProcessorJob.perform_later(audio.id)
  end
end

User Interface Extensions

Audio Player Integration

  • Video.js Audio Plugin - Extend existing video player for audio
  • Waveform Visualization - Interactive seeking with waveform display
  • Chapter Support - Essential for audiobooks
  • Speed Control - Variable playback speed for audiobooks

Library Organization

  • Album View - Grid layout with album art
  • Artist Pages - Discography and album organization
  • Audiobook Progress - Chapter tracking and resume functionality
  • Mixed Media Collections - Works containing both video and audio content

Audio-Specific Features

  • Playlist Creation - Custom playlists for music
  • Shuffle Play - Random playback for albums/artists
  • Gapless Playback - Seamless track transitions
  • Lyrics Display - Embedded or external lyrics support

Implementation Timeline

Phase 5A: Audio Foundation (Week 1-2)

  • Create Audio model inheriting from MediaFile
  • Implement AudioProcessorJob and audio metadata extraction
  • Extend FileScannerService for audio formats
  • Basic audio streaming endpoint

Phase 5B: Audio Processing (Week 3)

  • Album art extraction and storage
  • Waveform generation
  • Audio transcoding for web compatibility
  • Quality optimization and format conversion

Phase 5C: User Interface (Week 4)

  • Audio player component (extends Video.js)
  • Album and artist browsing interfaces
  • Audio library management views
  • Search and filtering for audio content

Phase 5D: Advanced Features (Week 5)

  • Chapter support for audiobooks
  • Playlist creation and management
  • Mixed media Works (video + audio)
  • Audio-specific user preferences

Migration Strategy

Database Migrations

# Extend videos table for STI (already done in Phase 1)
# Add audio-specific columns if needed
class AddAudioFeatures < ActiveRecord::Migration[8.1]
  def change
    create_table :audio_assets do |t|
      t.references :audio, null: false, foreign_key: true
      t.string :asset_type
      t.timestamps
    end

    # Audio-specific indexes
    add_index :audios, :artist if column_exists?(:audios, :artist)
    add_index :audios, :album if column_exists?(:audios, :album)
  end
end

Backward Compatibility

  • All existing video functionality remains unchanged
  • Video URLs and routes continue to work identically
  • Database migration is additive (type column only)
  • No breaking changes to existing API

Configuration

Environment Variables

# Audio Processing (extends existing video processing)
FFMPEG_PATH=/usr/bin/ffmpeg
AUDIO_TRANSCODE_QUALITY=high
MAX_AUDIO_TRANSCODE_SIZE_GB=10

# Audio Features
ENABLE_AUDIO_SCANNING=true
ENABLE_WAVEFORM_GENERATION=true
AUDIO_THUMBNAIL_SIZE=300x300

Storage Considerations

  • Album art storage in Active Storage
  • Waveform images (generated per track)
  • Potential audio transcoding cache
  • Audio-specific metadata storage

Testing Strategy

Model Tests

  • Audio model validation and inheritance
  • Work model mixed content handling
  • Audio metadata extraction accuracy

Integration Tests

  • Audio processing pipeline end-to-end
  • Audio streaming with seeking support
  • File scanner audio discovery

System Tests

  • Audio player functionality
  • Album/artist interface navigation
  • Mixed media library browsing

Performance Considerations

Audio Processing

  • Parallel audio metadata extraction
  • Efficient album art extraction
  • Optimized waveform generation
  • Background transcoding queue management

Storage Optimization

  • Compressed waveform storage
  • Album art caching and optimization
  • Efficient audio streaming with range requests

User Experience

  • Fast audio library browsing
  • Quick album art loading
  • Responsive audio player controls

Future Extensions

Phase 5+ Possibilities

  • Podcast Support - RSS feed integration and episode management
  • Radio Streaming - Internet radio station integration
  • Music Discovery - Similar artist recommendations
  • Audio Bookmarks - Detailed note-taking for audiobooks
  • Social Features - Sharing playlists and recommendations

This phase transforms Velour from a video library into a comprehensive personal media platform while maintaining the simplicity and robustness of the existing architecture.