⚙️ Configuration Schema
The pipeline is fully configurable through a single config block. This makes it easy to try out different experiments without changing the code — just edit the config and rerun.
🔑 Main Parameters
📂 Dataset
ml_latest_small→ MovieLens Latest Smalllfm360k→ Last.fm 360K
🧩 Embeddings
textual→ Text-only embeddingsvisual→ Visual-only embeddingsaudio→ Audio-only embeddingsfused_concat→ Concatenate all modalitiesfused_pca→ PCA-based fusion (128-dim)fused_cca→ CCA-based fusion (64-dim)fused_avg→ Average embeddings
🤖 LLM Model
openai→ OpenAI Ada embeddingssentence_transformer→ MiniLM or similarllama3→ HuggingFace LLaMA model
👤 User Vector Strategy
random→ Random baseline (sanity check)average→ Average of liked item embeddingstemporal→ Weighted by recency of interactions
🔎 Retrieval
N→ Number of nearest neighbors (default:50)
🎬 Recommendation
K→ Final top recommendations (default:10)explainable→true/false(include reasoning or not)
⚡ Runtime Settings
use_gpu→ Use GPU acceleration if availableseed→ Random seed for reproducibilitybatch_size→ Batch size for embedding & retrieval
🛠 Example Config Block
dataset: ml_latest_small
embeddings: fused_pca
llm_model: sentence_transformer
user_vector: temporal
retrieval:
N: 50
recommendation:
K: 10
explainable: true
runtime:
use_gpu: true
seed: 42
batch_size: 64
✅ Notes
- Default values are set to match the benchmarks used in experiments.
- Changing any config parameter requires no code edits — just update the block.
- This makes the system flexible for rapid experimentation and extensibility.