Skip to content

⚙️ Configuration Schema

The pipeline is fully configurable through a single config block. This makes it easy to try out different experiments without changing the code — just edit the config and rerun.


🔑 Main Parameters

📂 Dataset

  • ml_latest_small → MovieLens Latest Small
  • lfm360k → Last.fm 360K

🧩 Embeddings

  • textual → Text-only embeddings
  • visual → Visual-only embeddings
  • audio → Audio-only embeddings
  • fused_concat → Concatenate all modalities
  • fused_pca → PCA-based fusion (128-dim)
  • fused_cca → CCA-based fusion (64-dim)
  • fused_avg → Average embeddings

🤖 LLM Model

  • openai → OpenAI Ada embeddings
  • sentence_transformer → MiniLM or similar
  • llama3 → HuggingFace LLaMA model

👤 User Vector Strategy

  • random → Random baseline (sanity check)
  • average → Average of liked item embeddings
  • temporal → Weighted by recency of interactions

🔎 Retrieval

  • N → Number of nearest neighbors (default: 50)

🎬 Recommendation

  • K → Final top recommendations (default: 10)
  • explainabletrue/false (include reasoning or not)

⚡ Runtime Settings

  • use_gpu → Use GPU acceleration if available
  • seed → Random seed for reproducibility
  • batch_size → Batch size for embedding & retrieval

🛠 Example Config Block

dataset: ml_latest_small
embeddings: fused_pca
llm_model: sentence_transformer
user_vector: temporal
retrieval:
  N: 50
recommendation:
  K: 10
  explainable: true
runtime:
  use_gpu: true
  seed: 42
  batch_size: 64

✅ Notes

  • Default values are set to match the benchmarks used in experiments.
  • Changing any config parameter requires no code edits — just update the block.
  • This makes the system flexible for rapid experimentation and extensibility.