Skip to content

πŸ‘£ Pipeline Walkthrough

1️⃣ Data Preparation and Ingestion

We start by loading data from MovieLens (small or 1M).

  • User–item interactions are cleaned and indexed.
  • Metadata like title, genres, and tags is merged in.
  • Missing information is handled smoothly (important for cold-start or incomplete data).

πŸ‘‰ Example:

  • Input: Nixon (1995) β†’ Genres: Drama, Biography β†’ Description: (missing)
  • Output (after LLM enrichment): β€œNixon (1995) explores the troubled psyche and political career of America's 37th president...”

We also tag each movie by popularity:

  • Head (top 10%)
  • Mid-tail (next 40%)
  • Long-tail (bottom 50%)

2️⃣ Multimodal Embedding Extraction

For each item, we generate embeddings (vector representations) from different sources:

  • Text β†’ From movie descriptions using models like OpenAI Ada, MiniLM, or LLaMA.
  • Visual β†’ From trailer/keyframes using ResNet-50 (2048-dim vectors).
  • Audio (optional) β†’ Extract MFCC features (128-dim).
  • Fusion β†’ Combine modalities using methods like:

  • Concat (stack vectors)

  • PCA (reduce to 128-dim)
  • CCA (align text & visual spaces)
  • Avg (simple average)

πŸ‘‰ Example:

  • Text embedding: [... -0.31, 0.54, 1.02 ...]
  • Visual embedding: [... 0.11, -0.22, 0.91 ...]
  • CCA fusion β†’ joint 64-dim vector: [... 0.44, 0.08, -0.32 ...]

3️⃣ Embedding Swap & Re-embedding

If movie info changes (e.g., after augmentation), we re-run the embedding step. βœ… Ensures all experiments always use up-to-date representations.


4️⃣ User Embedding Construction

We create user profiles in the same space as items. Options:

  • Random β†’ sanity check baseline.
  • Average β†’ mean of all liked items.
  • Temporal β†’ recent interactions get more weight.

πŸ‘‰ Example: User 42 watched Nixon (1995), The Post, Frost/Nixon.

  • More recent movies (e.g., The Post) count more in their profile.

5️⃣ Candidate Retrieval

With user embeddings ready:

  • We build a kNN index of all items.
  • For each user β†’ find top-N most similar items.

This step reflects all upstream choices (embedding model, fusion method, user strategy).


6️⃣ Profile Augmentation & LLM Prompting

We enhance user profiles with structured info:

  • Manual: Extracted from history (genres, top movies, tags).
  • LLM-based: Generated with a short natural-language summary.

πŸ‘‰ Example Profile:

{
  "Genres": ["Drama", "Biography"],
  "Top items": ["Nixon (1995)", "The Post"],
  "Taste": "Prefers political dramas exploring real historical events."
}

This profile + candidate movies β†’ passed to the LLM with instructions.


7️⃣ LLM Re-ranking

The LLM receives:

  1. User profile
  2. Candidate movies
  3. Instructions

It outputs a ranked list.

  • ID-only mode β†’ Just movie IDs (privacy-friendly).
  • Explainable mode β†’ IDs + reasoning.

If parsing fails β†’ fallback to kNN list.


8️⃣ Evaluation & Logging

Every run is measured using:

  • Accuracy β†’ Recall\@K, nDCG\@K, MAP, MRR.
  • Beyond-accuracy β†’ Coverage, novelty, diversity, long-tail %.
  • Fairness/robustness β†’ Cold-start, exposure balance.

Results are logged per user, averaged, and exported (CSV/Parquet). βœ… Intermediate artifacts (embeddings, candidates, logs) are checkpointed for reproducibility.