Ignito

Ignito

Ignito

[ML System Design Tech Case Study Pulse #14]Massive Billions of Personalized Recommendations in Real-Time: How Instagram Scaling Actually Works

Behind the tech with detailed explanation and flow chart....

Naina Chaturvedi's avatar
Naina Chaturvedi
Oct 22, 2025
∙ Paid

Table of Contents

  1. User Session Flow: What Happens When You Open Instagram

  2. Understanding Instagram

  3. The Multi-Stage Recommendation Pipeline

  4. Real-Time Serving Infrastructure

  5. Machine Learning Models and Training

  6. Content Understanding and Feature Engineering

  7. Distributed Systems Architecture


User Session Flow: What Happens When You Open Instagram

Before diving into the technical architecture, let’s follow Maria’s journey when she opens Instagram and navigates to the Explore tab. This seemingly simple interaction triggers one of the most sophisticated recommendation systems in the world.

The Complete User Experience Flow

┌─────────────────────────────────────────────────────────────┐
│                    USER SESSION JOURNEY                       │
├─────────────────────────────────────────────────────────────┤
│                                                               │
│   1. APP LAUNCH & AUTHENTICATION                             │
│   ┌──────────────┐                                          │
│   │ User: Maria  │ → Opens Instagram app                    │
│   │ Device: iPhone│ → User authenticated via token         │
│   │ Location: NYC │ → Session context established          │
│   └──────────────┘                                          │
│          ↓                                                   │
│                                                               │
│   2. EXPLORE TAB CLICKED (< 50ms)                           │
│   ┌──────────────────────────────────┐                      │
│   │ • Request sent to recommendation │                      │
│   │   service with user context      │                      │
│   │ • User ID: 12345678              │                      │
│   │ • Device info: iOS, timezone     │                      │
│   │ • Recent activity signals        │                      │
│   └──────────────────────────────────┘                      │
│          ↓                                                   │
│                                                               │
│   3. CANDIDATE GENERATION (< 100ms)                         │
│   ┌──────────────────────────────────┐                      │
│   │ • Query 50M+ posts from multiple │                      │
│   │   candidate sources               │                      │
│   │ • Following network: 5,000 posts │                      │
│   │ • Similar users: 10,000 posts    │                      │
│   │ • Trending content: 15,000 posts │                      │
│   │ • Topic-based: 20,000 posts      │                      │
│   │ Total candidates: 50,000 posts   │                      │
│   └──────────────────────────────────┘                      │
│          ↓                                                   │
│                                                               │
│   4. RANKING & SCORING (< 200ms)                            │
│   ┌──────────────────────────────────┐                      │
│   │ ML Models Process Each Post:     │                      │
│   │ • Engagement prediction: 0.85    │                      │
│   │ • Content quality score: 0.92    │                      │
│   │ • User interest match: 0.78      │                      │
│   │ • Diversity factor: 0.65         │                      │
│   │ → Final score: 0.84              │                      │
│   │ → Ranked list of 150 posts       │                      │
│   └──────────────────────────────────┘                      │
│          ↓                                                   │
│                                                               │
│   5. CONTENT DELIVERY (< 100ms)                             │
│   ┌──────────────────────────────────┐                      │
│   │ • Top 24 posts selected for grid │                      │
│   │ • Images/videos served from CDN  │                      │
│   │ • Prefetch next batch in background│                     │
│   │ • Analytics events logged        │                      │
│   └──────────────────────────────────┘                      │
│          ↓                                                   │
│                                                               │
│   6. USER INTERACTION                                        │
│   ┌──────────────────────────────────┐                      │
│   │ Maria sees personalized grid:    │                      │
│   │ • Travel photos (her interest)   │                      │
│   │ • Food content (recent searches) │                      │
│   │ • Art posts (friend’s activity)  │                      │
│   │ • Recipe videos (trending)       │                      │
│   │ [User starts browsing & engaging]│                      │
│   └──────────────────────────────────┘                      │
│                                                               │
│   Total Response Time: < 450ms ⚡                            │
└─────────────────────────────────────────────────────────────┘

When Maria opens Instagram and taps the Explore tab, her device immediately sends a request containing her unique user ID, device information including iOS version and timezone, and recent activity signals from her current session. This request travels through Instagram’s global load balancers to the nearest data center, typically reaching servers within 50 milliseconds due to Meta’s extensive edge network infrastructure.

The recommendation service springs into action, first retrieving Maria’s user profile and recent interaction history from distributed caches. Simultaneously, multiple candidate generation systems begin pulling relevant content from different sources: posts from accounts Maria follows but hasn’t seen recently, content from users with similar interests identified through collaborative filtering, currently trending posts that match her historical preferences, and topic-based recommendations derived from her search history and engagement patterns.

How Real World Scalable Systems are Build — 200+ System Design Case Studies:

  1. System Design Den : Must Know System Design Case Studies

$100000 per BTC : How Bitcoin Actually Works

Processing 2 Billion Daily Queries : How Facebook Graph Search Actually Works

7 Trillion Messages Daily : Magic Behind LinkedIn Architecture and How It Actually Works

1 Billion Tweets Daily : Magic Behind Twitter Scaling and How It Actually Works

12 Million Daily Users: Inside Slack’s Real-Time Messaging Magic and How it Actually Works

3 Billion Daily Users : How Youtube Actually Scales

$320 Billion Crypto Transactions Volume: How Coinbase Actually Works

100K Events per Second : How Uber Real-Time Surge Pricing Actually Works

1.5 Billion Swipes per Day : How Tinder Matching Actually Works

500+ Million Users Daily : How Instagram Stories Actually Work

2.9 Billion Daily Active Users : How Facebook News Feed Algorithm Actually Works

20 Billion Messages Daily: How Facebook Messenger Actually Works

8+ Billion Daily Views: How Facebook’s Live Video Ranking Algorithm Works

How Discord’s Real-Time Chat Scales to 200+ Million Users

80 Million Photos Daily : How Instagram Achieves Real Time Photo Sharing

Serving 1 Trillion Edges in Social Graph with 1ms Read Times : How Facebook TAO works

How Lyft Handles 2x Traffic Spikes during Peak Hours with Auto scaling Infrastructure..

Within 100 milliseconds, these systems have identified approximately 50,000 candidate posts from Instagram’s massive corpus of billions of pieces of content. This massive funnel approach ensures high recall - capturing all potentially relevant content before applying more sophisticated ranking algorithms. The candidate generation relies heavily on approximate nearest neighbor search using embeddings, allowing the system to quickly identify similar content across multiple dimensions.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Naina
Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture