Ignito

Ignito

Share this post

Ignito
Ignito
[Most Asked: ML System Design Case Studies #28] What 3 Billion+ People See: Facebook's News Feed Algorithm : How it Actually Works
Ignito

[Most Asked: ML System Design Case Studies #28] What 3 Billion+ People See: Facebook's News Feed Algorithm : How it Actually Works

Uncovering crucial engineering insights and technical details..

Aug 21, 2025
∙ Paid
1

Share this post

Ignito
Ignito
[Most Asked: ML System Design Case Studies #28] What 3 Billion+ People See: Facebook's News Feed Algorithm : How it Actually Works
Share

Table of Contents

  1. User Session Flow

  2. Understanding the News Feed Problem

  3. The Machine Learning Revolution: From Chronological to Algorithmic

  4. The Neural Network Architecture

  5. Feature Engineering at Scale

  6. Engineering Implementation Details

  7. Engineering Insights & Conclusions

  8. TL;DR


User Session Flow: What Happens When You Open Facebook

The Complete User Journey

When Sarah opens Facebook on her phone during her morning commute, an incredibly sophisticated machine learning system springs into action. In less than 200 milliseconds, Facebook's News Feed algorithm analyzes thousands of potential posts, predicts which ones Sarah is most likely to engage with, and delivers a personalized feed that feels crafted just for her. Here's the complete journey from app launch to scrolling through posts.

Read Implemented LLMs System Design (recommended to complete previous parts) -

Understanding Transformers & Large Language Models: How They Actually Work - Part 1

Understanding Transformers & Large Language Models: How They Actually Work - Part 2

[Launching LLM System Design ] Large Language Models: From Tokens to Optimization: How They Actually Work - Part 1

[Launching LLM System Design #2] Large Language Models: From Architecture, Attention, and Fine-Tuning: How They Actually Work - Part 2

[LLM System Design #3] Large Language Models: Pre-Training LLMs: How They Actually Work - Part 3

[Important LLM System Design #4] Heart of Large Language Models: Encoder and Decoder: How They Actually Work - Part 4

Diagram 1.1: Complete User Session Flow

┌─────────────────────────────────────────────────────────────┐
│                    USER SESSION FLOW                          │
├─────────────────────────────────────────────────────────────┤
│                                                               │
│   1. APP LAUNCH                                              │
│   ┌──────────────┐                                          │
│   │ User: Sarah  │ → Opens Facebook app at 8:15 AM          │
│   │ Location: NYC│ → Device: iPhone 15, iOS 17.1            │
│   │ Context: Commuting on subway                             │
│   └──────────────┘                                          │
│          ↓                                                   │
│                                                               │
│   2. AUTHENTICATION & CONTEXT (< 50ms)                      │
│   ┌──────────────────────────────────┐                      │
│   │ • User ID: 12345 authenticated    │                      │
│   │ • Device fingerprint captured     │                      │
│   │ • Location: Manhattan detected    │                      │
│   │ • Time: 8:15 AM EST recorded      │                      │
│   │ • Network: T-Mobile 5G identified │                      │
│   │ • Last active: 11:30 PM yesterday │                      │
│   └──────────────────────────────────┘                      │
│          ↓                                                   │
│                                                               │
│   3. CANDIDATE GENERATION (< 100ms)                         │
│   ┌──────────────────────────────────┐                      │
│   │ From 50,000 possible posts:       │                      │
│   │ • Friends' posts: 2,500           │                      │
│   │ • Pages followed: 1,200           │                      │
│   │ • Groups joined: 800              │                      │
│   │ • Ads eligible: 300               │                      │
│   │ → Filtered to top 10,000 candidates│                      │
│   └──────────────────────────────────┘                      │
│          ↓                                                   │
│                                                               │
│   4. MACHINE LEARNING SCORING (< 80ms)                      │
│   ┌──────────────────────────────────┐                      │
│   │ For each of 10,000 posts:         │                      │
│   │ • Extract 10,000+ features        │                      │
│   │ • Run multi-task neural network   │                      │
│   │ • Predict engagement probabilities │                      │
│   │ • Calculate inventory value        │                      │
│   │ → Ranked list of scored posts     │                      │
│   └──────────────────────────────────┘                      │
│          ↓                                                   │
│                                                               │
│   5. FINAL RANKING & FILTERING (< 30ms)                     │
│   ┌──────────────────────────────────┐                      │
│   │ Apply business logic:              │                      │
│   │ • Diversity constraints (max 2     │                      │
│   │   posts per friend consecutively)  │                      │
│   │ • Policy checks (hate speech, etc) │                      │
│   │ • Ad load optimization             │                      │
│   │ • Time decay adjustments           │                      │
│   │ → Final feed order determined      │                      │
│   └──────────────────────────────────┘                      │
│          ↓                                                   │
│                                                               │
│   6. CONTENT DELIVERY                                        │
│   ┌──────────────────────────────────┐                      │
│   │ First 5 posts rendered immediately│                      │
│   │ • Images preloaded from CDN       │                      │
│   │ • Video thumbnails optimized      │                      │
│   │ • Infinite scroll ready           │                      │
│   │ Total latency: 180ms ⚡            │                      │
│   │ [Sarah sees her personalized feed] │                      │
│   └──────────────────────────────────┘                      │
│                                                               │
│   Total Processing: < 200ms across 6 datacenters            │
└─────────────────────────────────────────────────────────────┘

Breaking Down Each Step in Detail

Step 1: App Launch and Context Capture The moment Sarah taps the Facebook icon, the app immediately begins collecting contextual signals that will influence her feed. The client app captures her device type (iPhone 15), operating system version (iOS 17.1), current time (8:15 AM EST), and approximate location (Manhattan, NYC). It also notes environmental context - the accelerometer data suggests she's in motion, likely commuting. This contextual information is crucial because Facebook has learned that commuters prefer different content than people browsing at home - typically shorter, more engaging posts that can be consumed quickly.

Step 2: Authentication and User Profile Loading Within 50 milliseconds, Facebook's authentication systems verify Sarah's identity and load her complete user profile from distributed databases. This includes her social graph (3,247 friends, 156 pages followed, 23 groups joined), her historical engagement patterns (she likes 15 posts per day on average, comments twice, shares once), and her interests inferred from past behavior (technology, cooking, travel). The system also notes when she was last active (11:30 PM yesterday) to understand how much new content has accumulated since her last session.

Below are the top 10 System Design Case studies for this week

[Launching-ML System Design Tech Case Study Pulse #2] Million Of House Prices in Predicted Accurately in Real Time : How Zillow Actually Works

[ML System Design Tech Case Study Pulse #4 : Top Question] Predict Real-time Store Status to Billions of Users Worldwide: How Google Maps Actually Work

[ML System Design Tech Case Study Pulse #3 : Top Question] Recommending Million Of Items to Millions of Customer in Real Time: How Amazon Recommendation Actually Works

[Launching-ML System Design Tech Case Study Pulse #1]Handling Billions of Transaction Daily : How Amazon Efficiently Prevents Fraudulent Transactions (How it Actually Works)

Billions of Queries Daily : How Google Search Actually Works

100+ Million Requests per Second : How Amazon Shopping Cart Actually Works

Serving 132+ Million Users : Scaling for Global Transit Real Time Ride Sharing Market at Uber

3 Billion Daily Users : How Youtube Actually Scales

$100000 per BTC : How Bitcoin Actually Works

$320 Billion Crypto Transactions Volume: How Coinbase Actually Works

100K Events per Second : How Uber Real-Time Surge Pricing Actually Works

Processing 2 Billion Daily Queries : How Facebook Graph Search Actually Works

7 Trillion Messages Daily : Magic Behind LinkedIn Architecture and How It Actually Works

1 Billion Tweets Daily : Magic Behind Twitter Scaling and How It Actually Works

12 Million Daily Users: Inside Slack's Real-Time Messaging Magic and How it Actually Works

3 Billion Daily Users : How Youtube Actually Scales

1.5 Billion Swipes per Day : How Tinder Matching Actually Works

500+ Million Users Daily : How Instagram Stories Actually Work

2.9 Billion Daily Active Users : How Facebook News Feed Algorithm Actually Works

20 Billion Messages Daily: How Facebook Messenger Actually Works

8+ Billion Daily Views: How Facebook's Live Video Ranking Algorithm Works

How Discord's Real-Time Chat Scales to 200+ Million Users

80 Million Photos Daily : How Instagram Achieves Real Time Photo Sharing

Serving 1 Trillion Edges in Social Graph with 1ms Read Times : How Facebook TAO works

How Lyft Handles 2x Traffic Spikes during Peak Hours with Auto scaling Infrastructure..

Step 3: Candidate Generation - Narrowing 50,000 to 10,000 From the massive universe of potentially relevant content, Facebook's candidate generation system quickly filters down to a manageable set. Of Sarah's 3,247 friends, about 400 have posted something since her last session. Her 156 followed pages have published roughly 1,200 new posts. The 23 groups she's joined have accumulated 800 new posts. Additionally, there are about 300 ads that target her demographic and interests. The system also considers viral content from the broader network - posts from friends-of-friends that are receiving high engagement. Through various filters (recency, basic relevance, spam detection), this 50,000-post universe gets narrowed to the top 10,000 candidates.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Naina
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share