[System Design Tech Case Study Pulse #44]Processing Crucial 1 Trillion Edges in Social Graph with 1ms Read Times : Magic Behind Facebook TAO and How It Works

With detailed explanation and flow chart....

Dec 17, 2024

∙ Paid

Hi All,

Facebook’s TAO (The Associations and Objects) is an efficient, geographically distributed data store optimized for the social graph, enabling Facebook to serve over 1 trillion edges (connections) in their social graph with sub-1ms read times.

TAO's architecture supports low-latency reads and high-throughput writes for millions of concurrent users, handling interactions like "likes," "follows," and "friendships" across Facebook's vast social network. It achieves this by using a caching layer, multi-tiered databases, and hierarchical replication strategies.

Learn how to System design —Design Lyft and 200+ System Design Case Studies

Let's explore how Facebook engineered this system to handle 1 trillion edges while maintaining 1ms read latencies, examining the key architectural decisions, caching strategies, and optimization techniques.

System Overview

Before diving into TAO's architecture, let's examine the key metrics that demonstrate its scale:

Graph edges served: 1+ trillion
Read latency: ~1ms average
QPS (Queries Per Second): Millions
Data size: Petabytes
Availability: 99.99%+
Global deployment: Multiple data centers
Types of objects: Users, posts, comments, likes, etc.
Edge types: Hundreds (friendships, follows, likes, etc.)
Write throughput: Millions per second

Learn Real World System Designs —

$100000 per BTC : How Bitcoin Actually Works

$320 Billion Crypto Transactions Volume: How Coinbase Actually Works

100K Events per Second : How Uber Real-Time Surge Pricing Actually Works

Processing 2 Billion Daily Queries : How Facebook Graph Search Actually Works

7 Trillion Messages Daily : Magic Behind LinkedIn Architecture and How It Actually Works

1 Billion Tweets Daily : Magic Behind Twitter Scaling and How It Actually Works

12 Million Daily Users: Inside Slack's Real-Time Messaging Magic and How it Actually Works

3 Billion Daily Users : How Youtube Actually Scales

1.5 Billion Swipes per Day : How Tinder Matching Actually Works

500+ Million Users Daily : How Instagram Stories Actually Work

2.9 Billion Daily Active Users : How Facebook News Feed Algorithm Actually Works

20 Billion Messages Daily: How Facebook Messenger Actually Works

8+ Billion Daily Views: How Facebook's Live Video Ranking Algorithm Works

How Discord's Real-Time Chat Scales to 200+ Million Users

80 Million Photos Daily : How Instagram Achieves Real Time Photo Sharing

Serving 1 Trillion Edges in Social Graph with 1ms Read Times : How Facebook TAO works

How Lyft Handles 2x Traffic Spikes during Peak Hours with Auto scaling Infrastructure..

How it works ( Tech in depth) —

User Interaction:
- A user sends a request to view or update social graph data (e.g., viewing a friend’s profile or posting a comment), which is routed to the nearest data center for fast processing.
Local Cache Lookup:
- TAO’s caching layer immediately checks if the requested data is available locally. A cache hit means data can be quickly served to the user without needing a database call, resulting in a sub-1ms response.

Ignito

[System Design Tech Case Study Pulse #44]Processing Crucial 1 Trillion Edges in Social Graph with 1ms Read Times : Magic Behind Facebook TAO and How It Works

With detailed explanation and flow chart....

System Overview

Learn Real World System Designs —

How it works ( Tech in depth) —

This post is for paid subscribers