Post

Latency Numbers Worth Knowing

Latency Numbers Worth Knowing

Came across this classic post about computer performance numbers. Really puts things in perspective.

The Numbers

Here are the key ones (approximate, 2024 hardware):

1
2
3
4
5
6
7
8
9
10
11
12
13
L1 cache reference            0.5 ns
Branch mispredict             5 ns
L2 cache reference            7 ns
Mutex lock/unlock             100 ns
Main memory reference         100 ns
Compress 1KB with Snappy      10 µs (10,000 ns)
Send 2KB over 1 Gbps network  20 µs
Read 1MB from memory          250 µs
SSD random read               150 µs
Read 1MB from SSD             1 ms
Disk seek (HDD)               10 ms
Read 1MB from network         10 ms
Send packet CA→EU→CA          150 ms

What This Means

Cache is Fast

  • L1 cache: 0.5ns
  • Memory: 100ns (200x slower)
  • SSD: 150µs (300,000x slower)

Moral: Keep hot data in cache. Sequential access > random access.

Locks Are Expensive Mutex lock/unlock costs as much as a memory read (100ns). For high-frequency operations, that adds up.

1
2
3
4
5
6
7
8
9
// Bad: lock on every increment
func (c *Counter) Inc() {
    c.mu.Lock()
    c.count++  // 100ns+ for this lock
    c.mu.Unlock()
}

// Better: use atomic
atomic.AddInt64(&c.count, 1)  // ~5-10ns

Network vs Disk vs Memory

1
2
3
Memory:   100 ns
SSD:      150 µs  (1,500x slower)
Network:  10 ms   (100,000x slower)

Even a “slow” memory access is orders of magnitude faster than disk or network.

Compression Can Help

Compressing 1KB takes 10µs. Sending 2KB over network takes 20µs. If you can compress 2KB→1KB, total time is 10µs (compress) + 10µs (send) = 20µs. Same as sending uncompressed!

For slow networks or large data, compression is worth it.

How I Use This

1. Cache Aggressively

Even a 50% cache hit rate saves huge amounts of time:

1
2
3
100 requests:
- No cache: 100 × 10ms = 1000ms
- 50% cache: 50 × 100ns + 50 × 10ms = 500ms

2. Batch Operations

1
2
3
4
5
6
7
// Instead of N network calls
for id := range ids {
    user := api.GetUser(id)  // 150ms each
}

// One batch call
users := api.GetUsers(ids)  // 150ms total

3. Use SSDs

Random reads: HDD (10ms) vs SSD (150µs) = 67x faster

4. Sequential > Random

1
2
Sequential 1MB from memory: 250µs
Random memory accesses: ~100ns each × 1,000,000 = 100ms

400x difference! Structure data for sequential access.

The 1-Second Scale

If L1 cache access = 1 second, then:

  • Memory access = 3 minutes
  • SSD read = 3.5 days
  • HDD seek = 4 months
  • Network round trip = 15 years

Puts it in perspective.

Takeaway

Don’t need to memorize exact numbers, but knowing the rough orders of magnitude helps:

  • Cache: nanoseconds
  • Memory: ~100 nanoseconds
  • SSD: microseconds
  • Network/Disk: milliseconds

Each jump is 1000x. Design accordingly.

This post is licensed under CC BY 4.0 by the author.