Rate Limiting: Leaky Bucket vs Token Bucket

Posted Jul 27, 2022

3 min read

Been looking into rate limiting for calling third-party APIs. Two main algorithms keep coming up: Leaky Bucket and Token Bucket. Here’s what I learned.

The Problem

Calling external APIs without rate limiting = bad time:

Hit rate limits → errors
Waste quota → costs money
Bursts → overwhelm services

Need to smooth out requests.

Leaky Bucket

Imagine a bucket with a hole:

Requests go in at any rate
Leak out at constant rate
Bucket full? Drop requests

  
type LeakyBucket struct {
    capacity  int
    remaining int
    leakRate  time.Duration
    lastLeak  time.Time
    mu        sync.Mutex
}

func (lb *LeakyBucket) Allow() bool {
    lb.mu.Lock()
    defer lb.mu.Unlock()

    // Leak based on time elapsed
    now := time.Now()
    elapsed := now.Sub(lb.lastLeak)
    leaked := int(elapsed / lb.leakRate)

    if leaked > 0 {
        lb.remaining += leaked
        if lb.remaining > lb.capacity {
            lb.remaining = lb.capacity
        }
        lb.lastLeak = now
    }

    if lb.remaining > 0 {
        lb.remaining--
        return true
    }
    return false
}

Good for: Smooth, constant output rate. Like streaming data or strict API rate limits.

Token Bucket

Tokens refill at constant rate:

Bucket starts with tokens
Request consumes token
Out of tokens? Wait or reject
Allows bursts (if tokens available)

  
type TokenBucket struct {
    capacity   int
    tokens     float64
    refillRate float64  // tokens per second
    lastRefill time.Time
    mu         sync.Mutex
}

func (tb *TokenBucket) Allow() bool {
    tb.mu.Lock()
    defer tb.mu.Unlock()

    now := time.Now()
    elapsed := now.Sub(tb.lastRefill).Seconds()

    // Add tokens
    tb.tokens += elapsed * tb.refillRate
    if tb.tokens > float64(tb.capacity) {
        tb.tokens = float64(tb.capacity)
    }
    tb.lastRefill = now

    if tb.tokens >= 1 {
        tb.tokens -= 1
        return true
    }
    return false
}

Good for: User-facing APIs. Allows occasional bursts for better UX.

Key Difference

Leaky Bucket: Constant output, smooths bursts
Token Bucket: Allows bursts, maintains average rate

What I’m Using

For calling third-party APIs: Leaky Bucket. I want predictable, steady rate.

For my own API: Token Bucket. Users can burst occasionally, better experience.

Or just use golang.org/x/time/rate - it’s basically a token bucket:

  
import "golang.org/x/time/rate"

limiter := rate.NewLimiter(10, 100) // 10 req/s, burst of 100

if limiter.Allow() {
    // make request
}

// Or wait
limiter.Wait(ctx)

Redis Implementation

For distributed systems, need shared state. Found this Redis rate limiting article (in Chinese) that shows Lua script approach.

Basic idea:

  
-- Token bucket in Redis (simplified)
local tokens = redis.call('get', key) or capacity
local now = redis.call('time')[1]
local last = redis.call('get', key .. ':last') or now

-- Refill tokens
local elapsed = now - last
tokens = math.min(capacity, tokens + elapsed * rate)

-- Consume token
if tokens >= 1 then
    redis.call('set', key, tokens - 1)
    redis.call('set', key .. ':last', now)
    return 1
else
    return 0
end

Atomic operations, works across servers.

That’s my understanding so far. Both algorithms work, just pick based on whether you need smoothing (leaky) or bursts (token).

This post is licensed under CC BY 4.0 by the author.