Course Navigation
← Back to Course OverviewAll Lessons
 ✓ 
 Introduction and Database Fundamentals  ✓ 
 Building the Core Data Structure  ✓ 
 Concurrency and Thread Safety  ✓ 
 Append-Only Log (Write-Ahead Log)  ✓ 
 SSTables and LSM Trees  6 
 Compaction and Optimization  7 
 TCP Server and Protocol Design  8 
 Client Library and Advanced Networking  9 
 Transactions and ACID Properties  10 
 Replication and High Availability  11 
 Monitoring, Metrics, and Observability  12 
 Performance Optimization and Tuning  13 
 Configuration and Deployment  14 
 Security and Production Hardening  15 
 Final Project and Beyond Current Lesson
  6 of 15 
  Progress 40% 
 Compaction and Optimization
Learning Objectives
- • Master different compaction strategies (size-tiered, leveled, universal)
- • Implement Bloom filters for 10-100x speedup on negative lookups
- • Build block caching for hot data optimization
- • Learn compression techniques and trade-offs
- • Implement batched writes and resource throttling
Lesson 6.1: Compaction Strategies
Why Compaction Matters
As the database operates, SSTables accumulate at different levels, causing performance degradation.
Without compaction:
Level 0: sstable-0, sstable-1, sstable-2, ... sstable-100 (100 files!)
Level 1: sstable-l1-0, sstable-l1-1, ...
Level 2: ...
Problem:
- Read "user:500": Must check 100+ files in Level 0
- Each file access = disk seek (expensive)
- Read latency degrades over time
- Disk space grows unboundedWith compaction:
Before:
  Level 0: 100 small files (1-10MB each)
  Level 1: 10 medium files (50MB each)
After compaction:
  Level 0: 4 small files (5MB each)
  Level 1: 5 medium files (100MB each)
Benefits:
- Read latency: O(100 seeks) → O(5 seeks)
- Disk space: Unused space reclaimed
- Tombstone cleanup: Delete entries removedCompaction Strategies: Size-Tiered
Size-Tiered Compaction (SimpleDB, Cassandra):
Trigger: When level has too many files (e.g., > 10 files)
Action: 
  1. Select all files at level N that fit in memory
  2. Merge into single file
  3. Write to level N+1
  4. Delete source files
Timeline:
t0: Level 0: [f0(5MB), f1(5MB), f2(5MB)] → Trigger (3 files)
t1: Merge into temp file (15MB)
t2: Move to Level 1: [f1_0(15MB)] + existing files
t3: Level 0: [] (files deleted)
t4: New writes go to Level 0 againAdvantages:
- • Simple to implement
- • Good for write-heavy workloads
- • Low CPU overhead during compaction
Disadvantages:
- • Write amplification: ~5-10x
- • Uneven file sizes at same level
- • Space overhead during merge
Leveled Compaction
Leveled Compaction (LevelDB, RocksDB):
Design:
- Level 0: Multiple files (overlap allowed)
- Level 1+: Single file per key range (NO overlap)
Trigger: When level size exceeds limit (e.g., 10MB)
Action:
  1. Pick file from level N
  2. Find overlapping files in level N+1
  3. Merge all into level N+1
  4. Delete source files
Timeline:
Level 0: [file0(4MB), file1(3MB), file2(2MB)] → Size: 9MB (below 10MB)
         [file0(4MB), file1(3MB), file2(2MB), file3(5MB)] → Size: 14MB (TRIGGER)
Select file0 from L0:
  - Key range: ["a", "g"]
  - Find overlapping in L1: [file_l1_0 ("a"-"m")]
  - Merge: file0 + file_l1_0 → file_l1_0'
  - Delete: file0, file_l1_0
Result:
Level 0: [file1(3MB), file2(2MB), file3(5MB)]
Level 1: [file_l1_0'(overlapping merged)]Advantages:
- • Write amplification: ~10x (predictable)
- • Bounded file count per level
- • Space efficient
- • Consistent performance
Disadvantages:
- • Complex implementation
- • Higher CPU during compaction
- • Must maintain key range invariants
Compaction Trade-offs Comparison
| Aspect | Size-Tiered | Leveled | Universal | 
|---|---|---|---|
| Write Amplification | 5-10x | ~10x | ~5x | 
| Read Amplification | Medium | Low | Low-Medium | 
| Space Overhead | High | Low | Medium | 
| Implementation | Simple | Complex | Medium | 
| Best For | Write-heavy | Read-heavy | Balanced | 
Lesson 6.2: Optimization Techniques
Bloom Filters for Faster Lookups
Problem: To check if key exists, must read SSTable (expensive disk I/O).
Without Bloom Filter:
Read "user:500":
  1. Check memtable: Not found
  2. Check SSTable 1: Read file (disk I/O) - Not found
  3. Check SSTable 2: Read file (disk I/O) - Not found
  4. Check SSTable 3: Read file (disk I/O) - Not found
  
  Result: 3 disk I/Os for missing key!Solution: Bloom Filter
A probabilistic data structure that answers "is key in this file?" quickly:
- • No false negatives (if says no, definitely not there)
- • Few false positives (might say yes when actually no)
- • Fast negative lookups
- • Small space (1-2 bits per entry)
Bloom Filter:
- Bit array: [0, 0, 1, 0, 1, 1, 0, 1, ...]
- Hash functions: h1, h2, h3
Put "user:1":
  Set bits at positions: h1("user:1"), h2("user:1"), h3("user:1")
Check "user:1":
  All three positions set? YES, probably in set
  
Check "user:500":
  All three positions set? NO, definitely not in set!Block Cache for Hot Data
Problem: Frequently accessed data requires disk reads.
// BlockCache caches SSTable blocks
type BlockCache struct {
    mu      sync.RWMutex
    cache   map[string]*CacheEntry
    maxSize int64
    size    int64
}
type CacheEntry struct {
    block   *DataBlock
    lastUsed time.Time
}
// Get retrieves from cache
func (bc *BlockCache) Get(key string) (*DataBlock, bool) {
    bc.mu.RLock()
    defer bc.mu.RUnlock()
    
    entry, ok := bc.cache[key]
    if ok {
        entry.lastUsed = time.Now()
        return entry.block, true
    }
    return nil, false
}
// Put stores in cache
func (bc *BlockCache) Put(key string, block *DataBlock) {
    bc.mu.Lock()
    defer bc.mu.Unlock()
    
    // Check if should evict
    if bc.size+block.Size() > bc.maxSize {
        bc.evictLRU()
    }
    
    bc.cache[key] = &CacheEntry{
        block:    block,
        lastUsed: time.Now(),
    }
    bc.size += block.Size()
}Compression Strategies
Problem: SSTables consume disk space.
import (
    "github.com/golang/snappy"
    "compress/gzip"
)
// CompressionCodec compresses/decompresses blocks
type CompressionCodec interface {
    Compress(data []byte) ([]byte, error)
    Decompress(data []byte) ([]byte, error)
}
// SnappyCodec uses Snappy (fast)
type SnappyCodec struct{}
func (sc SnappyCodec) Compress(data []byte) ([]byte, error) {
    return snappy.Encode(nil, data), nil
}
func (sc SnappyCodec) Decompress(data []byte) ([]byte, error) {
    return snappy.Decode(nil, data)
}
// GzipCodec uses Gzip (better ratio)
type GzipCodec struct {
    level int
}
func (gc GzipCodec) Compress(data []byte) ([]byte, error) {
    var buf bytes.Buffer
    w, _ := gzip.NewWriterLevel(&buf, gc.level)
    w.Write(data)
    w.Close()
    return buf.Bytes(), nil
}Compression Trade-offs:
Snappy (Fast):
- • Compression ratio: 0.5-0.7
- • CPU overhead: Low
- • Best for: Recent data, frequent reads
Gzip (Better ratio):
- • Compression ratio: 0.3-0.5
- • CPU overhead: High
- • Best for: Older data, infrequent reads
Batched Writes
Problem: Writing entries one-by-one is slow due to lock contention.
// Batch groups multiple operations
type Batch struct {
    entries []*LogEntry
}
// Add adds operation to batch
func (b *Batch) Put(key, value []byte) {
    b.entries = append(b.entries, &LogEntry{
        Op:    OpPut,
        Key:   key,
        Value: value,
    })
}
// Commit applies all operations atomically
func (b *Batch) Commit(db *DB) error {
    db.mu.Lock()
    defer db.mu.Unlock()
    
    // All writes in one lock
    for _, entry := range b.entries {
        switch entry.Op {
        case OpPut:
            db.memtable.Put(entry.Key, entry.Value)
        case OpDelete:
            db.memtable.Delete(entry.Key)
        }
    }
    
    // Single WAL write
    return db.wal.WriteBatch(b.entries)
}
// Usage
batch := &Batch{}
batch.Put([]byte("k1"), []byte("v1"))
batch.Put([]byte("k2"), []byte("v2"))
batch.Put([]byte("k3"), []byte("v3"))
batch.Commit(db)  // All or nothingHands-on Lab: Implement Leveled Compaction with Bloom filters
Build a complete compaction engine with optimization techniques.
Lab Tasks:
- 1. Implement LeveledCompactor with size-based triggers
- 2. Build Bloom filters with configurable false positive rate
- 3. Implement block cache with LRU eviction
- 4. Add compression support (Snappy + Gzip)
- 5. Implement batched writes
- 6. Write comprehensive tests and benchmarks
- 7. Measure performance improvements
Summary & Learning Outcomes
What You've Learned:
- • Three major compaction strategies and their trade-offs
- • Bloom filters for 10-100x speedup on negative lookups
- • Block caching for hot data optimization
- • Compression techniques and space/CPU trade-offs
- • Batched writes for reduced lock contention
- • Resource throttling and background processing
Key Metrics Achieved:
- • Compaction reduces file count by 90%
- • Bloom filters skip ~99% of negative lookups
- • Block cache achieves 10+ MB/s for hot data
- • Compression ratio: 0.3-0.5 (3-5GB from 10GB)
- • Batched writes reduce lock contention by 10x