Performance Benchmarking

Performance Benchmarking Guide #

Overview #

This guide covers performance testing and benchmarking methodologies for IceFireDB. Learn how to measure, analyze, and optimize IceFireDB performance for different workloads and storage drivers.

Benchmarking Tools #

The standard Redis benchmarking tool works perfectly with IceFireDB:

# Basic benchmark
redis-benchmark -h localhost -p 11001 -t set,get -c 50 -n 100000

# Comprehensive test
redis-benchmark -h localhost -p 11001 \
  -t set,get,incr,lpush,lpop,sadd,spop,lpush,lrange \
  -c 100 -n 1000000 -P 16

# Specific command testing
redis-benchmark -h localhost -p 11001 \
  -t set -c 50 -n 500000 --csv

# Pipeline testing
redis-benchmark -h localhost -p 11001 \
  -t set,get -c 100 -n 1000000 -P 100

memtier_benchmark #

For more advanced benchmarking:

memtier_benchmark -s localhost -p 11001 \
  --protocol=redis --clients=50 --threads=4 \
  --ratio=1:1 --test-time=300 --key-pattern=S:S

Custom Scripts #

Python example for custom workload testing:

import redis
import time
import statistics

r = redis.Redis(host='localhost', port=11001, decode_responses=True)

# Warm up
for i in range(1000):
    r.set(f'key_{i}', f'value_{i}')

# Benchmark SET operations
times = []
for i in range(10000):
    start = time.time()
    r.set(f'bench_{i}', 'x' * 100)
    times.append(time.time() - start)

print(f"SET ops/sec: {10000/sum(times):.0f}")
print(f"Average latency: {statistics.mean(times)*1000:.2f}ms")
print(f"P95 latency: {sorted(times)[9500]*1000:.2f}ms")

Benchmarking Methodology #

Test Environment Setup #

  1. Hardware Considerations:

    • CPU: Modern multi-core processor
    • Memory: Sufficient RAM for workload
    • Storage: SSD recommended for disk-based drivers
    • Network: Gigabit Ethernet for distributed tests
  2. IceFireDB Configuration:

    network:
      max_connections: 10000
    
    performance:
      cache_size: 2048
      max_memory: 4294967296
    
    log:
      level: "warn"  # Reduce logging overhead
    
  3. System Tuning:

    # Increase file limits
    ulimit -n 100000
    
    # Network tuning
    sysctl -w net.core.somaxconn=65535
    sysctl -w net.ipv4.tcp_max_syn_backlog=65535
    

Benchmark Scenarios #

1. Throughput Testing #

# Maximum throughput test
redis-benchmark -h localhost -p 11001 \
  -t set -c 512 -n 10000000 -P 512 -q

# Expected output:
# SET: 253232.12 requests per second

2. Latency Testing #

# Low concurrency latency test
redis-benchmark -h localhost -p 11001 \
  -t set -c 1 -n 10000 -P 1 --csv

# High percentiles
memtier_benchmark -s localhost -p 11001 \
  --protocol=redis --clients=10 --threads=2 \
  --ratio=1:0 --test-time=60 --key-pattern=S:S \
  --hide-histogram

3. Mixed Workload Testing #

# Read-heavy workload (80% read, 20% write)
memtier_benchmark -s localhost -p 11001 \
  --protocol=redis --clients=50 --threads=4 \
  --ratio=4:1 --test-time=300

# Write-heavy workload
memtier_benchmark -s localhost -p 11001 \
  --protocol=redis --clients=50 --threads=4 \
  --ratio=1:4 --test-time=300

4. Data Size Impact #

# Different value sizes
for size in 100 500 1000 5000; do
    redis-benchmark -h localhost -p 11001 \
      -t set -c 50 -n 100000 -d $size -q
    echo "Value size: $size bytes"
done

Performance by Storage Driver #

LevelDB Driver Performance #

Best for: Balanced read/write workloads

# LevelDB benchmark results
SET: 250,000 - 300,000 ops/sec
GET: 2,000,000 - 2,500,000 ops/sec
Latency: 0.1 - 2ms (P99)

Optimization Tips:

  • Increase write_buffer_size for write-heavy workloads
  • Use compression for smaller data sizes
  • Adjust block_size for your access patterns

BadgerDB Driver Performance #

Best for: Write-intensive workloads

# BadgerDB benchmark results  
SET: 300,000 - 400,000 ops/sec
GET: 1,800,000 - 2,200,000 ops/sec
Latency: 0.1 - 3ms (P99)

Optimization Tips:

  • Use SSD storage for best performance
  • Tune value_log_file_size for your workload
  • Enable compression for value logs

IPFS Driver Performance #

Best for: Decentralized storage

# IPFS benchmark results (local node)
SET: 15,000 - 25,000 ops/sec  
GET: 40,000 - 60,000 ops/sec
Latency: 5 - 50ms (P99)

# IPFS benchmark results (remote nodes)
SET: 5,000 - 10,000 ops/sec
GET: 10,000 - 20,000 ops/sec  
Latency: 20 - 200ms (P99)

Optimization Tips:

  • Increase hot_cache_size for better read performance
  • Use local IPFS nodes for lower latency
  • Optimize network connectivity between nodes

CRDT Driver Performance #

Best for: Distributed consistency

# CRDT benchmark results
SET: 120,000 - 180,000 ops/sec
GET: 700,000 - 900,000 ops/sec  
Latency: 1 - 5ms (P99)
Sync Latency: 10 - 100ms (cross-node)

Optimization Tips:

  • Adjust sync_interval based on consistency requirements
  • Monitor conflict resolution overhead
  • Use appropriate conflict resolution strategy

Advanced Benchmarking #

Long-running Tests #

# 24-hour endurance test
memtier_benchmark -s localhost -p 11001 \
  --protocol=redis --clients=100 --threads=8 \
  --ratio=1:1 --test-time=86400 --key-pattern=R:R

# Monitor memory usage over time
while true; do
    redis-cli -p 11001 info memory | grep used_memory_human
    sleep 60
done

Cluster Benchmarking #

# Multi-node cluster test
# On node 1
memtier_benchmark -s node1 -p 11001 --clients=50

# On node 2  
memtier_benchmark -s node2 -p 11001 --clients=50

# Monitor cluster sync performance
redis-cli -p 11001 info replication

Custom Workload Generation #

Python script for realistic workload simulation:

import redis
import random
import time

r = redis.Redis(host='localhost', port=11001)

# Realistic key distribution (power law)
def generate_key():
    if random.random() < 0.8:  # 80% hot keys
        return f'hot_{random.randint(1, 1000)}'
    else:  # 20% cold keys
        return f'cold_{random.randint(1, 100000)}'

# Simulate real workload
operations = []
for _ in range(1000000):
    key = generate_key()
    if random.random() < 0.7:  # 70% reads
        operations.append(('GET', key))
    else:  # 30% writes
        operations.append(('SET', key, f'value_{random.randint(1, 10000)}'))

# Execute and measure
start = time.time()
for op in operations:
    if op[0] == 'GET':
        r.get(op[1])
    else:
        r.set(op[1], op[2])

duration = time.time() - start
print(f"Operations: {len(operations)}")
print(f"Duration: {duration:.2f}s")
print(f"Throughput: {len(operations)/duration:.0f} ops/sec")

Performance Metrics #

Key Metrics to Monitor #

  1. Throughput: Operations per second
  2. Latency: Response time percentiles (P50, P95, P99)
  3. Memory Usage: RSS, used_memory, peak memory
  4. CPU Utilization: User vs system time
  5. Network I/O: Bytes in/out, connections
  6. Disk I/O: Read/write operations, throughput

Monitoring Commands #

# Real-time monitoring
redis-cli -p 11001 --stat

# Detailed metrics
redis-cli -p 11001 info all

# Specific sections
redis-cli -p 11001 info memory
redis-cli -p 11001 info stats
redis-cli -p 11001 info persistence

# Slow log analysis
redis-cli -p 11001 slowlog get 10

Optimization Techniques #

Configuration Optimizations #

# High-performance configuration
network:
  max_connections: 100000
  tcp_keepalive: 300

performance:
  cache_size: 4096  # 4GB
  max_memory: 8589934592  # 8GB
  max_memory_policy: "volatile-lru"

storage:
  driver: "badger"
  value_log_file_size: 2147483648  # 2GB
  num_compactors: 4
  num_level_zero_tables: 8

OS-Level Optimizations #

# Linux performance tuning
echo 'net.core.somaxconn=65535' >> /etc/sysctl.conf
echo 'vm.overcommit_memory=1' >> /etc/sysctl.conf
echo 'vm.swappiness=1' >> /etc/sysctl.conf

# SSD optimization
echo 'deadline' > /sys/block/sda/queue/scheduler

# Memory management
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled

Application-Level Optimizations #

  1. Use pipelining for bulk operations
  2. Batch small operations together
  3. Use appropriate data types for your workload
  4. Monitor and evict unused keys regularly
  5. Use compression for large values

Troubleshooting Performance Issues #

Common Performance Problems #

  1. High Latency:

    • Check network connectivity
    • Monitor system resource usage
    • Review slow log queries
  2. Low Throughput:

    • Verify client configuration
    • Check for resource bottlenecks
    • Review storage driver performance
  3. Memory Issues:

    • Monitor memory usage patterns
    • Adjust max memory policy
    • Implement key eviction strategies

Diagnostic Commands #

# Check current performance
redis-cli -p 11001 info commandstats

# Monitor real-time performance
redis-cli -p 11001 monitor | head -100

# Analyze memory usage
redis-cli -p 11001 memory stats

# Check persistence performance
redis-cli -p 11001 info persistence

Benchmark Results Interpretation #

Expected Performance Ranges #

DriverSET ops/secGET ops/secP99 Latency
LevelDB200-300K1.8-2.5M1-2ms
BadgerDB300-400K1.6-2.2M1-3ms
IPFS15-25K40-60K5-50ms
CRDT120-180K700-900K1-5ms

Factors Affecting Performance #

  1. Data Size: Larger values reduce throughput
  2. Concurrency: Higher concurrency increases throughput but may increase latency
  3. Network: Latency and bandwidth affect distributed performance
  4. Hardware: CPU, memory, and storage speed determine maximum performance
  5. Workload Pattern: Read vs write ratio affects optimal configuration

Continuous Performance Testing #

Automated Benchmarking #

Set up automated performance testing:

#!/bin/bash
# daily-benchmark.sh

DATE=$(date +%Y%m%d)
RESULTS="benchmark_results_$DATE.csv"

echo "date,driver,test,throughput,latency_p99" > $RESULTS

# Test different drivers
for driver in leveldb badger ipfs; do
    # Switch driver
    redis-cli -p 11001 DRIVER.SELECT $driver
    
    # Run benchmarks
    redis-benchmark -h localhost -p 11001 -t set -c 50 -n 100000 -q | \
        awk -v d="$driver" -v date="$DATE" '{print date "," d ",set," $1 ","}' >> $RESULTS
    
    redis-benchmark -h localhost -p 11001 -t get -c 50 -n 100000 -q | \
        awk -v d="$driver" -v date="$DATE" '{print date "," d ",get," $1 ","}' >> $RESULTS
done

Performance Regression Testing #

Monitor performance over time to detect regressions:

# Weekly performance report
#!/bin/bash

WEEK=$(date +%Y-%U)
redis-cli -p 11001 info stats > "stats_$WEEK.log"
redis-cli -p 11001 info memory > "memory_$WEEK.log"

# Compare with previous week
# Alert on significant changes

Best Practices #

  1. Test realistic workloads that match production usage
  2. Run long-term tests to identify memory leaks or degradation
  3. Monitor system resources during testing
  4. Document benchmark configurations for reproducibility
  5. Compare across versions to track performance improvements
  6. Test failure scenarios and recovery performance
  7. Validate with multiple tools for comprehensive analysis

See Also #