sharded-gotify/benchmark/HARDWARE_RECOMMENDATIONS.md

# Hardware Recommendations for WebSocket Benchmarking

## Testing Millions of WebSocket Connections

To properly test Gotify's ability to handle millions of concurrent WebSocket connections, you need to consider several hardware and system factors.

## M4 Mac Mini Considerations

### Pros:
- **Powerful CPU**: M4 chip has excellent single-threaded and multi-threaded performance
- **Unified Memory**: Fast memory access
- **Energy Efficient**: Can run tests for extended periods

### Cons:
- **Limited RAM Options**: Max 24GB (M4 Pro) or 36GB (M4 Max) - may be limiting for millions of connections
- **macOS Limitations**:
  - Lower default file descriptor limits (~10,000)
  - Docker Desktop overhead
  - Network stack differences from Linux
- **Memory Per Connection**:
  - Each WebSocket connection uses ~2-4KB of memory
  - 1M connections = ~2-4GB just for connections
  - Plus OS overhead, buffers, etc.
  - Realistically need 8-16GB+ for 1M connections

### M4 Mac Mini Verdict:
✅ **Good for**: Testing up to 100K-500K connections, development, validation
❌ **Limited for**: Testing true millions of connections (1M+)

## Recommended Hardware for Full-Scale Testing

### Option 1: Linux Server (Recommended)
**Best for: 1M+ connections**

**Minimum Specs:**
- **CPU**: 8+ cores (Intel Xeon or AMD EPYC)
- **RAM**: 32GB+ (64GB+ recommended for 1M+ connections)
- **Network**: 10Gbps+ network interface
- **OS**: Linux (Ubuntu 22.04+ or similar)
- **Storage**: SSD for database

**Why Linux:**
- Higher file descriptor limits (can be set to 1M+)
- Better network stack performance
- Native Docker (no Desktop overhead)
- More control over system resources

**System Tuning Required:**
```bash
# Increase file descriptor limits
ulimit -n 1000000
echo "* soft nofile 1000000" >> /etc/security/limits.conf
echo "* hard nofile 1000000" >> /etc/security/limits.conf

# Network tuning
echo 'net.core.somaxconn = 65535' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_max_syn_backlog = 65535' >> /etc/sysctl.conf
echo 'net.ipv4.ip_local_port_range = 1024 65535' >> /etc/sysctl.conf
sysctl -p
```

### Option 2: Cloud Instance (AWS/GCP/Azure)
**Best for: Flexible scaling**

**Recommended Instance Types:**
- **AWS**: c6i.4xlarge or larger (16+ vCPUs, 32GB+ RAM)
- **GCP**: n2-standard-16 or larger
- **Azure**: Standard_D16s_v3 or larger

**Benefits:**
- Can scale up/down as needed
- High network bandwidth
- Easy to replicate test environments
- Can test with multiple instances

### Option 3: Dedicated Server
**Best for: Consistent long-term testing**

- **CPU**: 16+ cores
- **RAM**: 64GB+
- **Network**: 10Gbps+
- **Cost**: $200-500/month for good hardware

## Memory Requirements by Connection Count

| Connections | Estimated RAM | Recommended RAM | Notes |
|------------|---------------|------------------|-------|
| 10K        | 2-4GB         | 8GB              | M4 Mac Mini ✅ |
| 100K       | 4-8GB         | 16GB             | M4 Mac Mini ⚠️ (may work) |
| 500K       | 8-16GB        | 32GB             | M4 Mac Mini ❌ |
| 1M         | 16-32GB       | 64GB             | Linux Server ✅ |
| 5M         | 80-160GB      | 256GB            | High-end Server ✅ |

*Note: RAM estimates include OS, Docker, and application overhead*

## Network Requirements

### Bandwidth Calculation:
- Each WebSocket connection: ~1-2KB initial handshake
- Ping/pong messages: ~10 bytes every 45 seconds
- Message delivery: Variable (depends on message size)

**For 1M connections:**
- Initial connection burst: ~1-2GB
- Sustained: ~100-200MB/s for ping/pong
- Message delivery: Depends on message rate

**Recommendation**: 10Gbps network for 1M+ connections

## Testing Strategy by Hardware

### M4 Mac Mini (24-36GB RAM)
1. **Start Small**: Test with 10K connections
2. **Scale Gradually**: 50K → 100K → 250K
3. **Monitor Memory**: Watch for OOM conditions
4. **Focus on**: Shard comparison, latency testing, throughput at moderate scale

**Commands:**
```bash
# Test with 10K connections
./benchmark/run-benchmark.sh scale 1k

# Compare shard configurations
./benchmark/run-benchmark.sh all
```

### Linux Server (32GB+ RAM)
1. **Full Scale Testing**: 100K → 500K → 1M+ connections
2. **Multiple Instances**: Test horizontal scaling
3. **Stress Testing**: Find breaking points
4. **Production Simulation**: Real-world scenarios

**Commands:**
```bash
# Test with 100K connections
SCALE=10k ./benchmark/run-benchmark.sh scale 10k

# Test with custom high connection count
# (modify k6 scripts for higher VU counts)
```

## System Limits to Check

### macOS (M4 Mac Mini)
```bash
# Check current limits
ulimit -n                    # File descriptors
sysctl kern.maxfiles         # Max open files
sysctl kern.maxfilesperproc  # Max files per process

# Increase limits (temporary)
ulimit -n 65536
```

### Linux
```bash
# Check limits
ulimit -n
cat /proc/sys/fs/file-max

# Increase limits (permanent)
# Edit /etc/security/limits.conf
```

## Docker Considerations

### Docker Desktop (macOS)
- **Overhead**: ~2-4GB RAM for Docker Desktop
- **Performance**: Slightly slower than native Linux
- **Limits**: Subject to macOS system limits

### Native Docker (Linux)
- **Overhead**: Minimal (~500MB)
- **Performance**: Near-native
- **Limits**: Can use full system resources

## Recommendations

### For Development & Initial Testing:
✅ **M4 Mac Mini is fine**
- Test up to 100K-250K connections
- Validate shard configurations
- Test latency and throughput
- Develop and debug

### For Production-Scale Testing:
✅ **Use Linux Server**
- Test 1M+ connections
- Validate true scalability
- Stress testing
- Production simulation

### Hybrid Approach:
1. **Develop on M4 Mac Mini**: Quick iteration, smaller scale tests
2. **Validate on Linux Server**: Full-scale testing before production

## Quick Start on M4 Mac Mini

```bash
# 1. Increase file descriptor limits
ulimit -n 65536

# 2. Start single instance for testing
docker-compose -f docker-compose.benchmark.yml up -d gotify-256

# 3. Run small-scale test
./benchmark/run-benchmark.sh 256 websocket-simple.js

# 4. Monitor resources
docker stats gotify-bench-256
```

## Quick Start on Linux Server

```bash
# 1. Tune system limits (see above)
# 2. Start all instances
docker-compose -f docker-compose.benchmark.yml up -d --build

# 3. Run full-scale test
./benchmark/run-benchmark.sh scale 10k

# 4. Monitor system resources
htop
docker stats
```

## Conclusion

**M4 Mac Mini**: Great for development and testing up to ~250K connections
**Linux Server**: Required for testing true millions of connections

Start with the M4 Mac Mini to validate the setup and optimizations, then move to a Linux server for full-scale production validation.