6.5 KiB

Raw Permalink Blame History

Hardware Recommendations for WebSocket Benchmarking

Testing Millions of WebSocket Connections

To properly test Gotify's ability to handle millions of concurrent WebSocket connections, you need to consider several hardware and system factors.

M4 Mac Mini Considerations

Pros:

Powerful CPU: M4 chip has excellent single-threaded and multi-threaded performance
Unified Memory: Fast memory access
Energy Efficient: Can run tests for extended periods

Cons:

Limited RAM Options: Max 24GB (M4 Pro) or 36GB (M4 Max) - may be limiting for millions of connections
macOS Limitations:
- Lower default file descriptor limits (~10,000)
- Docker Desktop overhead
- Network stack differences from Linux
Memory Per Connection:
- Each WebSocket connection uses ~2-4KB of memory
- 1M connections = ~2-4GB just for connections
- Plus OS overhead, buffers, etc.
- Realistically need 8-16GB+ for 1M connections

M4 Mac Mini Verdict:

✅ Good for: Testing up to 100K-500K connections, development, validation ❌ Limited for: Testing true millions of connections (1M+)

Recommended Hardware for Full-Scale Testing

Option 1: Linux Server (Recommended)

Best for: 1M+ connections

Minimum Specs:

CPU: 8+ cores (Intel Xeon or AMD EPYC)
RAM: 32GB+ (64GB+ recommended for 1M+ connections)
Network: 10Gbps+ network interface
OS: Linux (Ubuntu 22.04+ or similar)
Storage: SSD for database

Why Linux:

Higher file descriptor limits (can be set to 1M+)
Better network stack performance
Native Docker (no Desktop overhead)
More control over system resources

System Tuning Required:

# Increase file descriptor limits
ulimit -n 1000000
echo "* soft nofile 1000000" >> /etc/security/limits.conf
echo "* hard nofile 1000000" >> /etc/security/limits.conf

# Network tuning
echo 'net.core.somaxconn = 65535' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_max_syn_backlog = 65535' >> /etc/sysctl.conf
echo 'net.ipv4.ip_local_port_range = 1024 65535' >> /etc/sysctl.conf
sysctl -p

Option 2: Cloud Instance (AWS/GCP/Azure)

Best for: Flexible scaling

Recommended Instance Types:

AWS: c6i.4xlarge or larger (16+ vCPUs, 32GB+ RAM)
GCP: n2-standard-16 or larger
Azure: Standard_D16s_v3 or larger

Benefits:

Can scale up/down as needed
High network bandwidth
Easy to replicate test environments
Can test with multiple instances

Option 3: Dedicated Server

Best for: Consistent long-term testing

CPU: 16+ cores
RAM: 64GB+
Network: 10Gbps+
Cost: $200-500/month for good hardware

Memory Requirements by Connection Count

Connections	Estimated RAM	Recommended RAM	Notes
10K	2-4GB	8GB	M4 Mac Mini ✅
100K	4-8GB	16GB	M4 Mac Mini ⚠️ (may work)
500K	8-16GB	32GB	M4 Mac Mini ❌
1M	16-32GB	64GB	Linux Server ✅
5M	80-160GB	256GB	High-end Server ✅

Note: RAM estimates include OS, Docker, and application overhead

Network Requirements

Bandwidth Calculation:

Each WebSocket connection: ~1-2KB initial handshake
Ping/pong messages: ~10 bytes every 45 seconds
Message delivery: Variable (depends on message size)

For 1M connections:

Initial connection burst: ~1-2GB
Sustained: ~100-200MB/s for ping/pong
Message delivery: Depends on message rate

Recommendation: 10Gbps network for 1M+ connections

Testing Strategy by Hardware

M4 Mac Mini (24-36GB RAM)

Start Small: Test with 10K connections
Scale Gradually: 50K → 100K → 250K
Monitor Memory: Watch for OOM conditions
Focus on: Shard comparison, latency testing, throughput at moderate scale

Commands:

# Test with 10K connections
./benchmark/run-benchmark.sh scale 1k

# Compare shard configurations
./benchmark/run-benchmark.sh all

Linux Server (32GB+ RAM)

Full Scale Testing: 100K → 500K → 1M+ connections
Multiple Instances: Test horizontal scaling
Stress Testing: Find breaking points
Production Simulation: Real-world scenarios

Commands:

# Test with 100K connections
SCALE=10k ./benchmark/run-benchmark.sh scale 10k

# Test with custom high connection count
# (modify k6 scripts for higher VU counts)

System Limits to Check

macOS (M4 Mac Mini)

# Check current limits
ulimit -n                    # File descriptors
sysctl kern.maxfiles         # Max open files
sysctl kern.maxfilesperproc  # Max files per process

# Increase limits (temporary)
ulimit -n 65536

Linux

# Check limits
ulimit -n
cat /proc/sys/fs/file-max

# Increase limits (permanent)
# Edit /etc/security/limits.conf

Docker Considerations

Docker Desktop (macOS)

Overhead: ~2-4GB RAM for Docker Desktop
Performance: Slightly slower than native Linux
Limits: Subject to macOS system limits

Native Docker (Linux)

Overhead: Minimal (~500MB)
Performance: Near-native
Limits: Can use full system resources

Recommendations

For Development & Initial Testing:

✅ M4 Mac Mini is fine

Test up to 100K-250K connections
Validate shard configurations
Test latency and throughput
Develop and debug

For Production-Scale Testing:

✅ Use Linux Server

Test 1M+ connections
Validate true scalability
Stress testing
Production simulation

Hybrid Approach:

Develop on M4 Mac Mini: Quick iteration, smaller scale tests
Validate on Linux Server: Full-scale testing before production

Quick Start on M4 Mac Mini

# 1. Increase file descriptor limits
ulimit -n 65536

# 2. Start single instance for testing
docker-compose -f docker-compose.benchmark.yml up -d gotify-256

# 3. Run small-scale test
./benchmark/run-benchmark.sh 256 websocket-simple.js

# 4. Monitor resources
docker stats gotify-bench-256

Quick Start on Linux Server

# 1. Tune system limits (see above)
# 2. Start all instances
docker-compose -f docker-compose.benchmark.yml up -d --build

# 3. Run full-scale test
./benchmark/run-benchmark.sh scale 10k

# 4. Monitor system resources
htop
docker stats

Conclusion

M4 Mac Mini: Great for development and testing up to ~250K connections Linux Server: Required for testing true millions of connections

Start with the M4 Mac Mini to validate the setup and optimizations, then move to a Linux server for full-scale production validation.

6.5 KiB Raw Permalink Blame History