sharded-gotify/benchmark/README.md

346 lines
8.5 KiB
Markdown

# Gotify WebSocket Performance Benchmarking
This directory contains tools and configurations for benchmarking Gotify's WebSocket performance with different shard configurations.
## Overview
The benchmarking setup allows you to:
- Test multiple Gotify instances with different shard counts (64, 128, 256, 512, 1024)
- Measure WebSocket connection performance, latency, and throughput
- Compare performance across different shard configurations
- Test connection scaling (1K, 10K, 100K+ concurrent connections)
## Prerequisites
- Docker and Docker Compose installed
- At least 8GB of available RAM (for running multiple instances)
- Sufficient CPU cores (recommended: 4+ cores)
## Quick Start
### 1. Start All Benchmark Instances
```bash
# Build and start all Gotify instances with different shard counts
docker-compose -f docker-compose.benchmark.yml up -d --build
```
This will start 5 Gotify instances:
- `gotify-64` on port 8080 (64 shards)
- `gotify-128` on port 8081 (128 shards)
- `gotify-256` on port 8082 (256 shards, default)
- `gotify-512` on port 8083 (512 shards)
- `gotify-1024` on port 8084 (1024 shards)
### 2. Verify Services Are Running
```bash
# Check health of all instances
curl http://localhost:8080/health
curl http://localhost:8081/health
curl http://localhost:8082/health
curl http://localhost:8083/health
curl http://localhost:8084/health
```
### 3. Run Benchmarks
#### Run All Benchmarks (Compare All Shard Counts)
```bash
./benchmark/run-benchmark.sh all
```
#### Run Benchmark Against Specific Instance
```bash
# Test instance with 256 shards
./benchmark/run-benchmark.sh 256
# Test instance with 512 shards
./benchmark/run-benchmark.sh 512
```
#### Run Connection Scaling Test
```bash
# Test with 1K connections
./benchmark/run-benchmark.sh scale 1k
# Test with 10K connections
./benchmark/run-benchmark.sh scale 10k
```
#### Stop All Services
```bash
./benchmark/run-benchmark.sh stop
```
## Manual k6 Testing
You can also run k6 tests manually for more control:
### Simple Connection Test
```bash
docker run --rm -i --network gotify_benchmark-net \
-v $(pwd)/benchmark/k6:/scripts \
-e BASE_URL="http://gotify-256:80" \
grafana/k6:latest run /scripts/websocket-simple.js
```
### Full WebSocket Test
```bash
docker run --rm -i --network gotify_benchmark-net \
-v $(pwd)/benchmark/k6:/scripts \
-e BASE_URL="http://gotify-256:80" \
grafana/k6:latest run /scripts/websocket-test.js
```
### Connection Scaling Test
```bash
docker run --rm -i --network gotify_benchmark-net \
-v $(pwd)/benchmark/k6:/scripts \
-e BASE_URL="http://gotify-256:80" \
-e SCALE="10k" \
grafana/k6:latest run /scripts/connection-scaling.js
```
## Test Scripts
### `websocket-simple.js`
- Quick validation test
- 100 virtual users for 2 minutes
- Basic connection and message delivery checks
### `websocket-test.js`
- Comprehensive performance test
- Gradual ramp-up: 1K → 5K → 10K connections
- Measures connection time, latency, throughput
- Includes thresholds for performance validation
### `connection-scaling.js`
- Tests different connection scales
- Configurable via `SCALE` environment variable (1k, 10k, 100k)
- Measures connection establishment time
- Tracks message delivery latency
## Metrics Collected
The benchmarks collect the following metrics:
### Connection Metrics
- **Connection Time**: Time to establish WebSocket connection
- **Connection Success Rate**: Percentage of successful connections
- **Connection Duration**: How long connections stay alive
### Message Metrics
- **Message Latency**: Time from message creation to delivery (P50, P95, P99)
- **Messages Per Second**: Throughput of message delivery
- **Message Success Rate**: Percentage of messages successfully delivered
### Resource Metrics
- **CPU Usage**: Per-instance CPU utilization
- **Memory Usage**: Per-instance memory consumption
- **Memory Per Connection**: Average memory used per WebSocket connection
## Interpreting Results
### Shard Count Comparison
When comparing different shard counts, look for:
1. **Connection Time**: Lower is better
- More shards should reduce lock contention
- Expect 64 shards to have higher connection times under load
- 256-512 shards typically provide optimal balance
2. **Message Latency**: Lower is better
- P95 latency should be < 100ms for most scenarios
- Higher shard counts may reduce latency under high concurrency
3. **Throughput**: Higher is better
- Messages per second should scale with shard count up to a point
- Diminishing returns after optimal shard count
4. **Memory Usage**: Lower is better
- More shards = slightly more memory overhead
- Balance between performance and memory
### Optimal Shard Count
Based on testing, recommended shard counts:
- **< 10K connections**: 128-256 shards
- **10K-100K connections**: 256-512 shards
- **100K-1M connections**: 512-1024 shards
- **> 1M connections**: 1024+ shards (may need custom build)
## Benchmark Scenarios
### Scenario 1: Connection Scaling
Test how many concurrent connections each configuration can handle:
```bash
./benchmark/run-benchmark.sh scale 1k # Start with 1K
./benchmark/run-benchmark.sh scale 10k # Then 10K
./benchmark/run-benchmark.sh scale 100k # Finally 100K
```
### Scenario 2: Shard Comparison
Compare performance across all shard configurations:
```bash
./benchmark/run-benchmark.sh all
```
### Scenario 3: Message Throughput
Test message delivery rate with different connection counts:
- Modify k6 scripts to send messages via REST API
- Measure delivery latency through WebSocket
### Scenario 4: Latency Testing
Focus on P50, P95, P99 latency metrics:
- Run tests with steady connection count
- Send messages at controlled rate
- Analyze latency distribution
## Configuration
### Adjusting Shard Counts
Edit `docker-compose.benchmark.yml` to modify shard counts:
```yaml
environment:
- GOTIFY_SERVER_STREAM_SHARDCOUNT=256
```
### Adjusting Buffer Sizes
Modify buffer sizes in config files or environment variables:
```yaml
environment:
- GOTIFY_SERVER_STREAM_READBUFFERSIZE=8192
- GOTIFY_SERVER_STREAM_WRITEBUFFERSIZE=8192
- GOTIFY_SERVER_STREAM_CHANNELBUFFERSIZE=10
```
### Custom k6 Test Parameters
Modify k6 test scripts to adjust:
- Virtual users (VUs)
- Test duration
- Ramp-up/ramp-down stages
- Thresholds
## Troubleshooting
### Services Won't Start
1. Check Docker resources:
```bash
docker system df
docker system prune # If needed
```
2. Verify ports are available:
```bash
lsof -i :8080-8084
```
3. Check logs:
```bash
docker-compose -f docker-compose.benchmark.yml logs
```
### High Connection Failures
1. Increase system limits:
```bash
# Linux: Increase file descriptor limits
ulimit -n 65536
```
2. Check Docker resource limits:
- Increase memory allocation
- Increase CPU allocation
3. Reduce concurrent connections in test scripts
### Memory Issues
1. Monitor memory usage:
```bash
docker stats
```
2. Reduce number of instances running simultaneously
3. Adjust shard counts (fewer shards = less memory)
### Slow Performance
1. Check CPU usage: `docker stats`
2. Verify network connectivity between containers
3. Check for resource contention
4. Consider running tests sequentially instead of parallel
## Results Storage
Benchmark results are stored in:
- `benchmark/results/` - Detailed logs per shard configuration
- k6 output includes summary statistics
## Advanced Usage
### Custom Test Scenarios
Create custom k6 scripts in `benchmark/k6/`:
```javascript
import ws from 'k6/ws';
import { check } from 'k6';
export const options = {
vus: 1000,
duration: '5m',
};
export default function() {
// Your custom test logic
}
```
### Monitoring with Prometheus
Add Prometheus to `docker-compose.benchmark.yml` for detailed metrics collection.
### Load Balancer Testing
Test with a load balancer in front of multiple instances to simulate production scenarios.
## Performance Expectations
Based on optimizations implemented:
- **Connection Capacity**: 100K-1M+ concurrent connections per instance
- **Message Latency**: P95 < 100ms for most scenarios
- **Throughput**: 10K+ messages/second per instance
- **Memory**: ~2-4KB per connection (varies by shard count)
## Contributing
When adding new benchmark scenarios:
1. Add k6 script to `benchmark/k6/`
2. Update this README with usage instructions
3. Add configuration if needed
4. Test and validate results
## References
- [k6 WebSocket Documentation](https://k6.io/docs/javascript-api/k6-ws/)
- [Gotify Configuration](https://gotify.net/docs/config)
- [WebSocket Performance Best Practices](https://www.ably.com/topic/websockets)