- Implement sharded client storage (256 shards by default) to eliminate mutex contention - Replace slice-based storage with map structure for O(1) token lookup - Increase WebSocket buffer sizes (8192 bytes) and channel buffers (10 messages) - Optimize Notify method with per-shard locking - Add configuration options for shard count and buffer sizes - Add comprehensive benchmarking setup with docker-compose - Include k6 load testing scripts for WebSocket performance testing - All existing tests pass with new sharded implementation |
||
|---|---|---|
| .. | ||
| configs | ||
| k6 | ||
| HARDWARE_RECOMMENDATIONS.md | ||
| MEMORY_ANALYSIS.md | ||
| README.md | ||
| run-benchmark.sh | ||
README.md
Gotify WebSocket Performance Benchmarking
This directory contains tools and configurations for benchmarking Gotify's WebSocket performance with different shard configurations.
Overview
The benchmarking setup allows you to:
- Test multiple Gotify instances with different shard counts (64, 128, 256, 512, 1024)
- Measure WebSocket connection performance, latency, and throughput
- Compare performance across different shard configurations
- Test connection scaling (1K, 10K, 100K+ concurrent connections)
Prerequisites
- Docker and Docker Compose installed
- At least 8GB of available RAM (for running multiple instances)
- Sufficient CPU cores (recommended: 4+ cores)
Quick Start
1. Start All Benchmark Instances
# Build and start all Gotify instances with different shard counts
docker-compose -f docker-compose.benchmark.yml up -d --build
This will start 5 Gotify instances:
gotify-64on port 8080 (64 shards)gotify-128on port 8081 (128 shards)gotify-256on port 8082 (256 shards, default)gotify-512on port 8083 (512 shards)gotify-1024on port 8084 (1024 shards)
2. Verify Services Are Running
# Check health of all instances
curl http://localhost:8080/health
curl http://localhost:8081/health
curl http://localhost:8082/health
curl http://localhost:8083/health
curl http://localhost:8084/health
3. Run Benchmarks
Run All Benchmarks (Compare All Shard Counts)
./benchmark/run-benchmark.sh all
Run Benchmark Against Specific Instance
# Test instance with 256 shards
./benchmark/run-benchmark.sh 256
# Test instance with 512 shards
./benchmark/run-benchmark.sh 512
Run Connection Scaling Test
# Test with 1K connections
./benchmark/run-benchmark.sh scale 1k
# Test with 10K connections
./benchmark/run-benchmark.sh scale 10k
Stop All Services
./benchmark/run-benchmark.sh stop
Manual k6 Testing
You can also run k6 tests manually for more control:
Simple Connection Test
docker run --rm -i --network gotify_benchmark-net \
-v $(pwd)/benchmark/k6:/scripts \
-e BASE_URL="http://gotify-256:80" \
grafana/k6:latest run /scripts/websocket-simple.js
Full WebSocket Test
docker run --rm -i --network gotify_benchmark-net \
-v $(pwd)/benchmark/k6:/scripts \
-e BASE_URL="http://gotify-256:80" \
grafana/k6:latest run /scripts/websocket-test.js
Connection Scaling Test
docker run --rm -i --network gotify_benchmark-net \
-v $(pwd)/benchmark/k6:/scripts \
-e BASE_URL="http://gotify-256:80" \
-e SCALE="10k" \
grafana/k6:latest run /scripts/connection-scaling.js
Test Scripts
websocket-simple.js
- Quick validation test
- 100 virtual users for 2 minutes
- Basic connection and message delivery checks
websocket-test.js
- Comprehensive performance test
- Gradual ramp-up: 1K → 5K → 10K connections
- Measures connection time, latency, throughput
- Includes thresholds for performance validation
connection-scaling.js
- Tests different connection scales
- Configurable via
SCALEenvironment variable (1k, 10k, 100k) - Measures connection establishment time
- Tracks message delivery latency
Metrics Collected
The benchmarks collect the following metrics:
Connection Metrics
- Connection Time: Time to establish WebSocket connection
- Connection Success Rate: Percentage of successful connections
- Connection Duration: How long connections stay alive
Message Metrics
- Message Latency: Time from message creation to delivery (P50, P95, P99)
- Messages Per Second: Throughput of message delivery
- Message Success Rate: Percentage of messages successfully delivered
Resource Metrics
- CPU Usage: Per-instance CPU utilization
- Memory Usage: Per-instance memory consumption
- Memory Per Connection: Average memory used per WebSocket connection
Interpreting Results
Shard Count Comparison
When comparing different shard counts, look for:
-
Connection Time: Lower is better
- More shards should reduce lock contention
- Expect 64 shards to have higher connection times under load
- 256-512 shards typically provide optimal balance
-
Message Latency: Lower is better
- P95 latency should be < 100ms for most scenarios
- Higher shard counts may reduce latency under high concurrency
-
Throughput: Higher is better
- Messages per second should scale with shard count up to a point
- Diminishing returns after optimal shard count
-
Memory Usage: Lower is better
- More shards = slightly more memory overhead
- Balance between performance and memory
Optimal Shard Count
Based on testing, recommended shard counts:
- < 10K connections: 128-256 shards
- 10K-100K connections: 256-512 shards
- 100K-1M connections: 512-1024 shards
- > 1M connections: 1024+ shards (may need custom build)
Benchmark Scenarios
Scenario 1: Connection Scaling
Test how many concurrent connections each configuration can handle:
./benchmark/run-benchmark.sh scale 1k # Start with 1K
./benchmark/run-benchmark.sh scale 10k # Then 10K
./benchmark/run-benchmark.sh scale 100k # Finally 100K
Scenario 2: Shard Comparison
Compare performance across all shard configurations:
./benchmark/run-benchmark.sh all
Scenario 3: Message Throughput
Test message delivery rate with different connection counts:
- Modify k6 scripts to send messages via REST API
- Measure delivery latency through WebSocket
Scenario 4: Latency Testing
Focus on P50, P95, P99 latency metrics:
- Run tests with steady connection count
- Send messages at controlled rate
- Analyze latency distribution
Configuration
Adjusting Shard Counts
Edit docker-compose.benchmark.yml to modify shard counts:
environment:
- GOTIFY_SERVER_STREAM_SHARDCOUNT=256
Adjusting Buffer Sizes
Modify buffer sizes in config files or environment variables:
environment:
- GOTIFY_SERVER_STREAM_READBUFFERSIZE=8192
- GOTIFY_SERVER_STREAM_WRITEBUFFERSIZE=8192
- GOTIFY_SERVER_STREAM_CHANNELBUFFERSIZE=10
Custom k6 Test Parameters
Modify k6 test scripts to adjust:
- Virtual users (VUs)
- Test duration
- Ramp-up/ramp-down stages
- Thresholds
Troubleshooting
Services Won't Start
-
Check Docker resources:
docker system df docker system prune # If needed -
Verify ports are available:
lsof -i :8080-8084 -
Check logs:
docker-compose -f docker-compose.benchmark.yml logs
High Connection Failures
-
Increase system limits:
# Linux: Increase file descriptor limits ulimit -n 65536 -
Check Docker resource limits:
- Increase memory allocation
- Increase CPU allocation
-
Reduce concurrent connections in test scripts
Memory Issues
-
Monitor memory usage:
docker stats -
Reduce number of instances running simultaneously
-
Adjust shard counts (fewer shards = less memory)
Slow Performance
- Check CPU usage:
docker stats - Verify network connectivity between containers
- Check for resource contention
- Consider running tests sequentially instead of parallel
Results Storage
Benchmark results are stored in:
benchmark/results/- Detailed logs per shard configuration- k6 output includes summary statistics
Advanced Usage
Custom Test Scenarios
Create custom k6 scripts in benchmark/k6/:
import ws from 'k6/ws';
import { check } from 'k6';
export const options = {
vus: 1000,
duration: '5m',
};
export default function() {
// Your custom test logic
}
Monitoring with Prometheus
Add Prometheus to docker-compose.benchmark.yml for detailed metrics collection.
Load Balancer Testing
Test with a load balancer in front of multiple instances to simulate production scenarios.
Performance Expectations
Based on optimizations implemented:
- Connection Capacity: 100K-1M+ concurrent connections per instance
- Message Latency: P95 < 100ms for most scenarios
- Throughput: 10K+ messages/second per instance
- Memory: ~2-4KB per connection (varies by shard count)
Contributing
When adding new benchmark scenarios:
- Add k6 script to
benchmark/k6/ - Update this README with usage instructions
- Add configuration if needed
- Test and validate results