mirror of
https://github.com/rustfs/rustfs.git
synced 2026-01-17 09:40:32 +00:00
330 lines
7.6 KiB
Markdown
330 lines
7.6 KiB
Markdown
# RustFS Performance Testing Guide
|
||
|
||
This document describes the recommended tools and workflows for benchmarking RustFS and analyzing performance bottlenecks.
|
||
|
||
## Overview
|
||
|
||
RustFS exposes several complementary tooling options:
|
||
|
||
1. **Profiling** – collect CPU samples through the built-in `pprof` endpoints.
|
||
2. **Load testing** – drive concurrent requests with dedicated client utilities.
|
||
3. **Monitoring and analysis** – inspect collected metrics to locate hotspots.
|
||
|
||
## Prerequisites
|
||
|
||
### 1. Enable profiling support
|
||
|
||
Set the profiling environment variable before launching RustFS:
|
||
|
||
```bash
|
||
export RUSTFS_ENABLE_PROFILING=true
|
||
./rustfs
|
||
```
|
||
|
||
### 2. Install required tooling
|
||
|
||
Make sure the following dependencies are available:
|
||
|
||
```bash
|
||
# Base tools
|
||
curl # HTTP requests
|
||
jq # JSON processing (optional)
|
||
|
||
# Analysis tools
|
||
go # Go pprof CLI (optional, required for protobuf output)
|
||
python3 # Python load-testing scripts
|
||
|
||
# macOS users
|
||
brew install curl jq go python3
|
||
|
||
# Ubuntu/Debian users
|
||
sudo apt-get install curl jq golang-go python3
|
||
```
|
||
|
||
## Performance Testing Methods
|
||
|
||
### Method 1: Use the dedicated profiling script (recommended)
|
||
|
||
The repository ships with a helper script for common profiling flows:
|
||
|
||
```bash
|
||
# Show command help
|
||
./scripts/profile_rustfs.sh help
|
||
|
||
# Check profiler status
|
||
./scripts/profile_rustfs.sh status
|
||
|
||
# Capture a 30 second flame graph
|
||
./scripts/profile_rustfs.sh flamegraph
|
||
|
||
# Download protobuf-formatted samples
|
||
./scripts/profile_rustfs.sh protobuf
|
||
|
||
# Collect both formats
|
||
./scripts/profile_rustfs.sh both
|
||
|
||
# Provide custom arguments
|
||
./scripts/profile_rustfs.sh -d 60 -u http://192.168.1.100:9000 both
|
||
```
|
||
|
||
### Method 2: Run the Python end-to-end tester
|
||
|
||
A Python utility combines background load generation with profiling:
|
||
|
||
```bash
|
||
# Launch the integrated test harness
|
||
python3 test_load.py
|
||
```
|
||
|
||
The script will:
|
||
|
||
1. Launch multi-threaded S3 operations as load.
|
||
2. Pull profiling samples in parallel.
|
||
3. Produce a flame graph for investigation.
|
||
|
||
### Method 3: Simple shell-based load test
|
||
|
||
For quick smoke checks, a lightweight bash script is also provided:
|
||
|
||
```bash
|
||
# Execute a lightweight benchmark
|
||
./simple_load_test.sh
|
||
```
|
||
|
||
## Profiling Output Formats
|
||
|
||
### 1. Flame graph (SVG)
|
||
|
||
- **Purpose**: Visualize CPU time distribution.
|
||
- **File name**: `rustfs_profile_TIMESTAMP.svg`
|
||
- **How to view**: Open the SVG in a browser.
|
||
- **Interpretation tips**:
|
||
- Width reflects CPU time per function.
|
||
- Height illustrates call-stack depth.
|
||
- Click to zoom into specific frames.
|
||
|
||
```bash
|
||
# Example: open the file in a browser
|
||
open profiles/rustfs_profile_20240911_143000.svg
|
||
```
|
||
|
||
### 2. Protobuf samples
|
||
|
||
- **Purpose**: Feed data to the `go tool pprof` command.
|
||
- **File name**: `rustfs_profile_TIMESTAMP.pb`
|
||
- **Tooling**: `go tool pprof`
|
||
|
||
```bash
|
||
# Analyze the protobuf output
|
||
go tool pprof profiles/rustfs_profile_20240911_143000.pb
|
||
|
||
# Common pprof commands
|
||
(pprof) top # Show hottest call sites
|
||
(pprof) list func # Display annotated source for a function
|
||
(pprof) web # Launch the web UI (requires graphviz)
|
||
(pprof) png # Render a PNG flame chart
|
||
(pprof) help # List available commands
|
||
```
|
||
|
||
## API Usage
|
||
|
||
### Check profiling status
|
||
|
||
```bash
|
||
curl "http://127.0.0.1:9000/rustfs/admin/debug/pprof/status"
|
||
```
|
||
|
||
Sample response:
|
||
|
||
```json
|
||
{
|
||
"enabled": "true",
|
||
"sampling_rate": "100"
|
||
}
|
||
```
|
||
|
||
### Capture profiling data
|
||
|
||
```bash
|
||
# Fetch a 30-second flame graph
|
||
curl "http://127.0.0.1:9000/rustfs/admin/debug/pprof/profile?seconds=30&format=flamegraph" \
|
||
-o profile.svg
|
||
|
||
# Fetch protobuf output
|
||
curl "http://127.0.0.1:9000/rustfs/admin/debug/pprof/profile?seconds=30&format=protobuf" \
|
||
-o profile.pb
|
||
```
|
||
|
||
**Parameters**
|
||
- `seconds`: Duration between 1 and 300 seconds.
|
||
- `format`: Output format (`flamegraph`/`svg` or `protobuf`/`pb`).
|
||
|
||
## Load Testing Scenarios
|
||
|
||
### 1. S3 API workload
|
||
|
||
Use the Python harness to exercise a complete S3 workflow:
|
||
|
||
```python
|
||
# Basic configuration
|
||
tester = S3LoadTester(
|
||
endpoint="http://127.0.0.1:9000",
|
||
access_key="rustfsadmin",
|
||
secret_key="rustfsadmin"
|
||
)
|
||
|
||
# Execute the load test
|
||
# Four threads, ten operations each
|
||
tester.run_load_test(num_threads=4, operations_per_thread=10)
|
||
```
|
||
|
||
Each iteration performs:
|
||
1. Upload a 1 MB object.
|
||
2. Download the object.
|
||
3. Delete the object.
|
||
|
||
### 2. Custom load scenarios
|
||
|
||
```bash
|
||
# Create a test bucket
|
||
curl -X PUT "http://127.0.0.1:9000/test-bucket"
|
||
|
||
# Concurrent uploads
|
||
for i in {1..10}; do
|
||
echo "test data $i" | curl -X PUT "http://127.0.0.1:9000/test-bucket/object-$i" -d @- &
|
||
done
|
||
wait
|
||
|
||
# Concurrent downloads
|
||
for i in {1..10}; do
|
||
curl "http://127.0.0.1:9000/test-bucket/object-$i" > /dev/null &
|
||
done
|
||
wait
|
||
```
|
||
|
||
## Profiling Best Practices
|
||
|
||
### 1. Environment preparation
|
||
|
||
- Confirm that `RUSTFS_ENABLE_PROFILING=true` is set.
|
||
- Use an isolated benchmark environment to avoid interference.
|
||
- Reserve disk space for generated profile artifacts.
|
||
|
||
### 2. Data collection tips
|
||
|
||
- **Warm-up**: Run a light workload for 5–10 minutes before sampling.
|
||
- **Sampling window**: Capture 30–60 seconds under steady load.
|
||
- **Multiple samples**: Take several runs to compare results.
|
||
|
||
### 3. Analysis focus areas
|
||
|
||
When inspecting flame graphs, pay attention to:
|
||
|
||
1. **The widest frames** – most CPU time consumed.
|
||
2. **Flat plateaus** – likely bottlenecks.
|
||
3. **Deep call stacks** – recursion or complex logic.
|
||
4. **Unexpected syscalls** – I/O stalls or allocation churn.
|
||
|
||
### 4. Common issues
|
||
|
||
- **Lock contention**: Investigate frames under `std::sync`.
|
||
- **Memory allocation**: Search for `alloc`-related frames.
|
||
- **I/O wait**: Review filesystem or network call stacks.
|
||
- **Serialization overhead**: Look for JSON/XML parsing hotspots.
|
||
|
||
## Troubleshooting
|
||
|
||
### 1. Profiling disabled
|
||
|
||
Error: `{"enabled":"false"}`
|
||
|
||
**Fix**:
|
||
|
||
```bash
|
||
export RUSTFS_ENABLE_PROFILING=true
|
||
# Restart RustFS
|
||
```
|
||
|
||
### 2. Connection refused
|
||
|
||
Error: `Connection refused`
|
||
|
||
**Checklist**:
|
||
- Confirm RustFS is running.
|
||
- Ensure the port number is correct (default 9000).
|
||
- Verify firewall rules.
|
||
|
||
### 3. Oversized profile output
|
||
|
||
If artifacts become too large:
|
||
- Shorten the capture window (e.g., 15–30 seconds).
|
||
- Reduce load-test concurrency.
|
||
- Prefer protobuf output instead of SVG.
|
||
|
||
## Configuration Parameters
|
||
|
||
### Environment variables
|
||
|
||
| Variable | Default | Description |
|
||
|------|--------|------|
|
||
| `RUSTFS_ENABLE_PROFILING` | `false` | Enable profiling support |
|
||
| `RUSTFS_URL` | `http://127.0.0.1:9000` | RustFS endpoint |
|
||
| `PROFILE_DURATION` | `30` | Profiling duration in seconds |
|
||
| `OUTPUT_DIR` | `./profiles` | Output directory |
|
||
|
||
### Script arguments
|
||
|
||
```bash
|
||
./scripts/profile_rustfs.sh [OPTIONS] [COMMAND]
|
||
|
||
OPTIONS:
|
||
-u, --url URL RustFS URL
|
||
-d, --duration SECONDS Profile duration
|
||
-o, --output DIR Output directory
|
||
|
||
COMMANDS:
|
||
status Check profiler status
|
||
flamegraph Collect a flame graph
|
||
protobuf Collect protobuf samples
|
||
both Collect both formats (default)
|
||
```
|
||
|
||
## Output Locations
|
||
|
||
- **Script output**: `./profiles/`
|
||
- **Python script**: `/tmp/rustfs_profiles/`
|
||
- **File naming**: `rustfs_profile_TIMESTAMP.{svg|pb}`
|
||
|
||
## Example Workflow
|
||
|
||
1. **Launch RustFS**
|
||
```bash
|
||
RUSTFS_ENABLE_PROFILING=true ./rustfs
|
||
```
|
||
|
||
2. **Verify profiling availability**
|
||
```bash
|
||
./scripts/profile_rustfs.sh status
|
||
```
|
||
|
||
3. **Start a load test**
|
||
```bash
|
||
python3 test_load.py &
|
||
```
|
||
|
||
4. **Collect samples**
|
||
```bash
|
||
./scripts/profile_rustfs.sh -d 60 both
|
||
```
|
||
|
||
5. **Inspect the results**
|
||
```bash
|
||
# Review the flame graph
|
||
open profiles/rustfs_profile_*.svg
|
||
|
||
# Or analyze the protobuf output
|
||
go tool pprof profiles/rustfs_profile_*.pb
|
||
```
|
||
|
||
Following this workflow helps you understand RustFS performance characteristics, locate bottlenecks, and implement targeted optimizations.
|