diff --git a/.docker/README.md b/.docker/README.md
index 6c553495..05d0210f 100644
--- a/.docker/README.md
+++ b/.docker/README.md
@@ -1,261 +1,131 @@
-# RustFS Docker Images
+# RustFS Docker Infrastructure
-This directory contains Docker configuration files and supporting infrastructure for building and running RustFS container images.
+This directory contains the complete Docker infrastructure for building, deploying, and monitoring RustFS. It provides ready-to-use configurations for development, testing, and production-grade observability.
-## 📁 Directory Structure
+## 📂 Directory Structure
-```
-rustfs/
-├── Dockerfile # Production image (Alpine + pre-built binaries)
-├── Dockerfile.source # Development image (Debian + source build)
-├── docker-buildx.sh # Multi-architecture build script
-├── Makefile # Build automation with simplified commands
-└── .docker/ # Supporting infrastructure
- ├── observability/ # Monitoring and observability configs
- ├── compose/ # Docker Compose configurations
- ├── mqtt/ # MQTT broker configs
- └── openobserve-otel/ # OpenObserve + OpenTelemetry configs
-```
+| Directory | Description | Status |
+| :--- | :--- | :--- |
+| **[`observability/`](observability/README.md)** | **[RECOMMENDED]** Full-stack observability (Prometheus, Grafana, Tempo, Loki). | ✅ Production-Ready |
+| **[`compose/`](compose/README.md)** | Specialized setups (e.g., 4-node distributed cluster testing). | ⚠️ Testing Only |
+| **[`mqtt/`](mqtt/README.md)** | EMQX Broker configuration for MQTT integration testing. | 🧪 Development |
+| **[`openobserve-otel/`](openobserve-otel/README.md)** | Alternative lightweight observability stack using OpenObserve. | 🔄 Alternative |
-## 🎯 Image Variants
+---
-### Core Images
+## 📄 Root Directory Files
-| Image | Base OS | Build Method | Size | Use Case |
-|-------|---------|--------------|------|----------|
-| `production` (default) | Alpine 3.18 | GitHub Releases | Smallest | Production deployment |
-| `source` | Debian Bookworm | Source build | Medium | Custom builds with cross-compilation |
-| `dev` | Debian Bookworm | Development tools | Large | Interactive development |
+The following files in the project root are essential for Docker operations:
-## 🚀 Usage Examples
+### Build Scripts & Dockerfiles
-### Quick Start (Production)
+| File | Description | Usage |
+| :--- | :--- | :--- |
+| **`docker-buildx.sh`** | **Multi-Arch Build Script**
Automates building and pushing Docker images for `amd64` and `arm64`. Supports release and dev channels. | `./docker-buildx.sh --push` |
+| **`Dockerfile`** | **Production Image (Alpine)**
Lightweight image using musl libc. Downloads pre-built binaries from GitHub Releases. | `docker build -t rustfs:latest .` |
+| **`Dockerfile.glibc`** | **Production Image (Ubuntu)**
Standard image using glibc. Useful if you need specific dynamic libraries. | `docker build -f Dockerfile.glibc .` |
+| **`Dockerfile.source`** | **Development Image**
Builds RustFS from source code. Includes build tools. Ideal for local development and CI. | `docker build -f Dockerfile.source .` |
+### Docker Compose Configurations
+
+| File | Description | Usage |
+| :--- | :--- | :--- |
+| **`docker-compose.yml`** | **Main Development Setup**
Comprehensive setup with profiles for development, observability, and proxying. | `docker compose up -d`
`docker compose --profile observability up -d` |
+| **`docker-compose-simple.yml`** | **Quick Start Setup**
Minimal configuration running a single RustFS instance with 4 volumes. Perfect for first-time users. | `docker compose -f docker-compose-simple.yml up -d` |
+
+---
+
+## 🌟 Observability Stack (Recommended)
+
+Located in: [`.docker/observability/`](observability/README.md)
+
+We provide a comprehensive, industry-standard observability stack designed for deep insights into RustFS performance. This is the recommended setup for both development and production monitoring.
+
+### Components
+- **Metrics**: Prometheus (Collection) + Grafana (Visualization)
+- **Traces**: Tempo (Storage) + Jaeger (UI)
+- **Logs**: Loki
+- **Ingestion**: OpenTelemetry Collector
+
+### Key Features
+- **Full Persistence**: All metrics, logs, and traces are saved to Docker volumes, ensuring no data loss on restarts.
+- **Correlation**: Seamlessly jump between Logs, Traces, and Metrics in Grafana.
+- **High Performance**: Optimized configurations for batching, compression, and memory management.
+
+### Quick Start
```bash
-# Default production image (Alpine + GitHub Releases)
-docker run -p 9000:9000 rustfs/rustfs:latest
-
-# Specific version
-docker run -p 9000:9000 rustfs/rustfs:1.2.3
+cd .docker/observability
+docker compose up -d
```
-### Complete Tag Strategy Examples
+---
+## 🧪 Specialized Environments
+
+Located in: [`.docker/compose/`](compose/README.md)
+
+These configurations are tailored for specific testing scenarios that require complex topologies.
+
+### Distributed Cluster (4-Nodes)
+Simulates a real-world distributed environment with 4 RustFS nodes running locally.
```bash
-# Stable Releases
-docker run rustfs/rustfs:1.2.3 # Main version (production)
-docker run rustfs/rustfs:1.2.3-production # Explicit production variant
-docker run rustfs/rustfs:1.2.3-source # Source build variant
-docker run rustfs/rustfs:latest # Latest stable
-
-# Prerelease Versions
-docker run rustfs/rustfs:1.3.0-alpha.2 # Specific alpha version
-docker run rustfs/rustfs:alpha # Latest alpha
-docker run rustfs/rustfs:beta # Latest beta
-docker run rustfs/rustfs:rc # Latest release candidate
-
-# Development Versions
-docker run rustfs/rustfs:dev # Latest main branch development
-docker run rustfs/rustfs:dev-13e4a0b # Specific commit
-docker run rustfs/rustfs:dev-latest # Latest development
-docker run rustfs/rustfs:main-latest # Main branch latest
+docker compose -f .docker/compose/docker-compose.cluster.yaml up -d
```
-### Development Environment
-
+### Integrated Observability Test
+A self-contained environment running 4 RustFS nodes alongside the full observability stack. Useful for end-to-end testing of telemetry.
```bash
-# Quick setup using Makefile (recommended)
-make docker-dev-local # Build development image locally
-make dev-env-start # Start development container
-
-# Manual Docker commands
-docker run -it -v $(pwd):/workspace -p 9000:9000 rustfs/rustfs:latest-dev
-
-# Build from source locally
-docker build -f Dockerfile.source -t rustfs:custom .
-
-# Development with hot reload
-docker-compose up rustfs-dev
+docker compose -f .docker/compose/docker-compose.observability.yaml up -d
```
-## 🏗️ Build Arguments and Scripts
+---
-### Using Makefile Commands (Recommended)
+## 📡 MQTT Integration
-The easiest way to build images using simplified commands:
+Located in: [`.docker/mqtt/`](mqtt/README.md)
+Provides an EMQX broker for testing RustFS MQTT features.
+
+### Quick Start
```bash
-# Development images (build from source)
-make docker-dev-local # Build for local use (single arch)
-make docker-dev # Build multi-arch (for CI/CD)
-make docker-dev-push REGISTRY=xxx # Build and push to registry
-
-# Production images (using pre-built binaries)
-make docker-buildx # Build multi-arch production images
-make docker-buildx-push # Build and push production images
-make docker-buildx-version VERSION=v1.0.0 # Build specific version
-
-# Development environment
-make dev-env-start # Start development container
-make dev-env-stop # Stop development container
-make dev-env-restart # Restart development container
-
-# Help
-make help-docker # Show all Docker-related commands
+cd .docker/mqtt
+docker compose up -d
```
+- **Dashboard**: [http://localhost:18083](http://localhost:18083) (Default: `admin` / `public`)
+- **MQTT Port**: `1883`
-### Using docker-buildx.sh (Advanced)
+---
-For direct script usage and advanced scenarios:
+## 👁️ Alternative: OpenObserve
+Located in: [`.docker/openobserve-otel/`](openobserve-otel/README.md)
+
+For users preferring a lightweight, all-in-one solution, we support OpenObserve. It combines logs, metrics, and traces into a single binary and UI.
+
+### Quick Start
```bash
-# Build latest version for all architectures
-./docker-buildx.sh
-
-# Build and push to registry
-./docker-buildx.sh --push
-
-# Build specific version
-./docker-buildx.sh --release v1.2.3
-
-# Build and push specific version
-./docker-buildx.sh --release v1.2.3 --push
+cd .docker/openobserve-otel
+docker compose up -d
```
-### Manual Docker Builds
+---
-All images support dynamic version selection:
+## 🔧 Common Operations
+### Cleaning Up
+To stop all containers and remove volumes (**WARNING**: deletes all persisted data):
```bash
-# Build production image with latest release
-docker build --build-arg RELEASE="latest" -t rustfs:latest .
-
-# Build from source with specific target
-docker build -f Dockerfile.source \
- --build-arg TARGETPLATFORM="linux/amd64" \
- -t rustfs:source .
-
-# Development build
-docker build -f Dockerfile.source -t rustfs:dev .
+docker compose down -v
```
-## 🔧 Binary Download Sources
-
-### Unified GitHub Releases
-
-The production image downloads from GitHub Releases for reliability and transparency:
-
-- ✅ **production** → GitHub Releases API with automatic latest detection
-- ✅ **Checksum verification** → SHA256SUMS validation when available
-- ✅ **Multi-architecture** → Supports amd64 and arm64
-
-### Source Build
-
-The source variant compiles from source code with advanced features:
-
-- 🔧 **Cross-compilation** → Supports multiple target platforms via `TARGETPLATFORM`
-- ⚡ **Build caching** → sccache for faster compilation
-- 🎯 **Optimized builds** → Release optimizations with LTO and symbol stripping
-
-## 📋 Architecture Support
-
-All variants support multi-architecture builds:
-
-- **linux/amd64** (x86_64)
-- **linux/arm64** (aarch64)
-
-Architecture is automatically detected during build using Docker's `TARGETARCH` build argument.
-
-## 🔐 Security Features
-
-- **Checksum Verification**: Production image verifies SHA256SUMS when available
-- **Non-root User**: All images run as user `rustfs` (UID 1000)
-- **Minimal Runtime**: Production image only includes necessary dependencies
-- **Secure Defaults**: No hardcoded credentials or keys
-
-## 🛠️ Development Workflow
-
-### Quick Start with Makefile (Recommended)
-
+### Viewing Logs
+To follow logs for a specific service:
```bash
-# 1. Start development environment
-make dev-env-start
-
-# 2. Your development container is now running with:
-# - Port 9000 exposed for RustFS
-# - Port 9010 exposed for admin console
-# - Current directory mounted as /workspace
-
-# 3. Stop when done
-make dev-env-stop
+docker compose logs -f [service_name]
```
-### Manual Development Setup
-
+### Checking Status
+To see the status of all running containers:
```bash
-# Build development image from source
-make docker-dev-local
-
-# Or use traditional Docker commands
-docker build -f Dockerfile.source -t rustfs:dev .
-
-# Run with development tools
-docker run -it -v $(pwd):/workspace -p 9000:9000 rustfs:dev bash
-
-# Or use docker-compose for complex setups
-docker-compose up rustfs-dev
+docker compose ps
```
-
-### Common Development Tasks
-
-```bash
-# Build and test locally
-make build # Build binary natively
-make docker-dev-local # Build development Docker image
-make test # Run tests
-make fmt # Format code
-make clippy # Run linter
-
-# Get help
-make help # General help
-make help-docker # Docker-specific help
-make help-build # Build-specific help
-```
-
-## 🚀 CI/CD Integration
-
-The project uses GitHub Actions for automated multi-architecture Docker builds:
-
-### Automated Builds
-
-- **Tags**: Automatic builds triggered on version tags (e.g., `v1.2.3`)
-- **Main Branch**: Development builds with `dev-latest` and `main-latest` tags
-- **Pull Requests**: Test builds without registry push
-
-### Build Variants
-
-Each build creates three image variants:
-
-- `rustfs/rustfs:v1.2.3` (production - Alpine-based)
-- `rustfs/rustfs:v1.2.3-source` (source build - Debian-based)
-- `rustfs/rustfs:v1.2.3-dev` (development - Debian-based with tools)
-
-### Manual Builds
-
-Trigger custom builds via GitHub Actions:
-
-```bash
-# Use workflow_dispatch to build specific versions
-# Available options: latest, main-latest, dev-latest, v1.2.3, dev-abc123
-```
-
-## 📦 Supporting Infrastructure
-
-The `.docker/` directory contains supporting configuration files:
-
-- **observability/** - Prometheus, Grafana, OpenTelemetry configs
-- **compose/** - Multi-service Docker Compose setups
-- **mqtt/** - MQTT broker configurations
-- **openobserve-otel/** - Log aggregation and tracing setup
-
-See individual README files in each subdirectory for specific usage instructions.
diff --git a/.docker/compose/README.md b/.docker/compose/README.md
index 600a6aad..06a91bee 100644
--- a/.docker/compose/README.md
+++ b/.docker/compose/README.md
@@ -1,80 +1,44 @@
-# Docker Compose Configurations
+# Specialized Docker Compose Configurations
-This directory contains specialized Docker Compose configurations for different use cases.
+This directory contains specialized Docker Compose configurations for specific testing scenarios.
+
+## ⚠️ Important Note
+
+**For Observability:**
+We **strongly recommend** using the new, fully integrated observability stack located in `../observability/`. It provides a production-ready setup with Prometheus, Grafana, Tempo, Loki, and OpenTelemetry Collector, all with persistent storage and optimized configurations.
+
+The `docker-compose.observability.yaml` in this directory is kept for legacy reference or specific minimal testing needs but is **not** the primary recommended setup.
## 📁 Configuration Files
-This directory contains specialized Docker Compose configurations and their associated Dockerfiles, keeping related files organized together.
+### Cluster Testing
-### Main Configuration (Root Directory)
+- **`docker-compose.cluster.yaml`**
+ - **Purpose**: Simulates a 4-node RustFS distributed cluster.
+ - **Use Case**: Testing distributed storage logic, consensus, and failover.
+ - **Nodes**: 4 RustFS instances.
+ - **Storage**: Uses local HTTP endpoints.
-- **`../../docker-compose.yml`** - **Default Production Setup**
- - Complete production-ready configuration
- - Includes RustFS server + full observability stack
- - Supports multiple profiles: `dev`, `observability`, `cache`, `proxy`
- - Recommended for most users
+### Legacy / Minimal Observability
-### Specialized Configurations
-
-- **`docker-compose.cluster.yaml`** - **Distributed Testing**
- - 4-node cluster setup for testing distributed storage
- - Uses local compiled binaries
- - Simulates multi-node environment
- - Ideal for development and cluster testing
-
-- **`docker-compose.observability.yaml`** - **Observability Focus**
- - Specialized setup for testing observability features
- - Includes OpenTelemetry, Jaeger, Prometheus, Loki, Grafana
- - Uses `../../Dockerfile.source` for builds
- - Perfect for observability development
+- **`docker-compose.observability.yaml`**
+ - **Purpose**: A minimal observability setup.
+ - **Status**: **Deprecated**. Please use `../observability/docker-compose.yml` instead.
## 🚀 Usage Examples
-### Production Setup
-
-```bash
-# Start main service
-docker-compose up -d
-
-# Start with development profile
-docker-compose --profile dev up -d
-
-# Start with full observability
-docker-compose --profile observability up -d
-```
-
### Cluster Testing
-```bash
-# Build and start 4-node cluster (run from project root)
-cd .docker/compose
-docker-compose -f docker-compose.cluster.yaml up -d
-
-# Or run directly from project root
-docker-compose -f .docker/compose/docker-compose.cluster.yaml up -d
-```
-
-### Observability Testing
+To start a 4-node cluster for distributed testing:
```bash
-# Start observability-focused environment (run from project root)
-cd .docker/compose
-docker-compose -f docker-compose.observability.yaml up -d
-
-# Or run directly from project root
-docker-compose -f .docker/compose/docker-compose.observability.yaml up -d
+# From project root
+docker compose -f .docker/compose/docker-compose.cluster.yaml up -d
```
-## 🔧 Configuration Overview
+### (Deprecated) Minimal Observability
-| Configuration | Nodes | Storage | Observability | Use Case |
-|---------------|-------|---------|---------------|----------|
-| **Main** | 1 | Volume mounts | Full stack | Production |
-| **Cluster** | 4 | HTTP endpoints | Basic | Testing |
-| **Observability** | 4 | Local data | Advanced | Development |
-
-## 📝 Notes
-
-- Always ensure you have built the required binaries before starting cluster tests
-- The main configuration is sufficient for most use cases
-- Specialized configurations are for specific testing scenarios
+```bash
+# From project root
+docker compose -f .docker/compose/docker-compose.observability.yaml up -d
+```
diff --git a/.docker/compose/docker-compose.observability.yaml b/.docker/compose/docker-compose.observability.yaml
index 08127078..e6495b4c 100644
--- a/.docker/compose/docker-compose.observability.yaml
+++ b/.docker/compose/docker-compose.observability.yaml
@@ -13,65 +13,126 @@
# limitations under the License.
services:
+ # --- Observability Stack ---
+
+ tempo-init:
+ image: busybox:latest
+ command: [ "sh", "-c", "chown -R 10001:10001 /var/tempo" ]
+ volumes:
+ - tempo-data:/var/tempo
+ user: root
+ networks:
+ - rustfs-network
+ restart: "no"
+
+ tempo:
+ image: grafana/tempo:latest
+ user: "10001"
+ command: [ "-config.file=/etc/tempo.yaml" ]
+ volumes:
+ - ../../.docker/observability/tempo.yaml:/etc/tempo.yaml:ro
+ - tempo-data:/var/tempo
+ ports:
+ - "3200:3200" # tempo
+ - "4317" # otlp grpc
+ - "4318" # otlp http
+ restart: unless-stopped
+ networks:
+ - rustfs-network
+
otel-collector:
- image: otel/opentelemetry-collector-contrib:0.129.1
+ image: otel/opentelemetry-collector-contrib:latest
environment:
- TZ=Asia/Shanghai
volumes:
- - ../../.docker/observability/otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml
+ - ../../.docker/observability/otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml:ro
ports:
- - 1888:1888
- - 8888:8888
- - 8889:8889
- - 13133:13133
- - 4317:4317
- - 4318:4318
- - 55679:55679
+ - "1888:1888" # pprof
+ - "8888:8888" # Prometheus metrics for Collector
+ - "8889:8889" # Prometheus metrics for application indicators
+ - "13133:13133" # health check
+ - "4317:4317" # OTLP gRPC
+ - "4318:4318" # OTLP HTTP
+ - "55679:55679" # zpages
networks:
- rustfs-network
+ depends_on:
+ - tempo
+ - jaeger
+ - prometheus
+ - loki
+
jaeger:
- image: jaegertracing/jaeger:2.8.0
+ image: jaegertracing/jaeger:latest
environment:
- TZ=Asia/Shanghai
+ - SPAN_STORAGE_TYPE=badger
+ - BADGER_EPHEMERAL=false
+ - BADGER_DIRECTORY_VALUE=/badger/data
+ - BADGER_DIRECTORY_KEY=/badger/key
+ - COLLECTOR_OTLP_ENABLED=true
+ volumes:
+ - jaeger-data:/badger
ports:
- - "16686:16686"
- - "14317:4317"
- - "14318:4318"
+ - "16686:16686" # Web UI
+ - "14269:14269" # Admin/Metrics
networks:
- rustfs-network
+
prometheus:
- image: prom/prometheus:v3.4.2
+ image: prom/prometheus:latest
environment:
- TZ=Asia/Shanghai
volumes:
- - ../../.docker/observability/prometheus.yml:/etc/prometheus/prometheus.yml
+ - ../../.docker/observability/prometheus.yml:/etc/prometheus/prometheus.yml:ro
+ - prometheus-data:/prometheus
ports:
- "9090:9090"
+ command:
+ - '--config.file=/etc/prometheus/prometheus.yml'
+ - '--web.enable-otlp-receiver'
+ - '--web.enable-remote-write-receiver'
+ - '--enable-feature=promql-experimental-functions'
+ - '--storage.tsdb.path=/prometheus'
+ - '--web.console.libraries=/usr/share/prometheus/console_libraries'
+ - '--web.console.templates=/usr/share/prometheus/consoles'
networks:
- rustfs-network
+
loki:
- image: grafana/loki:3.5.1
+ image: grafana/loki:latest
environment:
- TZ=Asia/Shanghai
volumes:
- - ../../.docker/observability/loki-config.yaml:/etc/loki/local-config.yaml
+ - ../../.docker/observability/loki-config.yaml:/etc/loki/local-config.yaml:ro
+ - loki-data:/loki
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
networks:
- rustfs-network
+
grafana:
- image: grafana/grafana:12.0.2
+ image: grafana/grafana:latest
ports:
- "3000:3000" # Web UI
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
+ - GF_SECURITY_ADMIN_USER=admin
- TZ=Asia/Shanghai
+ - GF_INSTALL_PLUGINS=grafana-pyroscope-datasource
+ - GF_DASHBOARDS_DEFAULT_HOME_DASHBOARD_PATH=/var/lib/grafana/dashboards/home.json
networks:
- rustfs-network
volumes:
- ../../.docker/observability/grafana/provisioning:/etc/grafana/provisioning:ro
- ../../.docker/observability/grafana/dashboards:/var/lib/grafana/dashboards:ro
+ depends_on:
+ - prometheus
+ - tempo
+ - loki
+
+ # --- RustFS Cluster ---
node1:
build:
@@ -86,9 +147,11 @@ services:
- RUSTFS_OBS_LOGGER_LEVEL=debug
platform: linux/amd64
ports:
- - "9001:9000" # Map port 9001 of the host to port 9000 of the container
+ - "9001:9000"
networks:
- rustfs-network
+ depends_on:
+ - otel-collector
node2:
build:
@@ -103,9 +166,11 @@ services:
- RUSTFS_OBS_LOGGER_LEVEL=debug
platform: linux/amd64
ports:
- - "9002:9000" # Map port 9002 of the host to port 9000 of the container
+ - "9002:9000"
networks:
- rustfs-network
+ depends_on:
+ - otel-collector
node3:
build:
@@ -120,9 +185,11 @@ services:
- RUSTFS_OBS_LOGGER_LEVEL=debug
platform: linux/amd64
ports:
- - "9003:9000" # Map port 9003 of the host to port 9000 of the container
+ - "9003:9000"
networks:
- rustfs-network
+ depends_on:
+ - otel-collector
node4:
build:
@@ -137,9 +204,17 @@ services:
- RUSTFS_OBS_LOGGER_LEVEL=debug
platform: linux/amd64
ports:
- - "9004:9000" # Map port 9004 of the host to port 9000 of the container
+ - "9004:9000"
networks:
- rustfs-network
+ depends_on:
+ - otel-collector
+
+volumes:
+ prometheus-data:
+ tempo-data:
+ loki-data:
+ jaeger-data:
networks:
rustfs-network:
diff --git a/.docker/mqtt/README.md b/.docker/mqtt/README.md
new file mode 100644
index 00000000..f0d05559
--- /dev/null
+++ b/.docker/mqtt/README.md
@@ -0,0 +1,30 @@
+# MQTT Broker (EMQX)
+
+This directory contains the configuration for running an EMQX MQTT broker, which can be used for testing RustFS's MQTT integration.
+
+## 🚀 Quick Start
+
+To start the EMQX broker:
+
+```bash
+docker compose up -d
+```
+
+## 📊 Access
+
+- **Dashboard**: [http://localhost:18083](http://localhost:18083)
+- **Default Credentials**: `admin` / `public`
+- **MQTT Port**: `1883`
+- **WebSocket Port**: `8083`
+
+## 🛠️ Configuration
+
+The `docker-compose.yml` file sets up a single-node EMQX instance.
+
+- **Persistence**: Data is not persisted by default (for testing).
+- **Network**: Uses the default bridge network.
+
+## 📝 Notes
+
+- This setup is intended for development and testing purposes.
+- For production deployments, please refer to the official [EMQX Documentation](https://www.emqx.io/docs/en/latest/).
diff --git a/.docker/nginx/nginx.conf b/.docker/nginx/nginx.conf
new file mode 100644
index 00000000..2e574e04
--- /dev/null
+++ b/.docker/nginx/nginx.conf
@@ -0,0 +1,82 @@
+# Copyright 2024 RustFS Team
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+worker_processes auto;
+pid /var/run/nginx.pid;
+
+events {
+ worker_connections 1024;
+}
+
+http {
+ include /etc/nginx/mime.types;
+ default_type application/octet-stream;
+
+ log_format main '$remote_addr - $remote_user [$time_local] "$request" '
+ '$status $body_bytes_sent "$http_referer" '
+ '"$http_user_agent" "$http_x_forwarded_for"';
+
+ access_log /var/log/nginx/access.log main;
+ error_log /var/log/nginx/error.log warn;
+
+ sendfile on;
+ keepalive_timeout 65;
+
+ # RustFS Server Block
+ server {
+ listen 80;
+ server_name localhost;
+
+ # Redirect HTTP to HTTPS (optional, uncomment if SSL is configured)
+ # return 301 https://$host$request_uri;
+
+ location / {
+ proxy_pass http://rustfs:9000;
+ proxy_set_header Host $host;
+ proxy_set_header X-Real-IP $remote_addr;
+ proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+ proxy_set_header X-Forwarded-Proto $scheme;
+
+ # S3 specific headers
+ proxy_set_header X-Amz-Date $http_x_amz_date;
+ proxy_set_header Authorization $http_authorization;
+
+ # Disable buffering for large uploads
+ proxy_request_buffering off;
+ client_max_body_size 0;
+ }
+
+ location /rustfs/console {
+ proxy_pass http://rustfs:9001;
+ proxy_set_header Host $host;
+ proxy_set_header X-Real-IP $remote_addr;
+ proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
+ proxy_set_header X-Forwarded-Proto $scheme;
+ }
+ }
+
+ # SSL Configuration (Example)
+ # server {
+ # listen 443 ssl;
+ # server_name localhost;
+ #
+ # ssl_certificate /etc/nginx/ssl/server.crt;
+ # ssl_certificate_key /etc/nginx/ssl/server.key;
+ #
+ # location / {
+ # proxy_pass http://rustfs:9000;
+ # ...
+ # }
+ # }
+}
diff --git a/.docker/nginx/ssl/.keep b/.docker/nginx/ssl/.keep
new file mode 100644
index 00000000..e69de29b
diff --git a/.docker/observability/.gitignore b/.docker/observability/.gitignore
new file mode 100644
index 00000000..02b805c3
--- /dev/null
+++ b/.docker/observability/.gitignore
@@ -0,0 +1,5 @@
+jaeger-data/*
+loki-data/*
+prometheus-data/*
+tempo-data/*
+grafana-data/*
\ No newline at end of file
diff --git a/.docker/observability/README.md b/.docker/observability/README.md
index e24e928a..43b45cec 100644
--- a/.docker/observability/README.md
+++ b/.docker/observability/README.md
@@ -1,109 +1,85 @@
-# Observability
+# RustFS Observability Stack
-This directory contains the observability stack for the application. The stack is composed of the following components:
+This directory contains the comprehensive observability stack for RustFS, designed to provide deep insights into application performance, logs, and traces.
-- Prometheus v3.2.1
-- Grafana 11.6.0
-- Loki 3.4.2
-- Jaeger 2.4.0
-- Otel Collector 0.120.0 # 0.121.0 remove loki
+## Components
-## Prometheus
+The stack is composed of the following best-in-class open-source components:
-Prometheus is a monitoring and alerting toolkit. It scrapes metrics from instrumented jobs, either directly or via an
-intermediary push gateway for short-lived jobs. It stores all scraped samples locally and runs rules over this data to
-either aggregate and record new time series from existing data or generate alerts. Grafana or other API consumers can be
-used to visualize the collected data.
+- **Prometheus** (v2.53.1): The industry standard for metric collection and alerting.
+- **Grafana** (v11.1.0): The leading platform for observability visualization.
+- **Loki** (v3.1.0): A horizontally-scalable, highly-available, multi-tenant log aggregation system.
+- **Tempo** (v2.5.0): A high-volume, minimal dependency distributed tracing backend.
+- **Jaeger** (v1.59.0): Distributed tracing system (configured as a secondary UI/storage).
+- **OpenTelemetry Collector** (v0.104.0): A vendor-agnostic implementation for receiving, processing, and exporting telemetry data.
-## Grafana
+## Architecture
-Grafana is a multi-platform open-source analytics and interactive visualization web application. It provides charts,
-graphs, and alerts for the web when connected to supported data sources.
+1. **Telemetry Collection**: Applications send OTLP (OpenTelemetry Protocol) data (Metrics, Logs, Traces) to the **OpenTelemetry Collector**.
+2. **Processing & Exporting**: The Collector processes the data (batching, memory limiting) and exports it to the respective backends:
+ - **Traces** -> **Tempo** (Primary) & **Jaeger** (Secondary/Optional)
+ - **Metrics** -> **Prometheus** (via scraping the Collector's exporter)
+ - **Logs** -> **Loki**
+3. **Visualization**: **Grafana** connects to all backends (Prometheus, Tempo, Loki, Jaeger) to provide a unified dashboard experience.
-## Loki
+## Features
-Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus. It is
-designed to be very cost-effective and easy to operate. It does not index the contents of the logs, but rather a set of
-labels for each log stream.
+- **Full Persistence**: All data (Metrics, Logs, Traces) is persisted to Docker volumes, ensuring no data loss on restart.
+- **Correlation**: Seamless navigation between Metrics, Logs, and Traces in Grafana.
+ - Jump from a Metric spike to relevant Traces.
+ - Jump from a Trace to relevant Logs.
+- **High Performance**: Optimized configurations for batching, compression, and memory management.
+- **Standardized Protocols**: Built entirely on OpenTelemetry standards.
-## Jaeger
+## Quick Start
-Jaeger is a distributed tracing system released as open source by Uber Technologies. It is used for monitoring and
-troubleshooting microservices-based distributed systems, including:
+### Prerequisites
-- Distributed context propagation
-- Distributed transaction monitoring
-- Root cause analysis
-- Service dependency analysis
-- Performance / latency optimization
+- Docker
+- Docker Compose
-## Otel Collector
+### Deploy
-The OpenTelemetry Collector offers a vendor-agnostic implementation on how to receive, process, and export telemetry
-data. It removes the need to run, operate, and maintain multiple agents/collectors in order to support open-source
-observability data formats (e.g. Jaeger, Prometheus, etc.) sending to one or more open-source or commercial back-ends.
-
-## How to use
-
-To deploy the observability stack, run the following command:
-
-- docker latest version
+Run the following command to start the entire stack:
```bash
-docker compose -f docker-compose.yml -f docker-compose.override.yml up -d
+docker compose up -d
```
-- docker compose v2.0.0 or before
+### Access Dashboards
+
+| Service | URL | Credentials | Description |
+| :--- | :--- | :--- | :--- |
+| **Grafana** | [http://localhost:3000](http://localhost:3000) | `admin` / `admin` | Main visualization hub. |
+| **Prometheus** | [http://localhost:9090](http://localhost:9090) | - | Metric queries and status. |
+| **Jaeger UI** | [http://localhost:16686](http://localhost:16686) | - | Secondary trace visualization. |
+| **Tempo** | [http://localhost:3200](http://localhost:3200) | - | Tempo status/metrics. |
+
+## Configuration
+
+### Data Persistence
+
+Data is stored in the following Docker volumes:
+
+- `prometheus-data`: Prometheus metrics
+- `tempo-data`: Tempo traces (WAL and Blocks)
+- `loki-data`: Loki logs (Chunks and Rules)
+- `jaeger-data`: Jaeger traces (Badger DB)
+
+To clear all data:
```bash
-docker-compose -f docker-compose.yml -f docker-compose.override.yml up -d
+docker compose down -v
```
-To access the Grafana dashboard, navigate to `http://localhost:3000` in your browser. The default username and password
-are `admin` and `admin`, respectively.
-
-To access the Jaeger dashboard, navigate to `http://localhost:16686` in your browser.
-
-To access the Prometheus dashboard, navigate to `http://localhost:9090` in your browser.
-
-## How to stop
-
-To stop the observability stack, run the following command:
-
-```bash
-docker compose -f docker-compose.yml -f docker-compose.override.yml down
-```
-
-## How to remove data
-
-To remove the data generated by the observability stack, run the following command:
-
-```bash
-docker compose -f docker-compose.yml -f docker-compose.override.yml down -v
-```
-
-## How to configure
-
-To configure the observability stack, modify the `docker-compose.override.yml` file. The file contains the following
-
-```yaml
-services:
- prometheus:
- environment:
- - PROMETHEUS_CONFIG_FILE=/etc/prometheus/prometheus.yml
- volumes:
- - ./prometheus.yml:/etc/prometheus/prometheus.yml
-
- grafana:
- environment:
- - GF_SECURITY_ADMIN_PASSWORD=admin
- volumes:
- - ./grafana/provisioning:/etc/grafana/provisioning
-```
-
-The `prometheus` service mounts the `prometheus.yml` file to `/etc/prometheus/prometheus.yml`. The `grafana` service
-mounts the `grafana/provisioning` directory to `/etc/grafana/provisioning`. You can modify these files to configure the
-observability stack.
+### Customization
+- **Prometheus**: Edit `prometheus.yml` to add scrape targets or alerting rules.
+- **Grafana**: Dashboards and datasources are provisioned from the `grafana/` directory.
+- **Collector**: Edit `otel-collector-config.yaml` to modify pipelines, processors, or exporters.
+## Troubleshooting
+- **Service Health**: Check the health of services using `docker compose ps`.
+- **Logs**: View logs for a specific service using `docker compose logs -f `.
+- **Otel Collector**: Check `http://localhost:13133` for health status and `http://localhost:1888/debug/pprof/` for profiling.
diff --git a/.docker/observability/README_ZH.md b/.docker/observability/README_ZH.md
index 48568689..75d6e80e 100644
--- a/.docker/observability/README_ZH.md
+++ b/.docker/observability/README_ZH.md
@@ -1,27 +1,85 @@
-## 部署可观测性系统
+# RustFS 可观测性技术栈
-OpenTelemetry Collector 提供了一个厂商中立的遥测数据处理方案,用于接收、处理和导出遥测数据。它消除了为支持多种开源可观测性数据格式(如
-Jaeger、Prometheus 等)而需要运行和维护多个代理/收集器的必要性。
+本目录包含 RustFS 的全面可观测性技术栈,旨在提供对应用程序性能、日志和追踪的深入洞察。
-### 快速部署
+## 组件
-1. 进入 `.docker/observability` 目录
-2. 执行以下命令启动服务:
+该技术栈由以下一流的开源组件组成:
+
+- **Prometheus** (v2.53.1): 行业标准的指标收集和告警工具。
+- **Grafana** (v11.1.0): 领先的可观测性可视化平台。
+- **Loki** (v3.1.0): 水平可扩展、高可用、多租户的日志聚合系统。
+- **Tempo** (v2.5.0): 高吞吐量、最小依赖的分布式追踪后端。
+- **Jaeger** (v1.59.0): 分布式追踪系统(配置为辅助 UI/存储)。
+- **OpenTelemetry Collector** (v0.104.0): 接收、处理和导出遥测数据的供应商无关实现。
+
+## 架构
+
+1. **遥测收集**: 应用程序将 OTLP (OpenTelemetry Protocol) 数据(指标、日志、追踪)发送到 **OpenTelemetry Collector**。
+2. **处理与导出**: Collector 处理数据(批处理、内存限制)并将其导出到相应的后端:
+ - **追踪** -> **Tempo** (主要) & **Jaeger** (辅助/可选)
+ - **指标** -> **Prometheus** (通过抓取 Collector 的导出器)
+ - **日志** -> **Loki**
+3. **可视化**: **Grafana** 连接到所有后端(Prometheus, Tempo, Loki, Jaeger),提供统一的仪表盘体验。
+
+## 特性
+
+- **完全持久化**: 所有数据(指标、日志、追踪)都持久化到 Docker 卷,确保重启后无数据丢失。
+- **关联性**: 在 Grafana 中实现指标、日志和追踪之间的无缝导航。
+ - 从指标峰值跳转到相关追踪。
+ - 从追踪跳转到相关日志。
+- **高性能**: 针对批处理、压缩和内存管理进行了优化配置。
+- **标准化协议**: 完全基于 OpenTelemetry 标准构建。
+
+## 快速开始
+
+### 前置条件
+
+- Docker
+- Docker Compose
+
+### 部署
+
+运行以下命令启动整个技术栈:
```bash
-docker compose -f docker-compose.yml up -d
+docker compose up -d
```
-### 访问监控面板
+### 访问仪表盘
-服务启动后,可通过以下地址访问各个监控面板:
+| 服务 | URL | 凭据 | 描述 |
+| :--- | :--- | :--- | :--- |
+| **Grafana** | [http://localhost:3000](http://localhost:3000) | `admin` / `admin` | 主要可视化中心。 |
+| **Prometheus** | [http://localhost:9090](http://localhost:9090) | - | 指标查询和状态。 |
+| **Jaeger UI** | [http://localhost:16686](http://localhost:16686) | - | 辅助追踪可视化。 |
+| **Tempo** | [http://localhost:3200](http://localhost:3200) | - | Tempo 状态/指标。 |
-- Grafana: `http://localhost:3000` (默认账号/密码:`admin`/`admin`)
-- Jaeger: `http://localhost:16686`
-- Prometheus: `http://localhost:9090`
+## 配置
-## 配置可观测性
+### 数据持久化
-```shell
-export RUSTFS_OBS_ENDPOINT="http://localhost:4317" # OpenTelemetry Collector 地址
+数据存储在以下 Docker 卷中:
+
+- `prometheus-data`: Prometheus 指标
+- `tempo-data`: Tempo 追踪 (WAL 和 Blocks)
+- `loki-data`: Loki 日志 (Chunks 和 Rules)
+- `jaeger-data`: Jaeger 追踪 (Badger DB)
+
+要清除所有数据:
+
+```bash
+docker compose down -v
```
+
+### 自定义
+
+- **Prometheus**: 编辑 `prometheus.yml` 以添加抓取目标或告警规则。
+- **Grafana**: 仪表盘和数据源从 `grafana/` 目录预置。
+- **Collector**: 编辑 `otel-collector-config.yaml` 以修改管道、处理器或导出器。
+
+## 故障排除
+
+- **服务健康**: 使用 `docker compose ps` 检查服务健康状况。
+- **日志**: 使用 `docker compose logs -f ` 查看特定服务的日志。
+- **Otel Collector**: 检查 `http://localhost:13133` 获取健康状态,检查 `http://localhost:1888/debug/pprof/` 进行性能分析。
diff --git a/.docker/observability/docker-compose.yml b/.docker/observability/docker-compose.yml
index a9d12685..5ed6ac6d 100644
--- a/.docker/observability/docker-compose.yml
+++ b/.docker/observability/docker-compose.yml
@@ -14,6 +14,8 @@
services:
+ # --- Tracing ---
+
tempo-init:
image: busybox:latest
command: [ "sh", "-c", "chown -R 10001:10001 /var/tempo" ]
@@ -26,74 +28,52 @@ services:
tempo:
image: grafana/tempo:latest
- user: "10001" # The container must be started with root to execute chown in the script
- command: [ "-config.file=/etc/tempo.yaml" ] # This is passed as a parameter to the entry point script
+ user: "10001"
+ command: [ "-config.file=/etc/tempo.yaml" ]
volumes:
- ./tempo.yaml:/etc/tempo.yaml:ro
- ./tempo-data:/var/tempo
ports:
- "3200:3200" # tempo
- - "24317:4317" # otlp grpc
- - "24318:4318" # otlp http
+ - "4317" # otlp grpc
+ - "4318" # otlp http
restart: unless-stopped
networks:
- otel-network
healthcheck:
- test: [ "CMD", "wget", "--spider", "-q", "http://localhost:3200/metrics" ]
+ test: [ "CMD-SHELL", "wget --spider -q http://localhost:3200/metrics || exit 1" ]
interval: 10s
timeout: 5s
- retries: 3
- start_period: 15s
-
- otel-collector:
- image: otel/opentelemetry-collector-contrib:latest
- environment:
- - TZ=Asia/Shanghai
- volumes:
- - ./otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml:ro
- ports:
- - "1888:1888" # pprof
- - "8888:8888" # Prometheus metrics for Collector
- - "8889:8889" # Prometheus metrics for application indicators
- - "13133:13133" # health check
- - "4317:4317" # OTLP gRPC
- - "4318:4318" # OTLP HTTP
- - "55679:55679" # zpages
- networks:
- - otel-network
- depends_on:
- jaeger:
- condition: service_started
- tempo:
- condition: service_started
- prometheus:
- condition: service_started
- loki:
- condition: service_started
- healthcheck:
- test: [ "CMD", "wget", "--spider", "-q", "http://localhost:13133" ]
- interval: 10s
- timeout: 5s
- retries: 3
+ retries: 5
+ start_period: 40s
jaeger:
image: jaegertracing/jaeger:latest
environment:
- TZ=Asia/Shanghai
- - SPAN_STORAGE_TYPE=memory
+ - SPAN_STORAGE_TYPE=badger
+ - BADGER_EPHEMERAL=false
+ - BADGER_DIRECTORY_VALUE=/badger/data
+ - BADGER_DIRECTORY_KEY=/badger/key
- COLLECTOR_OTLP_ENABLED=true
+ volumes:
+ - ./jaeger-data:/badger
ports:
- "16686:16686" # Web UI
- - "14317:4317" # OTLP gRPC
- - "14318:4318" # OTLP HTTP
- - "18888:8888" # collector
+ - "14269:14269" # Admin/Metrics
+ - "4317"
+ - "4318"
networks:
- otel-network
healthcheck:
- test: [ "CMD", "wget", "--spider", "-q", "http://localhost:16686" ]
+ test: [ "CMD-SHELL", "wget --spider -q http://localhost:14269 || exit 1" ]
interval: 10s
timeout: 5s
- retries: 3
+ retries: 5
+ start_period: 20s
+
+ # --- Metrics ---
+
prometheus:
image: prom/prometheus:latest
environment:
@@ -105,11 +85,11 @@ services:
- "9090:9090"
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- - '--web.enable-otlp-receiver' # Enable OTLP
- - '--web.enable-remote-write-receiver' # Enable remote write
- - '--enable-feature=promql-experimental-functions' # Enable info()
- - '--storage.tsdb.min-block-duration=15m' # Minimum block duration
- - '--storage.tsdb.max-block-duration=1h' # Maximum block duration
+ - '--web.enable-otlp-receiver'
+ - '--web.enable-remote-write-receiver'
+ - '--enable-feature=promql-experimental-functions'
+ - '--storage.tsdb.min-block-duration=2h'
+ - '--storage.tsdb.max-block-duration=2h'
- '--log.level=info'
- '--storage.tsdb.retention.time=30d'
- '--storage.tsdb.path=/prometheus'
@@ -119,37 +99,78 @@ services:
networks:
- otel-network
healthcheck:
- test: [ "CMD", "wget", "--spider", "-q", "http://localhost:9090/-/healthy" ]
+ test: [ "CMD-SHELL", "wget --spider -q http://localhost:9090/-/healthy || exit 1" ]
interval: 10s
timeout: 5s
retries: 3
+
+ # --- Logging ---
+
loki:
image: grafana/loki:latest
environment:
- TZ=Asia/Shanghai
volumes:
- ./loki-config.yaml:/etc/loki/local-config.yaml:ro
+ - ./loki-data:/loki
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
networks:
- otel-network
healthcheck:
- test: [ "CMD", "wget", "--spider", "-q", "http://localhost:3100/ready" ]
+ test: [ "CMD-SHELL", "wget --spider -q http://localhost:3100/metrics || exit 1" ]
+ interval: 15s
+ timeout: 10s
+ retries: 5
+ start_period: 60s
+
+ # --- Collection ---
+
+ otel-collector:
+ image: otel/opentelemetry-collector-contrib:latest
+ environment:
+ - TZ=Asia/Shanghai
+ volumes:
+ - ./otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml:ro
+ ports:
+ - "1888:1888" # pprof
+ - "8888:8888" # Prometheus metrics for Collector
+ - "8889:8889" # Prometheus metrics for application indicators
+ - "13133:13133" # health check
+ - "4317:4317" # OTLP gRPC
+ - "4318:4318" # OTLP HTTP
+ - "55679:55679" # zpages
+ networks:
+ - otel-network
+ depends_on:
+ - tempo
+ - jaeger
+ - prometheus
+ - loki
+ healthcheck:
+ test: [ "CMD-SHELL", "wget --spider -q http://localhost:13133 || exit 1" ]
interval: 10s
timeout: 5s
retries: 3
+ start_period: 20s
+
+ # --- Visualization ---
+
grafana:
image: grafana/grafana:latest
ports:
- - "3000:3000" # Web UI
+ - "3000:3000"
volumes:
- - ./grafana-datasources.yaml:/etc/grafana/provisioning/datasources/datasources.yaml
+ - ./grafana/provisioning:/etc/grafana/provisioning
+ - ./grafana/dashboards:/var/lib/grafana/dashboards
+ - ./grafana-data:/var/lib/grafana
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
- GF_SECURITY_ADMIN_USER=admin
- TZ=Asia/Shanghai
- GF_INSTALL_PLUGINS=grafana-pyroscope-datasource
+ - GF_DASHBOARDS_DEFAULT_HOME_DASHBOARD_PATH=/var/lib/grafana/dashboards/home.json
restart: unless-stopped
networks:
- otel-network
@@ -158,7 +179,7 @@ services:
- tempo
- loki
healthcheck:
- test: [ "CMD", "wget", "--spider", "-q", "http://localhost:3000/api/health" ]
+ test: [ "CMD-SHELL", "wget --spider -q http://localhost:3000/api/health || exit 1" ]
interval: 10s
timeout: 5s
retries: 3
@@ -166,11 +187,14 @@ services:
volumes:
prometheus-data:
tempo-data:
+ loki-data:
+ jaeger-data:
+ grafana-data:
networks:
otel-network:
driver: bridge
- name: "network_otel_config"
+ name: "network_otel"
ipam:
config:
- subnet: 172.28.0.0/16
diff --git a/.docker/observability/grafana-data/.gitignore b/.docker/observability/grafana-data/.gitignore
new file mode 100644
index 00000000..f59ec20a
--- /dev/null
+++ b/.docker/observability/grafana-data/.gitignore
@@ -0,0 +1 @@
+*
\ No newline at end of file
diff --git a/.docker/observability/grafana/provisioning/datasources.yaml b/.docker/observability/grafana/provisioning/datasources.yaml
index babfd530..b83f3b30 100644
--- a/.docker/observability/grafana/provisioning/datasources.yaml
+++ b/.docker/observability/grafana/provisioning/datasources.yaml
@@ -1,3 +1,17 @@
+# Copyright 2024 RustFS Team
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
apiVersion: 1
datasources:
@@ -7,102 +21,77 @@ datasources:
access: proxy
orgId: 1
url: http://prometheus:9090
- basicAuth: false
- isDefault: false
+ isDefault: true
version: 1
editable: false
jsonData:
httpMethod: GET
+ exemplarTraceIdDestinations:
+ - name: trace_id
+ datasourceUid: tempo
+
- name: Tempo
type: tempo
+ uid: tempo
access: proxy
orgId: 1
url: http://tempo:3200
- basicAuth: false
- isDefault: true
+ isDefault: false
version: 1
editable: false
- apiVersion: 1
- uid: tempo
jsonData:
httpMethod: GET
serviceMap:
datasourceUid: prometheus
- streamingEnabled:
- search: true
- tracesToLogsV2:
- # Field with an internal link pointing to a logs data source in Grafana.
- # datasourceUid value must match the uid value of the logs data source.
- datasourceUid: 'loki'
- spanStartTimeShift: '-1h'
- spanEndTimeShift: '1h'
- tags: [ 'job', 'instance', 'pod', 'namespace' ]
- filterByTraceID: false
+ tracesToLogs:
+ datasourceUid: loki
+ tags: [ 'job', 'instance', 'pod', 'namespace', 'service.name' ]
+ mappedTags: [ { key: 'service.name', value: 'app' } ]
+ spanStartTimeShift: '1s'
+ spanEndTimeShift: '-1s'
+ filterByTraceID: true
filterBySpanID: false
- customQuery: true
- query: 'method="$${__span.tags.method}"'
- tracesToMetrics:
- datasourceUid: 'prometheus'
- spanStartTimeShift: '-1h'
- spanEndTimeShift: '1h'
- tags: [ { key: 'service.name', value: 'service' }, { key: 'job' } ]
- queries:
- - name: 'Sample query'
- query: 'sum(rate(traces_spanmetrics_latency_bucket{$$__tags}[5m]))'
- tracesToProfiles:
- datasourceUid: 'grafana-pyroscope-datasource'
- tags: [ 'job', 'instance', 'pod', 'namespace' ]
- profileTypeId: 'process_cpu:cpu:nanoseconds:cpu:nanoseconds'
- customQuery: true
- query: 'method="$${__span.tags.method}"'
- serviceMap:
- datasourceUid: 'prometheus'
- nodeGraph:
- enabled: true
- search:
- hide: false
- traceQuery:
- timeShiftEnabled: true
- spanStartTimeShift: '-1h'
- spanEndTimeShift: '1h'
- spanBar:
- type: 'Tag'
- tag: 'http.path'
- streamingEnabled:
- search: true
- - name: Jaeger
- type: jaeger
- uid: Jaeger
- url: http://jaeger:16686
- basicAuth: false
- access: proxy
- readOnly: false
- isDefault: false
- jsonData:
- tracesToLogsV2:
- # Field with an internal link pointing to a logs data source in Grafana.
- # datasourceUid value must match the uid value of the logs data source.
- datasourceUid: 'loki'
- spanStartTimeShift: '1h'
- spanEndTimeShift: '-1h'
- tags: [ 'job', 'instance', 'pod', 'namespace' ]
- filterByTraceID: false
- filterBySpanID: false
- customQuery: true
- query: 'method="$${__span.tags.method}"'
tracesToMetrics:
- datasourceUid: 'Prometheus'
- spanStartTimeShift: '1h'
- spanEndTimeShift: '-1h'
- tags: [ { key: 'service.name', value: 'service' }, { key: 'job' } ]
+ datasourceUid: prometheus
+ tags: [ { key: 'service.name' }, { key: 'job' } ]
queries:
- - name: 'Sample query'
- query: 'sum(rate(traces_spanmetrics_latency_bucket{$$__tags}[5m]))'
+ - name: 'Service-Level Latency'
+ query: 'sum(rate(traces_spanmetrics_latency_bucket{$$__tags}[5m])) by (le)'
+ - name: 'Service-Level Calls'
+ query: 'sum(rate(traces_spanmetrics_calls_total{$$__tags}[5m]))'
+ - name: 'Service-Level Errors'
+ query: 'sum(rate(traces_spanmetrics_calls_total{status_code="ERROR", $$__tags}[5m]))'
nodeGraph:
enabled: true
- traceQuery:
- timeShiftEnabled: true
- spanStartTimeShift: '1h'
- spanEndTimeShift: '-1h'
- spanBar:
- type: 'None'
\ No newline at end of file
+
+ - name: Loki
+ type: loki
+ uid: loki
+ orgId: 1
+ url: http://loki:3100
+ isDefault: false
+ version: 1
+ editable: false
+ jsonData:
+ derivedFields:
+ - datasourceUid: tempo
+ matcherRegex: 'trace_id=(\w+)'
+ name: 'TraceID'
+ url: '$${__value.raw}'
+
+ - name: Jaeger
+ type: jaeger
+ uid: jaeger
+ url: http://jaeger:16686
+ access: proxy
+ isDefault: false
+ editable: false
+ jsonData:
+ tracesToLogs:
+ datasourceUid: loki
+ tags: [ 'job', 'instance', 'pod', 'namespace', 'service.name' ]
+ mappedTags: [ { key: 'service.name', value: 'app' } ]
+ spanStartTimeShift: '1s'
+ spanEndTimeShift: '-1s'
+ filterByTraceID: true
+ filterBySpanID: false
diff --git a/.docker/observability/jaeger-config.yaml b/.docker/observability/jaeger-config.yaml
index 9f1f1ca0..271b3f66 100644
--- a/.docker/observability/jaeger-config.yaml
+++ b/.docker/observability/jaeger-config.yaml
@@ -31,29 +31,19 @@ service:
host: 0.0.0.0
port: 8888
logs:
- level: debug
- # TODO Initialize telemetry tracer once OTEL released new feature.
- # https://github.com/open-telemetry/opentelemetry-collector/issues/10663
+ level: info
extensions:
healthcheckv2:
use_v2: true
http:
- # pprof:
- # endpoint: 0.0.0.0:1777
- # zpages:
- # endpoint: 0.0.0.0:55679
-
jaeger_query:
storage:
- traces: some_store
- traces_archive: another_store
+ traces: badger_store
ui:
config_file: ./cmd/jaeger/config-ui.json
log_access: true
- # The maximum duration that is considered for clock skew adjustments.
- # Defaults to 0 seconds, which means it's disabled.
max_clock_skew_adjust: 0s
grpc:
endpoint: 0.0.0.0:16685
@@ -62,26 +52,16 @@ extensions:
jaeger_storage:
backends:
- some_store:
- memory:
- max_traces: 1000000
- max_events: 100000
- another_store:
- memory:
- max_traces: 1000000
- metric_backends:
- some_metrics_storage:
- prometheus:
- endpoint: http://prometheus:9090
- normalize_calls: true
- normalize_duration: true
+ badger_store:
+ badger:
+ ephemeral: false
+ directory_key: /badger/key
+ directory_value: /badger/data
+ span_store_ttl: 72h
remote_sampling:
- # You can either use file or adaptive sampling strategy in remote_sampling
- # file:
- # path: ./cmd/jaeger/sampling-strategies.json
adaptive:
- sampling_store: some_store
+ sampling_store: badger_store
initial_sampling_probability: 0.1
http:
grpc:
@@ -103,12 +83,8 @@ receivers:
processors:
batch:
- metadata_keys: [ "span.kind", "http.method", "http.status_code", "db.system", "db.statement", "messaging.system", "messaging.destination", "messaging.operation","span.events","span.links" ]
- # Adaptive Sampling Processor is required to support adaptive sampling.
- # It expects remote_sampling extension with `adaptive:` config to be enabled.
adaptive_sampling:
exporters:
jaeger_storage_exporter:
- trace_storage: some_store
-
+ trace_storage: badger_store
diff --git a/.docker/observability/jaeger-data/.gitignore b/.docker/observability/jaeger-data/.gitignore
new file mode 100644
index 00000000..f59ec20a
--- /dev/null
+++ b/.docker/observability/jaeger-data/.gitignore
@@ -0,0 +1 @@
+*
\ No newline at end of file
diff --git a/.docker/observability/loki-config.yaml b/.docker/observability/loki-config.yaml
index 4f5add74..daee60c6 100644
--- a/.docker/observability/loki-config.yaml
+++ b/.docker/observability/loki-config.yaml
@@ -17,16 +17,16 @@ auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
- log_level: debug
+ log_level: info
grpc_server_max_concurrent_streams: 1000
common:
instance_addr: 127.0.0.1
- path_prefix: /tmp/loki
+ path_prefix: /loki
storage:
filesystem:
- chunks_directory: /tmp/loki/chunks
- rules_directory: /tmp/loki/rules
+ chunks_directory: /loki/chunks
+ rules_directory: /loki/rules
replication_factor: 1
ring:
kvstore:
@@ -66,17 +66,3 @@ ruler:
frontend:
encoding: protobuf
-
-
-# By default, Loki will send anonymous, but uniquely-identifiable usage and configuration
-# analytics to Grafana Labs. These statistics are sent to https://stats.grafana.org/
-#
-# Statistics help us better understand how Loki is used, and they show us performance
-# levels for most users. This helps us prioritize features and documentation.
-# For more information on what's sent, look at
-# https://github.com/grafana/loki/blob/main/pkg/analytics/stats.go
-# Refer to the buildReport method to see what goes into a report.
-#
-# If you would like to disable reporting, uncomment the following lines:
-#analytics:
-# reporting_enabled: false
diff --git a/.docker/observability/loki-data/.gitignore b/.docker/observability/loki-data/.gitignore
new file mode 100644
index 00000000..f59ec20a
--- /dev/null
+++ b/.docker/observability/loki-data/.gitignore
@@ -0,0 +1 @@
+*
\ No newline at end of file
diff --git a/.docker/observability/otel-collector-config.yaml b/.docker/observability/otel-collector-config.yaml
index 078318f1..da566c36 100644
--- a/.docker/observability/otel-collector-config.yaml
+++ b/.docker/observability/otel-collector-config.yaml
@@ -15,69 +15,70 @@
receivers:
otlp:
protocols:
- grpc: # OTLP gRPC receiver
+ grpc:
endpoint: 0.0.0.0:4317
- http: # OTLP HTTP receiver
+ http:
endpoint: 0.0.0.0:4318
processors:
- batch: # Batch processor to improve throughput
- timeout: 5s
- send_batch_size: 1000
- metadata_keys: [ ]
- metadata_cardinality_limit: 1000
+ batch:
+ timeout: 1s
+ send_batch_size: 1024
memory_limiter:
check_interval: 1s
- limit_mib: 512
+ limit_mib: 1024
+ spike_limit_mib: 256
transform/logs:
log_statements:
- context: log
statements:
- # Extract Body as attribute "message"
- set(attributes["message"], body.string)
- # Retain the original Body
- set(attributes["log.body"], body.string)
exporters:
- otlp/traces: # OTLP exporter for trace data
- endpoint: "http://jaeger:4317" # OTLP gRPC endpoint for Jaeger
+ otlp/tempo:
+ endpoint: "tempo:4317"
tls:
- insecure: true # TLS is disabled in the development environment and a certificate needs to be configured in the production environment.
- compression: gzip # Enable compression to reduce network bandwidth
+ insecure: true
+ compression: gzip
retry_on_failure:
- enabled: true # Enable retry on failure
- initial_interval: 1s # Initial interval for retry
- max_interval: 30s # Maximum interval for retry
- max_elapsed_time: 300s # Maximum elapsed time for retry
+ enabled: true
+ initial_interval: 1s
+ max_interval: 30s
+ max_elapsed_time: 300s
sending_queue:
- enabled: true # Enable sending queue
- num_consumers: 10 # Number of consumers
- queue_size: 5000 # Queue size
- otlp/tempo: # OTLP exporter for trace data
- endpoint: "http://tempo:4317" # OTLP gRPC endpoint for tempo
+ enabled: true
+ num_consumers: 10
+ queue_size: 5000
+
+ otlp/jaeger:
+ endpoint: "jaeger:4317"
tls:
- insecure: true # TLS is disabled in the development environment and a certificate needs to be configured in the production environment.
- compression: gzip # Enable compression to reduce network bandwidth
+ insecure: true
+ compression: gzip
retry_on_failure:
- enabled: true # Enable retry on failure
- initial_interval: 1s # Initial interval for retry
- max_interval: 30s # Maximum interval for retry
- max_elapsed_time: 300s # Maximum elapsed time for retry
+ enabled: true
+ initial_interval: 1s
+ max_interval: 30s
+ max_elapsed_time: 300s
sending_queue:
- enabled: true # Enable sending queue
- num_consumers: 10 # Number of consumers
- queue_size: 5000 # Queue size
- prometheus: # Prometheus exporter for metrics data
- endpoint: "0.0.0.0:8889" # Prometheus scraping endpoint
- send_timestamps: true # Send timestamp
- metric_expiration: 5m # Metric expiration time
+ enabled: true
+ num_consumers: 10
+ queue_size: 5000
+
+ prometheus:
+ endpoint: "0.0.0.0:8889"
+ send_timestamps: true
+ metric_expiration: 5m
resource_to_telemetry_conversion:
- enabled: true # Enable resource to telemetry conversion
- otlphttp/loki: # Loki exporter for log data
+ enabled: true
+
+ otlphttp/loki:
endpoint: "http://loki:3100/otlp"
tls:
insecure: true
- compression: gzip # Enable compression to reduce network bandwidth
+ compression: gzip
+
extensions:
health_check:
endpoint: 0.0.0.0:13133
@@ -85,13 +86,14 @@ extensions:
endpoint: 0.0.0.0:1888
zpages:
endpoint: 0.0.0.0:55679
+
service:
- extensions: [ health_check, pprof, zpages ] # Enable extension
+ extensions: [ health_check, pprof, zpages ]
pipelines:
traces:
receivers: [ otlp ]
processors: [ memory_limiter, batch ]
- exporters: [ otlp/traces, otlp/tempo ]
+ exporters: [ otlp/tempo, otlp/jaeger ]
metrics:
receivers: [ otlp ]
processors: [ batch ]
@@ -102,20 +104,13 @@ service:
exporters: [ otlphttp/loki ]
telemetry:
logs:
- level: "debug" # Collector log level
- encoding: "json" # Log encoding: console or json
+ level: "info"
+ encoding: "json"
metrics:
- level: "detailed" # Can be basic, normal, detailed
+ level: "normal"
readers:
- - periodic:
- exporter:
- otlp:
- protocol: http/protobuf
- endpoint: http://otel-collector:4318
- pull:
exporter:
prometheus:
host: '0.0.0.0'
port: 8888
-
-
diff --git a/.docker/observability/prometheus.yml b/.docker/observability/prometheus.yml
index 88b0d0af..25266a3b 100644
--- a/.docker/observability/prometheus.yml
+++ b/.docker/observability/prometheus.yml
@@ -17,27 +17,40 @@ global:
evaluation_interval: 15s
external_labels:
cluster: 'rustfs-dev' # Label to identify the cluster
- relica: '1' # Replica identifier
+ replica: '1' # Replica identifier
scrape_configs:
- - job_name: 'otel-collector-internal'
+ - job_name: 'otel-collector'
static_configs:
- targets: [ 'otel-collector:8888' ] # Scrape metrics from Collector
scrape_interval: 10s
+
- job_name: 'rustfs-app-metrics'
static_configs:
- targets: [ 'otel-collector:8889' ] # Application indicators
scrape_interval: 15s
metric_relabel_configs:
+ - source_labels: [ __name__ ]
+ regex: 'go_.*'
+ action: drop # Drop Go runtime metrics if not needed
+
- job_name: 'tempo'
static_configs:
- targets: [ 'tempo:3200' ] # Scrape metrics from Tempo
+
- job_name: 'jaeger'
static_configs:
- - targets: [ 'jaeger:8888' ] # Jaeger admin port
+ - targets: [ 'jaeger:14269' ] # Jaeger admin port (14269 is standard for admin/metrics)
+
+ - job_name: 'loki'
+ static_configs:
+ - targets: [ 'loki:3100' ]
+
+ - job_name: 'prometheus'
+ static_configs:
+ - targets: [ 'localhost:9090' ]
otlp:
- # Recommended attributes to be promoted to labels.
promote_resource_attributes:
- service.instance.id
- service.name
@@ -56,10 +69,8 @@ otlp:
- k8s.pod.name
- k8s.replicaset.name
- k8s.statefulset.name
- # Ingest OTLP data keeping all characters in metric/label names.
translation_strategy: NoUTF8EscapingWithSuffixes
storage:
- # OTLP is a push-based protocol, Out of order samples is a common scenario.
tsdb:
- out_of_order_time_window: 30m
\ No newline at end of file
+ out_of_order_time_window: 30m
diff --git a/.docker/observability/tempo.yaml b/.docker/observability/tempo.yaml
index 714d1310..3099aec9 100644
--- a/.docker/observability/tempo.yaml
+++ b/.docker/observability/tempo.yaml
@@ -1,18 +1,21 @@
-stream_over_http_enabled: true
+# Copyright 2024 RustFS Team
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
server:
http_listen_port: 3200
log_level: info
-query_frontend:
- search:
- duration_slo: 5s
- throughput_bytes_slo: 1.073741824e+09
- metadata_slo:
- duration_slo: 5s
- throughput_bytes_slo: 1.073741824e+09
- trace_by_id:
- duration_slo: 5s
-
distributor:
receivers:
otlp:
@@ -25,10 +28,6 @@ distributor:
ingester:
max_block_duration: 5m # cut the headblock when this much time passes. this is being set for demo purposes and should probably be left alone normally
-compactor:
- compaction:
- block_retention: 1h # overall Tempo trace retention. set for demo purposes
-
metrics_generator:
registry:
external_labels:
@@ -49,9 +48,3 @@ storage:
path: /var/tempo/wal # where to store the wal locally
local:
path: /var/tempo/blocks
-
-overrides:
- defaults:
- metrics_generator:
- processors: [ service-graphs, span-metrics, local-blocks ] # enables metrics generator
- generate_native_histograms: both
\ No newline at end of file
diff --git a/.docker/openobserve-otel/README.md b/.docker/openobserve-otel/README.md
index fcf89c0d..bbc863d3 100644
--- a/.docker/openobserve-otel/README.md
+++ b/.docker/openobserve-otel/README.md
@@ -5,71 +5,57 @@
English | [中文](README_ZH.md)
-This directory contains the configuration files for setting up an observability stack with OpenObserve and OpenTelemetry
-Collector.
+This directory contains the configuration for an **alternative** observability stack using OpenObserve.
-### Overview
+## ⚠️ Note
-This setup provides a complete observability solution for your applications:
+For the **recommended** observability stack (Prometheus, Grafana, Tempo, Loki), please see `../observability/`.
-- **OpenObserve**: A modern, open-source observability platform for logs, metrics, and traces.
-- **OpenTelemetry Collector**: Collects and processes telemetry data before sending it to OpenObserve.
+## 🌟 Overview
-### Setup Instructions
+OpenObserve is a lightweight, all-in-one observability platform that handles logs, metrics, and traces in a single binary. This setup is ideal for:
+- Resource-constrained environments.
+- Quick setup and testing.
+- Users who prefer a unified UI.
-1. **Prerequisites**:
- - Docker and Docker Compose installed
- - Sufficient memory resources (minimum 2GB recommended)
+## 🚀 Quick Start
-2. **Starting the Services**:
- ```bash
- cd .docker/openobserve-otel
- docker compose -f docker-compose.yml up -d
- ```
+### 1. Start Services
-3. **Accessing the Dashboard**:
- - OpenObserve UI: http://localhost:5080
- - Default credentials:
- - Username: root@rustfs.com
- - Password: rustfs123
+```bash
+cd .docker/openobserve-otel
+docker compose up -d
+```
-### Configuration
+### 2. Access Dashboard
-#### OpenObserve Configuration
+- **URL**: [http://localhost:5080](http://localhost:5080)
+- **Username**: `root@rustfs.com`
+- **Password**: `rustfs123`
-The OpenObserve service is configured with:
+## 🛠️ Configuration
-- Root user credentials
-- Data persistence through a volume mount
-- Memory cache enabled
-- Health checks
-- Exposed ports:
- - 5080: HTTP API and UI
- - 5081: OTLP gRPC
+### OpenObserve
-#### OpenTelemetry Collector Configuration
+- **Persistence**: Data is persisted to a Docker volume.
+- **Ports**:
+ - `5080`: HTTP API and UI
+ - `5081`: OTLP gRPC
-The collector is configured to:
+### OpenTelemetry Collector
-- Receive telemetry data via OTLP (HTTP and gRPC)
-- Collect logs from files
-- Process data in batches
-- Export data to OpenObserve
-- Manage memory usage
+- **Receivers**: OTLP (gRPC `4317`, HTTP `4318`)
+- **Exporters**: Sends data to OpenObserve.
-### Integration with Your Application
+## 🔗 Integration
-To send telemetry data from your application, configure your OpenTelemetry SDK to send data to:
+Configure your application to send OTLP data to the collector:
-- OTLP gRPC: `localhost:4317`
-- OTLP HTTP: `localhost:4318`
+- **Endpoint**: `http://localhost:4318` (HTTP) or `localhost:4317` (gRPC)
-For example, in a Rust application using the `rustfs-obs` library:
+Example for RustFS:
```bash
export RUSTFS_OBS_ENDPOINT=http://localhost:4318
-export RUSTFS_OBS_SERVICE_NAME=yourservice
-export RUSTFS_OBS_SERVICE_VERSION=1.0.0
-export RUSTFS_OBS_ENVIRONMENT=development
+export RUSTFS_OBS_SERVICE_NAME=rustfs-node-1
```
-
diff --git a/.docker/openobserve-otel/README_ZH.md b/.docker/openobserve-otel/README_ZH.md
index 2e2e80c9..d926d8bd 100644
--- a/.docker/openobserve-otel/README_ZH.md
+++ b/.docker/openobserve-otel/README_ZH.md
@@ -5,71 +5,57 @@
[English](README.md) | 中文
-## 中文
+本目录包含使用 OpenObserve 的**替代**可观测性技术栈配置。
-本目录包含搭建 OpenObserve 和 OpenTelemetry Collector 可观测性栈的配置文件。
+## ⚠️ 注意
-### 概述
+对于**推荐**的可观测性技术栈(Prometheus, Grafana, Tempo, Loki),请参阅 `../observability/`。
-此设置为应用程序提供了完整的可观测性解决方案:
+## 🌟 概览
-- **OpenObserve**:现代化、开源的可观测性平台,用于日志、指标和追踪。
-- **OpenTelemetry Collector**:收集和处理遥测数据,然后将其发送到 OpenObserve。
+OpenObserve 是一个轻量级、一体化的可观测性平台,在一个二进制文件中处理日志、指标和追踪。此设置非常适合:
+- 资源受限的环境。
+- 快速设置和测试。
+- 喜欢统一 UI 的用户。
-### 设置说明
+## 🚀 快速开始
-1. **前提条件**:
- - 已安装 Docker 和 Docker Compose
- - 足够的内存资源(建议至少 2GB)
-
-2. **启动服务**:
- ```bash
- cd .docker/openobserve-otel
- docker compose -f docker-compose.yml up -d
- ```
-
-3. **访问仪表板**:
- - OpenObserve UI:http://localhost:5080
- - 默认凭据:
- - 用户名:root@rustfs.com
- - 密码:rustfs123
-
-### 配置
-
-#### OpenObserve 配置
-
-OpenObserve 服务配置:
-
-- 根用户凭据
-- 通过卷挂载实现数据持久化
-- 启用内存缓存
-- 健康检查
-- 暴露端口:
- - 5080:HTTP API 和 UI
- - 5081:OTLP gRPC
-
-#### OpenTelemetry Collector 配置
-
-收集器配置为:
-
-- 通过 OTLP(HTTP 和 gRPC)接收遥测数据
-- 从文件中收集日志
-- 批处理数据
-- 将数据导出到 OpenObserve
-- 管理内存使用
-
-### 与应用程序集成
-
-要从应用程序发送遥测数据,将 OpenTelemetry SDK 配置为发送数据到:
-
-- OTLP gRPC:`localhost:4317`
-- OTLP HTTP:`localhost:4318`
-
-例如,在使用 `rustfs-obs` 库的 Rust 应用程序中:
+### 1. 启动服务
```bash
-export RUSTFS_OBS_ENDPOINT=http://localhost:4317
-export RUSTFS_OBS_SERVICE_NAME=yourservice
-export RUSTFS_OBS_SERVICE_VERSION=1.0.0
-export RUSTFS_OBS_ENVIRONMENT=development
-```
\ No newline at end of file
+cd .docker/openobserve-otel
+docker compose up -d
+```
+
+### 2. 访问仪表盘
+
+- **URL**: [http://localhost:5080](http://localhost:5080)
+- **用户名**: `root@rustfs.com`
+- **密码**: `rustfs123`
+
+## 🛠️ 配置
+
+### OpenObserve
+
+- **持久化**: 数据持久化到 Docker 卷。
+- **端口**:
+ - `5080`: HTTP API 和 UI
+ - `5081`: OTLP gRPC
+
+### OpenTelemetry Collector
+
+- **接收器**: OTLP (gRPC `4317`, HTTP `4318`)
+- **导出器**: 将数据发送到 OpenObserve。
+
+## 🔗 集成
+
+配置您的应用程序将 OTLP 数据发送到收集器:
+
+- **端点**: `http://localhost:4318` (HTTP) 或 `localhost:4317` (gRPC)
+
+RustFS 示例:
+
+```bash
+export RUSTFS_OBS_ENDPOINT=http://localhost:4318
+export RUSTFS_OBS_SERVICE_NAME=rustfs-node-1
+```
diff --git a/Cargo.lock b/Cargo.lock
index e549865d..b42e5a9e 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -530,9 +530,9 @@ dependencies = [
[[package]]
name = "async-compression"
-version = "0.4.40"
+version = "0.4.41"
source = "registry+https://github.com/rust-lang/crates.io-index"
-checksum = "7d67d43201f4d20c78bcda740c142ca52482d81da80681533d33bf3f0596c8e2"
+checksum = "d0f9ee0f6e02ffd7ad5816e9464499fba7b3effd01123b515c41d1697c43dad1"
dependencies = [
"compression-codecs",
"compression-core",
@@ -8119,7 +8119,7 @@ checksum = "9774ba4a74de5f7b1c1451ed6cd5285a32eddb5cccb8cc655a4e50009e06477f"
[[package]]
name = "s3s"
version = "0.13.0-alpha.3"
-source = "git+https://github.com/s3s-project/s3s.git?rev=61b96d11de81c508ba5361864676824f318ef65c#61b96d11de81c508ba5361864676824f318ef65c"
+source = "git+https://github.com/s3s-project/s3s.git?rev=218000387f4c3e67ad478bfc4587931f88b37006#218000387f4c3e67ad478bfc4587931f88b37006"
dependencies = [
"arc-swap",
"arrayvec",
diff --git a/Cargo.toml b/Cargo.toml
index d8dd651c..58d64c36 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -105,7 +105,7 @@ rustfs-protocols = { path = "crates/protocols", version = "0.0.5" }
# Async Runtime and Networking
async-channel = "2.5.0"
-async-compression = { version = "0.4.40" }
+async-compression = { version = "0.4.41" }
async-recursion = "1.1.1"
async-trait = "0.1.89"
axum = "0.8.8"
@@ -235,7 +235,7 @@ rumqttc = { version = "0.25.1" }
rustix = { version = "1.1.4", features = ["fs"] }
rust-embed = { version = "8.11.0" }
rustc-hash = { version = "2.1.1" }
-s3s = { version = "0.13.0-alpha.3", features = ["minio"], git = "https://github.com/s3s-project/s3s.git", rev = "61b96d11de81c508ba5361864676824f318ef65c" }
+s3s = { version = "0.13.0-alpha.3", features = ["minio"], git = "https://github.com/s3s-project/s3s.git", rev = "218000387f4c3e67ad478bfc4587931f88b37006" }
serial_test = "3.4.0"
shadow-rs = { version = "1.7.0", default-features = false }
siphasher = "1.0.2"
diff --git a/Dockerfile b/Dockerfile
index db9c0eb7..6cb8eabb 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,3 +1,17 @@
+# Copyright 2024 RustFS Team
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
FROM alpine:3.23 AS build
ARG TARGETARCH
diff --git a/Dockerfile.glibc b/Dockerfile.glibc
index 434e8fa0..0d06d71b 100644
--- a/Dockerfile.glibc
+++ b/Dockerfile.glibc
@@ -1,3 +1,17 @@
+# Copyright 2024 RustFS Team
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
FROM ubuntu:24.04 AS build
ARG TARGETARCH
diff --git a/Dockerfile.source b/Dockerfile.source
index 3dc2304b..764f8513 100644
--- a/Dockerfile.source
+++ b/Dockerfile.source
@@ -1,4 +1,18 @@
# syntax=docker/dockerfile:1.6
+# Copyright 2024 RustFS Team
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
# Multi-stage Dockerfile for RustFS - LOCAL DEVELOPMENT ONLY
#
# IMPORTANT: This Dockerfile builds RustFS from source for local development and testing.
diff --git a/crates/config/src/constants/app.rs b/crates/config/src/constants/app.rs
index 6460eea8..7b1aa275 100644
--- a/crates/config/src/constants/app.rs
+++ b/crates/config/src/constants/app.rs
@@ -150,8 +150,8 @@ pub const DEFAULT_CONSOLE_ADDRESS: &str = concat!(":", DEFAULT_CONSOLE_PORT);
/// Default region for rustfs
/// This is the default region for rustfs.
/// It is used to identify the region of the application.
-/// Default value: cn-east-1
-pub const RUSTFS_REGION: &str = "cn-east-1";
+/// Default value: rustfs-global-0
+pub const RUSTFS_REGION: &str = "rustfs-global-0";
/// Default log filename for rustfs
/// This is the default log filename for rustfs.
diff --git a/crates/ecstore/src/global.rs b/crates/ecstore/src/global.rs
index 65035a06..559f109c 100644
--- a/crates/ecstore/src/global.rs
+++ b/crates/ecstore/src/global.rs
@@ -42,7 +42,6 @@ lazy_static! {
static ref GLOBAL_RUSTFS_PORT: OnceLock = OnceLock::new();
static ref globalDeploymentIDPtr: OnceLock = OnceLock::new();
pub static ref GLOBAL_OBJECT_API: OnceLock> = OnceLock::new();
- pub static ref GLOBAL_LOCAL_DISK: Arc>>> = Arc::new(RwLock::new(Vec::new()));
pub static ref GLOBAL_IsErasure: RwLock = RwLock::new(false);
pub static ref GLOBAL_IsDistErasure: RwLock = RwLock::new(false);
pub static ref GLOBAL_IsErasureSD: RwLock = RwLock::new(false);
@@ -57,8 +56,8 @@ lazy_static! {
pub static ref GLOBAL_LocalNodeName: String = "127.0.0.1:9000".to_string();
pub static ref GLOBAL_LocalNodeNameHex: String = rustfs_utils::crypto::hex(GLOBAL_LocalNodeName.as_bytes());
pub static ref GLOBAL_NodeNamesHex: HashMap = HashMap::new();
- pub static ref GLOBAL_REGION: OnceLock = OnceLock::new();
- pub static ref GLOBAL_LOCAL_LOCK_CLIENT: OnceLock> = OnceLock::new();
+ pub static ref GLOBAL_REGION: OnceLock = OnceLock::new();
+ pub static ref GLOBAL_LOCAL_LOCK_CLIENT: OnceLock> = OnceLock::new();
pub static ref GLOBAL_LOCK_CLIENTS: OnceLock>> = OnceLock::new();
pub static ref GLOBAL_BUCKET_MONITOR: OnceLock> = OnceLock::new();
}
@@ -243,20 +242,20 @@ type TypeLocalDiskSetDrives = Vec>>>;
/// Set the global region
///
/// # Arguments
-/// * `region` - The region string to set globally
+/// * `region` - The Region instance to set globally
///
/// # Returns
/// * None
-pub fn set_global_region(region: String) {
+pub fn set_global_region(region: s3s::region::Region) {
GLOBAL_REGION.set(region).unwrap();
}
/// Get the global region
///
/// # Returns
-/// * `Option` - The global region string, if set
+/// * `Option` - The global region, if set
///
-pub fn get_global_region() -> Option {
+pub fn get_global_region() -> Option {
GLOBAL_REGION.get().cloned()
}
diff --git a/crates/iam/Cargo.toml b/crates/iam/Cargo.toml
index 67428cd7..bc19644a 100644
--- a/crates/iam/Cargo.toml
+++ b/crates/iam/Cargo.toml
@@ -51,10 +51,10 @@ rustfs-utils = { workspace = true, features = ["path"] }
tokio-util.workspace = true
pollster.workspace = true
reqwest = { workspace = true }
-url = { workspace = true }
moka = { workspace = true }
openidconnect = { workspace = true }
http = { workspace = true }
+url = { workspace = true }
[dev-dependencies]
pollster.workspace = true
diff --git a/crates/obs/src/telemetry.rs b/crates/obs/src/telemetry.rs
index e152cc3f..6119149d 100644
--- a/crates/obs/src/telemetry.rs
+++ b/crates/obs/src/telemetry.rs
@@ -179,7 +179,7 @@ fn format_with_color(w: &mut dyn std::io::Write, now: &mut DeferredNow, record:
let binding = std::thread::current();
let thread_name = binding.name().unwrap_or("unnamed");
let thread_id = format!("{:?}", std::thread::current().id());
- writeln!(
+ write!(
w,
"[{}] {} [{}] [{}:{}] [{}:{}] {}",
now.now().format(flexi_logger::TS_DASHES_BLANK_COLONS_DOT_BLANK),
@@ -200,7 +200,7 @@ fn format_for_file(w: &mut dyn std::io::Write, now: &mut DeferredNow, record: &R
let binding = std::thread::current();
let thread_name = binding.name().unwrap_or("unnamed");
let thread_id = format!("{:?}", std::thread::current().id());
- writeln!(
+ write!(
w,
"[{}] {} [{}] [{}:{}] [{}:{}] {}",
now.now().format(flexi_logger::TS_DASHES_BLANK_COLONS_DOT_BLANK),
diff --git a/docker-buildx.sh b/docker-buildx.sh
index ed19c077..e674bdc0 100755
--- a/docker-buildx.sh
+++ b/docker-buildx.sh
@@ -1,4 +1,17 @@
#!/usr/bin/env bash
+# Copyright 2024 RustFS Team
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
set -e
diff --git a/docker-compose-simple.yml b/docker-compose-simple.yml
index 5827ccab..dc107d03 100644
--- a/docker-compose-simple.yml
+++ b/docker-compose-simple.yml
@@ -1,3 +1,17 @@
+# Copyright 2024 RustFS Team
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
version: "3.9"
services:
diff --git a/docker-compose.yml b/docker-compose.yml
index b8dc5d36..5fdf9c40 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -106,7 +106,37 @@ services:
profiles:
- dev
- # OpenTelemetry Collector
+ # --- Observability Stack ---
+
+ tempo-init:
+ image: busybox:latest
+ command: [ "sh", "-c", "chown -R 10001:10001 /var/tempo" ]
+ volumes:
+ - tempo_data:/var/tempo
+ user: root
+ networks:
+ - rustfs-network
+ restart: "no"
+ profiles:
+ - observability
+
+ tempo:
+ image: grafana/tempo:latest
+ user: "10001"
+ command: [ "-config.file=/etc/tempo.yaml" ]
+ volumes:
+ - ./.docker/observability/tempo.yaml:/etc/tempo.yaml:ro
+ - tempo_data:/var/tempo
+ ports:
+ - "3200:3200" # tempo
+ - "4317" # otlp grpc
+ - "4318" # otlp http
+ restart: unless-stopped
+ networks:
+ - rustfs-network
+ profiles:
+ - observability
+
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
container_name: otel-collector
@@ -115,32 +145,45 @@ services:
volumes:
- ./.docker/observability/otel-collector-config.yaml:/etc/otelcol-contrib/otel-collector.yml:ro
ports:
- - "4317:4317" # OTLP gRPC receiver
- - "4318:4318" # OTLP HTTP receiver
- - "8888:8888" # Prometheus metrics
- - "8889:8889" # Prometheus exporter metrics
+ - "1888:1888" # pprof
+ - "8888:8888" # Prometheus metrics for Collector
+ - "8889:8889" # Prometheus metrics for application indicators
+ - "13133:13133" # health check
+ - "4317:4317" # OTLP gRPC
+ - "4318:4318" # OTLP HTTP
+ - "55679:55679" # zpages
networks:
- rustfs-network
restart: unless-stopped
profiles:
- observability
+ depends_on:
+ - tempo
+ - jaeger
+ - prometheus
+ - loki
- # Jaeger for tracing
jaeger:
- image: jaegertracing/all-in-one:latest
+ image: jaegertracing/jaeger:latest
container_name: jaeger
- ports:
- - "16686:16686" # Jaeger UI
- - "14250:14250" # Jaeger gRPC
environment:
+ - TZ=Asia/Shanghai
+ - SPAN_STORAGE_TYPE=badger
+ - BADGER_EPHEMERAL=false
+ - BADGER_DIRECTORY_VALUE=/badger/data
+ - BADGER_DIRECTORY_KEY=/badger/key
- COLLECTOR_OTLP_ENABLED=true
+ volumes:
+ - jaeger_data:/badger
+ ports:
+ - "16686:16686" # Web UI
+ - "14269:14269" # Admin/Metrics
networks:
- rustfs-network
restart: unless-stopped
profiles:
- observability
- # Prometheus for metrics
prometheus:
image: prom/prometheus:latest
container_name: prometheus
@@ -152,17 +195,35 @@ services:
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- - "--web.console.libraries=/etc/prometheus/console_libraries"
- - "--web.console.templates=/etc/prometheus/consoles"
+ - "--web.console.libraries=/usr/share/prometheus/console_libraries"
+ - "--web.console.templates=/usr/share/prometheus/consoles"
- "--storage.tsdb.retention.time=200h"
- "--web.enable-lifecycle"
+ - "--web.enable-otlp-receiver"
+ - "--web.enable-remote-write-receiver"
+ networks:
+ - rustfs-network
+ restart: unless-stopped
+ profiles:
+ - observability
+
+ loki:
+ image: grafana/loki:latest
+ container_name: loki
+ environment:
+ - TZ=Asia/Shanghai
+ volumes:
+ - ./.docker/observability/loki-config.yaml:/etc/loki/local-config.yaml:ro
+ - loki_data:/loki
+ ports:
+ - "3100:3100"
+ command: -config.file=/etc/loki/local-config.yaml
networks:
- rustfs-network
restart: unless-stopped
profiles:
- observability
- # Grafana for visualization
grafana:
image: grafana/grafana:latest
container_name: grafana
@@ -171,6 +232,7 @@ services:
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
+ - GF_INSTALL_PLUGINS=grafana-pyroscope-datasource
volumes:
- grafana_data:/var/lib/grafana
- ./.docker/observability/grafana/provisioning:/etc/grafana/provisioning:ro
@@ -180,6 +242,10 @@ services:
restart: unless-stopped
profiles:
- observability
+ depends_on:
+ - prometheus
+ - tempo
+ - loki
# NGINX reverse proxy (optional)
nginx:
@@ -228,6 +294,12 @@ volumes:
driver: local
grafana_data:
driver: local
+ tempo_data:
+ driver: local
+ loki_data:
+ driver: local
+ jaeger_data:
+ driver: local
logs:
driver: local
cargo_registry:
diff --git a/rustfs/src/admin/handlers/event.rs b/rustfs/src/admin/handlers/event.rs
index 1a5b29d4..01d9e9f6 100644
--- a/rustfs/src/admin/handlers/event.rs
+++ b/rustfs/src/admin/handlers/event.rs
@@ -324,7 +324,10 @@ impl Operation for ListTargetsArns {
.clone()
.ok_or_else(|| s3_error!(InvalidRequest, "region not found"))?;
- let data_target_arn_list: Vec<_> = active_targets.iter().map(|id| id.to_arn(®ion).to_string()).collect();
+ let data_target_arn_list: Vec<_> = active_targets
+ .iter()
+ .map(|id| id.to_arn(region.as_str()).to_string())
+ .collect();
let data = serde_json::to_vec(&data_target_arn_list)
.map_err(|e| s3_error!(InternalError, "failed to serialize targets: {}", e))?;
diff --git a/rustfs/src/admin/router.rs b/rustfs/src/admin/router.rs
index e6d94395..270ddc17 100644
--- a/rustfs/src/admin/router.rs
+++ b/rustfs/src/admin/router.rs
@@ -242,7 +242,7 @@ impl Operation for AdminOperation {
#[derive(Debug, Clone)]
pub struct Extra {
pub credentials: Option,
- pub region: Option,
+ pub region: Option,
pub service: Option,
}
diff --git a/rustfs/src/app/bucket_usecase.rs b/rustfs/src/app/bucket_usecase.rs
index b02c799e..337f7eb7 100644
--- a/rustfs/src/app/bucket_usecase.rs
+++ b/rustfs/src/app/bucket_usecase.rs
@@ -30,6 +30,7 @@ use crate::storage::*;
use futures::StreamExt;
use http::StatusCode;
use metrics::counter;
+use rustfs_config::RUSTFS_REGION;
use rustfs_ecstore::bucket::{
lifecycle::bucket_lifecycle_ops::validate_transition_tier,
metadata::{
@@ -57,6 +58,7 @@ use rustfs_targets::{
use rustfs_utils::http::RUSTFS_FORCE_DELETE;
use rustfs_utils::string::parse_bool;
use s3s::dto::*;
+use s3s::region::Region;
use s3s::xml;
use s3s::{S3Error, S3ErrorCode, S3Request, S3Response, S3Result, s3_error};
use std::{fmt::Display, sync::Arc};
@@ -73,8 +75,8 @@ fn to_internal_error(err: impl Display) -> S3Error {
S3Error::with_message(S3ErrorCode::InternalError, format!("{err}"))
}
-fn resolve_notification_region(global_region: Option, request_region: Option) -> String {
- global_region.unwrap_or_else(|| request_region.unwrap_or_default())
+fn resolve_notification_region(global_region: Option, request_region: Option) -> Region {
+ global_region.unwrap_or_else(|| request_region.unwrap_or_else(|| Region::new(RUSTFS_REGION.into()).expect("valid region")))
}
#[derive(Debug, Clone, PartialEq, Eq)]
@@ -165,7 +167,7 @@ impl DefaultBucketUsecase {
self.context.clone()
}
- fn global_region(&self) -> Option {
+ fn global_region(&self) -> Option {
self.context.as_ref().and_then(|context| context.region().get())
}
@@ -431,7 +433,7 @@ impl DefaultBucketUsecase {
if let Some(region) = self.global_region() {
return Ok(S3Response::new(GetBucketLocationOutput {
- location_constraint: Some(BucketLocationConstraint::from(region)),
+ location_constraint: Some(BucketLocationConstraint::from(region.to_string())),
}));
}
@@ -1230,9 +1232,9 @@ impl DefaultBucketUsecase {
let event_rules =
event_rules_result.map_err(|e| s3_error!(InvalidArgument, "Invalid ARN in notification configuration: {e}"))?;
warn!("notify event rules: {:?}", &event_rules);
-
+ let region_clone = region.clone();
notify
- .add_event_specific_rules(&bucket, ®ion, &event_rules)
+ .add_event_specific_rules(&bucket, region_clone.as_str(), &event_rules)
.await
.map_err(|e| s3_error!(InternalError, "Failed to add rules: {e}"))?;
@@ -1800,20 +1802,23 @@ mod tests {
#[test]
fn resolve_notification_region_prefers_global_region() {
- let region = resolve_notification_region(Some("us-east-1".to_string()), Some("ap-southeast-1".to_string()));
+ let binding = resolve_notification_region(Some("us-east-1".parse().unwrap()), Some("ap-southeast-1".parse().unwrap()));
+ let region = binding.as_str();
assert_eq!(region, "us-east-1");
}
#[test]
fn resolve_notification_region_falls_back_to_request_region() {
- let region = resolve_notification_region(None, Some("ap-southeast-1".to_string()));
+ let binding = resolve_notification_region(None, Some("ap-southeast-1".parse().unwrap()));
+ let region = binding.as_str();
assert_eq!(region, "ap-southeast-1");
}
#[test]
- fn resolve_notification_region_defaults_to_empty() {
- let region = resolve_notification_region(None, None);
- assert!(region.is_empty());
+ fn resolve_notification_region_defaults_value() {
+ let binding = resolve_notification_region(None, None);
+ let region = binding.as_str();
+ assert_eq!(region, RUSTFS_REGION);
}
#[tokio::test]
diff --git a/rustfs/src/app/context.rs b/rustfs/src/app/context.rs
index c0ab28b2..c4230409 100644
--- a/rustfs/src/app/context.rs
+++ b/rustfs/src/app/context.rs
@@ -75,7 +75,7 @@ pub trait EndpointsInterface: Send + Sync {
/// Region interface for application-layer use-cases.
pub trait RegionInterface: Send + Sync {
- fn get(&self) -> Option;
+ fn get(&self) -> Option;
}
/// Tier config interface for application-layer and admin handlers.
@@ -190,7 +190,7 @@ impl EndpointsInterface for EndpointsHandle {
pub struct RegionHandle;
impl RegionInterface for RegionHandle {
- fn get(&self) -> Option {
+ fn get(&self) -> Option {
get_global_region()
}
}
diff --git a/rustfs/src/app/multipart_usecase.rs b/rustfs/src/app/multipart_usecase.rs
index 24cc209d..824a3944 100644
--- a/rustfs/src/app/multipart_usecase.rs
+++ b/rustfs/src/app/multipart_usecase.rs
@@ -27,6 +27,7 @@ use crate::storage::options::{
use crate::storage::*;
use bytes::Bytes;
use futures::StreamExt;
+use rustfs_config::RUSTFS_REGION;
use rustfs_ecstore::StorageAPI;
use rustfs_ecstore::bucket::quota::checker::QuotaChecker;
use rustfs_ecstore::bucket::{
@@ -164,7 +165,7 @@ impl DefaultMultipartUsecase {
self.context.as_ref().and_then(|context| context.bucket_metadata().handle())
}
- fn global_region(&self) -> Option {
+ fn global_region(&self) -> Option {
self.context.as_ref().and_then(|context| context.region().get())
}
@@ -422,12 +423,12 @@ impl DefaultMultipartUsecase {
}
}
- let region = self.global_region().unwrap_or_else(|| "us-east-1".to_string());
+ let region = self.global_region().unwrap_or_else(|| RUSTFS_REGION.parse().unwrap());
let output = CompleteMultipartUploadOutput {
bucket: Some(bucket.clone()),
key: Some(key.clone()),
e_tag: obj_info.etag.clone().map(|etag| to_s3s_etag(&etag)),
- location: Some(region.clone()),
+ location: Some(region.to_string()),
server_side_encryption: server_side_encryption.clone(),
ssekms_key_id: ssekms_key_id.clone(),
checksum_crc32: checksum_crc32.clone(),
@@ -448,7 +449,7 @@ impl DefaultMultipartUsecase {
bucket: Some(bucket.clone()),
key: Some(key.clone()),
e_tag: obj_info.etag.clone().map(|etag| to_s3s_etag(&etag)),
- location: Some(region),
+ location: Some(region.to_string()),
server_side_encryption,
ssekms_key_id,
checksum_crc32,
diff --git a/rustfs/src/auth.rs b/rustfs/src/auth.rs
index 59e919cb..f0ed933f 100644
--- a/rustfs/src/auth.rs
+++ b/rustfs/src/auth.rs
@@ -276,7 +276,7 @@ pub fn get_condition_values(
header: &HeaderMap,
cred: &Credentials,
version_id: Option<&str>,
- region: Option<&str>,
+ region: Option,
remote_addr: Option,
) -> HashMap> {
let username = if cred.is_temp() || cred.is_service_account() {
@@ -362,7 +362,7 @@ pub fn get_condition_values(
}
if let Some(lc) = region
- && !lc.is_empty()
+ && !lc.as_str().is_empty()
{
args.insert("LocationConstraint".to_owned(), vec![lc.to_string()]);
}
diff --git a/rustfs/src/config/mod.rs b/rustfs/src/config/mod.rs
index 1705d6aa..9a3f1e84 100644
--- a/rustfs/src/config/mod.rs
+++ b/rustfs/src/config/mod.rs
@@ -15,8 +15,10 @@
use clap::Parser;
use clap::builder::NonEmptyStringValueParser;
use const_str::concat;
+use rustfs_config::RUSTFS_REGION;
use std::path::PathBuf;
use std::string::ToString;
+
shadow_rs::shadow!(build);
pub mod workload_profiles;
@@ -191,8 +193,10 @@ pub struct Config {
/// tls path for rustfs API and console.
pub tls_path: Option,
+ /// License key for enterprise features
pub license: Option,
+ /// Region for the server, used for signing and other region-specific behavior
pub region: Option,
/// Enable KMS encryption for server-side encryption
@@ -280,6 +284,9 @@ impl Config {
.trim()
.to_string();
+ // Region is optional, but if not set, we should default to "rustfs-global-0" for signing compatibility with AWS S3 clients
+ let region = region.or_else(|| Some(RUSTFS_REGION.to_string()));
+
Ok(Config {
volumes,
address,
@@ -329,15 +336,3 @@ impl std::fmt::Debug for Config {
.finish()
}
}
-
-// lazy_static::lazy_static! {
-// pub(crate) static ref OPT: OnceLock = OnceLock::new();
-// }
-
-// pub fn init_config(opt: Opt) {
-// OPT.set(opt).expect("Failed to set global config");
-// }
-
-// pub fn get_config() -> &'static Opt {
-// OPT.get().expect("Global config not initialized")
-// }
diff --git a/rustfs/src/init.rs b/rustfs/src/init.rs
index 366100d7..09f8d77d 100644
--- a/rustfs/src/init.rs
+++ b/rustfs/src/init.rs
@@ -93,14 +93,15 @@ pub(crate) fn init_update_check() {
/// * `buckets` - A vector of bucket names to process
#[instrument(skip_all)]
pub(crate) async fn add_bucket_notification_configuration(buckets: Vec) {
- let region_opt = rustfs_ecstore::global::get_global_region();
- let region = match region_opt {
- Some(ref r) if !r.is_empty() => r,
- _ => {
+ let global_region = rustfs_ecstore::global::get_global_region();
+ let region = global_region
+ .as_ref()
+ .filter(|r| !r.as_str().is_empty())
+ .map(|r| r.as_str())
+ .unwrap_or_else(|| {
warn!("Global region is not set; attempting notification configuration for all buckets with an empty region.");
""
- }
- };
+ });
for bucket in buckets.iter() {
let has_notification_config = metadata_sys::get_notification_config(bucket).await.unwrap_or_else(|err| {
warn!("get_notification_config err {:?}", err);
@@ -368,7 +369,7 @@ pub async fn init_ftp_system() -> Result