Files
rustfs/.docker/observability/prometheus.yml
houseme 29056a767a Refactor Telemetry Initialization and Environment Utilities (#811)
* improve code for metrics

* improve code for metrics

* fix

* fix

* Refactor telemetry initialization and environment functions ordering

- Reorder functions in envs.rs by type size (8-bit to 64-bit, signed before unsigned) and add missing variants like get_env_opt_u16.
- Optimize init_telemetry to support three modes: stdout logging (default error level with span tracing), file rolling logs (size-based with retention), and HTTP-based observability with sub-endpoints (trace, metric, log) falling back to unified endpoint.
- Fix stdout logging issue by retaining WorkerGuard in OtelGuard to prevent premature release of async writer threads.
- Enhance observability mode with HTTP protocol, compression, and proper resource management.
- Update OtelGuard to include tracing_guard for stdout and flexi_logger_handles for file logging.
- Improve error handling and configuration extraction in OtelConfig.

* fix

* up

* fix

* fix

* improve code for obs

* fix

* fix
2025-11-07 20:01:54 +08:00

55 lines
1.7 KiB
YAML

# Copyright 2024 RustFS Team
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
global:
scrape_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: [ 'otel-collector:8888' ] # Scrape metrics from Collector
- job_name: 'otel-metrics'
static_configs:
- targets: [ 'otel-collector:8889' ] # Application indicators
- job_name: 'tempo'
static_configs:
- targets: [ 'tempo:3200' ] # Scrape metrics from Tempo
otlp:
# Recommended attributes to be promoted to labels.
promote_resource_attributes:
- service.instance.id
- service.name
- service.namespace
- cloud.availability_zone
- cloud.region
- container.name
- deployment.environment.name
- k8s.cluster.name
- k8s.container.name
- k8s.cronjob.name
- k8s.daemonset.name
- k8s.deployment.name
- k8s.job.name
- k8s.namespace.name
- k8s.pod.name
- k8s.replicaset.name
- k8s.statefulset.name
# Ingest OTLP data keeping all characters in metric/label names.
translation_strategy: NoUTF8EscapingWithSuffixes
storage:
# OTLP is a push-based protocol, Out of order samples is a common scenario.
tsdb:
out_of_order_time_window: 30m