diff --git a/.envrc b/.envrc new file mode 100644 index 00000000..8392d159 --- /dev/null +++ b/.envrc @@ -0,0 +1 @@ +use flake \ No newline at end of file diff --git a/.github/s3tests/README.md b/.github/s3tests/README.md new file mode 100644 index 00000000..af61ed25 --- /dev/null +++ b/.github/s3tests/README.md @@ -0,0 +1,103 @@ +# S3 Compatibility Tests Configuration + +This directory contains the configuration for running [Ceph S3 compatibility tests](https://github.com/ceph/s3-tests) against RustFS. + +## Configuration File + +The `s3tests.conf` file is based on the official `s3tests.conf.SAMPLE` from the ceph/s3-tests repository. It uses environment variable substitution via `envsubst` to configure the endpoint and credentials. + +### Key Configuration Points + +- **Host**: Set via `${S3_HOST}` environment variable (e.g., `rustfs-single` for single-node, `lb` for multi-node) +- **Port**: 9000 (standard RustFS port) +- **Credentials**: Uses `${S3_ACCESS_KEY}` and `${S3_SECRET_KEY}` from workflow environment +- **TLS**: Disabled (`is_secure = False`) + +## Test Execution Strategy + +### Network Connectivity Fix + +Tests run inside a Docker container on the `rustfs-net` network, which allows them to resolve and connect to the RustFS container hostnames. This fixes the "Temporary failure in name resolution" error that occurred when tests ran on the GitHub runner host. + +### Performance Optimizations + +1. **Parallel Execution**: Uses `pytest-xdist` with `-n 4` to run tests in parallel across 4 workers +2. **Load Distribution**: Uses `--dist=loadgroup` to distribute test groups across workers +3. **Fail-Fast**: Uses `--maxfail=50` to stop after 50 failures, saving time on catastrophic failures + +### Feature Filtering + +Tests are filtered using pytest markers (`-m`) to skip features not yet supported by RustFS: + +- `lifecycle` - Bucket lifecycle policies +- `versioning` - Object versioning +- `s3website` - Static website hosting +- `bucket_logging` - Bucket logging +- `encryption` / `sse_s3` - Server-side encryption +- `cloud_transition` / `cloud_restore` - Cloud storage transitions +- `lifecycle_expiration` / `lifecycle_transition` - Lifecycle operations + +This filtering: +1. Reduces test execution time significantly (from 1+ hour to ~10-15 minutes) +2. Focuses on features RustFS currently supports +3. Avoids hundreds of expected failures + +## Running Tests Locally + +### Single-Node Test + +```bash +# Set credentials +export S3_ACCESS_KEY=rustfsadmin +export S3_SECRET_KEY=rustfsadmin + +# Start RustFS container +docker run -d --name rustfs-single \ + --network rustfs-net \ + -e RUSTFS_ADDRESS=0.0.0.0:9000 \ + -e RUSTFS_ACCESS_KEY=$S3_ACCESS_KEY \ + -e RUSTFS_SECRET_KEY=$S3_SECRET_KEY \ + -e RUSTFS_VOLUMES="/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3" \ + rustfs-ci + +# Generate config +export S3_HOST=rustfs-single +envsubst < .github/s3tests/s3tests.conf > /tmp/s3tests.conf + +# Run tests +docker run --rm \ + --network rustfs-net \ + -v /tmp/s3tests.conf:/etc/s3tests.conf:ro \ + python:3.12-slim \ + bash -c ' + apt-get update -qq && apt-get install -y -qq git + git clone --depth 1 https://github.com/ceph/s3-tests.git /s3-tests + cd /s3-tests + pip install -q -r requirements.txt pytest-xdist + S3TEST_CONF=/etc/s3tests.conf pytest -v -n 4 \ + s3tests/functional/test_s3.py \ + -m "not lifecycle and not versioning and not s3website and not bucket_logging and not encryption and not sse_s3" + ' +``` + +## Test Results Interpretation + +- **PASSED**: Test succeeded, feature works correctly +- **FAILED**: Test failed, indicates a potential bug or incompatibility +- **ERROR**: Test setup failed (e.g., network issues, missing dependencies) +- **SKIPPED**: Test skipped due to marker filtering + +## Adding New Feature Support + +When adding support for a new S3 feature to RustFS: + +1. Remove the corresponding marker from the filter in `.github/workflows/e2e-s3tests.yml` +2. Run the tests to verify compatibility +3. Fix any failing tests +4. Update this README to reflect the newly supported feature + +## References + +- [Ceph S3 Tests Repository](https://github.com/ceph/s3-tests) +- [S3 API Compatibility](https://docs.aws.amazon.com/AmazonS3/latest/API/) +- [pytest-xdist Documentation](https://pytest-xdist.readthedocs.io/) diff --git a/.github/s3tests/s3tests.conf b/.github/s3tests/s3tests.conf new file mode 100644 index 00000000..c7f8acf3 --- /dev/null +++ b/.github/s3tests/s3tests.conf @@ -0,0 +1,185 @@ +# RustFS s3-tests configuration +# Based on: https://github.com/ceph/s3-tests/blob/master/s3tests.conf.SAMPLE +# +# Usage: +# Single-node: S3_HOST=rustfs-single envsubst < s3tests.conf > /tmp/s3tests.conf +# Multi-node: S3_HOST=lb envsubst < s3tests.conf > /tmp/s3tests.conf + +[DEFAULT] +## this section is just used for host, port and bucket_prefix + +# host set for RustFS - will be substituted via envsubst +host = ${S3_HOST} + +# port for RustFS +port = 9000 + +## say "False" to disable TLS +is_secure = False + +## say "False" to disable SSL Verify +ssl_verify = False + +[fixtures] +## all the buckets created will start with this prefix; +## {random} will be filled with random characters to pad +## the prefix to 30 characters long, and avoid collisions +bucket prefix = rustfs-{random}- + +# all the iam account resources (users, roles, etc) created +# will start with this name prefix +iam name prefix = s3-tests- + +# all the iam account resources (users, roles, etc) created +# will start with this path prefix +iam path prefix = /s3-tests/ + +[s3 main] +# main display_name +display_name = RustFS Tester + +# main user_id +user_id = rustfsadmin + +# main email +email = tester@rustfs.local + +# zonegroup api_name for bucket location +api_name = default + +## main AWS access key +access_key = ${S3_ACCESS_KEY} + +## main AWS secret key +secret_key = ${S3_SECRET_KEY} + +## replace with key id obtained when secret is created, or delete if KMS not tested +#kms_keyid = 01234567-89ab-cdef-0123-456789abcdef + +## Storage classes +#storage_classes = "LUKEWARM, FROZEN" + +## Lifecycle debug interval (default: 10) +#lc_debug_interval = 20 +## Restore debug interval (default: 100) +#rgw_restore_debug_interval = 60 +#rgw_restore_processor_period = 60 + +[s3 alt] +# alt display_name +display_name = RustFS Alt Tester + +## alt email +email = alt@rustfs.local + +# alt user_id +user_id = rustfsalt + +# alt AWS access key (must be different from s3 main for many tests) +access_key = ${S3_ALT_ACCESS_KEY} + +# alt AWS secret key +secret_key = ${S3_ALT_SECRET_KEY} + +#[s3 cloud] +## to run the testcases with "cloud_transition" for transition +## and "cloud_restore" for restore attribute. +## Note: the waiting time may have to tweaked depending on +## the I/O latency to the cloud endpoint. + +## host set for cloud endpoint +# host = localhost + +## port set for cloud endpoint +# port = 8001 + +## say "False" to disable TLS +# is_secure = False + +## cloud endpoint credentials +# access_key = 0555b35654ad1656d804 +# secret_key = h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== + +## storage class configured as cloud tier on local rgw server +# cloud_storage_class = CLOUDTIER + +## Below are optional - + +## Above configured cloud storage class config options +# retain_head_object = false +# allow_read_through = false # change it to enable read_through +# read_through_restore_days = 2 +# target_storage_class = Target_SC +# target_path = cloud-bucket + +## another regular storage class to test multiple transition rules, +# storage_class = S1 + +[s3 tenant] +# tenant display_name +display_name = RustFS Tenant Tester + +# tenant user_id +user_id = rustfstenant + +# tenant AWS access key +access_key = ${S3_ACCESS_KEY} + +# tenant AWS secret key +secret_key = ${S3_SECRET_KEY} + +# tenant email +email = tenant@rustfs.local + +# tenant name +tenant = testx + +#following section needs to be added for all sts-tests +[iam] +#used for iam operations in sts-tests +#email +email = s3@rustfs.local + +#user_id +user_id = rustfsiam + +#access_key +access_key = ${S3_ACCESS_KEY} + +#secret_key +secret_key = ${S3_SECRET_KEY} + +#display_name +display_name = RustFS IAM User + +# iam account root user for iam_account tests +[iam root] +access_key = ${S3_ACCESS_KEY} +secret_key = ${S3_SECRET_KEY} +user_id = RGW11111111111111111 +email = account1@rustfs.local + +# iam account root user in a different account than [iam root] +[iam alt root] +access_key = ${S3_ACCESS_KEY} +secret_key = ${S3_SECRET_KEY} +user_id = RGW22222222222222222 +email = account2@rustfs.local + +#following section needs to be added when you want to run Assume Role With Webidentity test +[webidentity] +#used for assume role with web identity test in sts-tests +#all parameters will be obtained from ceph/qa/tasks/keycloak.py +#token= + +#aud= + +#sub= + +#azp= + +#user_token=] + +#thumbprint= + +#KC_REALM= diff --git a/.github/workflows/audit.yml b/.github/workflows/audit.yml index 03a5c8a2..d54bbfef 100644 --- a/.github/workflows/audit.yml +++ b/.github/workflows/audit.yml @@ -40,11 +40,11 @@ env: jobs: security-audit: name: Security Audit - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 timeout-minutes: 15 steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 - name: Install cargo-audit uses: taiki-e/install-action@v2 @@ -65,14 +65,14 @@ jobs: dependency-review: name: Dependency Review - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 if: github.event_name == 'pull_request' permissions: contents: read pull-requests: write steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 - name: Dependency Review uses: actions/dependency-review-action@v4 diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml index baa6d266..7390d7c8 100644 --- a/.github/workflows/build.yml +++ b/.github/workflows/build.yml @@ -83,7 +83,7 @@ jobs: # Build strategy check - determine build type based on trigger build-check: name: Build Strategy Check - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 outputs: should_build: ${{ steps.check.outputs.should_build }} build_type: ${{ steps.check.outputs.build_type }} @@ -92,7 +92,7 @@ jobs: is_prerelease: ${{ steps.check.outputs.is_prerelease }} steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 with: fetch-depth: 0 @@ -167,19 +167,19 @@ jobs: matrix: include: # Linux builds - - os: ubuntu-latest + - os: ubicloud-standard-2 target: x86_64-unknown-linux-musl cross: false platform: linux - - os: ubuntu-latest + - os: ubicloud-standard-2 target: aarch64-unknown-linux-musl cross: true platform: linux - - os: ubuntu-latest + - os: ubicloud-standard-2 target: x86_64-unknown-linux-gnu cross: false platform: linux - - os: ubuntu-latest + - os: ubicloud-standard-2 target: aarch64-unknown-linux-gnu cross: true platform: linux @@ -203,7 +203,7 @@ jobs: # platform: windows steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 with: fetch-depth: 0 @@ -454,7 +454,7 @@ jobs: OSS_ACCESS_KEY_ID: ${{ secrets.ALICLOUDOSS_KEY_ID }} OSS_ACCESS_KEY_SECRET: ${{ secrets.ALICLOUDOSS_KEY_SECRET }} OSS_REGION: cn-beijing - OSS_ENDPOINT: https://oss-cn-beijing.aliyuncs.com + OSS_ENDPOINT: https://oss-accelerate.aliyuncs.com shell: bash run: | BUILD_TYPE="${{ needs.build-check.outputs.build_type }}" @@ -532,7 +532,7 @@ jobs: name: Build Summary needs: [ build-check, build-rustfs ] if: always() && needs.build-check.outputs.should_build == 'true' - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 steps: - name: Build completion summary shell: bash @@ -584,7 +584,7 @@ jobs: name: Create GitHub Release needs: [ build-check, build-rustfs ] if: startsWith(github.ref, 'refs/tags/') && needs.build-check.outputs.build_type != 'development' - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 permissions: contents: write outputs: @@ -592,7 +592,7 @@ jobs: release_url: ${{ steps.create.outputs.release_url }} steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 with: fetch-depth: 0 @@ -670,13 +670,13 @@ jobs: name: Upload Release Assets needs: [ build-check, build-rustfs, create-release ] if: startsWith(github.ref, 'refs/tags/') && needs.build-check.outputs.build_type != 'development' - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 permissions: contents: write actions: read steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 - name: Download all build artifacts uses: actions/download-artifact@v5 @@ -751,7 +751,7 @@ jobs: name: Update Latest Version needs: [ build-check, upload-release-assets ] if: startsWith(github.ref, 'refs/tags/') - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 steps: - name: Update latest.json env: @@ -801,12 +801,12 @@ jobs: name: Publish Release needs: [ build-check, create-release, upload-release-assets ] if: startsWith(github.ref, 'refs/tags/') && needs.build-check.outputs.build_type != 'development' - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 permissions: contents: write steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 - name: Update release notes and publish env: diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index ed5571d3..ae3a308c 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -4,7 +4,7 @@ # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # -# http://www.apache.org/licenses/LICENSE-2.0 +# http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, @@ -62,17 +62,23 @@ on: permissions: contents: read +concurrency: + group: ${{ github.workflow }}-${{ github.ref }} + cancel-in-progress: true + env: CARGO_TERM_COLOR: always RUST_BACKTRACE: 1 + CARGO_BUILD_JOBS: 2 jobs: + skip-check: name: Skip Duplicate Actions permissions: actions: write contents: read - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 outputs: should_skip: ${{ steps.skip_check.outputs.should_skip }} steps: @@ -83,15 +89,13 @@ jobs: concurrent_skipping: "same_content_newer" cancel_others: true paths_ignore: '["*.md", "docs/**", "deploy/**"]' - # Never skip release events and tag pushes do_not_skip: '["workflow_dispatch", "schedule", "merge_group", "release", "push"]' - typos: name: Typos - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 steps: - - uses: actions/checkout@v5 + - uses: actions/checkout@v6 - uses: dtolnay/rust-toolchain@stable - name: Typos check with custom config file uses: crate-ci/typos@master @@ -100,13 +104,11 @@ jobs: name: Test and Lint needs: skip-check if: needs.skip-check.outputs.should_skip != 'true' - runs-on: ubuntu-latest + runs-on: ubicloud-standard-4 timeout-minutes: 60 steps: - - name: Delete huge unnecessary tools folder - run: rm -rf /opt/hostedtoolcache - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 - name: Setup Rust environment uses: ./.github/actions/setup @@ -116,6 +118,9 @@ jobs: github-token: ${{ secrets.GITHUB_TOKEN }} cache-save-if: ${{ github.ref == 'refs/heads/main' }} + - name: Install cargo-nextest + uses: taiki-e/install-action@nextest + - name: Run tests run: | cargo nextest run --all --exclude e2e_test @@ -131,11 +136,16 @@ jobs: name: End-to-End Tests needs: skip-check if: needs.skip-check.outputs.should_skip != 'true' - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 timeout-minutes: 30 steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 + + - name: Clean up previous test run + run: | + rm -rf /tmp/rustfs + rm -f /tmp/rustfs.log - name: Setup Rust environment uses: ./.github/actions/setup @@ -155,7 +165,8 @@ jobs: - name: Build debug binary run: | touch rustfs/build.rs - cargo build -p rustfs --bins + # Limit concurrency to prevent OOM + cargo build -p rustfs --bins --jobs 2 - name: Run end-to-end tests run: | diff --git a/.github/workflows/docker.yml b/.github/workflows/docker.yml index a8919c92..383dcd57 100644 --- a/.github/workflows/docker.yml +++ b/.github/workflows/docker.yml @@ -72,7 +72,7 @@ jobs: # Check if we should build Docker images build-check: name: Docker Build Check - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 outputs: should_build: ${{ steps.check.outputs.should_build }} should_push: ${{ steps.check.outputs.should_push }} @@ -83,7 +83,7 @@ jobs: create_latest: ${{ steps.check.outputs.create_latest }} steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 with: fetch-depth: 0 # For workflow_run events, checkout the specific commit that triggered the workflow @@ -162,11 +162,11 @@ jobs: if [[ "$version" == *"alpha"* ]] || [[ "$version" == *"beta"* ]] || [[ "$version" == *"rc"* ]]; then build_type="prerelease" is_prerelease=true - # TODO: 临时修改 - 当前允许 alpha 版本也创建 latest 标签 - # 等版本稳定后,需要移除下面这行,恢复原有逻辑(只有稳定版本才创建 latest) + # TODO: Temporary change - currently allows alpha versions to also create latest tags + # After the version is stable, you need to remove the following line and restore the original logic (latest is created only for stable versions) if [[ "$version" == *"alpha"* ]]; then create_latest=true - echo "🧪 Building Docker image for prerelease: $version (临时允许创建 latest 标签)" + echo "🧪 Building Docker image for prerelease: $version (temporarily allowing creation of latest tag)" else echo "🧪 Building Docker image for prerelease: $version" fi @@ -215,11 +215,11 @@ jobs: v*alpha*|v*beta*|v*rc*|*alpha*|*beta*|*rc*) build_type="prerelease" is_prerelease=true - # TODO: 临时修改 - 当前允许 alpha 版本也创建 latest 标签 - # 等版本稳定后,需要移除下面的 if 块,恢复原有逻辑 + # TODO: Temporary change - currently allows alpha versions to also create latest tags + # After the version is stable, you need to remove the if block below and restore the original logic. if [[ "$input_version" == *"alpha"* ]]; then create_latest=true - echo "🧪 Building with prerelease version: $input_version (临时允许创建 latest 标签)" + echo "🧪 Building with prerelease version: $input_version (temporarily allowing creation of latest tag)" else echo "🧪 Building with prerelease version: $input_version" fi @@ -264,11 +264,11 @@ jobs: name: Build Docker Images needs: build-check if: needs.build-check.outputs.should_build == 'true' - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 timeout-minutes: 60 steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 - name: Login to Docker Hub uses: docker/login-action@v3 @@ -330,9 +330,9 @@ jobs: # Add channel tags for prereleases and latest for stable if [[ "$CREATE_LATEST" == "true" ]]; then - # TODO: 临时修改 - 当前 alpha 版本也会创建 latest 标签 - # 等版本稳定后,这里的逻辑保持不变,但上游的 CREATE_LATEST 设置需要恢复 - # Stable release (以及临时的 alpha 版本) + # TODO: Temporary change - the current alpha version will also create the latest tag + # After the version is stabilized, the logic here remains unchanged, but the upstream CREATE_LATEST setting needs to be restored. + # Stable release (and temporary alpha versions) TAGS="$TAGS,${{ env.REGISTRY_DOCKERHUB }}:latest" elif [[ "$BUILD_TYPE" == "prerelease" ]]; then # Prerelease channel tags (alpha, beta, rc) @@ -404,7 +404,7 @@ jobs: name: Docker Build Summary needs: [ build-check, build-docker ] if: always() && needs.build-check.outputs.should_build == 'true' - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 steps: - name: Docker build completion summary run: | @@ -429,10 +429,10 @@ jobs: "prerelease") echo "🧪 Prerelease Docker image has been built with ${VERSION} tags" echo "⚠️ This is a prerelease image - use with caution" - # TODO: 临时修改 - alpha 版本当前会创建 latest 标签 - # 等版本稳定后,需要恢复下面的提示信息 + # TODO: Temporary change - alpha versions currently create the latest tag + # After the version is stable, you need to restore the following prompt information if [[ "$VERSION" == *"alpha"* ]] && [[ "$CREATE_LATEST" == "true" ]]; then - echo "🏷️ Latest tag has been created for alpha version (临时措施)" + echo "🏷️ Latest tag has been created for alpha version (temporary measures)" else echo "🚫 Latest tag NOT created for prerelease" fi diff --git a/.github/workflows/e2e-mint.yml b/.github/workflows/e2e-mint.yml new file mode 100644 index 00000000..a9de46f7 --- /dev/null +++ b/.github/workflows/e2e-mint.yml @@ -0,0 +1,260 @@ +# Copyright 2024 RustFS Team +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: e2e-mint + +on: + push: + branches: [ main ] + paths: + - ".github/workflows/e2e-mint.yml" + - "Dockerfile.source" + - "rustfs/**" + - "crates/**" + workflow_dispatch: + inputs: + run-multi: + description: "Run multi-node Mint as well" + required: false + default: "false" + +env: + ACCESS_KEY: rustfsadmin + SECRET_KEY: rustfsadmin + RUST_LOG: info + PLATFORM: linux/amd64 + +jobs: + mint-single: + runs-on: ubicloud-standard-2 + timeout-minutes: 40 + steps: + - name: Checkout + uses: actions/checkout@v6 + + - name: Enable buildx + uses: docker/setup-buildx-action@v3 + + - name: Build RustFS image (source) + run: | + DOCKER_BUILDKIT=1 docker buildx build --load \ + --platform ${PLATFORM} \ + -t rustfs-ci \ + -f Dockerfile.source . + + - name: Create network + run: | + docker network inspect rustfs-net >/dev/null 2>&1 || docker network create rustfs-net + + - name: Remove existing rustfs-single (if any) + run: docker rm -f rustfs-single >/dev/null 2>&1 || true + + - name: Start single RustFS + run: | + docker run -d --name rustfs-single \ + --network rustfs-net \ + -e RUSTFS_ADDRESS=0.0.0.0:9000 \ + -e RUSTFS_ACCESS_KEY=$ACCESS_KEY \ + -e RUSTFS_SECRET_KEY=$SECRET_KEY \ + -e RUSTFS_VOLUMES="/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3" \ + -v /tmp/rustfs-single:/data \ + rustfs-ci + + - name: Wait for RustFS ready + run: | + for i in {1..30}; do + if docker exec rustfs-single curl -sf http://localhost:9000/health >/dev/null; then + exit 0 + fi + sleep 2 + done + echo "RustFS did not become ready" >&2 + docker logs rustfs-single || true + exit 1 + + - name: Run Mint (single, S3-only) + run: | + mkdir -p artifacts/mint-single + docker run --rm --network rustfs-net \ + --platform ${PLATFORM} \ + -e SERVER_ENDPOINT=rustfs-single:9000 \ + -e ACCESS_KEY=$ACCESS_KEY \ + -e SECRET_KEY=$SECRET_KEY \ + -e ENABLE_HTTPS=0 \ + -e SERVER_REGION=us-east-1 \ + -e RUN_ON_FAIL=1 \ + -e MINT_MODE=core \ + -v ${GITHUB_WORKSPACE}/artifacts/mint-single:/mint/log \ + --entrypoint /mint/mint.sh \ + minio/mint:edge \ + awscli aws-sdk-go aws-sdk-java-v2 aws-sdk-php aws-sdk-ruby s3cmd s3select + + - name: Collect RustFS logs + run: | + mkdir -p artifacts/rustfs-single + docker logs rustfs-single > artifacts/rustfs-single/rustfs.log || true + + - name: Upload artifacts + uses: actions/upload-artifact@v4 + with: + name: mint-single + path: artifacts/** + + mint-multi: + if: github.event_name == 'workflow_dispatch' && github.event.inputs.run-multi == 'true' + needs: mint-single + runs-on: ubicloud-standard-2 + timeout-minutes: 60 + steps: + - name: Checkout + uses: actions/checkout@v6 + + - name: Enable buildx + uses: docker/setup-buildx-action@v3 + + - name: Build RustFS image (source) + run: | + DOCKER_BUILDKIT=1 docker buildx build --load \ + --platform ${PLATFORM} \ + -t rustfs-ci \ + -f Dockerfile.source . + + - name: Prepare cluster compose + run: | + cat > compose.yml <<'EOF' + version: '3.8' + services: + rustfs1: + image: rustfs-ci + hostname: rustfs1 + networks: [rustfs-net] + environment: + - RUSTFS_ADDRESS=0.0.0.0:9000 + - RUSTFS_ACCESS_KEY=${ACCESS_KEY} + - RUSTFS_SECRET_KEY=${SECRET_KEY} + - RUSTFS_VOLUMES=/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3 + volumes: + - rustfs1-data:/data + rustfs2: + image: rustfs-ci + hostname: rustfs2 + networks: [rustfs-net] + environment: + - RUSTFS_ADDRESS=0.0.0.0:9000 + - RUSTFS_ACCESS_KEY=${ACCESS_KEY} + - RUSTFS_SECRET_KEY=${SECRET_KEY} + - RUSTFS_VOLUMES=/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3 + volumes: + - rustfs2-data:/data + rustfs3: + image: rustfs-ci + hostname: rustfs3 + networks: [rustfs-net] + environment: + - RUSTFS_ADDRESS=0.0.0.0:9000 + - RUSTFS_ACCESS_KEY=${ACCESS_KEY} + - RUSTFS_SECRET_KEY=${SECRET_KEY} + - RUSTFS_VOLUMES=/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3 + volumes: + - rustfs3-data:/data + rustfs4: + image: rustfs-ci + hostname: rustfs4 + networks: [rustfs-net] + environment: + - RUSTFS_ADDRESS=0.0.0.0:9000 + - RUSTFS_ACCESS_KEY=${ACCESS_KEY} + - RUSTFS_SECRET_KEY=${SECRET_KEY} + - RUSTFS_VOLUMES=/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3 + volumes: + - rustfs4-data:/data + lb: + image: haproxy:2.9 + hostname: lb + networks: [rustfs-net] + ports: + - "9000:9000" + volumes: + - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro + networks: + rustfs-net: + name: rustfs-net + volumes: + rustfs1-data: + rustfs2-data: + rustfs3-data: + rustfs4-data: + EOF + + cat > haproxy.cfg <<'EOF' + defaults + mode http + timeout connect 5s + timeout client 30s + timeout server 30s + + frontend fe_s3 + bind *:9000 + default_backend be_s3 + + backend be_s3 + balance roundrobin + server s1 rustfs1:9000 check + server s2 rustfs2:9000 check + server s3 rustfs3:9000 check + server s4 rustfs4:9000 check + EOF + + - name: Launch cluster + run: docker compose -f compose.yml up -d + + - name: Wait for LB ready + run: | + for i in {1..60}; do + if docker run --rm --network rustfs-net curlimages/curl -sf http://lb:9000/health >/dev/null; then + exit 0 + fi + sleep 2 + done + echo "LB or backend not ready" >&2 + docker compose -f compose.yml logs --tail=200 || true + exit 1 + + - name: Run Mint (multi, S3-only) + run: | + mkdir -p artifacts/mint-multi + docker run --rm --network rustfs-net \ + --platform ${PLATFORM} \ + -e SERVER_ENDPOINT=lb:9000 \ + -e ACCESS_KEY=$ACCESS_KEY \ + -e SECRET_KEY=$SECRET_KEY \ + -e ENABLE_HTTPS=0 \ + -e SERVER_REGION=us-east-1 \ + -e RUN_ON_FAIL=1 \ + -e MINT_MODE=core \ + -v ${GITHUB_WORKSPACE}/artifacts/mint-multi:/mint/log \ + --entrypoint /mint/mint.sh \ + minio/mint:edge \ + awscli aws-sdk-go aws-sdk-java-v2 aws-sdk-php aws-sdk-ruby s3cmd s3select + + - name: Collect logs + run: | + mkdir -p artifacts/cluster + docker compose -f compose.yml logs --no-color > artifacts/cluster/cluster.log || true + + - name: Upload artifacts + uses: actions/upload-artifact@v4 + with: + name: mint-multi + path: artifacts/** diff --git a/.github/workflows/e2e-s3tests.yml b/.github/workflows/e2e-s3tests.yml new file mode 100644 index 00000000..e23e3a94 --- /dev/null +++ b/.github/workflows/e2e-s3tests.yml @@ -0,0 +1,422 @@ +# Copyright 2024 RustFS Team +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: e2e-s3tests + +on: + workflow_dispatch: + inputs: + test-mode: + description: "Test mode to run" + required: true + type: choice + default: "single" + options: + - single + - multi + xdist: + description: "Enable pytest-xdist (parallel). '0' to disable." + required: false + default: "0" + maxfail: + description: "Stop after N failures (debug friendly)" + required: false + default: "1" + markexpr: + description: "pytest -m expression (feature filters)" + required: false + default: "not lifecycle and not versioning and not s3website and not bucket_logging and not encryption" + +env: + # main user + S3_ACCESS_KEY: rustfsadmin + S3_SECRET_KEY: rustfsadmin + # alt user (must be different from main for many s3-tests) + S3_ALT_ACCESS_KEY: rustfsalt + S3_ALT_SECRET_KEY: rustfsalt + + S3_REGION: us-east-1 + + RUST_LOG: info + PLATFORM: linux/amd64 + +defaults: + run: + shell: bash + +jobs: + s3tests-single: + if: github.event.inputs.test-mode == 'single' + runs-on: ubicloud-standard-2 + timeout-minutes: 120 + steps: + - uses: actions/checkout@v6 + + - name: Enable buildx + uses: docker/setup-buildx-action@v3 + + - name: Build RustFS image (source, cached) + run: | + DOCKER_BUILDKIT=1 docker buildx build --load \ + --platform ${PLATFORM} \ + --cache-from type=gha \ + --cache-to type=gha,mode=max \ + -t rustfs-ci \ + -f Dockerfile.source . + + - name: Create network + run: docker network inspect rustfs-net >/dev/null 2>&1 || docker network create rustfs-net + + - name: Remove existing rustfs-single (if any) + run: docker rm -f rustfs-single >/dev/null 2>&1 || true + + - name: Start single RustFS + run: | + docker run -d --name rustfs-single \ + --network rustfs-net \ + -p 9000:9000 \ + -e RUSTFS_ADDRESS=0.0.0.0:9000 \ + -e RUSTFS_ACCESS_KEY=$S3_ACCESS_KEY \ + -e RUSTFS_SECRET_KEY=$S3_SECRET_KEY \ + -e RUSTFS_VOLUMES="/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3" \ + -v /tmp/rustfs-single:/data \ + rustfs-ci + + - name: Wait for RustFS ready + run: | + for i in {1..60}; do + if curl -sf http://127.0.0.1:9000/health >/dev/null 2>&1; then + echo "RustFS is ready" + exit 0 + fi + + if [ "$(docker inspect -f '{{.State.Running}}' rustfs-single 2>/dev/null)" != "true" ]; then + echo "RustFS container not running" >&2 + docker logs rustfs-single || true + exit 1 + fi + + sleep 2 + done + + echo "Health check timed out" >&2 + docker logs rustfs-single || true + exit 1 + + - name: Generate s3tests config + run: | + export S3_HOST=127.0.0.1 + envsubst < .github/s3tests/s3tests.conf > s3tests.conf + + - name: Provision s3-tests alt user (required by suite) + run: | + python3 -m pip install --user --upgrade pip awscurl + export PATH="$HOME/.local/bin:$PATH" + + # Admin API requires AWS SigV4 signing. awscurl is used by RustFS codebase as well. + awscurl \ + --service s3 \ + --region "${S3_REGION}" \ + --access_key "${S3_ACCESS_KEY}" \ + --secret_key "${S3_SECRET_KEY}" \ + -X PUT \ + -H 'Content-Type: application/json' \ + -d '{"secretKey":"'"${S3_ALT_SECRET_KEY}"'","status":"enabled","policy":"readwrite"}' \ + "http://127.0.0.1:9000/rustfs/admin/v3/add-user?accessKey=${S3_ALT_ACCESS_KEY}" + + # Explicitly attach built-in policy via policy mapping. + # s3-tests relies on alt client being able to ListBuckets during setup cleanup. + awscurl \ + --service s3 \ + --region "${S3_REGION}" \ + --access_key "${S3_ACCESS_KEY}" \ + --secret_key "${S3_SECRET_KEY}" \ + -X PUT \ + "http://127.0.0.1:9000/rustfs/admin/v3/set-user-or-group-policy?policyName=readwrite&userOrGroup=${S3_ALT_ACCESS_KEY}&isGroup=false" + + # Sanity check: alt user can list buckets (should not be AccessDenied). + awscurl \ + --service s3 \ + --region "${S3_REGION}" \ + --access_key "${S3_ALT_ACCESS_KEY}" \ + --secret_key "${S3_ALT_SECRET_KEY}" \ + -X GET \ + "http://127.0.0.1:9000/" >/dev/null + + - name: Prepare s3-tests + run: | + python3 -m pip install --user --upgrade pip tox + export PATH="$HOME/.local/bin:$PATH" + git clone --depth 1 https://github.com/ceph/s3-tests.git s3-tests + + - name: Run ceph s3-tests (debug friendly) + run: | + export PATH="$HOME/.local/bin:$PATH" + mkdir -p artifacts/s3tests-single + + cd s3-tests + + set -o pipefail + + MAXFAIL="${{ github.event.inputs.maxfail }}" + if [ -z "$MAXFAIL" ]; then MAXFAIL="1"; fi + + MARKEXPR="${{ github.event.inputs.markexpr }}" + if [ -z "$MARKEXPR" ]; then MARKEXPR="not lifecycle and not versioning and not s3website and not bucket_logging and not encryption"; fi + + XDIST="${{ github.event.inputs.xdist }}" + if [ -z "$XDIST" ]; then XDIST="0"; fi + XDIST_ARGS="" + if [ "$XDIST" != "0" ]; then + # Add pytest-xdist to requirements.txt so tox installs it inside + # its virtualenv. Installing outside tox does NOT work. + echo "pytest-xdist" >> requirements.txt + XDIST_ARGS="-n $XDIST --dist=loadgroup" + fi + + # Run tests from s3tests/functional (boto2+boto3 combined directory). + S3TEST_CONF=${GITHUB_WORKSPACE}/s3tests.conf \ + tox -- \ + -vv -ra --showlocals --tb=long \ + --maxfail="$MAXFAIL" \ + --junitxml=${GITHUB_WORKSPACE}/artifacts/s3tests-single/junit.xml \ + $XDIST_ARGS \ + s3tests/functional/test_s3.py \ + -m "$MARKEXPR" \ + 2>&1 | tee ${GITHUB_WORKSPACE}/artifacts/s3tests-single/pytest.log + + - name: Collect RustFS logs + if: always() + run: | + mkdir -p artifacts/rustfs-single + docker logs rustfs-single > artifacts/rustfs-single/rustfs.log 2>&1 || true + docker inspect rustfs-single > artifacts/rustfs-single/inspect.json || true + + - name: Upload artifacts + if: always() && env.ACT != 'true' + uses: actions/upload-artifact@v4 + with: + name: s3tests-single + path: artifacts/** + + s3tests-multi: + if: github.event_name == 'workflow_dispatch' && github.event.inputs.test-mode == 'multi' + runs-on: ubicloud-standard-2 + timeout-minutes: 150 + steps: + - uses: actions/checkout@v6 + + - name: Enable buildx + uses: docker/setup-buildx-action@v3 + + - name: Build RustFS image (source, cached) + run: | + DOCKER_BUILDKIT=1 docker buildx build --load \ + --platform ${PLATFORM} \ + --cache-from type=gha \ + --cache-to type=gha,mode=max \ + -t rustfs-ci \ + -f Dockerfile.source . + + - name: Prepare cluster compose + run: | + cat > compose.yml <<'EOF' + services: + rustfs1: + image: rustfs-ci + hostname: rustfs1 + networks: [rustfs-net] + environment: + RUSTFS_ADDRESS: "0.0.0.0:9000" + RUSTFS_ACCESS_KEY: ${S3_ACCESS_KEY} + RUSTFS_SECRET_KEY: ${S3_SECRET_KEY} + RUSTFS_VOLUMES: "/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3" + volumes: + - rustfs1-data:/data + rustfs2: + image: rustfs-ci + hostname: rustfs2 + networks: [rustfs-net] + environment: + RUSTFS_ADDRESS: "0.0.0.0:9000" + RUSTFS_ACCESS_KEY: ${S3_ACCESS_KEY} + RUSTFS_SECRET_KEY: ${S3_SECRET_KEY} + RUSTFS_VOLUMES: "/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3" + volumes: + - rustfs2-data:/data + rustfs3: + image: rustfs-ci + hostname: rustfs3 + networks: [rustfs-net] + environment: + RUSTFS_ADDRESS: "0.0.0.0:9000" + RUSTFS_ACCESS_KEY: ${S3_ACCESS_KEY} + RUSTFS_SECRET_KEY: ${S3_SECRET_KEY} + RUSTFS_VOLUMES: "/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3" + volumes: + - rustfs3-data:/data + rustfs4: + image: rustfs-ci + hostname: rustfs4 + networks: [rustfs-net] + environment: + RUSTFS_ADDRESS: "0.0.0.0:9000" + RUSTFS_ACCESS_KEY: ${S3_ACCESS_KEY} + RUSTFS_SECRET_KEY: ${S3_SECRET_KEY} + RUSTFS_VOLUMES: "/data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3" + volumes: + - rustfs4-data:/data + lb: + image: haproxy:2.9 + hostname: lb + networks: [rustfs-net] + ports: + - "9000:9000" + volumes: + - ./haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg:ro + networks: + rustfs-net: + name: rustfs-net + volumes: + rustfs1-data: + rustfs2-data: + rustfs3-data: + rustfs4-data: + EOF + + cat > haproxy.cfg <<'EOF' + defaults + mode http + timeout connect 5s + timeout client 30s + timeout server 30s + + frontend fe_s3 + bind *:9000 + default_backend be_s3 + + backend be_s3 + balance roundrobin + server s1 rustfs1:9000 check + server s2 rustfs2:9000 check + server s3 rustfs3:9000 check + server s4 rustfs4:9000 check + EOF + + - name: Launch cluster + run: docker compose -f compose.yml up -d + + - name: Wait for LB ready + run: | + for i in {1..90}; do + if curl -sf http://127.0.0.1:9000/health >/dev/null 2>&1; then + echo "Load balancer is ready" + exit 0 + fi + sleep 2 + done + echo "LB or backend not ready" >&2 + docker compose -f compose.yml logs --tail=200 || true + exit 1 + + - name: Generate s3tests config + run: | + export S3_HOST=127.0.0.1 + envsubst < .github/s3tests/s3tests.conf > s3tests.conf + + - name: Provision s3-tests alt user (required by suite) + run: | + python3 -m pip install --user --upgrade pip awscurl + export PATH="$HOME/.local/bin:$PATH" + + awscurl \ + --service s3 \ + --region "${S3_REGION}" \ + --access_key "${S3_ACCESS_KEY}" \ + --secret_key "${S3_SECRET_KEY}" \ + -X PUT \ + -H 'Content-Type: application/json' \ + -d '{"secretKey":"'"${S3_ALT_SECRET_KEY}"'","status":"enabled","policy":"readwrite"}' \ + "http://127.0.0.1:9000/rustfs/admin/v3/add-user?accessKey=${S3_ALT_ACCESS_KEY}" + + awscurl \ + --service s3 \ + --region "${S3_REGION}" \ + --access_key "${S3_ACCESS_KEY}" \ + --secret_key "${S3_SECRET_KEY}" \ + -X PUT \ + "http://127.0.0.1:9000/rustfs/admin/v3/set-user-or-group-policy?policyName=readwrite&userOrGroup=${S3_ALT_ACCESS_KEY}&isGroup=false" + + awscurl \ + --service s3 \ + --region "${S3_REGION}" \ + --access_key "${S3_ALT_ACCESS_KEY}" \ + --secret_key "${S3_ALT_SECRET_KEY}" \ + -X GET \ + "http://127.0.0.1:9000/" >/dev/null + + - name: Prepare s3-tests + run: | + python3 -m pip install --user --upgrade pip tox + export PATH="$HOME/.local/bin:$PATH" + git clone --depth 1 https://github.com/ceph/s3-tests.git s3-tests + + - name: Run ceph s3-tests (multi, debug friendly) + run: | + export PATH="$HOME/.local/bin:$PATH" + mkdir -p artifacts/s3tests-multi + + cd s3-tests + + set -o pipefail + + MAXFAIL="${{ github.event.inputs.maxfail }}" + if [ -z "$MAXFAIL" ]; then MAXFAIL="1"; fi + + MARKEXPR="${{ github.event.inputs.markexpr }}" + if [ -z "$MARKEXPR" ]; then MARKEXPR="not lifecycle and not versioning and not s3website and not bucket_logging and not encryption"; fi + + XDIST="${{ github.event.inputs.xdist }}" + if [ -z "$XDIST" ]; then XDIST="0"; fi + XDIST_ARGS="" + if [ "$XDIST" != "0" ]; then + # Add pytest-xdist to requirements.txt so tox installs it inside + # its virtualenv. Installing outside tox does NOT work. + echo "pytest-xdist" >> requirements.txt + XDIST_ARGS="-n $XDIST --dist=loadgroup" + fi + + # Run tests from s3tests/functional (boto2+boto3 combined directory). + S3TEST_CONF=${GITHUB_WORKSPACE}/s3tests.conf \ + tox -- \ + -vv -ra --showlocals --tb=long \ + --maxfail="$MAXFAIL" \ + --junitxml=${GITHUB_WORKSPACE}/artifacts/s3tests-multi/junit.xml \ + $XDIST_ARGS \ + s3tests/functional/test_s3.py \ + -m "$MARKEXPR" \ + 2>&1 | tee ${GITHUB_WORKSPACE}/artifacts/s3tests-multi/pytest.log + + - name: Collect logs + if: always() + run: | + mkdir -p artifacts/cluster + docker compose -f compose.yml logs --no-color > artifacts/cluster/cluster.log 2>&1 || true + + - name: Upload artifacts + if: always() && env.ACT != 'true' + uses: actions/upload-artifact@v4 + with: + name: s3tests-multi + path: artifacts/** diff --git a/.github/workflows/helm-package.yml b/.github/workflows/helm-package.yml index ccefc6eb..954d7c41 100644 --- a/.github/workflows/helm-package.yml +++ b/.github/workflows/helm-package.yml @@ -1,9 +1,23 @@ +# Copyright 2024 RustFS Team +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + name: Publish helm chart to artifacthub on: workflow_run: - workflows: ["Build and Release"] - types: [completed] + workflows: [ "Build and Release" ] + types: [ completed ] permissions: contents: read @@ -13,7 +27,7 @@ env: jobs: build-helm-package: - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 # Only run on successful builds triggered by tag pushes (version format: x.y.z or x.y.z-suffix) if: | github.event.workflow_run.conclusion == 'success' && @@ -22,9 +36,9 @@ jobs: steps: - name: Checkout helm chart repo - uses: actions/checkout@v2 + uses: actions/checkout@v6 - - name: Replace chart appversion + - name: Replace chart app version run: | set -e set -x @@ -40,7 +54,7 @@ jobs: cp helm/README.md helm/rustfs/ package_version=$(echo $new_version | awk -F '-' '{print $2}' | awk -F '.' '{print $NF}') helm package ./helm/rustfs --destination helm/rustfs/ --version "0.0.$package_version" - + - name: Upload helm package as artifact uses: actions/upload-artifact@v4 with: @@ -49,25 +63,25 @@ jobs: retention-days: 1 publish-helm-package: - runs-on: ubuntu-latest - needs: [build-helm-package] + runs-on: ubicloud-standard-2 + needs: [ build-helm-package ] steps: - name: Checkout helm package repo - uses: actions/checkout@v2 + uses: actions/checkout@v6 with: - repository: rustfs/helm + repository: rustfs/helm token: ${{ secrets.RUSTFS_HELM_PACKAGE }} - + - name: Download helm package uses: actions/download-artifact@v4 with: name: helm-package path: ./ - + - name: Set up helm uses: azure/setup-helm@v4.3.0 - + - name: Generate index run: helm repo index . --url https://charts.rustfs.com diff --git a/.github/workflows/issue-translator.yml b/.github/workflows/issue-translator.yml index 0cb805d4..b3c9d206 100644 --- a/.github/workflows/issue-translator.yml +++ b/.github/workflows/issue-translator.yml @@ -25,7 +25,7 @@ permissions: jobs: build: - runs-on: ubuntu-latest + runs-on: ubicloud-standard-4 steps: - uses: usthe/issues-translate-action@v2.7 with: diff --git a/.github/workflows/performance.yml b/.github/workflows/performance.yml index 52274035..954fd000 100644 --- a/.github/workflows/performance.yml +++ b/.github/workflows/performance.yml @@ -40,11 +40,11 @@ env: jobs: performance-profile: name: Performance Profiling - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 timeout-minutes: 30 steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 - name: Setup Rust environment uses: ./.github/actions/setup @@ -115,11 +115,11 @@ jobs: benchmark: name: Benchmark Tests - runs-on: ubuntu-latest + runs-on: ubicloud-standard-2 timeout-minutes: 45 steps: - name: Checkout repository - uses: actions/checkout@v5 + uses: actions/checkout@v6 - name: Setup Rust environment uses: ./.github/actions/setup diff --git a/.gitignore b/.gitignore index 1b46a92f..d0139ca6 100644 --- a/.gitignore +++ b/.gitignore @@ -2,6 +2,7 @@ .DS_Store .idea .vscode +.direnv/ /test /logs /data @@ -23,4 +24,13 @@ profile.json *.go *.pb *.svg -deploy/logs/*.log.* \ No newline at end of file +deploy/logs/*.log.* + +# s3-tests local artifacts (root directory only) +/s3-tests/ +/s3-tests-local/ +/s3tests.conf +/s3tests.conf.* +*.events +*.audit +*.snappy \ No newline at end of file diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml new file mode 100644 index 00000000..9482f3b9 --- /dev/null +++ b/.pre-commit-config.yaml @@ -0,0 +1,32 @@ +# See https://pre-commit.com for more information +# See https://pre-commit.com/hooks.html for more hooks +repos: + - repo: local + hooks: + - id: cargo-fmt + name: cargo fmt + entry: cargo fmt --all --check + language: system + types: [rust] + pass_filenames: false + + - id: cargo-clippy + name: cargo clippy + entry: cargo clippy --all-targets --all-features -- -D warnings + language: system + types: [rust] + pass_filenames: false + + - id: cargo-check + name: cargo check + entry: cargo check --all-targets + language: system + types: [rust] + pass_filenames: false + + - id: cargo-test + name: cargo test + entry: bash -c 'cargo test --workspace --exclude e2e_test && cargo test --all --doc' + language: system + types: [rust] + pass_filenames: false diff --git a/.vscode/launch.json b/.vscode/launch.json index f054a23a..62da1e91 100644 --- a/.vscode/launch.json +++ b/.vscode/launch.json @@ -1,9 +1,31 @@ { - // 使用 IntelliSense 了解相关属性。 - // 悬停以查看现有属性的描述。 - // 欲了解更多信息,请访问: https://go.microsoft.com/fwlink/?linkid=830387 "version": "0.2.0", "configurations": [ + { + "type": "lldb", + "request": "launch", + "name": "Debug(only) executable 'rustfs'", + "env": { + "RUST_LOG": "rustfs=info,ecstore=info,s3s=info,iam=info", + "RUSTFS_SKIP_BACKGROUND_TASK": "on" + //"RUSTFS_OBS_LOG_DIRECTORY": "./deploy/logs", + // "RUSTFS_POLICY_PLUGIN_URL":"http://localhost:8181/v1/data/rustfs/authz/allow", + // "RUSTFS_POLICY_PLUGIN_AUTH_TOKEN":"your-opa-token" + }, + "program": "${workspaceFolder}/target/debug/rustfs", + "args": [ + "--access-key", + "rustfsadmin", + "--secret-key", + "rustfsadmin", + "--address", + "0.0.0.0:9010", + "--server-domains", + "127.0.0.1:9010", + "./target/volume/test{1...4}" + ], + "cwd": "${workspaceFolder}" + }, { "type": "lldb", "request": "launch", @@ -67,12 +89,8 @@ "test", "--no-run", "--lib", - "--package=ecstore" - ], - "filter": { - "name": "ecstore", - "kind": "lib" - } + "--package=rustfs-ecstore" + ] }, "args": [], "cwd": "${workspaceFolder}" @@ -95,6 +113,7 @@ // "RUSTFS_OBS_TRACE_ENDPOINT": "http://127.0.0.1:4318/v1/traces", // jeager otlp http endpoint // "RUSTFS_OBS_METRIC_ENDPOINT": "http://127.0.0.1:4318/v1/metrics", // default otlp http endpoint // "RUSTFS_OBS_LOG_ENDPOINT": "http://127.0.0.1:4318/v1/logs", // default otlp http endpoint + // "RUSTFS_COMPRESS_ENABLE": "true", "RUSTFS_CONSOLE_ADDRESS": "127.0.0.1:9001", "RUSTFS_OBS_LOG_DIRECTORY": "./target/logs", }, diff --git a/AGENTS.md b/AGENTS.md index 4990fc0b..0670af41 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,6 +2,7 @@ ## Communication Rules - Respond to the user in Chinese; use English in all other contexts. +- Code and documentation must be written in English only. Chinese text is allowed solely as test data/fixtures when a case explicitly requires Chinese-language content for validation. ## Project Structure & Module Organization The workspace root hosts shared dependencies in `Cargo.toml`. The service binary lives under `rustfs/src/main.rs`, while reusable crates sit in `crates/` (`crypto`, `iam`, `kms`, and `e2e_test`). Local fixtures for standalone flows reside in `test_standalone/`, deployment manifests are under `deploy/`, Docker assets sit at the root, and automation lives in `scripts/`. Skim each crate’s README or module docs before contributing changes. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index be58a46e..6b9dcfc4 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -2,6 +2,8 @@ ## 📋 Code Quality Requirements +For instructions on setting up and running the local development environment, please see [Development Guide](docs/DEVELOPMENT.md). + ### 🔧 Code Formatting Rules **MANDATORY**: All code must be properly formatted before committing. This project enforces strict formatting standards to maintain code consistency and readability. diff --git a/Cargo.lock b/Cargo.lock index 94b0e800..155c7eef 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -216,15 +216,18 @@ dependencies = [ [[package]] name = "arc-swap" -version = "1.7.1" +version = "1.8.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "69f7f8c3906b62b754cd5326047894316021dcfe5a194c8ea52bdd94934a3457" +checksum = "51d03449bb8ca2cc2ef70869af31463d1ae5ccc8fa3e334b307203fbf815207e" +dependencies = [ + "rustversion", +] [[package]] name = "argon2" -version = "0.6.0-rc.3" +version = "0.6.0-rc.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "53fc8992356faa4da0422d552f1dc7d7fda26927165069fd0af2d565f0b0fc6f" +checksum = "a26e88a084142953a0415c47ddf4081eddf9a6d310012bbe92e9827d03e447f0" dependencies = [ "base64ct", "blake2 0.11.0-rc.3", @@ -323,7 +326,7 @@ dependencies = [ "arrow-schema", "arrow-select", "atoi", - "base64 0.22.1", + "base64", "chrono", "comfy-table", "half", @@ -515,9 +518,9 @@ dependencies = [ [[package]] name = "async-lock" -version = "3.4.1" +version = "3.4.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "5fd03604047cee9b6ce9de9f70c6cd540a0520c813cbd49bae61f33ab80ed1dc" +checksum = "290f7f2596bd5b78a9fec8088ccd89180d7f9f55b94b0576823bbbdc72ee8311" dependencies = [ "event-listener", "event-listener-strategy", @@ -602,9 +605,9 @@ checksum = "c08606f8c3cbf4ce6ec8e28fb0014a2c086708fe954eaa885384a6165172e7e8" [[package]] name = "aws-config" -version = "1.8.11" +version = "1.8.12" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a0149602eeaf915158e14029ba0c78dedb8c08d554b024d54c8f239aab46511d" +checksum = "96571e6996817bf3d58f6b569e4b9fd2e9d2fcf9f7424eed07b2ce9bb87535e5" dependencies = [ "aws-credential-types", "aws-runtime", @@ -632,9 +635,9 @@ dependencies = [ [[package]] name = "aws-credential-types" -version = "1.2.10" +version = "1.2.11" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "b01c9521fa01558f750d183c8c68c81b0155b9d193a4ba7f84c36bd1b6d04a06" +checksum = "3cd362783681b15d136480ad555a099e82ecd8e2d10a841e14dfd0078d67fee3" dependencies = [ "aws-smithy-async", "aws-smithy-runtime-api", @@ -644,9 +647,9 @@ dependencies = [ [[package]] name = "aws-lc-rs" -version = "1.15.1" +version = "1.15.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "6b5ce75405893cd713f9ab8e297d8e438f624dde7d706108285f7e17a25a180f" +checksum = "6a88aab2464f1f25453baa7a07c84c5b7684e274054ba06817f382357f77a288" dependencies = [ "aws-lc-sys", "zeroize", @@ -654,9 +657,9 @@ dependencies = [ [[package]] name = "aws-lc-sys" -version = "0.34.0" +version = "0.35.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "179c3777a8b5e70e90ea426114ffc565b2c1a9f82f6c4a0c5a34aa6ef5e781b6" +checksum = "b45afffdee1e7c9126814751f88dddc747f41d91da16c9551a0f1e8a11e788a1" dependencies = [ "cc", "cmake", @@ -666,9 +669,9 @@ dependencies = [ [[package]] name = "aws-runtime" -version = "1.5.16" +version = "1.5.17" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "7ce527fb7e53ba9626fc47824f25e256250556c40d8f81d27dd92aa38239d632" +checksum = "d81b5b2898f6798ad58f484856768bca817e3cd9de0974c24ae0f1113fe88f1b" dependencies = [ "aws-credential-types", "aws-sigv4", @@ -691,9 +694,9 @@ dependencies = [ [[package]] name = "aws-sdk-s3" -version = "1.116.0" +version = "1.119.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "cd4c10050aa905b50dc2a1165a9848d598a80c3a724d6f93b5881aa62235e4a5" +checksum = "1d65fddc3844f902dfe1864acb8494db5f9342015ee3ab7890270d36fbd2e01c" dependencies = [ "aws-credential-types", "aws-runtime", @@ -725,9 +728,9 @@ dependencies = [ [[package]] name = "aws-sdk-sso" -version = "1.90.0" +version = "1.91.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4f18e53542c522459e757f81e274783a78f8c81acdfc8d1522ee8a18b5fb1c66" +checksum = "8ee6402a36f27b52fe67661c6732d684b2635152b676aa2babbfb5204f99115d" dependencies = [ "aws-credential-types", "aws-runtime", @@ -747,9 +750,9 @@ dependencies = [ [[package]] name = "aws-sdk-ssooidc" -version = "1.92.0" +version = "1.93.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "532f4d866012ffa724a4385c82e8dd0e59f0ca0e600f3f22d4c03b6824b34e4a" +checksum = "a45a7f750bbd170ee3677671ad782d90b894548f4e4ae168302c57ec9de5cb3e" dependencies = [ "aws-credential-types", "aws-runtime", @@ -769,9 +772,9 @@ dependencies = [ [[package]] name = "aws-sdk-sts" -version = "1.94.0" +version = "1.95.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "1be6fbbfa1a57724788853a623378223fe828fc4c09b146c992f0c95b6256174" +checksum = "55542378e419558e6b1f398ca70adb0b2088077e79ad9f14eb09441f2f7b2164" dependencies = [ "aws-credential-types", "aws-runtime", @@ -792,9 +795,9 @@ dependencies = [ [[package]] name = "aws-sigv4" -version = "1.3.6" +version = "1.3.7" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c35452ec3f001e1f2f6db107b6373f1f48f05ec63ba2c5c9fa91f07dad32af11" +checksum = "69e523e1c4e8e7e8ff219d732988e22bfeae8a1cafdbe6d9eca1546fa080be7c" dependencies = [ "aws-credential-types", "aws-smithy-eventstream", @@ -820,9 +823,9 @@ dependencies = [ [[package]] name = "aws-smithy-async" -version = "1.2.6" +version = "1.2.7" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "127fcfad33b7dfc531141fda7e1c402ac65f88aca5511a4d31e2e3d2cd01ce9c" +checksum = "9ee19095c7c4dda59f1697d028ce704c24b2d33c6718790c7f1d5a3015b4107c" dependencies = [ "futures-util", "pin-project-lite", @@ -831,9 +834,9 @@ dependencies = [ [[package]] name = "aws-smithy-checksums" -version = "0.63.11" +version = "0.63.12" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "95bd108f7b3563598e4dc7b62e1388c9982324a2abd622442167012690184591" +checksum = "87294a084b43d649d967efe58aa1f9e0adc260e13a6938eb904c0ae9b45824ae" dependencies = [ "aws-smithy-http", "aws-smithy-types", @@ -851,9 +854,9 @@ dependencies = [ [[package]] name = "aws-smithy-eventstream" -version = "0.60.13" +version = "0.60.14" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "e29a304f8319781a39808847efb39561351b1bb76e933da7aa90232673638658" +checksum = "dc12f8b310e38cad85cf3bef45ad236f470717393c613266ce0a89512286b650" dependencies = [ "aws-smithy-types", "bytes", @@ -862,9 +865,9 @@ dependencies = [ [[package]] name = "aws-smithy-http" -version = "0.62.5" +version = "0.62.6" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "445d5d720c99eed0b4aa674ed00d835d9b1427dd73e04adaf2f94c6b2d6f9fca" +checksum = "826141069295752372f8203c17f28e30c464d22899a43a0c9fd9c458d469c88b" dependencies = [ "aws-smithy-eventstream", "aws-smithy-runtime-api", @@ -884,9 +887,9 @@ dependencies = [ [[package]] name = "aws-smithy-http-client" -version = "1.1.4" +version = "1.1.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "623254723e8dfd535f566ee7b2381645f8981da086b5c4aa26c0c41582bb1d2c" +checksum = "59e62db736db19c488966c8d787f52e6270be565727236fd5579eaa301e7bc4a" dependencies = [ "aws-smithy-async", "aws-smithy-runtime-api", @@ -904,7 +907,7 @@ dependencies = [ "pin-project-lite", "rustls 0.21.12", "rustls 0.23.35", - "rustls-native-certs 0.8.2", + "rustls-native-certs", "rustls-pki-types", "tokio", "tokio-rustls 0.26.4", @@ -914,27 +917,27 @@ dependencies = [ [[package]] name = "aws-smithy-json" -version = "0.61.7" +version = "0.61.9" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "2db31f727935fc63c6eeae8b37b438847639ec330a9161ece694efba257e0c54" +checksum = "49fa1213db31ac95288d981476f78d05d9cbb0353d22cdf3472cc05bb02f6551" dependencies = [ "aws-smithy-types", ] [[package]] name = "aws-smithy-observability" -version = "0.1.4" +version = "0.1.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "2d1881b1ea6d313f9890710d65c158bdab6fb08c91ea825f74c1c8c357baf4cc" +checksum = "17f616c3f2260612fe44cede278bafa18e73e6479c4e393e2c4518cf2a9a228a" dependencies = [ "aws-smithy-runtime-api", ] [[package]] name = "aws-smithy-query" -version = "0.60.8" +version = "0.60.9" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d28a63441360c477465f80c7abac3b9c4d075ca638f982e605b7dc2a2c7156c9" +checksum = "ae5d689cf437eae90460e944a58b5668530d433b4ff85789e69d2f2a556e057d" dependencies = [ "aws-smithy-types", "urlencoding", @@ -942,9 +945,9 @@ dependencies = [ [[package]] name = "aws-smithy-runtime" -version = "1.9.4" +version = "1.9.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "0bbe9d018d646b96c7be063dd07987849862b0e6d07c778aad7d93d1be6c1ef0" +checksum = "a392db6c583ea4a912538afb86b7be7c5d8887d91604f50eb55c262ee1b4a5f5" dependencies = [ "aws-smithy-async", "aws-smithy-http", @@ -966,9 +969,9 @@ dependencies = [ [[package]] name = "aws-smithy-runtime-api" -version = "1.9.2" +version = "1.9.3" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "ec7204f9fd94749a7c53b26da1b961b4ac36bf070ef1e0b94bb09f79d4f6c193" +checksum = "ab0d43d899f9e508300e587bf582ba54c27a452dd0a9ea294690669138ae14a2" dependencies = [ "aws-smithy-async", "aws-smithy-types", @@ -983,9 +986,9 @@ dependencies = [ [[package]] name = "aws-smithy-types" -version = "1.3.4" +version = "1.3.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "25f535879a207fce0db74b679cfc3e91a3159c8144d717d55f5832aea9eef46e" +checksum = "905cb13a9895626d49cf2ced759b062d913834c7482c38e49557eac4e6193f01" dependencies = [ "base64-simd", "bytes", @@ -1009,18 +1012,18 @@ dependencies = [ [[package]] name = "aws-smithy-xml" -version = "0.60.12" +version = "0.60.13" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "eab77cdd036b11056d2a30a7af7b775789fb024bf216acc13884c6c97752ae56" +checksum = "11b2f670422ff42bf7065031e72b45bc52a3508bd089f743ea90731ca2b6ea57" dependencies = [ "xmlparser", ] [[package]] name = "aws-types" -version = "1.3.10" +version = "1.3.11" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d79fb68e3d7fe5d4833ea34dc87d2e97d26d3086cb3da660bb6b1f76d98680b6" +checksum = "1d980627d2dd7bfc32a3c025685a033eeab8d365cc840c631ef59d1b8f428164" dependencies = [ "aws-credential-types", "aws-smithy-async", @@ -1032,9 +1035,9 @@ dependencies = [ [[package]] name = "axum" -version = "0.8.7" +version = "0.8.8" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "5b098575ebe77cb6d14fc7f32749631a6e44edbef6b796f89b020e99ba20d425" +checksum = "8b52af3cb4058c895d37317bb27508dccc8e5f2d39454016b297bf4a400597b8" dependencies = [ "axum-core", "bytes", @@ -1084,9 +1087,9 @@ dependencies = [ [[package]] name = "axum-extra" -version = "0.12.2" +version = "0.12.3" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "dbfe9f610fe4e99cf0cfcd03ccf8c63c28c616fe714d80475ef731f3b13dd21b" +checksum = "6dfbd6109d91702d55fc56df06aae7ed85c465a7a451db6c0e54a4b9ca5983d1" dependencies = [ "axum", "axum-core", @@ -1158,12 +1161,6 @@ version = "0.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "d8b59d472eab27ade8d770dcb11da7201c11234bef9f82ce7aa517be028d462b" -[[package]] -name = "base64" -version = "0.21.7" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9d297deb1925b89f2ccc13d7635fa0714f12c87adce1c75356b39ca9b7178567" - [[package]] name = "base64" version = "0.22.1" @@ -1182,9 +1179,9 @@ dependencies = [ [[package]] name = "base64ct" -version = "1.8.0" +version = "1.8.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "55248b47b0caf0546f7988906588779981c43bb1bc9d0c44087278f80cdb44ba" +checksum = "0e050f626429857a27ddccb31e0aca21356bfa709c04041aefddac081a8f068a" [[package]] name = "bigdecimal" @@ -1343,9 +1340,9 @@ dependencies = [ [[package]] name = "bumpalo" -version = "3.19.0" +version = "3.19.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "46c5e41b57b8bba42a04676d81cb89e9ee8e859a1a66f80a5a72e1cb76b34d43" +checksum = "5dd9dc738b7a8311c7ade152424974d8115f2cdad61e8dab8dac9f2362298510" [[package]] name = "bytemuck" @@ -1423,47 +1420,31 @@ dependencies = [ [[package]] name = "camino" -version = "1.2.1" +version = "1.2.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "276a59bf2b2c967788139340c9f0c5b12d7fd6630315c15c217e559de85d2609" +checksum = "e629a66d692cb9ff1a1c664e41771b3dcaf961985a9774c0eb0bd1b51cf60a48" dependencies = [ "serde_core", ] [[package]] name = "cargo-platform" -version = "0.3.1" +version = "0.3.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "122ec45a44b270afd1402f351b782c676b173e3c3fb28d86ff7ebfb4d86a4ee4" +checksum = "87a0c0e6148f11f01f32650a2ea02d532b2ad4e81d8bd41e6e565b5adc5e6082" dependencies = [ "serde", -] - -[[package]] -name = "cargo-util-schemas" -version = "0.8.2" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "7dc1a6f7b5651af85774ae5a34b4e8be397d9cf4bc063b7e6dbd99a841837830" -dependencies = [ - "semver", - "serde", - "serde-untagged", - "serde-value", - "thiserror 2.0.17", - "toml", - "unicode-xid", - "url", + "serde_core", ] [[package]] name = "cargo_metadata" -version = "0.22.0" +version = "0.23.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "0c3f56c207c76c07652489840ff98687dcf213de178ac0974660d6fefeaf5ec6" +checksum = "ef987d17b0a113becdd19d3d0022d04d7ef41f9efe4f3fb63ac44ba61df3ade9" dependencies = [ "camino", "cargo-platform", - "cargo-util-schemas", "semver", "serde", "serde_json", @@ -1478,9 +1459,9 @@ checksum = "37b2a672a2cb129a2e41c10b1224bb368f9f37a2b16b612598138befd7b37eb5" [[package]] name = "cc" -version = "1.2.48" +version = "1.2.50" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c481bdbf0ed3b892f6f806287d72acd515b352a4ec27a208489b8c1bc839633a" +checksum = "9f50d563227a1c37cc0a263f64eca3334388c01c5e4c4861a9def205c614383c" dependencies = [ "find-msvc-tools", "jobserver", @@ -1581,7 +1562,7 @@ version = "0.4.4" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "773f3b9af64447d2ce9850330c473515014aa235e6a783b02db81ff39e4a3dad" dependencies = [ - "crypto-common 0.1.6", + "crypto-common 0.1.7", "inout 0.1.4", ] @@ -1638,9 +1619,9 @@ checksum = "a1d728cc89cf3aee9ff92b05e62b19ee65a02b5702cff7d5a377e32c6ae29d8d" [[package]] name = "cmake" -version = "0.1.54" +version = "0.1.57" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "e7caa3f9de89ddbe2c607f4101924c5abec803763ae9534e4f4d7d8f84aa81f0" +checksum = "75443c44cd6b379beb8c5b45d85d0773baf31cce901fe7bb252f4eff3008ef7d" dependencies = [ "cc", ] @@ -1704,18 +1685,18 @@ dependencies = [ [[package]] name = "const-str" -version = "0.7.0" +version = "0.7.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f4d34b8f066904ed7cfa4a6f9ee96c3214aa998cb44b69ca20bd2054f47402ed" +checksum = "b0664d2867b4a32697dfe655557f5c3b187e9b605b38612a748e5ec99811d160" dependencies = [ "const-str-proc-macro", ] [[package]] name = "const-str-proc-macro" -version = "0.7.0" +version = "0.7.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a08a8aee16926ee1c4ad18868b8c3dfe5106359053f91e035861ec2a17116988" +checksum = "5c25c2a02ba19f2d4fd9f54d5f239f97c867deb7397763a9771edab63c44a4fa" dependencies = [ "proc-macro2", "quote", @@ -1803,9 +1784,9 @@ dependencies = [ [[package]] name = "crc" -version = "3.4.0" +version = "3.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "5eb8a2a1cd12ab0d987a5d5e825195d372001a4094a0376319d5a0ad71c1ba0d" +checksum = "9710d3b3739c2e349eb44fe848ad0b7c8cb1e42bd87ee49371df2f7acaf3e675" dependencies = [ "crc-catalog", ] @@ -1970,9 +1951,9 @@ dependencies = [ [[package]] name = "crypto-common" -version = "0.1.6" +version = "0.1.7" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "1bfb12502f3fc46cca1bb51ac28df9d618d813cdc3d2f25b9fe775a34af26bb3" +checksum = "78c8292055d1c1df0cce5d180393dc8cce0abec0a7102adb6c7b1eef6016d60a" dependencies = [ "generic-array", "typenum", @@ -2087,6 +2068,16 @@ dependencies = [ "darling_macro 0.21.3", ] +[[package]] +name = "darling" +version = "0.23.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "25ae13da2f202d56bd7f91c25fba009e7717a1e4a1cc98a76d844b65ae912e9d" +dependencies = [ + "darling_core 0.23.0", + "darling_macro 0.23.0", +] + [[package]] name = "darling_core" version = "0.14.4" @@ -2129,6 +2120,19 @@ dependencies = [ "syn 2.0.111", ] +[[package]] +name = "darling_core" +version = "0.23.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9865a50f7c335f53564bb694ef660825eb8610e0a53d3e11bf1b0d3df31e03b0" +dependencies = [ + "ident_case", + "proc-macro2", + "quote", + "strsim 0.11.1", + "syn 2.0.111", +] + [[package]] name = "darling_macro" version = "0.14.4" @@ -2162,6 +2166,17 @@ dependencies = [ "syn 2.0.111", ] +[[package]] +name = "darling_macro" +version = "0.23.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ac3984ec7bd6cfa798e62b4a642426a5be0e68f9401cfc2a01e3fa9ea2fcdb8d" +dependencies = [ + "darling_core 0.23.0", + "quote", + "syn 2.0.111", +] + [[package]] name = "dashmap" version = "6.1.0" @@ -2520,7 +2535,7 @@ checksum = "794a9db7f7b96b3346fc007ff25e994f09b8f0511b4cf7dff651fadfe3ebb28f" dependencies = [ "arrow", "arrow-buffer", - "base64 0.22.1", + "base64", "blake2 0.10.6", "blake3", "chrono", @@ -2968,7 +2983,7 @@ checksum = "9ed9a281f7bc9b7576e61468ba615a66a5c8cfdff42420a70aa82701a3b1e292" dependencies = [ "block-buffer 0.10.4", "const-oid 0.9.6", - "crypto-common 0.1.6", + "crypto-common 0.1.7", "subtle", ] @@ -3044,7 +3059,7 @@ dependencies = [ "async-trait", "aws-config", "aws-sdk-s3", - "base64 0.22.1", + "base64", "bytes", "chrono", "flatbuffers", @@ -3053,6 +3068,7 @@ dependencies = [ "rand 0.10.0-rc.5", "reqwest", "rmp-serde", + "rustfs-common", "rustfs-ecstore", "rustfs-filemeta", "rustfs-lock", @@ -3375,9 +3391,9 @@ checksum = "1d674e81391d1e1ab681a28d99df07927c6d4aa5b027d7da16ba32d1d21ecd99" [[package]] name = "flatbuffers" -version = "25.9.23" +version = "25.12.19" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "09b6620799e7340ebd9968d2e0708eb82cf1971e9a16821e2091b6d6e475eed5" +checksum = "35f6839d7b3b98adde531effaf34f0c2badc6f4735d26fe74709d8e513a96ef3" dependencies = [ "bitflags 2.10.0", "rustc_version", @@ -3457,9 +3473,9 @@ dependencies = [ [[package]] name = "fs-err" -version = "3.2.0" +version = "3.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "62d91fd049c123429b018c47887d3f75a265540dd3c30ba9cb7bae9197edb03a" +checksum = "824f08d01d0f496b3eca4f001a13cf17690a6ee930043d20817f547455fd98f8" dependencies = [ "autocfg", "tokio", @@ -3577,9 +3593,9 @@ dependencies = [ [[package]] name = "generic-array" -version = "0.14.9" +version = "0.14.7" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4bb6743198531e02858aeaea5398fcc883e71851fcbcb5a2f773e2fb6cb1edf2" +checksum = "85649ca51fd72272d7821adaf274ad91c288277713d9c18820d8499a7ff69e9a" dependencies = [ "typenum", "version_check", @@ -3648,20 +3664,20 @@ checksum = "0cc23270f6e1808e30a928bdc84dea0b9b4136a8bc82338574f23baf47bbd280" [[package]] name = "google-cloud-auth" -version = "1.2.0" +version = "1.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "1cc977b20996b87e207b0a004ea34aa5f0f8692c44a1ca8c8802a08f553bf79c" +checksum = "590a1c28795779d5da6fda35b149d5271bcddcf2ce1709eae9e9460faf2f2aa9" dependencies = [ "async-trait", - "base64 0.22.1", + "base64", "bon", + "bytes", "google-cloud-gax", "http 1.4.0", - "jsonwebtoken", "reqwest", "rustc_version", "rustls 0.23.35", - "rustls-pemfile 2.2.0", + "rustls-pemfile", "serde", "serde_json", "thiserror 2.0.17", @@ -3671,11 +3687,11 @@ dependencies = [ [[package]] name = "google-cloud-gax" -version = "1.3.1" +version = "1.4.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "48c26e6f1be47e93e5360a77e67e4e996a2d838b1924ffe0763bcb21d47be68b" +checksum = "324fb97d35103787e80a33ed41ccc43d947c376d2ece68ca53e860f5844dbe24" dependencies = [ - "base64 0.22.1", + "base64", "bytes", "futures", "google-cloud-rpc", @@ -3691,20 +3707,23 @@ dependencies = [ [[package]] name = "google-cloud-gax-internal" -version = "0.7.5" +version = "0.7.6" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "69168fd1f81869bb8682883d27c56e3e499840d45b27b884b289ec0d5f2b442a" +checksum = "7b75b810886ae872aca68a35ad1d4d5e8f2be39e40238116d8aff9d778f04b38" dependencies = [ "bytes", + "futures", "google-cloud-auth", "google-cloud-gax", "google-cloud-rpc", "google-cloud-wkt", "http 1.4.0", + "http-body 1.0.1", "http-body-util", "hyper 1.8.1", "opentelemetry-semantic-conventions", "percent-encoding", + "pin-project", "prost 0.14.1", "prost-types", "reqwest", @@ -3722,9 +3741,9 @@ dependencies = [ [[package]] name = "google-cloud-iam-v1" -version = "1.1.0" +version = "1.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a2f2c6d094d0ed9453de0fba8bb690b0c039a3d056f009d2e6c7909c32a446bb" +checksum = "498a68e2a958e8aa9938f7db2c7147aad1b5a0ff2cd47c5ba4e10cb0dcb5bfc5" dependencies = [ "async-trait", "bytes", @@ -3742,9 +3761,9 @@ dependencies = [ [[package]] name = "google-cloud-longrunning" -version = "1.2.0" +version = "1.3.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "398201f50d0dd0180105628c370ba5dae77a3f5e842eebce494f451caee96371" +checksum = "1c80938e704401a47fdf36b51ec10e1a99b1ec22793d607afd0e67c7b675b8b3" dependencies = [ "async-trait", "bytes", @@ -3762,9 +3781,9 @@ dependencies = [ [[package]] name = "google-cloud-lro" -version = "1.1.1" +version = "1.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f5259a172f712809460ad10336b322caf0cd37cf1469aecc950bf6bf0026fbd7" +checksum = "49747b7b684b804a2d1040c2cdb21238b3d568a41ab9e36c423554509112f61d" dependencies = [ "google-cloud-gax", "google-cloud-longrunning", @@ -3776,9 +3795,9 @@ dependencies = [ [[package]] name = "google-cloud-rpc" -version = "1.1.0" +version = "1.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "e5b655e3540a78e18fd753ebd8f11e068210a3fa392892370f932ffcc8774346" +checksum = "bd10e97751ca894f9dad6be69fcef1cb72f5bc187329e0254817778fc8235030" dependencies = [ "bytes", "google-cloud-wkt", @@ -3789,12 +3808,12 @@ dependencies = [ [[package]] name = "google-cloud-storage" -version = "1.4.0" +version = "1.5.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "931b69ac5996d0216e74e22e1843e025bef605ba8c062003d40c1565d90594b4" +checksum = "043be824d1b105bfdce786c720e45cae04e66436f8e5d0168e98ca8e5715ce9f" dependencies = [ "async-trait", - "base64 0.22.1", + "base64", "bytes", "crc32c", "futures", @@ -3832,9 +3851,9 @@ dependencies = [ [[package]] name = "google-cloud-type" -version = "1.1.0" +version = "1.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "290760412b63cd266376273e4fbeb13afaa4bc7dadd5340786c916866139e14c" +checksum = "9390ac2f3f9882ff42956b25ea65b9f546c8dd44c131726d75a96bf744ec75f6" dependencies = [ "bytes", "google-cloud-wkt", @@ -3845,11 +3864,11 @@ dependencies = [ [[package]] name = "google-cloud-wkt" -version = "1.1.0" +version = "1.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "02931df6af9beda1c852bbbbe5f7b6ba6ae5e4cd49c029fa0ca2cecc787cd9b1" +checksum = "c6f270e404be7ce76a3260abe0c3c71492ab2599ccd877f3253f3dd552f48cc9" dependencies = [ - "base64 0.22.1", + "base64", "bytes", "serde", "serde_json", @@ -4239,7 +4258,6 @@ dependencies = [ "hyper 0.14.32", "log", "rustls 0.21.12", - "rustls-native-certs 0.6.3", "tokio", "tokio-rustls 0.24.1", ] @@ -4255,7 +4273,7 @@ dependencies = [ "hyper-util", "log", "rustls 0.23.35", - "rustls-native-certs 0.8.2", + "rustls-native-certs", "rustls-pki-types", "tokio", "tokio-rustls 0.26.4", @@ -4282,7 +4300,7 @@ version = "0.1.19" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "727805d60e7938b76b826a6ef209eb70eaa1812794f9424d4a4e2d740662df5f" dependencies = [ - "base64 0.22.1", + "base64", "bytes", "futures-channel", "futures-core", @@ -4374,9 +4392,9 @@ checksum = "7aedcccd01fc5fe81e6b489c15b247b8b0690feb23304303a9e560f37efc560a" [[package]] name = "icu_properties" -version = "2.1.1" +version = "2.1.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "e93fcd3157766c0c8da2f8cff6ce651a31f0810eaa1c51ec363ef790bbb5fb99" +checksum = "020bfc02fe870ec3a66d93e677ccca0562506e5872c650f893269e08615d74ec" dependencies = [ "icu_collections", "icu_locale_core", @@ -4388,9 +4406,9 @@ dependencies = [ [[package]] name = "icu_properties_data" -version = "2.1.1" +version = "2.1.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "02845b3647bb045f1100ecd6480ff52f34c35f82d9880e029d329c21d1054899" +checksum = "616c294cf8d725c6afcd8f55abc17c56464ef6211f9ed59cccffe534129c77af" [[package]] name = "icu_provider" @@ -4609,9 +4627,9 @@ dependencies = [ [[package]] name = "itoa" -version = "1.0.15" +version = "1.0.16" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "4a5f13b858c8d314ee3e8f639011f7ccefe71f97f96e50151fb991f267928e2c" +checksum = "7ee5b5339afb4c41626dde77b7a611bd4f2c202b897852b4bcf5d03eddc61010" [[package]] name = "jemalloc_pprof" @@ -4656,7 +4674,7 @@ version = "10.2.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "c76e1c7d7df3e34443b3621b459b066a7b79644f059fc8b2db7070c825fd417e" dependencies = [ - "base64 0.22.1", + "base64", "ed25519-dalek", "getrandom 0.2.16", "hmac 0.12.1", @@ -4799,13 +4817,13 @@ dependencies = [ [[package]] name = "libredox" -version = "0.1.10" +version = "0.1.11" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "416f7e718bdb06000964960ffa43b4335ad4012ae8b99060261aa4a8088d5ccb" +checksum = "df15f6eac291ed1cf25865b1ee60399f57e7c227e7f51bdbd4c5270396a9ed50" dependencies = [ "bitflags 2.10.0", "libc", - "redox_syscall", + "redox_syscall 0.6.0", ] [[package]] @@ -4828,9 +4846,9 @@ dependencies = [ [[package]] name = "libz-rs-sys" -version = "0.5.3" +version = "0.5.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "8b484ba8d4f775eeca644c452a56650e544bf7e617f1d170fe7298122ead5222" +checksum = "c10501e7805cee23da17c7790e59df2870c0d4043ec6d03f67d31e2b53e77415" dependencies = [ "zlib-rs", ] @@ -4866,9 +4884,9 @@ dependencies = [ [[package]] name = "local-ip-address" -version = "0.6.6" +version = "0.6.8" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "786c72d9739fc316a7acf9b22d9c2794ac9cb91074e9668feb04304ab7219783" +checksum = "0a60bf300a990b2d1ebdde4228e873e8e4da40d834adbf5265f3da1457ede652" dependencies = [ "libc", "neli", @@ -4940,9 +4958,9 @@ dependencies = [ [[package]] name = "lzma-rust2" -version = "0.13.0" +version = "0.15.4" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c60a23ffb90d527e23192f1246b14746e2f7f071cb84476dd879071696c18a4a" +checksum = "48172246aa7c3ea28e423295dd1ca2589a24617cc4e588bb8cfe177cb2c54d95" dependencies = [ "crc", "sha2 0.10.9", @@ -5090,9 +5108,9 @@ dependencies = [ [[package]] name = "mio" -version = "1.1.0" +version = "1.1.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "69d83b0086dc8ecf3ce9ae2874b2d1290252e2a30720bea58a5c6639b0092873" +checksum = "a69bcab0ad47271a0234d9422b131806bf3968021e5dc9328caf2d4cd58557fc" dependencies = [ "libc", "log", @@ -5102,9 +5120,9 @@ dependencies = [ [[package]] name = "moka" -version = "0.12.11" +version = "0.12.12" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "8261cd88c312e0004c1d51baad2980c66528dfdb2bee62003e643a4d8f86b077" +checksum = "a3dec6bd31b08944e08b58fd99373893a6c17054d6f3ea5006cc894f4f4eee2a" dependencies = [ "async-lock", "crossbeam-channel", @@ -5115,7 +5133,6 @@ dependencies = [ "futures-util", "parking_lot", "portable-atomic", - "rustc_version", "smallvec", "tagptr", "uuid", @@ -5129,9 +5146,9 @@ checksum = "1d87ecb2933e8aeadb3e3a02b828fed80a7528047e68b4f424523a0981a3a084" [[package]] name = "neli" -version = "0.7.2" +version = "0.7.3" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "87fe4204517c0dafc04a1d99ecb577d52c0ffc81e1bbe5cf322769aa8fbd1b05" +checksum = "e23bebbf3e157c402c4d5ee113233e5e0610cc27453b2f07eefce649c7365dcc" dependencies = [ "bitflags 2.10.0", "byteorder", @@ -5145,9 +5162,9 @@ dependencies = [ [[package]] name = "neli-proc-macros" -version = "0.2.1" +version = "0.2.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "90e502fe5db321c6e0ae649ccda600675680125a8e8dee327744fe1910b19332" +checksum = "05d8d08c6e98f20a62417478ebf7be8e1425ec9acecc6f63e22da633f6b71609" dependencies = [ "either", "proc-macro2", @@ -5249,9 +5266,9 @@ checksum = "5e0826a989adedc2a244799e823aece04662b66609d96af8dff7ac6df9a8925d" [[package]] name = "ntapi" -version = "0.4.1" +version = "0.4.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "e8a3895c6391c39d7fe7ebc444a87eb2991b2a0bc718fdabd071eec617fc68e4" +checksum = "c70f219e21142367c70c0b30c6a9e3a14d55b4d12a204d897fbec83a0363f081" dependencies = [ "winapi", ] @@ -5700,7 +5717,7 @@ checksum = "2621685985a2ebf1c516881c026032ac7deafcda1a2c9b7850dc81e3dfcb64c1" dependencies = [ "cfg-if", "libc", - "redox_syscall", + "redox_syscall 0.5.18", "smallvec", "windows-link 0.2.1", ] @@ -5719,7 +5736,7 @@ dependencies = [ "arrow-ipc", "arrow-schema", "arrow-select", - "base64 0.22.1", + "base64", "brotli 8.0.2", "bytes", "chrono", @@ -5744,13 +5761,12 @@ dependencies = [ [[package]] name = "password-hash" -version = "0.6.0-rc.3" +version = "0.6.0-rc.6" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "11ceb29fb5976f752babcc02842a530515b714919233f0912845c742dffb6246" +checksum = "383d290055c99f2dd7dece082088d89494dff6d79277fbac4a7da21c1bf2ab6b" dependencies = [ - "base64ct", - "rand_core 0.10.0-rc-2", - "subtle", + "getrandom 0.3.4", + "phc", ] [[package]] @@ -5761,9 +5777,9 @@ checksum = "57c0d7b74b563b49d38dae00a0c37d4d6de9b432382b2892f0574ddcae73fd0a" [[package]] name = "pastey" -version = "0.2.0" +version = "0.2.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "57d6c094ee800037dff99e02cab0eaf3142826586742a270ab3d7a62656bd27a" +checksum = "b867cad97c0791bbd3aaa6472142568c6c9e8f71937e98379f584cfb0cf35bec" [[package]] name = "path-absolutize" @@ -5801,9 +5817,9 @@ dependencies = [ [[package]] name = "pbkdf2" -version = "0.13.0-rc.3" +version = "0.13.0-rc.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "2c148c9a0a9a7d256a8ea004fae8356c02ccc44cf8c06e7d68fdbedb48de1beb" +checksum = "c015873c38594dfb7724f90b2ed912a606697393bda2d39fd83c2394301f808a" dependencies = [ "digest 0.11.0-rc.4", "hmac 0.13.0-rc.3", @@ -5815,7 +5831,7 @@ version = "3.0.6" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "1d30c53c26bc5b31a98cd02d20f25a7c8567146caf63ed593a9d87b2775291be" dependencies = [ - "base64 0.22.1", + "base64", "serde_core", ] @@ -5865,6 +5881,17 @@ dependencies = [ "serde", ] +[[package]] +name = "phc" +version = "0.6.0-rc.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c61f960577aaac5c259bc0866d685ba315c0ed30793c602d7287f54980913863" +dependencies = [ + "base64ct", + "getrandom 0.3.4", + "subtle", +] + [[package]] name = "phf" version = "0.11.3" @@ -6042,6 +6069,12 @@ dependencies = [ "plotters-backend", ] +[[package]] +name = "pollster" +version = "0.4.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "2f3a9f18d041e6d0e102a0a46750538147e5e8992d3b4873aaafee2520b00ce3" + [[package]] name = "poly1305" version = "0.9.0-rc.3" @@ -6065,9 +6098,9 @@ dependencies = [ [[package]] name = "portable-atomic" -version = "1.11.1" +version = "1.12.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f84267b20a16ea918e43c6a88433c2d54fa145c92a811b5b047ccbe153674483" +checksum = "f59e70c4aef1e55797c2e8fd94a4f2a973fc972cfde0e0b05f683667b0cd39dd" [[package]] name = "potential_utf" @@ -6173,7 +6206,7 @@ version = "3.4.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "219cb19e96be00ab2e37d6e299658a0cfa83e52429179969b0f0121b4ac46983" dependencies = [ - "toml_edit 0.23.7", + "toml_edit 0.23.10+spec-1.0.0", ] [[package]] @@ -6600,6 +6633,15 @@ dependencies = [ "bitflags 2.10.0", ] +[[package]] +name = "redox_syscall" +version = "0.6.0" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "ec96166dafa0886eb81fe1c0a388bece180fbef2135f97c1e2cf8302e74b43b5" +dependencies = [ + "bitflags 2.10.0", +] + [[package]] name = "redox_users" version = "0.5.2" @@ -6686,11 +6728,11 @@ checksum = "ba39f3699c378cd8970968dcbff9c43159ea4cfbd88d43c00b22f2ef10a435d2" [[package]] name = "reqwest" -version = "0.12.25" +version = "0.12.28" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "b6eff9328d40131d43bd911d42d79eb6a47312002a4daefc9e37f17e74a7701a" +checksum = "eddd3ca559203180a307f12d114c268abf583f59b03cb906fd0b3ff8646c1147" dependencies = [ - "base64 0.22.1", + "base64", "bytes", "encoding_rs", "futures-channel", @@ -6711,7 +6753,7 @@ dependencies = [ "pin-project-lite", "quinn", "rustls 0.23.35", - "rustls-native-certs 0.8.2", + "rustls-native-certs", "rustls-pki-types", "serde", "serde_json", @@ -6777,12 +6819,12 @@ dependencies = [ [[package]] name = "rmcp" -version = "0.10.0" +version = "0.12.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "38b18323edc657390a6ed4d7a9110b0dec2dc3ed128eb2a123edfbafabdbddc5" +checksum = "528d42f8176e6e5e71ea69182b17d1d0a19a6b3b894b564678b74cd7cab13cfa" dependencies = [ "async-trait", - "base64 0.22.1", + "base64", "chrono", "futures", "pastey", @@ -6799,11 +6841,11 @@ dependencies = [ [[package]] name = "rmcp-macros" -version = "0.10.0" +version = "0.12.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c75d0a62676bf8c8003c4e3c348e2ceb6a7b3e48323681aaf177fdccdac2ce50" +checksum = "e3f81daaa494eb8e985c9462f7d6ce1ab05e5299f48aafd76cdd3d8b060e6f59" dependencies = [ - "darling 0.21.3", + "darling 0.23.0", "proc-macro2", "quote", "serde_json", @@ -6812,22 +6854,19 @@ dependencies = [ [[package]] name = "rmp" -version = "0.8.14" +version = "0.8.15" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "228ed7c16fa39782c3b3468e974aec2795e9089153cd08ee2e9aefb3613334c4" +checksum = "4ba8be72d372b2c9b35542551678538b562e7cf86c3315773cae48dfbfe7790c" dependencies = [ - "byteorder", "num-traits", - "paste", ] [[package]] name = "rmp-serde" -version = "1.3.0" +version = "1.3.1" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "52e599a477cf9840e92f2cde9a7189e67b42c57532749bf90aea6ec10facd4db" +checksum = "72f81bee8c8ef9b577d1681a70ebbc962c232461e397b22c208c43c04b67a155" dependencies = [ - "byteorder", "rmp", "serde", ] @@ -6911,8 +6950,8 @@ dependencies = [ "flume", "futures-util", "log", - "rustls-native-certs 0.8.2", - "rustls-pemfile 2.2.0", + "rustls-native-certs", + "rustls-pemfile", "rustls-webpki 0.102.8", "thiserror 2.0.17", "tokio", @@ -6988,7 +7027,7 @@ dependencies = [ "axum", "axum-extra", "axum-server", - "base64 0.22.1", + "base64", "base64-simd", "bytes", "chrono", @@ -7001,6 +7040,7 @@ dependencies = [ "hex-simd", "http 1.4.0", "http-body 1.0.1", + "http-body-util", "hyper 1.8.1", "hyper-util", "jemalloc_pprof", @@ -7034,6 +7074,7 @@ dependencies = [ "rustfs-rio", "rustfs-s3select-api", "rustfs-s3select-query", + "rustfs-scanner", "rustfs-targets", "rustfs-utils", "rustfs-zip", @@ -7042,6 +7083,7 @@ dependencies = [ "serde", "serde_json", "serde_urlencoded", + "serial_test", "shadow-rs", "socket2 0.6.1", "subtle", @@ -7112,6 +7154,7 @@ dependencies = [ name = "rustfs-audit" version = "0.0.5" dependencies = [ + "async-trait", "chrono", "const-str", "futures", @@ -7177,7 +7220,7 @@ dependencies = [ "cfg-if", "chacha20poly1305", "jsonwebtoken", - "pbkdf2 0.13.0-rc.3", + "pbkdf2 0.13.0-rc.5", "rand 0.10.0-rc.5", "serde_json", "sha2 0.11.0-rc.3", @@ -7197,7 +7240,7 @@ dependencies = [ "aws-credential-types", "aws-sdk-s3", "aws-smithy-types", - "base64 0.22.1", + "base64", "base64-simd", "byteorder", "bytes", @@ -7220,7 +7263,6 @@ dependencies = [ "lazy_static", "md-5 0.11.0-rc.3", "moka", - "nix 0.30.1", "num_cpus", "parking_lot", "path-absolutize", @@ -7262,10 +7304,10 @@ dependencies = [ "tonic", "tower", "tracing", + "tracing-subscriber", "url", "urlencoding", "uuid", - "winapi", "xxhash-rust", ] @@ -7301,6 +7343,7 @@ dependencies = [ "base64-simd", "futures", "jsonwebtoken", + "pollster", "rand 0.10.0-rc.5", "rustfs-crypto", "rustfs-ecstore", @@ -7322,7 +7365,7 @@ version = "0.0.5" dependencies = [ "aes-gcm", "async-trait", - "base64 0.22.1", + "base64", "chacha20poly1305", "chrono", "md5", @@ -7335,7 +7378,6 @@ dependencies = [ "tempfile", "thiserror 2.0.17", "tokio", - "tokio-test", "tracing", "url", "uuid", @@ -7398,6 +7440,7 @@ dependencies = [ name = "rustfs-notify" version = "0.0.5" dependencies = [ + "arc-swap", "async-trait", "axum", "chrono", @@ -7455,10 +7498,14 @@ dependencies = [ name = "rustfs-policy" version = "0.0.5" dependencies = [ + "async-trait", "base64-simd", "chrono", + "futures", "ipnetwork", "jsonwebtoken", + "moka", + "pollster", "rand 0.10.0-rc.5", "regex", "reqwest", @@ -7493,7 +7540,7 @@ name = "rustfs-rio" version = "0.0.5" dependencies = [ "aes-gcm", - "base64 0.22.1", + "base64", "bytes", "crc-fast", "faster-hex", @@ -7529,6 +7576,7 @@ dependencies = [ "futures-core", "http 1.4.0", "object_store", + "parking_lot", "pin-project-lite", "rustfs-common", "rustfs-ecstore", @@ -7558,6 +7606,39 @@ dependencies = [ "tracing", ] +[[package]] +name = "rustfs-scanner" +version = "0.0.5" +dependencies = [ + "anyhow", + "async-trait", + "chrono", + "futures", + "http 1.4.0", + "path-clean", + "rand 0.10.0-rc.5", + "rmp-serde", + "rustfs-common", + "rustfs-config", + "rustfs-ecstore", + "rustfs-filemeta", + "rustfs-madmin", + "rustfs-utils", + "s3s", + "serde", + "serde_json", + "serial_test", + "tempfile", + "thiserror 2.0.17", + "time", + "tokio", + "tokio-test", + "tokio-util", + "tracing", + "tracing-subscriber", + "uuid", +] + [[package]] name = "rustfs-signer" version = "0.0.5" @@ -7621,7 +7702,7 @@ dependencies = [ "regex", "rustfs-config", "rustls 0.23.35", - "rustls-pemfile 2.2.0", + "rustls-pemfile", "rustls-pki-types", "s3s", "serde", @@ -7707,9 +7788,9 @@ dependencies = [ [[package]] name = "rustix" -version = "1.1.2" +version = "1.1.3" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "cd15f8a2c5551a84d56efdc1cd049089e409ac19a3072d5037a17fd70719ff3e" +checksum = "146c9e247ccc180c1f61615433868c99f3de3ae256a30a43b49f67c2d9171f34" dependencies = [ "bitflags 2.10.0", "errno", @@ -7746,18 +7827,6 @@ dependencies = [ "zeroize", ] -[[package]] -name = "rustls-native-certs" -version = "0.6.3" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "a9aace74cb666635c918e9c12bc0d348266037aa8eb599b5cba565709a8dff00" -dependencies = [ - "openssl-probe", - "rustls-pemfile 1.0.4", - "schannel", - "security-framework 2.11.1", -] - [[package]] name = "rustls-native-certs" version = "0.8.2" @@ -7767,16 +7836,7 @@ dependencies = [ "openssl-probe", "rustls-pki-types", "schannel", - "security-framework 3.5.1", -] - -[[package]] -name = "rustls-pemfile" -version = "1.0.4" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "1c74cae0a4cf6ccbbf5f359f08efdf8ee7e1dc532573bf0db71968cb56b1448c" -dependencies = [ - "base64 0.21.7", + "security-framework", ] [[package]] @@ -7790,9 +7850,9 @@ dependencies = [ [[package]] name = "rustls-pki-types" -version = "1.13.1" +version = "1.13.2" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "708c0f9d5f54ba0272468c1d306a52c495b31fa155e91bc25371e6df7996908c" +checksum = "21e6f2ab2928ca4291b86736a8bd920a277a399bba1589409d72154ff87c1282" dependencies = [ "web-time", "zeroize", @@ -7839,15 +7899,14 @@ checksum = "b39cdef0fa800fc44525c84ccb54a029961a8215f9619753635a9c0d2538d46d" [[package]] name = "ryu" -version = "1.0.20" +version = "1.0.21" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "28d3b2b1366ec20994f1fd18c3c594f05c5dd4bc44d8bb0c1c632c8d6829481f" +checksum = "62049b2877bf12821e8f9ad256ee38fdc31db7387ec2d3b3f403024de2034aea" [[package]] name = "s3s" -version = "0.12.0-rc.4" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "538c74372dc8900685cd9bdd122a587fdf63a24f6c9d5878812241f7682338fd" +version = "0.13.0-alpha" +source = "git+https://github.com/s3s-project/s3s.git?branch=main#f6198bbf49abe60066fe47cbbefcb7078863b3e9" dependencies = [ "arrayvec", "async-trait", @@ -8008,19 +8067,6 @@ dependencies = [ "zeroize", ] -[[package]] -name = "security-framework" -version = "2.11.1" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "897b2245f0b511c87893af39b033e5ca9cce68824c4d7e7630b5a1d339658d02" -dependencies = [ - "bitflags 2.10.0", - "core-foundation 0.9.4", - "core-foundation-sys", - "libc", - "security-framework-sys", -] - [[package]] name = "security-framework" version = "3.5.1" @@ -8070,28 +8116,6 @@ dependencies = [ "serde_derive", ] -[[package]] -name = "serde-untagged" -version = "0.1.9" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f9faf48a4a2d2693be24c6289dbe26552776eb7737074e6722891fadbe6c5058" -dependencies = [ - "erased-serde", - "serde", - "serde_core", - "typeid", -] - -[[package]] -name = "serde-value" -version = "0.7.0" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f3a1a3341211875ef120e117ea7fd5228530ae7e7036a779fdc9117be6b3282c" -dependencies = [ - "ordered-float", - "serde", -] - [[package]] name = "serde_core" version = "1.0.228" @@ -8134,15 +8158,15 @@ dependencies = [ [[package]] name = "serde_json" -version = "1.0.145" +version = "1.0.147" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "402a6f66d8c709116cf22f558eab210f5a50187f702eb4d7e5ef38d9a7f1c79c" +checksum = "6af14725505314343e673e9ecb7cd7e8a36aa9791eb936235a3567cc31447ae4" dependencies = [ "itoa", "memchr", - "ryu", "serde", "serde_core", + "zmij", ] [[package]] @@ -8183,7 +8207,7 @@ version = "3.16.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "4fa237f2807440d238e0364a218270b98f767a00d3dada77b1c53ae88940e2e7" dependencies = [ - "base64 0.22.1", + "base64", "chrono", "hex", "indexmap 1.9.3", @@ -8289,9 +8313,9 @@ dependencies = [ [[package]] name = "shadow-rs" -version = "1.4.0" +version = "1.5.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "72d18183cef626bce22836103349c7050d73db799be0171386b80947d157ae32" +checksum = "ff351910f271e7065781b6b4f0f43cb515d474d812f31176a0246d9058e47d5d" dependencies = [ "cargo_metadata", "const_format", @@ -8366,9 +8390,9 @@ dependencies = [ [[package]] name = "simd-adler32" -version = "0.3.7" +version = "0.3.8" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d66dc143e6b11c1eddc06d5c423cfc97062865baf299914ab64caa38182078fe" +checksum = "e320a6c5ad31d271ad523dcf3ad13e2767ad8b1cb8f047f75a8aeaf8da139da2" [[package]] name = "simdutf8" @@ -8862,14 +8886,14 @@ dependencies = [ [[package]] name = "tempfile" -version = "3.23.0" +version = "3.24.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "2d31c77bdf42a745371d260a26ca7163f1e0924b64afa0b688e61b5a9fa02f16" +checksum = "655da9c7eb6305c55742045d5a8d2037996d61d8de95806335c7c86ce0f82e9c" dependencies = [ "fastrand", "getrandom 0.3.4", "once_cell", - "rustix 1.1.2", + "rustix 1.1.3", "windows-sys 0.61.2", ] @@ -9183,9 +9207,9 @@ dependencies = [ [[package]] name = "toml_datetime" -version = "0.7.3" +version = "0.7.5+spec-1.1.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "f2cdb639ebbc97961c51720f858597f7f24c4fc295327923af55b74c3c724533" +checksum = "92e1cfed4a3038bc5a127e35a2d360f145e1f4b971b551a2ba5fd7aedf7e1347" dependencies = [ "serde_core", ] @@ -9206,21 +9230,21 @@ dependencies = [ [[package]] name = "toml_edit" -version = "0.23.7" +version = "0.23.10+spec-1.0.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "6485ef6d0d9b5d0ec17244ff7eb05310113c3f316f2d14200d4de56b3cb98f8d" +checksum = "84c8b9f757e028cee9fa244aea147aab2a9ec09d5325a9b01e0a49730c2b5269" dependencies = [ "indexmap 2.12.1", - "toml_datetime 0.7.3", + "toml_datetime 0.7.5+spec-1.1.0", "toml_parser", "winnow", ] [[package]] name = "toml_parser" -version = "1.0.4" +version = "1.0.6+spec-1.1.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "c0cbe268d35bdb4bb5a56a2de88d0ad0eb70af5384a99d648cd4b3d04039800e" +checksum = "a3198b4b0a8e11f09dd03e133c0280504d0801269e9afa46362ffde1cbeebf44" dependencies = [ "winnow", ] @@ -9239,7 +9263,7 @@ checksum = "eb7613188ce9f7df5bfe185db26c5814347d110db17920415cf2fbcad85e7203" dependencies = [ "async-trait", "axum", - "base64 0.22.1", + "base64", "bytes", "flate2", "h2 0.4.12", @@ -9251,7 +9275,7 @@ dependencies = [ "hyper-util", "percent-encoding", "pin-project", - "rustls-native-certs 0.8.2", + "rustls-native-certs", "socket2 0.6.1", "sync_wrapper", "tokio", @@ -9360,9 +9384,9 @@ checksum = "8df9b6e13f2d32c91b9bd719c00d1958837bc7dec474d94952798cc8e69eeec3" [[package]] name = "tracing" -version = "0.1.43" +version = "0.1.44" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "2d15d90a0b5c19378952d479dc858407149d7bb45a14de0142f6c534b16fc647" +checksum = "63e71662fa4b2a2c3a26f570f037eb95bb1f85397f3cd8076caed2f026a6d100" dependencies = [ "log", "pin-project-lite", @@ -9395,9 +9419,9 @@ dependencies = [ [[package]] name = "tracing-core" -version = "0.1.35" +version = "0.1.36" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "7a04e24fab5c89c6a36eb8558c9656f30d81de51dfa4d3b45f26b21d61fa0a6c" +checksum = "db97caf9d906fbde555dd62fa95ddba9eecfd14cb388e4f491a66d74cd5fb79a" dependencies = [ "once_cell", "valuable", @@ -9516,9 +9540,9 @@ checksum = "14eff19b8dc1ace5bf7e4d920b2628ae3837f422ff42210cb1567cbf68b5accf" [[package]] name = "tzdb" -version = "0.7.2" +version = "0.7.3" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "0be2ea5956f295449f47c0b825c5e109022ff1a6a53bb4f77682a87c2341fbf5" +checksum = "56d4e985b6dda743ae7fd4140c28105316ffd75bc58258ee6cc12934e3eb7a0c" dependencies = [ "iana-time-zone", "tz-rs", @@ -9527,9 +9551,9 @@ dependencies = [ [[package]] name = "tzdb_data" -version = "0.2.2" +version = "0.2.3" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "9c4c81d75033770e40fbd3643ce7472a1a9fd301f90b7139038228daf8af03ec" +checksum = "42302a846dea7ab786f42dc5f519387069045acff793e1178d9368414168fe95" dependencies = [ "tz-rs", ] @@ -10259,7 +10283,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "32e45ad4206f6d2479085147f02bc2ef834ac85886624a23575ae137c8aa8156" dependencies = [ "libc", - "rustix 1.1.2", + "rustix 1.1.3", ] [[package]] @@ -10408,9 +10432,9 @@ dependencies = [ [[package]] name = "zip" -version = "6.0.0" +version = "7.0.0" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "eb2a05c7c36fde6c09b08576c9f7fb4cda705990f73b58fe011abf7dfb24168b" +checksum = "bdd8a47718a4ee5fe78e07667cd36f3de80e7c2bfe727c7074245ffc7303c037" dependencies = [ "aes 0.8.4", "arbitrary", @@ -10419,6 +10443,7 @@ dependencies = [ "crc32fast", "deflate64", "flate2", + "generic-array", "getrandom 0.3.4", "hmac 0.12.1", "indexmap 2.12.1", @@ -10435,9 +10460,15 @@ dependencies = [ [[package]] name = "zlib-rs" -version = "0.5.3" +version = "0.5.5" source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "36134c44663532e6519d7a6dfdbbe06f6f8192bde8ae9ed076e9b213f0e31df7" +checksum = "40990edd51aae2c2b6907af74ffb635029d5788228222c4bb811e9351c0caad3" + +[[package]] +name = "zmij" +version = "0.1.7" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "9e404bcd8afdaf006e529269d3e85a743f9480c3cef60034d77860d02964f3ba" [[package]] name = "zopfli" diff --git a/Cargo.toml b/Cargo.toml index df0fdc4a..9ea702d2 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -34,6 +34,7 @@ members = [ "crates/targets", # Target-specific configurations and utilities "crates/s3select-api", # S3 Select API interface "crates/s3select-query", # S3 Select query engine + "crates/scanner", # Scanner for data integrity checks and health monitoring "crates/signer", # client signer "crates/checksums", # client checksums "crates/utils", # Utility functions and helpers @@ -86,6 +87,7 @@ rustfs-protos = { path = "crates/protos", version = "0.0.5" } rustfs-rio = { path = "crates/rio", version = "0.0.5" } rustfs-s3select-api = { path = "crates/s3select-api", version = "0.0.5" } rustfs-s3select-query = { path = "crates/s3select-query", version = "0.0.5" } +rustfs-scanner = { path = "crates/scanner", version = "0.0.5" } rustfs-signer = { path = "crates/signer", version = "0.0.5" } rustfs-targets = { path = "crates/targets", version = "0.0.5" } rustfs-utils = { path = "crates/utils", version = "0.0.5" } @@ -97,18 +99,20 @@ async-channel = "2.5.0" async-compression = { version = "0.4.19" } async-recursion = "1.1.1" async-trait = "0.1.89" -axum = "0.8.7" -axum-extra = "0.12.2" +axum = "0.8.8" +axum-extra = "0.12.3" axum-server = { version = "0.8.0", features = ["tls-rustls-no-provider"], default-features = false } futures = "0.3.31" futures-core = "0.3.31" futures-util = "0.3.31" +pollster = "0.4.0" hyper = { version = "1.8.1", features = ["http2", "http1", "server"] } hyper-rustls = { version = "0.27.7", default-features = false, features = ["native-tokio", "http1", "tls12", "logging", "http2", "ring", "webpki-roots"] } hyper-util = { version = "0.1.19", features = ["tokio", "server-auto", "server-graceful"] } http = "1.4.0" http-body = "1.0.1" -reqwest = { version = "0.12.25", default-features = false, features = ["rustls-tls-webpki-roots", "charset", "http2", "system-proxy", "stream", "json", "blocking"] } +http-body-util = "0.1.3" +reqwest = { version = "0.12.28", default-features = false, features = ["rustls-tls-webpki-roots", "charset", "http2", "system-proxy", "stream", "json", "blocking"] } socket2 = "0.6.1" tokio = { version = "1.48.0", features = ["fs", "rt-multi-thread"] } tokio-rustls = { version = "0.26.4", default-features = false, features = ["logging", "tls12", "ring"] } @@ -125,31 +129,31 @@ tower-http = { version = "0.6.8", features = ["cors"] } bytes = { version = "1.11.0", features = ["serde"] } bytesize = "2.3.1" byteorder = "1.5.0" -flatbuffers = "25.9.23" +flatbuffers = "25.12.19" form_urlencoded = "1.2.2" prost = "0.14.1" quick-xml = "0.38.4" -rmcp = { version = "0.10.0" } -rmp = { version = "0.8.14" } -rmp-serde = { version = "1.3.0" } +rmcp = { version = "0.12.0" } +rmp = { version = "0.8.15" } +rmp-serde = { version = "1.3.1" } serde = { version = "1.0.228", features = ["derive"] } -serde_json = { version = "1.0.145", features = ["raw_value"] } +serde_json = { version = "1.0.147", features = ["raw_value"] } serde_urlencoded = "0.7.1" schemars = "1.1.0" # Cryptography and Security aes-gcm = { version = "0.11.0-rc.2", features = ["rand_core"] } -argon2 = { version = "0.6.0-rc.3", features = ["std"] } +argon2 = { version = "0.6.0-rc.5" } blake3 = { version = "1.8.2", features = ["rayon", "mmap"] } chacha20poly1305 = { version = "0.11.0-rc.2" } crc-fast = "1.6.0" hmac = { version = "0.13.0-rc.3" } jsonwebtoken = { version = "10.2.0", features = ["rust_crypto"] } -pbkdf2 = "0.13.0-rc.3" +pbkdf2 = "0.13.0-rc.5" rsa = { version = "0.10.0-rc.10" } rustls = { version = "0.23.35", features = ["ring", "logging", "std", "tls12"], default-features = false } rustls-pemfile = "2.2.0" -rustls-pki-types = "1.13.1" +rustls-pki-types = "1.13.2" sha1 = "0.11.0-rc.3" sha2 = "0.11.0-rc.3" subtle = "2.6" @@ -162,20 +166,20 @@ time = { version = "0.3.44", features = ["std", "parsing", "formatting", "macros # Utilities and Tools anyhow = "1.0.100" -arc-swap = "1.7.1" +arc-swap = "1.8.0" astral-tokio-tar = "0.5.6" atoi = "2.0.0" atomic_enum = "0.3.0" -aws-config = { version = "1.8.11" } -aws-credential-types = { version = "1.2.10" } -aws-sdk-s3 = { version = "1.116.0", default-features = false, features = ["sigv4a", "rustls", "rt-tokio"] } -aws-smithy-types = { version = "1.3.4" } +aws-config = { version = "1.8.12" } +aws-credential-types = { version = "1.2.11" } +aws-sdk-s3 = { version = "1.119.0", default-features = false, features = ["sigv4a", "rustls", "rt-tokio"] } +aws-smithy-types = { version = "1.3.5" } base64 = "0.22.1" base64-simd = "0.8.0" brotli = "8.0.2" cfg-if = "1.0.4" clap = { version = "4.5.53", features = ["derive", "env"] } -const-str = { version = "0.7.0", features = ["std", "proc"] } +const-str = { version = "0.7.1", features = ["std", "proc"] } convert_case = "0.10.0" criterion = { version = "0.8", features = ["html_reports"] } crossbeam-queue = "0.3.12" @@ -186,8 +190,8 @@ faster-hex = "0.10.0" flate2 = "1.1.5" flexi_logger = { version = "0.31.7", features = ["trc", "dont_minimize_extra_stacks", "compress", "kv", "json"] } glob = "0.3.3" -google-cloud-storage = "1.4.0" -google-cloud-auth = "1.2.0" +google-cloud-storage = "1.5.0" +google-cloud-auth = "1.3.0" hashbrown = { version = "0.16.1", features = ["serde", "rayon"] } heed = { version = "0.22.0" } hex-simd = "0.8.0" @@ -196,13 +200,13 @@ ipnetwork = { version = "0.21.1", features = ["serde"] } lazy_static = "1.5.0" libc = "0.2.178" libsystemd = "0.7.2" -local-ip-address = "0.6.6" +local-ip-address = "0.6.8" lz4 = "1.28.1" matchit = "0.9.0" md-5 = "0.11.0-rc.3" md5 = "0.8.0" mime_guess = "2.0.5" -moka = { version = "0.12.11", features = ["future"] } +moka = { version = "0.12.12", features = ["future"] } netif = "0.1.6" nix = { version = "0.30.1", features = ["fs"] } nu-ansi-term = "0.50.3" @@ -221,9 +225,9 @@ regex = { version = "1.12.2" } rumqttc = { version = "0.25.1" } rust-embed = { version = "8.9.0" } rustc-hash = { version = "2.1.1" } -s3s = { version = "0.12.0-rc.4", features = ["minio"] } +s3s = { version = "0.13.0-alpha", features = ["minio"], git = "https://github.com/s3s-project/s3s.git", branch = "main" } serial_test = "3.2.0" -shadow-rs = { version = "1.4.0", default-features = false } +shadow-rs = { version = "1.5.0", default-features = false } siphasher = "1.0.1" smallvec = { version = "1.15.1", features = ["serde"] } smartstring = "1.0.1" @@ -234,10 +238,10 @@ strum = { version = "0.27.2", features = ["derive"] } sysctl = "0.7.1" sysinfo = "0.37.2" temp-env = "0.3.6" -tempfile = "3.23.0" +tempfile = "3.24.0" test-case = "3.3.1" thiserror = "2.0.17" -tracing = { version = "0.1.43" } +tracing = { version = "0.1.44" } tracing-appender = "0.2.4" tracing-error = "0.2.1" tracing-opentelemetry = "0.32.0" @@ -251,7 +255,7 @@ walkdir = "2.5.0" wildmatch = { version = "2.6.1", features = ["serde"] } winapi = { version = "0.3.9" } xxhash-rust = { version = "0.8.15", features = ["xxh64", "xxh3"] } -zip = "6.0.0" +zip = "7.0.0" zstd = "0.13.3" # Observability and Metrics @@ -277,7 +281,7 @@ pprof = { version = "0.15.0", features = ["flamegraph", "protobuf-codec"] } [workspace.metadata.cargo-shear] -ignored = ["rustfs", "rustfs-mcp", "tokio-test"] +ignored = ["rustfs", "rustfs-mcp"] [profile.release] opt-level = 3 diff --git a/Dockerfile.source b/Dockerfile.source index c4d9a430..442775bc 100644 --- a/Dockerfile.source +++ b/Dockerfile.source @@ -39,7 +39,9 @@ RUN set -eux; \ libssl-dev \ lld \ protobuf-compiler \ - flatbuffers-compiler; \ + flatbuffers-compiler \ + gcc-aarch64-linux-gnu \ + gcc-x86-64-linux-gnu; \ rm -rf /var/lib/apt/lists/* # Optional: cross toolchain for aarch64 (only when targeting linux/arm64) @@ -51,18 +53,18 @@ RUN set -eux; \ rm -rf /var/lib/apt/lists/*; \ fi -# Add Rust targets based on TARGETPLATFORM +# Add Rust targets for both arches (to support cross-builds on multi-arch runners) RUN set -eux; \ - case "${TARGETPLATFORM:-linux/amd64}" in \ - linux/amd64) rustup target add x86_64-unknown-linux-gnu ;; \ - linux/arm64) rustup target add aarch64-unknown-linux-gnu ;; \ - *) echo "Unsupported TARGETPLATFORM=${TARGETPLATFORM}" >&2; exit 1 ;; \ - esac + rustup target add x86_64-unknown-linux-gnu aarch64-unknown-linux-gnu; \ + rustup component add rust-std-x86_64-unknown-linux-gnu rust-std-aarch64-unknown-linux-gnu # Cross-compilation environment (used only when targeting aarch64) ENV CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc ENV CC_aarch64_unknown_linux_gnu=aarch64-linux-gnu-gcc ENV CXX_aarch64_unknown_linux_gnu=aarch64-linux-gnu-g++ +ENV CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER=x86_64-linux-gnu-gcc +ENV CC_x86_64_unknown_linux_gnu=x86_64-linux-gnu-gcc +ENV CXX_x86_64_unknown_linux_gnu=x86_64-linux-gnu-g++ WORKDIR /usr/src/rustfs @@ -72,7 +74,6 @@ COPY Cargo.toml Cargo.lock ./ # 2) workspace member manifests (adjust if workspace layout changes) COPY rustfs/Cargo.toml rustfs/Cargo.toml COPY crates/*/Cargo.toml crates/ -COPY cli/rustfs-gui/Cargo.toml cli/rustfs-gui/Cargo.toml # Pre-fetch dependencies for better caching RUN --mount=type=cache,target=/usr/local/cargo/registry \ @@ -117,6 +118,49 @@ RUN --mount=type=cache,target=/usr/local/cargo/registry \ ;; \ esac +# ----------------------------- +# Development stage (keeps toolchain) +# ----------------------------- +FROM builder AS dev + +ARG BUILD_DATE +ARG VCS_REF + +LABEL name="RustFS (dev-source)" \ + maintainer="RustFS Team" \ + build-date="${BUILD_DATE}" \ + vcs-ref="${VCS_REF}" \ + description="RustFS - local development with Rust toolchain." + +# Install runtime dependencies that might be missing in partial builder +# (builder already has build-essential, lld, etc.) +WORKDIR /app + +ENV CARGO_INCREMENTAL=1 + +# Ensure we have the same default env vars available +ENV RUSTFS_ADDRESS=":9000" \ + RUSTFS_ACCESS_KEY="rustfsadmin" \ + RUSTFS_SECRET_KEY="rustfsadmin" \ + RUSTFS_CONSOLE_ENABLE="true" \ + RUSTFS_VOLUMES="/data" \ + RUST_LOG="warn" \ + RUSTFS_OBS_LOG_DIRECTORY="/logs" \ + RUSTFS_USERNAME="rustfs" \ + RUSTFS_GROUPNAME="rustfs" \ + RUSTFS_UID="1000" \ + RUSTFS_GID="1000" + +# Note: We don't COPY source here because we expect it to be mounted at /app +# We rely on cargo run to build and run +EXPOSE 9000 9001 + +COPY entrypoint.sh /entrypoint.sh +RUN chmod +x /entrypoint.sh + +ENTRYPOINT ["/entrypoint.sh"] +CMD ["cargo", "run", "--bin", "rustfs", "--"] + # ----------------------------- # Runtime stage (Ubuntu minimal) # ----------------------------- diff --git a/Makefile b/Makefile index 60bfce83..40ac1738 100644 --- a/Makefile +++ b/Makefile @@ -9,30 +9,53 @@ CONTAINER_NAME ?= rustfs-dev DOCKERFILE_PRODUCTION = Dockerfile DOCKERFILE_SOURCE = Dockerfile.source +# Fatal check +# Checks all required dependencies and exits with error if not found +# (e.g., cargo, rustfmt) +check-%: + @command -v $* >/dev/null 2>&1 || { \ + echo >&2 "❌ '$*' is not installed."; \ + exit 1; \ + } + +# Warning-only check +# Checks for optional dependencies and issues a warning if not found +# (e.g., cargo-nextest for enhanced testing) +warn-%: + @command -v $* >/dev/null 2>&1 || { \ + echo >&2 "⚠️ '$*' is not installed."; \ + } + +# For checking dependencies use check- or warn- +.PHONY: core-deps fmt-deps test-deps +core-deps: check-cargo +fmt-deps: check-rustfmt +test-deps: warn-cargo-nextest + # Code quality and formatting targets .PHONY: fmt -fmt: +fmt: core-deps fmt-deps @echo "🔧 Formatting code..." cargo fmt --all .PHONY: fmt-check -fmt-check: +fmt-check: core-deps fmt-deps @echo "📝 Checking code formatting..." cargo fmt --all --check .PHONY: clippy -clippy: +clippy: core-deps @echo "🔍 Running clippy checks..." cargo clippy --fix --allow-dirty cargo clippy --all-targets --all-features -- -D warnings .PHONY: check -check: +check: core-deps @echo "🔨 Running compilation check..." cargo check --all-targets .PHONY: test -test: +test: core-deps test-deps @echo "🧪 Running tests..." @if command -v cargo-nextest >/dev/null 2>&1; then \ cargo nextest run --all --exclude e2e_test; \ @@ -42,16 +65,16 @@ test: fi cargo test --all --doc -.PHONY: pre-commit -pre-commit: fmt clippy check test - @echo "✅ All pre-commit checks passed!" - .PHONY: setup-hooks setup-hooks: @echo "🔧 Setting up git hooks..." chmod +x .git/hooks/pre-commit @echo "✅ Git hooks setup complete!" +.PHONY: pre-commit +pre-commit: fmt clippy check test + @echo "✅ All pre-commit checks passed!" + .PHONY: e2e-server e2e-server: sh $(shell pwd)/scripts/run.sh @@ -186,8 +209,6 @@ docker-dev-push: --push \ . - - # Local production builds using direct buildx (alternative to docker-buildx.sh) .PHONY: docker-buildx-production-local docker-buildx-production-local: @@ -247,8 +268,6 @@ dev-env-stop: .PHONY: dev-env-restart dev-env-restart: dev-env-stop dev-env-start - - # ======================================================================================== # Build Utilities # ======================================================================================== diff --git a/README.md b/README.md index bf6d7fb6..3bf4d0e4 100644 --- a/README.md +++ b/README.md @@ -103,7 +103,7 @@ The RustFS container runs as a non-root user `rustfs` (UID `10001`). If you run docker run -d -p 9000:9000 -p 9001:9001 -v $(pwd)/data:/data -v $(pwd)/logs:/logs rustfs/rustfs:latest # Using specific version - docker run -d -p 9000:9000 -p 9001:9001 -v $(pwd)/data:/data -v $(pwd)/logs:/logs rustfs/rustfs:1.0.0.alpha.68 + docker run -d -p 9000:9000 -p 9001:9001 -v $(pwd)/data:/data -v $(pwd)/logs:/logs rustfs/rustfs:1.0.0-alpha.76 ``` You can also use Docker Compose. Using the `docker-compose.yml` file in the root directory: @@ -153,11 +153,28 @@ make help-docker # Show all Docker-related commands Follow the instructions in the [Helm Chart README](https://charts.rustfs.com/) to install RustFS on a Kubernetes cluster. +### 5\. Nix Flake (Option 5) + +If you have [Nix with flakes enabled](https://nixos.wiki/wiki/Flakes#Enable_flakes): + +```bash +# Run directly without installing +nix run github:rustfs/rustfs + +# Build the binary +nix build github:rustfs/rustfs +./result/bin/rustfs --help + +# Or from a local checkout +nix build +nix run +``` + ----- ### Accessing RustFS -5. **Access the Console**: Open your web browser and navigate to `http://localhost:9000` to access the RustFS console. +5. **Access the Console**: Open your web browser and navigate to `http://localhost:9001` to access the RustFS console. * Default credentials: `rustfsadmin` / `rustfsadmin` 6. **Create a Bucket**: Use the console to create a new bucket for your objects. 7. **Upload Objects**: You can upload files directly through the console or use S3-compatible APIs/clients to interact with your RustFS instance. diff --git a/SECURITY.md b/SECURITY.md index 988d29e9..7f28a238 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -2,8 +2,7 @@ ## Supported Versions -Use this section to tell people about which versions of your project are -currently being supported with security updates. +Security updates are provided for the latest released version of this project. | Version | Supported | | ------- | ------------------ | @@ -11,8 +10,10 @@ currently being supported with security updates. ## Reporting a Vulnerability -Use this section to tell people how to report a vulnerability. +Please report security vulnerabilities **privately** via GitHub Security Advisories: -Tell them where to go, how often they can expect to get an update on a -reported vulnerability, what to expect if the vulnerability is accepted or -declined, etc. +https://github.com/rustfs/rustfs/security/advisories/new + +Do **not** open a public issue for security-sensitive bugs. + +You can expect an initial response within a reasonable timeframe. Further updates will be provided as the report is triaged. diff --git a/build-rustfs.sh b/build-rustfs.sh index 651ef735..51e2383c 100755 --- a/build-rustfs.sh +++ b/build-rustfs.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # RustFS Binary Build Script # This script compiles RustFS binaries for different platforms and architectures diff --git a/crates/ahm/src/heal/channel.rs b/crates/ahm/src/heal/channel.rs index 4490239f..4a2df7f5 100644 --- a/crates/ahm/src/heal/channel.rs +++ b/crates/ahm/src/heal/channel.rs @@ -183,7 +183,7 @@ impl HealChannelProcessor { HealType::Object { bucket: request.bucket.clone(), object: prefix.clone(), - version_id: None, + version_id: request.object_version_id.clone(), } } else { HealType::Bucket { @@ -366,6 +366,7 @@ mod tests { id: "test-id".to_string(), bucket: "test-bucket".to_string(), object_prefix: None, + object_version_id: None, disk: None, priority: HealChannelPriority::Normal, scan_mode: None, @@ -394,6 +395,7 @@ mod tests { id: "test-id".to_string(), bucket: "test-bucket".to_string(), object_prefix: Some("test-object".to_string()), + object_version_id: None, disk: None, priority: HealChannelPriority::High, scan_mode: Some(HealScanMode::Deep), @@ -425,6 +427,7 @@ mod tests { id: "test-id".to_string(), bucket: "test-bucket".to_string(), object_prefix: None, + object_version_id: None, disk: Some("pool_0_set_1".to_string()), priority: HealChannelPriority::Critical, scan_mode: None, @@ -453,6 +456,7 @@ mod tests { id: "test-id".to_string(), bucket: "test-bucket".to_string(), object_prefix: None, + object_version_id: None, disk: Some("invalid-disk-id".to_string()), priority: HealChannelPriority::Normal, scan_mode: None, @@ -488,6 +492,7 @@ mod tests { id: "test-id".to_string(), bucket: "test-bucket".to_string(), object_prefix: None, + object_version_id: None, disk: None, priority: channel_priority, scan_mode: None, @@ -516,6 +521,7 @@ mod tests { id: "test-id".to_string(), bucket: "test-bucket".to_string(), object_prefix: None, + object_version_id: None, disk: None, priority: HealChannelPriority::Normal, scan_mode: None, @@ -545,6 +551,7 @@ mod tests { id: "test-id".to_string(), bucket: "test-bucket".to_string(), object_prefix: Some("".to_string()), // Empty prefix should be treated as bucket heal + object_version_id: None, disk: None, priority: HealChannelPriority::Normal, scan_mode: None, diff --git a/crates/ahm/src/heal/manager.rs b/crates/ahm/src/heal/manager.rs index 39c5f8fd..4e287e38 100644 --- a/crates/ahm/src/heal/manager.rs +++ b/crates/ahm/src/heal/manager.rs @@ -468,14 +468,17 @@ impl HealManager { let active_heals = self.active_heals.clone(); let cancel_token = self.cancel_token.clone(); let storage = self.storage.clone(); - - info!( - "start_auto_disk_scanner: Starting auto disk scanner with interval: {:?}", - config.read().await.heal_interval - ); + let mut duration = { + let config = config.read().await; + config.heal_interval + }; + if duration < Duration::from_secs(1) { + duration = Duration::from_secs(1); + } + info!("start_auto_disk_scanner: Starting auto disk scanner with interval: {:?}", duration); tokio::spawn(async move { - let mut interval = interval(config.read().await.heal_interval); + let mut interval = interval(duration); loop { tokio::select! { diff --git a/crates/ahm/src/scanner/data_scanner.rs b/crates/ahm/src/scanner/data_scanner.rs index 900d40ce..eaf85255 100644 --- a/crates/ahm/src/scanner/data_scanner.rs +++ b/crates/ahm/src/scanner/data_scanner.rs @@ -29,8 +29,8 @@ use rustfs_ecstore::{ self as ecstore, StorageAPI, bucket::versioning::VersioningApi, bucket::versioning_sys::BucketVersioningSys, - data_usage::{aggregate_local_snapshots, store_data_usage_in_backend}, - disk::{Disk, DiskAPI, DiskStore, RUSTFS_META_BUCKET, WalkDirOptions}, + data_usage::{aggregate_local_snapshots, compute_bucket_usage, store_data_usage_in_backend}, + disk::{DiskAPI, DiskStore, RUSTFS_META_BUCKET, WalkDirOptions}, set_disk::SetDisks, store_api::ObjectInfo, }; @@ -137,6 +137,8 @@ pub struct Scanner { data_usage_stats: Arc>>, /// Last data usage statistics collection time last_data_usage_collection: Arc>>, + /// Backoff timestamp for heavy fallback collection + fallback_backoff_until: Arc>>, /// Heal manager for auto-heal integration heal_manager: Option>, @@ -192,6 +194,7 @@ impl Scanner { disk_metrics: Arc::new(Mutex::new(HashMap::new())), data_usage_stats: Arc::new(Mutex::new(HashMap::new())), last_data_usage_collection: Arc::new(RwLock::new(None)), + fallback_backoff_until: Arc::new(RwLock::new(None)), heal_manager, node_scanner, stats_aggregator, @@ -473,6 +476,8 @@ impl Scanner { size: usage.total_size as i64, delete_marker: !usage.has_live_object && usage.delete_markers_count > 0, mod_time: usage.last_modified_ns.and_then(Self::ns_to_offset_datetime), + // Set is_latest to true for live objects - required for lifecycle expiration evaluation + is_latest: usage.has_live_object, ..Default::default() } } @@ -879,6 +884,7 @@ impl Scanner { /// Collect and persist data usage statistics async fn collect_and_persist_data_usage(&self) -> Result<()> { info!("Starting data usage collection and persistence"); + let now = SystemTime::now(); // Get ECStore instance let Some(ecstore) = rustfs_ecstore::new_object_layer_fn() else { @@ -886,6 +892,10 @@ impl Scanner { return Ok(()); }; + // Helper to avoid hammering the storage layer with repeated realtime scans. + let mut use_cached_on_backoff = false; + let fallback_backoff_secs = Duration::from_secs(300); + // Run local usage scan and aggregate snapshots; fall back to on-demand build when necessary. let mut data_usage = match local_scan::scan_and_persist_local_usage(ecstore.clone()).await { Ok(outcome) => { @@ -907,16 +917,55 @@ impl Scanner { "Failed to aggregate local data usage snapshots, falling back to realtime collection: {}", e ); - self.build_data_usage_from_ecstore(&ecstore).await? + match self.maybe_fallback_collection(now, fallback_backoff_secs, &ecstore).await? { + Some(usage) => usage, + None => { + use_cached_on_backoff = true; + DataUsageInfo::default() + } + } } } } Err(e) => { warn!("Local usage scan failed (using realtime collection instead): {}", e); - self.build_data_usage_from_ecstore(&ecstore).await? + match self.maybe_fallback_collection(now, fallback_backoff_secs, &ecstore).await? { + Some(usage) => usage, + None => { + use_cached_on_backoff = true; + DataUsageInfo::default() + } + } } }; + // If heavy fallback was skipped due to backoff, try to reuse cached stats to avoid empty responses. + if use_cached_on_backoff && data_usage.buckets_usage.is_empty() { + let cached = { + let guard = self.data_usage_stats.lock().await; + guard.values().next().cloned() + }; + if let Some(cached_usage) = cached { + data_usage = cached_usage; + } + + // If there is still no data, try backend before persisting zeros + if data_usage.buckets_usage.is_empty() { + if let Ok(existing) = rustfs_ecstore::data_usage::load_data_usage_from_backend(ecstore.clone()).await { + if !existing.buckets_usage.is_empty() { + info!("Using existing backend data usage during fallback backoff"); + data_usage = existing; + } + } + } + + // Avoid overwriting valid backend stats with zeros when fallback is throttled + if data_usage.buckets_usage.is_empty() { + warn!("Skipping data usage persistence: fallback throttled and no cached/backend data available"); + return Ok(()); + } + } + // Make sure bucket counters reflect aggregated content data_usage.buckets_count = data_usage.buckets_usage.len() as u64; if data_usage.last_update.is_none() { @@ -959,8 +1008,31 @@ impl Scanner { Ok(()) } + async fn maybe_fallback_collection( + &self, + now: SystemTime, + backoff: Duration, + ecstore: &Arc, + ) -> Result> { + let backoff_until = *self.fallback_backoff_until.read().await; + let within_backoff = backoff_until.map(|ts| now < ts).unwrap_or(false); + + if within_backoff { + warn!( + "Skipping heavy data usage fallback within backoff window (until {:?}); using cached stats if available", + backoff_until + ); + return Ok(None); + } + + let usage = self.build_data_usage_from_ecstore(ecstore).await?; + let mut backoff_guard = self.fallback_backoff_until.write().await; + *backoff_guard = Some(now + backoff); + Ok(Some(usage)) + } + /// Build data usage statistics directly from ECStore - async fn build_data_usage_from_ecstore(&self, ecstore: &Arc) -> Result { + pub async fn build_data_usage_from_ecstore(&self, ecstore: &Arc) -> Result { let mut data_usage = DataUsageInfo::default(); // Get bucket list @@ -973,6 +1045,8 @@ impl Scanner { data_usage.last_update = Some(SystemTime::now()); let mut total_objects = 0u64; + let mut total_versions = 0u64; + let mut total_delete_markers = 0u64; let mut total_size = 0u64; for bucket_info in buckets { @@ -980,37 +1054,26 @@ impl Scanner { continue; // Skip system buckets } - // Try to get actual object count for this bucket - let (object_count, bucket_size) = match ecstore - .clone() - .list_objects_v2( - &bucket_info.name, - "", // prefix - None, // continuation_token - None, // delimiter - 100, // max_keys - small limit for performance - false, // fetch_owner - None, // start_after - false, // incl_deleted - ) - .await - { - Ok(result) => { - let count = result.objects.len() as u64; - let size = result.objects.iter().map(|obj| obj.size as u64).sum(); - (count, size) - } - Err(_) => (0, 0), - }; + // Use ecstore pagination helper to avoid truncating at 100 objects + let (object_count, bucket_size, versions_count, delete_markers) = + match compute_bucket_usage(ecstore.clone(), &bucket_info.name).await { + Ok(usage) => (usage.objects_count, usage.size, usage.versions_count, usage.delete_markers_count), + Err(e) => { + warn!("Failed to compute bucket usage for {}: {}", bucket_info.name, e); + (0, 0, 0, 0) + } + }; total_objects += object_count; + total_versions += versions_count; + total_delete_markers += delete_markers; total_size += bucket_size; let bucket_usage = rustfs_common::data_usage::BucketUsageInfo { size: bucket_size, objects_count: object_count, - versions_count: object_count, // Simplified - delete_markers_count: 0, + versions_count, + delete_markers_count: delete_markers, ..Default::default() }; @@ -1020,7 +1083,8 @@ impl Scanner { data_usage.objects_total_count = total_objects; data_usage.objects_total_size = total_size; - data_usage.versions_total_count = total_objects; + data_usage.versions_total_count = total_versions; + data_usage.delete_markers_total_count = total_delete_markers; } Err(e) => { warn!("Failed to list buckets for data usage collection: {}", e); @@ -1913,7 +1977,7 @@ impl Scanner { } else { // Apply lifecycle actions if let Some(lifecycle_config) = &lifecycle_config { - if let Disk::Local(_local_disk) = &**disk { + if disk.is_local() { let vcfg = BucketVersioningSys::get(bucket).await.ok(); let mut scanner_item = ScannerItem { @@ -2554,6 +2618,7 @@ impl Scanner { disk_metrics: Arc::clone(&self.disk_metrics), data_usage_stats: Arc::clone(&self.data_usage_stats), last_data_usage_collection: Arc::clone(&self.last_data_usage_collection), + fallback_backoff_until: Arc::clone(&self.fallback_backoff_until), heal_manager: self.heal_manager.clone(), node_scanner: Arc::clone(&self.node_scanner), stats_aggregator: Arc::clone(&self.stats_aggregator), diff --git a/crates/ahm/src/scanner/local_scan/mod.rs b/crates/ahm/src/scanner/local_scan/mod.rs index 7e31d711..39387c24 100644 --- a/crates/ahm/src/scanner/local_scan/mod.rs +++ b/crates/ahm/src/scanner/local_scan/mod.rs @@ -84,6 +84,9 @@ pub async fn scan_and_persist_local_usage(store: Arc) -> Result) -> Result id.to_string(), None => { diff --git a/crates/ahm/src/scanner/stats_aggregator.rs b/crates/ahm/src/scanner/stats_aggregator.rs index ed56b549..0c019c3a 100644 --- a/crates/ahm/src/scanner/stats_aggregator.rs +++ b/crates/ahm/src/scanner/stats_aggregator.rs @@ -347,7 +347,8 @@ impl DecentralizedStatsAggregator { // update cache *self.cached_stats.write().await = Some(aggregated.clone()); - *self.cache_timestamp.write().await = aggregation_timestamp; + // Use the time when aggregation completes as cache timestamp to avoid premature expiry during long runs + *self.cache_timestamp.write().await = SystemTime::now(); Ok(aggregated) } @@ -359,7 +360,8 @@ impl DecentralizedStatsAggregator { // update cache *self.cached_stats.write().await = Some(aggregated.clone()); - *self.cache_timestamp.write().await = now; + // Cache timestamp should reflect completion time rather than aggregation start + *self.cache_timestamp.write().await = SystemTime::now(); Ok(aggregated) } diff --git a/crates/ahm/tests/data_usage_fallback_test.rs b/crates/ahm/tests/data_usage_fallback_test.rs new file mode 100644 index 00000000..03a7cfe5 --- /dev/null +++ b/crates/ahm/tests/data_usage_fallback_test.rs @@ -0,0 +1,112 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#![cfg(test)] + +use rustfs_ahm::scanner::data_scanner::Scanner; +use rustfs_common::data_usage::DataUsageInfo; +use rustfs_ecstore::GLOBAL_Endpoints; +use rustfs_ecstore::bucket::metadata_sys::{BucketMetadataSys, GLOBAL_BucketMetadataSys}; +use rustfs_ecstore::endpoints::EndpointServerPools; +use rustfs_ecstore::store::ECStore; +use rustfs_ecstore::store_api::{ObjectIO, PutObjReader, StorageAPI}; +use std::sync::{Arc, Once}; +use tempfile::TempDir; +use tokio::sync::RwLock; +use tokio_util::sync::CancellationToken; +use tracing::Level; + +/// Build a minimal single-node ECStore over a temp directory and populate objects. +async fn create_store_with_objects(count: usize) -> (TempDir, std::sync::Arc) { + let temp_dir = TempDir::new().expect("temp dir"); + let root = temp_dir.path().to_string_lossy().to_string(); + + // Create endpoints from the temp dir + let (endpoint_pools, _setup) = EndpointServerPools::from_volumes("127.0.0.1:0", vec![root]) + .await + .expect("endpoint pools"); + + // Seed globals required by metadata sys if not already set + if GLOBAL_Endpoints.get().is_none() { + let _ = GLOBAL_Endpoints.set(endpoint_pools.clone()); + } + + let store = ECStore::new("127.0.0.1:0".parse().unwrap(), endpoint_pools, CancellationToken::new()) + .await + .expect("create store"); + + if rustfs_ecstore::global::new_object_layer_fn().is_none() { + rustfs_ecstore::global::set_object_layer(store.clone()).await; + } + + // Initialize metadata system before bucket operations + if GLOBAL_BucketMetadataSys.get().is_none() { + let mut sys = BucketMetadataSys::new(store.clone()); + sys.init(Vec::new()).await; + let _ = GLOBAL_BucketMetadataSys.set(Arc::new(RwLock::new(sys))); + } + + store + .make_bucket("fallback-bucket", &rustfs_ecstore::store_api::MakeBucketOptions::default()) + .await + .expect("make bucket"); + + for i in 0..count { + let key = format!("obj-{i:04}"); + let data = format!("payload-{i}"); + let mut reader = PutObjReader::from_vec(data.into_bytes()); + store + .put_object("fallback-bucket", &key, &mut reader, &rustfs_ecstore::store_api::ObjectOptions::default()) + .await + .expect("put object"); + } + + (temp_dir, store) +} + +static INIT: Once = Once::new(); + +fn init_tracing(filter_level: Level) { + INIT.call_once(|| { + let _ = tracing_subscriber::fmt() + .with_env_filter(tracing_subscriber::EnvFilter::from_default_env()) + .with_max_level(filter_level) + .with_timer(tracing_subscriber::fmt::time::UtcTime::rfc_3339()) + .with_thread_names(true) + .try_init(); + }); +} + +#[tokio::test] +async fn fallback_builds_full_counts_over_100_objects() { + init_tracing(Level::ERROR); + let (_tmp, store) = create_store_with_objects(1000).await; + let scanner = Scanner::new(None, None); + + // Directly call the fallback builder to ensure pagination works. + let usage: DataUsageInfo = scanner.build_data_usage_from_ecstore(&store).await.expect("fallback usage"); + + let bucket = usage.buckets_usage.get("fallback-bucket").expect("bucket usage present"); + + assert!( + usage.objects_total_count >= 1000, + "total objects should be >=1000, got {}", + usage.objects_total_count + ); + assert!( + bucket.objects_count >= 1000, + "bucket objects should be >=1000, got {}", + bucket.objects_count + ); +} diff --git a/crates/ahm/tests/heal_integration_test.rs b/crates/ahm/tests/heal_integration_test.rs index 85ce694e..8ad68d79 100644 --- a/crates/ahm/tests/heal_integration_test.rs +++ b/crates/ahm/tests/heal_integration_test.rs @@ -38,9 +38,13 @@ use walkdir::WalkDir; static GLOBAL_ENV: OnceLock<(Vec, Arc, Arc)> = OnceLock::new(); static INIT: Once = Once::new(); -fn init_tracing() { +pub fn init_tracing() { INIT.call_once(|| { - let _ = tracing_subscriber::fmt::try_init(); + let _ = tracing_subscriber::fmt() + .with_env_filter(tracing_subscriber::EnvFilter::from_default_env()) + .with_timer(tracing_subscriber::fmt::time::UtcTime::rfc_3339()) + .with_thread_names(true) + .try_init(); }); } @@ -356,7 +360,7 @@ mod serial_tests { // Create heal manager with faster interval let cfg = HealConfig { - heal_interval: Duration::from_secs(2), + heal_interval: Duration::from_secs(1), ..Default::default() }; let heal_manager = HealManager::new(heal_storage.clone(), Some(cfg)); diff --git a/crates/audit/Cargo.toml b/crates/audit/Cargo.toml index 414e05fc..ae97033e 100644 --- a/crates/audit/Cargo.toml +++ b/crates/audit/Cargo.toml @@ -29,6 +29,7 @@ categories = ["web-programming", "development-tools", "asynchronous", "api-bindi rustfs-targets = { workspace = true } rustfs-config = { workspace = true, features = ["audit", "constants"] } rustfs-ecstore = { workspace = true } +async-trait = { workspace = true } chrono = { workspace = true } const-str = { workspace = true } futures = { workspace = true } diff --git a/crates/audit/src/factory.rs b/crates/audit/src/factory.rs new file mode 100644 index 00000000..9beded31 --- /dev/null +++ b/crates/audit/src/factory.rs @@ -0,0 +1,224 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use crate::AuditEntry; +use async_trait::async_trait; +use hashbrown::HashSet; +use rumqttc::QoS; +use rustfs_config::audit::{AUDIT_MQTT_KEYS, AUDIT_WEBHOOK_KEYS, ENV_AUDIT_MQTT_KEYS, ENV_AUDIT_WEBHOOK_KEYS}; +use rustfs_config::{ + AUDIT_DEFAULT_DIR, DEFAULT_LIMIT, MQTT_BROKER, MQTT_KEEP_ALIVE_INTERVAL, MQTT_PASSWORD, MQTT_QOS, MQTT_QUEUE_DIR, + MQTT_QUEUE_LIMIT, MQTT_RECONNECT_INTERVAL, MQTT_TOPIC, MQTT_USERNAME, WEBHOOK_AUTH_TOKEN, WEBHOOK_CLIENT_CERT, + WEBHOOK_CLIENT_KEY, WEBHOOK_ENDPOINT, WEBHOOK_QUEUE_DIR, WEBHOOK_QUEUE_LIMIT, +}; +use rustfs_ecstore::config::KVS; +use rustfs_targets::{ + Target, + error::TargetError, + target::{mqtt::MQTTArgs, webhook::WebhookArgs}, +}; +use std::time::Duration; +use tracing::{debug, warn}; +use url::Url; + +/// Trait for creating targets from configuration +#[async_trait] +pub trait TargetFactory: Send + Sync { + /// Creates a target from configuration + async fn create_target(&self, id: String, config: &KVS) -> Result + Send + Sync>, TargetError>; + + /// Validates target configuration + fn validate_config(&self, id: &str, config: &KVS) -> Result<(), TargetError>; + + /// Returns a set of valid configuration field names for this target type. + /// This is used to filter environment variables. + fn get_valid_fields(&self) -> HashSet; + + /// Returns a set of valid configuration env field names for this target type. + /// This is used to filter environment variables. + fn get_valid_env_fields(&self) -> HashSet; +} + +/// Factory for creating Webhook targets +pub struct WebhookTargetFactory; + +#[async_trait] +impl TargetFactory for WebhookTargetFactory { + async fn create_target(&self, id: String, config: &KVS) -> Result + Send + Sync>, TargetError> { + // All config values are now read directly from the merged `config` KVS. + let endpoint = config + .lookup(WEBHOOK_ENDPOINT) + .ok_or_else(|| TargetError::Configuration("Missing webhook endpoint".to_string()))?; + let parsed_endpoint = endpoint.trim(); + let endpoint_url = Url::parse(parsed_endpoint) + .map_err(|e| TargetError::Configuration(format!("Invalid endpoint URL: {e} (value: '{parsed_endpoint}')")))?; + + let args = WebhookArgs { + enable: true, // If we are here, it's already enabled. + endpoint: endpoint_url, + auth_token: config.lookup(WEBHOOK_AUTH_TOKEN).unwrap_or_default(), + queue_dir: config.lookup(WEBHOOK_QUEUE_DIR).unwrap_or(AUDIT_DEFAULT_DIR.to_string()), + queue_limit: config + .lookup(WEBHOOK_QUEUE_LIMIT) + .and_then(|v| v.parse::().ok()) + .unwrap_or(DEFAULT_LIMIT), + client_cert: config.lookup(WEBHOOK_CLIENT_CERT).unwrap_or_default(), + client_key: config.lookup(WEBHOOK_CLIENT_KEY).unwrap_or_default(), + target_type: rustfs_targets::target::TargetType::AuditLog, + }; + + let target = rustfs_targets::target::webhook::WebhookTarget::new(id, args)?; + Ok(Box::new(target)) + } + + fn validate_config(&self, _id: &str, config: &KVS) -> Result<(), TargetError> { + // Validation also uses the merged `config` KVS directly. + let endpoint = config + .lookup(WEBHOOK_ENDPOINT) + .ok_or_else(|| TargetError::Configuration("Missing webhook endpoint".to_string()))?; + debug!("endpoint: {}", endpoint); + let parsed_endpoint = endpoint.trim(); + Url::parse(parsed_endpoint) + .map_err(|e| TargetError::Configuration(format!("Invalid endpoint URL: {e} (value: '{parsed_endpoint}')")))?; + + let client_cert = config.lookup(WEBHOOK_CLIENT_CERT).unwrap_or_default(); + let client_key = config.lookup(WEBHOOK_CLIENT_KEY).unwrap_or_default(); + + if client_cert.is_empty() != client_key.is_empty() { + return Err(TargetError::Configuration( + "Both client_cert and client_key must be specified together".to_string(), + )); + } + + let queue_dir = config.lookup(WEBHOOK_QUEUE_DIR).unwrap_or(AUDIT_DEFAULT_DIR.to_string()); + if !queue_dir.is_empty() && !std::path::Path::new(&queue_dir).is_absolute() { + return Err(TargetError::Configuration("Webhook queue directory must be an absolute path".to_string())); + } + + Ok(()) + } + + fn get_valid_fields(&self) -> HashSet { + AUDIT_WEBHOOK_KEYS.iter().map(|s| s.to_string()).collect() + } + + fn get_valid_env_fields(&self) -> HashSet { + ENV_AUDIT_WEBHOOK_KEYS.iter().map(|s| s.to_string()).collect() + } +} + +/// Factory for creating MQTT targets +pub struct MQTTTargetFactory; + +#[async_trait] +impl TargetFactory for MQTTTargetFactory { + async fn create_target(&self, id: String, config: &KVS) -> Result + Send + Sync>, TargetError> { + let broker = config + .lookup(MQTT_BROKER) + .ok_or_else(|| TargetError::Configuration("Missing MQTT broker".to_string()))?; + let broker_url = Url::parse(&broker) + .map_err(|e| TargetError::Configuration(format!("Invalid broker URL: {e} (value: '{broker}')")))?; + + let topic = config + .lookup(MQTT_TOPIC) + .ok_or_else(|| TargetError::Configuration("Missing MQTT topic".to_string()))?; + + let args = MQTTArgs { + enable: true, // Assumed enabled. + broker: broker_url, + topic, + qos: config + .lookup(MQTT_QOS) + .and_then(|v| v.parse::().ok()) + .map(|q| match q { + 0 => QoS::AtMostOnce, + 1 => QoS::AtLeastOnce, + 2 => QoS::ExactlyOnce, + _ => QoS::AtLeastOnce, + }) + .unwrap_or(QoS::AtLeastOnce), + username: config.lookup(MQTT_USERNAME).unwrap_or_default(), + password: config.lookup(MQTT_PASSWORD).unwrap_or_default(), + max_reconnect_interval: config + .lookup(MQTT_RECONNECT_INTERVAL) + .and_then(|v| v.parse::().ok()) + .map(Duration::from_secs) + .unwrap_or_else(|| Duration::from_secs(5)), + keep_alive: config + .lookup(MQTT_KEEP_ALIVE_INTERVAL) + .and_then(|v| v.parse::().ok()) + .map(Duration::from_secs) + .unwrap_or_else(|| Duration::from_secs(30)), + queue_dir: config.lookup(MQTT_QUEUE_DIR).unwrap_or(AUDIT_DEFAULT_DIR.to_string()), + queue_limit: config + .lookup(MQTT_QUEUE_LIMIT) + .and_then(|v| v.parse::().ok()) + .unwrap_or(DEFAULT_LIMIT), + target_type: rustfs_targets::target::TargetType::AuditLog, + }; + + let target = rustfs_targets::target::mqtt::MQTTTarget::new(id, args)?; + Ok(Box::new(target)) + } + + fn validate_config(&self, _id: &str, config: &KVS) -> Result<(), TargetError> { + let broker = config + .lookup(MQTT_BROKER) + .ok_or_else(|| TargetError::Configuration("Missing MQTT broker".to_string()))?; + let url = Url::parse(&broker) + .map_err(|e| TargetError::Configuration(format!("Invalid broker URL: {e} (value: '{broker}')")))?; + + match url.scheme() { + "tcp" | "ssl" | "ws" | "wss" | "mqtt" | "mqtts" => {} + _ => { + return Err(TargetError::Configuration("Unsupported broker URL scheme".to_string())); + } + } + + if config.lookup(MQTT_TOPIC).is_none() { + return Err(TargetError::Configuration("Missing MQTT topic".to_string())); + } + + if let Some(qos_str) = config.lookup(MQTT_QOS) { + let qos = qos_str + .parse::() + .map_err(|_| TargetError::Configuration("Invalid QoS value".to_string()))?; + if qos > 2 { + return Err(TargetError::Configuration("QoS must be 0, 1, or 2".to_string())); + } + } + + let queue_dir = config.lookup(MQTT_QUEUE_DIR).unwrap_or_default(); + if !queue_dir.is_empty() { + if !std::path::Path::new(&queue_dir).is_absolute() { + return Err(TargetError::Configuration("MQTT queue directory must be an absolute path".to_string())); + } + if let Some(qos_str) = config.lookup(MQTT_QOS) { + if qos_str == "0" { + warn!("Using queue_dir with QoS 0 may result in event loss"); + } + } + } + + Ok(()) + } + + fn get_valid_fields(&self) -> HashSet { + AUDIT_MQTT_KEYS.iter().map(|s| s.to_string()).collect() + } + + fn get_valid_env_fields(&self) -> HashSet { + ENV_AUDIT_MQTT_KEYS.iter().map(|s| s.to_string()).collect() + } +} diff --git a/crates/audit/src/lib.rs b/crates/audit/src/lib.rs index 8207bc23..7cca0063 100644 --- a/crates/audit/src/lib.rs +++ b/crates/audit/src/lib.rs @@ -20,6 +20,7 @@ pub mod entity; pub mod error; +pub mod factory; pub mod global; pub mod observability; pub mod registry; diff --git a/crates/audit/src/registry.rs b/crates/audit/src/registry.rs index 30aa325a..c73b300a 100644 --- a/crates/audit/src/registry.rs +++ b/crates/audit/src/registry.rs @@ -12,29 +12,26 @@ // See the License for the specific language governing permissions and // limitations under the License. -use crate::{AuditEntry, AuditError, AuditResult}; -use futures::{StreamExt, stream::FuturesUnordered}; +use crate::{ + AuditEntry, AuditError, AuditResult, + factory::{MQTTTargetFactory, TargetFactory, WebhookTargetFactory}, +}; +use futures::StreamExt; +use futures::stream::FuturesUnordered; use hashbrown::{HashMap, HashSet}; -use rustfs_config::{ - DEFAULT_DELIMITER, ENABLE_KEY, ENV_PREFIX, MQTT_BROKER, MQTT_KEEP_ALIVE_INTERVAL, MQTT_PASSWORD, MQTT_QOS, MQTT_QUEUE_DIR, - MQTT_QUEUE_LIMIT, MQTT_RECONNECT_INTERVAL, MQTT_TOPIC, MQTT_USERNAME, WEBHOOK_AUTH_TOKEN, WEBHOOK_BATCH_SIZE, - WEBHOOK_CLIENT_CERT, WEBHOOK_CLIENT_KEY, WEBHOOK_ENDPOINT, WEBHOOK_HTTP_TIMEOUT, WEBHOOK_MAX_RETRY, WEBHOOK_QUEUE_DIR, - WEBHOOK_QUEUE_LIMIT, WEBHOOK_RETRY_INTERVAL, audit::AUDIT_ROUTE_PREFIX, -}; +use rustfs_config::{DEFAULT_DELIMITER, ENABLE_KEY, ENV_PREFIX, EnableState, audit::AUDIT_ROUTE_PREFIX}; use rustfs_ecstore::config::{Config, KVS}; -use rustfs_targets::{ - Target, TargetError, - target::{ChannelTargetType, TargetType, mqtt::MQTTArgs, webhook::WebhookArgs}, -}; +use rustfs_targets::{Target, TargetError, target::ChannelTargetType}; +use std::str::FromStr; use std::sync::Arc; -use std::time::Duration; use tracing::{debug, error, info, warn}; -use url::Url; /// Registry for managing audit targets pub struct AuditRegistry { /// Storage for created targets targets: HashMap + Send + Sync>>, + /// Factories for creating targets + factories: HashMap>, } impl Default for AuditRegistry { @@ -46,162 +43,207 @@ impl Default for AuditRegistry { impl AuditRegistry { /// Creates a new AuditRegistry pub fn new() -> Self { - Self { targets: HashMap::new() } + let mut registry = AuditRegistry { + factories: HashMap::new(), + targets: HashMap::new(), + }; + + // Register built-in factories + registry.register(ChannelTargetType::Webhook.as_str(), Box::new(WebhookTargetFactory)); + registry.register(ChannelTargetType::Mqtt.as_str(), Box::new(MQTTTargetFactory)); + + registry } - /// Creates all audit targets from system configuration and environment variables. + /// Registers a new factory for a target type + /// + /// # Arguments + /// * `target_type` - The type of the target (e.g., "webhook", "mqtt"). + /// * `factory` - The factory instance to create targets of this type. + pub fn register(&mut self, target_type: &str, factory: Box) { + self.factories.insert(target_type.to_string(), factory); + } + + /// Creates a target of the specified type with the given ID and configuration + /// + /// # Arguments + /// * `target_type` - The type of the target (e.g., "webhook", "mqtt"). + /// * `id` - The identifier for the target instance. + /// * `config` - The configuration key-value store for the target. + /// + /// # Returns + /// * `Result + Send + Sync>, TargetError>` - The created target or an error. + pub async fn create_target( + &self, + target_type: &str, + id: String, + config: &KVS, + ) -> Result + Send + Sync>, TargetError> { + let factory = self + .factories + .get(target_type) + .ok_or_else(|| TargetError::Configuration(format!("Unknown target type: {target_type}")))?; + + // Validate configuration before creating target + factory.validate_config(&id, config)?; + + // Create target + factory.create_target(id, config).await + } + + /// Creates all targets from a configuration + /// Create all notification targets from system configuration and environment variables. /// This method processes the creation of each target concurrently as follows: - /// 1. Iterate through supported target types (webhook, mqtt). - /// 2. For each type, resolve its configuration from file and environment variables. + /// 1. Iterate through all registered target types (e.g. webhooks, mqtt). + /// 2. For each type, resolve its configuration in the configuration file and environment variables. /// 3. Identify all target instance IDs that need to be created. - /// 4. Merge configurations with precedence: ENV > file instance > file default. - /// 5. Create async tasks for enabled instances. - /// 6. Execute tasks concurrently and collect successful targets. - /// 7. Persist successful configurations back to system storage. - pub async fn create_targets_from_config( - &mut self, + /// 4. Combine the default configuration, file configuration, and environment variable configuration for each instance. + /// 5. If the instance is enabled, create an asynchronous task for it to instantiate. + /// 6. Concurrency executes all creation tasks and collects results. + pub async fn create_audit_targets_from_config( + &self, config: &Config, ) -> AuditResult + Send + Sync>>> { // Collect only environment variables with the relevant prefix to reduce memory usage let all_env: Vec<(String, String)> = std::env::vars().filter(|(key, _)| key.starts_with(ENV_PREFIX)).collect(); - // A collection of asynchronous tasks for concurrently executing target creation let mut tasks = FuturesUnordered::new(); - // let final_config = config.clone(); - + // let final_config = config.clone(); // Clone a configuration for aggregating the final result // Record the defaults for each segment so that the segment can eventually be rebuilt let mut section_defaults: HashMap = HashMap::new(); - - // Supported target types for audit - let target_types = vec![ChannelTargetType::Webhook.as_str(), ChannelTargetType::Mqtt.as_str()]; - - // 1. Traverse all target types and process them - for target_type in target_types { - let span = tracing::Span::current(); - span.record("target_type", target_type); - info!(target_type = %target_type, "Starting audit target type processing"); + // 1. Traverse all registered plants and process them by target type + for (target_type, factory) in &self.factories { + tracing::Span::current().record("target_type", target_type.as_str()); + info!("Start working on target types..."); // 2. Prepare the configuration source + // 2.1. Get the configuration segment in the file, e.g. 'audit_webhook' let section_name = format!("{AUDIT_ROUTE_PREFIX}{target_type}").to_lowercase(); let file_configs = config.0.get(§ion_name).cloned().unwrap_or_default(); + // 2.2. Get the default configuration for that type let default_cfg = file_configs.get(DEFAULT_DELIMITER).cloned().unwrap_or_default(); - debug!(?default_cfg, "Retrieved default configuration"); + debug!(?default_cfg, "Get the default configuration"); // Save defaults for eventual write back section_defaults.insert(section_name.clone(), default_cfg.clone()); - // Get valid fields for the target type - let valid_fields = match target_type { - "webhook" => get_webhook_valid_fields(), - "mqtt" => get_mqtt_valid_fields(), - _ => { - warn!(target_type = %target_type, "Unknown target type, skipping"); - continue; - } - }; - debug!(?valid_fields, "Retrieved valid configuration fields"); + // *** Optimization point 1: Get all legitimate fields of the current target type *** + let valid_fields = factory.get_valid_fields(); + debug!(?valid_fields, "Get the legitimate configuration fields"); // 3. Resolve instance IDs and configuration overrides from environment variables let mut instance_ids_from_env = HashSet::new(); - let mut env_overrides: HashMap> = HashMap::new(); - - for (env_key, env_value) in &all_env { - let audit_prefix = format!("{ENV_PREFIX}{AUDIT_ROUTE_PREFIX}{target_type}").to_uppercase(); - if !env_key.starts_with(&audit_prefix) { - continue; - } - - let suffix = &env_key[audit_prefix.len()..]; - if suffix.is_empty() { - continue; - } - - // Parse field and instance from suffix (FIELD_INSTANCE or FIELD) - let (field_name, instance_id) = if let Some(last_underscore) = suffix.rfind('_') { - let potential_field = &suffix[1..last_underscore]; // Skip leading _ - let potential_instance = &suffix[last_underscore + 1..]; - - // Check if the part before the last underscore is a valid field - if valid_fields.contains(&potential_field.to_lowercase()) { - (potential_field.to_lowercase(), potential_instance.to_lowercase()) - } else { - // Treat the entire suffix as field name with default instance - (suffix[1..].to_lowercase(), DEFAULT_DELIMITER.to_string()) + // 3.1. Instance discovery: Based on the '..._ENABLE_INSTANCEID' format + let enable_prefix = + format!("{ENV_PREFIX}{AUDIT_ROUTE_PREFIX}{target_type}{DEFAULT_DELIMITER}{ENABLE_KEY}{DEFAULT_DELIMITER}") + .to_uppercase(); + for (key, value) in &all_env { + if EnableState::from_str(value).ok().map(|s| s.is_enabled()).unwrap_or(false) { + if let Some(id) = key.strip_prefix(&enable_prefix) { + if !id.is_empty() { + instance_ids_from_env.insert(id.to_lowercase()); + } } - } else { - // No underscore, treat as field with default instance - (suffix[1..].to_lowercase(), DEFAULT_DELIMITER.to_string()) - }; - - if valid_fields.contains(&field_name) { - if instance_id != DEFAULT_DELIMITER { - instance_ids_from_env.insert(instance_id.clone()); - } - env_overrides - .entry(instance_id) - .or_default() - .insert(field_name, env_value.clone()); - } else { - debug!( - env_key = %env_key, - field_name = %field_name, - "Ignoring environment variable field not found in valid fields for target type {}", - target_type - ); } } - debug!(?env_overrides, "Completed environment variable analysis"); + + // 3.2. Parse all relevant environment variable configurations + // 3.2.1. Build environment variable prefixes such as 'RUSTFS_AUDIT_WEBHOOK_' + let env_prefix = format!("{ENV_PREFIX}{AUDIT_ROUTE_PREFIX}{target_type}{DEFAULT_DELIMITER}").to_uppercase(); + // 3.2.2. 'env_overrides' is used to store configurations parsed from environment variables in the format: {instance id -> {field -> value}} + let mut env_overrides: HashMap> = HashMap::new(); + for (key, value) in &all_env { + if let Some(rest) = key.strip_prefix(&env_prefix) { + // Use rsplitn to split from the right side to properly extract the INSTANCE_ID at the end + // Format: _ or + let mut parts = rest.rsplitn(2, DEFAULT_DELIMITER); + + // The first part from the right is INSTANCE_ID + let instance_id_part = parts.next().unwrap_or(DEFAULT_DELIMITER); + // The remaining part is FIELD_NAME + let field_name_part = parts.next(); + + let (field_name, instance_id) = match field_name_part { + // Case 1: The format is _ + // e.g., rest = "ENDPOINT_PRIMARY" -> field_name="ENDPOINT", instance_id="PRIMARY" + Some(field) => (field.to_lowercase(), instance_id_part.to_lowercase()), + // Case 2: The format is (without INSTANCE_ID) + // e.g., rest = "ENABLE" -> field_name="ENABLE", instance_id="" (Universal configuration `_ DEFAULT_DELIMITER`) + None => (instance_id_part.to_lowercase(), DEFAULT_DELIMITER.to_string()), + }; + + // *** Optimization point 2: Verify whether the parsed field_name is legal *** + if !field_name.is_empty() && valid_fields.contains(&field_name) { + debug!( + instance_id = %if instance_id.is_empty() { DEFAULT_DELIMITER } else { &instance_id }, + %field_name, + %value, + "Parsing to environment variables" + ); + env_overrides + .entry(instance_id) + .or_default() + .insert(field_name, value.clone()); + } else { + // Ignore illegal field names + warn!( + field_name = %field_name, + "Ignore environment variable fields, not found in the list of valid fields for target type {}", + target_type + ); + } + } + } + debug!(?env_overrides, "Complete the environment variable analysis"); // 4. Determine all instance IDs that need to be processed let mut all_instance_ids: HashSet = file_configs.keys().filter(|k| *k != DEFAULT_DELIMITER).cloned().collect(); all_instance_ids.extend(instance_ids_from_env); - debug!(?all_instance_ids, "Determined all instance IDs"); + debug!(?all_instance_ids, "Determine all instance IDs"); // 5. Merge configurations and create tasks for each instance for id in all_instance_ids { - // 5.1. Merge configuration, priority: Environment variables > File instance > File default + // 5.1. Merge configuration, priority: Environment variables > File instance configuration > File default configuration let mut merged_config = default_cfg.clone(); - - // Apply file instance configuration if available + // Instance-specific configuration in application files if let Some(file_instance_cfg) = file_configs.get(&id) { merged_config.extend(file_instance_cfg.clone()); } - - // Apply environment variable overrides + // Application instance-specific environment variable configuration if let Some(env_instance_cfg) = env_overrides.get(&id) { + // Convert HashMap to KVS let mut kvs_from_env = KVS::new(); for (k, v) in env_instance_cfg { kvs_from_env.insert(k.clone(), v.clone()); } merged_config.extend(kvs_from_env); } - debug!(instance_id = %id, ?merged_config, "Completed configuration merge"); + debug!(instance_id = %id, ?merged_config, "Complete configuration merge"); // 5.2. Check if the instance is enabled let enabled = merged_config .lookup(ENABLE_KEY) - .map(|v| parse_enable_value(&v)) + .map(|v| { + EnableState::from_str(v.as_str()) + .ok() + .map(|s| s.is_enabled()) + .unwrap_or(false) + }) .unwrap_or(false); if enabled { - info!(instance_id = %id, "Creating audit target"); - - // Create task for concurrent execution - let target_type_clone = target_type.to_string(); - let id_clone = id.clone(); - let merged_config_arc = Arc::new(merged_config.clone()); - let task = tokio::spawn(async move { - let result = create_audit_target(&target_type_clone, &id_clone, &merged_config_arc).await; - (target_type_clone, id_clone, result, merged_config_arc) + info!(instance_id = %id, "Target is enabled, ready to create a task"); + // 5.3. Create asynchronous tasks for enabled instances + let target_type_clone = target_type.clone(); + let tid = id.clone(); + let merged_config_arc = Arc::new(merged_config); + tasks.push(async move { + let result = factory.create_target(tid.clone(), &merged_config_arc).await; + (target_type_clone, tid, result, Arc::clone(&merged_config_arc)) }); - - tasks.push(task); - - // Update final config with successful instance - // final_config.0.entry(section_name.clone()).or_default().insert(id, merged_config); } else { - info!(instance_id = %id, "Skipping disabled audit target, will be removed from final configuration"); + info!(instance_id = %id, "Skip the disabled target and will be removed from the final configuration"); // Remove disabled target from final configuration // final_config.0.entry(section_name.clone()).or_default().remove(&id); } @@ -211,30 +253,28 @@ impl AuditRegistry { // 6. Concurrently execute all creation tasks and collect results let mut successful_targets = Vec::new(); let mut successful_configs = Vec::new(); - while let Some(task_result) = tasks.next().await { - match task_result { - Ok((target_type, id, result, kvs_arc)) => match result { - Ok(target) => { - info!(target_type = %target_type, instance_id = %id, "Created audit target successfully"); - successful_targets.push(target); - successful_configs.push((target_type, id, kvs_arc)); - } - Err(e) => { - error!(target_type = %target_type, instance_id = %id, error = %e, "Failed to create audit target"); - } - }, + while let Some((target_type, id, result, final_config)) = tasks.next().await { + match result { + Ok(target) => { + info!(target_type = %target_type, instance_id = %id, "Create a target successfully"); + successful_targets.push(target); + successful_configs.push((target_type, id, final_config)); + } Err(e) => { - error!(error = %e, "Task execution failed"); + error!(target_type = %target_type, instance_id = %id, error = %e, "Failed to create a target"); } } } - // Rebuild in pieces based on "default items + successful instances" and overwrite writeback to ensure that deleted/disabled instances will not be "resurrected" + // 7. Aggregate new configuration and write back to system configuration if !successful_configs.is_empty() || !section_defaults.is_empty() { - info!("Prepare to rebuild and save target configurations to the system configuration..."); + info!( + "Prepare to update {} successfully created target configurations to the system configuration...", + successful_configs.len() + ); - // Aggregate successful instances into segments let mut successes_by_section: HashMap> = HashMap::new(); + for (target_type, id, kvs) in successful_configs { let section_name = format!("{AUDIT_ROUTE_PREFIX}{target_type}").to_lowercase(); successes_by_section @@ -244,76 +284,99 @@ impl AuditRegistry { } let mut new_config = config.clone(); - // Collection of segments that need to be processed: Collect all segments where default items exist or where successful instances exist let mut sections: HashSet = HashSet::new(); sections.extend(section_defaults.keys().cloned()); sections.extend(successes_by_section.keys().cloned()); - for section_name in sections { + for section in sections { let mut section_map: std::collections::HashMap = std::collections::HashMap::new(); - - // The default entry (if present) is written back to `_` - if let Some(default_cfg) = section_defaults.get(§ion_name) { - if !default_cfg.is_empty() { - section_map.insert(DEFAULT_DELIMITER.to_string(), default_cfg.clone()); + // Add default item + if let Some(default_kvs) = section_defaults.get(§ion) { + if !default_kvs.is_empty() { + section_map.insert(DEFAULT_DELIMITER.to_string(), default_kvs.clone()); } } - // Successful instance write back - if let Some(instances) = successes_by_section.get(§ion_name) { + // Add successful instance item + if let Some(instances) = successes_by_section.get(§ion) { for (id, kvs) in instances { section_map.insert(id.clone(), kvs.clone()); } } - // Empty segments are removed and non-empty segments are replaced as a whole. + // Empty breaks are removed and non-empty breaks are replaced entirely. if section_map.is_empty() { - new_config.0.remove(§ion_name); + new_config.0.remove(§ion); } else { - new_config.0.insert(section_name, section_map); + new_config.0.insert(section, section_map); } } - // 7. Save the new configuration to the system - let Some(store) = rustfs_ecstore::new_object_layer_fn() else { + let Some(store) = rustfs_ecstore::global::new_object_layer_fn() else { return Err(AuditError::StorageNotAvailable( "Failed to save target configuration: server storage not initialized".to_string(), )); }; match rustfs_ecstore::config::com::save_server_config(store, &new_config).await { - Ok(_) => info!("New audit configuration saved to system successfully"), + Ok(_) => { + info!("The new configuration was saved to the system successfully.") + } Err(e) => { - error!(error = %e, "Failed to save new audit configuration"); + error!("Failed to save the new configuration: {}", e); return Err(AuditError::SaveConfig(Box::new(e))); } } } + + info!(count = successful_targets.len(), "All target processing completed"); Ok(successful_targets) } /// Adds a target to the registry + /// + /// # Arguments + /// * `id` - The identifier for the target. + /// * `target` - The target instance to be added. pub fn add_target(&mut self, id: String, target: Box + Send + Sync>) { self.targets.insert(id, target); } /// Removes a target from the registry + /// + /// # Arguments + /// * `id` - The identifier for the target to be removed. + /// + /// # Returns + /// * `Option + Send + Sync>>` - The removed target if it existed. pub fn remove_target(&mut self, id: &str) -> Option + Send + Sync>> { self.targets.remove(id) } /// Gets a target from the registry + /// + /// # Arguments + /// * `id` - The identifier for the target to be retrieved. + /// + /// # Returns + /// * `Option<&(dyn Target + Send + Sync)>` - The target if it exists. pub fn get_target(&self, id: &str) -> Option<&(dyn Target + Send + Sync)> { self.targets.get(id).map(|t| t.as_ref()) } /// Lists all target IDs + /// + /// # Returns + /// * `Vec` - A vector of all target IDs in the registry. pub fn list_targets(&self) -> Vec { self.targets.keys().cloned().collect() } /// Closes all targets and clears the registry + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure. pub async fn close_all(&mut self) -> AuditResult<()> { let mut errors = Vec::new(); @@ -331,152 +394,3 @@ impl AuditRegistry { Ok(()) } } - -/// Creates an audit target based on type and configuration -async fn create_audit_target( - target_type: &str, - id: &str, - config: &KVS, -) -> Result + Send + Sync>, TargetError> { - match target_type { - val if val == ChannelTargetType::Webhook.as_str() => { - let args = parse_webhook_args(id, config)?; - let target = rustfs_targets::target::webhook::WebhookTarget::new(id.to_string(), args)?; - Ok(Box::new(target)) - } - val if val == ChannelTargetType::Mqtt.as_str() => { - let args = parse_mqtt_args(id, config)?; - let target = rustfs_targets::target::mqtt::MQTTTarget::new(id.to_string(), args)?; - Ok(Box::new(target)) - } - _ => Err(TargetError::Configuration(format!("Unknown target type: {target_type}"))), - } -} - -/// Gets valid field names for webhook configuration -fn get_webhook_valid_fields() -> HashSet { - vec![ - ENABLE_KEY.to_string(), - WEBHOOK_ENDPOINT.to_string(), - WEBHOOK_AUTH_TOKEN.to_string(), - WEBHOOK_CLIENT_CERT.to_string(), - WEBHOOK_CLIENT_KEY.to_string(), - WEBHOOK_BATCH_SIZE.to_string(), - WEBHOOK_QUEUE_LIMIT.to_string(), - WEBHOOK_QUEUE_DIR.to_string(), - WEBHOOK_MAX_RETRY.to_string(), - WEBHOOK_RETRY_INTERVAL.to_string(), - WEBHOOK_HTTP_TIMEOUT.to_string(), - ] - .into_iter() - .collect() -} - -/// Gets valid field names for MQTT configuration -fn get_mqtt_valid_fields() -> HashSet { - vec![ - ENABLE_KEY.to_string(), - MQTT_BROKER.to_string(), - MQTT_TOPIC.to_string(), - MQTT_USERNAME.to_string(), - MQTT_PASSWORD.to_string(), - MQTT_QOS.to_string(), - MQTT_KEEP_ALIVE_INTERVAL.to_string(), - MQTT_RECONNECT_INTERVAL.to_string(), - MQTT_QUEUE_DIR.to_string(), - MQTT_QUEUE_LIMIT.to_string(), - ] - .into_iter() - .collect() -} - -/// Parses webhook arguments from KVS configuration -fn parse_webhook_args(_id: &str, config: &KVS) -> Result { - let endpoint = config - .lookup(WEBHOOK_ENDPOINT) - .filter(|s| !s.is_empty()) - .ok_or_else(|| TargetError::Configuration("webhook endpoint is required".to_string()))?; - - let endpoint_url = - Url::parse(&endpoint).map_err(|e| TargetError::Configuration(format!("invalid webhook endpoint URL: {e}")))?; - - let args = WebhookArgs { - enable: true, // Already validated as enabled - endpoint: endpoint_url, - auth_token: config.lookup(WEBHOOK_AUTH_TOKEN).unwrap_or_default(), - queue_dir: config.lookup(WEBHOOK_QUEUE_DIR).unwrap_or_default(), - queue_limit: config - .lookup(WEBHOOK_QUEUE_LIMIT) - .and_then(|s| s.parse().ok()) - .unwrap_or(100000), - client_cert: config.lookup(WEBHOOK_CLIENT_CERT).unwrap_or_default(), - client_key: config.lookup(WEBHOOK_CLIENT_KEY).unwrap_or_default(), - target_type: TargetType::AuditLog, - }; - - args.validate()?; - Ok(args) -} - -/// Parses MQTT arguments from KVS configuration -fn parse_mqtt_args(_id: &str, config: &KVS) -> Result { - let broker = config - .lookup(MQTT_BROKER) - .filter(|s| !s.is_empty()) - .ok_or_else(|| TargetError::Configuration("MQTT broker is required".to_string()))?; - - let broker_url = Url::parse(&broker).map_err(|e| TargetError::Configuration(format!("invalid MQTT broker URL: {e}")))?; - - let topic = config - .lookup(MQTT_TOPIC) - .filter(|s| !s.is_empty()) - .ok_or_else(|| TargetError::Configuration("MQTT topic is required".to_string()))?; - - let qos = config - .lookup(MQTT_QOS) - .and_then(|s| s.parse::().ok()) - .and_then(|q| match q { - 0 => Some(rumqttc::QoS::AtMostOnce), - 1 => Some(rumqttc::QoS::AtLeastOnce), - 2 => Some(rumqttc::QoS::ExactlyOnce), - _ => None, - }) - .unwrap_or(rumqttc::QoS::AtLeastOnce); - - let args = MQTTArgs { - enable: true, // Already validated as enabled - broker: broker_url, - topic, - qos, - username: config.lookup(MQTT_USERNAME).unwrap_or_default(), - password: config.lookup(MQTT_PASSWORD).unwrap_or_default(), - max_reconnect_interval: parse_duration(&config.lookup(MQTT_RECONNECT_INTERVAL).unwrap_or_else(|| "5s".to_string())) - .unwrap_or(Duration::from_secs(5)), - keep_alive: parse_duration(&config.lookup(MQTT_KEEP_ALIVE_INTERVAL).unwrap_or_else(|| "60s".to_string())) - .unwrap_or(Duration::from_secs(60)), - queue_dir: config.lookup(MQTT_QUEUE_DIR).unwrap_or_default(), - queue_limit: config.lookup(MQTT_QUEUE_LIMIT).and_then(|s| s.parse().ok()).unwrap_or(100000), - target_type: TargetType::AuditLog, - }; - - args.validate()?; - Ok(args) -} - -/// Parses enable value from string -fn parse_enable_value(value: &str) -> bool { - matches!(value.to_lowercase().as_str(), "1" | "on" | "true" | "yes") -} - -/// Parses duration from string (e.g., "3s", "5m") -fn parse_duration(s: &str) -> Option { - if let Some(stripped) = s.strip_suffix('s') { - stripped.parse::().ok().map(Duration::from_secs) - } else if let Some(stripped) = s.strip_suffix('m') { - stripped.parse::().ok().map(|m| Duration::from_secs(m * 60)) - } else if let Some(stripped) = s.strip_suffix("ms") { - stripped.parse::().ok().map(Duration::from_millis) - } else { - s.parse::().ok().map(Duration::from_secs) - } -} diff --git a/crates/audit/src/system.rs b/crates/audit/src/system.rs index cbfd2d51..ad80ffe9 100644 --- a/crates/audit/src/system.rs +++ b/crates/audit/src/system.rs @@ -58,6 +58,12 @@ impl AuditSystem { } /// Starts the audit system with the given configuration + /// + /// # Arguments + /// * `config` - The configuration to use for starting the audit system + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn start(&self, config: Config) -> AuditResult<()> { let state = self.state.write().await; @@ -87,7 +93,7 @@ impl AuditSystem { // Create targets from configuration let mut registry = self.registry.lock().await; - match registry.create_targets_from_config(&config).await { + match registry.create_audit_targets_from_config(&config).await { Ok(targets) => { if targets.is_empty() { info!("No enabled audit targets found, keeping audit system stopped"); @@ -143,6 +149,9 @@ impl AuditSystem { } /// Pauses the audit system + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn pause(&self) -> AuditResult<()> { let mut state = self.state.write().await; @@ -161,6 +170,9 @@ impl AuditSystem { } /// Resumes the audit system + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn resume(&self) -> AuditResult<()> { let mut state = self.state.write().await; @@ -179,6 +191,9 @@ impl AuditSystem { } /// Stops the audit system and closes all targets + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn close(&self) -> AuditResult<()> { let mut state = self.state.write().await; @@ -223,11 +238,20 @@ impl AuditSystem { } /// Checks if the audit system is running + /// + /// # Returns + /// * `bool` - True if running, false otherwise pub async fn is_running(&self) -> bool { matches!(*self.state.read().await, AuditSystemState::Running) } /// Dispatches an audit log entry to all active targets + /// + /// # Arguments + /// * `entry` - The audit log entry to dispatch + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn dispatch(&self, entry: Arc) -> AuditResult<()> { let start_time = std::time::Instant::now(); @@ -319,6 +343,13 @@ impl AuditSystem { Ok(()) } + /// Dispatches a batch of audit log entries to all active targets + /// + /// # Arguments + /// * `entries` - A vector of audit log entries to dispatch + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn dispatch_batch(&self, entries: Vec>) -> AuditResult<()> { let start_time = std::time::Instant::now(); @@ -386,7 +417,13 @@ impl AuditSystem { Ok(()) } - // New: Audit flow background tasks, based on send_from_store, including retries and exponential backoffs + /// Starts the audit stream processing for a target with batching and retry logic + /// # Arguments + /// * `store` - The store from which to read audit entries + /// * `target` - The target to which audit entries will be sent + /// + /// This function spawns a background task that continuously reads audit entries from the provided store + /// and attempts to send them to the specified target. It implements retry logic with exponential backoff fn start_audit_stream_with_batching( &self, store: Box, Error = StoreError, Key = Key> + Send>, @@ -462,6 +499,12 @@ impl AuditSystem { } /// Enables a specific target + /// + /// # Arguments + /// * `target_id` - The ID of the target to enable + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn enable_target(&self, target_id: &str) -> AuditResult<()> { // This would require storing enabled/disabled state per target // For now, just check if target exists @@ -475,6 +518,12 @@ impl AuditSystem { } /// Disables a specific target + /// + /// # Arguments + /// * `target_id` - The ID of the target to disable + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn disable_target(&self, target_id: &str) -> AuditResult<()> { // This would require storing enabled/disabled state per target // For now, just check if target exists @@ -488,6 +537,12 @@ impl AuditSystem { } /// Removes a target from the system + /// + /// # Arguments + /// * `target_id` - The ID of the target to remove + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn remove_target(&self, target_id: &str) -> AuditResult<()> { let mut registry = self.registry.lock().await; if let Some(target) = registry.remove_target(target_id) { @@ -502,6 +557,13 @@ impl AuditSystem { } /// Updates or inserts a target + /// + /// # Arguments + /// * `target_id` - The ID of the target to upsert + /// * `target` - The target instance to insert or update + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn upsert_target(&self, target_id: String, target: Box + Send + Sync>) -> AuditResult<()> { let mut registry = self.registry.lock().await; @@ -523,18 +585,33 @@ impl AuditSystem { } /// Lists all targets + /// + /// # Returns + /// * `Vec` - List of target IDs pub async fn list_targets(&self) -> Vec { let registry = self.registry.lock().await; registry.list_targets() } /// Gets information about a specific target + /// + /// # Arguments + /// * `target_id` - The ID of the target to retrieve + /// + /// # Returns + /// * `Option` - Target ID if found pub async fn get_target(&self, target_id: &str) -> Option { let registry = self.registry.lock().await; registry.get_target(target_id).map(|target| target.id().to_string()) } /// Reloads configuration and updates targets + /// + /// # Arguments + /// * `new_config` - The new configuration to load + /// + /// # Returns + /// * `AuditResult<()>` - Result indicating success or failure pub async fn reload_config(&self, new_config: Config) -> AuditResult<()> { info!("Reloading audit system configuration"); @@ -554,7 +631,7 @@ impl AuditSystem { } // Create new targets from updated configuration - match registry.create_targets_from_config(&new_config).await { + match registry.create_audit_targets_from_config(&new_config).await { Ok(targets) => { info!(target_count = targets.len(), "Reloaded audit targets successfully"); @@ -594,16 +671,22 @@ impl AuditSystem { } /// Gets current audit system metrics + /// + /// # Returns + /// * `AuditMetricsReport` - Current metrics report pub async fn get_metrics(&self) -> observability::AuditMetricsReport { observability::get_metrics_report().await } /// Validates system performance against requirements + /// + /// # Returns + /// * `PerformanceValidation` - Performance validation results pub async fn validate_performance(&self) -> observability::PerformanceValidation { observability::validate_performance().await } - /// Resets all metrics + /// Resets all metrics to initial state pub async fn reset_metrics(&self) { observability::reset_metrics().await; } diff --git a/crates/audit/tests/integration_test.rs b/crates/audit/tests/integration_test.rs index d889c84e..f2ef342e 100644 --- a/crates/audit/tests/integration_test.rs +++ b/crates/audit/tests/integration_test.rs @@ -43,11 +43,11 @@ async fn test_config_parsing_webhook() { audit_webhook_section.insert("_".to_string(), default_kvs); config.0.insert("audit_webhook".to_string(), audit_webhook_section); - let mut registry = AuditRegistry::new(); + let registry = AuditRegistry::new(); // This should not fail even if server storage is not initialized // as it's an integration test - let result = registry.create_targets_from_config(&config).await; + let result = registry.create_audit_targets_from_config(&config).await; // We expect this to fail due to server storage not being initialized // but the parsing should work correctly diff --git a/crates/audit/tests/performance_test.rs b/crates/audit/tests/performance_test.rs index 4080c47b..b96e92eb 100644 --- a/crates/audit/tests/performance_test.rs +++ b/crates/audit/tests/performance_test.rs @@ -44,7 +44,7 @@ async fn test_audit_system_startup_performance() { #[tokio::test] async fn test_concurrent_target_creation() { // Test that multiple targets can be created concurrently - let mut registry = AuditRegistry::new(); + let registry = AuditRegistry::new(); // Create config with multiple webhook instances let mut config = rustfs_ecstore::config::Config(std::collections::HashMap::new()); @@ -63,7 +63,7 @@ async fn test_concurrent_target_creation() { let start = Instant::now(); // This will fail due to server storage not being initialized, but we can measure timing - let result = registry.create_targets_from_config(&config).await; + let result = registry.create_audit_targets_from_config(&config).await; let elapsed = start.elapsed(); println!("Concurrent target creation took: {elapsed:?}"); diff --git a/crates/audit/tests/system_integration_test.rs b/crates/audit/tests/system_integration_test.rs index 267a9fc1..d60c6f18 100644 --- a/crates/audit/tests/system_integration_test.rs +++ b/crates/audit/tests/system_integration_test.rs @@ -135,7 +135,7 @@ async fn test_global_audit_functions() { #[tokio::test] async fn test_config_parsing_with_multiple_instances() { - let mut registry = AuditRegistry::new(); + let registry = AuditRegistry::new(); // Create config with multiple webhook instances let mut config = Config(HashMap::new()); @@ -164,7 +164,7 @@ async fn test_config_parsing_with_multiple_instances() { config.0.insert("audit_webhook".to_string(), webhook_section); // Try to create targets from config - let result = registry.create_targets_from_config(&config).await; + let result = registry.create_audit_targets_from_config(&config).await; // Should fail due to server storage not initialized, but parsing should work match result { diff --git a/crates/common/src/globals.rs b/crates/common/src/globals.rs index 141003a2..e0f6a38a 100644 --- a/crates/common/src/globals.rs +++ b/crates/common/src/globals.rs @@ -19,21 +19,26 @@ use std::sync::LazyLock; use tokio::sync::RwLock; use tonic::transport::Channel; -pub static GLOBAL_Local_Node_Name: LazyLock> = LazyLock::new(|| RwLock::new("".to_string())); -pub static GLOBAL_Rustfs_Host: LazyLock> = LazyLock::new(|| RwLock::new("".to_string())); -pub static GLOBAL_Rustfs_Port: LazyLock> = LazyLock::new(|| RwLock::new("9000".to_string())); -pub static GLOBAL_Rustfs_Addr: LazyLock> = LazyLock::new(|| RwLock::new("".to_string())); -pub static GLOBAL_Conn_Map: LazyLock>> = LazyLock::new(|| RwLock::new(HashMap::new())); +pub static GLOBAL_LOCAL_NODE_NAME: LazyLock> = LazyLock::new(|| RwLock::new("".to_string())); +pub static GLOBAL_RUSTFS_HOST: LazyLock> = LazyLock::new(|| RwLock::new("".to_string())); +pub static GLOBAL_RUSTFS_PORT: LazyLock> = LazyLock::new(|| RwLock::new("9000".to_string())); +pub static GLOBAL_RUSTFS_ADDR: LazyLock> = LazyLock::new(|| RwLock::new("".to_string())); +pub static GLOBAL_CONN_MAP: LazyLock>> = LazyLock::new(|| RwLock::new(HashMap::new())); +pub static GLOBAL_ROOT_CERT: LazyLock>>> = LazyLock::new(|| RwLock::new(None)); pub async fn set_global_addr(addr: &str) { - *GLOBAL_Rustfs_Addr.write().await = addr.to_string(); + *GLOBAL_RUSTFS_ADDR.write().await = addr.to_string(); +} + +pub async fn set_global_root_cert(cert: Vec) { + *GLOBAL_ROOT_CERT.write().await = Some(cert); } /// Evict a stale/dead connection from the global connection cache. /// This is critical for cluster recovery when a node dies unexpectedly (e.g., power-off). /// By removing the cached connection, subsequent requests will establish a fresh connection. pub async fn evict_connection(addr: &str) { - let removed = GLOBAL_Conn_Map.write().await.remove(addr); + let removed = GLOBAL_CONN_MAP.write().await.remove(addr); if removed.is_some() { tracing::warn!("Evicted stale connection from cache: {}", addr); } @@ -41,12 +46,12 @@ pub async fn evict_connection(addr: &str) { /// Check if a connection exists in the cache for the given address. pub async fn has_cached_connection(addr: &str) -> bool { - GLOBAL_Conn_Map.read().await.contains_key(addr) + GLOBAL_CONN_MAP.read().await.contains_key(addr) } /// Clear all cached connections. Useful for full cluster reset/recovery. pub async fn clear_all_connections() { - let mut map = GLOBAL_Conn_Map.write().await; + let mut map = GLOBAL_CONN_MAP.write().await; let count = map.len(); map.clear(); if count > 0 { diff --git a/crates/common/src/heal_channel.rs b/crates/common/src/heal_channel.rs index 67b7f46d..0d9ef98e 100644 --- a/crates/common/src/heal_channel.rs +++ b/crates/common/src/heal_channel.rs @@ -212,6 +212,8 @@ pub struct HealChannelRequest { pub bucket: String, /// Object prefix (optional) pub object_prefix: Option, + /// Object version ID (optional) + pub object_version_id: Option, /// Force start heal pub force_start: bool, /// Priority @@ -346,6 +348,7 @@ pub fn create_heal_request( id: Uuid::new_v4().to_string(), bucket, object_prefix, + object_version_id: None, force_start, priority: priority.unwrap_or_default(), pool_index: None, @@ -374,6 +377,7 @@ pub fn create_heal_request_with_options( id: Uuid::new_v4().to_string(), bucket, object_prefix, + object_version_id: None, force_start, priority: priority.unwrap_or_default(), pool_index, @@ -503,6 +507,7 @@ pub async fn send_heal_disk(set_disk_id: String, priority: Option Self { + Self::new() + } +} + +impl GlobalReadiness { + /// Create a new GlobalReadiness instance with initial status as Starting + /// # Returns + /// A new instance of GlobalReadiness + pub fn new() -> Self { + Self { + status: AtomicU8::new(SystemStage::Booting as u8), + } + } + + /// Update the system to a new stage + /// + /// # Arguments + /// * `step` - The SystemStage step to mark as ready + pub fn mark_stage(&self, step: SystemStage) { + self.status.fetch_max(step as u8, Ordering::SeqCst); + } + + /// Check if the service is fully ready + /// # Returns + /// `true` if the service is fully ready, `false` otherwise + pub fn is_ready(&self) -> bool { + self.status.load(Ordering::SeqCst) == SystemStage::FullReady as u8 + } +} + +#[cfg(test)] +mod tests { + use super::*; + use std::sync::Arc; + use std::thread; + + #[test] + fn test_initial_state() { + let readiness = GlobalReadiness::new(); + assert!(!readiness.is_ready()); + assert_eq!(readiness.status.load(Ordering::SeqCst), SystemStage::Booting as u8); + } + + #[test] + fn test_mark_stage_progression() { + let readiness = GlobalReadiness::new(); + readiness.mark_stage(SystemStage::StorageReady); + assert!(!readiness.is_ready()); + assert_eq!(readiness.status.load(Ordering::SeqCst), SystemStage::StorageReady as u8); + + readiness.mark_stage(SystemStage::IamReady); + assert!(!readiness.is_ready()); + assert_eq!(readiness.status.load(Ordering::SeqCst), SystemStage::IamReady as u8); + + readiness.mark_stage(SystemStage::FullReady); + assert!(readiness.is_ready()); + } + + #[test] + fn test_no_regression() { + let readiness = GlobalReadiness::new(); + readiness.mark_stage(SystemStage::FullReady); + readiness.mark_stage(SystemStage::IamReady); // Should not regress + assert!(readiness.is_ready()); + } + + #[test] + fn test_concurrent_marking() { + let readiness = Arc::new(GlobalReadiness::new()); + let mut handles = vec![]; + + for _ in 0..10 { + let r = Arc::clone(&readiness); + handles.push(thread::spawn(move || { + r.mark_stage(SystemStage::StorageReady); + r.mark_stage(SystemStage::IamReady); + r.mark_stage(SystemStage::FullReady); + })); + } + + for h in handles { + h.join().unwrap(); + } + + assert!(readiness.is_ready()); + } + + #[test] + fn test_is_ready_only_at_full_ready() { + let readiness = GlobalReadiness::new(); + assert!(!readiness.is_ready()); + + readiness.mark_stage(SystemStage::StorageReady); + assert!(!readiness.is_ready()); + + readiness.mark_stage(SystemStage::IamReady); + assert!(!readiness.is_ready()); + + readiness.mark_stage(SystemStage::FullReady); + assert!(readiness.is_ready()); + } +} diff --git a/crates/config/src/audit/mod.rs b/crates/config/src/audit/mod.rs index 92a57212..793845ff 100644 --- a/crates/config/src/audit/mod.rs +++ b/crates/config/src/audit/mod.rs @@ -29,7 +29,7 @@ pub const AUDIT_PREFIX: &str = "audit"; pub const AUDIT_ROUTE_PREFIX: &str = const_str::concat!(AUDIT_PREFIX, DEFAULT_DELIMITER); pub const AUDIT_WEBHOOK_SUB_SYS: &str = "audit_webhook"; -pub const AUDIT_MQTT_SUB_SYS: &str = "mqtt_webhook"; +pub const AUDIT_MQTT_SUB_SYS: &str = "audit_mqtt"; pub const AUDIT_STORE_EXTENSION: &str = ".audit"; #[allow(dead_code)] diff --git a/crates/config/src/constants/app.rs b/crates/config/src/constants/app.rs index f62b6407..0610319e 100644 --- a/crates/config/src/constants/app.rs +++ b/crates/config/src/constants/app.rs @@ -89,6 +89,30 @@ pub const RUSTFS_TLS_KEY: &str = "rustfs_key.pem"; /// This is the default cert for TLS. pub const RUSTFS_TLS_CERT: &str = "rustfs_cert.pem"; +/// Default public certificate filename for rustfs +/// This is the default public certificate filename for rustfs. +/// It is used to store the public certificate of the application. +/// Default value: public.crt +pub const RUSTFS_PUBLIC_CERT: &str = "public.crt"; + +/// Default CA certificate filename for rustfs +/// This is the default CA certificate filename for rustfs. +/// It is used to store the CA certificate of the application. +/// Default value: ca.crt +pub const RUSTFS_CA_CERT: &str = "ca.crt"; + +/// Default HTTP prefix for rustfs +/// This is the default HTTP prefix for rustfs. +/// It is used to identify HTTP URLs. +/// Default value: http:// +pub const RUSTFS_HTTP_PREFIX: &str = "http://"; + +/// Default HTTPS prefix for rustfs +/// This is the default HTTPS prefix for rustfs. +/// It is used to identify HTTPS URLs. +/// Default value: https:// +pub const RUSTFS_HTTPS_PREFIX: &str = "https://"; + /// Default port for rustfs /// This is the default port for rustfs. /// This is used to bind the server to a specific port. diff --git a/crates/config/src/constants/body_limits.rs b/crates/config/src/constants/body_limits.rs new file mode 100644 index 00000000..4a806045 --- /dev/null +++ b/crates/config/src/constants/body_limits.rs @@ -0,0 +1,56 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +//! Request body size limits for admin API endpoints +//! +//! These limits prevent DoS attacks through unbounded memory allocation +//! while allowing legitimate use cases. + +/// Maximum size for standard admin API request bodies (1 MB) +/// Used for: user creation/update, policies, tier config, KMS config, events, groups, service accounts +/// Rationale: Admin API payloads are typically JSON/XML configs under 100KB. +/// AWS IAM policy limit is 6KB-10KB. 1MB provides generous headroom. +pub const MAX_ADMIN_REQUEST_BODY_SIZE: usize = 1024 * 1024; // 1 MB + +/// Maximum size for IAM import/export operations (10 MB) +/// Used for: IAM entity imports/exports containing multiple users, policies, groups +/// Rationale: ZIP archives with hundreds of IAM entities. 10MB allows ~10,000 small configs. +pub const MAX_IAM_IMPORT_SIZE: usize = 10 * 1024 * 1024; // 10 MB + +/// Maximum size for bucket metadata import operations (100 MB) +/// Used for: Bucket metadata import containing configurations for many buckets +/// Rationale: Large deployments may have thousands of buckets with various configs. +/// 100MB allows importing metadata for ~10,000 buckets with reasonable configs. +pub const MAX_BUCKET_METADATA_IMPORT_SIZE: usize = 100 * 1024 * 1024; // 100 MB + +/// Maximum size for healing operation requests (1 MB) +/// Used for: Healing parameters and configuration +/// Rationale: Healing requests contain bucket/object paths and options. Should be small. +pub const MAX_HEAL_REQUEST_SIZE: usize = 1024 * 1024; // 1 MB + +/// Maximum size for S3 client response bodies (10 MB) +/// Used for: Reading responses from remote S3-compatible services (ACL, attributes, lists) +/// Rationale: Responses from external services should be bounded. +/// Large responses (>10MB) indicate misconfiguration or potential attack. +/// Typical responses: ACL XML < 10KB, List responses < 1MB +/// +/// Rationale: Responses from external S3-compatible services should be bounded. +/// - ACL XML responses: typically < 10KB +/// - Object attributes: typically < 100KB +/// - List responses: typically < 1MB (1000 objects with metadata) +/// - Location/error responses: typically < 10KB +/// +/// 10MB provides generous headroom for legitimate responses while preventing +/// memory exhaustion from malicious or misconfigured remote services. +pub const MAX_S3_CLIENT_RESPONSE_SIZE: usize = 10 * 1024 * 1024; // 10 MB diff --git a/crates/config/src/constants/compress.rs b/crates/config/src/constants/compress.rs new file mode 100644 index 00000000..4af04571 --- /dev/null +++ b/crates/config/src/constants/compress.rs @@ -0,0 +1,61 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +//! HTTP Response Compression Configuration +//! +//! This module provides configuration options for HTTP response compression. +//! By default, compression is disabled (aligned with MinIO behavior). +//! When enabled via `RUSTFS_COMPRESS_ENABLE=on`, compression can be configured +//! to apply only to specific file extensions, MIME types, and minimum file sizes. + +/// Environment variable to enable/disable HTTP response compression +/// Default: off (disabled) +/// Values: on, off, true, false, yes, no, 1, 0 +/// Example: RUSTFS_COMPRESS_ENABLE=on +pub const ENV_COMPRESS_ENABLE: &str = "RUSTFS_COMPRESS_ENABLE"; + +/// Default compression enable state +/// Aligned with MinIO behavior - compression is disabled by default +pub const DEFAULT_COMPRESS_ENABLE: bool = false; + +/// Environment variable for file extensions that should be compressed +/// Comma-separated list of file extensions (with or without leading dot) +/// Default: "" (empty, meaning use MIME type matching only) +/// Example: RUSTFS_COMPRESS_EXTENSIONS=.txt,.log,.csv,.json,.xml,.html,.css,.js +pub const ENV_COMPRESS_EXTENSIONS: &str = "RUSTFS_COMPRESS_EXTENSIONS"; + +/// Default file extensions for compression +/// Empty by default - relies on MIME type matching +pub const DEFAULT_COMPRESS_EXTENSIONS: &str = ""; + +/// Environment variable for MIME types that should be compressed +/// Comma-separated list of MIME types, supports wildcard (*) for subtypes +/// Default: "text/*,application/json,application/xml,application/javascript" +/// Example: RUSTFS_COMPRESS_MIME_TYPES=text/*,application/json,application/xml +pub const ENV_COMPRESS_MIME_TYPES: &str = "RUSTFS_COMPRESS_MIME_TYPES"; + +/// Default MIME types for compression +/// Includes common text-based content types that benefit from compression +pub const DEFAULT_COMPRESS_MIME_TYPES: &str = "text/*,application/json,application/xml,application/javascript"; + +/// Environment variable for minimum file size to apply compression +/// Files smaller than this size will not be compressed +/// Default: 1000 (bytes) +/// Example: RUSTFS_COMPRESS_MIN_SIZE=1000 +pub const ENV_COMPRESS_MIN_SIZE: &str = "RUSTFS_COMPRESS_MIN_SIZE"; + +/// Default minimum file size for compression (in bytes) +/// Files smaller than 1000 bytes typically don't benefit from compression +/// and the compression overhead may outweigh the benefits +pub const DEFAULT_COMPRESS_MIN_SIZE: u64 = 1000; diff --git a/crates/config/src/constants/env.rs b/crates/config/src/constants/env.rs index e78c2b90..84116ba5 100644 --- a/crates/config/src/constants/env.rs +++ b/crates/config/src/constants/env.rs @@ -16,7 +16,8 @@ pub const DEFAULT_DELIMITER: &str = "_"; pub const ENV_PREFIX: &str = "RUSTFS_"; pub const ENV_WORD_DELIMITER: &str = "_"; -pub const DEFAULT_DIR: &str = "/opt/rustfs/events"; // Default directory for event store +pub const EVENT_DEFAULT_DIR: &str = "/opt/rustfs/events"; // Default directory for event store +pub const AUDIT_DEFAULT_DIR: &str = "/opt/rustfs/audit"; // Default directory for audit store pub const DEFAULT_LIMIT: u64 = 100000; // Default store limit /// Standard config keys and values. diff --git a/crates/config/src/constants/mod.rs b/crates/config/src/constants/mod.rs index 3c68f472..a03e03ad 100644 --- a/crates/config/src/constants/mod.rs +++ b/crates/config/src/constants/mod.rs @@ -13,11 +13,14 @@ // limitations under the License. pub(crate) mod app; +pub(crate) mod body_limits; +pub(crate) mod compress; pub(crate) mod console; pub(crate) mod env; pub(crate) mod heal; pub(crate) mod object; pub(crate) mod profiler; pub(crate) mod runtime; +pub(crate) mod scanner; pub(crate) mod targets; pub(crate) mod tls; diff --git a/crates/config/src/constants/runtime.rs b/crates/config/src/constants/runtime.rs index b9fa5862..04afaf84 100644 --- a/crates/config/src/constants/runtime.rs +++ b/crates/config/src/constants/runtime.rs @@ -39,3 +39,10 @@ pub const DEFAULT_MAX_IO_EVENTS_PER_TICK: usize = 1024; /// Event polling default (Tokio default 61) pub const DEFAULT_EVENT_INTERVAL: u32 = 61; pub const DEFAULT_RNG_SEED: Option = None; // None means random + +/// Threshold for small object seek support in megabytes. +/// +/// When an object is smaller than this size, rustfs will provide seek support. +/// +/// Default is set to 10MB. +pub const DEFAULT_OBJECT_SEEK_SUPPORT_THRESHOLD: usize = 10 * 1024 * 1024; diff --git a/crates/config/src/constants/scanner.rs b/crates/config/src/constants/scanner.rs new file mode 100644 index 00000000..ff024150 --- /dev/null +++ b/crates/config/src/constants/scanner.rs @@ -0,0 +1,28 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +/// Environment variable name that specifies the data scanner start delay in seconds. +/// - Purpose: Define the delay between data scanner operations. +/// - Unit: seconds (u64). +/// - Valid values: any positive integer. +/// - Semantics: This delay controls how frequently the data scanner checks for and processes data; shorter delays lead to more responsive scanning but may increase system load. +/// - Example: `export RUSTFS_DATA_SCANNER_START_DELAY_SECS=10` +/// - Note: Choose an appropriate delay that balances scanning responsiveness with overall system performance. +pub const ENV_DATA_SCANNER_START_DELAY_SECS: &str = "RUSTFS_DATA_SCANNER_START_DELAY_SECS"; + +/// Default data scanner start delay in seconds if not specified in the environment variable. +/// - Value: 10 seconds. +/// - Rationale: This default interval provides a reasonable balance between scanning responsiveness and system load for most deployments. +/// - Adjustments: Users may modify this value via the `RUSTFS_DATA_SCANNER_START_DELAY_SECS` environment variable based on their specific scanning requirements and system performance. +pub const DEFAULT_DATA_SCANNER_START_DELAY_SECS: u64 = 60; diff --git a/crates/config/src/constants/tls.rs b/crates/config/src/constants/tls.rs index cfda42e2..6cbebcd4 100644 --- a/crates/config/src/constants/tls.rs +++ b/crates/config/src/constants/tls.rs @@ -12,4 +12,26 @@ // See the License for the specific language governing permissions and // limitations under the License. +/// TLS related environment variable names and default values +/// Environment variable to enable TLS key logging +/// When set to "1", RustFS will log TLS keys to the specified file for debugging purposes. +/// By default, this is disabled. +/// To enable, set the environment variable RUSTFS_TLS_KEYLOG=1 pub const ENV_TLS_KEYLOG: &str = "RUSTFS_TLS_KEYLOG"; + +/// Default value for TLS key logging +/// By default, RustFS does not log TLS keys. +/// To change this behavior, set the environment variable RUSTFS_TLS_KEYLOG=1 +pub const DEFAULT_TLS_KEYLOG: bool = false; + +/// Environment variable to trust system CA certificates +/// When set to "1", RustFS will trust system CA certificates in addition to any +/// custom CA certificates provided in the configuration. +/// By default, this is disabled. +/// To enable, set the environment variable RUSTFS_TRUST_SYSTEM_CA=1 +pub const ENV_TRUST_SYSTEM_CA: &str = "RUSTFS_TRUST_SYSTEM_CA"; + +/// Default value for trusting system CA certificates +/// By default, RustFS does not trust system CA certificates. +/// To change this behavior, set the environment variable RUSTFS_TRUST_SYSTEM_CA=1 +pub const DEFAULT_TRUST_SYSTEM_CA: bool = false; diff --git a/crates/config/src/lib.rs b/crates/config/src/lib.rs index 0202d6dd..d2278fef 100644 --- a/crates/config/src/lib.rs +++ b/crates/config/src/lib.rs @@ -17,6 +17,10 @@ pub mod constants; #[cfg(feature = "constants")] pub use constants::app::*; #[cfg(feature = "constants")] +pub use constants::body_limits::*; +#[cfg(feature = "constants")] +pub use constants::compress::*; +#[cfg(feature = "constants")] pub use constants::console::*; #[cfg(feature = "constants")] pub use constants::env::*; @@ -29,6 +33,8 @@ pub use constants::profiler::*; #[cfg(feature = "constants")] pub use constants::runtime::*; #[cfg(feature = "constants")] +pub use constants::scanner::*; +#[cfg(feature = "constants")] pub use constants::targets::*; #[cfg(feature = "constants")] pub use constants::tls::*; diff --git a/crates/config/src/notify/mod.rs b/crates/config/src/notify/mod.rs index 91a78de4..59e6493f 100644 --- a/crates/config/src/notify/mod.rs +++ b/crates/config/src/notify/mod.rs @@ -24,13 +24,45 @@ pub use webhook::*; use crate::DEFAULT_DELIMITER; -// --- Configuration Constants --- +/// Default target identifier for notifications, +/// Used in notification system when no specific target is provided, +/// Represents the default target stream or endpoint for notifications when no specific target is provided. pub const DEFAULT_TARGET: &str = "1"; - +/// Notification prefix for routing and identification, +/// Used in notification system, +/// This prefix is utilized in constructing routes and identifiers related to notifications within the system. pub const NOTIFY_PREFIX: &str = "notify"; +/// Notification route prefix combining the notification prefix and default delimiter +/// Combines the notification prefix with the default delimiter +/// Used in notification system for defining routes related to notifications. +/// Example: "notify:/" pub const NOTIFY_ROUTE_PREFIX: &str = const_str::concat!(NOTIFY_PREFIX, DEFAULT_DELIMITER); +/// Name of the environment variable that configures target stream concurrency. +/// Controls how many target streams are processed in parallel by the notification system. +/// Defaults to [`DEFAULT_NOTIFY_TARGET_STREAM_CONCURRENCY`] if not set. +/// Example: `RUSTFS_NOTIFY_TARGET_STREAM_CONCURRENCY=20`. +pub const ENV_NOTIFY_TARGET_STREAM_CONCURRENCY: &str = "RUSTFS_NOTIFY_TARGET_STREAM_CONCURRENCY"; + +/// Default concurrency for target stream processing in the notification system +/// This value is used if the environment variable `RUSTFS_NOTIFY_TARGET_STREAM_CONCURRENCY` is not set. +/// It defines how many target streams can be processed in parallel by the notification system at any given time. +/// Adjust this value based on your system's capabilities and expected load. +pub const DEFAULT_NOTIFY_TARGET_STREAM_CONCURRENCY: usize = 20; + +/// Name of the environment variable that configures send concurrency. +/// Controls how many send operations are processed in parallel by the notification system. +/// Defaults to [`DEFAULT_NOTIFY_SEND_CONCURRENCY`] if not set. +/// Example: `RUSTFS_NOTIFY_SEND_CONCURRENCY=64`. +pub const ENV_NOTIFY_SEND_CONCURRENCY: &str = "RUSTFS_NOTIFY_SEND_CONCURRENCY"; + +/// Default concurrency for send operations in the notification system +/// This value is used if the environment variable `RUSTFS_NOTIFY_SEND_CONCURRENCY` is not set. +/// It defines how many send operations can be processed in parallel by the notification system at any given time. +/// Adjust this value based on your system's capabilities and expected load. +pub const DEFAULT_NOTIFY_SEND_CONCURRENCY: usize = 64; + #[allow(dead_code)] pub const NOTIFY_SUB_SYSTEMS: &[&str] = &[NOTIFY_MQTT_SUB_SYS, NOTIFY_WEBHOOK_SUB_SYS]; diff --git a/crates/config/src/notify/store.rs b/crates/config/src/notify/store.rs index ed838b05..3dab3de2 100644 --- a/crates/config/src/notify/store.rs +++ b/crates/config/src/notify/store.rs @@ -15,5 +15,5 @@ pub const DEFAULT_EXT: &str = ".unknown"; // Default file extension pub const COMPRESS_EXT: &str = ".snappy"; // Extension for compressed files -/// STORE_EXTENSION - file extension of an event file in store -pub const STORE_EXTENSION: &str = ".event"; +/// NOTIFY_STORE_EXTENSION - file extension of an event file in store +pub const NOTIFY_STORE_EXTENSION: &str = ".event"; diff --git a/crates/crypto/Cargo.toml b/crates/crypto/Cargo.toml index f29fee53..b5e47cf5 100644 --- a/crates/crypto/Cargo.toml +++ b/crates/crypto/Cargo.toml @@ -30,7 +30,7 @@ workspace = true [dependencies] aes-gcm = { workspace = true, optional = true } -argon2 = { workspace = true, features = ["std"], optional = true } +argon2 = { workspace = true, optional = true } cfg-if = { workspace = true } chacha20poly1305 = { workspace = true, optional = true } jsonwebtoken = { workspace = true } diff --git a/crates/e2e_test/Cargo.toml b/crates/e2e_test/Cargo.toml index 07e2b239..e1fcbe8a 100644 --- a/crates/e2e_test/Cargo.toml +++ b/crates/e2e_test/Cargo.toml @@ -25,6 +25,7 @@ workspace = true [dependencies] rustfs-ecstore.workspace = true +rustfs-common.workspace = true flatbuffers.workspace = true futures.workspace = true rustfs-lock.workspace = true @@ -49,4 +50,4 @@ uuid = { workspace = true } base64 = { workspace = true } rand = { workspace = true } chrono = { workspace = true } -md5 = { workspace = true } \ No newline at end of file +md5 = { workspace = true } diff --git a/crates/e2e_test/src/common.rs b/crates/e2e_test/src/common.rs index a3cf1371..9fecad3c 100644 --- a/crates/e2e_test/src/common.rs +++ b/crates/e2e_test/src/common.rs @@ -327,7 +327,8 @@ pub async fn execute_awscurl( if !output.status.success() { let stderr = String::from_utf8_lossy(&output.stderr); - return Err(format!("awscurl failed: {stderr}").into()); + let stdout = String::from_utf8_lossy(&output.stdout); + return Err(format!("awscurl failed: stderr='{stderr}', stdout='{stdout}'").into()); } let response = String::from_utf8_lossy(&output.stdout).to_string(); @@ -352,3 +353,13 @@ pub async fn awscurl_get( ) -> Result> { execute_awscurl(url, "GET", None, access_key, secret_key).await } + +/// Helper function for PUT requests +pub async fn awscurl_put( + url: &str, + body: &str, + access_key: &str, + secret_key: &str, +) -> Result> { + execute_awscurl(url, "PUT", Some(body), access_key, secret_key).await +} diff --git a/crates/e2e_test/src/data_usage_test.rs b/crates/e2e_test/src/data_usage_test.rs new file mode 100644 index 00000000..1121b366 --- /dev/null +++ b/crates/e2e_test/src/data_usage_test.rs @@ -0,0 +1,73 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use aws_sdk_s3::primitives::ByteStream; +use rustfs_common::data_usage::DataUsageInfo; +use serial_test::serial; + +use crate::common::{RustFSTestEnvironment, TEST_BUCKET, awscurl_get, init_logging}; + +/// Regression test for data usage accuracy (issue #1012). +/// Launches rustfs, writes 1000 objects, then asserts admin data usage reports the full count. +#[tokio::test(flavor = "multi_thread")] +#[serial] +#[ignore = "Starts a rustfs server and requires awscurl; enable when running full E2E"] +async fn data_usage_reports_all_objects() -> Result<(), Box> { + init_logging(); + + let mut env = RustFSTestEnvironment::new().await?; + env.start_rustfs_server(vec![]).await?; + + let client = env.create_s3_client(); + + // Create bucket and upload objects + client.create_bucket().bucket(TEST_BUCKET).send().await?; + + for i in 0..1000 { + let key = format!("obj-{i:04}"); + client + .put_object() + .bucket(TEST_BUCKET) + .key(key) + .body(ByteStream::from_static(b"hello-world")) + .send() + .await?; + } + + // Query admin data usage API + let url = format!("{}/rustfs/admin/v3/datausageinfo", env.url); + let resp = awscurl_get(&url, &env.access_key, &env.secret_key).await?; + let usage: DataUsageInfo = serde_json::from_str(&resp)?; + + // Assert total object count and per-bucket count are not truncated + let bucket_usage = usage + .buckets_usage + .get(TEST_BUCKET) + .cloned() + .expect("bucket usage should exist"); + + assert!( + usage.objects_total_count >= 1000, + "total object count should be at least 1000, got {}", + usage.objects_total_count + ); + assert!( + bucket_usage.objects_count >= 1000, + "bucket object count should be at least 1000, got {}", + bucket_usage.objects_count + ); + + env.stop_server(); + Ok(()) +} diff --git a/crates/e2e_test/src/lib.rs b/crates/e2e_test/src/lib.rs index ac6f3805..ac430785 100644 --- a/crates/e2e_test/src/lib.rs +++ b/crates/e2e_test/src/lib.rs @@ -18,6 +18,10 @@ mod reliant; #[cfg(test)] pub mod common; +// Data usage regression tests +#[cfg(test)] +mod data_usage_test; + // KMS-specific test modules #[cfg(test)] mod kms; @@ -29,3 +33,7 @@ mod special_chars_test; // Content-Encoding header preservation test #[cfg(test)] mod content_encoding_test; + +// Policy variables tests +#[cfg(test)] +mod policy; diff --git a/crates/e2e_test/src/policy/README.md b/crates/e2e_test/src/policy/README.md new file mode 100644 index 00000000..16d4a4dc --- /dev/null +++ b/crates/e2e_test/src/policy/README.md @@ -0,0 +1,39 @@ +# RustFS Policy Variables Tests + +This directory contains comprehensive end-to-end tests for AWS IAM policy variables in RustFS. + +## Test Overview + +The tests cover the following AWS policy variable scenarios: + +1. **Single-value variables** - Basic variable resolution like `${aws:username}` +2. **Multi-value variables** - Variables that can have multiple values +3. **Variable concatenation** - Combining variables with static text like `prefix-${aws:username}-suffix` +4. **Nested variables** - Complex nested variable patterns like `${${aws:username}-test}` +5. **Deny scenarios** - Testing deny policies with variables + +## Prerequisites + +- RustFS server binary +- `awscurl` utility for admin API calls +- AWS SDK for Rust (included in the project) + +## Running Tests + +### Run All Policy Tests Using Unified Test Runner + +```bash +# Run all policy tests with comprehensive reporting +# Note: Requires a RustFS server running on localhost:9000 +cargo test -p e2e_test policy::test_runner::test_policy_full_suite -- --nocapture --ignored --test-threads=1 + +# Run only critical policy tests +cargo test -p e2e_test policy::test_runner::test_policy_critical_suite -- --nocapture --ignored --test-threads=1 +``` + +### Run All Policy Tests + +```bash +# From the project root directory +cargo test -p e2e_test policy:: -- --nocapture --ignored --test-threads=1 +``` \ No newline at end of file diff --git a/crates/s3select-api/src/query/datasource/mod.rs b/crates/e2e_test/src/policy/mod.rs similarity index 70% rename from crates/s3select-api/src/query/datasource/mod.rs rename to crates/e2e_test/src/policy/mod.rs index 6238cfff..6efa597a 100644 --- a/crates/s3select-api/src/query/datasource/mod.rs +++ b/crates/e2e_test/src/policy/mod.rs @@ -11,3 +11,12 @@ // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. + +//! Policy-specific tests for RustFS +//! +//! This module provides comprehensive tests for AWS IAM policy variables +//! including single-value, multi-value, and nested variable scenarios. + +mod policy_variables_test; +mod test_env; +mod test_runner; diff --git a/crates/e2e_test/src/policy/policy_variables_test.rs b/crates/e2e_test/src/policy/policy_variables_test.rs new file mode 100644 index 00000000..187f355c --- /dev/null +++ b/crates/e2e_test/src/policy/policy_variables_test.rs @@ -0,0 +1,798 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +//! Tests for AWS IAM policy variables with single-value, multi-value, and nested scenarios + +use crate::common::{awscurl_put, init_logging}; +use crate::policy::test_env::PolicyTestEnvironment; +use aws_sdk_s3::primitives::ByteStream; +use serial_test::serial; +use tracing::info; + +/// Helper function to create a regular user with given credentials +async fn create_user( + env: &PolicyTestEnvironment, + username: &str, + password: &str, +) -> Result<(), Box> { + let create_user_body = serde_json::json!({ + "secretKey": password, + "status": "enabled" + }) + .to_string(); + + let create_user_url = format!("{}/rustfs/admin/v3/add-user?accessKey={}", env.url, username); + awscurl_put(&create_user_url, &create_user_body, &env.access_key, &env.secret_key).await?; + Ok(()) +} + +/// Helper function to create an STS user with given credentials +async fn create_sts_user( + env: &PolicyTestEnvironment, + username: &str, + password: &str, +) -> Result<(), Box> { + // For STS, we create a regular user first, then use it to assume roles + create_user(env, username, password).await?; + Ok(()) +} + +/// Helper function to create and attach a policy +async fn create_and_attach_policy( + env: &PolicyTestEnvironment, + policy_name: &str, + username: &str, + policy_document: serde_json::Value, +) -> Result<(), Box> { + let policy_string = policy_document.to_string(); + + // Create policy + let add_policy_url = format!("{}/rustfs/admin/v3/add-canned-policy?name={}", env.url, policy_name); + awscurl_put(&add_policy_url, &policy_string, &env.access_key, &env.secret_key).await?; + + // Attach policy to user + let attach_policy_url = format!( + "{}/rustfs/admin/v3/set-user-or-group-policy?policyName={}&userOrGroup={}&isGroup=false", + env.url, policy_name, username + ); + awscurl_put(&attach_policy_url, "", &env.access_key, &env.secret_key).await?; + Ok(()) +} + +/// Helper function to clean up test resources +async fn cleanup_user_and_policy(env: &PolicyTestEnvironment, username: &str, policy_name: &str) { + // Create admin client for cleanup + let admin_client = env.create_s3_client(&env.access_key, &env.secret_key); + + // Delete buckets that might have been created by this user + let bucket_patterns = [ + format!("{username}-test-bucket"), + format!("{username}-bucket1"), + format!("{username}-bucket2"), + format!("{username}-bucket3"), + format!("prefix-{username}-suffix"), + format!("{username}-test"), + format!("{username}-sts-bucket"), + format!("{username}-service-bucket"), + "private-test-bucket".to_string(), // For deny test + ]; + + // Try to delete objects and buckets + for bucket_name in &bucket_patterns { + let _ = admin_client + .delete_object() + .bucket(bucket_name) + .key("test-object.txt") + .send() + .await; + let _ = admin_client + .delete_object() + .bucket(bucket_name) + .key("test-sts-object.txt") + .send() + .await; + let _ = admin_client + .delete_object() + .bucket(bucket_name) + .key("test-service-object.txt") + .send() + .await; + let _ = admin_client.delete_bucket().bucket(bucket_name).send().await; + } + + // Remove user + let remove_user_url = format!("{}/rustfs/admin/v3/remove-user?accessKey={}", env.url, username); + let _ = awscurl_put(&remove_user_url, "", &env.access_key, &env.secret_key).await; + + // Remove policy + let remove_policy_url = format!("{}/rustfs/admin/v3/remove-canned-policy?name={}", env.url, policy_name); + let _ = awscurl_put(&remove_policy_url, "", &env.access_key, &env.secret_key).await; +} + +/// Test AWS policy variables with single-value scenarios +#[tokio::test(flavor = "multi_thread")] +#[serial] +#[ignore = "Starts a rustfs server; enable when running full E2E"] +pub async fn test_aws_policy_variables_single_value() -> Result<(), Box> { + test_aws_policy_variables_single_value_impl().await +} + +/// Implementation function for single-value policy variables test +pub async fn test_aws_policy_variables_single_value_impl() -> Result<(), Box> { + init_logging(); + info!("Starting AWS policy variables single-value test"); + + let env = PolicyTestEnvironment::with_address("127.0.0.1:9000").await?; + + test_aws_policy_variables_single_value_impl_with_env(&env).await +} + +/// Implementation function for single-value policy variables test with shared environment +pub async fn test_aws_policy_variables_single_value_impl_with_env( + env: &PolicyTestEnvironment, +) -> Result<(), Box> { + // Create test user + let test_user = "testuser1"; + let test_password = "testpassword123"; + let policy_name = "test-single-value-policy"; + + // Create cleanup function + let cleanup = || async { + cleanup_user_and_policy(env, test_user, policy_name).await; + }; + + let create_user_body = serde_json::json!({ + "secretKey": test_password, + "status": "enabled" + }) + .to_string(); + + let create_user_url = format!("{}/rustfs/admin/v3/add-user?accessKey={}", env.url, test_user); + awscurl_put(&create_user_url, &create_user_body, &env.access_key, &env.secret_key).await?; + + // Create policy with single-value AWS variables + let policy_document = serde_json::json!({ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListAllMyBuckets"], + "Resource": ["arn:aws:s3:::*"] + }, + { + "Effect": "Allow", + "Action": ["s3:CreateBucket"], + "Resource": [format!("arn:aws:s3:::{}-*", "${aws:username}")] + }, + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": [format!("arn:aws:s3:::{}-*", "${aws:username}")] + }, + { + "Effect": "Allow", + "Action": ["s3:PutObject", "s3:GetObject"], + "Resource": [format!("arn:aws:s3:::{}-*/*", "${aws:username}")] + } + ] + }) + .to_string(); + + let add_policy_url = format!("{}/rustfs/admin/v3/add-canned-policy?name={}", env.url, policy_name); + awscurl_put(&add_policy_url, &policy_document, &env.access_key, &env.secret_key).await?; + + // Attach policy to user + let attach_policy_url = format!( + "{}/rustfs/admin/v3/set-user-or-group-policy?policyName={}&userOrGroup={}&isGroup=false", + env.url, policy_name, test_user + ); + awscurl_put(&attach_policy_url, "", &env.access_key, &env.secret_key).await?; + + // Create S3 client for test user + let test_client = env.create_s3_client(test_user, test_password); + + tokio::time::sleep(std::time::Duration::from_millis(500)).await; + + // Test 1: User should be able to list buckets (allowed by policy) + info!("Test 1: User listing buckets"); + let list_result = test_client.list_buckets().send().await; + if let Err(e) = list_result { + cleanup().await; + return Err(format!("User should be able to list buckets: {e}").into()); + } + + // Test 2: User should be able to create bucket matching username pattern + info!("Test 2: User creating bucket matching pattern"); + let bucket_name = format!("{test_user}-test-bucket"); + let create_result = test_client.create_bucket().bucket(&bucket_name).send().await; + if let Err(e) = create_result { + cleanup().await; + return Err(format!("User should be able to create bucket matching username pattern: {e}").into()); + } + + // Test 3: User should be able to list objects in their own bucket + info!("Test 3: User listing objects in their bucket"); + let list_objects_result = test_client.list_objects_v2().bucket(&bucket_name).send().await; + if let Err(e) = list_objects_result { + cleanup().await; + return Err(format!("User should be able to list objects in their own bucket: {e}").into()); + } + + // Test 4: User should be able to put object in their own bucket + info!("Test 4: User putting object in their bucket"); + let put_result = test_client + .put_object() + .bucket(&bucket_name) + .key("test-object.txt") + .body(ByteStream::from_static(b"Hello, Policy Variables!")) + .send() + .await; + if let Err(e) = put_result { + cleanup().await; + return Err(format!("User should be able to put object in their own bucket: {e}").into()); + } + + // Test 5: User should be able to get object from their own bucket + info!("Test 5: User getting object from their bucket"); + let get_result = test_client + .get_object() + .bucket(&bucket_name) + .key("test-object.txt") + .send() + .await; + if let Err(e) = get_result { + cleanup().await; + return Err(format!("User should be able to get object from their own bucket: {e}").into()); + } + + // Test 6: User should NOT be able to create bucket NOT matching username pattern + info!("Test 6: User attempting to create bucket NOT matching pattern"); + let other_bucket_name = "other-user-bucket"; + let create_other_result = test_client.create_bucket().bucket(other_bucket_name).send().await; + if create_other_result.is_ok() { + cleanup().await; + return Err("User should NOT be able to create bucket NOT matching username pattern".into()); + } + + // Cleanup + info!("Cleaning up test resources"); + cleanup().await; + + info!("AWS policy variables single-value test completed successfully"); + Ok(()) +} + +/// Test AWS policy variables with multi-value scenarios +#[tokio::test(flavor = "multi_thread")] +#[serial] +#[ignore = "Starts a rustfs server; enable when running full E2E"] +pub async fn test_aws_policy_variables_multi_value() -> Result<(), Box> { + test_aws_policy_variables_multi_value_impl().await +} + +/// Implementation function for multi-value policy variables test +pub async fn test_aws_policy_variables_multi_value_impl() -> Result<(), Box> { + init_logging(); + info!("Starting AWS policy variables multi-value test"); + + let env = PolicyTestEnvironment::with_address("127.0.0.1:9000").await?; + + test_aws_policy_variables_multi_value_impl_with_env(&env).await +} + +/// Implementation function for multi-value policy variables test with shared environment +pub async fn test_aws_policy_variables_multi_value_impl_with_env( + env: &PolicyTestEnvironment, +) -> Result<(), Box> { + // Create test user + let test_user = "testuser2"; + let test_password = "testpassword123"; + let policy_name = "test-multi-value-policy"; + + // Create cleanup function + let cleanup = || async { + cleanup_user_and_policy(env, test_user, policy_name).await; + }; + + // Create user + create_user(env, test_user, test_password).await?; + + // Create policy with multi-value AWS variables + let policy_document = serde_json::json!({ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListAllMyBuckets"], + "Resource": ["arn:aws:s3:::*"] + }, + { + "Effect": "Allow", + "Action": ["s3:CreateBucket"], + "Resource": [ + format!("arn:aws:s3:::{}-bucket1", "${aws:username}"), + format!("arn:aws:s3:::{}-bucket2", "${aws:username}"), + format!("arn:aws:s3:::{}-bucket3", "${aws:username}") + ] + }, + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": [ + format!("arn:aws:s3:::{}-bucket1", "${aws:username}"), + format!("arn:aws:s3:::{}-bucket2", "${aws:username}"), + format!("arn:aws:s3:::{}-bucket3", "${aws:username}") + ] + } + ] + }); + + create_and_attach_policy(env, policy_name, test_user, policy_document).await?; + + // Create S3 client for test user + let test_client = env.create_s3_client(test_user, test_password); + + // Test 1: User should be able to create buckets matching any of the multi-value patterns + info!("Test 1: User creating first bucket matching multi-value pattern"); + let bucket1_name = format!("{test_user}-bucket1"); + let create_result1 = test_client.create_bucket().bucket(&bucket1_name).send().await; + if let Err(e) = create_result1 { + cleanup().await; + return Err(format!("User should be able to create first bucket matching multi-value pattern: {e}").into()); + } + + info!("Test 2: User creating second bucket matching multi-value pattern"); + let bucket2_name = format!("{test_user}-bucket2"); + let create_result2 = test_client.create_bucket().bucket(&bucket2_name).send().await; + if let Err(e) = create_result2 { + cleanup().await; + return Err(format!("User should be able to create second bucket matching multi-value pattern: {e}").into()); + } + + info!("Test 3: User creating third bucket matching multi-value pattern"); + let bucket3_name = format!("{test_user}-bucket3"); + let create_result3 = test_client.create_bucket().bucket(&bucket3_name).send().await; + if let Err(e) = create_result3 { + cleanup().await; + return Err(format!("User should be able to create third bucket matching multi-value pattern: {e}").into()); + } + + // Test 4: User should NOT be able to create bucket NOT matching any multi-value pattern + info!("Test 4: User attempting to create bucket NOT matching any pattern"); + let other_bucket_name = format!("{test_user}-other-bucket"); + let create_other_result = test_client.create_bucket().bucket(&other_bucket_name).send().await; + if create_other_result.is_ok() { + cleanup().await; + return Err("User should NOT be able to create bucket NOT matching any multi-value pattern".into()); + } + + // Test 5: User should be able to list objects in their allowed buckets + info!("Test 5: User listing objects in allowed buckets"); + let list_objects_result1 = test_client.list_objects_v2().bucket(&bucket1_name).send().await; + if let Err(e) = list_objects_result1 { + cleanup().await; + return Err(format!("User should be able to list objects in first allowed bucket: {e}").into()); + } + + let list_objects_result2 = test_client.list_objects_v2().bucket(&bucket2_name).send().await; + if let Err(e) = list_objects_result2 { + cleanup().await; + return Err(format!("User should be able to list objects in second allowed bucket: {e}").into()); + } + + // Cleanup + info!("Cleaning up test resources"); + cleanup().await; + + info!("AWS policy variables multi-value test completed successfully"); + Ok(()) +} + +/// Test AWS policy variables with variable concatenation +#[tokio::test(flavor = "multi_thread")] +#[serial] +#[ignore = "Starts a rustfs server; enable when running full E2E"] +pub async fn test_aws_policy_variables_concatenation() -> Result<(), Box> { + test_aws_policy_variables_concatenation_impl().await +} + +/// Implementation function for concatenation policy variables test +pub async fn test_aws_policy_variables_concatenation_impl() -> Result<(), Box> { + init_logging(); + info!("Starting AWS policy variables concatenation test"); + + let env = PolicyTestEnvironment::with_address("127.0.0.1:9000").await?; + + test_aws_policy_variables_concatenation_impl_with_env(&env).await +} + +/// Implementation function for concatenation policy variables test with shared environment +pub async fn test_aws_policy_variables_concatenation_impl_with_env( + env: &PolicyTestEnvironment, +) -> Result<(), Box> { + // Create test user + let test_user = "testuser3"; + let test_password = "testpassword123"; + let policy_name = "test-concatenation-policy"; + + // Create cleanup function + let cleanup = || async { + cleanup_user_and_policy(env, test_user, policy_name).await; + }; + + // Create user + create_user(env, test_user, test_password).await?; + + // Create policy with variable concatenation + let policy_document = serde_json::json!({ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListAllMyBuckets"], + "Resource": ["arn:aws:s3:::*"] + }, + { + "Effect": "Allow", + "Action": ["s3:CreateBucket"], + "Resource": [format!("arn:aws:s3:::prefix-{}-suffix", "${aws:username}")] + }, + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": [format!("arn:aws:s3:::prefix-{}-suffix", "${aws:username}")] + } + ] + }); + + create_and_attach_policy(env, policy_name, test_user, policy_document).await?; + + // Create S3 client for test user + let test_client = env.create_s3_client(test_user, test_password); + + // Add a small delay to allow policy to propagate + tokio::time::sleep(std::time::Duration::from_millis(500)).await; + + // Test: User should be able to create bucket matching concatenated pattern + info!("Test: User creating bucket matching concatenated pattern"); + let bucket_name = format!("prefix-{test_user}-suffix"); + let create_result = test_client.create_bucket().bucket(&bucket_name).send().await; + if let Err(e) = create_result { + cleanup().await; + return Err(format!("User should be able to create bucket matching concatenated pattern: {e}").into()); + } + + // Test: User should be able to list objects in the concatenated pattern bucket + info!("Test: User listing objects in concatenated pattern bucket"); + let list_objects_result = test_client.list_objects_v2().bucket(&bucket_name).send().await; + if let Err(e) = list_objects_result { + cleanup().await; + return Err(format!("User should be able to list objects in concatenated pattern bucket: {e}").into()); + } + + // Cleanup + info!("Cleaning up test resources"); + cleanup().await; + + info!("AWS policy variables concatenation test completed successfully"); + Ok(()) +} + +/// Test AWS policy variables with nested scenarios +#[tokio::test(flavor = "multi_thread")] +#[serial] +#[ignore = "Starts a rustfs server; enable when running full E2E"] +pub async fn test_aws_policy_variables_nested() -> Result<(), Box> { + test_aws_policy_variables_nested_impl().await +} + +/// Implementation function for nested policy variables test +pub async fn test_aws_policy_variables_nested_impl() -> Result<(), Box> { + init_logging(); + info!("Starting AWS policy variables nested test"); + + let env = PolicyTestEnvironment::with_address("127.0.0.1:9000").await?; + + test_aws_policy_variables_nested_impl_with_env(&env).await +} + +/// Test AWS policy variables with STS temporary credentials +#[tokio::test(flavor = "multi_thread")] +#[serial] +#[ignore = "Starts a rustfs server; enable when running full E2E"] +pub async fn test_aws_policy_variables_sts() -> Result<(), Box> { + test_aws_policy_variables_sts_impl().await +} + +/// Implementation function for STS policy variables test +pub async fn test_aws_policy_variables_sts_impl() -> Result<(), Box> { + init_logging(); + info!("Starting AWS policy variables STS test"); + + let env = PolicyTestEnvironment::with_address("127.0.0.1:9000").await?; + + test_aws_policy_variables_sts_impl_with_env(&env).await +} + +/// Implementation function for nested policy variables test with shared environment +pub async fn test_aws_policy_variables_nested_impl_with_env( + env: &PolicyTestEnvironment, +) -> Result<(), Box> { + // Create test user + let test_user = "testuser4"; + let test_password = "testpassword123"; + let policy_name = "test-nested-policy"; + + // Create cleanup function + let cleanup = || async { + cleanup_user_and_policy(env, test_user, policy_name).await; + }; + + // Create user + create_user(env, test_user, test_password).await?; + + // Create policy with nested variables - this tests complex variable resolution + let policy_document = serde_json::json!({ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListAllMyBuckets"], + "Resource": ["arn:aws:s3:::*"] + }, + { + "Effect": "Allow", + "Action": ["s3:CreateBucket"], + "Resource": ["arn:aws:s3:::${${aws:username}-test}"] + }, + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": ["arn:aws:s3:::${${aws:username}-test}"] + } + ] + }); + + create_and_attach_policy(env, policy_name, test_user, policy_document).await?; + + // Create S3 client for test user + let test_client = env.create_s3_client(test_user, test_password); + + // Add a small delay to allow policy to propagate + tokio::time::sleep(std::time::Duration::from_millis(500)).await; + + // Test nested variable resolution + info!("Test: Nested variable resolution"); + + // Create bucket with expected resolved name + let expected_bucket = format!("{test_user}-test"); + + // Attempt to create bucket with resolved name + let create_result = test_client.create_bucket().bucket(&expected_bucket).send().await; + + // Verify bucket creation succeeds (nested variable resolved correctly) + if let Err(e) = create_result { + cleanup().await; + return Err(format!("User should be able to create bucket with nested variable: {e}").into()); + } + + // Verify bucket creation fails with unresolved variable + let unresolved_bucket = format!("${{}}-test {test_user}"); + let create_unresolved = test_client.create_bucket().bucket(&unresolved_bucket).send().await; + + if create_unresolved.is_ok() { + cleanup().await; + return Err("User should NOT be able to create bucket with unresolved variable".into()); + } + + // Cleanup + info!("Cleaning up test resources"); + cleanup().await; + + info!("AWS policy variables nested test completed successfully"); + Ok(()) +} + +/// Implementation function for STS policy variables test with shared environment +pub async fn test_aws_policy_variables_sts_impl_with_env( + env: &PolicyTestEnvironment, +) -> Result<(), Box> { + // Create test user for STS + let test_user = "testuser-sts"; + let test_password = "testpassword123"; + let policy_name = "test-sts-policy"; + + // Create cleanup function + let cleanup = || async { + cleanup_user_and_policy(env, test_user, policy_name).await; + }; + + // Create STS user + create_sts_user(env, test_user, test_password).await?; + + // Create policy with STS-compatible variables + let policy_document = serde_json::json!({ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListAllMyBuckets"], + "Resource": ["arn:aws:s3:::*"] + }, + { + "Effect": "Allow", + "Action": ["s3:CreateBucket"], + "Resource": [format!("arn:aws:s3:::{}-sts-bucket", "${aws:username}")] + }, + { + "Effect": "Allow", + "Action": ["s3:ListBucket", "s3:PutObject", "s3:GetObject"], + "Resource": [format!("arn:aws:s3:::{}-sts-bucket/*", "${aws:username}")] + } + ] + }); + + create_and_attach_policy(env, policy_name, test_user, policy_document).await?; + + // Create S3 client for test user + let test_client = env.create_s3_client(test_user, test_password); + + // Add a small delay to allow policy to propagate + tokio::time::sleep(std::time::Duration::from_millis(500)).await; + + // Test: User should be able to create bucket matching STS pattern + info!("Test: User creating bucket matching STS pattern"); + let bucket_name = format!("{test_user}-sts-bucket"); + let create_result = test_client.create_bucket().bucket(&bucket_name).send().await; + if let Err(e) = create_result { + cleanup().await; + return Err(format!("User should be able to create STS bucket: {e}").into()); + } + + // Test: User should be able to put object in STS bucket + info!("Test: User putting object in STS bucket"); + let put_result = test_client + .put_object() + .bucket(&bucket_name) + .key("test-sts-object.txt") + .body(ByteStream::from_static(b"STS Test Object")) + .send() + .await; + if let Err(e) = put_result { + cleanup().await; + return Err(format!("User should be able to put object in STS bucket: {e}").into()); + } + + // Test: User should be able to get object from STS bucket + info!("Test: User getting object from STS bucket"); + let get_result = test_client + .get_object() + .bucket(&bucket_name) + .key("test-sts-object.txt") + .send() + .await; + if let Err(e) = get_result { + cleanup().await; + return Err(format!("User should be able to get object from STS bucket: {e}").into()); + } + + // Test: User should be able to list objects in STS bucket + info!("Test: User listing objects in STS bucket"); + let list_result = test_client.list_objects_v2().bucket(&bucket_name).send().await; + if let Err(e) = list_result { + cleanup().await; + return Err(format!("User should be able to list objects in STS bucket: {e}").into()); + } + + // Cleanup + info!("Cleaning up test resources"); + cleanup().await; + + info!("AWS policy variables STS test completed successfully"); + Ok(()) +} + +/// Test AWS policy variables with deny scenarios +#[tokio::test(flavor = "multi_thread")] +#[serial] +#[ignore = "Starts a rustfs server; enable when running full E2E"] +pub async fn test_aws_policy_variables_deny() -> Result<(), Box> { + test_aws_policy_variables_deny_impl().await +} + +/// Implementation function for deny policy variables test +pub async fn test_aws_policy_variables_deny_impl() -> Result<(), Box> { + init_logging(); + info!("Starting AWS policy variables deny test"); + + let env = PolicyTestEnvironment::with_address("127.0.0.1:9000").await?; + + test_aws_policy_variables_deny_impl_with_env(&env).await +} + +/// Implementation function for deny policy variables test with shared environment +pub async fn test_aws_policy_variables_deny_impl_with_env( + env: &PolicyTestEnvironment, +) -> Result<(), Box> { + // Create test user + let test_user = "testuser5"; + let test_password = "testpassword123"; + let policy_name = "test-deny-policy"; + + // Create cleanup function + let cleanup = || async { + cleanup_user_and_policy(env, test_user, policy_name).await; + }; + + // Create user + create_user(env, test_user, test_password).await?; + + // Create policy with both allow and deny statements + let policy_document = serde_json::json!({ + "Version": "2012-10-17", + "Statement": [ + // Allow general access + { + "Effect": "Allow", + "Action": ["s3:ListAllMyBuckets"], + "Resource": ["arn:aws:s3:::*"] + }, + // Allow creating buckets matching username pattern + { + "Effect": "Allow", + "Action": ["s3:CreateBucket"], + "Resource": [format!("arn:aws:s3:::{}-*", "${aws:username}")] + }, + // Deny creating buckets with "private" in the name + { + "Effect": "Deny", + "Action": ["s3:CreateBucket"], + "Resource": ["arn:aws:s3:::*private*"] + } + ] + }); + + create_and_attach_policy(env, policy_name, test_user, policy_document).await?; + + // Create S3 client for test user + let test_client = env.create_s3_client(test_user, test_password); + + // Add a small delay to allow policy to propagate + tokio::time::sleep(std::time::Duration::from_millis(500)).await; + + // Test 1: User should be able to create bucket matching username pattern + info!("Test 1: User creating bucket matching username pattern"); + let bucket_name = format!("{test_user}-test-bucket"); + let create_result = test_client.create_bucket().bucket(&bucket_name).send().await; + if let Err(e) = create_result { + cleanup().await; + return Err(format!("User should be able to create bucket matching username pattern: {e}").into()); + } + + // Test 2: User should NOT be able to create bucket with "private" in the name (deny rule) + info!("Test 2: User attempting to create bucket with 'private' in name (should be denied)"); + let private_bucket_name = "private-test-bucket"; + let create_private_result = test_client.create_bucket().bucket(private_bucket_name).send().await; + if create_private_result.is_ok() { + cleanup().await; + return Err("User should NOT be able to create bucket with 'private' in name due to deny rule".into()); + } + + // Cleanup + info!("Cleaning up test resources"); + cleanup().await; + + info!("AWS policy variables deny test completed successfully"); + Ok(()) +} diff --git a/crates/e2e_test/src/policy/test_env.rs b/crates/e2e_test/src/policy/test_env.rs new file mode 100644 index 00000000..6e7392a0 --- /dev/null +++ b/crates/e2e_test/src/policy/test_env.rs @@ -0,0 +1,100 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +//! Custom test environment for policy variables tests +//! +//! This module provides a custom test environment that doesn't automatically +//! stop servers when destroyed, addressing the server stopping issue. + +use aws_sdk_s3::Client; +use aws_sdk_s3::config::{Config, Credentials, Region}; +use std::net::TcpStream; +use std::time::Duration; +use tokio::time::sleep; +use tracing::{info, warn}; + +// Default credentials +const DEFAULT_ACCESS_KEY: &str = "rustfsadmin"; +const DEFAULT_SECRET_KEY: &str = "rustfsadmin"; + +/// Custom test environment that doesn't automatically stop servers +pub struct PolicyTestEnvironment { + pub temp_dir: String, + pub address: String, + pub url: String, + pub access_key: String, + pub secret_key: String, +} + +impl PolicyTestEnvironment { + /// Create a new test environment with specific address + /// This environment won't stop any server when dropped + pub async fn with_address(address: &str) -> Result> { + let temp_dir = format!("/tmp/rustfs_policy_test_{}", uuid::Uuid::new_v4()); + tokio::fs::create_dir_all(&temp_dir).await?; + + let url = format!("http://{address}"); + + Ok(Self { + temp_dir, + address: address.to_string(), + url, + access_key: DEFAULT_ACCESS_KEY.to_string(), + secret_key: DEFAULT_SECRET_KEY.to_string(), + }) + } + + /// Create an AWS S3 client configured for this RustFS instance + pub fn create_s3_client(&self, access_key: &str, secret_key: &str) -> Client { + let credentials = Credentials::new(access_key, secret_key, None, None, "policy-test"); + let config = Config::builder() + .credentials_provider(credentials) + .region(Region::new("us-east-1")) + .endpoint_url(&self.url) + .force_path_style(true) + .behavior_version_latest() + .build(); + Client::from_conf(config) + } + + /// Wait for RustFS server to be ready by checking TCP connectivity + pub async fn wait_for_server_ready(&self) -> Result<(), Box> { + info!("Waiting for RustFS server to be ready on {}", self.address); + + for i in 0..30 { + if TcpStream::connect(&self.address).is_ok() { + info!("✅ RustFS server is ready after {} attempts", i + 1); + return Ok(()); + } + + if i == 29 { + return Err("RustFS server failed to become ready within 30 seconds".into()); + } + + sleep(Duration::from_secs(1)).await; + } + + Ok(()) + } +} + +// Implement Drop trait that doesn't stop servers +impl Drop for PolicyTestEnvironment { + fn drop(&mut self) { + // Clean up temp directory only, don't stop any server + if let Err(e) = std::fs::remove_dir_all(&self.temp_dir) { + warn!("Failed to clean up temp directory {}: {}", self.temp_dir, e); + } + } +} diff --git a/crates/e2e_test/src/policy/test_runner.rs b/crates/e2e_test/src/policy/test_runner.rs new file mode 100644 index 00000000..38989579 --- /dev/null +++ b/crates/e2e_test/src/policy/test_runner.rs @@ -0,0 +1,247 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use crate::common::init_logging; +use crate::policy::test_env::PolicyTestEnvironment; +use serial_test::serial; +use std::time::Instant; +use tokio::time::{Duration, sleep}; +use tracing::{error, info}; + +/// Core test categories +#[derive(Debug, Clone, PartialEq, Eq)] +pub enum TestCategory { + SingleValue, + MultiValue, + Concatenation, + Nested, + DenyScenarios, +} + +impl TestCategory {} + +/// Test case definition +#[derive(Debug, Clone)] +pub struct TestDefinition { + pub name: String, + #[allow(dead_code)] + pub category: TestCategory, + pub is_critical: bool, +} + +impl TestDefinition { + pub fn new(name: impl Into, category: TestCategory, is_critical: bool) -> Self { + Self { + name: name.into(), + category, + is_critical, + } + } +} + +/// Test result +#[derive(Debug, Clone)] +pub struct TestResult { + pub test_name: String, + pub success: bool, + pub error_message: Option, +} + +impl TestResult { + pub fn success(test_name: String) -> Self { + Self { + test_name, + success: true, + error_message: None, + } + } + + pub fn failure(test_name: String, error: String) -> Self { + Self { + test_name, + success: false, + error_message: Some(error), + } + } +} + +/// Test suite configuration +#[derive(Debug, Clone, Default)] +pub struct TestSuiteConfig { + pub include_critical_only: bool, +} + +/// Policy test suite +pub struct PolicyTestSuite { + tests: Vec, + config: TestSuiteConfig, +} + +impl PolicyTestSuite { + /// Create default test suite + pub fn new() -> Self { + let tests = vec![ + TestDefinition::new("test_aws_policy_variables_single_value", TestCategory::SingleValue, true), + TestDefinition::new("test_aws_policy_variables_multi_value", TestCategory::MultiValue, true), + TestDefinition::new("test_aws_policy_variables_concatenation", TestCategory::Concatenation, true), + TestDefinition::new("test_aws_policy_variables_nested", TestCategory::Nested, true), + TestDefinition::new("test_aws_policy_variables_deny", TestCategory::DenyScenarios, true), + TestDefinition::new("test_aws_policy_variables_sts", TestCategory::SingleValue, true), + ]; + + Self { + tests, + config: TestSuiteConfig::default(), + } + } + + /// Configure test suite + pub fn with_config(mut self, config: TestSuiteConfig) -> Self { + self.config = config; + self + } + + /// Run test suite + pub async fn run_test_suite(&self) -> Vec { + init_logging(); + info!("Starting Policy Variables test suite"); + + let start_time = Instant::now(); + let mut results = Vec::new(); + + // Create test environment + let env = match PolicyTestEnvironment::with_address("127.0.0.1:9000").await { + Ok(env) => env, + Err(e) => { + error!("Failed to create test environment: {}", e); + return vec![TestResult::failure("env_creation".into(), e.to_string())]; + } + }; + + // Wait for server to be ready + if env.wait_for_server_ready().await.is_err() { + error!("Server is not ready"); + return vec![TestResult::failure("server_check".into(), "Server not ready".into())]; + } + + // Filter tests + let tests_to_run: Vec<&TestDefinition> = self + .tests + .iter() + .filter(|test| !self.config.include_critical_only || test.is_critical) + .collect(); + + info!("Scheduled {} tests", tests_to_run.len()); + + // Run tests + for (i, test_def) in tests_to_run.iter().enumerate() { + info!("Running test {}/{}: {}", i + 1, tests_to_run.len(), test_def.name); + let test_start = Instant::now(); + + let result = self.run_single_test(test_def, &env).await; + let test_duration = test_start.elapsed(); + + match result { + Ok(_) => { + info!("Test passed: {} ({:.2}s)", test_def.name, test_duration.as_secs_f64()); + results.push(TestResult::success(test_def.name.clone())); + } + Err(e) => { + error!("Test failed: {} ({:.2}s): {}", test_def.name, test_duration.as_secs_f64(), e); + results.push(TestResult::failure(test_def.name.clone(), e.to_string())); + } + } + + // Delay between tests to avoid resource conflicts + if i < tests_to_run.len() - 1 { + sleep(Duration::from_secs(2)).await; + } + } + + // Print summary + self.print_summary(&results, start_time.elapsed()); + + results + } + + /// Run a single test + async fn run_single_test( + &self, + test_def: &TestDefinition, + env: &PolicyTestEnvironment, + ) -> Result<(), Box> { + match test_def.name.as_str() { + "test_aws_policy_variables_single_value" => { + super::policy_variables_test::test_aws_policy_variables_single_value_impl_with_env(env).await + } + "test_aws_policy_variables_multi_value" => { + super::policy_variables_test::test_aws_policy_variables_multi_value_impl_with_env(env).await + } + "test_aws_policy_variables_concatenation" => { + super::policy_variables_test::test_aws_policy_variables_concatenation_impl_with_env(env).await + } + "test_aws_policy_variables_nested" => { + super::policy_variables_test::test_aws_policy_variables_nested_impl_with_env(env).await + } + "test_aws_policy_variables_deny" => { + super::policy_variables_test::test_aws_policy_variables_deny_impl_with_env(env).await + } + "test_aws_policy_variables_sts" => { + super::policy_variables_test::test_aws_policy_variables_sts_impl_with_env(env).await + } + _ => Err(format!("Test {} not implemented", test_def.name).into()), + } + } + + /// Print test summary + fn print_summary(&self, results: &[TestResult], total_duration: Duration) { + info!("=== Test Suite Summary ==="); + info!("Total duration: {:.2}s", total_duration.as_secs_f64()); + info!("Total tests: {}", results.len()); + + let passed = results.iter().filter(|r| r.success).count(); + let failed = results.len() - passed; + let success_rate = (passed as f64 / results.len() as f64) * 100.0; + + info!("Passed: {} | Failed: {}", passed, failed); + info!("Success rate: {:.1}%", success_rate); + + if failed > 0 { + error!("Failed tests:"); + for result in results.iter().filter(|r| !r.success) { + error!(" - {}: {}", result.test_name, result.error_message.as_ref().unwrap()); + } + } + } +} + +/// Test suite +#[tokio::test] +#[serial] +#[ignore = "Connects to existing rustfs server"] +async fn test_policy_critical_suite() -> Result<(), Box> { + let config = TestSuiteConfig { + include_critical_only: true, + }; + let suite = PolicyTestSuite::new().with_config(config); + let results = suite.run_test_suite().await; + + let failed = results.iter().filter(|r| !r.success).count(); + if failed > 0 { + return Err(format!("Critical tests failed: {failed} failures").into()); + } + + info!("All critical tests passed"); + Ok(()) +} diff --git a/crates/e2e_test/src/reliant/get_deleted_object_test.rs b/crates/e2e_test/src/reliant/get_deleted_object_test.rs index 71df0858..b34159ec 100644 --- a/crates/e2e_test/src/reliant/get_deleted_object_test.rs +++ b/crates/e2e_test/src/reliant/get_deleted_object_test.rs @@ -127,12 +127,12 @@ async fn test_get_deleted_object_returns_nosuchkey() -> Result<(), Box { - panic!("Expected ServiceError with NoSuchKey, but got: {:?}", other_err); + panic!("Expected ServiceError with NoSuchKey, but got: {other_err:?}"); } } @@ -182,13 +182,12 @@ async fn test_head_deleted_object_returns_nosuchkey() -> Result<(), Box { - panic!("Expected ServiceError but got: {:?}", other_err); + panic!("Expected ServiceError but got: {other_err:?}"); } } @@ -220,11 +219,11 @@ async fn test_get_nonexistent_object_returns_nosuchkey() -> Result<(), Box { let s3_err = service_err.into_err(); - assert!(s3_err.is_no_such_key(), "Error should be NoSuchKey, got: {:?}", s3_err); + assert!(s3_err.is_no_such_key(), "Error should be NoSuchKey, got: {s3_err:?}"); info!("✅ GetObject correctly returns NoSuchKey for non-existent object"); } other_err => { - panic!("Expected ServiceError with NoSuchKey, but got: {:?}", other_err); + panic!("Expected ServiceError with NoSuchKey, but got: {other_err:?}"); } } @@ -266,15 +265,15 @@ async fn test_multiple_gets_deleted_object() -> Result<(), Box { let s3_err = service_err.into_err(); - assert!(s3_err.is_no_such_key(), "Attempt {}: Error should be NoSuchKey, got: {:?}", i, s3_err); + assert!(s3_err.is_no_such_key(), "Attempt {i}: Error should be NoSuchKey, got: {s3_err:?}"); } other_err => { - panic!("Attempt {}: Expected ServiceError but got: {:?}", i, other_err); + panic!("Attempt {i}: Expected ServiceError but got: {other_err:?}"); } } } diff --git a/crates/e2e_test/src/reliant/head_deleted_object_versioning_test.rs b/crates/e2e_test/src/reliant/head_deleted_object_versioning_test.rs new file mode 100644 index 00000000..a4d47175 --- /dev/null +++ b/crates/e2e_test/src/reliant/head_deleted_object_versioning_test.rs @@ -0,0 +1,138 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +//! Test for HeadObject on deleted objects with versioning enabled +//! +//! This test reproduces the issue where getting a deleted object returns +//! 200 OK instead of 404 NoSuchKey when versioning is enabled. + +#![cfg(test)] + +use aws_config::meta::region::RegionProviderChain; +use aws_sdk_s3::Client; +use aws_sdk_s3::config::{Credentials, Region}; +use aws_sdk_s3::error::SdkError; +use aws_sdk_s3::types::{BucketVersioningStatus, VersioningConfiguration}; +use bytes::Bytes; +use serial_test::serial; +use std::error::Error; +use tracing::info; + +const ENDPOINT: &str = "http://localhost:9000"; +const ACCESS_KEY: &str = "rustfsadmin"; +const SECRET_KEY: &str = "rustfsadmin"; +const BUCKET: &str = "test-head-deleted-versioning-bucket"; + +async fn create_aws_s3_client() -> Result> { + let region_provider = RegionProviderChain::default_provider().or_else(Region::new("us-east-1")); + let shared_config = aws_config::defaults(aws_config::BehaviorVersion::latest()) + .region(region_provider) + .credentials_provider(Credentials::new(ACCESS_KEY, SECRET_KEY, None, None, "static")) + .endpoint_url(ENDPOINT) + .load() + .await; + + let client = Client::from_conf( + aws_sdk_s3::Config::from(&shared_config) + .to_builder() + .force_path_style(true) + .build(), + ); + Ok(client) +} + +/// Setup test bucket, creating it if it doesn't exist, and enable versioning +async fn setup_test_bucket(client: &Client) -> Result<(), Box> { + match client.create_bucket().bucket(BUCKET).send().await { + Ok(_) => {} + Err(SdkError::ServiceError(e)) => { + let e = e.into_err(); + let error_code = e.meta().code().unwrap_or(""); + if !error_code.eq("BucketAlreadyExists") && !error_code.eq("BucketAlreadyOwnedByYou") { + return Err(e.into()); + } + } + Err(e) => { + return Err(e.into()); + } + } + + // Enable versioning + client + .put_bucket_versioning() + .bucket(BUCKET) + .versioning_configuration( + VersioningConfiguration::builder() + .status(BucketVersioningStatus::Enabled) + .build(), + ) + .send() + .await?; + + Ok(()) +} + +/// Test that HeadObject on a deleted object returns NoSuchKey when versioning is enabled +#[tokio::test] +#[serial] +#[ignore = "requires running RustFS server at localhost:9000"] +async fn test_head_deleted_object_versioning_returns_nosuchkey() -> Result<(), Box> { + let _ = tracing_subscriber::fmt() + .with_max_level(tracing::Level::INFO) + .with_test_writer() + .try_init(); + + info!("🧪 Starting test_head_deleted_object_versioning_returns_nosuchkey"); + + let client = create_aws_s3_client().await?; + setup_test_bucket(&client).await?; + + let key = "test-head-deleted-versioning.txt"; + let content = b"Test content for HeadObject with versioning"; + + // Upload and verify + client + .put_object() + .bucket(BUCKET) + .key(key) + .body(Bytes::from_static(content).into()) + .send() + .await?; + + // Delete the object (creates a delete marker) + client.delete_object().bucket(BUCKET).key(key).send().await?; + + // Try to head the deleted object (latest version is delete marker) + let head_result = client.head_object().bucket(BUCKET).key(key).send().await; + + assert!(head_result.is_err(), "HeadObject on deleted object should return an error"); + + match head_result.unwrap_err() { + SdkError::ServiceError(service_err) => { + let s3_err = service_err.into_err(); + assert!( + s3_err.meta().code() == Some("NoSuchKey") + || s3_err.meta().code() == Some("NotFound") + || s3_err.meta().code() == Some("404"), + "Error should be NoSuchKey or NotFound, got: {s3_err:?}" + ); + info!("✅ HeadObject correctly returns NoSuchKey/NotFound"); + } + other_err => { + panic!("Expected ServiceError but got: {other_err:?}"); + } + } + + Ok(()) +} diff --git a/crates/e2e_test/src/reliant/mod.rs b/crates/e2e_test/src/reliant/mod.rs index 83d89906..05a4867b 100644 --- a/crates/e2e_test/src/reliant/mod.rs +++ b/crates/e2e_test/src/reliant/mod.rs @@ -14,6 +14,7 @@ mod conditional_writes; mod get_deleted_object_test; +mod head_deleted_object_versioning_test; mod lifecycle; mod lock; mod node_interact_test; diff --git a/crates/e2e_test/src/special_chars_test.rs b/crates/e2e_test/src/special_chars_test.rs index 157ec270..60a80fdd 100644 --- a/crates/e2e_test/src/special_chars_test.rs +++ b/crates/e2e_test/src/special_chars_test.rs @@ -256,7 +256,7 @@ mod tests { let output = result.unwrap(); let body_bytes = output.body.collect().await.unwrap().into_bytes(); - assert_eq!(body_bytes.as_ref(), *content, "Content mismatch for key '{}'", key); + assert_eq!(body_bytes.as_ref(), *content, "Content mismatch for key '{key}'"); info!("✅ PUT/GET succeeded for key: {}", key); } @@ -472,7 +472,7 @@ mod tests { info!("Testing COPY from '{}' to '{}'", src_key, dest_key); // COPY object - let copy_source = format!("{}/{}", bucket, src_key); + let copy_source = format!("{bucket}/{src_key}"); let result = client .copy_object() .bucket(bucket) @@ -543,7 +543,7 @@ mod tests { let output = result.unwrap(); let body_bytes = output.body.collect().await.unwrap().into_bytes(); - assert_eq!(body_bytes.as_ref(), *content, "Content mismatch for Unicode key '{}'", key); + assert_eq!(body_bytes.as_ref(), *content, "Content mismatch for Unicode key '{key}'"); info!("✅ PUT/GET succeeded for Unicode key: {}", key); } @@ -610,7 +610,7 @@ mod tests { let output = result.unwrap(); let body_bytes = output.body.collect().await.unwrap().into_bytes(); - assert_eq!(body_bytes.as_ref(), *content, "Content mismatch for key '{}'", key); + assert_eq!(body_bytes.as_ref(), *content, "Content mismatch for key '{key}'"); info!("✅ PUT/GET succeeded for key: {}", key); } @@ -658,7 +658,7 @@ mod tests { // Note: The validation happens on the server side, so we expect an error // For null byte, newline, and carriage return if key.contains('\0') || key.contains('\n') || key.contains('\r') { - assert!(result.is_err(), "Control character should be rejected for key: {:?}", key); + assert!(result.is_err(), "Control character should be rejected for key: {key:?}"); if let Err(e) = result { info!("✅ Control character correctly rejected: {:?}", e); } diff --git a/crates/ecstore/Cargo.toml b/crates/ecstore/Cargo.toml index c144bfb9..bd021c19 100644 --- a/crates/ecstore/Cargo.toml +++ b/crates/ecstore/Cargo.toml @@ -108,17 +108,12 @@ google-cloud-auth = { workspace = true } aws-config = { workspace = true } faster-hex = { workspace = true } -[target.'cfg(not(windows))'.dependencies] -nix = { workspace = true } - -[target.'cfg(windows)'.dependencies] -winapi = { workspace = true } - [dev-dependencies] tokio = { workspace = true, features = ["rt-multi-thread", "macros"] } criterion = { workspace = true, features = ["html_reports"] } temp-env = { workspace = true } +tracing-subscriber = { workspace = true } [build-dependencies] shadow-rs = { workspace = true, features = ["build", "metadata"] } diff --git a/crates/ecstore/run_benchmarks.sh b/crates/ecstore/run_benchmarks.sh index cf6988e0..7e5266c3 100755 --- a/crates/ecstore/run_benchmarks.sh +++ b/crates/ecstore/run_benchmarks.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Copyright 2024 RustFS Team # # Licensed under the Apache License, Version 2.0 (the "License"); diff --git a/crates/ecstore/src/admin_server_info.rs b/crates/ecstore/src/admin_server_info.rs index 7917004c..324ec388 100644 --- a/crates/ecstore/src/admin_server_info.rs +++ b/crates/ecstore/src/admin_server_info.rs @@ -23,7 +23,7 @@ use crate::{ }; use crate::data_usage::load_data_usage_cache; -use rustfs_common::{globals::GLOBAL_Local_Node_Name, heal_channel::DriveState}; +use rustfs_common::{GLOBAL_LOCAL_NODE_NAME, heal_channel::DriveState}; use rustfs_madmin::{ BackendDisks, Disk, ErasureSetInfo, ITEM_INITIALIZING, ITEM_OFFLINE, ITEM_ONLINE, InfoMessage, ServerProperties, }; @@ -128,7 +128,7 @@ async fn is_server_resolvable(endpoint: &Endpoint) -> Result<()> { } pub async fn get_local_server_property() -> ServerProperties { - let addr = GLOBAL_Local_Node_Name.read().await.clone(); + let addr = GLOBAL_LOCAL_NODE_NAME.read().await.clone(); let mut pool_numbers = HashSet::new(); let mut network = HashMap::new(); diff --git a/crates/ecstore/src/bucket/lifecycle/bucket_lifecycle_ops.rs b/crates/ecstore/src/bucket/lifecycle/bucket_lifecycle_ops.rs index d7404057..5742736b 100644 --- a/crates/ecstore/src/bucket/lifecycle/bucket_lifecycle_ops.rs +++ b/crates/ecstore/src/bucket/lifecycle/bucket_lifecycle_ops.rs @@ -953,7 +953,7 @@ impl LifecycleOps for ObjectInfo { lifecycle::ObjectOpts { name: self.name.clone(), user_tags: self.user_tags.clone(), - version_id: self.version_id.map(|v| v.to_string()).unwrap_or_default(), + version_id: self.version_id.clone(), mod_time: self.mod_time, size: self.size as usize, is_latest: self.is_latest, @@ -1067,7 +1067,7 @@ pub async fn eval_action_from_lifecycle( event } -async fn apply_transition_rule(event: &lifecycle::Event, src: &LcEventSrc, oi: &ObjectInfo) -> bool { +pub async fn apply_transition_rule(event: &lifecycle::Event, src: &LcEventSrc, oi: &ObjectInfo) -> bool { if oi.delete_marker || oi.is_dir { return false; } @@ -1161,7 +1161,7 @@ pub async fn apply_expiry_on_non_transitioned_objects( true } -async fn apply_expiry_rule(event: &lifecycle::Event, src: &LcEventSrc, oi: &ObjectInfo) -> bool { +pub async fn apply_expiry_rule(event: &lifecycle::Event, src: &LcEventSrc, oi: &ObjectInfo) -> bool { let mut expiry_state = GLOBAL_ExpiryState.write().await; expiry_state.enqueue_by_days(oi, event, src).await; true diff --git a/crates/ecstore/src/bucket/lifecycle/evaluator.rs b/crates/ecstore/src/bucket/lifecycle/evaluator.rs new file mode 100644 index 00000000..fc34f5f8 --- /dev/null +++ b/crates/ecstore/src/bucket/lifecycle/evaluator.rs @@ -0,0 +1,192 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use std::sync::Arc; + +use s3s::dto::{ + BucketLifecycleConfiguration, ObjectLockConfiguration, ObjectLockEnabled, ObjectLockLegalHoldStatus, ObjectLockRetentionMode, +}; +use time::OffsetDateTime; +use tracing::info; + +use crate::bucket::lifecycle::lifecycle::{Event, Lifecycle, ObjectOpts}; +use crate::bucket::object_lock::ObjectLockStatusExt; +use crate::bucket::object_lock::objectlock::{get_object_legalhold_meta, get_object_retention_meta, utc_now_ntp}; +use crate::bucket::replication::ReplicationConfig; +use rustfs_common::metrics::IlmAction; + +/// Evaluator - evaluates lifecycle policy on objects for the given lifecycle +/// configuration, lock retention configuration and replication configuration. +pub struct Evaluator { + policy: Arc, + lock_retention: Option>, + repl_cfg: Option>, +} + +impl Evaluator { + /// NewEvaluator - creates a new evaluator with the given lifecycle + pub fn new(policy: Arc) -> Self { + Self { + policy, + lock_retention: None, + repl_cfg: None, + } + } + + /// WithLockRetention - sets the lock retention configuration for the evaluator + pub fn with_lock_retention(mut self, lr: Option>) -> Self { + self.lock_retention = lr; + self + } + + /// WithReplicationConfig - sets the replication configuration for the evaluator + pub fn with_replication_config(mut self, rcfg: Option>) -> Self { + self.repl_cfg = rcfg; + self + } + + /// IsPendingReplication checks if the object is pending replication. + pub fn is_pending_replication(&self, obj: &ObjectOpts) -> bool { + use crate::bucket::replication::ReplicationConfigurationExt; + if self.repl_cfg.is_none() { + return false; + } + if let Some(rcfg) = &self.repl_cfg { + if rcfg + .config + .as_ref() + .is_some_and(|config| config.has_active_rules(obj.name.as_str(), true)) + && !obj.version_purge_status.is_empty() + { + return true; + } + } + false + } + + /// IsObjectLocked checks if it is appropriate to remove an + /// object according to locking configuration when this is lifecycle/ bucket quota asking. + /// (copied over from enforceRetentionForDeletion) + pub fn is_object_locked(&self, obj: &ObjectOpts) -> bool { + if self.lock_retention.as_ref().is_none_or(|v| { + v.object_lock_enabled + .as_ref() + .is_none_or(|v| v.as_str() != ObjectLockEnabled::ENABLED) + }) { + return false; + } + + if obj.delete_marker { + return false; + } + + let lhold = get_object_legalhold_meta(obj.user_defined.clone()); + if lhold + .status + .is_some_and(|v| v.valid() && v.as_str() == ObjectLockLegalHoldStatus::ON) + { + return true; + } + + let ret = get_object_retention_meta(obj.user_defined.clone()); + if ret + .mode + .is_some_and(|v| matches!(v.as_str(), ObjectLockRetentionMode::COMPLIANCE | ObjectLockRetentionMode::GOVERNANCE)) + { + let t = utc_now_ntp(); + if let Some(retain_until) = ret.retain_until_date { + if OffsetDateTime::from(retain_until).gt(&t) { + return true; + } + } + } + false + } + + /// eval will return a lifecycle event for each object in objs for a given time. + async fn eval_inner(&self, objs: &[ObjectOpts], now: OffsetDateTime) -> Vec { + let mut events = vec![Event::default(); objs.len()]; + let mut newer_noncurrent_versions = 0; + + 'top_loop: { + for (i, obj) in objs.iter().enumerate() { + let mut event = self.policy.eval_inner(obj, now, newer_noncurrent_versions).await; + match event.action { + IlmAction::DeleteAllVersionsAction | IlmAction::DelMarkerDeleteAllVersionsAction => { + // Skip if bucket has object locking enabled; To prevent the + // possibility of violating an object retention on one of the + // noncurrent versions of this object. + if self.lock_retention.as_ref().is_some_and(|v| { + v.object_lock_enabled + .as_ref() + .is_some_and(|v| v.as_str() == ObjectLockEnabled::ENABLED) + }) { + event = Event::default(); + } else { + // No need to evaluate remaining versions' lifecycle + // events after DeleteAllVersionsAction* + events[i] = event; + + info!("eval_inner: skipping remaining versions' lifecycle events after DeleteAllVersionsAction*"); + + break 'top_loop; + } + } + IlmAction::DeleteVersionAction | IlmAction::DeleteRestoredVersionAction => { + // Defensive code, should never happen + if obj.version_id.is_none_or(|v| v.is_nil()) { + event.action = IlmAction::NoneAction; + } + if self.is_object_locked(obj) { + event = Event::default(); + } + + if self.is_pending_replication(obj) { + event = Event::default(); + } + } + _ => {} + } + + if !obj.is_latest { + match event.action { + IlmAction::DeleteVersionAction => { + // this noncurrent version will be expired, nothing to add + } + _ => { + // this noncurrent version will be spared + newer_noncurrent_versions += 1; + } + } + } + events[i] = event; + } + } + events + } + + /// Eval will return a lifecycle event for each object in objs + pub async fn eval(&self, objs: &[ObjectOpts]) -> Result, std::io::Error> { + if objs.is_empty() { + return Ok(vec![]); + } + if objs.len() != objs[0].num_versions { + return Err(std::io::Error::new( + std::io::ErrorKind::InvalidInput, + format!("number of versions mismatch, expected {}, got {}", objs[0].num_versions, objs.len()), + )); + } + Ok(self.eval_inner(objs, OffsetDateTime::now_utc()).await) + } +} diff --git a/crates/ecstore/src/bucket/lifecycle/lifecycle.rs b/crates/ecstore/src/bucket/lifecycle/lifecycle.rs index c435dca5..9666699b 100644 --- a/crates/ecstore/src/bucket/lifecycle/lifecycle.rs +++ b/crates/ecstore/src/bucket/lifecycle/lifecycle.rs @@ -18,19 +18,23 @@ #![allow(unused_must_use)] #![allow(clippy::all)] +use rustfs_filemeta::{ReplicationStatusType, VersionPurgeStatusType}; use s3s::dto::{ BucketLifecycleConfiguration, ExpirationStatus, LifecycleExpiration, LifecycleRule, NoncurrentVersionTransition, ObjectLockConfiguration, ObjectLockEnabled, RestoreRequest, Transition, }; use std::cmp::Ordering; +use std::collections::HashMap; use std::env; use std::fmt::Display; use std::sync::Arc; use time::macros::{datetime, offset}; use time::{self, Duration, OffsetDateTime}; use tracing::info; +use uuid::Uuid; use crate::bucket::lifecycle::rule::TransitionOps; +use crate::store_api::ObjectInfo; pub const TRANSITION_COMPLETE: &str = "complete"; pub const TRANSITION_PENDING: &str = "pending"; @@ -131,11 +135,11 @@ impl RuleValidate for LifecycleRule { pub trait Lifecycle { async fn has_transition(&self) -> bool; fn has_expiry(&self) -> bool; - async fn has_active_rules(&self, prefix: &str) -> bool; + fn has_active_rules(&self, prefix: &str) -> bool; async fn validate(&self, lr: &ObjectLockConfiguration) -> Result<(), std::io::Error>; async fn filter_rules(&self, obj: &ObjectOpts) -> Option>; async fn eval(&self, obj: &ObjectOpts) -> Event; - async fn eval_inner(&self, obj: &ObjectOpts, now: OffsetDateTime) -> Event; + async fn eval_inner(&self, obj: &ObjectOpts, now: OffsetDateTime, newer_noncurrent_versions: usize) -> Event; //fn set_prediction_headers(&self, w: http.ResponseWriter, obj: ObjectOpts); async fn noncurrent_versions_expiration_limit(self: Arc, obj: &ObjectOpts) -> Event; } @@ -160,7 +164,7 @@ impl Lifecycle for BucketLifecycleConfiguration { false } - async fn has_active_rules(&self, prefix: &str) -> bool { + fn has_active_rules(&self, prefix: &str) -> bool { if self.rules.len() == 0 { return false; } @@ -274,16 +278,26 @@ impl Lifecycle for BucketLifecycleConfiguration { } async fn eval(&self, obj: &ObjectOpts) -> Event { - self.eval_inner(obj, OffsetDateTime::now_utc()).await + self.eval_inner(obj, OffsetDateTime::now_utc(), 0).await } - async fn eval_inner(&self, obj: &ObjectOpts, now: OffsetDateTime) -> Event { + async fn eval_inner(&self, obj: &ObjectOpts, now: OffsetDateTime, newer_noncurrent_versions: usize) -> Event { let mut events = Vec::::new(); info!( "eval_inner: object={}, mod_time={:?}, now={:?}, is_latest={}, delete_marker={}", obj.name, obj.mod_time, now, obj.is_latest, obj.delete_marker ); - if obj.mod_time.expect("err").unix_timestamp() == 0 { + + // Gracefully handle missing mod_time instead of panicking + let mod_time = match obj.mod_time { + Some(t) => t, + None => { + info!("eval_inner: mod_time is None for object={}, returning default event", obj.name); + return Event::default(); + } + }; + + if mod_time.unix_timestamp() == 0 { info!("eval_inner: mod_time is 0, returning default event"); return Event::default(); } @@ -323,7 +337,7 @@ impl Lifecycle for BucketLifecycleConfiguration { } if let Some(days) = expiration.days { - let expected_expiry = expected_expiry_time(obj.mod_time.unwrap(), days /*, date*/); + let expected_expiry = expected_expiry_time(mod_time, days /*, date*/); if now.unix_timestamp() >= expected_expiry.unix_timestamp() { events.push(Event { action: IlmAction::DeleteVersionAction, @@ -426,10 +440,10 @@ impl Lifecycle for BucketLifecycleConfiguration { obj.is_latest, obj.delete_marker, obj.version_id, - (obj.is_latest || obj.version_id.is_empty()) && !obj.delete_marker + (obj.is_latest || obj.version_id.is_none_or(|v| v.is_nil())) && !obj.delete_marker ); // Allow expiration for latest objects OR non-versioned objects (empty version_id) - if (obj.is_latest || obj.version_id.is_empty()) && !obj.delete_marker { + if (obj.is_latest || obj.version_id.is_none_or(|v| v.is_nil())) && !obj.delete_marker { info!("eval_inner: entering expiration check"); if let Some(ref expiration) = rule.expiration { if let Some(ref date) = expiration.date { @@ -446,11 +460,11 @@ impl Lifecycle for BucketLifecycleConfiguration { }); } } else if let Some(days) = expiration.days { - let expected_expiry: OffsetDateTime = expected_expiry_time(obj.mod_time.unwrap(), days); + let expected_expiry: OffsetDateTime = expected_expiry_time(mod_time, days); info!( "eval_inner: expiration check - days={}, obj_time={:?}, expiry_time={:?}, now={:?}, should_expire={}", days, - obj.mod_time.expect("err!"), + mod_time, expected_expiry, now, now.unix_timestamp() > expected_expiry.unix_timestamp() @@ -649,7 +663,7 @@ pub struct ObjectOpts { pub user_tags: String, pub mod_time: Option, pub size: usize, - pub version_id: String, + pub version_id: Option, pub is_latest: bool, pub delete_marker: bool, pub num_versions: usize, @@ -659,12 +673,37 @@ pub struct ObjectOpts { pub restore_expires: Option, pub versioned: bool, pub version_suspended: bool, + pub user_defined: HashMap, + pub version_purge_status: VersionPurgeStatusType, + pub replication_status: ReplicationStatusType, } impl ObjectOpts { pub fn expired_object_deletemarker(&self) -> bool { self.delete_marker && self.num_versions == 1 } + + pub fn from_object_info(oi: &ObjectInfo) -> Self { + Self { + name: oi.name.clone(), + user_tags: oi.user_tags.clone(), + mod_time: oi.mod_time, + size: oi.size as usize, + version_id: oi.version_id.clone(), + is_latest: oi.is_latest, + delete_marker: oi.delete_marker, + num_versions: oi.num_versions, + successor_mod_time: oi.successor_mod_time, + transition_status: oi.transitioned_object.status.clone(), + restore_ongoing: oi.restore_ongoing, + restore_expires: oi.restore_expires, + versioned: false, + version_suspended: false, + user_defined: oi.user_defined.clone(), + version_purge_status: oi.version_purge_status.clone(), + replication_status: oi.replication_status.clone(), + } + } } #[derive(Debug, Clone)] diff --git a/crates/ecstore/src/bucket/lifecycle/mod.rs b/crates/ecstore/src/bucket/lifecycle/mod.rs index 6189d6c6..2eed2bfc 100644 --- a/crates/ecstore/src/bucket/lifecycle/mod.rs +++ b/crates/ecstore/src/bucket/lifecycle/mod.rs @@ -14,6 +14,7 @@ pub mod bucket_lifecycle_audit; pub mod bucket_lifecycle_ops; +pub mod evaluator; pub mod lifecycle; pub mod rule; pub mod tier_last_day_stats; diff --git a/crates/ecstore/src/bucket/lifecycle/tier_sweeper.rs b/crates/ecstore/src/bucket/lifecycle/tier_sweeper.rs index 26f94031..35a87a0b 100644 --- a/crates/ecstore/src/bucket/lifecycle/tier_sweeper.rs +++ b/crates/ecstore/src/bucket/lifecycle/tier_sweeper.rs @@ -21,6 +21,7 @@ use sha2::{Digest, Sha256}; use std::any::Any; use std::io::Write; +use uuid::Uuid; use xxhash_rust::xxh64; use super::bucket_lifecycle_ops::{ExpiryOp, GLOBAL_ExpiryState, TransitionedObject}; @@ -34,7 +35,7 @@ static XXHASH_SEED: u64 = 0; struct ObjSweeper { object: String, bucket: String, - version_id: String, + version_id: Option, versioned: bool, suspended: bool, transition_status: String, @@ -54,8 +55,8 @@ impl ObjSweeper { }) } - pub fn with_version(&mut self, vid: String) -> &Self { - self.version_id = vid; + pub fn with_version(&mut self, vid: Option) -> &Self { + self.version_id = vid.clone(); self } @@ -72,8 +73,8 @@ impl ObjSweeper { version_suspended: self.suspended, ..Default::default() }; - if self.suspended && self.version_id == "" { - opts.version_id = String::from(""); + if self.suspended && self.version_id.is_none_or(|v| v.is_nil()) { + opts.version_id = None; } opts } @@ -94,7 +95,7 @@ impl ObjSweeper { if !self.versioned || self.suspended { // 1, 2.a, 2.b del_tier = true; - } else if self.versioned && self.version_id != "" { + } else if self.versioned && self.version_id.is_some_and(|v| !v.is_nil()) { // 3.a del_tier = true; } diff --git a/crates/ecstore/src/bucket/metadata_sys.rs b/crates/ecstore/src/bucket/metadata_sys.rs index 395a8b76..d11ede91 100644 --- a/crates/ecstore/src/bucket/metadata_sys.rs +++ b/crates/ecstore/src/bucket/metadata_sys.rs @@ -175,6 +175,13 @@ pub async fn created_at(bucket: &str) -> Result { bucket_meta_sys.created_at(bucket).await } +pub async fn list_bucket_targets(bucket: &str) -> Result { + let bucket_meta_sys_lock = get_bucket_metadata_sys()?; + let bucket_meta_sys = bucket_meta_sys_lock.read().await; + + bucket_meta_sys.get_bucket_targets_config(bucket).await +} + #[derive(Debug)] pub struct BucketMetadataSys { metadata_map: RwLock>>, diff --git a/crates/ecstore/src/bucket/object_lock/mod.rs b/crates/ecstore/src/bucket/object_lock/mod.rs index cb562dfc..0a821f81 100644 --- a/crates/ecstore/src/bucket/object_lock/mod.rs +++ b/crates/ecstore/src/bucket/object_lock/mod.rs @@ -15,7 +15,7 @@ pub mod objectlock; pub mod objectlock_sys; -use s3s::dto::{ObjectLockConfiguration, ObjectLockEnabled}; +use s3s::dto::{ObjectLockConfiguration, ObjectLockEnabled, ObjectLockLegalHoldStatus}; pub trait ObjectLockApi { fn enabled(&self) -> bool; @@ -28,3 +28,13 @@ impl ObjectLockApi for ObjectLockConfiguration { .is_some_and(|v| v.as_str() == ObjectLockEnabled::ENABLED) } } + +pub trait ObjectLockStatusExt { + fn valid(&self) -> bool; +} + +impl ObjectLockStatusExt for ObjectLockLegalHoldStatus { + fn valid(&self) -> bool { + matches!(self.as_str(), ObjectLockLegalHoldStatus::ON | ObjectLockLegalHoldStatus::OFF) + } +} diff --git a/crates/ecstore/src/bucket/policy_sys.rs b/crates/ecstore/src/bucket/policy_sys.rs index 6f1b68c3..14e54252 100644 --- a/crates/ecstore/src/bucket/policy_sys.rs +++ b/crates/ecstore/src/bucket/policy_sys.rs @@ -22,7 +22,7 @@ pub struct PolicySys {} impl PolicySys { pub async fn is_allowed(args: &BucketPolicyArgs<'_>) -> bool { match Self::get(args.bucket).await { - Ok(cfg) => return cfg.is_allowed(args), + Ok(cfg) => return cfg.is_allowed(args).await, Err(err) => { if err != StorageError::ConfigNotFound { info!("config get err {:?}", err); diff --git a/crates/ecstore/src/bucket/replication/replication_pool.rs b/crates/ecstore/src/bucket/replication/replication_pool.rs index 84c3a6ae..70dd8900 100644 --- a/crates/ecstore/src/bucket/replication/replication_pool.rs +++ b/crates/ecstore/src/bucket/replication/replication_pool.rs @@ -9,8 +9,11 @@ use std::sync::Arc; use std::sync::atomic::AtomicI32; use std::sync::atomic::Ordering; +use crate::bucket::bucket_target_sys::BucketTargetSys; +use crate::bucket::metadata_sys; use crate::bucket::replication::replication_resyncer::{ - BucketReplicationResyncStatus, DeletedObjectReplicationInfo, ReplicationResyncer, + BucketReplicationResyncStatus, DeletedObjectReplicationInfo, ReplicationConfig, ReplicationResyncer, + get_heal_replicate_object_info, }; use crate::bucket::replication::replication_state::ReplicationStats; use crate::config::com::read_config; @@ -26,8 +29,10 @@ use rustfs_filemeta::ReplicationStatusType; use rustfs_filemeta::ReplicationType; use rustfs_filemeta::ReplicationWorkerOperation; use rustfs_filemeta::ResyncDecision; +use rustfs_filemeta::VersionPurgeStatusType; use rustfs_filemeta::replication_statuses_map; use rustfs_filemeta::version_purge_statuses_map; +use rustfs_filemeta::{REPLICATE_EXISTING, REPLICATE_HEAL, REPLICATE_HEAL_DELETE}; use rustfs_utils::http::RESERVED_METADATA_PREFIX_LOWER; use time::OffsetDateTime; use time::format_description::well_known::Rfc3339; @@ -1033,3 +1038,152 @@ pub async fn schedule_replication_delete(dv: DeletedObjectReplicationInfo) { } } } + +/// QueueReplicationHeal is a wrapper for queue_replication_heal_internal +pub async fn queue_replication_heal(bucket: &str, oi: ObjectInfo, retry_count: u32) { + // ignore modtime zero objects + if oi.mod_time.is_none() || oi.mod_time == Some(OffsetDateTime::UNIX_EPOCH) { + return; + } + + let rcfg = match metadata_sys::get_replication_config(bucket).await { + Ok((config, _)) => config, + Err(err) => { + warn!("Failed to get replication config for bucket {}: {}", bucket, err); + + return; + } + }; + + let tgts = match BucketTargetSys::get().list_bucket_targets(bucket).await { + Ok(targets) => Some(targets), + Err(err) => { + warn!("Failed to list bucket targets for bucket {}: {}", bucket, err); + None + } + }; + + let rcfg_wrapper = ReplicationConfig::new(Some(rcfg), tgts); + queue_replication_heal_internal(bucket, oi, rcfg_wrapper, retry_count).await; +} + +/// queue_replication_heal_internal enqueues objects that failed replication OR eligible for resyncing through +/// an ongoing resync operation or via existing objects replication configuration setting. +pub async fn queue_replication_heal_internal( + _bucket: &str, + oi: ObjectInfo, + rcfg: ReplicationConfig, + retry_count: u32, +) -> ReplicateObjectInfo { + let mut roi = ReplicateObjectInfo::default(); + + // ignore modtime zero objects + if oi.mod_time.is_none() || oi.mod_time == Some(OffsetDateTime::UNIX_EPOCH) { + return roi; + } + + if rcfg.config.is_none() || rcfg.remotes.is_none() { + return roi; + } + + roi = get_heal_replicate_object_info(&oi, &rcfg).await; + roi.retry_count = retry_count; + + if !roi.dsc.replicate_any() { + return roi; + } + + // early return if replication already done, otherwise we need to determine if this + // version is an existing object that needs healing. + if roi.replication_status == ReplicationStatusType::Completed + && roi.version_purge_status.is_empty() + && !roi.existing_obj_resync.must_resync() + { + return roi; + } + + if roi.delete_marker || !roi.version_purge_status.is_empty() { + let (version_id, dm_version_id) = if roi.version_purge_status.is_empty() { + (None, roi.version_id) + } else { + (roi.version_id, None) + }; + + let dv = DeletedObjectReplicationInfo { + delete_object: crate::store_api::DeletedObject { + object_name: roi.name.clone(), + delete_marker_version_id: dm_version_id, + version_id, + replication_state: roi.replication_state.clone(), + delete_marker_mtime: roi.mod_time, + delete_marker: roi.delete_marker, + ..Default::default() + }, + bucket: roi.bucket.clone(), + op_type: ReplicationType::Heal, + event_type: REPLICATE_HEAL_DELETE.to_string(), + ..Default::default() + }; + + // heal delete marker replication failure or versioned delete replication failure + if roi.replication_status == ReplicationStatusType::Pending + || roi.replication_status == ReplicationStatusType::Failed + || roi.version_purge_status == VersionPurgeStatusType::Failed + || roi.version_purge_status == VersionPurgeStatusType::Pending + { + if let Some(pool) = GLOBAL_REPLICATION_POOL.get() { + pool.queue_replica_delete_task(dv).await; + } + return roi; + } + + // if replication status is Complete on DeleteMarker and existing object resync required + let existing_obj_resync = roi.existing_obj_resync.clone(); + if existing_obj_resync.must_resync() + && (roi.replication_status == ReplicationStatusType::Completed || roi.replication_status.is_empty()) + { + queue_replicate_deletes_wrapper(dv, existing_obj_resync).await; + return roi; + } + + return roi; + } + + if roi.existing_obj_resync.must_resync() { + roi.op_type = ReplicationType::ExistingObject; + } + + match roi.replication_status { + ReplicationStatusType::Pending | ReplicationStatusType::Failed => { + roi.event_type = REPLICATE_HEAL.to_string(); + if let Some(pool) = GLOBAL_REPLICATION_POOL.get() { + pool.queue_replica_task(roi.clone()).await; + } + return roi; + } + _ => {} + } + + if roi.existing_obj_resync.must_resync() { + roi.event_type = REPLICATE_EXISTING.to_string(); + if let Some(pool) = GLOBAL_REPLICATION_POOL.get() { + pool.queue_replica_task(roi.clone()).await; + } + } + + roi +} + +/// Wrapper function for queueing replicate deletes with resync decision +async fn queue_replicate_deletes_wrapper(doi: DeletedObjectReplicationInfo, existing_obj_resync: ResyncDecision) { + for (k, v) in existing_obj_resync.targets.iter() { + if v.replicate { + let mut dv = doi.clone(); + dv.reset_id = v.reset_id.clone(); + dv.target_arn = k.clone(); + if let Some(pool) = GLOBAL_REPLICATION_POOL.get() { + pool.queue_replica_delete_task(dv).await; + } + } + } +} diff --git a/crates/ecstore/src/bucket/replication/replication_resyncer.rs b/crates/ecstore/src/bucket/replication/replication_resyncer.rs index eebc1ff8..d1cf7e8c 100644 --- a/crates/ecstore/src/bucket/replication/replication_resyncer.rs +++ b/crates/ecstore/src/bucket/replication/replication_resyncer.rs @@ -744,7 +744,7 @@ impl ReplicationWorkerOperation for DeletedObjectReplicationInfo { } } -#[derive(Debug, Clone, Default)] +#[derive(Debug, Clone, Default, Serialize, Deserialize)] pub struct ReplicationConfig { pub config: Option, pub remotes: Option, diff --git a/crates/ecstore/src/cache_value/metacache_set.rs b/crates/ecstore/src/cache_value/metacache_set.rs index 621ffea7..b71b2c30 100644 --- a/crates/ecstore/src/cache_value/metacache_set.rs +++ b/crates/ecstore/src/cache_value/metacache_set.rs @@ -16,7 +16,7 @@ use crate::disk::error::DiskError; use crate::disk::{self, DiskAPI, DiskStore, WalkDirOptions}; use futures::future::join_all; use rustfs_filemeta::{MetaCacheEntries, MetaCacheEntry, MetacacheReader, is_io_eof}; -use std::{future::Future, pin::Pin, sync::Arc}; +use std::{future::Future, pin::Pin}; use tokio::spawn; use tokio_util::sync::CancellationToken; use tracing::{error, info, warn}; @@ -71,14 +71,14 @@ pub async fn list_path_raw(rx: CancellationToken, opts: ListPathRawOptions) -> d let mut jobs: Vec>> = Vec::new(); let mut readers = Vec::with_capacity(opts.disks.len()); - let fds = Arc::new(opts.fallback_disks.clone()); + let fds = opts.fallback_disks.iter().flatten().cloned().collect::>(); let cancel_rx = CancellationToken::new(); for disk in opts.disks.iter() { let opdisk = disk.clone(); let opts_clone = opts.clone(); - let fds_clone = fds.clone(); + let mut fds_clone = fds.clone(); let cancel_rx_clone = cancel_rx.clone(); let (rd, mut wr) = tokio::io::duplex(64); readers.push(MetacacheReader::new(rd)); @@ -113,21 +113,20 @@ pub async fn list_path_raw(rx: CancellationToken, opts: ListPathRawOptions) -> d } while need_fallback { - // warn!("list_path_raw: while need_fallback start"); - let disk = match fds_clone.iter().find(|d| d.is_some()) { - Some(d) => { - if let Some(disk) = d.clone() { - disk - } else { - warn!("list_path_raw: fallback disk is none"); - break; - } - } - None => { - warn!("list_path_raw: fallback disk is none2"); - break; + let disk_op = { + if fds_clone.is_empty() { + None + } else { + let disk = fds_clone.remove(0); + if disk.is_online().await { Some(disk.clone()) } else { None } } }; + + let Some(disk) = disk_op else { + warn!("list_path_raw: fallback disk is none"); + break; + }; + match disk .as_ref() .walk_dir( diff --git a/crates/ecstore/src/checksum.rs b/crates/ecstore/src/checksum.rs deleted file mode 100644 index dd8be1e6..00000000 --- a/crates/ecstore/src/checksum.rs +++ /dev/null @@ -1,350 +0,0 @@ -#![allow(clippy::map_entry)] -// Copyright 2024 RustFS Team -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. -#![allow(unused_imports)] -#![allow(unused_variables)] -#![allow(unused_mut)] -#![allow(unused_assignments)] -#![allow(unused_must_use)] -#![allow(clippy::all)] - -use lazy_static::lazy_static; -use rustfs_checksums::ChecksumAlgorithm; -use std::collections::HashMap; - -use crate::client::{api_put_object::PutObjectOptions, api_s3_datatypes::ObjectPart}; -use crate::{disk::DiskAPI, store_api::GetObjectReader}; -use rustfs_utils::crypto::{base64_decode, base64_encode}; -use s3s::header::{ - X_AMZ_CHECKSUM_ALGORITHM, X_AMZ_CHECKSUM_CRC32, X_AMZ_CHECKSUM_CRC32C, X_AMZ_CHECKSUM_SHA1, X_AMZ_CHECKSUM_SHA256, -}; - -use enumset::{EnumSet, EnumSetType, enum_set}; - -#[derive(Debug, EnumSetType, Default)] -#[enumset(repr = "u8")] -pub enum ChecksumMode { - #[default] - ChecksumNone, - ChecksumSHA256, - ChecksumSHA1, - ChecksumCRC32, - ChecksumCRC32C, - ChecksumCRC64NVME, - ChecksumFullObject, -} - -lazy_static! { - static ref C_ChecksumMask: EnumSet = { - let mut s = EnumSet::all(); - s.remove(ChecksumMode::ChecksumFullObject); - s - }; - static ref C_ChecksumFullObjectCRC32: EnumSet = - enum_set!(ChecksumMode::ChecksumCRC32 | ChecksumMode::ChecksumFullObject); - static ref C_ChecksumFullObjectCRC32C: EnumSet = - enum_set!(ChecksumMode::ChecksumCRC32C | ChecksumMode::ChecksumFullObject); -} -const AMZ_CHECKSUM_CRC64NVME: &str = "x-amz-checksum-crc64nvme"; - -impl ChecksumMode { - //pub const CRC64_NVME_POLYNOMIAL: i64 = 0xad93d23594c93659; - - pub fn base(&self) -> ChecksumMode { - let s = EnumSet::from(*self).intersection(*C_ChecksumMask); - match s.as_u8() { - 1_u8 => ChecksumMode::ChecksumNone, - 2_u8 => ChecksumMode::ChecksumSHA256, - 4_u8 => ChecksumMode::ChecksumSHA1, - 8_u8 => ChecksumMode::ChecksumCRC32, - 16_u8 => ChecksumMode::ChecksumCRC32C, - 32_u8 => ChecksumMode::ChecksumCRC64NVME, - _ => panic!("enum err."), - } - } - - pub fn is(&self, t: ChecksumMode) -> bool { - *self & t == t - } - - pub fn key(&self) -> String { - //match c & checksumMask { - match self { - ChecksumMode::ChecksumCRC32 => { - return X_AMZ_CHECKSUM_CRC32.to_string(); - } - ChecksumMode::ChecksumCRC32C => { - return X_AMZ_CHECKSUM_CRC32C.to_string(); - } - ChecksumMode::ChecksumSHA1 => { - return X_AMZ_CHECKSUM_SHA1.to_string(); - } - ChecksumMode::ChecksumSHA256 => { - return X_AMZ_CHECKSUM_SHA256.to_string(); - } - ChecksumMode::ChecksumCRC64NVME => { - return AMZ_CHECKSUM_CRC64NVME.to_string(); - } - _ => { - return "".to_string(); - } - } - } - - pub fn can_composite(&self) -> bool { - let s = EnumSet::from(*self).intersection(*C_ChecksumMask); - match s.as_u8() { - 2_u8 => true, - 4_u8 => true, - 8_u8 => true, - 16_u8 => true, - _ => false, - } - } - - pub fn can_merge_crc(&self) -> bool { - let s = EnumSet::from(*self).intersection(*C_ChecksumMask); - match s.as_u8() { - 8_u8 => true, - 16_u8 => true, - 32_u8 => true, - _ => false, - } - } - - pub fn full_object_requested(&self) -> bool { - let s = EnumSet::from(*self).intersection(*C_ChecksumMask); - match s.as_u8() { - //C_ChecksumFullObjectCRC32 as u8 => true, - //C_ChecksumFullObjectCRC32C as u8 => true, - 32_u8 => true, - _ => false, - } - } - - pub fn key_capitalized(&self) -> String { - self.key() - } - - pub fn raw_byte_len(&self) -> usize { - let u = EnumSet::from(*self).intersection(*C_ChecksumMask).as_u8(); - if u == ChecksumMode::ChecksumCRC32 as u8 || u == ChecksumMode::ChecksumCRC32C as u8 { - 4 - } else if u == ChecksumMode::ChecksumSHA1 as u8 { - use sha1::Digest; - sha1::Sha1::output_size() as usize - } else if u == ChecksumMode::ChecksumSHA256 as u8 { - use sha2::Digest; - sha2::Sha256::output_size() as usize - } else if u == ChecksumMode::ChecksumCRC64NVME as u8 { - 8 - } else { - 0 - } - } - - pub fn hasher(&self) -> Result, std::io::Error> { - match /*C_ChecksumMask & **/self { - ChecksumMode::ChecksumCRC32 => { - return Ok(ChecksumAlgorithm::Crc32.into_impl()); - } - ChecksumMode::ChecksumCRC32C => { - return Ok(ChecksumAlgorithm::Crc32c.into_impl()); - } - ChecksumMode::ChecksumSHA1 => { - return Ok(ChecksumAlgorithm::Sha1.into_impl()); - } - ChecksumMode::ChecksumSHA256 => { - return Ok(ChecksumAlgorithm::Sha256.into_impl()); - } - ChecksumMode::ChecksumCRC64NVME => { - return Ok(ChecksumAlgorithm::Crc64Nvme.into_impl()); - } - _ => return Err(std::io::Error::other("unsupported checksum type")), - } - } - - pub fn is_set(&self) -> bool { - let s = EnumSet::from(*self).intersection(*C_ChecksumMask); - s.len() == 1 - } - - pub fn set_default(&mut self, t: ChecksumMode) { - if !self.is_set() { - *self = t; - } - } - - pub fn encode_to_string(&self, b: &[u8]) -> Result { - if !self.is_set() { - return Ok("".to_string()); - } - let mut h = self.hasher()?; - h.update(b); - let hash = h.finalize(); - Ok(base64_encode(hash.as_ref())) - } - - pub fn to_string(&self) -> String { - //match c & checksumMask { - match self { - ChecksumMode::ChecksumCRC32 => { - return "CRC32".to_string(); - } - ChecksumMode::ChecksumCRC32C => { - return "CRC32C".to_string(); - } - ChecksumMode::ChecksumSHA1 => { - return "SHA1".to_string(); - } - ChecksumMode::ChecksumSHA256 => { - return "SHA256".to_string(); - } - ChecksumMode::ChecksumNone => { - return "".to_string(); - } - ChecksumMode::ChecksumCRC64NVME => { - return "CRC64NVME".to_string(); - } - _ => { - return "".to_string(); - } - } - } - - // pub fn check_sum_reader(&self, r: GetObjectReader) -> Result { - // let mut h = self.hasher()?; - // Ok(Checksum::new(self.clone(), h.sum().as_bytes())) - // } - - // pub fn check_sum_bytes(&self, b: &[u8]) -> Result { - // let mut h = self.hasher()?; - // Ok(Checksum::new(self.clone(), h.sum().as_bytes())) - // } - - pub fn composite_checksum(&self, p: &mut [ObjectPart]) -> Result { - if !self.can_composite() { - return Err(std::io::Error::other("cannot do composite checksum")); - } - p.sort_by(|i, j| { - if i.part_num < j.part_num { - std::cmp::Ordering::Less - } else if i.part_num > j.part_num { - std::cmp::Ordering::Greater - } else { - std::cmp::Ordering::Equal - } - }); - let c = self.base(); - let crc_bytes = Vec::::with_capacity(p.len() * self.raw_byte_len() as usize); - let mut h = self.hasher()?; - h.update(crc_bytes.as_ref()); - let hash = h.finalize(); - Ok(Checksum { - checksum_type: self.clone(), - r: hash.as_ref().to_vec(), - computed: false, - }) - } - - pub fn full_object_checksum(&self, p: &mut [ObjectPart]) -> Result { - todo!(); - } -} - -#[derive(Default)] -pub struct Checksum { - checksum_type: ChecksumMode, - r: Vec, - computed: bool, -} - -#[allow(dead_code)] -impl Checksum { - fn new(t: ChecksumMode, b: &[u8]) -> Checksum { - if t.is_set() && b.len() == t.raw_byte_len() { - return Checksum { - checksum_type: t, - r: b.to_vec(), - computed: false, - }; - } - Checksum::default() - } - - #[allow(dead_code)] - fn new_checksum_string(t: ChecksumMode, s: &str) -> Result { - let b = match base64_decode(s.as_bytes()) { - Ok(b) => b, - Err(err) => return Err(std::io::Error::other(err.to_string())), - }; - if t.is_set() && b.len() == t.raw_byte_len() { - return Ok(Checksum { - checksum_type: t, - r: b, - computed: false, - }); - } - Ok(Checksum::default()) - } - - fn is_set(&self) -> bool { - self.checksum_type.is_set() && self.r.len() == self.checksum_type.raw_byte_len() - } - - fn encoded(&self) -> String { - if !self.is_set() { - return "".to_string(); - } - base64_encode(&self.r) - } - - #[allow(dead_code)] - fn raw(&self) -> Option> { - if !self.is_set() { - return None; - } - Some(self.r.clone()) - } -} - -pub fn add_auto_checksum_headers(opts: &mut PutObjectOptions) { - opts.user_metadata - .insert("X-Amz-Checksum-Algorithm".to_string(), opts.auto_checksum.to_string()); - if opts.auto_checksum.full_object_requested() { - opts.user_metadata - .insert("X-Amz-Checksum-Type".to_string(), "FULL_OBJECT".to_string()); - } -} - -pub fn apply_auto_checksum(opts: &mut PutObjectOptions, all_parts: &mut [ObjectPart]) -> Result<(), std::io::Error> { - if opts.auto_checksum.can_composite() && !opts.auto_checksum.is(ChecksumMode::ChecksumFullObject) { - let crc = opts.auto_checksum.composite_checksum(all_parts)?; - opts.user_metadata = { - let mut hm = HashMap::new(); - hm.insert(opts.auto_checksum.key(), crc.encoded()); - hm - } - } else if opts.auto_checksum.can_merge_crc() { - let crc = opts.auto_checksum.full_object_checksum(all_parts)?; - opts.user_metadata = { - let mut hm = HashMap::new(); - hm.insert(opts.auto_checksum.key_capitalized(), crc.encoded()); - hm.insert("X-Amz-Checksum-Type".to_string(), "FULL_OBJECT".to_string()); - hm - } - } - - Ok(()) -} diff --git a/crates/ecstore/src/chunk_stream.rs b/crates/ecstore/src/chunk_stream.rs deleted file mode 100644 index 41b3b2d9..00000000 --- a/crates/ecstore/src/chunk_stream.rs +++ /dev/null @@ -1,270 +0,0 @@ -// Copyright 2024 RustFS Team -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -// use crate::error::StdError; -// use bytes::Bytes; -// use futures::pin_mut; -// use futures::stream::{Stream, StreamExt}; -// use std::future::Future; -// use std::pin::Pin; -// use std::task::{Context, Poll}; -// use transform_stream::AsyncTryStream; - -// pub type SyncBoxFuture<'a, T> = Pin + Send + Sync + 'a>>; - -// pub struct ChunkedStream<'a> { -// /// inner -// inner: AsyncTryStream>>, - -// remaining_length: usize, -// } - -// impl<'a> ChunkedStream<'a> { -// pub fn new(body: S, content_length: usize, chunk_size: usize, need_padding: bool) -> Self -// where -// S: Stream> + Send + Sync + 'a, -// { -// let inner = AsyncTryStream::<_, _, SyncBoxFuture<'a, Result<(), StdError>>>::new(|mut y| { -// #[allow(clippy::shadow_same)] // necessary for `pin_mut!` -// Box::pin(async move { -// pin_mut!(body); -// // Data left over from the previous call -// let mut prev_bytes = Bytes::new(); -// let mut read_size = 0; - -// loop { -// let data: Vec = { -// // Read a fixed-size chunk -// match Self::read_data(body.as_mut(), prev_bytes, chunk_size).await { -// None => break, -// Some(Err(e)) => return Err(e), -// Some(Ok((data, remaining_bytes))) => { -// // debug!( -// // "content_length:{},read_size:{}, read_data data:{}, remaining_bytes: {} ", -// // content_length, -// // read_size, -// // data.len(), -// // remaining_bytes.len() -// // ); - -// prev_bytes = remaining_bytes; -// data -// } -// } -// }; - -// for bytes in data { -// read_size += bytes.len(); -// // debug!("read_size {}, content_length {}", read_size, content_length,); -// y.yield_ok(bytes).await; -// } - -// if read_size + prev_bytes.len() >= content_length { -// // debug!( -// // "Finished reading: read_size:{} + prev_bytes.len({}) == content_length {}", -// // read_size, -// // prev_bytes.len(), -// // content_length, -// // ); - -// // Pad with zeros? -// if !need_padding { -// y.yield_ok(prev_bytes).await; -// break; -// } - -// let mut bytes = vec![0u8; chunk_size]; -// let (left, _) = bytes.split_at_mut(prev_bytes.len()); -// left.copy_from_slice(&prev_bytes); - -// y.yield_ok(Bytes::from(bytes)).await; - -// break; -// } -// } - -// // debug!("chunked stream exit"); - -// Ok(()) -// }) -// }); -// Self { -// inner, -// remaining_length: content_length, -// } -// } -// /// read data and return remaining bytes -// async fn read_data( -// mut body: Pin<&mut S>, -// prev_bytes: Bytes, -// data_size: usize, -// ) -> Option, Bytes), StdError>> -// where -// S: Stream> + Send, -// { -// let mut bytes_buffer = Vec::new(); - -// // Run only once -// let mut push_data_bytes = |mut bytes: Bytes| { -// // debug!("read from body {} split per {}, prev_bytes: {}", bytes.len(), data_size, prev_bytes.len()); - -// if bytes.is_empty() { -// return None; -// } - -// if data_size == 0 { -// return Some(bytes); -// } - -// // Merge with the previous data -// if !prev_bytes.is_empty() { -// let need_size = data_size.wrapping_sub(prev_bytes.len()); -// // debug!( -// // "Previous leftover {}, take {} now, total: {}", -// // prev_bytes.len(), -// // need_size, -// // prev_bytes.len() + need_size -// // ); -// if bytes.len() >= need_size { -// let data = bytes.split_to(need_size); -// let mut combined = Vec::new(); -// combined.extend_from_slice(&prev_bytes); -// combined.extend_from_slice(&data); - -// // debug!( -// // "Fetched more bytes than needed: {}, merged result {}, remaining bytes {}", -// // need_size, -// // combined.len(), -// // bytes.len(), -// // ); - -// bytes_buffer.push(Bytes::from(combined)); -// } else { -// let mut combined = Vec::new(); -// combined.extend_from_slice(&prev_bytes); -// combined.extend_from_slice(&bytes); - -// // debug!( -// // "Fetched fewer bytes than needed: {}, merged result {}, remaining bytes {}, return immediately", -// // need_size, -// // combined.len(), -// // bytes.len(), -// // ); - -// return Some(Bytes::from(combined)); -// } -// } - -// // If the fetched data exceeds the chunk, slice the required size -// if data_size <= bytes.len() { -// let n = bytes.len() / data_size; - -// for _ in 0..n { -// let data = bytes.split_to(data_size); - -// // println!("bytes_buffer.push: {}, remaining: {}", data.len(), bytes.len()); -// bytes_buffer.push(data); -// } - -// Some(bytes) -// } else { -// // Insufficient data -// Some(bytes) -// } -// }; - -// // Remaining data -// let remaining_bytes = 'outer: { -// // // Exit if the previous data was sufficient -// // if let Some(remaining_bytes) = push_data_bytes(prev_bytes) { -// // println!("Consuming leftovers"); -// // break 'outer remaining_bytes; -// // } - -// loop { -// match body.next().await? { -// Err(e) => return Some(Err(e)), -// Ok(bytes) => { -// if let Some(remaining_bytes) = push_data_bytes(bytes) { -// break 'outer remaining_bytes; -// } -// } -// } -// } -// }; - -// Some(Ok((bytes_buffer, remaining_bytes))) -// } - -// fn poll(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll>> { -// let ans = Pin::new(&mut self.inner).poll_next(cx); -// if let Poll::Ready(Some(Ok(ref bytes))) = ans { -// self.remaining_length = self.remaining_length.saturating_sub(bytes.len()); -// } -// ans -// } - -// // pub fn exact_remaining_length(&self) -> usize { -// // self.remaining_length -// // } -// } - -// impl Stream for ChunkedStream<'_> { -// type Item = Result; - -// fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll> { -// self.poll(cx) -// } - -// fn size_hint(&self) -> (usize, Option) { -// (0, None) -// } -// } - -// #[cfg(test)] -// mod test { - -// use super::*; - -// #[tokio::test] -// async fn test_chunked_stream() { -// let chunk_size = 4; - -// let data1 = vec![1u8; 7777]; // 65536 -// let data2 = vec![1u8; 7777]; // 65536 - -// let content_length = data1.len() + data2.len(); - -// let chunk1 = Bytes::from(data1); -// let chunk2 = Bytes::from(data2); - -// let chunk_results: Vec> = vec![Ok(chunk1), Ok(chunk2)]; - -// let stream = futures::stream::iter(chunk_results); - -// let mut chunked_stream = ChunkedStream::new(stream, content_length, chunk_size, true); - -// loop { -// let ans1 = chunked_stream.next().await; -// if ans1.is_none() { -// break; -// } - -// let bytes = ans1.unwrap().unwrap(); -// assert!(bytes.len() == chunk_size) -// } - -// // assert_eq!(ans1.unwrap(), chunk1_data.as_slice()); -// } -// } diff --git a/crates/ecstore/src/client/api_get_object_acl.rs b/crates/ecstore/src/client/api_get_object_acl.rs index 1e811512..e0ef8ddb 100644 --- a/crates/ecstore/src/client/api_get_object_acl.rs +++ b/crates/ecstore/src/client/api_get_object_acl.rs @@ -18,19 +18,17 @@ #![allow(unused_must_use)] #![allow(clippy::all)] +use crate::client::{ + api_error_response::http_resp_to_error_response, + api_get_options::GetObjectOptions, + transition_api::{ObjectInfo, ReaderImpl, RequestMetadata, TransitionClient}, +}; use bytes::Bytes; use http::{HeaderMap, HeaderValue}; +use rustfs_config::MAX_S3_CLIENT_RESPONSE_SIZE; +use rustfs_utils::EMPTY_STRING_SHA256_HASH; use s3s::dto::Owner; use std::collections::HashMap; -use std::io::Cursor; -use tokio::io::BufReader; - -use crate::client::{ - api_error_response::{err_invalid_argument, http_resp_to_error_response}, - api_get_options::GetObjectOptions, - transition_api::{ObjectInfo, ReadCloser, ReaderImpl, RequestMetadata, TransitionClient, to_object_info}, -}; -use rustfs_utils::EMPTY_STRING_SHA256_HASH; #[derive(Clone, Debug, Default, serde::Serialize, serde::Deserialize)] pub struct Grantee { @@ -90,7 +88,12 @@ impl TransitionClient { return Err(std::io::Error::other(http_resp_to_error_response(&resp, b, bucket_name, object_name))); } - let b = resp.body_mut().store_all_unlimited().await.unwrap().to_vec(); + let b = resp + .body_mut() + .store_all_limited(MAX_S3_CLIENT_RESPONSE_SIZE) + .await + .unwrap() + .to_vec(); let mut res = match quick_xml::de::from_str::(&String::from_utf8(b).unwrap()) { Ok(result) => result, Err(err) => { diff --git a/crates/ecstore/src/client/api_get_object_attributes.rs b/crates/ecstore/src/client/api_get_object_attributes.rs index fd8015ad..874a0968 100644 --- a/crates/ecstore/src/client/api_get_object_attributes.rs +++ b/crates/ecstore/src/client/api_get_object_attributes.rs @@ -21,24 +21,17 @@ use bytes::Bytes; use http::{HeaderMap, HeaderValue}; use std::collections::HashMap; -use std::io::Cursor; use time::OffsetDateTime; -use tokio::io::BufReader; use crate::client::constants::{GET_OBJECT_ATTRIBUTES_MAX_PARTS, GET_OBJECT_ATTRIBUTES_TAGS, ISO8601_DATEFORMAT}; -use rustfs_utils::EMPTY_STRING_SHA256_HASH; -use s3s::header::{ - X_AMZ_DELETE_MARKER, X_AMZ_MAX_PARTS, X_AMZ_METADATA_DIRECTIVE, X_AMZ_OBJECT_ATTRIBUTES, X_AMZ_PART_NUMBER_MARKER, - X_AMZ_REQUEST_CHARGED, X_AMZ_RESTORE, X_AMZ_VERSION_ID, -}; -use s3s::{Body, dto::Owner}; - use crate::client::{ - api_error_response::err_invalid_argument, api_get_object_acl::AccessControlPolicy, - api_get_options::GetObjectOptions, - transition_api::{ObjectInfo, ReadCloser, ReaderImpl, RequestMetadata, TransitionClient, to_object_info}, + transition_api::{ReaderImpl, RequestMetadata, TransitionClient}, }; +use rustfs_config::MAX_S3_CLIENT_RESPONSE_SIZE; +use rustfs_utils::EMPTY_STRING_SHA256_HASH; +use s3s::Body; +use s3s::header::{X_AMZ_MAX_PARTS, X_AMZ_OBJECT_ATTRIBUTES, X_AMZ_PART_NUMBER_MARKER, X_AMZ_VERSION_ID}; pub struct ObjectAttributesOptions { pub max_parts: i64, @@ -143,7 +136,12 @@ impl ObjectAttributes { self.last_modified = mod_time; self.version_id = h.get(X_AMZ_VERSION_ID).unwrap().to_str().unwrap().to_string(); - let b = resp.body_mut().store_all_unlimited().await.unwrap().to_vec(); + let b = resp + .body_mut() + .store_all_limited(MAX_S3_CLIENT_RESPONSE_SIZE) + .await + .unwrap() + .to_vec(); let mut response = match quick_xml::de::from_str::(&String::from_utf8(b).unwrap()) { Ok(result) => result, Err(err) => { @@ -224,7 +222,12 @@ impl TransitionClient { } if resp.status() != http::StatusCode::OK { - let b = resp.body_mut().store_all_unlimited().await.unwrap().to_vec(); + let b = resp + .body_mut() + .store_all_limited(MAX_S3_CLIENT_RESPONSE_SIZE) + .await + .unwrap() + .to_vec(); let err_body = String::from_utf8(b).unwrap(); let mut er = match quick_xml::de::from_str::(&err_body) { Ok(result) => result, diff --git a/crates/ecstore/src/client/api_list.rs b/crates/ecstore/src/client/api_list.rs index fdbffc68..73839025 100644 --- a/crates/ecstore/src/client/api_list.rs +++ b/crates/ecstore/src/client/api_list.rs @@ -18,10 +18,6 @@ #![allow(unused_must_use)] #![allow(clippy::all)] -use bytes::Bytes; -use http::{HeaderMap, StatusCode}; -use std::collections::HashMap; - use crate::client::{ api_error_response::http_resp_to_error_response, api_s3_datatypes::{ @@ -31,7 +27,11 @@ use crate::client::{ transition_api::{ReaderImpl, RequestMetadata, TransitionClient}, }; use crate::store_api::BucketInfo; +use bytes::Bytes; +use http::{HeaderMap, StatusCode}; +use rustfs_config::MAX_S3_CLIENT_RESPONSE_SIZE; use rustfs_utils::hash::EMPTY_STRING_SHA256_HASH; +use std::collections::HashMap; impl TransitionClient { pub fn list_buckets(&self) -> Result, std::io::Error> { @@ -102,7 +102,12 @@ impl TransitionClient { } //let mut list_bucket_result = ListBucketV2Result::default(); - let b = resp.body_mut().store_all_unlimited().await.unwrap().to_vec(); + let b = resp + .body_mut() + .store_all_limited(MAX_S3_CLIENT_RESPONSE_SIZE) + .await + .unwrap() + .to_vec(); let mut list_bucket_result = match quick_xml::de::from_str::(&String::from_utf8(b).unwrap()) { Ok(result) => result, Err(err) => { diff --git a/crates/ecstore/src/client/bucket_cache.rs b/crates/ecstore/src/client/bucket_cache.rs index 8bd22605..6db43358 100644 --- a/crates/ecstore/src/client/bucket_cache.rs +++ b/crates/ecstore/src/client/bucket_cache.rs @@ -18,23 +18,19 @@ #![allow(unused_must_use)] #![allow(clippy::all)] -use http::Request; -use hyper::StatusCode; -use hyper::body::Incoming; -use std::{collections::HashMap, sync::Arc}; -use tracing::warn; -use tracing::{debug, error, info}; - +use super::constants::UNSIGNED_PAYLOAD; +use super::credentials::SignatureType; use crate::client::{ - api_error_response::{http_resp_to_error_response, to_error_response}, + api_error_response::http_resp_to_error_response, transition_api::{CreateBucketConfiguration, LocationConstraint, TransitionClient}, }; +use http::Request; +use hyper::StatusCode; +use rustfs_config::MAX_S3_CLIENT_RESPONSE_SIZE; use rustfs_utils::hash::EMPTY_STRING_SHA256_HASH; use s3s::Body; use s3s::S3ErrorCode; - -use super::constants::UNSIGNED_PAYLOAD; -use super::credentials::SignatureType; +use std::collections::HashMap; #[derive(Debug, Clone)] pub struct BucketLocationCache { @@ -212,7 +208,12 @@ async fn process_bucket_location_response( } //} - let b = resp.body_mut().store_all_unlimited().await.unwrap().to_vec(); + let b = resp + .body_mut() + .store_all_limited(MAX_S3_CLIENT_RESPONSE_SIZE) + .await + .unwrap() + .to_vec(); let mut location = "".to_string(); if tier_type == "huaweicloud" { let d = quick_xml::de::from_str::(&String::from_utf8(b).unwrap()).unwrap(); diff --git a/crates/ecstore/src/client/hook_reader.rs b/crates/ecstore/src/client/hook_reader.rs deleted file mode 100644 index 38d2c3f8..00000000 --- a/crates/ecstore/src/client/hook_reader.rs +++ /dev/null @@ -1,59 +0,0 @@ -#![allow(clippy::map_entry)] -// Copyright 2024 RustFS Team -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -use std::{collections::HashMap, sync::Arc}; - -use crate::{ - disk::{ - error::{is_unformatted_disk, DiskError}, - format::{DistributionAlgoVersion, FormatV3}, - new_disk, DiskAPI, DiskInfo, DiskOption, DiskStore, - }, - store_api::{ - BucketInfo, BucketOptions, CompletePart, DeleteBucketOptions, DeletedObject, GetObjectReader, HTTPRangeSpec, - ListMultipartsInfo, ListObjectVersionsInfo, ListObjectsV2Info, MakeBucketOptions, MultipartInfo, MultipartUploadResult, - ObjectIO, ObjectInfo, ObjectOptions, ObjectToDelete, PartInfo, PutObjReader, StorageAPI, - }, - credentials::{Credentials, SignatureType,}, - api_put_object_multipart::UploadPartParams, -}; - -use http::HeaderMap; -use tokio_util::sync::CancellationToken; -use tracing::warn; -use tracing::{error, info}; -use url::Url; - -struct HookReader { - source: GetObjectReader, - hook: GetObjectReader, -} - -impl HookReader { - pub fn new(source: GetObjectReader, hook: GetObjectReader) -> HookReader { - HookReader { - source, - hook, - } - } - - fn seek(&self, offset: i64, whence: i64) -> Result { - todo!(); - } - - fn read(&self, b: &[u8]) -> Result { - todo!(); - } -} \ No newline at end of file diff --git a/crates/ecstore/src/client/transition_api.rs b/crates/ecstore/src/client/transition_api.rs index c0d7092f..2be5d7c2 100644 --- a/crates/ecstore/src/client/transition_api.rs +++ b/crates/ecstore/src/client/transition_api.rs @@ -18,6 +18,20 @@ #![allow(unused_must_use)] #![allow(clippy::all)] +use crate::client::bucket_cache::BucketLocationCache; +use crate::client::{ + api_error_response::{err_invalid_argument, http_resp_to_error_response, to_error_response}, + api_get_options::GetObjectOptions, + api_put_object::PutObjectOptions, + api_put_object_multipart::UploadPartParams, + api_s3_datatypes::{ + CompleteMultipartUpload, CompletePart, ListBucketResult, ListBucketV2Result, ListMultipartUploadsResult, + ListObjectPartsResult, ObjectPart, + }, + constants::{UNSIGNED_PAYLOAD, UNSIGNED_PAYLOAD_TRAILER}, + credentials::{CredContext, Credentials, SignatureType, Static}, +}; +use crate::{client::checksum::ChecksumMode, store_api::GetObjectReader}; use bytes::Bytes; use futures::{Future, StreamExt}; use http::{HeaderMap, HeaderName}; @@ -30,7 +44,18 @@ use hyper_util::{client::legacy::Client, client::legacy::connect::HttpConnector, use md5::Digest; use md5::Md5; use rand::Rng; +use rustfs_config::MAX_S3_CLIENT_RESPONSE_SIZE; +use rustfs_rio::HashReader; use rustfs_utils::HashAlgorithm; +use rustfs_utils::{ + net::get_endpoint_url, + retry::{ + DEFAULT_RETRY_CAP, DEFAULT_RETRY_UNIT, MAX_JITTER, MAX_RETRY, RetryTimer, is_http_status_retryable, is_s3code_retryable, + }, +}; +use s3s::S3ErrorCode; +use s3s::dto::ReplicationStatus; +use s3s::{Body, dto::Owner}; use serde::{Deserialize, Serialize}; use sha2::Sha256; use std::io::Cursor; @@ -48,31 +73,6 @@ use tracing::{debug, error, warn}; use url::{Url, form_urlencoded}; use uuid::Uuid; -use crate::client::bucket_cache::BucketLocationCache; -use crate::client::{ - api_error_response::{err_invalid_argument, http_resp_to_error_response, to_error_response}, - api_get_options::GetObjectOptions, - api_put_object::PutObjectOptions, - api_put_object_multipart::UploadPartParams, - api_s3_datatypes::{ - CompleteMultipartUpload, CompletePart, ListBucketResult, ListBucketV2Result, ListMultipartUploadsResult, - ListObjectPartsResult, ObjectPart, - }, - constants::{UNSIGNED_PAYLOAD, UNSIGNED_PAYLOAD_TRAILER}, - credentials::{CredContext, Credentials, SignatureType, Static}, -}; -use crate::{client::checksum::ChecksumMode, store_api::GetObjectReader}; -use rustfs_rio::HashReader; -use rustfs_utils::{ - net::get_endpoint_url, - retry::{ - DEFAULT_RETRY_CAP, DEFAULT_RETRY_UNIT, MAX_JITTER, MAX_RETRY, RetryTimer, is_http_status_retryable, is_s3code_retryable, - }, -}; -use s3s::S3ErrorCode; -use s3s::dto::ReplicationStatus; -use s3s::{Body, dto::Owner}; - const C_USER_AGENT: &str = "RustFS (linux; x86)"; const SUCCESS_STATUS: [StatusCode; 3] = [StatusCode::OK, StatusCode::NO_CONTENT, StatusCode::PARTIAL_CONTENT]; @@ -291,7 +291,12 @@ impl TransitionClient { //if self.is_trace_enabled && !(self.trace_errors_only && resp.status() == StatusCode::OK) { if resp.status() != StatusCode::OK { //self.dump_http(&cloned_req, &resp)?; - let b = resp.body_mut().store_all_unlimited().await.unwrap().to_vec(); + let b = resp + .body_mut() + .store_all_limited(MAX_S3_CLIENT_RESPONSE_SIZE) + .await + .unwrap() + .to_vec(); warn!("err_body: {}", String::from_utf8(b).unwrap()); } @@ -334,7 +339,12 @@ impl TransitionClient { } } - let b = resp.body_mut().store_all_unlimited().await.unwrap().to_vec(); + let b = resp + .body_mut() + .store_all_limited(MAX_S3_CLIENT_RESPONSE_SIZE) + .await + .unwrap() + .to_vec(); let mut err_response = http_resp_to_error_response(&resp, b.clone(), &metadata.bucket_name, &metadata.object_name); err_response.message = format!("remote tier error: {}", err_response.message); diff --git a/crates/ecstore/src/config/audit.rs b/crates/ecstore/src/config/audit.rs index afbab13b..f0c86403 100644 --- a/crates/ecstore/src/config/audit.rs +++ b/crates/ecstore/src/config/audit.rs @@ -14,7 +14,7 @@ use crate::config::{KV, KVS}; use rustfs_config::{ - COMMENT_KEY, DEFAULT_DIR, DEFAULT_LIMIT, ENABLE_KEY, EnableState, MQTT_BROKER, MQTT_KEEP_ALIVE_INTERVAL, MQTT_PASSWORD, + COMMENT_KEY, DEFAULT_LIMIT, ENABLE_KEY, EVENT_DEFAULT_DIR, EnableState, MQTT_BROKER, MQTT_KEEP_ALIVE_INTERVAL, MQTT_PASSWORD, MQTT_QOS, MQTT_QUEUE_DIR, MQTT_QUEUE_LIMIT, MQTT_RECONNECT_INTERVAL, MQTT_TOPIC, MQTT_USERNAME, WEBHOOK_AUTH_TOKEN, WEBHOOK_BATCH_SIZE, WEBHOOK_CLIENT_CERT, WEBHOOK_CLIENT_KEY, WEBHOOK_ENDPOINT, WEBHOOK_HTTP_TIMEOUT, WEBHOOK_MAX_RETRY, WEBHOOK_QUEUE_DIR, WEBHOOK_QUEUE_LIMIT, WEBHOOK_RETRY_INTERVAL, @@ -63,7 +63,7 @@ pub static DEFAULT_AUDIT_WEBHOOK_KVS: LazyLock = LazyLock::new(|| { }, KV { key: WEBHOOK_QUEUE_DIR.to_owned(), - value: DEFAULT_DIR.to_owned(), + value: EVENT_DEFAULT_DIR.to_owned(), hidden_if_empty: false, }, KV { @@ -131,7 +131,7 @@ pub static DEFAULT_AUDIT_MQTT_KVS: LazyLock = LazyLock::new(|| { }, KV { key: MQTT_QUEUE_DIR.to_owned(), - value: DEFAULT_DIR.to_owned(), + value: EVENT_DEFAULT_DIR.to_owned(), hidden_if_empty: false, }, KV { diff --git a/crates/ecstore/src/config/notify.rs b/crates/ecstore/src/config/notify.rs index 74157f52..c9ebf3ba 100644 --- a/crates/ecstore/src/config/notify.rs +++ b/crates/ecstore/src/config/notify.rs @@ -14,7 +14,7 @@ use crate::config::{KV, KVS}; use rustfs_config::{ - COMMENT_KEY, DEFAULT_DIR, DEFAULT_LIMIT, ENABLE_KEY, EnableState, MQTT_BROKER, MQTT_KEEP_ALIVE_INTERVAL, MQTT_PASSWORD, + COMMENT_KEY, DEFAULT_LIMIT, ENABLE_KEY, EVENT_DEFAULT_DIR, EnableState, MQTT_BROKER, MQTT_KEEP_ALIVE_INTERVAL, MQTT_PASSWORD, MQTT_QOS, MQTT_QUEUE_DIR, MQTT_QUEUE_LIMIT, MQTT_RECONNECT_INTERVAL, MQTT_TOPIC, MQTT_USERNAME, WEBHOOK_AUTH_TOKEN, WEBHOOK_CLIENT_CERT, WEBHOOK_CLIENT_KEY, WEBHOOK_ENDPOINT, WEBHOOK_QUEUE_DIR, WEBHOOK_QUEUE_LIMIT, }; @@ -47,7 +47,7 @@ pub static DEFAULT_NOTIFY_WEBHOOK_KVS: LazyLock = LazyLock::new(|| { }, KV { key: WEBHOOK_QUEUE_DIR.to_owned(), - value: DEFAULT_DIR.to_owned(), + value: EVENT_DEFAULT_DIR.to_owned(), hidden_if_empty: false, }, KV { @@ -114,7 +114,7 @@ pub static DEFAULT_NOTIFY_MQTT_KVS: LazyLock = LazyLock::new(|| { }, KV { key: MQTT_QUEUE_DIR.to_owned(), - value: DEFAULT_DIR.to_owned(), + value: EVENT_DEFAULT_DIR.to_owned(), hidden_if_empty: false, }, KV { diff --git a/crates/ecstore/src/data_usage.rs b/crates/ecstore/src/data_usage.rs index 822aaa38..4bfd1ea7 100644 --- a/crates/ecstore/src/data_usage.rs +++ b/crates/ecstore/src/data_usage.rs @@ -32,6 +32,7 @@ use rustfs_common::data_usage::{ BucketTargetUsageInfo, BucketUsageInfo, DataUsageCache, DataUsageEntry, DataUsageInfo, DiskUsageStatus, SizeSummary, }; use rustfs_utils::path::SLASH_SEPARATOR; +use tokio::fs; use tracing::{error, info, warn}; use crate::error::Error; @@ -63,6 +64,21 @@ lazy_static::lazy_static! { /// Store data usage info to backend storage pub async fn store_data_usage_in_backend(data_usage_info: DataUsageInfo, store: Arc) -> Result<(), Error> { + // Prevent older data from overwriting newer persisted stats + if let Ok(buf) = read_config(store.clone(), &DATA_USAGE_OBJ_NAME_PATH).await { + if let Ok(existing) = serde_json::from_slice::(&buf) { + if let (Some(new_ts), Some(existing_ts)) = (data_usage_info.last_update, existing.last_update) { + if new_ts <= existing_ts { + info!( + "Skip persisting data usage: incoming last_update {:?} <= existing {:?}", + new_ts, existing_ts + ); + return Ok(()); + } + } + } + } + let data = serde_json::to_vec(&data_usage_info).map_err(|e| Error::other(format!("Failed to serialize data usage info: {e}")))?; @@ -160,6 +176,39 @@ pub async fn load_data_usage_from_backend(store: Arc) -> Result) { + if let Some(update) = snapshot.last_update { + if latest_update.is_none_or(|current| update > current) { + *latest_update = Some(update); + } + } + + snapshot.recompute_totals(); + + aggregated.objects_total_count = aggregated.objects_total_count.saturating_add(snapshot.objects_total_count); + aggregated.versions_total_count = aggregated.versions_total_count.saturating_add(snapshot.versions_total_count); + aggregated.delete_markers_total_count = aggregated + .delete_markers_total_count + .saturating_add(snapshot.delete_markers_total_count); + aggregated.objects_total_size = aggregated.objects_total_size.saturating_add(snapshot.objects_total_size); + + for (bucket, usage) in snapshot.buckets_usage.into_iter() { + let bucket_size = usage.size; + match aggregated.buckets_usage.entry(bucket.clone()) { + Entry::Occupied(mut entry) => entry.get_mut().merge(&usage), + Entry::Vacant(entry) => { + entry.insert(usage.clone()); + } + } + + aggregated + .bucket_sizes + .entry(bucket) + .and_modify(|size| *size = size.saturating_add(bucket_size)) + .or_insert(bucket_size); + } +} + pub async fn aggregate_local_snapshots(store: Arc) -> Result<(Vec, DataUsageInfo), Error> { let mut aggregated = DataUsageInfo::default(); let mut latest_update: Option = None; @@ -196,7 +245,24 @@ pub async fn aggregate_local_snapshots(store: Arc) -> Result<(Vec) -> Result<(Vec current) { - latest_update = Some(update); - } - } - - aggregated.objects_total_count = aggregated.objects_total_count.saturating_add(snapshot.objects_total_count); - aggregated.versions_total_count = - aggregated.versions_total_count.saturating_add(snapshot.versions_total_count); - aggregated.delete_markers_total_count = aggregated - .delete_markers_total_count - .saturating_add(snapshot.delete_markers_total_count); - aggregated.objects_total_size = aggregated.objects_total_size.saturating_add(snapshot.objects_total_size); - - for (bucket, usage) in snapshot.buckets_usage.into_iter() { - let bucket_size = usage.size; - match aggregated.buckets_usage.entry(bucket.clone()) { - Entry::Occupied(mut entry) => entry.get_mut().merge(&usage), - Entry::Vacant(entry) => { - entry.insert(usage.clone()); - } - } - - aggregated - .bucket_sizes - .entry(bucket) - .and_modify(|size| *size = size.saturating_add(bucket_size)) - .or_insert(bucket_size); - } + merge_snapshot(&mut aggregated, snapshot, &mut latest_update); } statuses.push(status); @@ -549,3 +585,94 @@ pub async fn save_data_usage_cache(cache: &DataUsageCache, name: &str) -> crate: save_config(store, &name, buf).await?; Ok(()) } + +#[cfg(test)] +mod tests { + use super::*; + use rustfs_common::data_usage::BucketUsageInfo; + + fn aggregate_for_test( + inputs: Vec<(DiskUsageStatus, Result, Error>)>, + ) -> (Vec, DataUsageInfo) { + let mut aggregated = DataUsageInfo::default(); + let mut latest_update: Option = None; + let mut statuses = Vec::new(); + + for (mut status, snapshot_result) in inputs { + if let Ok(Some(snapshot)) = snapshot_result { + status.snapshot_exists = true; + status.last_update = snapshot.last_update; + merge_snapshot(&mut aggregated, snapshot, &mut latest_update); + } + statuses.push(status); + } + + aggregated.buckets_count = aggregated.buckets_usage.len() as u64; + aggregated.last_update = latest_update; + aggregated.disk_usage_status = statuses.clone(); + + (statuses, aggregated) + } + + #[test] + fn aggregate_skips_corrupted_snapshot_and_preserves_other_disks() { + let mut good_snapshot = LocalUsageSnapshot::new(LocalUsageSnapshotMeta { + disk_id: "good-disk".to_string(), + pool_index: Some(0), + set_index: Some(0), + disk_index: Some(0), + }); + good_snapshot.last_update = Some(SystemTime::now()); + good_snapshot.buckets_usage.insert( + "bucket-a".to_string(), + BucketUsageInfo { + objects_count: 3, + versions_count: 3, + size: 42, + ..Default::default() + }, + ); + good_snapshot.recompute_totals(); + + let bad_snapshot_err: Result, Error> = Err(Error::other("corrupted snapshot payload")); + + let inputs = vec![ + ( + DiskUsageStatus { + disk_id: "bad-disk".to_string(), + pool_index: Some(0), + set_index: Some(0), + disk_index: Some(1), + last_update: None, + snapshot_exists: false, + }, + bad_snapshot_err, + ), + ( + DiskUsageStatus { + disk_id: "good-disk".to_string(), + pool_index: Some(0), + set_index: Some(0), + disk_index: Some(0), + last_update: None, + snapshot_exists: false, + }, + Ok(Some(good_snapshot)), + ), + ]; + + let (statuses, aggregated) = aggregate_for_test(inputs); + + // Bad disk stays non-existent, good disk is marked present + let bad_status = statuses.iter().find(|s| s.disk_id == "bad-disk").unwrap(); + assert!(!bad_status.snapshot_exists); + let good_status = statuses.iter().find(|s| s.disk_id == "good-disk").unwrap(); + assert!(good_status.snapshot_exists); + + // Aggregated data is from good snapshot only + assert_eq!(aggregated.objects_total_count, 3); + assert_eq!(aggregated.objects_total_size, 42); + assert_eq!(aggregated.buckets_count, 1); + assert_eq!(aggregated.buckets_usage.get("bucket-a").map(|b| (b.objects_count, b.size)), Some((3, 42))); + } +} diff --git a/crates/ecstore/src/disk/disk_store.rs b/crates/ecstore/src/disk/disk_store.rs new file mode 100644 index 00000000..12cad517 --- /dev/null +++ b/crates/ecstore/src/disk/disk_store.rs @@ -0,0 +1,781 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use crate::disk::{ + CheckPartsResp, DeleteOptions, DiskAPI, DiskError, DiskInfo, DiskInfoOptions, DiskLocation, Endpoint, Error, + FileInfoVersions, ReadMultipleReq, ReadMultipleResp, ReadOptions, RenameDataResp, Result, UpdateMetadataOpts, VolumeInfo, + WalkDirOptions, + local::{LocalDisk, ScanGuard}, +}; +use bytes::Bytes; +use rustfs_filemeta::{FileInfo, ObjectPartInfo, RawFileInfo}; +use rustfs_utils::string::parse_bool_with_default; +use std::{ + path::PathBuf, + sync::{ + Arc, + atomic::{AtomicI64, AtomicU32, Ordering}, + }, + time::Duration, +}; +use tokio::{sync::RwLock, time}; +use tokio_util::sync::CancellationToken; +use tracing::{debug, info, warn}; +use uuid::Uuid; + +/// Disk health status constants +const DISK_HEALTH_OK: u32 = 0; +const DISK_HEALTH_FAULTY: u32 = 1; + +pub const ENV_RUSTFS_DRIVE_ACTIVE_MONITORING: &str = "RUSTFS_DRIVE_ACTIVE_MONITORING"; +pub const ENV_RUSTFS_DRIVE_MAX_TIMEOUT_DURATION: &str = "RUSTFS_DRIVE_MAX_TIMEOUT_DURATION"; +pub const CHECK_EVERY: Duration = Duration::from_secs(15); +pub const SKIP_IF_SUCCESS_BEFORE: Duration = Duration::from_secs(5); +pub const CHECK_TIMEOUT_DURATION: Duration = Duration::from_secs(5); + +lazy_static::lazy_static! { + static ref TEST_OBJ: String = format!("health-check-{}", Uuid::new_v4()); + static ref TEST_DATA: Bytes = Bytes::from(vec![42u8; 2048]); + static ref TEST_BUCKET: String = ".rustfs.sys/tmp".to_string(); +} + +pub fn get_max_timeout_duration() -> Duration { + std::env::var(ENV_RUSTFS_DRIVE_MAX_TIMEOUT_DURATION) + .map(|v| Duration::from_secs(v.parse::().unwrap_or(30))) + .unwrap_or(Duration::from_secs(30)) +} + +/// DiskHealthTracker tracks the health status of a disk. +/// Similar to Go's diskHealthTracker. +#[derive(Debug)] +pub struct DiskHealthTracker { + /// Atomic timestamp of last successful operation + pub last_success: AtomicI64, + /// Atomic timestamp of last operation start + pub last_started: AtomicI64, + /// Atomic disk status (OK or Faulty) + pub status: AtomicU32, + /// Atomic number of waiting operations + pub waiting: AtomicU32, +} + +impl DiskHealthTracker { + /// Create a new disk health tracker + pub fn new() -> Self { + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64; + + Self { + last_success: AtomicI64::new(now), + last_started: AtomicI64::new(now), + status: AtomicU32::new(DISK_HEALTH_OK), + waiting: AtomicU32::new(0), + } + } + + /// Log a successful operation + pub fn log_success(&self) { + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64; + self.last_success.store(now, Ordering::Relaxed); + } + + /// Check if disk is faulty + pub fn is_faulty(&self) -> bool { + self.status.load(Ordering::Relaxed) == DISK_HEALTH_FAULTY + } + + /// Set disk as faulty + pub fn set_faulty(&self) { + self.status.store(DISK_HEALTH_FAULTY, Ordering::Relaxed); + } + + /// Set disk as OK + pub fn set_ok(&self) { + self.status.store(DISK_HEALTH_OK, Ordering::Relaxed); + } + + pub fn swap_ok_to_faulty(&self) -> bool { + self.status + .compare_exchange(DISK_HEALTH_OK, DISK_HEALTH_FAULTY, Ordering::Relaxed, Ordering::Relaxed) + .is_ok() + } + + /// Increment waiting operations counter + pub fn increment_waiting(&self) { + self.waiting.fetch_add(1, Ordering::Relaxed); + } + + /// Decrement waiting operations counter + pub fn decrement_waiting(&self) { + self.waiting.fetch_sub(1, Ordering::Relaxed); + } + + /// Get waiting operations count + pub fn waiting_count(&self) -> u32 { + self.waiting.load(Ordering::Relaxed) + } + + /// Get last success timestamp + pub fn last_success(&self) -> i64 { + self.last_success.load(Ordering::Relaxed) + } +} + +impl Default for DiskHealthTracker { + fn default() -> Self { + Self::new() + } +} + +/// Health check context key for tracking disk operations +#[derive(Debug, Clone)] +struct HealthDiskCtxKey; + +#[derive(Debug)] +struct HealthDiskCtxValue { + last_success: Arc, +} + +impl HealthDiskCtxValue { + fn log_success(&self) { + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64; + self.last_success.store(now, Ordering::Relaxed); + } +} + +/// LocalDiskWrapper wraps a DiskStore with health tracking capabilities. +/// This is similar to Go's xlStorageDiskIDCheck. +#[derive(Debug, Clone)] +pub struct LocalDiskWrapper { + /// The underlying disk store + disk: Arc, + /// Health tracker + health: Arc, + /// Whether health checking is enabled + health_check: bool, + /// Cancellation token for monitoring tasks + cancel_token: CancellationToken, + /// Disk ID for stale checking + disk_id: Arc>>, +} + +impl LocalDiskWrapper { + /// Create a new LocalDiskWrapper + pub fn new(disk: Arc, health_check: bool) -> Self { + // Check environment variable for health check override + // Default to true if not set, but only enable if both param and env are true + let env_health_check = std::env::var(ENV_RUSTFS_DRIVE_ACTIVE_MONITORING) + .map(|v| parse_bool_with_default(&v, true)) + .unwrap_or(true); + + let ret = Self { + disk, + health: Arc::new(DiskHealthTracker::new()), + health_check: health_check && env_health_check, + cancel_token: CancellationToken::new(), + disk_id: Arc::new(RwLock::new(None)), + }; + + ret.start_monitoring(); + + ret + } + + pub fn get_disk(&self) -> Arc { + self.disk.clone() + } + + /// Start the disk monitoring if health_check is enabled + pub fn start_monitoring(&self) { + if self.health_check { + let health = Arc::clone(&self.health); + let cancel_token = self.cancel_token.clone(); + let disk = Arc::clone(&self.disk); + + tokio::spawn(async move { + Self::monitor_disk_writable(disk, health, cancel_token).await; + }); + } + } + + /// Stop the disk monitoring + pub async fn stop_monitoring(&self) { + self.cancel_token.cancel(); + } + + /// Monitor disk writability periodically + async fn monitor_disk_writable(disk: Arc, health: Arc, cancel_token: CancellationToken) { + // TODO: config interval + + let mut interval = time::interval(CHECK_EVERY); + + loop { + tokio::select! { + _ = cancel_token.cancelled() => { + return; + } + _ = interval.tick() => { + if cancel_token.is_cancelled() { + return; + } + + if health.status.load(Ordering::Relaxed) != DISK_HEALTH_OK { + continue; + } + + let last_success_nanos = health.last_success.load(Ordering::Relaxed); + let elapsed = Duration::from_nanos( + (std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64 - last_success_nanos) as u64 + ); + + if elapsed < SKIP_IF_SUCCESS_BEFORE { + continue; + } + + tokio::time::sleep(Duration::from_secs(1)).await; + + + debug!("health check: performing health check"); + if Self::perform_health_check(disk.clone(), &TEST_BUCKET, &TEST_OBJ, &TEST_DATA, true, CHECK_TIMEOUT_DURATION).await.is_err() && health.swap_ok_to_faulty() { + // Health check failed, disk is considered faulty + warn!("health check: failed, disk is considered faulty"); + + health.increment_waiting(); // Balance the increment from failed operation + + let health_clone = Arc::clone(&health); + let disk_clone = disk.clone(); + let cancel_clone = cancel_token.clone(); + + tokio::spawn(async move { + Self::monitor_disk_status(disk_clone, health_clone, cancel_clone).await; + }); + } + } + } + } + } + + /// Perform a health check by writing and reading a test file + async fn perform_health_check( + disk: Arc, + test_bucket: &str, + test_filename: &str, + test_data: &Bytes, + check_faulty_only: bool, + timeout_duration: Duration, + ) -> Result<()> { + // Perform health check with timeout + let health_check_result = tokio::time::timeout(timeout_duration, async { + // Try to write test data + disk.write_all(test_bucket, test_filename, test_data.clone()).await?; + + // Try to read back the data + let read_data = disk.read_all(test_bucket, test_filename).await?; + + // Verify data integrity + if read_data.len() != test_data.len() { + warn!( + "health check: test file data length mismatch: expected {} bytes, got {}", + test_data.len(), + read_data.len() + ); + if check_faulty_only { + return Ok(()); + } + return Err(DiskError::FaultyDisk); + } + + // Clean up + disk.delete( + test_bucket, + test_filename, + DeleteOptions { + recursive: false, + immediate: false, + undo_write: false, + old_data_dir: None, + }, + ) + .await?; + + Ok(()) + }) + .await; + + match health_check_result { + Ok(result) => match result { + Ok(()) => Ok(()), + Err(e) => { + debug!("health check: failed: {:?}", e); + + if e == DiskError::FaultyDisk { + return Err(e); + } + + if check_faulty_only { Ok(()) } else { Err(e) } + } + }, + Err(_) => { + // Timeout occurred + warn!("health check: timeout after {:?}", timeout_duration); + Err(DiskError::FaultyDisk) + } + } + } + + /// Monitor disk status and try to bring it back online + async fn monitor_disk_status(disk: Arc, health: Arc, cancel_token: CancellationToken) { + const CHECK_EVERY: Duration = Duration::from_secs(5); + + let mut interval = time::interval(CHECK_EVERY); + + loop { + tokio::select! { + _ = cancel_token.cancelled() => { + return; + } + _ = interval.tick() => { + if cancel_token.is_cancelled() { + return; + } + + match Self::perform_health_check(disk.clone(), &TEST_BUCKET, &TEST_OBJ, &TEST_DATA, false, CHECK_TIMEOUT_DURATION).await { + Ok(_) => { + info!("Disk {} is back online", disk.to_string()); + health.set_ok(); + health.decrement_waiting(); + return; + } + Err(e) => { + warn!("Disk {} still faulty: {:?}", disk.to_string(), e); + } + } + } + } + } + } + + async fn check_id(&self, want_id: Option) -> Result<()> { + if want_id.is_none() { + return Ok(()); + } + + let stored_disk_id = self.disk.get_disk_id().await?; + + if stored_disk_id != want_id { + return Err(Error::other(format!("Disk ID mismatch wanted {:?}, got {:?}", want_id, stored_disk_id))); + } + + Ok(()) + } + + /// Check if disk ID is stale + async fn check_disk_stale(&self) -> Result<()> { + let Some(current_disk_id) = *self.disk_id.read().await else { + return Ok(()); + }; + + let stored_disk_id = match self.disk.get_disk_id().await? { + Some(id) => id, + None => return Ok(()), // Empty disk ID is allowed during initialization + }; + + if current_disk_id != stored_disk_id { + return Err(DiskError::DiskNotFound); + } + + Ok(()) + } + + /// Set the disk ID + pub async fn set_disk_id_internal(&self, id: Option) -> Result<()> { + let mut disk_id = self.disk_id.write().await; + *disk_id = id; + Ok(()) + } + + /// Get the current disk ID + pub async fn get_current_disk_id(&self) -> Option { + *self.disk_id.read().await + } + + /// Track disk health for an operation. + /// This method should wrap disk operations to ensure health checking. + pub async fn track_disk_health(&self, operation: F, timeout_duration: Duration) -> Result + where + F: FnOnce() -> Fut, + Fut: std::future::Future>, + { + // Check if disk is faulty + if self.health.is_faulty() { + warn!("local disk {} health is faulty, returning error", self.to_string()); + return Err(DiskError::FaultyDisk); + } + + // Check if disk is stale + self.check_disk_stale().await?; + + // Record operation start + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64; + self.health.last_started.store(now, Ordering::Relaxed); + self.health.increment_waiting(); + + if timeout_duration == Duration::ZERO { + let result = operation().await; + self.health.decrement_waiting(); + if result.is_ok() { + self.health.log_success(); + } + return result; + } + // Execute the operation with timeout + let result = tokio::time::timeout(timeout_duration, operation()).await; + + match result { + Ok(operation_result) => { + // Log success and decrement waiting counter + if operation_result.is_ok() { + self.health.log_success(); + } + self.health.decrement_waiting(); + operation_result + } + Err(_) => { + // Timeout occurred, mark disk as potentially faulty and decrement waiting counter + self.health.decrement_waiting(); + warn!("disk operation timeout after {:?}", timeout_duration); + Err(DiskError::other(format!("disk operation timeout after {:?}", timeout_duration))) + } + } + } +} + +#[async_trait::async_trait] +impl DiskAPI for LocalDiskWrapper { + async fn read_metadata(&self, volume: &str, path: &str) -> Result { + self.track_disk_health(|| async { self.disk.read_metadata(volume, path).await }, Duration::ZERO) + .await + } + + fn start_scan(&self) -> ScanGuard { + self.disk.start_scan() + } + + fn to_string(&self) -> String { + self.disk.to_string() + } + + async fn is_online(&self) -> bool { + let Ok(Some(disk_id)) = self.disk.get_disk_id().await else { + return false; + }; + + let Some(current_disk_id) = *self.disk_id.read().await else { + return false; + }; + + current_disk_id == disk_id + } + + fn is_local(&self) -> bool { + self.disk.is_local() + } + + fn host_name(&self) -> String { + self.disk.host_name() + } + + fn endpoint(&self) -> Endpoint { + self.disk.endpoint() + } + + async fn close(&self) -> Result<()> { + self.stop_monitoring().await; + self.disk.close().await + } + + async fn get_disk_id(&self) -> Result> { + self.disk.get_disk_id().await + } + + async fn set_disk_id(&self, id: Option) -> Result<()> { + self.set_disk_id_internal(id).await + } + + fn path(&self) -> PathBuf { + self.disk.path() + } + + fn get_disk_location(&self) -> DiskLocation { + self.disk.get_disk_location() + } + + async fn disk_info(&self, opts: &DiskInfoOptions) -> Result { + if opts.noop && opts.metrics { + let mut info = DiskInfo::default(); + // Add health metrics + info.metrics.total_waiting = self.health.waiting_count(); + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + return Ok(info); + } + + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + + let result = self.disk.disk_info(opts).await?; + + if let Some(current_disk_id) = *self.disk_id.read().await + && Some(current_disk_id) != result.id + { + return Err(DiskError::DiskNotFound); + }; + + Ok(result) + } + + async fn make_volume(&self, volume: &str) -> Result<()> { + self.track_disk_health(|| async { self.disk.make_volume(volume).await }, get_max_timeout_duration()) + .await + } + + async fn make_volumes(&self, volumes: Vec<&str>) -> Result<()> { + self.track_disk_health(|| async { self.disk.make_volumes(volumes).await }, get_max_timeout_duration()) + .await + } + + async fn list_volumes(&self) -> Result> { + self.track_disk_health(|| async { self.disk.list_volumes().await }, Duration::ZERO) + .await + } + + async fn stat_volume(&self, volume: &str) -> Result { + self.track_disk_health(|| async { self.disk.stat_volume(volume).await }, get_max_timeout_duration()) + .await + } + + async fn delete_volume(&self, volume: &str) -> Result<()> { + self.track_disk_health(|| async { self.disk.delete_volume(volume).await }, Duration::ZERO) + .await + } + + async fn walk_dir(&self, opts: WalkDirOptions, wr: &mut W) -> Result<()> { + self.track_disk_health(|| async { self.disk.walk_dir(opts, wr).await }, Duration::ZERO) + .await + } + + async fn delete_version( + &self, + volume: &str, + path: &str, + fi: FileInfo, + force_del_marker: bool, + opts: DeleteOptions, + ) -> Result<()> { + self.track_disk_health( + || async { self.disk.delete_version(volume, path, fi, force_del_marker, opts).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn delete_versions(&self, volume: &str, versions: Vec, opts: DeleteOptions) -> Vec> { + // Check if disk is faulty before proceeding + if self.health.is_faulty() { + return vec![Some(DiskError::FaultyDisk); versions.len()]; + } + + // Check if disk is stale + if let Err(e) = self.check_disk_stale().await { + return vec![Some(e); versions.len()]; + } + + // Record operation start + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64; + self.health.last_started.store(now, Ordering::Relaxed); + self.health.increment_waiting(); + + // Execute the operation + let result = self.disk.delete_versions(volume, versions, opts).await; + + self.health.decrement_waiting(); + let has_err = result.iter().any(|e| e.is_some()); + if !has_err { + // Log success and decrement waiting counter + self.health.log_success(); + } + + result + } + + async fn delete_paths(&self, volume: &str, paths: &[String]) -> Result<()> { + self.track_disk_health(|| async { self.disk.delete_paths(volume, paths).await }, get_max_timeout_duration()) + .await + } + + async fn write_metadata(&self, org_volume: &str, volume: &str, path: &str, fi: FileInfo) -> Result<()> { + self.track_disk_health( + || async { self.disk.write_metadata(org_volume, volume, path, fi).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn update_metadata(&self, volume: &str, path: &str, fi: FileInfo, opts: &UpdateMetadataOpts) -> Result<()> { + self.track_disk_health( + || async { self.disk.update_metadata(volume, path, fi, opts).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn read_version( + &self, + org_volume: &str, + volume: &str, + path: &str, + version_id: &str, + opts: &ReadOptions, + ) -> Result { + self.track_disk_health( + || async { self.disk.read_version(org_volume, volume, path, version_id, opts).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn read_xl(&self, volume: &str, path: &str, read_data: bool) -> Result { + self.track_disk_health(|| async { self.disk.read_xl(volume, path, read_data).await }, get_max_timeout_duration()) + .await + } + + async fn rename_data( + &self, + src_volume: &str, + src_path: &str, + fi: FileInfo, + dst_volume: &str, + dst_path: &str, + ) -> Result { + self.track_disk_health( + || async { self.disk.rename_data(src_volume, src_path, fi, dst_volume, dst_path).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn list_dir(&self, origvolume: &str, volume: &str, dir_path: &str, count: i32) -> Result> { + self.track_disk_health( + || async { self.disk.list_dir(origvolume, volume, dir_path, count).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn read_file(&self, volume: &str, path: &str) -> Result { + self.track_disk_health(|| async { self.disk.read_file(volume, path).await }, get_max_timeout_duration()) + .await + } + + async fn read_file_stream(&self, volume: &str, path: &str, offset: usize, length: usize) -> Result { + self.track_disk_health( + || async { self.disk.read_file_stream(volume, path, offset, length).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn append_file(&self, volume: &str, path: &str) -> Result { + self.track_disk_health(|| async { self.disk.append_file(volume, path).await }, Duration::ZERO) + .await + } + + async fn create_file(&self, origvolume: &str, volume: &str, path: &str, file_size: i64) -> Result { + self.track_disk_health( + || async { self.disk.create_file(origvolume, volume, path, file_size).await }, + Duration::ZERO, + ) + .await + } + + async fn rename_file(&self, src_volume: &str, src_path: &str, dst_volume: &str, dst_path: &str) -> Result<()> { + self.track_disk_health( + || async { self.disk.rename_file(src_volume, src_path, dst_volume, dst_path).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn rename_part(&self, src_volume: &str, src_path: &str, dst_volume: &str, dst_path: &str, meta: Bytes) -> Result<()> { + self.track_disk_health( + || async { self.disk.rename_part(src_volume, src_path, dst_volume, dst_path, meta).await }, + get_max_timeout_duration(), + ) + .await + } + + async fn delete(&self, volume: &str, path: &str, opt: DeleteOptions) -> Result<()> { + self.track_disk_health(|| async { self.disk.delete(volume, path, opt).await }, get_max_timeout_duration()) + .await + } + + async fn verify_file(&self, volume: &str, path: &str, fi: &FileInfo) -> Result { + self.track_disk_health(|| async { self.disk.verify_file(volume, path, fi).await }, Duration::ZERO) + .await + } + + async fn check_parts(&self, volume: &str, path: &str, fi: &FileInfo) -> Result { + self.track_disk_health(|| async { self.disk.check_parts(volume, path, fi).await }, Duration::ZERO) + .await + } + + async fn read_parts(&self, bucket: &str, paths: &[String]) -> Result> { + self.track_disk_health(|| async { self.disk.read_parts(bucket, paths).await }, Duration::ZERO) + .await + } + + async fn read_multiple(&self, req: ReadMultipleReq) -> Result> { + self.track_disk_health(|| async { self.disk.read_multiple(req).await }, Duration::ZERO) + .await + } + + async fn write_all(&self, volume: &str, path: &str, data: Bytes) -> Result<()> { + self.track_disk_health(|| async { self.disk.write_all(volume, path, data).await }, get_max_timeout_duration()) + .await + } + + async fn read_all(&self, volume: &str, path: &str) -> Result { + self.track_disk_health(|| async { self.disk.read_all(volume, path).await }, get_max_timeout_duration()) + .await + } +} diff --git a/crates/ecstore/src/disk/endpoint.rs b/crates/ecstore/src/disk/endpoint.rs index f1de59e1..952cda94 100644 --- a/crates/ecstore/src/disk/endpoint.rs +++ b/crates/ecstore/src/disk/endpoint.rs @@ -198,15 +198,22 @@ impl Endpoint { } } - pub fn get_file_path(&self) -> &str { - let path = self.url.path(); + pub fn get_file_path(&self) -> String { + let path: &str = self.url.path(); + let decoded: std::borrow::Cow<'_, str> = match urlencoding::decode(path) { + Ok(decoded) => decoded, + Err(e) => { + debug!("Failed to decode path '{}': {}, using original path", path, e); + std::borrow::Cow::Borrowed(path) + } + }; #[cfg(windows)] if self.url.scheme() == "file" { - let stripped = path.strip_prefix('/').unwrap_or(path); + let stripped: &str = decoded.strip_prefix('/').unwrap_or(&decoded); debug!("get_file_path windows: path={}", stripped); - return stripped; + return stripped.to_string(); } - path + decoded.into_owned() } } @@ -501,6 +508,45 @@ mod test { assert_eq!(endpoint.get_type(), EndpointType::Path); } + #[test] + fn test_endpoint_with_spaces_in_path() { + let path_with_spaces = "/Users/test/Library/Application Support/rustfs/data"; + let endpoint = Endpoint::try_from(path_with_spaces).unwrap(); + assert_eq!(endpoint.get_file_path(), path_with_spaces); + assert!(endpoint.is_local); + assert_eq!(endpoint.get_type(), EndpointType::Path); + } + + #[test] + fn test_endpoint_percent_encoding_roundtrip() { + let path_with_spaces = "/Users/test/Library/Application Support/rustfs/data"; + let endpoint = Endpoint::try_from(path_with_spaces).unwrap(); + + // Verify that the URL internally stores percent-encoded path + assert!( + endpoint.url.path().contains("%20"), + "URL path should contain percent-encoded spaces: {}", + endpoint.url.path() + ); + + // Verify that get_file_path() decodes the percent-encoded path correctly + assert_eq!( + endpoint.get_file_path(), + "/Users/test/Library/Application Support/rustfs/data", + "get_file_path() should decode percent-encoded spaces" + ); + } + + #[test] + fn test_endpoint_with_various_special_characters() { + // Test path with multiple special characters that get percent-encoded + let path_with_special = "/tmp/test path/data[1]/file+name&more"; + let endpoint = Endpoint::try_from(path_with_special).unwrap(); + + // get_file_path() should return the original path with decoded characters + assert_eq!(endpoint.get_file_path(), path_with_special); + } + #[test] fn test_endpoint_update_is_local() { let mut endpoint = Endpoint::try_from("http://localhost:9000/path").unwrap(); diff --git a/crates/ecstore/src/disk/error.rs b/crates/ecstore/src/disk/error.rs index 6ef2c05e..9fb5f81c 100644 --- a/crates/ecstore/src/disk/error.rs +++ b/crates/ecstore/src/disk/error.rs @@ -16,7 +16,6 @@ use std::hash::{Hash, Hasher}; use std::io::{self}; use std::path::PathBuf; -use tracing::error; pub type Error = DiskError; pub type Result = core::result::Result; diff --git a/crates/ecstore/src/disk/local.rs b/crates/ecstore/src/disk/local.rs index 5ed851e6..a9395575 100644 --- a/crates/ecstore/src/disk/local.rs +++ b/crates/ecstore/src/disk/local.rs @@ -69,7 +69,7 @@ use tokio::sync::RwLock; use tracing::{debug, error, info, warn}; use uuid::Uuid; -#[derive(Debug)] +#[derive(Debug, Clone)] pub struct FormatInfo { pub id: Option, pub data: Bytes, @@ -77,16 +77,6 @@ pub struct FormatInfo { pub last_check: Option, } -impl FormatInfo { - pub fn last_check_valid(&self) -> bool { - let now = OffsetDateTime::now_utc(); - self.file_info.is_some() - && self.id.is_some() - && self.last_check.is_some() - && (now.unix_timestamp() - self.last_check.unwrap().unix_timestamp() <= 1) - } -} - /// A helper enum to handle internal buffer types for writing data. pub enum InternalBuf<'a> { Ref(&'a [u8]), @@ -99,7 +89,7 @@ pub struct LocalDisk { pub format_info: RwLock, pub endpoint: Endpoint, pub disk_info_cache: Arc>, - pub scanning: AtomicU32, + pub scanning: Arc, pub rotational: bool, pub fstype: String, pub major: u64, @@ -185,7 +175,7 @@ impl LocalDisk { }; let root_clone = root.clone(); let update_fn: UpdateFn = Box::new(move || { - let disk_id = id.map_or("".to_string(), |id| id.to_string()); + let disk_id = id; let root = root_clone.clone(); Box::pin(async move { match get_disk_info(root.clone()).await { @@ -200,7 +190,7 @@ impl LocalDisk { minor: info.minor, fs_type: info.fstype, root_disk: root, - id: disk_id.to_string(), + id: disk_id, ..Default::default() }; // if root { @@ -225,7 +215,7 @@ impl LocalDisk { format_path, format_info: RwLock::new(format_info), disk_info_cache: Arc::new(cache), - scanning: AtomicU32::new(0), + scanning: Arc::new(AtomicU32::new(0)), rotational: Default::default(), fstype: Default::default(), minor: Default::default(), @@ -683,6 +673,8 @@ impl LocalDisk { return Err(DiskError::FileNotFound); } + debug!("read_raw: file_path: {:?}", file_path.as_ref()); + let meta_path = file_path.as_ref().join(Path::new(STORAGE_FORMAT_FILE)); let res = { @@ -692,6 +684,7 @@ impl LocalDisk { match self.read_metadata_with_dmtime(meta_path).await { Ok(res) => Ok(res), Err(err) => { + warn!("read_raw: error: {:?}", err); if err == Error::FileNotFound && !skip_access_checks(volume_dir.as_ref().to_string_lossy().to_string().as_str()) { @@ -717,20 +710,6 @@ impl LocalDisk { Ok((buf, mtime)) } - async fn read_metadata(&self, file_path: impl AsRef) -> Result> { - // Try to use cached file content reading for better performance, with safe fallback - let path = file_path.as_ref().to_path_buf(); - - // First, try the cache - if let Ok(bytes) = get_global_file_cache().get_file_content(path.clone()).await { - return Ok(bytes.to_vec()); - } - - // Fallback to direct read if cache fails - let (data, _) = self.read_metadata_with_dmtime(file_path.as_ref()).await?; - Ok(data) - } - async fn read_metadata_with_dmtime(&self, file_path: impl AsRef) -> Result<(Vec, Option)> { check_path_length(file_path.as_ref().to_string_lossy().as_ref())?; @@ -892,7 +871,7 @@ impl LocalDisk { } // write_all_private with check_path_length - #[tracing::instrument(level = "debug", skip_all)] + #[tracing::instrument(level = "debug", skip(self, buf, sync, skip_parent))] pub async fn write_all_private(&self, volume: &str, path: &str, buf: Bytes, sync: bool, skip_parent: &Path) -> Result<()> { let volume_dir = self.get_bucket_path(volume)?; let file_path = volume_dir.join(Path::new(&path)); @@ -1084,7 +1063,7 @@ impl LocalDisk { if entry.ends_with(STORAGE_FORMAT_FILE) { let metadata = self - .read_metadata(self.get_object_path(bucket, format!("{}/{}", ¤t, &entry).as_str())?) + .read_metadata(bucket, format!("{}/{}", ¤t, &entry).as_str()) .await?; let entry = entry.strip_suffix(STORAGE_FORMAT_FILE).unwrap_or_default().to_owned(); @@ -1100,7 +1079,7 @@ impl LocalDisk { out.write_obj(&MetaCacheEntry { name: name.clone(), - metadata, + metadata: metadata.to_vec(), ..Default::default() }) .await?; @@ -1167,14 +1146,14 @@ impl LocalDisk { let fname = format!("{}/{}", &meta.name, STORAGE_FORMAT_FILE); - match self.read_metadata(self.get_object_path(&opts.bucket, fname.as_str())?).await { + match self.read_metadata(&opts.bucket, fname.as_str()).await { Ok(res) => { if is_dir_obj { meta.name = meta.name.trim_end_matches(GLOBAL_DIR_SUFFIX_WITH_SLASH).to_owned(); meta.name.push_str(SLASH_SEPARATOR); } - meta.metadata = res; + meta.metadata = res.to_vec(); out.write_obj(&meta).await?; @@ -1221,6 +1200,14 @@ impl LocalDisk { } } +pub struct ScanGuard(pub Arc); + +impl Drop for ScanGuard { + fn drop(&mut self) { + self.0.fetch_sub(1, Ordering::Relaxed); + } +} + fn is_root_path(path: impl AsRef) -> bool { path.as_ref().components().count() == 1 && path.as_ref().has_root() } @@ -1295,7 +1282,7 @@ impl DiskAPI for LocalDisk { } #[tracing::instrument(skip(self))] async fn is_online(&self) -> bool { - self.check_format_json().await.is_ok() + true } #[tracing::instrument(skip(self))] @@ -1342,24 +1329,40 @@ impl DiskAPI for LocalDisk { #[tracing::instrument(level = "debug", skip(self))] async fn get_disk_id(&self) -> Result> { - let mut format_info = self.format_info.write().await; + let format_info = { + let format_info = self.format_info.read().await; + format_info.clone() + }; let id = format_info.id; - if format_info.last_check_valid() { - return Ok(id); + // if format_info.last_check_valid() { + // return Ok(id); + // } + + if format_info.file_info.is_some() && id.is_some() { + // check last check time + if let Some(last_check) = format_info.last_check { + if last_check.unix_timestamp() + 1 < OffsetDateTime::now_utc().unix_timestamp() { + return Ok(id); + } + } } let file_meta = self.check_format_json().await?; if let Some(file_info) = &format_info.file_info { if super::fs::same_file(&file_meta, file_info) { + let mut format_info = self.format_info.write().await; format_info.last_check = Some(OffsetDateTime::now_utc()); + drop(format_info); return Ok(id); } } + debug!("get_disk_id: read format.json"); + let b = fs::read(&self.format_path).await.map_err(to_unformatted_disk_error)?; let fm = FormatV3::try_from(b.as_slice()).map_err(|e| { @@ -1375,20 +1378,19 @@ impl DiskAPI for LocalDisk { return Err(DiskError::InconsistentDisk); } + let mut format_info = self.format_info.write().await; format_info.id = Some(disk_id); format_info.file_info = Some(file_meta); format_info.data = b.into(); format_info.last_check = Some(OffsetDateTime::now_utc()); + drop(format_info); Ok(Some(disk_id)) } #[tracing::instrument(skip(self))] - async fn set_disk_id(&self, id: Option) -> Result<()> { + async fn set_disk_id(&self, _id: Option) -> Result<()> { // No setup is required locally - // TODO: add check_id_store - let mut format_info = self.format_info.write().await; - format_info.id = id; Ok(()) } @@ -1853,19 +1855,20 @@ impl DiskAPI for LocalDisk { let mut objs_returned = 0; if opts.base_dir.ends_with(SLASH_SEPARATOR) { - let fpath = self.get_object_path( - &opts.bucket, - path_join_buf(&[ - format!("{}{}", opts.base_dir.trim_end_matches(SLASH_SEPARATOR), GLOBAL_DIR_SUFFIX).as_str(), - STORAGE_FORMAT_FILE, - ]) - .as_str(), - )?; - - if let Ok(data) = self.read_metadata(fpath).await { + if let Ok(data) = self + .read_metadata( + &opts.bucket, + path_join_buf(&[ + format!("{}{}", opts.base_dir.trim_end_matches(SLASH_SEPARATOR), GLOBAL_DIR_SUFFIX).as_str(), + STORAGE_FORMAT_FILE, + ]) + .as_str(), + ) + .await + { let meta = MetaCacheEntry { name: opts.base_dir.clone(), - metadata: data, + metadata: data.to_vec(), ..Default::default() }; out.write_obj(&meta).await?; @@ -2438,8 +2441,32 @@ impl DiskAPI for LocalDisk { info.endpoint = self.endpoint.to_string(); info.scanning = self.scanning.load(Ordering::SeqCst) == 1; + if info.id.is_none() { + info.id = self.get_disk_id().await.unwrap_or(None); + } + Ok(info) } + #[tracing::instrument(skip(self))] + fn start_scan(&self) -> ScanGuard { + self.scanning.fetch_add(1, Ordering::Relaxed); + ScanGuard(Arc::clone(&self.scanning)) + } + + async fn read_metadata(&self, volume: &str, path: &str) -> Result { + // Try to use cached file content reading for better performance, with safe fallback + let file_path = self.get_object_path(volume, path)?; + // let file_path = file_path.join(Path::new(STORAGE_FORMAT_FILE)); + + // First, try the cache + if let Ok(bytes) = get_global_file_cache().get_file_content(file_path.clone()).await { + return Ok(bytes); + } + + // Fallback to direct read if cache fails + let (data, _) = self.read_metadata_with_dmtime(file_path).await?; + Ok(data.into()) + } } async fn get_disk_info(drive_path: PathBuf) -> Result<(rustfs_utils::os::DiskInfo, bool)> { @@ -2705,39 +2732,6 @@ mod test { } } - #[tokio::test] - async fn test_format_info_last_check_valid() { - let now = OffsetDateTime::now_utc(); - - // Valid format info - let valid_format_info = FormatInfo { - id: Some(Uuid::new_v4()), - data: vec![1, 2, 3].into(), - file_info: Some(fs::metadata("../../../..").await.unwrap()), - last_check: Some(now), - }; - assert!(valid_format_info.last_check_valid()); - - // Invalid format info (missing id) - let invalid_format_info = FormatInfo { - id: None, - data: vec![1, 2, 3].into(), - file_info: Some(fs::metadata("../../../..").await.unwrap()), - last_check: Some(now), - }; - assert!(!invalid_format_info.last_check_valid()); - - // Invalid format info (old timestamp) - let old_time = OffsetDateTime::now_utc() - time::Duration::seconds(10); - let old_format_info = FormatInfo { - id: Some(Uuid::new_v4()), - data: vec![1, 2, 3].into(), - file_info: Some(fs::metadata("../../../..").await.unwrap()), - last_check: Some(old_time), - }; - assert!(!old_format_info.last_check_valid()); - } - #[tokio::test] async fn test_read_file_exists() { let test_file = "./test_read_exists.txt"; diff --git a/crates/ecstore/src/disk/mod.rs b/crates/ecstore/src/disk/mod.rs index 5f419380..f0a75636 100644 --- a/crates/ecstore/src/disk/mod.rs +++ b/crates/ecstore/src/disk/mod.rs @@ -12,6 +12,7 @@ // See the License for the specific language governing permissions and // limitations under the License. +pub mod disk_store; pub mod endpoint; pub mod error; pub mod error_conv; @@ -30,6 +31,8 @@ pub const FORMAT_CONFIG_FILE: &str = "format.json"; pub const STORAGE_FORMAT_FILE: &str = "xl.meta"; pub const STORAGE_FORMAT_FILE_BACKUP: &str = "xl.meta.bkp"; +use crate::disk::disk_store::LocalDiskWrapper; +use crate::disk::local::ScanGuard; use crate::rpc::RemoteDisk; use bytes::Bytes; use endpoint::Endpoint; @@ -51,7 +54,7 @@ pub type FileWriter = Box; #[derive(Debug)] pub enum Disk { - Local(Box), + Local(Box), Remote(Box), } @@ -393,12 +396,26 @@ impl DiskAPI for Disk { Disk::Remote(remote_disk) => remote_disk.disk_info(opts).await, } } + + fn start_scan(&self) -> ScanGuard { + match self { + Disk::Local(local_disk) => local_disk.start_scan(), + Disk::Remote(remote_disk) => remote_disk.start_scan(), + } + } + + async fn read_metadata(&self, volume: &str, path: &str) -> Result { + match self { + Disk::Local(local_disk) => local_disk.read_metadata(volume, path).await, + Disk::Remote(remote_disk) => remote_disk.read_metadata(volume, path).await, + } + } } pub async fn new_disk(ep: &Endpoint, opt: &DiskOption) -> Result { if ep.is_local { let s = LocalDisk::new(ep, opt.cleanup).await?; - Ok(Arc::new(Disk::Local(Box::new(s)))) + Ok(Arc::new(Disk::Local(Box::new(LocalDiskWrapper::new(Arc::new(s), opt.health_check))))) } else { let remote_disk = RemoteDisk::new(ep, opt).await?; Ok(Arc::new(Disk::Remote(Box::new(remote_disk)))) @@ -456,6 +473,7 @@ pub trait DiskAPI: Debug + Send + Sync + 'static { opts: &ReadOptions, ) -> Result; async fn read_xl(&self, volume: &str, path: &str, read_data: bool) -> Result; + async fn read_metadata(&self, volume: &str, path: &str) -> Result; async fn rename_data( &self, src_volume: &str, @@ -487,6 +505,7 @@ pub trait DiskAPI: Debug + Send + Sync + 'static { async fn write_all(&self, volume: &str, path: &str, data: Bytes) -> Result<()>; async fn read_all(&self, volume: &str, path: &str) -> Result; async fn disk_info(&self, opts: &DiskInfoOptions) -> Result; + fn start_scan(&self) -> ScanGuard; } #[derive(Debug, Default, Serialize, Deserialize)] @@ -534,7 +553,7 @@ pub struct DiskInfo { pub scanning: bool, pub endpoint: String, pub mount_path: String, - pub id: String, + pub id: Option, pub rotational: bool, pub metrics: DiskMetrics, pub error: String, @@ -1015,7 +1034,7 @@ mod tests { let endpoint = Endpoint::try_from(test_dir).unwrap(); let local_disk = LocalDisk::new(&endpoint, false).await.unwrap(); - let disk = Disk::Local(Box::new(local_disk)); + let disk = Disk::Local(Box::new(LocalDiskWrapper::new(Arc::new(local_disk), false))); // Test basic methods assert!(disk.is_local()); diff --git a/crates/ecstore/src/endpoints.rs b/crates/ecstore/src/endpoints.rs index 5f3572e7..1a334c07 100644 --- a/crates/ecstore/src/endpoints.rs +++ b/crates/ecstore/src/endpoints.rs @@ -232,7 +232,7 @@ impl PoolEndpointList { for endpoints in pool_endpoint_list.inner.iter_mut() { // Check whether same path is not used in endpoints of a host on different port. - let mut path_ip_map: HashMap<&str, HashSet> = HashMap::new(); + let mut path_ip_map: HashMap> = HashMap::new(); let mut host_ip_cache = HashMap::new(); for ep in endpoints.as_ref() { if !ep.url.has_host() { @@ -275,8 +275,9 @@ impl PoolEndpointList { match path_ip_map.entry(path) { Entry::Occupied(mut e) => { if e.get().intersection(host_ip_set).count() > 0 { + let path_key = e.key().clone(); return Err(Error::other(format!( - "same path '{path}' can not be served by different port on same address" + "same path '{path_key}' can not be served by different port on same address" ))); } e.get_mut().extend(host_ip_set.iter()); @@ -295,7 +296,7 @@ impl PoolEndpointList { } let path = ep.get_file_path(); - if local_path_set.contains(path) { + if local_path_set.contains(&path) { return Err(Error::other(format!( "path '{path}' cannot be served by different address on same server" ))); diff --git a/crates/ecstore/src/erasure.rs b/crates/ecstore/src/erasure.rs deleted file mode 100644 index 2939fe13..00000000 --- a/crates/ecstore/src/erasure.rs +++ /dev/null @@ -1,586 +0,0 @@ -// Copyright 2024 RustFS Team -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -use crate::bitrot::{BitrotReader, BitrotWriter}; -use crate::disk::error::{Error, Result}; -use crate::disk::error_reduce::{reduce_write_quorum_errs, OBJECT_OP_IGNORED_ERRS}; -use crate::io::Etag; -use bytes::{Bytes, BytesMut}; -use futures::future::join_all; -use reed_solomon_erasure::galois_8::ReedSolomon; -use smallvec::SmallVec; -use std::any::Any; -use std::io::ErrorKind; -use std::sync::{mpsc, Arc}; -use tokio::io::{AsyncRead, AsyncWrite}; -use tokio::io::{AsyncReadExt, AsyncWriteExt}; -use tokio::sync::mpsc; -use tracing::warn; -use tracing::{error, info}; -use uuid::Uuid; - -use crate::disk::error::DiskError; - -#[derive(Default)] -pub struct Erasure { - data_shards: usize, - parity_shards: usize, - encoder: Option, - pub block_size: usize, - _id: Uuid, - _buf: Vec, -} - -impl Erasure { - pub fn new(data_shards: usize, parity_shards: usize, block_size: usize) -> Self { - // debug!( - // "Erasure new data_shards {},parity_shards {} block_size {} ", - // data_shards, parity_shards, block_size - // ); - let mut encoder = None; - if parity_shards > 0 { - encoder = Some(ReedSolomon::new(data_shards, parity_shards).unwrap()); - } - - Erasure { - data_shards, - parity_shards, - block_size, - encoder, - _id: Uuid::new_v4(), - _buf: vec![0u8; block_size], - } - } - - #[tracing::instrument(level = "info", skip(self, reader, writers))] - pub async fn encode( - self: Arc, - mut reader: S, - writers: &mut [Option], - // block_size: usize, - total_size: usize, - write_quorum: usize, - ) -> Result<(usize, String)> - where - S: AsyncRead + Etag + Unpin + Send + 'static, - { - let (tx, mut rx) = mpsc::channel(5); - let task = tokio::spawn(async move { - let mut buf = vec![0u8; self.block_size]; - let mut total: usize = 0; - loop { - if total_size > 0 { - let new_len = { - let remain = total_size - total; - if remain > self.block_size { self.block_size } else { remain } - }; - - if new_len == 0 && total > 0 { - break; - } - - buf.resize(new_len, 0u8); - match reader.read_exact(&mut buf).await { - Ok(res) => res, - Err(e) => { - if let ErrorKind::UnexpectedEof = e.kind() { - break; - } else { - return Err(e.into()); - } - } - }; - total += buf.len(); - } - let blocks = Arc::new(Box::pin(self.clone().encode_data(&buf)?)); - let _ = tx.send(blocks).await; - if total_size == 0 { - break; - } - } - let etag = reader.etag().await; - Ok((total, etag)) - }); - - while let Some(blocks) = rx.recv().await { - let write_futures = writers.iter_mut().enumerate().map(|(i, w_op)| { - let i_inner = i; - let blocks_inner = blocks.clone(); - async move { - if let Some(w) = w_op { - w.write(blocks_inner[i_inner].clone()).await.err() - } else { - Some(DiskError::DiskNotFound) - } - } - }); - let errs = join_all(write_futures).await; - let none_count = errs.iter().filter(|&x| x.is_none()).count(); - if none_count >= write_quorum { - if total_size == 0 { - break; - } - continue; - } - - if let Some(err) = reduce_write_quorum_errs(&errs, OBJECT_OP_IGNORED_ERRS, write_quorum) { - warn!("Erasure encode errs {:?}", &errs); - return Err(err); - } - } - task.await? - } - - pub async fn decode( - &self, - writer: &mut W, - readers: Vec>, - offset: usize, - length: usize, - total_length: usize, - ) -> (usize, Option) - where - W: AsyncWriteExt + Send + Unpin + 'static, - { - if length == 0 { - return (0, None); - } - - let mut reader = ShardReader::new(readers, self, offset, total_length); - - // debug!("ShardReader {:?}", &reader); - - let start_block = offset / self.block_size; - let end_block = (offset + length) / self.block_size; - - // debug!("decode block from {} to {}", start_block, end_block); - - let mut bytes_written = 0; - - for block_idx in start_block..=end_block { - let (block_offset, block_length) = if start_block == end_block { - (offset % self.block_size, length) - } else if block_idx == start_block { - let block_offset = offset % self.block_size; - (block_offset, self.block_size - block_offset) - } else if block_idx == end_block { - (0, (offset + length) % self.block_size) - } else { - (0, self.block_size) - }; - - if block_length == 0 { - // debug!("block_length == 0 break"); - break; - } - - // debug!("decode {} block_offset {},block_length {} ", block_idx, block_offset, block_length); - - let mut bufs = match reader.read().await { - Ok(bufs) => bufs, - Err(err) => return (bytes_written, Some(err)), - }; - - if self.parity_shards > 0 { - if let Err(err) = self.decode_data(&mut bufs) { - return (bytes_written, Some(err)); - } - } - - let written_n = match self - .write_data_blocks(writer, bufs, self.data_shards, block_offset, block_length) - .await - { - Ok(n) => n, - Err(err) => { - error!("write_data_blocks err {:?}", &err); - return (bytes_written, Some(err)); - } - }; - - bytes_written += written_n; - - // debug!("decode {} written_n {}, total_written: {} ", block_idx, written_n, bytes_written); - } - - if bytes_written != length { - // debug!("bytes_written != length: {} != {} ", bytes_written, length); - return (bytes_written, Some(Error::other("erasure decode less data"))); - } - - (bytes_written, None) - } - - async fn write_data_blocks( - &self, - writer: &mut W, - bufs: Vec>>, - data_blocks: usize, - offset: usize, - length: usize, - ) -> Result - where - W: AsyncWrite + Send + Unpin + 'static, - { - if bufs.len() < data_blocks { - return Err(Error::other("read bufs not match data_blocks")); - } - - let data_len: usize = bufs - .iter() - .take(data_blocks) - .filter(|v| v.is_some()) - .map(|v| v.as_ref().unwrap().len()) - .sum(); - if data_len < length { - return Err(Error::other(format!("write_data_blocks data_len < length {} < {}", data_len, length))); - } - - let mut offset = offset; - - // debug!("write_data_blocks offset {}, length {}", offset, length); - - let mut write = length; - let mut total_written = 0; - - for opt_buf in bufs.iter().take(data_blocks) { - let buf = opt_buf.as_ref().unwrap(); - - if offset >= buf.len() { - offset -= buf.len(); - continue; - } - - let buf = &buf[offset..]; - - offset = 0; - - // debug!("write_data_blocks write buf len {}", buf.len()); - - if write < buf.len() { - let buf = &buf[..write]; - - // debug!("write_data_blocks write buf less len {}", buf.len()); - writer.write_all(buf).await?; - // debug!("write_data_blocks write done len {}", buf.len()); - total_written += buf.len(); - break; - } - - writer.write_all(buf).await?; - let n = buf.len(); - - // debug!("write_data_blocks write done len {}", n); - write -= n; - total_written += n; - } - - Ok(total_written) - } - - pub fn total_shard_count(&self) -> usize { - self.data_shards + self.parity_shards - } - - #[tracing::instrument(level = "info", skip_all, fields(data_len=data.len()))] - pub fn encode_data(self: Arc, data: &[u8]) -> Result> { - let (shard_size, total_size) = self.need_size(data.len()); - - // Generate the total length required for all shards - let mut data_buffer = BytesMut::with_capacity(total_size); - - // Copy the source data - data_buffer.extend_from_slice(data); - data_buffer.resize(total_size, 0u8); - - { - // Perform EC encoding; the results go into data_buffer - let data_slices: SmallVec<[&mut [u8]; 16]> = data_buffer.chunks_exact_mut(shard_size).collect(); - - // Only perform EC encoding when parity shards are present - if self.parity_shards > 0 { - self.encoder.as_ref().unwrap().encode(data_slices).map_err(Error::other)?; - } - } - - // Zero-copy shards: every shard references data_buffer - let mut data_buffer = data_buffer.freeze(); - let mut shards = Vec::with_capacity(self.total_shard_count()); - for _ in 0..self.total_shard_count() { - let shard = data_buffer.split_to(shard_size); - shards.push(shard); - } - - Ok(shards) - } - - pub fn decode_data(&self, shards: &mut [Option>]) -> Result<()> { - if self.parity_shards > 0 { - self.encoder.as_ref().unwrap().reconstruct(shards).map_err(Error::other)?; - } - - Ok(()) - } - - // The length per shard and the total required length - fn need_size(&self, data_size: usize) -> (usize, usize) { - let shard_size = self.shard_size(data_size); - (shard_size, shard_size * (self.total_shard_count())) - } - - // Compute each shard size - pub fn shard_size(&self, data_size: usize) -> usize { - data_size.div_ceil(self.data_shards) - } - // returns final erasure size from original size. - pub fn shard_file_size(&self, total_size: usize) -> usize { - if total_size == 0 { - return 0; - } - - let num_shards = total_size / self.block_size; - let last_block_size = total_size % self.block_size; - let last_shard_size = last_block_size.div_ceil(self.data_shards); - num_shards * self.shard_size(self.block_size) + last_shard_size - - // When writing, EC pads the data so the last shard length should match - // if last_block_size != 0 { - // num_shards += 1 - // } - // num_shards * self.shard_size(self.block_size) - } - - // where erasure reading begins. - pub fn shard_file_offset(&self, start_offset: usize, length: usize, total_length: usize) -> usize { - let shard_size = self.shard_size(self.block_size); - let shard_file_size = self.shard_file_size(total_length); - let end_shard = (start_offset + length) / self.block_size; - let mut till_offset = end_shard * shard_size + shard_size; - if till_offset > shard_file_size { - till_offset = shard_file_size; - } - - till_offset - } - - pub async fn heal( - &self, - writers: &mut [Option], - readers: Vec>, - total_length: usize, - _prefer: &[bool], - ) -> Result<()> { - info!( - "Erasure heal, writers len: {}, readers len: {}, total_length: {}", - writers.len(), - readers.len(), - total_length - ); - if writers.len() != self.parity_shards + self.data_shards { - return Err(Error::other("invalid argument")); - } - let mut reader = ShardReader::new(readers, self, 0, total_length); - - let start_block = 0; - let mut end_block = total_length / self.block_size; - if total_length % self.block_size != 0 { - end_block += 1; - } - - let mut errs = Vec::new(); - for _ in start_block..end_block { - let mut bufs = reader.read().await?; - - if self.parity_shards > 0 { - self.encoder.as_ref().unwrap().reconstruct(&mut bufs).map_err(Error::other)?; - } - - let shards = bufs.into_iter().flatten().map(Bytes::from).collect::>(); - if shards.len() != self.parity_shards + self.data_shards { - return Err(Error::other("can not reconstruct data")); - } - - for (i, w) in writers.iter_mut().enumerate() { - if w.is_none() { - continue; - } - match w.as_mut().unwrap().write(shards[i].clone()).await { - Ok(_) => {} - Err(e) => { - info!("write failed, err: {:?}", e); - errs.push(e); - } - } - } - } - if !errs.is_empty() { - return Err(errs[0].clone().into()); - } - - Ok(()) - } -} - -#[async_trait::async_trait] -pub trait Writer { - fn as_any(&self) -> &dyn Any; - async fn write(&mut self, buf: Bytes) -> Result<()>; - async fn close(&mut self) -> Result<()> { - Ok(()) - } -} - -#[async_trait::async_trait] -pub trait ReadAt { - async fn read_at(&mut self, offset: usize, length: usize) -> Result<(Vec, usize)>; -} - -pub struct ShardReader { - readers: Vec>, // Disk readers - data_block_count: usize, // Total number of shards - parity_block_count: usize, - shard_size: usize, // Block size per shard (read one block at a time) - shard_file_size: usize, // Total size of the shard file - offset: usize, // Offset within the shard -} - -impl ShardReader { - pub fn new(readers: Vec>, ec: &Erasure, offset: usize, total_length: usize) -> Self { - Self { - readers, - data_block_count: ec.data_shards, - parity_block_count: ec.parity_shards, - shard_size: ec.shard_size(ec.block_size), - shard_file_size: ec.shard_file_size(total_length), - offset: (offset / ec.block_size) * ec.shard_size(ec.block_size), - } - } - - pub async fn read(&mut self) -> Result>>> { - // let mut disks = self.readers; - let reader_length = self.readers.len(); - // Length of the block to read - let mut read_length = self.shard_size; - if self.offset + read_length > self.shard_file_size { - read_length = self.shard_file_size - self.offset - } - - if read_length == 0 { - return Ok(vec![None; reader_length]); - } - - // debug!("shard reader read offset {}, shard_size {}", self.offset, read_length); - - let mut futures = Vec::with_capacity(reader_length); - let mut errors = Vec::with_capacity(reader_length); - - let mut ress = Vec::with_capacity(reader_length); - - for disk in self.readers.iter_mut() { - // if disk.is_none() { - // ress.push(None); - // errors.push(Some(Error::new(DiskError::DiskNotFound))); - // continue; - // } - - // let disk: &mut BitrotReader = disk.as_mut().unwrap(); - let offset = self.offset; - futures.push(async move { - if let Some(disk) = disk { - disk.read_at(offset, read_length).await - } else { - Err(DiskError::DiskNotFound) - } - }); - } - - let results = join_all(futures).await; - for result in results { - match result { - Ok((res, _)) => { - ress.push(Some(res)); - errors.push(None); - } - Err(e) => { - ress.push(None); - errors.push(Some(e)); - } - } - } - - if !self.can_decode(&ress) { - warn!("ec decode read ress {:?}", &ress); - warn!("ec decode read errors {:?}", &errors); - - return Err(Error::other("shard reader read failed")); - } - - self.offset += self.shard_size; - - Ok(ress) - } - - fn can_decode(&self, bufs: &[Option>]) -> bool { - let c = bufs.iter().filter(|v| v.is_some()).count(); - if self.parity_block_count > 0 { - c >= self.data_block_count - } else { - c == self.data_block_count - } - } -} - -// fn shards_to_option_shards(shards: &[Vec]) -> Vec>> { -// let mut result = Vec::with_capacity(shards.len()); - -// for v in shards.iter() { -// let inner: Vec = v.clone(); -// result.push(Some(inner)); -// } -// result -// } - -#[cfg(test)] -mod test { - use super::*; - - #[test] - fn test_erasure() { - let data_shards = 3; - let parity_shards = 2; - let data: &[u8] = &[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]; - let ec = Erasure::new(data_shards, parity_shards, 1); - let shards = Arc::new(ec).encode_data(data).unwrap(); - println!("shards:{:?}", shards); - - let mut s: Vec<_> = shards - .iter() - .map(|d| if d.is_empty() { None } else { Some(d.to_vec()) }) - .collect(); - - // let mut s = shards_to_option_shards(&shards); - - // s[0] = None; - s[4] = None; - s[3] = None; - - println!("sss:{:?}", &s); - - let ec = Erasure::new(data_shards, parity_shards, 1); - ec.decode_data(&mut s).unwrap(); - // ec.encoder.reconstruct(&mut s).unwrap(); - - println!("sss:{:?}", &s); - } -} diff --git a/crates/ecstore/src/lib.rs b/crates/ecstore/src/lib.rs index 3194f2b8..d8ea3440 100644 --- a/crates/ecstore/src/lib.rs +++ b/crates/ecstore/src/lib.rs @@ -20,7 +20,6 @@ pub mod batch_processor; pub mod bitrot; pub mod bucket; pub mod cache_value; -mod chunk_stream; pub mod compress; pub mod config; pub mod data_usage; diff --git a/crates/ecstore/src/metrics_realtime.rs b/crates/ecstore/src/metrics_realtime.rs index a0f711e1..2bbe6456 100644 --- a/crates/ecstore/src/metrics_realtime.rs +++ b/crates/ecstore/src/metrics_realtime.rs @@ -19,11 +19,7 @@ use crate::{ // utils::os::get_drive_stats, }; use chrono::Utc; -use rustfs_common::{ - globals::{GLOBAL_Local_Node_Name, GLOBAL_Rustfs_Addr}, - heal_channel::DriveState, - metrics::global_metrics, -}; +use rustfs_common::{GLOBAL_LOCAL_NODE_NAME, GLOBAL_RUSTFS_ADDR, heal_channel::DriveState, metrics::global_metrics}; use rustfs_madmin::metrics::{DiskIOStats, DiskMetric, RealtimeMetrics}; use rustfs_utils::os::get_drive_stats; use serde::{Deserialize, Serialize}; @@ -86,7 +82,7 @@ pub async fn collect_local_metrics(types: MetricType, opts: &CollectMetricsOpts) return real_time_metrics; } - let mut by_host_name = GLOBAL_Rustfs_Addr.read().await.clone(); + let mut by_host_name = GLOBAL_RUSTFS_ADDR.read().await.clone(); if !opts.hosts.is_empty() { let server = get_local_server_property().await; if opts.hosts.contains(&server.endpoint) { @@ -95,7 +91,7 @@ pub async fn collect_local_metrics(types: MetricType, opts: &CollectMetricsOpts) return real_time_metrics; } } - let local_node_name = GLOBAL_Local_Node_Name.read().await.clone(); + let local_node_name = GLOBAL_LOCAL_NODE_NAME.read().await.clone(); if by_host_name.starts_with(":") && !local_node_name.starts_with(":") { by_host_name = local_node_name; } diff --git a/crates/ecstore/src/pools.rs b/crates/ecstore/src/pools.rs index f503ff9d..f2e0d19d 100644 --- a/crates/ecstore/src/pools.rs +++ b/crates/ecstore/src/pools.rs @@ -440,11 +440,11 @@ impl PoolMeta { } } -fn path2_bucket_object(name: &str) -> (String, String) { +pub fn path2_bucket_object(name: &str) -> (String, String) { path2_bucket_object_with_base_path("", name) } -fn path2_bucket_object_with_base_path(base_path: &str, path: &str) -> (String, String) { +pub fn path2_bucket_object_with_base_path(base_path: &str, path: &str) -> (String, String) { // Trim the base path and leading slash let trimmed_path = path .strip_prefix(base_path) @@ -452,7 +452,9 @@ fn path2_bucket_object_with_base_path(base_path: &str, path: &str) -> (String, S .strip_prefix(SLASH_SEPARATOR) .unwrap_or(path); // Find the position of the first '/' - let pos = trimmed_path.find(SLASH_SEPARATOR).unwrap_or(trimmed_path.len()); + let Some(pos) = trimmed_path.find(SLASH_SEPARATOR) else { + return (trimmed_path.to_string(), "".to_string()); + }; // Split into bucket and prefix let bucket = &trimmed_path[0..pos]; let prefix = &trimmed_path[pos + 1..]; // +1 to skip the '/' character if it exists diff --git a/crates/ecstore/src/rpc/peer_s3_client.rs b/crates/ecstore/src/rpc/peer_s3_client.rs index ac0a035c..fe251a3e 100644 --- a/crates/ecstore/src/rpc/peer_s3_client.rs +++ b/crates/ecstore/src/rpc/peer_s3_client.rs @@ -13,14 +13,18 @@ // limitations under the License. use crate::bucket::metadata_sys; +use crate::disk::error::DiskError; use crate::disk::error::{Error, Result}; use crate::disk::error_reduce::{BUCKET_OP_IGNORED_ERRS, is_all_buckets_not_found, reduce_write_quorum_errs}; -use crate::disk::{DiskAPI, DiskStore}; +use crate::disk::{DiskAPI, DiskStore, disk_store::get_max_timeout_duration}; use crate::global::GLOBAL_LOCAL_DISK_MAP; use crate::store::all_local_disk; use crate::store_utils::is_reserved_or_invalid_bucket; use crate::{ - disk::{self, VolumeInfo}, + disk::{ + self, VolumeInfo, + disk_store::{CHECK_EVERY, CHECK_TIMEOUT_DURATION, DiskHealthTracker}, + }, endpoints::{EndpointServerPools, Node}, store_api::{BucketInfo, BucketOptions, DeleteBucketOptions, MakeBucketOptions}, }; @@ -32,10 +36,11 @@ use rustfs_protos::node_service_time_out_client; use rustfs_protos::proto_gen::node_service::{ DeleteBucketRequest, GetBucketInfoRequest, HealBucketRequest, ListBucketRequest, MakeBucketRequest, }; -use std::{collections::HashMap, fmt::Debug, sync::Arc}; -use tokio::sync::RwLock; +use std::{collections::HashMap, fmt::Debug, sync::Arc, time::Duration}; +use tokio::{net::TcpStream, sync::RwLock, time}; +use tokio_util::sync::CancellationToken; use tonic::Request; -use tracing::info; +use tracing::{debug, info, warn}; type Client = Arc>; @@ -559,16 +564,160 @@ pub struct RemotePeerS3Client { pub node: Option, pub pools: Option>, addr: String, + /// Health tracker for connection monitoring + health: Arc, + /// Cancellation token for monitoring tasks + cancel_token: CancellationToken, } impl RemotePeerS3Client { pub fn new(node: Option, pools: Option>) -> Self { let addr = node.as_ref().map(|v| v.url.to_string()).unwrap_or_default().to_string(); - Self { node, pools, addr } + let client = Self { + node, + pools, + addr, + health: Arc::new(DiskHealthTracker::new()), + cancel_token: CancellationToken::new(), + }; + + // Start health monitoring + client.start_health_monitoring(); + + client } + pub fn get_addr(&self) -> String { self.addr.clone() } + + /// Start health monitoring for the remote peer + fn start_health_monitoring(&self) { + let health = Arc::clone(&self.health); + let cancel_token = self.cancel_token.clone(); + let addr = self.addr.clone(); + + tokio::spawn(async move { + Self::monitor_remote_peer_health(addr, health, cancel_token).await; + }); + } + + /// Monitor remote peer health periodically + async fn monitor_remote_peer_health(addr: String, health: Arc, cancel_token: CancellationToken) { + let mut interval = time::interval(CHECK_EVERY); + + loop { + tokio::select! { + _ = cancel_token.cancelled() => { + debug!("Health monitoring cancelled for remote peer: {}", addr); + return; + } + _ = interval.tick() => { + if cancel_token.is_cancelled() { + return; + } + + // Skip health check if peer is already marked as faulty + if health.is_faulty() { + continue; + } + + // Perform basic connectivity check + if Self::perform_connectivity_check(&addr).await.is_err() && health.swap_ok_to_faulty() { + warn!("Remote peer health check failed for {}: marking as faulty", addr); + + // Start recovery monitoring + let health_clone = Arc::clone(&health); + let addr_clone = addr.clone(); + let cancel_clone = cancel_token.clone(); + + tokio::spawn(async move { + Self::monitor_remote_peer_recovery(addr_clone, health_clone, cancel_clone).await; + }); + } + } + } + } + } + + /// Monitor remote peer recovery and mark as healthy when recovered + async fn monitor_remote_peer_recovery(addr: String, health: Arc, cancel_token: CancellationToken) { + let mut interval = time::interval(Duration::from_secs(5)); // Check every 5 seconds + + loop { + tokio::select! { + _ = cancel_token.cancelled() => { + return; + } + _ = interval.tick() => { + if Self::perform_connectivity_check(&addr).await.is_ok() { + info!("Remote peer recovered: {}", addr); + health.set_ok(); + return; + } + } + } + } + } + + /// Perform basic connectivity check for remote peer + async fn perform_connectivity_check(addr: &str) -> Result<()> { + use tokio::time::timeout; + + let url = url::Url::parse(addr).map_err(|e| Error::other(format!("Invalid URL: {}", e)))?; + + let Some(host) = url.host_str() else { + return Err(Error::other("No host in URL".to_string())); + }; + + let port = url.port_or_known_default().unwrap_or(80); + + // Try to establish TCP connection + match timeout(CHECK_TIMEOUT_DURATION, TcpStream::connect((host, port))).await { + Ok(Ok(_)) => Ok(()), + _ => Err(Error::other(format!("Cannot connect to {}:{}", host, port))), + } + } + + /// Execute operation with timeout and health tracking + async fn execute_with_timeout(&self, operation: F, timeout_duration: Duration) -> Result + where + F: FnOnce() -> Fut, + Fut: std::future::Future>, + { + // Check if peer is faulty + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + + // Record operation start + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64; + self.health.last_started.store(now, std::sync::atomic::Ordering::Relaxed); + self.health.increment_waiting(); + + // Execute operation with timeout + let result = time::timeout(timeout_duration, operation()).await; + + match result { + Ok(operation_result) => { + // Log success and decrement waiting counter + if operation_result.is_ok() { + self.health.log_success(); + } + self.health.decrement_waiting(); + operation_result + } + Err(_) => { + // Timeout occurred, mark peer as potentially faulty + self.health.decrement_waiting(); + warn!("Remote peer operation timeout after {:?}", timeout_duration); + Err(Error::other(format!("Remote peer operation timeout after {:?}", timeout_duration))) + } + } + } } #[async_trait] @@ -578,115 +727,145 @@ impl PeerS3Client for RemotePeerS3Client { } async fn heal_bucket(&self, bucket: &str, opts: &HealOpts) -> Result { - let options: String = serde_json::to_string(opts)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(HealBucketRequest { - bucket: bucket.to_string(), - options, - }); - let response = client.heal_bucket(request).await?.into_inner(); - if !response.success { - return if let Some(err) = response.error { - Err(err.into()) - } else { - Err(Error::other("")) - }; - } + self.execute_with_timeout( + || async { + let options: String = serde_json::to_string(opts)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(HealBucketRequest { + bucket: bucket.to_string(), + options, + }); + let response = client.heal_bucket(request).await?.into_inner(); + if !response.success { + return if let Some(err) = response.error { + Err(err.into()) + } else { + Err(Error::other("")) + }; + } - Ok(HealResultItem { - heal_item_type: HealItemType::Bucket.to_string(), - bucket: bucket.to_string(), - set_count: 0, - ..Default::default() - }) + Ok(HealResultItem { + heal_item_type: HealItemType::Bucket.to_string(), + bucket: bucket.to_string(), + set_count: 0, + ..Default::default() + }) + }, + get_max_timeout_duration(), + ) + .await } async fn list_bucket(&self, opts: &BucketOptions) -> Result> { - let options = serde_json::to_string(opts)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(ListBucketRequest { options }); - let response = client.list_bucket(request).await?.into_inner(); - if !response.success { - return if let Some(err) = response.error { - Err(err.into()) - } else { - Err(Error::other("")) - }; - } - let bucket_infos = response - .bucket_infos - .into_iter() - .filter_map(|json_str| serde_json::from_str::(&json_str).ok()) - .collect(); + self.execute_with_timeout( + || async { + let options = serde_json::to_string(opts)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(ListBucketRequest { options }); + let response = client.list_bucket(request).await?.into_inner(); + if !response.success { + return if let Some(err) = response.error { + Err(err.into()) + } else { + Err(Error::other("")) + }; + } + let bucket_infos = response + .bucket_infos + .into_iter() + .filter_map(|json_str| serde_json::from_str::(&json_str).ok()) + .collect(); - Ok(bucket_infos) + Ok(bucket_infos) + }, + get_max_timeout_duration(), + ) + .await } async fn make_bucket(&self, bucket: &str, opts: &MakeBucketOptions) -> Result<()> { - let options = serde_json::to_string(opts)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(MakeBucketRequest { - name: bucket.to_string(), - options, - }); - let response = client.make_bucket(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let options = serde_json::to_string(opts)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(MakeBucketRequest { + name: bucket.to_string(), + options, + }); + let response = client.make_bucket(request).await?.into_inner(); - // TODO: deal with error - if !response.success { - return if let Some(err) = response.error { - Err(err.into()) - } else { - Err(Error::other("")) - }; - } + // TODO: deal with error + if !response.success { + return if let Some(err) = response.error { + Err(err.into()) + } else { + Err(Error::other("")) + }; + } - Ok(()) + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } async fn get_bucket_info(&self, bucket: &str, opts: &BucketOptions) -> Result { - let options = serde_json::to_string(opts)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(GetBucketInfoRequest { - bucket: bucket.to_string(), - options, - }); - let response = client.get_bucket_info(request).await?.into_inner(); - if !response.success { - return if let Some(err) = response.error { - Err(err.into()) - } else { - Err(Error::other("")) - }; - } - let bucket_info = serde_json::from_str::(&response.bucket_info)?; + self.execute_with_timeout( + || async { + let options = serde_json::to_string(opts)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(GetBucketInfoRequest { + bucket: bucket.to_string(), + options, + }); + let response = client.get_bucket_info(request).await?.into_inner(); + if !response.success { + return if let Some(err) = response.error { + Err(err.into()) + } else { + Err(Error::other("")) + }; + } + let bucket_info = serde_json::from_str::(&response.bucket_info)?; - Ok(bucket_info) + Ok(bucket_info) + }, + get_max_timeout_duration(), + ) + .await } async fn delete_bucket(&self, bucket: &str, _opts: &DeleteBucketOptions) -> Result<()> { - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(DeleteBucketRequest { - bucket: bucket.to_string(), - }); - let response = client.delete_bucket(request).await?.into_inner(); - if !response.success { - return if let Some(err) = response.error { - Err(err.into()) - } else { - Err(Error::other("")) - }; - } + let request = Request::new(DeleteBucketRequest { + bucket: bucket.to_string(), + }); + let response = client.delete_bucket(request).await?.into_inner(); + if !response.success { + return if let Some(err) = response.error { + Err(err.into()) + } else { + Err(Error::other("")) + }; + } - Ok(()) + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } } diff --git a/crates/ecstore/src/rpc/remote_disk.rs b/crates/ecstore/src/rpc/remote_disk.rs index 5e024f0b..6b3dac54 100644 --- a/crates/ecstore/src/rpc/remote_disk.rs +++ b/crates/ecstore/src/rpc/remote_disk.rs @@ -12,37 +12,50 @@ // See the License for the specific language governing permissions and // limitations under the License. -use std::{path::PathBuf, time::Duration}; - +use crate::disk::{ + CheckPartsResp, DeleteOptions, DiskAPI, DiskInfo, DiskInfoOptions, DiskLocation, DiskOption, FileInfoVersions, + ReadMultipleReq, ReadMultipleResp, ReadOptions, RenameDataResp, UpdateMetadataOpts, VolumeInfo, WalkDirOptions, + disk_store::{ + CHECK_EVERY, CHECK_TIMEOUT_DURATION, ENV_RUSTFS_DRIVE_ACTIVE_MONITORING, SKIP_IF_SUCCESS_BEFORE, get_max_timeout_duration, + }, + endpoint::Endpoint, + local::ScanGuard, +}; +use crate::disk::{FileReader, FileWriter}; +use crate::disk::{disk_store::DiskHealthTracker, error::DiskError}; +use crate::{ + disk::error::{Error, Result}, + rpc::build_auth_headers, +}; use bytes::Bytes; use futures::lock::Mutex; use http::{HeaderMap, HeaderValue, Method, header::CONTENT_TYPE}; +use rustfs_filemeta::{FileInfo, ObjectPartInfo, RawFileInfo}; +use rustfs_protos::proto_gen::node_service::RenamePartRequest; use rustfs_protos::{ node_service_time_out_client, proto_gen::node_service::{ CheckPartsRequest, DeletePathsRequest, DeleteRequest, DeleteVersionRequest, DeleteVersionsRequest, DeleteVolumeRequest, DiskInfoRequest, ListDirRequest, ListVolumesRequest, MakeVolumeRequest, MakeVolumesRequest, ReadAllRequest, - ReadMultipleRequest, ReadPartsRequest, ReadVersionRequest, ReadXlRequest, RenameDataRequest, RenameFileRequest, - StatVolumeRequest, UpdateMetadataRequest, VerifyFileRequest, WriteAllRequest, WriteMetadataRequest, + ReadMetadataRequest, ReadMultipleRequest, ReadPartsRequest, ReadVersionRequest, ReadXlRequest, RenameDataRequest, + RenameFileRequest, StatVolumeRequest, UpdateMetadataRequest, VerifyFileRequest, WriteAllRequest, WriteMetadataRequest, }, }; - -use crate::disk::{ - CheckPartsResp, DeleteOptions, DiskAPI, DiskInfo, DiskInfoOptions, DiskLocation, DiskOption, FileInfoVersions, - ReadMultipleReq, ReadMultipleResp, ReadOptions, RenameDataResp, UpdateMetadataOpts, VolumeInfo, WalkDirOptions, - endpoint::Endpoint, -}; -use crate::disk::{FileReader, FileWriter}; -use crate::{ - disk::error::{Error, Result}, - rpc::build_auth_headers, -}; -use rustfs_filemeta::{FileInfo, ObjectPartInfo, RawFileInfo}; -use rustfs_protos::proto_gen::node_service::RenamePartRequest; use rustfs_rio::{HttpReader, HttpWriter}; +use rustfs_utils::string::parse_bool_with_default; +use std::{ + path::PathBuf, + sync::{ + Arc, + atomic::{AtomicU32, Ordering}, + }, + time::Duration, +}; +use tokio::time; use tokio::{io::AsyncWrite, net::TcpStream, time::timeout}; +use tokio_util::sync::CancellationToken; use tonic::Request; -use tracing::{debug, info}; +use tracing::{debug, info, warn}; use uuid::Uuid; #[derive(Debug)] @@ -52,12 +65,17 @@ pub struct RemoteDisk { pub url: url::Url, pub root: PathBuf, endpoint: Endpoint, + pub scanning: Arc, + /// Whether health checking is enabled + health_check: bool, + /// Health tracker for connection monitoring + health: Arc, + /// Cancellation token for monitoring tasks + cancel_token: CancellationToken, } -const REMOTE_DISK_ONLINE_PROBE_TIMEOUT: Duration = Duration::from_millis(750); - impl RemoteDisk { - pub async fn new(ep: &Endpoint, _opt: &DiskOption) -> Result { + pub async fn new(ep: &Endpoint, opt: &DiskOption) -> Result { // let root = fs::canonicalize(ep.url.path()).await?; let root = PathBuf::from(ep.get_file_path()); let addr = if let Some(port) = ep.url.port() { @@ -65,13 +83,185 @@ impl RemoteDisk { } else { format!("{}://{}", ep.url.scheme(), ep.url.host_str().unwrap()) }; - Ok(Self { + + let env_health_check = std::env::var(ENV_RUSTFS_DRIVE_ACTIVE_MONITORING) + .map(|v| parse_bool_with_default(&v, true)) + .unwrap_or(true); + + let disk = Self { id: Mutex::new(None), - addr, + addr: addr.clone(), url: ep.url.clone(), root, endpoint: ep.clone(), - }) + scanning: Arc::new(AtomicU32::new(0)), + health_check: opt.health_check && env_health_check, + health: Arc::new(DiskHealthTracker::new()), + cancel_token: CancellationToken::new(), + }; + + // Start health monitoring + disk.start_health_monitoring(); + + Ok(disk) + } + + /// Start health monitoring for the remote disk + fn start_health_monitoring(&self) { + if self.health_check { + let health = Arc::clone(&self.health); + let cancel_token = self.cancel_token.clone(); + let addr = self.addr.clone(); + + tokio::spawn(async move { + Self::monitor_remote_disk_health(addr, health, cancel_token).await; + }); + } + } + + /// Monitor remote disk health periodically + async fn monitor_remote_disk_health(addr: String, health: Arc, cancel_token: CancellationToken) { + let mut interval = time::interval(CHECK_EVERY); + + // Perform basic connectivity check + if Self::perform_connectivity_check(&addr).await.is_err() && health.swap_ok_to_faulty() { + warn!("Remote disk health check failed for {}: marking as faulty", addr); + + // Start recovery monitoring + let health_clone = Arc::clone(&health); + let addr_clone = addr.clone(); + let cancel_clone = cancel_token.clone(); + + tokio::spawn(async move { + Self::monitor_remote_disk_recovery(addr_clone, health_clone, cancel_clone).await; + }); + } + + loop { + tokio::select! { + _ = cancel_token.cancelled() => { + debug!("Health monitoring cancelled for remote disk: {}", addr); + return; + } + _ = interval.tick() => { + if cancel_token.is_cancelled() { + return; + } + + // Skip health check if disk is already marked as faulty + if health.is_faulty() { + continue; + } + + let last_success_nanos = health.last_success.load(Ordering::Relaxed); + let elapsed = Duration::from_nanos( + (std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64 - last_success_nanos) as u64 + ); + + if elapsed < SKIP_IF_SUCCESS_BEFORE { + continue; + } + + // Perform basic connectivity check + if Self::perform_connectivity_check(&addr).await.is_err() && health.swap_ok_to_faulty() { + warn!("Remote disk health check failed for {}: marking as faulty", addr); + + // Start recovery monitoring + let health_clone = Arc::clone(&health); + let addr_clone = addr.clone(); + let cancel_clone = cancel_token.clone(); + + tokio::spawn(async move { + Self::monitor_remote_disk_recovery(addr_clone, health_clone, cancel_clone).await; + }); + } + } + } + } + } + + /// Monitor remote disk recovery and mark as healthy when recovered + async fn monitor_remote_disk_recovery(addr: String, health: Arc, cancel_token: CancellationToken) { + let mut interval = time::interval(CHECK_EVERY); + + loop { + tokio::select! { + _ = cancel_token.cancelled() => { + return; + } + _ = interval.tick() => { + if Self::perform_connectivity_check(&addr).await.is_ok() { + info!("Remote disk recovered: {}", addr); + health.set_ok(); + return; + } + } + } + } + } + + /// Perform basic connectivity check for remote disk + async fn perform_connectivity_check(addr: &str) -> Result<()> { + let url = url::Url::parse(addr).map_err(|e| Error::other(format!("Invalid URL: {}", e)))?; + + let Some(host) = url.host_str() else { + return Err(Error::other("No host in URL".to_string())); + }; + + let port = url.port_or_known_default().unwrap_or(80); + + // Try to establish TCP connection + match timeout(CHECK_TIMEOUT_DURATION, TcpStream::connect((host, port))).await { + Ok(Ok(stream)) => { + drop(stream); + Ok(()) + } + _ => Err(Error::other(format!("Cannot connect to {}:{}", host, port))), + } + } + + /// Execute operation with timeout and health tracking + async fn execute_with_timeout(&self, operation: F, timeout_duration: Duration) -> Result + where + F: FnOnce() -> Fut, + Fut: std::future::Future>, + { + // Check if disk is faulty + if self.health.is_faulty() { + warn!("remote disk {} health is faulty, returning error", self.to_string()); + return Err(DiskError::FaultyDisk); + } + + // Record operation start + let now = std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() as i64; + self.health.last_started.store(now, std::sync::atomic::Ordering::Relaxed); + self.health.increment_waiting(); + + // Execute operation with timeout + let result = time::timeout(timeout_duration, operation()).await; + + match result { + Ok(operation_result) => { + // Log success and decrement waiting counter + if operation_result.is_ok() { + self.health.log_success(); + } + self.health.decrement_waiting(); + operation_result + } + Err(_) => { + // Timeout occurred, mark disk as potentially faulty + self.health.decrement_waiting(); + warn!("Remote disk operation timeout after {:?}", timeout_duration); + Err(Error::other(format!("Remote disk operation timeout after {:?}", timeout_duration))) + } + } } } @@ -85,19 +275,8 @@ impl DiskAPI for RemoteDisk { #[tracing::instrument(skip(self))] async fn is_online(&self) -> bool { - let Some(host) = self.endpoint.url.host_str().map(|host| host.to_string()) else { - return false; - }; - - let port = self.endpoint.url.port_or_known_default().unwrap_or(80); - - match timeout(REMOTE_DISK_ONLINE_PROBE_TIMEOUT, TcpStream::connect((host, port))).await { - Ok(Ok(stream)) => { - drop(stream); - true - } - _ => false, - } + // If disk is marked as faulty, consider it offline + !self.health.is_faulty() } #[tracing::instrument(skip(self))] @@ -114,6 +293,7 @@ impl DiskAPI for RemoteDisk { } #[tracing::instrument(skip(self))] async fn close(&self) -> Result<()> { + self.cancel_token.cancel(); Ok(()) } #[tracing::instrument(skip(self))] @@ -164,108 +344,143 @@ impl DiskAPI for RemoteDisk { #[tracing::instrument(skip(self))] async fn make_volume(&self, volume: &str) -> Result<()> { info!("make_volume"); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(MakeVolumeRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - }); - let response = client.make_volume(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(MakeVolumeRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.make_volume(request).await?.into_inner(); - Ok(()) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn make_volumes(&self, volumes: Vec<&str>) -> Result<()> { info!("make_volumes"); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(MakeVolumesRequest { - disk: self.endpoint.to_string(), - volumes: volumes.iter().map(|s| (*s).to_string()).collect(), - }); - let response = client.make_volumes(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(MakeVolumesRequest { + disk: self.endpoint.to_string(), + volumes: volumes.iter().map(|s| (*s).to_string()).collect(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.make_volumes(request).await?.into_inner(); - Ok(()) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn list_volumes(&self) -> Result> { info!("list_volumes"); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(ListVolumesRequest { - disk: self.endpoint.to_string(), - }); - let response = client.list_volumes(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(ListVolumesRequest { + disk: self.endpoint.to_string(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.list_volumes(request).await?.into_inner(); - let infos = response - .volume_infos - .into_iter() - .filter_map(|json_str| serde_json::from_str::(&json_str).ok()) - .collect(); + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(infos) + let infos = response + .volume_infos + .into_iter() + .filter_map(|json_str| serde_json::from_str::(&json_str).ok()) + .collect(); + + Ok(infos) + }, + Duration::ZERO, + ) + .await } #[tracing::instrument(skip(self))] async fn stat_volume(&self, volume: &str) -> Result { info!("stat_volume"); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(StatVolumeRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - }); - let response = client.stat_volume(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(StatVolumeRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.stat_volume(request).await?.into_inner(); - let volume_info = serde_json::from_str::(&response.volume_info)?; + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(volume_info) + let volume_info = serde_json::from_str::(&response.volume_info)?; + + Ok(volume_info) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn delete_volume(&self, volume: &str) -> Result<()> { info!("delete_volume {}/{}", self.endpoint.to_string(), volume); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(DeleteVolumeRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - }); - let response = client.delete_volume(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(DeleteVolumeRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.delete_volume(request).await?.into_inner(); - Ok(()) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + Duration::ZERO, + ) + .await } // // FIXME: TODO: use writer @@ -328,36 +543,47 @@ impl DiskAPI for RemoteDisk { opts: DeleteOptions, ) -> Result<()> { info!("delete_version"); - let file_info = serde_json::to_string(&fi)?; - let opts = serde_json::to_string(&opts)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(DeleteVersionRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - file_info, - force_del_marker, - opts, - }); + self.execute_with_timeout( + || async { + let file_info = serde_json::to_string(&fi)?; + let opts = serde_json::to_string(&opts)?; - let response = client.delete_version(request).await?.into_inner(); + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(DeleteVersionRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + file_info, + force_del_marker, + opts, + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.delete_version(request).await?.into_inner(); - // let raw_file_info = serde_json::from_str::(&response.raw_file_info)?; + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(()) + // let raw_file_info = serde_json::from_str::(&response.raw_file_info)?; + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn delete_versions(&self, volume: &str, versions: Vec, opts: DeleteOptions) -> Vec> { info!("delete_versions"); + if self.health.is_faulty() { + return vec![Some(DiskError::FaultyDisk); versions.len()]; + } + let opts = match serde_json::to_string(&opts) { Ok(opts) => opts, Err(err) => { @@ -401,12 +627,24 @@ impl DiskAPI for RemoteDisk { // TODO: use Error not string - let response = match client.delete_versions(request).await { + let result = self + .execute_with_timeout( + || async { + client + .delete_versions(request) + .await + .map_err(|err| Error::other(format!("delete_versions failed: {err}"))) + }, + get_max_timeout_duration(), + ) + .await; + + let response = match result { Ok(response) => response, Err(err) => { let mut errors = Vec::with_capacity(versions.len()); for _ in 0..versions.len() { - errors.push(Some(Error::other(err.to_string()))); + errors.push(Some(err.clone())); } return errors; } @@ -437,71 +675,110 @@ impl DiskAPI for RemoteDisk { async fn delete_paths(&self, volume: &str, paths: &[String]) -> Result<()> { info!("delete_paths"); let paths = paths.to_owned(); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(DeletePathsRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - paths, - }); - let response = client.delete_paths(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(DeletePathsRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + paths: paths.clone(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.delete_paths(request).await?.into_inner(); - Ok(()) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn write_metadata(&self, _org_volume: &str, volume: &str, path: &str, fi: FileInfo) -> Result<()> { info!("write_metadata {}/{}", volume, path); let file_info = serde_json::to_string(&fi)?; + + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(WriteMetadataRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + file_info: file_info.clone(), + }); + + let response = client.write_metadata(request).await?.into_inner(); + + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await + } + + async fn read_metadata(&self, volume: &str, path: &str) -> Result { let mut client = node_service_time_out_client(&self.addr) .await .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(WriteMetadataRequest { - disk: self.endpoint.to_string(), + let request = Request::new(ReadMetadataRequest { volume: volume.to_string(), path: path.to_string(), - file_info, + disk: self.endpoint.to_string(), }); - let response = client.write_metadata(request).await?.into_inner(); + let response = client.read_metadata(request).await?.into_inner(); if !response.success { return Err(response.error.unwrap_or_default().into()); } - Ok(()) + Ok(response.data) } #[tracing::instrument(skip(self))] async fn update_metadata(&self, volume: &str, path: &str, fi: FileInfo, opts: &UpdateMetadataOpts) -> Result<()> { info!("update_metadata"); let file_info = serde_json::to_string(&fi)?; - let opts = serde_json::to_string(&opts)?; + let opts_str = serde_json::to_string(&opts)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(UpdateMetadataRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - file_info, - opts, - }); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(UpdateMetadataRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + file_info: file_info.clone(), + opts: opts_str.clone(), + }); - let response = client.update_metadata(request).await?.into_inner(); + let response = client.update_metadata(request).await?.into_inner(); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(()) + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] @@ -514,51 +791,65 @@ impl DiskAPI for RemoteDisk { opts: &ReadOptions, ) -> Result { info!("read_version"); - let opts = serde_json::to_string(opts)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(ReadVersionRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - version_id: version_id.to_string(), - opts, - }); + let opts_str = serde_json::to_string(opts)?; - let response = client.read_version(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(ReadVersionRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + version_id: version_id.to_string(), + opts: opts_str.clone(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.read_version(request).await?.into_inner(); - let file_info = serde_json::from_str::(&response.file_info)?; + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(file_info) + let file_info = serde_json::from_str::(&response.file_info)?; + + Ok(file_info) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(level = "debug", skip(self))] async fn read_xl(&self, volume: &str, path: &str, read_data: bool) -> Result { info!("read_xl {}/{}/{}", self.endpoint.to_string(), volume, path); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(ReadXlRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - read_data, - }); - let response = client.read_xl(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(ReadXlRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + read_data, + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.read_xl(request).await?.into_inner(); - let raw_file_info = serde_json::from_str::(&response.raw_file_info)?; + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(raw_file_info) + let raw_file_info = serde_json::from_str::(&response.raw_file_info)?; + + Ok(raw_file_info) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] @@ -571,33 +862,45 @@ impl DiskAPI for RemoteDisk { dst_path: &str, ) -> Result { info!("rename_data {}/{}/{}/{}", self.addr, self.endpoint.to_string(), dst_volume, dst_path); - let file_info = serde_json::to_string(&fi)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(RenameDataRequest { - disk: self.endpoint.to_string(), - src_volume: src_volume.to_string(), - src_path: src_path.to_string(), - file_info, - dst_volume: dst_volume.to_string(), - dst_path: dst_path.to_string(), - }); - let response = client.rename_data(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let file_info = serde_json::to_string(&fi)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(RenameDataRequest { + disk: self.endpoint.to_string(), + src_volume: src_volume.to_string(), + src_path: src_path.to_string(), + file_info, + dst_volume: dst_volume.to_string(), + dst_path: dst_path.to_string(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.rename_data(request).await?.into_inner(); - let rename_data_resp = serde_json::from_str::(&response.rename_data_resp)?; + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(rename_data_resp) + let rename_data_resp = serde_json::from_str::(&response.rename_data_resp)?; + + Ok(rename_data_resp) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn list_dir(&self, _origvolume: &str, volume: &str, dir_path: &str, count: i32) -> Result> { debug!("list_dir {}/{}", volume, dir_path); + + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + let mut client = node_service_time_out_client(&self.addr) .await .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; @@ -621,6 +924,10 @@ impl DiskAPI for RemoteDisk { async fn walk_dir(&self, opts: WalkDirOptions, wr: &mut W) -> Result<()> { info!("walk_dir {}", self.endpoint.to_string()); + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + let url = format!( "{}/rustfs/rpc/walk_dir?disk={}", self.endpoint.grid_host(), @@ -644,6 +951,10 @@ impl DiskAPI for RemoteDisk { async fn read_file(&self, volume: &str, path: &str) -> Result { info!("read_file {}/{}", volume, path); + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + let url = format!( "{}/rustfs/rpc/read_file_stream?disk={}&volume={}&path={}&offset={}&length={}", self.endpoint.grid_host(), @@ -670,6 +981,11 @@ impl DiskAPI for RemoteDisk { // offset, // length // ); + + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + let url = format!( "{}/rustfs/rpc/read_file_stream?disk={}&volume={}&path={}&offset={}&length={}", self.endpoint.grid_host(), @@ -690,6 +1006,10 @@ impl DiskAPI for RemoteDisk { async fn append_file(&self, volume: &str, path: &str) -> Result { info!("append_file {}/{}", volume, path); + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + let url = format!( "{}/rustfs/rpc/put_file_stream?disk={}&volume={}&path={}&append={}&size={}", self.endpoint.grid_host(), @@ -716,6 +1036,10 @@ impl DiskAPI for RemoteDisk { // file_size // ); + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + let url = format!( "{}/rustfs/rpc/put_file_stream?disk={}&volume={}&path={}&append={}&size={}", self.endpoint.grid_host(), @@ -735,216 +1059,282 @@ impl DiskAPI for RemoteDisk { #[tracing::instrument(level = "debug", skip(self))] async fn rename_file(&self, src_volume: &str, src_path: &str, dst_volume: &str, dst_path: &str) -> Result<()> { info!("rename_file"); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(RenameFileRequest { - disk: self.endpoint.to_string(), - src_volume: src_volume.to_string(), - src_path: src_path.to_string(), - dst_volume: dst_volume.to_string(), - dst_path: dst_path.to_string(), - }); - let response = client.rename_file(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(RenameFileRequest { + disk: self.endpoint.to_string(), + src_volume: src_volume.to_string(), + src_path: src_path.to_string(), + dst_volume: dst_volume.to_string(), + dst_path: dst_path.to_string(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.rename_file(request).await?.into_inner(); - Ok(()) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn rename_part(&self, src_volume: &str, src_path: &str, dst_volume: &str, dst_path: &str, meta: Bytes) -> Result<()> { info!("rename_part {}/{}", src_volume, src_path); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(RenamePartRequest { - disk: self.endpoint.to_string(), - src_volume: src_volume.to_string(), - src_path: src_path.to_string(), - dst_volume: dst_volume.to_string(), - dst_path: dst_path.to_string(), - meta, - }); - let response = client.rename_part(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(RenamePartRequest { + disk: self.endpoint.to_string(), + src_volume: src_volume.to_string(), + src_path: src_path.to_string(), + dst_volume: dst_volume.to_string(), + dst_path: dst_path.to_string(), + meta, + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.rename_part(request).await?.into_inner(); - Ok(()) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn delete(&self, volume: &str, path: &str, opt: DeleteOptions) -> Result<()> { info!("delete {}/{}/{}", self.endpoint.to_string(), volume, path); - let options = serde_json::to_string(&opt)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(DeleteRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - options, - }); - let response = client.delete(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let options = serde_json::to_string(&opt)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(DeleteRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + options, + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.delete(request).await?.into_inner(); - Ok(()) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn verify_file(&self, volume: &str, path: &str, fi: &FileInfo) -> Result { info!("verify_file"); - let file_info = serde_json::to_string(&fi)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(VerifyFileRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - file_info, - }); - let response = client.verify_file(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let file_info = serde_json::to_string(&fi)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(VerifyFileRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + file_info, + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.verify_file(request).await?.into_inner(); - let check_parts_resp = serde_json::from_str::(&response.check_parts_resp)?; + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(check_parts_resp) + let check_parts_resp = serde_json::from_str::(&response.check_parts_resp)?; + + Ok(check_parts_resp) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn read_parts(&self, bucket: &str, paths: &[String]) -> Result> { - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(ReadPartsRequest { - disk: self.endpoint.to_string(), - bucket: bucket.to_string(), - paths: paths.to_vec(), - }); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(ReadPartsRequest { + disk: self.endpoint.to_string(), + bucket: bucket.to_string(), + paths: paths.to_vec(), + }); - let response = client.read_parts(request).await?.into_inner(); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.read_parts(request).await?.into_inner(); + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - let read_parts_resp = rmp_serde::from_slice::>(&response.object_part_infos)?; + let read_parts_resp = rmp_serde::from_slice::>(&response.object_part_infos)?; - Ok(read_parts_resp) + Ok(read_parts_resp) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn check_parts(&self, volume: &str, path: &str, fi: &FileInfo) -> Result { info!("check_parts"); - let file_info = serde_json::to_string(&fi)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(CheckPartsRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - file_info, - }); - let response = client.check_parts(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let file_info = serde_json::to_string(&fi)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(CheckPartsRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + file_info, + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.check_parts(request).await?.into_inner(); - let check_parts_resp = serde_json::from_str::(&response.check_parts_resp)?; + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(check_parts_resp) + let check_parts_resp = serde_json::from_str::(&response.check_parts_resp)?; + + Ok(check_parts_resp) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn read_multiple(&self, req: ReadMultipleReq) -> Result> { info!("read_multiple {}/{}/{}", self.endpoint.to_string(), req.bucket, req.prefix); - let read_multiple_req = serde_json::to_string(&req)?; - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(ReadMultipleRequest { - disk: self.endpoint.to_string(), - read_multiple_req, - }); - let response = client.read_multiple(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let read_multiple_req = serde_json::to_string(&req)?; + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(ReadMultipleRequest { + disk: self.endpoint.to_string(), + read_multiple_req, + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.read_multiple(request).await?.into_inner(); - let read_multiple_resps = response - .read_multiple_resps - .into_iter() - .filter_map(|json_str| serde_json::from_str::(&json_str).ok()) - .collect(); + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } - Ok(read_multiple_resps) + let read_multiple_resps = response + .read_multiple_resps + .into_iter() + .filter_map(|json_str| serde_json::from_str::(&json_str).ok()) + .collect(); + + Ok(read_multiple_resps) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn write_all(&self, volume: &str, path: &str, data: Bytes) -> Result<()> { info!("write_all"); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(WriteAllRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - data, - }); - let response = client.write_all(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(WriteAllRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + data, + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.write_all(request).await?.into_inner(); - Ok(()) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(()) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn read_all(&self, volume: &str, path: &str) -> Result { info!("read_all {}/{}", volume, path); - let mut client = node_service_time_out_client(&self.addr) - .await - .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; - let request = Request::new(ReadAllRequest { - disk: self.endpoint.to_string(), - volume: volume.to_string(), - path: path.to_string(), - }); - let response = client.read_all(request).await?.into_inner(); + self.execute_with_timeout( + || async { + let mut client = node_service_time_out_client(&self.addr) + .await + .map_err(|err| Error::other(format!("can not get client, err: {err}")))?; + let request = Request::new(ReadAllRequest { + disk: self.endpoint.to_string(), + volume: volume.to_string(), + path: path.to_string(), + }); - if !response.success { - return Err(response.error.unwrap_or_default().into()); - } + let response = client.read_all(request).await?.into_inner(); - Ok(response.data) + if !response.success { + return Err(response.error.unwrap_or_default().into()); + } + + Ok(response.data) + }, + get_max_timeout_duration(), + ) + .await } #[tracing::instrument(skip(self))] async fn disk_info(&self, opts: &DiskInfoOptions) -> Result { + if self.health.is_faulty() { + return Err(DiskError::FaultyDisk); + } + let opts = serde_json::to_string(&opts)?; let mut client = node_service_time_out_client(&self.addr) .await @@ -964,14 +1354,35 @@ impl DiskAPI for RemoteDisk { Ok(disk_info) } + + #[tracing::instrument(skip(self))] + fn start_scan(&self) -> ScanGuard { + self.scanning.fetch_add(1, Ordering::Relaxed); + ScanGuard(Arc::clone(&self.scanning)) + } } #[cfg(test)] mod tests { use super::*; + use std::sync::Once; use tokio::net::TcpListener; + use tracing::Level; use uuid::Uuid; + static INIT: Once = Once::new(); + + fn init_tracing(filter_level: Level) { + INIT.call_once(|| { + let _ = tracing_subscriber::fmt() + .with_env_filter(tracing_subscriber::EnvFilter::from_default_env()) + .with_max_level(filter_level) + .with_timer(tracing_subscriber::fmt::time::UtcTime::rfc_3339()) + .with_thread_names(true) + .try_init(); + }); + } + #[tokio::test] async fn test_remote_disk_creation() { let url = url::Url::parse("http://example.com:9000/path").unwrap(); @@ -1080,6 +1491,8 @@ mod tests { #[tokio::test] async fn test_remote_disk_is_online_detects_missing_listener() { + init_tracing(Level::ERROR); + let listener = TcpListener::bind("127.0.0.1:0").await.unwrap(); let addr = listener.local_addr().unwrap(); let ip = addr.ip(); @@ -1098,10 +1511,14 @@ mod tests { let disk_option = DiskOption { cleanup: false, - health_check: false, + health_check: true, }; let remote_disk = RemoteDisk::new(&endpoint, &disk_option).await.unwrap(); + + // wait for health check connect timeout + tokio::time::sleep(Duration::from_secs(6)).await; + assert!(!remote_disk.is_online().await); } diff --git a/crates/ecstore/src/set_disk.rs b/crates/ecstore/src/set_disk.rs index 054934e6..a48193cb 100644 --- a/crates/ecstore/src/set_disk.rs +++ b/crates/ecstore/src/set_disk.rs @@ -73,6 +73,7 @@ use rustfs_filemeta::{ FileInfo, FileMeta, FileMetaShallowVersion, MetaCacheEntries, MetaCacheEntry, MetadataResolutionParams, ObjectPartInfo, RawFileInfo, ReplicationStatusType, VersionPurgeStatusType, file_info_from_raw, merge_file_meta_versions, }; +use rustfs_lock::FastLockGuard; use rustfs_lock::fast_lock::types::LockResult; use rustfs_madmin::heal_commands::{HealDriveInfo, HealResultItem}; use rustfs_rio::{EtagResolvable, HashReader, HashReaderMut, TryGetIndex as _, WarpReader}; @@ -174,56 +175,56 @@ impl SetDisks { }) } - async fn cached_disk_health(&self, index: usize) -> Option { - let cache = self.disk_health_cache.read().await; - cache - .get(index) - .and_then(|entry| entry.as_ref().and_then(|state| state.cached_value())) - } + // async fn cached_disk_health(&self, index: usize) -> Option { + // let cache = self.disk_health_cache.read().await; + // cache + // .get(index) + // .and_then(|entry| entry.as_ref().and_then(|state| state.cached_value())) + // } - async fn update_disk_health(&self, index: usize, online: bool) { - let mut cache = self.disk_health_cache.write().await; - if cache.len() <= index { - cache.resize(index + 1, None); - } - cache[index] = Some(DiskHealthEntry { - last_check: Instant::now(), - online, - }); - } + // async fn update_disk_health(&self, index: usize, online: bool) { + // let mut cache = self.disk_health_cache.write().await; + // if cache.len() <= index { + // cache.resize(index + 1, None); + // } + // cache[index] = Some(DiskHealthEntry { + // last_check: Instant::now(), + // online, + // }); + // } - async fn is_disk_online_cached(&self, index: usize, disk: &DiskStore) -> bool { - if let Some(online) = self.cached_disk_health(index).await { - return online; - } + // async fn is_disk_online_cached(&self, index: usize, disk: &DiskStore) -> bool { + // if let Some(online) = self.cached_disk_health(index).await { + // return online; + // } - let disk_clone = disk.clone(); - let online = timeout(DISK_ONLINE_TIMEOUT, async move { disk_clone.is_online().await }) - .await - .unwrap_or(false); - self.update_disk_health(index, online).await; - online - } + // let disk_clone = disk.clone(); + // let online = timeout(DISK_ONLINE_TIMEOUT, async move { disk_clone.is_online().await }) + // .await + // .unwrap_or(false); + // self.update_disk_health(index, online).await; + // online + // } - async fn filter_online_disks(&self, disks: Vec>) -> (Vec>, usize) { - let mut filtered = Vec::with_capacity(disks.len()); - let mut online_count = 0; + // async fn filter_online_disks(&self, disks: Vec>) -> (Vec>, usize) { + // let mut filtered = Vec::with_capacity(disks.len()); + // let mut online_count = 0; - for (idx, disk) in disks.into_iter().enumerate() { - if let Some(disk_store) = disk { - if self.is_disk_online_cached(idx, &disk_store).await { - filtered.push(Some(disk_store)); - online_count += 1; - } else { - filtered.push(None); - } - } else { - filtered.push(None); - } - } + // for (idx, disk) in disks.into_iter().enumerate() { + // if let Some(disk_store) = disk { + // if self.is_disk_online_cached(idx, &disk_store).await { + // filtered.push(Some(disk_store)); + // online_count += 1; + // } else { + // filtered.push(None); + // } + // } else { + // filtered.push(None); + // } + // } - (filtered, online_count) - } + // (filtered, online_count) + // } fn format_lock_error(&self, bucket: &str, object: &str, mode: &str, err: &LockResult) -> String { match err { LockResult::Timeout => { @@ -259,9 +260,28 @@ impl SetDisks { } async fn get_online_disks(&self) -> Vec> { - let disks = self.get_disks_internal().await; - let (filtered, _) = self.filter_online_disks(disks).await; - filtered.into_iter().filter(|disk| disk.is_some()).collect() + let mut disks = self.get_disks_internal().await; + + // TODO: diskinfo filter online + + let mut new_disk = Vec::with_capacity(disks.len()); + + for disk in disks.iter() { + if let Some(d) = disk { + if d.is_online().await { + new_disk.push(disk.clone()); + } + } + } + + let mut rng = rand::rng(); + + disks.shuffle(&mut rng); + + new_disk + // let disks = self.get_disks_internal().await; + // let (filtered, _) = self.filter_online_disks(disks).await; + // filtered.into_iter().filter(|disk| disk.is_some()).collect() } async fn get_online_local_disks(&self) -> Vec> { let mut disks = self.get_online_disks().await; @@ -1468,17 +1488,7 @@ impl SetDisks { let version_id = version_id.clone(); tokio::spawn(async move { if let Some(disk) = disk { - if version_id.is_empty() { - match disk.read_xl(&bucket, &object, read_data).await { - Ok(info) => { - let fi = file_info_from_raw(info, &bucket, &object, read_data).await?; - Ok(fi) - } - Err(err) => Err(err), - } - } else { - disk.read_version(&org_bucket, &bucket, &object, &version_id, &opts).await - } + disk.read_version(&org_bucket, &bucket, &object, &version_id, &opts).await } else { Err(DiskError::DiskNotFound) } @@ -1799,14 +1809,14 @@ impl SetDisks { } pub async fn renew_disk(&self, ep: &Endpoint) { - debug!("renew_disk start {:?}", ep); + debug!("renew_disk: start {:?}", ep); let (new_disk, fm) = match Self::connect_endpoint(ep).await { Ok(res) => res, Err(e) => { - warn!("connect_endpoint err {:?}", &e); + warn!("renew_disk: connect_endpoint err {:?}", &e); if ep.is_local && e == DiskError::UnformattedDisk { - info!("unformatteddisk will trigger heal_disk, {:?}", ep); + info!("renew_disk unformatteddisk will trigger heal_disk, {:?}", ep); let set_disk_id = format!("pool_{}_set_{}", ep.pool_idx, ep.set_idx); let _ = send_heal_disk(set_disk_id, Some(HealChannelPriority::Normal)).await; } @@ -1817,7 +1827,7 @@ impl SetDisks { let (set_idx, disk_idx) = match self.find_disk_index(&fm) { Ok(res) => res, Err(e) => { - warn!("find_disk_index err {:?}", e); + warn!("renew_disk: find_disk_index err {:?}", e); return; } }; @@ -1837,7 +1847,7 @@ impl SetDisks { } } - debug!("renew_disk update {:?}", fm.erasure.this); + debug!("renew_disk: update {:?}", fm.erasure.this); let mut disk_lock = self.disks.write().await; disk_lock[disk_idx] = Some(new_disk); @@ -2672,7 +2682,7 @@ impl SetDisks { let (mut parts_metadata, errs) = { let mut retry_count = 0; loop { - let (parts, errs) = Self::read_all_fileinfo(&disks, "", bucket, object, version_id, true, true).await?; + let (parts, errs) = Self::read_all_fileinfo(&disks, "", bucket, object, version_id, false, false).await?; // Check if we have enough valid metadata to proceed // If we have too many errors, and we haven't exhausted retries, try again @@ -2699,7 +2709,14 @@ impl SetDisks { retry_count += 1; } }; - info!(parts_count = parts_metadata.len(), ?errs, "File info read complete"); + info!( + parts_count = parts_metadata.len(), + bucket = bucket, + object = object, + version_id = version_id, + ?errs, + "File info read complete" + ); if DiskError::is_all_not_found(&errs) { warn!( "heal_object failed, all obj part not found, bucket: {}, obj: {}, version_id: {}", @@ -3051,7 +3068,7 @@ impl SetDisks { for (index, disk) in latest_disks.iter().enumerate() { if let Some(outdated_disk) = &out_dated_disks[index] { info!(disk_index = index, "Creating writer for outdated disk"); - let writer = create_bitrot_writer( + let writer = match create_bitrot_writer( is_inline_buffer, Some(outdated_disk), RUSTFS_META_TMP_BUCKET, @@ -3060,7 +3077,19 @@ impl SetDisks { erasure.shard_size(), HashAlgorithm::HighwayHash256, ) - .await?; + .await + { + Ok(writer) => writer, + Err(err) => { + warn!( + "create_bitrot_writer disk {}, err {:?}, skipping operation", + outdated_disk.to_string(), + err + ); + writers.push(None); + continue; + } + }; writers.push(Some(writer)); } else { info!(disk_index = index, "Skipping writer (disk not outdated)"); @@ -3790,8 +3819,8 @@ impl ObjectIO for SetDisks { #[tracing::instrument(level = "debug", skip(self, data,))] async fn put_object(&self, bucket: &str, object: &str, data: &mut PutObjReader, opts: &ObjectOptions) -> Result { - let disks_snapshot = self.get_disks_internal().await; - let (disks, filtered_online) = self.filter_online_disks(disks_snapshot).await; + let disks = self.get_disks_internal().await; + // let (disks, filtered_online) = self.filter_online_disks(disks_snapshot).await; // Acquire per-object exclusive lock via RAII guard. It auto-releases asynchronously on drop. let _object_lock_guard = if !opts.no_lock { @@ -3832,13 +3861,13 @@ impl ObjectIO for SetDisks { write_quorum += 1 } - if filtered_online < write_quorum { - warn!( - "online disk snapshot {} below write quorum {} for {}/{}; returning erasure write quorum error", - filtered_online, write_quorum, bucket, object - ); - return Err(to_object_err(Error::ErasureWriteQuorum, vec![bucket, object])); - } + // if filtered_online < write_quorum { + // warn!( + // "online disk snapshot {} below write quorum {} for {}/{}; returning erasure write quorum error", + // filtered_online, write_quorum, bucket, object + // ); + // return Err(to_object_err(Error::ErasureWriteQuorum, vec![bucket, object])); + // } let mut fi = FileInfo::new([bucket, object].join("/").as_str(), data_drives, parity_drives); @@ -3877,8 +3906,10 @@ impl ObjectIO for SetDisks { let mut writers = Vec::with_capacity(shuffle_disks.len()); let mut errors = Vec::with_capacity(shuffle_disks.len()); for disk_op in shuffle_disks.iter() { - if let Some(disk) = disk_op { - let writer = create_bitrot_writer( + if let Some(disk) = disk_op + && disk.is_online().await + { + let writer = match create_bitrot_writer( is_inline_buffer, Some(disk), RUSTFS_META_TMP_BUCKET, @@ -3887,29 +3918,16 @@ impl ObjectIO for SetDisks { erasure.shard_size(), HashAlgorithm::HighwayHash256, ) - .await?; - - // let writer = if is_inline_buffer { - // BitrotWriter::new( - // Writer::from_cursor(Cursor::new(Vec::new())), - // erasure.shard_size(), - // HashAlgorithm::HighwayHash256, - // ) - // } else { - // let f = match disk - // .create_file("", RUSTFS_META_TMP_BUCKET, &tmp_object, erasure.shard_file_size(data.content_length)) - // .await - // { - // Ok(f) => f, - // Err(e) => { - // errors.push(Some(e)); - // writers.push(None); - // continue; - // } - // }; - - // BitrotWriter::new(Writer::from_tokio_writer(f), erasure.shard_size(), HashAlgorithm::HighwayHash256) - // }; + .await + { + Ok(writer) => writer, + Err(err) => { + warn!("create_bitrot_writer disk {}, err {:?}, skipping operation", disk.to_string(), err); + errors.push(Some(err)); + writers.push(None); + continue; + } + }; writers.push(Some(writer)); errors.push(None); @@ -4058,6 +4076,14 @@ impl ObjectIO for SetDisks { #[async_trait::async_trait] impl StorageAPI for SetDisks { + #[tracing::instrument(skip(self))] + async fn new_ns_lock(&self, bucket: &str, object: &str) -> Result { + self.fast_lock_manager + .acquire_write_lock(bucket, object, self.locker_owner.as_str()) + .await + .map_err(|e| Error::other(self.format_lock_error(bucket, object, "write", &e))) + } + #[tracing::instrument(skip(self))] async fn backend_info(&self) -> rustfs_madmin::BackendInfo { unimplemented!() @@ -4072,7 +4098,7 @@ impl StorageAPI for SetDisks { async fn local_storage_info(&self) -> rustfs_madmin::StorageInfo { let disks = self.get_disks_internal().await; - let mut local_disks: Vec>> = Vec::new(); + let mut local_disks: Vec> = Vec::new(); let mut local_endpoints = Vec::new(); for (i, ep) in self.set_endpoints.iter().enumerate() { @@ -4908,9 +4934,7 @@ impl StorageAPI for SetDisks { for disk in disks.iter() { if let Some(disk) = disk { - if disk.is_online().await { - continue; - } + continue; } let _ = self.add_partial(bucket, object, opts.version_id.as_ref().expect("err")).await; break; @@ -5129,16 +5153,16 @@ impl StorageAPI for SetDisks { return Err(Error::other(format!("checksum mismatch: {checksum}"))); } - let disks_snapshot = self.get_disks_internal().await; - let (disks, filtered_online) = self.filter_online_disks(disks_snapshot).await; + let disks = self.get_disks_internal().await; + // let (disks, filtered_online) = self.filter_online_disks(disks_snapshot).await; - if filtered_online < write_quorum { - warn!( - "online disk snapshot {} below write quorum {} for multipart {}/{}; returning erasure write quorum error", - filtered_online, write_quorum, bucket, object - ); - return Err(to_object_err(Error::ErasureWriteQuorum, vec![bucket, object])); - } + // if filtered_online < write_quorum { + // warn!( + // "online disk snapshot {} below write quorum {} for multipart {}/{}; returning erasure write quorum error", + // filtered_online, write_quorum, bucket, object + // ); + // return Err(to_object_err(Error::ErasureWriteQuorum, vec![bucket, object])); + // } let shuffle_disks = Self::shuffle_disks(&disks, &fi.erasure.distribution); @@ -5152,7 +5176,7 @@ impl StorageAPI for SetDisks { let mut errors = Vec::with_capacity(shuffle_disks.len()); for disk_op in shuffle_disks.iter() { if let Some(disk) = disk_op { - let writer = create_bitrot_writer( + let writer = match create_bitrot_writer( false, Some(disk), RUSTFS_META_TMP_BUCKET, @@ -5161,23 +5185,16 @@ impl StorageAPI for SetDisks { erasure.shard_size(), HashAlgorithm::HighwayHash256, ) - .await?; - - // let writer = { - // let f = match disk - // .create_file("", RUSTFS_META_TMP_BUCKET, &tmp_part_path, erasure.shard_file_size(data.content_length)) - // .await - // { - // Ok(f) => f, - // Err(e) => { - // errors.push(Some(e)); - // writers.push(None); - // continue; - // } - // }; - - // BitrotWriter::new(Writer::from_tokio_writer(f), erasure.shard_size(), HashAlgorithm::HighwayHash256) - // }; + .await + { + Ok(writer) => writer, + Err(err) => { + warn!("create_bitrot_writer disk {}, err {:?}, skipping operation", disk.to_string(), err); + errors.push(Some(err)); + writers.push(None); + continue; + } + }; writers.push(Some(writer)); errors.push(None); @@ -6521,6 +6538,10 @@ async fn disks_with_all_parts( let corrupted = !meta.mod_time.eq(&latest_meta.mod_time) || !meta.data_dir.eq(&latest_meta.data_dir); if corrupted { + warn!( + "disks_with_all_partsv2: metadata is corrupted, object_name={}, index: {index}", + object_name + ); meta_errs[index] = Some(DiskError::FileCorrupt); parts_metadata[index] = FileInfo::default(); continue; @@ -6528,6 +6549,10 @@ async fn disks_with_all_parts( if erasure_distribution_reliable { if !meta.is_valid() { + warn!( + "disks_with_all_partsv2: metadata is not valid, object_name={}, index: {index}", + object_name + ); parts_metadata[index] = FileInfo::default(); meta_errs[index] = Some(DiskError::FileCorrupt); continue; @@ -6538,6 +6563,10 @@ async fn disks_with_all_parts( // Erasure distribution is not the same as onlineDisks // attempt a fix if possible, assuming other entries // might have the right erasure distribution. + warn!( + "disks_with_all_partsv2: erasure distribution is not the same as onlineDisks, object_name={}, index: {index}", + object_name + ); parts_metadata[index] = FileInfo::default(); meta_errs[index] = Some(DiskError::FileCorrupt); continue; @@ -6769,7 +6798,7 @@ async fn get_disks_info(disks: &[Option], eps: &[Endpoint]) -> Vec{ @@ -366,6 +367,10 @@ impl ObjectIO for Sets { #[async_trait::async_trait] impl StorageAPI for Sets { + #[tracing::instrument(skip(self))] + async fn new_ns_lock(&self, bucket: &str, object: &str) -> Result { + self.disk_set[0].new_ns_lock(bucket, object).await + } #[tracing::instrument(skip(self))] async fn backend_info(&self) -> rustfs_madmin::BackendInfo { unimplemented!() diff --git a/crates/ecstore/src/store.rs b/crates/ecstore/src/store.rs index 3097a9e2..06d2bd4d 100644 --- a/crates/ecstore/src/store.rs +++ b/crates/ecstore/src/store.rs @@ -55,9 +55,10 @@ use futures::future::join_all; use http::HeaderMap; use lazy_static::lazy_static; use rand::Rng as _; -use rustfs_common::globals::{GLOBAL_Local_Node_Name, GLOBAL_Rustfs_Host, GLOBAL_Rustfs_Port}; use rustfs_common::heal_channel::{HealItemType, HealOpts}; +use rustfs_common::{GLOBAL_LOCAL_NODE_NAME, GLOBAL_RUSTFS_HOST, GLOBAL_RUSTFS_PORT}; use rustfs_filemeta::FileInfo; +use rustfs_lock::FastLockGuard; use rustfs_madmin::heal_commands::HealResultItem; use rustfs_utils::path::{SLASH_SEPARATOR, decode_dir_object, encode_dir_object, path_join_buf}; use s3s::dto::{BucketVersioningStatus, ObjectLockConfiguration, ObjectLockEnabled, VersioningConfiguration}; @@ -127,11 +128,11 @@ impl ECStore { info!("ECStore new address: {}", address.to_string()); let mut host = address.ip().to_string(); if host.is_empty() { - host = GLOBAL_Rustfs_Host.read().await.to_string() + host = GLOBAL_RUSTFS_HOST.read().await.to_string() } let mut port = address.port().to_string(); if port.is_empty() { - port = GLOBAL_Rustfs_Port.read().await.to_string() + port = GLOBAL_RUSTFS_PORT.read().await.to_string() } info!("ECStore new host: {}, port: {}", host, port); init_local_peer(&endpoint_pools, &host, &port).await; @@ -1151,6 +1152,10 @@ lazy_static! { #[async_trait::async_trait] impl StorageAPI for ECStore { + #[instrument(skip(self))] + async fn new_ns_lock(&self, bucket: &str, object: &str) -> Result { + self.pools[0].new_ns_lock(bucket, object).await + } #[instrument(skip(self))] async fn backend_info(&self) -> rustfs_madmin::BackendInfo { let (standard_sc_parity, rr_sc_parity) = { @@ -2329,15 +2334,15 @@ async fn init_local_peer(endpoint_pools: &EndpointServerPools, host: &String, po if peer_set.is_empty() { if !host.is_empty() { - *GLOBAL_Local_Node_Name.write().await = format!("{host}:{port}"); + *GLOBAL_LOCAL_NODE_NAME.write().await = format!("{host}:{port}"); return; } - *GLOBAL_Local_Node_Name.write().await = format!("127.0.0.1:{port}"); + *GLOBAL_LOCAL_NODE_NAME.write().await = format!("127.0.0.1:{port}"); return; } - *GLOBAL_Local_Node_Name.write().await = peer_set[0].clone(); + *GLOBAL_LOCAL_NODE_NAME.write().await = peer_set[0].clone(); } pub fn is_valid_object_prefix(_object: &str) -> bool { diff --git a/crates/ecstore/src/store_api.rs b/crates/ecstore/src/store_api.rs index 90c8fd96..e1c2b21c 100644 --- a/crates/ecstore/src/store_api.rs +++ b/crates/ecstore/src/store_api.rs @@ -30,6 +30,7 @@ use rustfs_filemeta::{ FileInfo, MetaCacheEntriesSorted, ObjectPartInfo, REPLICATION_RESET, REPLICATION_STATUS, ReplicateDecision, ReplicationState, ReplicationStatusType, VersionPurgeStatusType, replication_statuses_map, version_purge_statuses_map, }; +use rustfs_lock::FastLockGuard; use rustfs_madmin::heal_commands::HealResultItem; use rustfs_rio::Checksum; use rustfs_rio::{DecompressReader, HashReader, LimitReader, WarpReader}; @@ -827,7 +828,12 @@ impl ObjectInfo { for entry in entries.entries() { if entry.is_object() { if let Some(delimiter) = &delimiter { - if let Some(idx) = entry.name.trim_start_matches(prefix).find(delimiter) { + let remaining = if entry.name.starts_with(prefix) { + &entry.name[prefix.len()..] + } else { + entry.name.as_str() + }; + if let Some(idx) = remaining.find(delimiter.as_str()) { let idx = prefix.len() + idx + delimiter.len(); if let Some(curr_prefix) = entry.name.get(0..idx) { if curr_prefix == prev_prefix { @@ -878,7 +884,14 @@ impl ObjectInfo { if entry.is_dir() { if let Some(delimiter) = &delimiter { - if let Some(idx) = entry.name.trim_start_matches(prefix).find(delimiter) { + if let Some(idx) = { + let remaining = if entry.name.starts_with(prefix) { + &entry.name[prefix.len()..] + } else { + entry.name.as_str() + }; + remaining.find(delimiter.as_str()) + } { let idx = prefix.len() + idx + delimiter.len(); if let Some(curr_prefix) = entry.name.get(0..idx) { if curr_prefix == prev_prefix { @@ -914,7 +927,12 @@ impl ObjectInfo { for entry in entries.entries() { if entry.is_object() { if let Some(delimiter) = &delimiter { - if let Some(idx) = entry.name.trim_start_matches(prefix).find(delimiter) { + let remaining = if entry.name.starts_with(prefix) { + &entry.name[prefix.len()..] + } else { + entry.name.as_str() + }; + if let Some(idx) = remaining.find(delimiter.as_str()) { let idx = prefix.len() + idx + delimiter.len(); if let Some(curr_prefix) = entry.name.get(0..idx) { if curr_prefix == prev_prefix { @@ -951,7 +969,14 @@ impl ObjectInfo { if entry.is_dir() { if let Some(delimiter) = &delimiter { - if let Some(idx) = entry.name.trim_start_matches(prefix).find(delimiter) { + if let Some(idx) = { + let remaining = if entry.name.starts_with(prefix) { + &entry.name[prefix.len()..] + } else { + entry.name.as_str() + }; + remaining.find(delimiter.as_str()) + } { let idx = prefix.len() + idx + delimiter.len(); if let Some(curr_prefix) = entry.name.get(0..idx) { if curr_prefix == prev_prefix { @@ -1275,6 +1300,7 @@ pub trait ObjectIO: Send + Sync + Debug + 'static { #[allow(clippy::too_many_arguments)] pub trait StorageAPI: ObjectIO + Debug { // NewNSLock TODO: + async fn new_ns_lock(&self, bucket: &str, object: &str) -> Result; // Shutdown TODO: // NSScanner TODO: diff --git a/crates/ecstore/src/store_init.rs b/crates/ecstore/src/store_init.rs index 965088d0..437b5218 100644 --- a/crates/ecstore/src/store_init.rs +++ b/crates/ecstore/src/store_init.rs @@ -265,7 +265,10 @@ pub async fn load_format_erasure(disk: &DiskStore, heal: bool) -> disk::error::R .map_err(|e| match e { DiskError::FileNotFound => DiskError::UnformattedDisk, DiskError::DiskNotFound => DiskError::UnformattedDisk, - _ => e, + _ => { + warn!("load_format_erasure err: {:?} {:?}", disk.to_string(), e); + e + } })?; let mut fm = FormatV3::try_from(data.as_ref())?; @@ -312,17 +315,18 @@ async fn save_format_file_all(disks: &[Option], formats: &[Option, format: &Option) -> disk::error::Result<()> { - if disk.is_none() { + let Some(disk) = disk else { return Err(DiskError::DiskNotFound); - } + }; - let format = format.as_ref().unwrap(); + let Some(format) = format else { + return Err(DiskError::other("format is none")); + }; let json_data = format.to_json()?; let tmpfile = Uuid::new_v4().to_string(); - let disk = disk.as_ref().unwrap(); disk.write_all(RUSTFS_META_BUCKET, tmpfile.as_str(), json_data.into_bytes().into()) .await?; diff --git a/crates/ecstore/src/tier/warm_backend_azure2.rs b/crates/ecstore/src/tier/warm_backend_azure2.rs deleted file mode 100644 index 338a475d..00000000 --- a/crates/ecstore/src/tier/warm_backend_azure2.rs +++ /dev/null @@ -1,231 +0,0 @@ -// Copyright 2024 RustFS Team -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. -#![allow(unused_imports)] -#![allow(unused_variables)] -#![allow(unused_mut)] -#![allow(unused_assignments)] -#![allow(unused_must_use)] -#![allow(clippy::all)] - -use std::collections::HashMap; -use std::sync::Arc; - -use azure_core::http::{Body, ClientOptions, RequestContent}; -use azure_storage::StorageCredentials; -use azure_storage_blobs::prelude::*; - -use crate::client::{ - admin_handler_utils::AdminError, - api_put_object::PutObjectOptions, - transition_api::{Options, ReadCloser, ReaderImpl}, -}; -use crate::tier::{ - tier_config::TierAzure, - warm_backend::{WarmBackend, WarmBackendGetOpts}, -}; -use tracing::warn; - -const MAX_MULTIPART_PUT_OBJECT_SIZE: i64 = 1024 * 1024 * 1024 * 1024 * 5; -const MAX_PARTS_COUNT: i64 = 10000; -const _MAX_PART_SIZE: i64 = 1024 * 1024 * 1024 * 5; -const MIN_PART_SIZE: i64 = 1024 * 1024 * 128; - -pub struct WarmBackendAzure { - pub client: Arc, - pub bucket: String, - pub prefix: String, - pub storage_class: String, -} - -impl WarmBackendAzure { - pub async fn new(conf: &TierAzure, tier: &str) -> Result { - if conf.access_key == "" || conf.secret_key == "" { - return Err(std::io::Error::other("both access and secret keys are required")); - } - - if conf.bucket == "" { - return Err(std::io::Error::other("no bucket name was provided")); - } - - let creds = StorageCredentials::access_key(conf.access_key.clone(), conf.secret_key.clone()); - let client = ClientBuilder::new(conf.access_key.clone(), creds) - //.endpoint(conf.endpoint) - .blob_service_client(); - let client = Arc::new(client); - Ok(Self { - client, - bucket: conf.bucket.clone(), - prefix: conf.prefix.strip_suffix("/").unwrap_or(&conf.prefix).to_owned(), - storage_class: "".to_string(), - }) - } - - /*pub fn tier(&self) -> *blob.AccessTier { - if self.storage_class == "" { - return None; - } - for t in blob.PossibleAccessTierValues() { - if strings.EqualFold(self.storage_class, t) { - return &t - } - } - None - }*/ - - pub fn get_dest(&self, object: &str) -> String { - let mut dest_obj = object.to_string(); - if self.prefix != "" { - dest_obj = format!("{}/{}", &self.prefix, object); - } - return dest_obj; - } -} - -#[async_trait::async_trait] -impl WarmBackend for WarmBackendAzure { - async fn put_with_meta( - &self, - object: &str, - r: ReaderImpl, - length: i64, - meta: HashMap, - ) -> Result { - let part_size = length; - let client = self.client.clone(); - let container_client = client.container_client(self.bucket.clone()); - let blob_client = container_client.blob_client(self.get_dest(object)); - /*let res = blob_client - .upload( - RequestContent::from(match r { - ReaderImpl::Body(content_body) => content_body.to_vec(), - ReaderImpl::ObjectBody(mut content_body) => content_body.read_all().await?, - }), - false, - length as u64, - None, - ) - .await - else { - return Err(std::io::Error::other("upload error")); - };*/ - - let Ok(res) = blob_client - .put_block_blob(match r { - ReaderImpl::Body(content_body) => content_body.to_vec(), - ReaderImpl::ObjectBody(mut content_body) => content_body.read_all().await?, - }) - .content_type("text/plain") - .into_future() - .await - else { - return Err(std::io::Error::other("put_block_blob error")); - }; - - //self.ToObjectError(err, object) - Ok(res.request_id.to_string()) - } - - async fn put(&self, object: &str, r: ReaderImpl, length: i64) -> Result { - self.put_with_meta(object, r, length, HashMap::new()).await - } - - async fn get(&self, object: &str, rv: &str, opts: WarmBackendGetOpts) -> Result { - let client = self.client.clone(); - let container_client = client.container_client(self.bucket.clone()); - let blob_client = container_client.blob_client(self.get_dest(object)); - blob_client.get(); - todo!(); - } - - async fn remove(&self, object: &str, rv: &str) -> Result<(), std::io::Error> { - let client = self.client.clone(); - let container_client = client.container_client(self.bucket.clone()); - let blob_client = container_client.blob_client(self.get_dest(object)); - blob_client.delete(); - todo!(); - } - - async fn in_use(&self) -> Result { - /*let result = self.client - .list_objects_v2(&self.bucket, &self.prefix, "", "", SLASH_SEPARATOR, 1) - .await?; - - Ok(result.common_prefixes.len() > 0 || result.contents.len() > 0)*/ - Ok(false) - } -} - -/*fn azure_to_object_error(err: Error, params: Vec) -> Option { - if err == nil { - return nil - } - - bucket := "" - object := "" - if len(params) >= 1 { - bucket = params[0] - } - if len(params) == 2 { - object = params[1] - } - - azureErr, ok := err.(*azcore.ResponseError) - if !ok { - // We don't interpret non Azure errors. As azure errors will - // have StatusCode to help to convert to object errors. - return err - } - - serviceCode := azureErr.ErrorCode - statusCode := azureErr.StatusCode - - azureCodesToObjectError(err, serviceCode, statusCode, bucket, object) -}*/ - -/*fn azure_codes_to_object_error(err: Error, service_code: String, status_code: i32, bucket: String, object: String) -> Option { - switch serviceCode { - case "ContainerNotFound", "ContainerBeingDeleted": - err = BucketNotFound{Bucket: bucket} - case "ContainerAlreadyExists": - err = BucketExists{Bucket: bucket} - case "InvalidResourceName": - err = BucketNameInvalid{Bucket: bucket} - case "RequestBodyTooLarge": - err = PartTooBig{} - case "InvalidMetadata": - err = UnsupportedMetadata{} - case "BlobAccessTierNotSupportedForAccountType": - err = NotImplemented{} - case "OutOfRangeInput": - err = ObjectNameInvalid{ - Bucket: bucket, - Object: object, - } - default: - switch statusCode { - case http.StatusNotFound: - if object != "" { - err = ObjectNotFound{ - Bucket: bucket, - Object: object, - } - } else { - err = BucketNotFound{Bucket: bucket} - } - case http.StatusBadRequest: - err = BucketNameInvalid{Bucket: bucket} - } - } - return err -}*/ diff --git a/crates/filemeta/src/filemeta.rs b/crates/filemeta/src/filemeta.rs index ad3b0f9e..bae4b6ad 100644 --- a/crates/filemeta/src/filemeta.rs +++ b/crates/filemeta/src/filemeta.rs @@ -942,6 +942,41 @@ impl FileMeta { } } + pub fn get_file_info_versions(&self, volume: &str, path: &str, include_free_versions: bool) -> Result { + let mut versions = self.into_file_info_versions(volume, path, true)?; + + let mut n = 0; + + let mut versions_vec = Vec::new(); + + for fi in versions.versions.iter() { + if fi.tier_free_version() { + if !include_free_versions { + versions.free_versions.push(fi.clone()); + } + } else { + if !include_free_versions { + versions_vec.push(fi.clone()); + } + n += 1; + } + } + + if !include_free_versions { + versions.versions = versions_vec; + } + + for fi in versions.free_versions.iter_mut() { + fi.num_versions = n; + } + + Ok(versions) + } + + pub fn get_all_file_info_versions(&self, volume: &str, path: &str, all_parts: bool) -> Result { + self.into_file_info_versions(volume, path, all_parts) + } + pub fn into_file_info_versions(&self, volume: &str, path: &str, all_parts: bool) -> Result { let mut versions = Vec::new(); for version in self.versions.iter() { diff --git a/crates/filemeta/src/headers.rs b/crates/filemeta/src/headers.rs deleted file mode 100644 index 687198a0..00000000 --- a/crates/filemeta/src/headers.rs +++ /dev/null @@ -1,52 +0,0 @@ -// Copyright 2024 RustFS Team -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -pub const AMZ_META_UNENCRYPTED_CONTENT_LENGTH: &str = "X-Amz-Meta-X-Amz-Unencrypted-Content-Length"; -pub const AMZ_META_UNENCRYPTED_CONTENT_MD5: &str = "X-Amz-Meta-X-Amz-Unencrypted-Content-Md5"; - -pub const AMZ_STORAGE_CLASS: &str = "x-amz-storage-class"; - -pub const RESERVED_METADATA_PREFIX: &str = "X-RustFS-Internal-"; -pub const RESERVED_METADATA_PREFIX_LOWER: &str = "x-rustfs-internal-"; - -pub const RUSTFS_HEALING: &str = "X-Rustfs-Internal-healing"; -// pub const RUSTFS_DATA_MOVE: &str = "X-Rustfs-Internal-data-mov"; - -// pub const X_RUSTFS_INLINE_DATA: &str = "x-rustfs-inline-data"; - -pub const VERSION_PURGE_STATUS_KEY: &str = "X-Rustfs-Internal-purgestatus"; - -pub const X_RUSTFS_HEALING: &str = "X-Rustfs-Internal-healing"; -pub const X_RUSTFS_DATA_MOV: &str = "X-Rustfs-Internal-data-mov"; - -pub const AMZ_OBJECT_TAGGING: &str = "X-Amz-Tagging"; -pub const AMZ_BUCKET_REPLICATION_STATUS: &str = "X-Amz-Replication-Status"; -pub const AMZ_DECODED_CONTENT_LENGTH: &str = "X-Amz-Decoded-Content-Length"; - -pub const RUSTFS_DATA_MOVE: &str = "X-Rustfs-Internal-data-mov"; - -// Server-side encryption headers -pub const AMZ_SERVER_SIDE_ENCRYPTION: &str = "x-amz-server-side-encryption"; -pub const AMZ_SERVER_SIDE_ENCRYPTION_AWS_KMS_KEY_ID: &str = "x-amz-server-side-encryption-aws-kms-key-id"; -pub const AMZ_SERVER_SIDE_ENCRYPTION_CONTEXT: &str = "x-amz-server-side-encryption-context"; -pub const AMZ_SERVER_SIDE_ENCRYPTION_CUSTOMER_ALGORITHM: &str = "x-amz-server-side-encryption-customer-algorithm"; -pub const AMZ_SERVER_SIDE_ENCRYPTION_CUSTOMER_KEY: &str = "x-amz-server-side-encryption-customer-key"; -pub const AMZ_SERVER_SIDE_ENCRYPTION_CUSTOMER_KEY_MD5: &str = "x-amz-server-side-encryption-customer-key-md5"; - -// SSE-C copy source headers -pub const AMZ_COPY_SOURCE_SERVER_SIDE_ENCRYPTION_CUSTOMER_ALGORITHM: &str = - "x-amz-copy-source-server-side-encryption-customer-algorithm"; -pub const AMZ_COPY_SOURCE_SERVER_SIDE_ENCRYPTION_CUSTOMER_KEY: &str = "x-amz-copy-source-server-side-encryption-customer-key"; -pub const AMZ_COPY_SOURCE_SERVER_SIDE_ENCRYPTION_CUSTOMER_KEY_MD5: &str = - "x-amz-copy-source-server-side-encryption-customer-key-md5"; diff --git a/crates/filemeta/src/metacache.rs b/crates/filemeta/src/metacache.rs index c07de472..8daa9c1a 100644 --- a/crates/filemeta/src/metacache.rs +++ b/crates/filemeta/src/metacache.rs @@ -831,10 +831,16 @@ impl Cache { } } + #[allow(unsafe_code)] async fn update(&self) -> std::io::Result<()> { match (self.update_fn)().await { Ok(val) => { - self.val.store(Box::into_raw(Box::new(val)), AtomicOrdering::SeqCst); + let old = self.val.swap(Box::into_raw(Box::new(val)), AtomicOrdering::SeqCst); + if !old.is_null() { + unsafe { + drop(Box::from_raw(old)); + } + } let now = SystemTime::now() .duration_since(UNIX_EPOCH) .expect("Time went backwards") diff --git a/crates/filemeta/src/replication.rs b/crates/filemeta/src/replication.rs index 1f3358c8..1137c144 100644 --- a/crates/filemeta/src/replication.rs +++ b/crates/filemeta/src/replication.rs @@ -709,7 +709,7 @@ pub fn parse_replicate_decision(_bucket: &str, s: &str) -> std::io::Result Error::PolicyTooLarge, Error::ConfigNotFound => Error::ConfigNotFound, Error::Io(e) => Error::Io(std::io::Error::new(e.kind(), e.to_string())), + Error::IamSysAlreadyInitialized => Error::IamSysAlreadyInitialized, } } } @@ -226,6 +230,7 @@ impl From for Error { rustfs_policy::error::Error::StringError(s) => Error::StringError(s), rustfs_policy::error::Error::CryptoError(e) => Error::CryptoError(e), rustfs_policy::error::Error::ErrCredMalformed => Error::ErrCredMalformed, + rustfs_policy::error::Error::IamSysAlreadyInitialized => Error::IamSysAlreadyInitialized, } } } diff --git a/crates/iam/src/lib.rs b/crates/iam/src/lib.rs index ebefb72f..f217b84e 100644 --- a/crates/iam/src/lib.rs +++ b/crates/iam/src/lib.rs @@ -18,30 +18,58 @@ use rustfs_ecstore::store::ECStore; use std::sync::{Arc, OnceLock}; use store::object::ObjectStore; use sys::IamSys; -use tracing::{debug, instrument}; +use tracing::{error, info, instrument}; pub mod cache; pub mod error; pub mod manager; pub mod store; -pub mod utils; - pub mod sys; +pub mod utils; static IAM_SYS: OnceLock>> = OnceLock::new(); #[instrument(skip(ecstore))] pub async fn init_iam_sys(ecstore: Arc) -> Result<()> { - debug!("init iam system"); - let s = IamCache::new(ObjectStore::new(ecstore)).await; + if IAM_SYS.get().is_some() { + info!("IAM system already initialized, skipping."); + return Ok(()); + } - IAM_SYS.get_or_init(move || IamSys::new(s).into()); + info!("Starting IAM system initialization sequence..."); + + // 1. Create the persistent storage adapter + let storage_adapter = ObjectStore::new(ecstore); + + // 2. Create the cache manager. + // The `new` method now performs a blocking initial load from disk. + let cache_manager = IamCache::new(storage_adapter).await; + + // 3. Construct the system interface + let iam_instance = Arc::new(IamSys::new(cache_manager)); + + // 4. Securely set the global singleton + if IAM_SYS.set(iam_instance).is_err() { + error!("Critical: Race condition detected during IAM initialization!"); + return Err(Error::IamSysAlreadyInitialized); + } + + info!("IAM system initialization completed successfully."); Ok(()) } #[inline] pub fn get() -> Result>> { - IAM_SYS.get().map(Arc::clone).ok_or(Error::IamSysNotInitialized) + let sys = IAM_SYS.get().map(Arc::clone).ok_or(Error::IamSysNotInitialized)?; + + // Double-check the internal readiness state. The OnceLock is only set + // after initialization and data loading complete, so this is a defensive + // guard to ensure callers never operate on a partially initialized system. + if !sys.is_ready() { + return Err(Error::IamSysNotInitialized); + } + + Ok(sys) } pub fn get_global_iam_sys() -> Option>> { diff --git a/crates/iam/src/manager.rs b/crates/iam/src/manager.rs index 7153bba4..5fa5220b 100644 --- a/crates/iam/src/manager.rs +++ b/crates/iam/src/manager.rs @@ -23,6 +23,7 @@ use crate::{ UpdateServiceAccountOpts, }, }; +use futures::future::join_all; use rustfs_ecstore::global::get_global_action_cred; use rustfs_madmin::{AccountStatus, AddOrUpdateUserReq, GroupDesc}; use rustfs_policy::{ @@ -36,6 +37,7 @@ use rustfs_policy::{ use rustfs_utils::path::path_join_buf; use serde::{Deserialize, Serialize}; use serde_json::Value; +use std::sync::atomic::AtomicU8; use std::{ collections::{HashMap, HashSet}, sync::{ @@ -75,9 +77,19 @@ fn get_iam_format_file_path() -> String { path_join_buf(&[&IAM_CONFIG_PREFIX, IAM_FORMAT_FILE]) } +#[repr(u8)] +#[derive(Debug, PartialEq)] +pub enum IamState { + Uninitialized = 0, + Loading = 1, + Ready = 2, + Error = 3, +} + pub struct IamCache { pub cache: Cache, pub api: T, + pub state: Arc, pub loading: Arc, pub roles: HashMap>, pub send_chan: Sender, @@ -88,12 +100,19 @@ impl IamCache where T: Store, { + /// Create a new IAM system instance + /// # Arguments + /// * `api` - The storage backend implementing the Store trait + /// + /// # Returns + /// An Arc-wrapped instance of IamSystem pub(crate) async fn new(api: T) -> Arc { let (sender, receiver) = mpsc::channel::(100); let sys = Arc::new(Self { api, cache: Cache::default(), + state: Arc::new(AtomicU8::new(IamState::Uninitialized as u8)), loading: Arc::new(AtomicBool::new(false)), send_chan: sender, roles: HashMap::new(), @@ -104,10 +123,32 @@ where sys } + /// Initialize the IAM system async fn init(self: Arc, receiver: Receiver) -> Result<()> { + self.state.store(IamState::Loading as u8, Ordering::SeqCst); + // Ensure the IAM format file is persisted first self.clone().save_iam_formatter().await?; - self.clone().load().await?; + // Critical: Load all existing users/policies into memory cache + const MAX_RETRIES: usize = 3; + for attempt in 0..MAX_RETRIES { + if let Err(e) = self.clone().load().await { + if attempt == MAX_RETRIES - 1 { + self.state.store(IamState::Error as u8, Ordering::SeqCst); + error!("IAM fail to load initial data after {} attempts: {:?}", MAX_RETRIES, e); + return Err(e); + } else { + warn!("IAM load failed, retrying... attempt {}", attempt + 1); + tokio::time::sleep(Duration::from_secs(1)).await; + } + } else { + break; + } + } + self.state.store(IamState::Ready as u8, Ordering::SeqCst); + info!("IAM System successfully initialized and marked as READY"); + + // Background ticker for synchronization // Check if environment variable is set let skip_background_task = std::env::var("RUSTFS_SKIP_BACKGROUND_TASK").is_ok(); @@ -151,6 +192,11 @@ where Ok(()) } + /// Check if IAM system is ready + pub fn is_ready(&self) -> bool { + self.state.load(Ordering::SeqCst) == IamState::Ready as u8 + } + async fn _notify(&self) { self.send_chan.send(OffsetDateTime::now_utc().unix_timestamp()).await.unwrap(); } @@ -402,13 +448,25 @@ where self.cache.policy_docs.store(Arc::new(cache)); - let ret = m + let items: Vec<_> = m.into_iter().map(|(k, v)| (k, v.policy.clone())).collect(); + + let futures: Vec<_> = items.iter().map(|(_, policy)| policy.match_resource(bucket_name)).collect(); + + let results = join_all(futures).await; + + let filtered = items .into_iter() - .filter(|(_, v)| bucket_name.is_empty() || v.policy.match_resource(bucket_name)) - .map(|(k, v)| (k, v.policy)) + .zip(results) + .filter_map(|((k, policy), matches)| { + if bucket_name.is_empty() || matches { + Some((k, policy)) + } else { + None + } + }) .collect(); - Ok(ret) + Ok(filtered) } pub async fn merge_policies(&self, name: &str) -> (String, Policy) { @@ -456,22 +514,51 @@ where self.cache.policy_docs.store(Arc::new(cache)); - let ret = m - .into_iter() - .filter(|(_, v)| bucket_name.is_empty() || v.policy.match_resource(bucket_name)) + let items: Vec<_> = m.into_iter().map(|(k, v)| (k, v.clone())).collect(); + + let futures: Vec<_> = items + .iter() + .map(|(_, policy_doc)| policy_doc.policy.match_resource(bucket_name)) .collect(); - Ok(ret) + let results = join_all(futures).await; + + let filtered = items + .into_iter() + .zip(results) + .filter_map(|((k, policy_doc), matches)| { + if bucket_name.is_empty() || matches { + Some((k, policy_doc)) + } else { + None + } + }) + .collect(); + + Ok(filtered) } pub async fn list_policy_docs_internal(&self, bucket_name: &str) -> Result> { - let ret = self - .cache - .policy_docs - .load() + let cache = self.cache.policy_docs.load(); + let items: Vec<_> = cache.iter().map(|(k, v)| (k.clone(), v.clone())).collect(); + + let futures: Vec<_> = items .iter() - .filter(|(_, v)| bucket_name.is_empty() || v.policy.match_resource(bucket_name)) - .map(|(k, v)| (k.clone(), v.clone())) + .map(|(_, policy_doc)| policy_doc.policy.match_resource(bucket_name)) + .collect(); + + let results = join_all(futures).await; + + let ret = items + .into_iter() + .zip(results) + .filter_map(|((k, policy_doc), matches)| { + if bucket_name.is_empty() || matches { + Some((k, policy_doc)) + } else { + None + } + }) .collect(); Ok(ret) @@ -1753,7 +1840,7 @@ fn filter_policies(cache: &Cache, policy_name: &str, bucket_name: &str) -> (Stri } if let Some(p) = cache.policy_docs.load().get(&policy) { - if bucket_name.is_empty() || p.policy.match_resource(bucket_name) { + if bucket_name.is_empty() || pollster::block_on(p.policy.match_resource(bucket_name)) { policies.push(policy); to_merge.push(p.policy.clone()); } diff --git a/crates/iam/src/store/object.rs b/crates/iam/src/store/object.rs index 0390587c..5479465b 100644 --- a/crates/iam/src/store/object.rs +++ b/crates/iam/src/store/object.rs @@ -38,7 +38,7 @@ use std::sync::LazyLock; use std::{collections::HashMap, sync::Arc}; use tokio::sync::mpsc::{self, Sender}; use tokio_util::sync::CancellationToken; -use tracing::{info, warn}; +use tracing::{debug, error, info, warn}; pub static IAM_CONFIG_PREFIX: LazyLock = LazyLock::new(|| format!("{RUSTFS_CONFIG_PREFIX}/iam")); pub static IAM_CONFIG_USERS_PREFIX: LazyLock = LazyLock::new(|| format!("{RUSTFS_CONFIG_PREFIX}/iam/users/")); @@ -341,6 +341,27 @@ impl ObjectStore { Ok(policies) } + /// Checks if the underlying ECStore is ready for metadata operations. + /// This prevents silent failures during the storage boot-up phase. + /// + /// Performs a lightweight probe by attempting to read a known configuration object. + /// If the object is not found, it indicates the storage metadata is not ready. + /// The upper-level caller should handle retries if needed. + async fn check_storage_readiness(&self) -> Result<()> { + // Probe path for a fixed object under the IAM root prefix. + // If it doesn't exist, the system bucket or metadata is not ready. + let probe_path = format!("{}/format.json", *IAM_CONFIG_PREFIX); + + match read_config(self.object_api.clone(), &probe_path).await { + Ok(_) => Ok(()), + Err(rustfs_ecstore::error::StorageError::ConfigNotFound) => Err(Error::other(format!( + "Storage metadata not ready: probe object '{}' not found (expected IAM config to be initialized)", + probe_path + ))), + Err(e) => Err(e.into()), + } + } + // async fn load_policy(&self, name: &str) -> Result { // let mut policy = self // .load_iam_config::(&format!("config/iam/policies/{name}/policy.json")) @@ -398,13 +419,50 @@ impl Store for ObjectStore { Ok(serde_json::from_slice(&data)?) } + /// Saves IAM configuration with a retry mechanism on failure. + /// + /// Attempts to save the IAM configuration up to 5 times if the storage layer is not ready, + /// using exponential backoff between attempts (starting at 200ms, doubling each retry). + /// + /// # Arguments + /// + /// * `item` - The IAM configuration item to save, must implement `Serialize` and `Send`. + /// * `path` - The path where the configuration will be saved. + /// + /// # Returns + /// + /// * `Result<()>` - `Ok(())` on success, or an `Error` if all attempts fail. #[tracing::instrument(level = "debug", skip(self, item, path))] async fn save_iam_config(&self, item: Item, path: impl AsRef + Send) -> Result<()> { let mut data = serde_json::to_vec(&item)?; data = Self::encrypt_data(&data)?; - save_config(self.object_api.clone(), path.as_ref(), data).await?; - Ok(()) + let mut attempts = 0; + let max_attempts = 5; + let path_ref = path.as_ref(); + + loop { + match save_config(self.object_api.clone(), path_ref, data.clone()).await { + Ok(_) => { + debug!("Successfully saved IAM config to {}", path_ref); + return Ok(()); + } + Err(e) if attempts < max_attempts => { + attempts += 1; + // Exponential backoff: 200ms, 400ms, 800ms... + let wait_ms = 200 * (1 << attempts); + warn!( + "Storage layer not ready for IAM write (attempt {}/{}). Retrying in {}ms. Path: {}, Error: {:?}", + attempts, max_attempts, wait_ms, path_ref, e + ); + tokio::time::sleep(std::time::Duration::from_millis(wait_ms)).await; + } + Err(e) => { + error!("Final failure saving IAM config to {}: {:?}", path_ref, e); + return Err(e.into()); + } + } + } } async fn delete_iam_config(&self, path: impl AsRef + Send) -> Result<()> { delete_config(self.object_api.clone(), path.as_ref()).await?; @@ -418,8 +476,16 @@ impl Store for ObjectStore { user_identity: UserIdentity, _ttl: Option, ) -> Result<()> { - self.save_iam_config(user_identity, get_user_identity_path(name, user_type)) - .await + // Pre-check storage health + self.check_storage_readiness().await?; + + let path = get_user_identity_path(name, user_type); + debug!("Saving IAM identity to path: {}", path); + + self.save_iam_config(user_identity, path).await.map_err(|e| { + error!("ObjectStore save failure for {}: {:?}", name, e); + e + }) } async fn delete_user_identity(&self, name: &str, user_type: UserType) -> Result<()> { self.delete_iam_config(get_user_identity_path(name, user_type)) diff --git a/crates/iam/src/sys.rs b/crates/iam/src/sys.rs index 94f9e96a..a05cdb6b 100644 --- a/crates/iam/src/sys.rs +++ b/crates/iam/src/sys.rs @@ -67,6 +67,13 @@ pub struct IamSys { } impl IamSys { + /// Create a new IamSys instance with the given IamCache store + /// + /// # Arguments + /// * `store` - An Arc to the IamCache instance + /// + /// # Returns + /// A new instance of IamSys pub fn new(store: Arc>) -> Self { tokio::spawn(async move { match opa::lookup_config().await { @@ -87,6 +94,11 @@ impl IamSys { roles_map: HashMap::new(), } } + + /// Check if the IamSys has a watcher configured + /// + /// # Returns + /// `true` if a watcher is configured, `false` otherwise pub fn has_watcher(&self) -> bool { self.store.api.has_watcher() } @@ -755,10 +767,10 @@ impl IamSys { let (has_session_policy, is_allowed_sp) = is_allowed_by_session_policy(args); if has_session_policy { - return is_allowed_sp && (is_owner || combined_policy.is_allowed(args)); + return is_allowed_sp && (is_owner || combined_policy.is_allowed(args).await); } - is_owner || combined_policy.is_allowed(args) + is_owner || combined_policy.is_allowed(args).await } pub async fn is_allowed_service_account(&self, args: &Args<'_>, parent_user: &str) -> bool { @@ -814,15 +826,15 @@ impl IamSys { }; if sa_str == INHERITED_POLICY_TYPE { - return is_owner || combined_policy.is_allowed(&parent_args); + return is_owner || combined_policy.is_allowed(&parent_args).await; } let (has_session_policy, is_allowed_sp) = is_allowed_by_session_policy_for_service_account(args); if has_session_policy { - return is_allowed_sp && (is_owner || combined_policy.is_allowed(&parent_args)); + return is_allowed_sp && (is_owner || combined_policy.is_allowed(&parent_args).await); } - is_owner || combined_policy.is_allowed(&parent_args) + is_owner || combined_policy.is_allowed(&parent_args).await } pub async fn get_combined_policy(&self, policies: &[String]) -> Policy { @@ -857,7 +869,12 @@ impl IamSys { return false; } - self.get_combined_policy(&policies).await.is_allowed(args) + self.get_combined_policy(&policies).await.is_allowed(args).await + } + + /// Check if the underlying store is ready + pub fn is_ready(&self) -> bool { + self.store.is_ready() } } @@ -883,7 +900,7 @@ fn is_allowed_by_session_policy(args: &Args<'_>) -> (bool, bool) { let mut session_policy_args = args.clone(); session_policy_args.is_owner = false; - (has_session_policy, sub_policy.is_allowed(&session_policy_args)) + (has_session_policy, pollster::block_on(sub_policy.is_allowed(&session_policy_args))) } fn is_allowed_by_session_policy_for_service_account(args: &Args<'_>) -> (bool, bool) { @@ -909,7 +926,7 @@ fn is_allowed_by_session_policy_for_service_account(args: &Args<'_>) -> (bool, b let mut session_policy_args = args.clone(); session_policy_args.is_owner = false; - (has_session_policy, sub_policy.is_allowed(&session_policy_args)) + (has_session_policy, pollster::block_on(sub_policy.is_allowed(&session_policy_args))) } #[derive(Debug, Clone, Default)] diff --git a/crates/kms/Cargo.toml b/crates/kms/Cargo.toml index 5e9e0159..912121c6 100644 --- a/crates/kms/Cargo.toml +++ b/crates/kms/Cargo.toml @@ -61,7 +61,6 @@ reqwest = { workspace = true } vaultrs = { workspace = true } [dev-dependencies] -tokio-test = { workspace = true } tempfile = { workspace = true } [features] diff --git a/crates/lock/src/fast_lock/benchmarks.rs b/crates/lock/src/fast_lock/benchmarks.rs deleted file mode 100644 index 930a5a81..00000000 --- a/crates/lock/src/fast_lock/benchmarks.rs +++ /dev/null @@ -1,325 +0,0 @@ -// Copyright 2024 RustFS Team -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -// Benchmarks comparing fast lock vs old lock performance - -#[cfg(test)] -#[allow(dead_code)] // Temporarily disable benchmark tests -mod benchmarks { - use super::super::*; - use std::sync::Arc; - use std::time::{Duration, Instant}; - use tokio::task; - - /// Benchmark single-threaded lock operations - #[tokio::test] - async fn bench_single_threaded_fast_locks() { - let manager = Arc::new(FastObjectLockManager::new()); - let iterations = 10000; - - // Warm up - for i in 0..100 { - let _guard = manager - .acquire_write_lock("bucket", &format!("warm_{}", i), "owner") - .await - .unwrap(); - } - - // Benchmark write locks - let start = Instant::now(); - for i in 0..iterations { - let _guard = manager - .acquire_write_lock("bucket", &format!("object_{}", i), "owner") - .await - .unwrap(); - } - let duration = start.elapsed(); - - println!("Fast locks: {} write locks in {:?}", iterations, duration); - println!("Average: {:?} per lock", duration / iterations); - - let metrics = manager.get_metrics(); - println!("Fast path rate: {:.2}%", metrics.shard_metrics.fast_path_rate() * 100.0); - - // Should be much faster than old implementation - assert!(duration.as_millis() < 1000, "Should complete 10k locks in <1s"); - assert!(metrics.shard_metrics.fast_path_rate() > 0.95, "Should have >95% fast path rate"); - } - - /// Benchmark concurrent lock operations - #[tokio::test] - async fn bench_concurrent_fast_locks() { - let manager = Arc::new(FastObjectLockManager::new()); - let concurrent_tasks = 100; - let iterations_per_task = 100; - - let start = Instant::now(); - - let mut handles = Vec::new(); - for task_id in 0..concurrent_tasks { - let manager_clone = manager.clone(); - let handle = task::spawn(async move { - for i in 0..iterations_per_task { - let object_name = format!("obj_{}_{}", task_id, i); - let _guard = manager_clone - .acquire_write_lock("bucket", &object_name, &format!("owner_{}", task_id)) - .await - .unwrap(); - - // Simulate some work - tokio::task::yield_now().await; - } - }); - handles.push(handle); - } - - // Wait for all tasks - for handle in handles { - handle.await.unwrap(); - } - - let duration = start.elapsed(); - let total_ops = concurrent_tasks * iterations_per_task; - - println!("Concurrent fast locks: {} operations across {} tasks in {:?}", - total_ops, concurrent_tasks, duration); - println!("Throughput: {:.2} ops/sec", total_ops as f64 / duration.as_secs_f64()); - - let metrics = manager.get_metrics(); - println!("Fast path rate: {:.2}%", metrics.shard_metrics.fast_path_rate() * 100.0); - println!("Contention events: {}", metrics.shard_metrics.contention_events); - - // Should maintain high throughput even with concurrency - assert!(duration.as_millis() < 5000, "Should complete concurrent ops in <5s"); - } - - /// Benchmark contended lock operations - #[tokio::test] - async fn bench_contended_locks() { - let manager = Arc::new(FastObjectLockManager::new()); - let concurrent_tasks = 50; - let shared_objects = 10; // High contention on few objects - let iterations_per_task = 50; - - let start = Instant::now(); - - let mut handles = Vec::new(); - for task_id in 0..concurrent_tasks { - let manager_clone = manager.clone(); - let handle = task::spawn(async move { - for i in 0..iterations_per_task { - let object_name = format!("shared_{}", i % shared_objects); - - // Mix of read and write operations - if i % 3 == 0 { - // Write operation - if let Ok(_guard) = manager_clone - .acquire_write_lock("bucket", &object_name, &format!("owner_{}", task_id)) - .await - { - tokio::task::yield_now().await; - } - } else { - // Read operation - if let Ok(_guard) = manager_clone - .acquire_read_lock("bucket", &object_name, &format!("owner_{}", task_id)) - .await - { - tokio::task::yield_now().await; - } - } - } - }); - handles.push(handle); - } - - // Wait for all tasks - for handle in handles { - handle.await.unwrap(); - } - - let duration = start.elapsed(); - - println!("Contended locks: {} tasks on {} objects in {:?}", - concurrent_tasks, shared_objects, duration); - - let metrics = manager.get_metrics(); - println!("Total acquisitions: {}", metrics.shard_metrics.total_acquisitions()); - println!("Fast path rate: {:.2}%", metrics.shard_metrics.fast_path_rate() * 100.0); - println!("Average wait time: {:?}", metrics.shard_metrics.avg_wait_time()); - println!("Timeout rate: {:.2}%", metrics.shard_metrics.timeout_rate() * 100.0); - - // Even with contention, should maintain reasonable performance - assert!(metrics.shard_metrics.timeout_rate() < 0.1, "Should have <10% timeout rate"); - assert!(metrics.shard_metrics.avg_wait_time() < Duration::from_millis(100), "Avg wait should be <100ms"); - } - - /// Benchmark batch operations - #[tokio::test] - async fn bench_batch_operations() { - let manager = FastObjectLockManager::new(); - let batch_sizes = vec![10, 50, 100, 500]; - - for batch_size in batch_sizes { - // Create batch request - let mut batch = BatchLockRequest::new("batch_owner"); - for i in 0..batch_size { - batch = batch.add_write_lock("bucket", &format!("batch_obj_{}", i)); - } - - let start = Instant::now(); - let result = manager.acquire_locks_batch(batch).await; - let duration = start.elapsed(); - - assert!(result.all_acquired, "Batch should succeed"); - println!("Batch size {}: {:?} ({:.2} μs per lock)", - batch_size, - duration, - duration.as_micros() as f64 / batch_size as f64); - - // Batch should be much faster than individual acquisitions - assert!(duration.as_millis() < batch_size as u128 / 10, - "Batch should be 10x+ faster than individual locks"); - } - } - - /// Benchmark version-specific locks - #[tokio::test] - async fn bench_versioned_locks() { - let manager = Arc::new(FastObjectLockManager::new()); - let objects = 100; - let versions_per_object = 10; - - let start = Instant::now(); - - let mut handles = Vec::new(); - for obj_id in 0..objects { - let manager_clone = manager.clone(); - let handle = task::spawn(async move { - for version in 0..versions_per_object { - let _guard = manager_clone - .acquire_write_lock_versioned( - "bucket", - &format!("obj_{}", obj_id), - &format!("v{}", version), - "version_owner" - ) - .await - .unwrap(); - } - }); - handles.push(handle); - } - - for handle in handles { - handle.await.unwrap(); - } - - let duration = start.elapsed(); - let total_ops = objects * versions_per_object; - - println!("Versioned locks: {} version locks in {:?}", total_ops, duration); - println!("Throughput: {:.2} locks/sec", total_ops as f64 / duration.as_secs_f64()); - - let metrics = manager.get_metrics(); - println!("Fast path rate: {:.2}%", metrics.shard_metrics.fast_path_rate() * 100.0); - - // Versioned locks should not interfere with each other - assert!(metrics.shard_metrics.fast_path_rate() > 0.9, "Should maintain high fast path rate"); - } - - /// Compare with theoretical maximum performance - #[tokio::test] - async fn bench_theoretical_maximum() { - let manager = Arc::new(FastObjectLockManager::new()); - let iterations = 100000; - - // Measure pure fast path performance (no contention) - let start = Instant::now(); - for i in 0..iterations { - let _guard = manager - .acquire_write_lock("bucket", &format!("unique_{}", i), "owner") - .await - .unwrap(); - } - let duration = start.elapsed(); - - println!("Theoretical maximum: {} unique locks in {:?}", iterations, duration); - println!("Rate: {:.2} locks/sec", iterations as f64 / duration.as_secs_f64()); - println!("Latency: {:?} per lock", duration / iterations); - - let metrics = manager.get_metrics(); - println!("Fast path rate: {:.2}%", metrics.shard_metrics.fast_path_rate() * 100.0); - - // Should achieve very high performance with no contention - assert!(metrics.shard_metrics.fast_path_rate() > 0.99, "Should be nearly 100% fast path"); - assert!(duration.as_secs_f64() / (iterations as f64) < 0.0001, "Should be <100μs per lock"); - } - - /// Performance regression test - #[tokio::test] - async fn performance_regression_test() { - let manager = Arc::new(FastObjectLockManager::new()); - - // This test ensures we maintain performance targets - let test_cases = vec![ - ("single_thread", 1, 10000), - ("low_contention", 10, 1000), - ("high_contention", 100, 100), - ]; - - for (test_name, threads, ops_per_thread) in test_cases { - let start = Instant::now(); - - let mut handles = Vec::new(); - for thread_id in 0..threads { - let manager_clone = manager.clone(); - let handle = task::spawn(async move { - for op_id in 0..ops_per_thread { - let object = if threads == 1 { - format!("obj_{}_{}", thread_id, op_id) - } else { - format!("obj_{}", op_id % 100) // Create contention - }; - - let owner = format!("owner_{}", thread_id); - let _guard = manager_clone - .acquire_write_lock("bucket", object, owner) - .await - .unwrap(); - } - }); - handles.push(handle); - } - - for handle in handles { - handle.await.unwrap(); - } - - let duration = start.elapsed(); - let total_ops = threads * ops_per_thread; - let ops_per_sec = total_ops as f64 / duration.as_secs_f64(); - - println!("{}: {:.2} ops/sec", test_name, ops_per_sec); - - // Performance targets (adjust based on requirements) - match test_name { - "single_thread" => assert!(ops_per_sec > 50000.0, "Single thread should exceed 50k ops/sec"), - "low_contention" => assert!(ops_per_sec > 20000.0, "Low contention should exceed 20k ops/sec"), - "high_contention" => assert!(ops_per_sec > 5000.0, "High contention should exceed 5k ops/sec"), - _ => {} - } - } - } -} \ No newline at end of file diff --git a/crates/lock/src/fast_lock/mod.rs b/crates/lock/src/fast_lock/mod.rs index d6e89243..3cd4b9c9 100644 --- a/crates/lock/src/fast_lock/mod.rs +++ b/crates/lock/src/fast_lock/mod.rs @@ -37,9 +37,6 @@ pub mod shard; pub mod state; pub mod types; -// #[cfg(test)] -// pub mod benchmarks; // Temporarily disabled due to compilation issues - // Re-export main types pub use disabled_manager::DisabledLockManager; pub use guard::FastLockGuard; diff --git a/crates/mcp/Dockerfile b/crates/mcp/Dockerfile index 5ec9501c..d9c95e94 100644 --- a/crates/mcp/Dockerfile +++ b/crates/mcp/Dockerfile @@ -12,4 +12,6 @@ WORKDIR /app COPY --from=builder /build/target/release/rustfs-mcp /app/ -ENTRYPOINT ["/app/rustfs-mcp"] \ No newline at end of file +RUN apt-get update && apt-get install -y ca-certificates && update-ca-certificates + +ENTRYPOINT ["/app/rustfs-mcp"] diff --git a/crates/notify/Cargo.toml b/crates/notify/Cargo.toml index bdccac6c..a4626675 100644 --- a/crates/notify/Cargo.toml +++ b/crates/notify/Cargo.toml @@ -28,8 +28,9 @@ documentation = "https://docs.rs/rustfs-notify/latest/rustfs_notify/" [dependencies] rustfs-config = { workspace = true, features = ["notify", "constants"] } rustfs-ecstore = { workspace = true } -rustfs-utils = { workspace = true, features = ["path", "sys"] } rustfs-targets = { workspace = true } +rustfs-utils = { workspace = true } +arc-swap = { workspace = true } async-trait = { workspace = true } chrono = { workspace = true, features = ["serde"] } futures = { workspace = true } @@ -40,7 +41,6 @@ rayon = { workspace = true } rumqttc = { workspace = true } rustc-hash = { workspace = true } serde = { workspace = true } -serde_json = { workspace = true } starshard = { workspace = true } thiserror = { workspace = true } tokio = { workspace = true, features = ["rt-multi-thread", "sync", "time"] } @@ -52,6 +52,8 @@ wildmatch = { workspace = true, features = ["serde"] } tokio = { workspace = true, features = ["test-util"] } tracing-subscriber = { workspace = true, features = ["env-filter"] } axum = { workspace = true } +rustfs-utils = { workspace = true, features = ["path", "sys"] } +serde_json = { workspace = true } [lints] workspace = true diff --git a/crates/notify/examples/webhook.rs b/crates/notify/examples/webhook.rs index b0f47dc9..e7d81c94 100644 --- a/crates/notify/examples/webhook.rs +++ b/crates/notify/examples/webhook.rs @@ -110,20 +110,21 @@ async fn reset_webhook_count(Query(params): Query, headers: HeaderM let reason = params.reason.unwrap_or_else(|| "Reason not provided".to_string()); println!("Reset webhook count, reason: {reason}"); - + let time_now = chrono::offset::Utc::now().to_string(); for header in headers { let (key, value) = header; - println!("Header: {key:?}: {value:?}"); + println!("Header: {key:?}: {value:?}, time: {time_now}"); } println!("Reset webhook count printed headers"); // Reset the counter to 0 WEBHOOK_COUNT.store(0, Ordering::SeqCst); println!("Webhook count has been reset to 0."); + let time_now = chrono::offset::Utc::now().to_string(); Response::builder() .header("Foo", "Bar") .status(StatusCode::OK) - .body(format!("Webhook count reset successfully current_count:{current_count}")) + .body(format!("Webhook count reset successfully current_count:{current_count},time: {time_now}")) .unwrap() } @@ -167,7 +168,11 @@ async fn receive_webhook(Json(payload): Json) -> StatusCode { serde_json::to_string_pretty(&payload).unwrap() ); WEBHOOK_COUNT.fetch_add(1, Ordering::SeqCst); - println!("Total webhook requests received: {}", WEBHOOK_COUNT.load(Ordering::SeqCst)); + println!( + "Total webhook requests received: {} , Time: {}", + WEBHOOK_COUNT.load(Ordering::SeqCst), + chrono::offset::Utc::now() + ); StatusCode::OK } diff --git a/crates/notify/src/event.rs b/crates/notify/src/event.rs index 97958e30..ad70e51e 100644 --- a/crates/notify/src/event.rs +++ b/crates/notify/src/event.rs @@ -20,6 +20,7 @@ use url::form_urlencoded; /// Represents the identity of the user who triggered the event #[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(rename_all = "camelCase")] pub struct Identity { /// The principal ID of the user pub principal_id: String, @@ -27,6 +28,7 @@ pub struct Identity { /// Represents the bucket that the object is in #[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(rename_all = "camelCase")] pub struct Bucket { /// The name of the bucket pub name: String, @@ -38,6 +40,7 @@ pub struct Bucket { /// Represents the object that the event occurred on #[derive(Debug, Clone, Serialize, Deserialize, Default)] +#[serde(rename_all = "camelCase")] pub struct Object { /// The key (name) of the object pub key: String, @@ -62,6 +65,7 @@ pub struct Object { /// Metadata about the event #[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(rename_all = "camelCase")] pub struct Metadata { /// The schema version of the event #[serde(rename = "s3SchemaVersion")] @@ -76,13 +80,13 @@ pub struct Metadata { /// Information about the source of the event #[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(rename_all = "camelCase")] pub struct Source { /// The host where the event originated pub host: String, /// The port on the host pub port: String, /// The user agent that caused the event - #[serde(rename = "userAgent")] pub user_agent: String, } diff --git a/crates/notify/src/factory.rs b/crates/notify/src/factory.rs index 84cf1be6..fb4d6312 100644 --- a/crates/notify/src/factory.rs +++ b/crates/notify/src/factory.rs @@ -18,9 +18,9 @@ use hashbrown::HashSet; use rumqttc::QoS; use rustfs_config::notify::{ENV_NOTIFY_MQTT_KEYS, ENV_NOTIFY_WEBHOOK_KEYS, NOTIFY_MQTT_KEYS, NOTIFY_WEBHOOK_KEYS}; use rustfs_config::{ - DEFAULT_DIR, DEFAULT_LIMIT, MQTT_BROKER, MQTT_KEEP_ALIVE_INTERVAL, MQTT_PASSWORD, MQTT_QOS, MQTT_QUEUE_DIR, MQTT_QUEUE_LIMIT, - MQTT_RECONNECT_INTERVAL, MQTT_TOPIC, MQTT_USERNAME, WEBHOOK_AUTH_TOKEN, WEBHOOK_CLIENT_CERT, WEBHOOK_CLIENT_KEY, - WEBHOOK_ENDPOINT, WEBHOOK_QUEUE_DIR, WEBHOOK_QUEUE_LIMIT, + DEFAULT_LIMIT, EVENT_DEFAULT_DIR, MQTT_BROKER, MQTT_KEEP_ALIVE_INTERVAL, MQTT_PASSWORD, MQTT_QOS, MQTT_QUEUE_DIR, + MQTT_QUEUE_LIMIT, MQTT_RECONNECT_INTERVAL, MQTT_TOPIC, MQTT_USERNAME, WEBHOOK_AUTH_TOKEN, WEBHOOK_CLIENT_CERT, + WEBHOOK_CLIENT_KEY, WEBHOOK_ENDPOINT, WEBHOOK_QUEUE_DIR, WEBHOOK_QUEUE_LIMIT, }; use rustfs_ecstore::config::KVS; use rustfs_targets::{ @@ -60,14 +60,15 @@ impl TargetFactory for WebhookTargetFactory { let endpoint = config .lookup(WEBHOOK_ENDPOINT) .ok_or_else(|| TargetError::Configuration("Missing webhook endpoint".to_string()))?; - let endpoint_url = Url::parse(&endpoint) - .map_err(|e| TargetError::Configuration(format!("Invalid endpoint URL: {e} (value: '{endpoint}')")))?; + let parsed_endpoint = endpoint.trim(); + let endpoint_url = Url::parse(parsed_endpoint) + .map_err(|e| TargetError::Configuration(format!("Invalid endpoint URL: {e} (value: '{parsed_endpoint}')")))?; let args = WebhookArgs { enable: true, // If we are here, it's already enabled. endpoint: endpoint_url, auth_token: config.lookup(WEBHOOK_AUTH_TOKEN).unwrap_or_default(), - queue_dir: config.lookup(WEBHOOK_QUEUE_DIR).unwrap_or(DEFAULT_DIR.to_string()), + queue_dir: config.lookup(WEBHOOK_QUEUE_DIR).unwrap_or(EVENT_DEFAULT_DIR.to_string()), queue_limit: config .lookup(WEBHOOK_QUEUE_LIMIT) .and_then(|v| v.parse::().ok()) @@ -100,7 +101,7 @@ impl TargetFactory for WebhookTargetFactory { )); } - let queue_dir = config.lookup(WEBHOOK_QUEUE_DIR).unwrap_or(DEFAULT_DIR.to_string()); + let queue_dir = config.lookup(WEBHOOK_QUEUE_DIR).unwrap_or(EVENT_DEFAULT_DIR.to_string()); if !queue_dir.is_empty() && !std::path::Path::new(&queue_dir).is_absolute() { return Err(TargetError::Configuration("Webhook queue directory must be an absolute path".to_string())); } @@ -159,7 +160,7 @@ impl TargetFactory for MQTTTargetFactory { .and_then(|v| v.parse::().ok()) .map(Duration::from_secs) .unwrap_or_else(|| Duration::from_secs(30)), - queue_dir: config.lookup(MQTT_QUEUE_DIR).unwrap_or(DEFAULT_DIR.to_string()), + queue_dir: config.lookup(MQTT_QUEUE_DIR).unwrap_or(EVENT_DEFAULT_DIR.to_string()), queue_limit: config .lookup(MQTT_QUEUE_LIMIT) .and_then(|v| v.parse::().ok()) diff --git a/crates/notify/src/integration.rs b/crates/notify/src/integration.rs index 4afa0145..ddce7560 100644 --- a/crates/notify/src/integration.rs +++ b/crates/notify/src/integration.rs @@ -12,10 +12,12 @@ // See the License for the specific language governing permissions and // limitations under the License. +use crate::notification_system_subscriber::NotificationSystemSubscriberView; use crate::{ Event, error::NotificationError, notifier::EventNotifier, registry::TargetRegistry, rules::BucketNotificationConfig, stream, }; use hashbrown::HashMap; +use rustfs_config::notify::{DEFAULT_NOTIFY_TARGET_STREAM_CONCURRENCY, ENV_NOTIFY_TARGET_STREAM_CONCURRENCY}; use rustfs_ecstore::config::{Config, KVS}; use rustfs_targets::EventName; use rustfs_targets::arn::TargetID; @@ -103,22 +105,22 @@ pub struct NotificationSystem { concurrency_limiter: Arc, /// Monitoring indicators metrics: Arc, + /// Subscriber view + subscriber_view: NotificationSystemSubscriberView, } impl NotificationSystem { /// Creates a new NotificationSystem pub fn new(config: Config) -> Self { + let concurrency_limiter = + rustfs_utils::get_env_usize(ENV_NOTIFY_TARGET_STREAM_CONCURRENCY, DEFAULT_NOTIFY_TARGET_STREAM_CONCURRENCY); NotificationSystem { + subscriber_view: NotificationSystemSubscriberView::new(), notifier: Arc::new(EventNotifier::new()), registry: Arc::new(TargetRegistry::new()), config: Arc::new(RwLock::new(config)), stream_cancellers: Arc::new(RwLock::new(HashMap::new())), - concurrency_limiter: Arc::new(Semaphore::new( - std::env::var("RUSTFS_TARGET_STREAM_CONCURRENCY") - .ok() - .and_then(|s| s.parse().ok()) - .unwrap_or(20), - )), // Limit the maximum number of concurrent processing events to 20 + concurrency_limiter: Arc::new(Semaphore::new(concurrency_limiter)), // Limit the maximum number of concurrent processing events to 20 metrics: Arc::new(NotificationMetrics::new()), } } @@ -190,8 +192,11 @@ impl NotificationSystem { } /// Checks if there are active subscribers for the given bucket and event name. - pub async fn has_subscriber(&self, bucket: &str, event_name: &EventName) -> bool { - self.notifier.has_subscriber(bucket, event_name).await + pub async fn has_subscriber(&self, bucket: &str, event: &EventName) -> bool { + if !self.subscriber_view.has_subscriber(bucket, event) { + return false; + } + self.notifier.has_subscriber(bucket, event).await } async fn update_config_and_reload(&self, mut modifier: F) -> Result<(), NotificationError> @@ -214,6 +219,11 @@ impl NotificationSystem { return Ok(()); } + // Save the modified configuration to storage + rustfs_ecstore::config::com::save_server_config(store, &new_config) + .await + .map_err(|e| NotificationError::SaveConfig(e.to_string()))?; + info!("Configuration updated. Reloading system..."); self.reload_config(new_config).await } @@ -233,15 +243,18 @@ impl NotificationSystem { pub async fn remove_target(&self, target_id: &TargetID, target_type: &str) -> Result<(), NotificationError> { info!("Attempting to remove target: {}", target_id); + let ttype = target_type.to_lowercase(); + let tname = target_id.name.to_lowercase(); + self.update_config_and_reload(|config| { let mut changed = false; - if let Some(targets_of_type) = config.0.get_mut(target_type) { - if targets_of_type.remove(&target_id.name).is_some() { + if let Some(targets_of_type) = config.0.get_mut(&ttype) { + if targets_of_type.remove(&tname).is_some() { info!("Removed target {} from configuration", target_id); changed = true; } if targets_of_type.is_empty() { - config.0.remove(target_type); + config.0.remove(&ttype); } } if !changed { @@ -266,20 +279,24 @@ impl NotificationSystem { /// If the target configuration is invalid, it returns Err(NotificationError::Configuration). pub async fn set_target_config(&self, target_type: &str, target_name: &str, kvs: KVS) -> Result<(), NotificationError> { info!("Setting config for target {} of type {}", target_name, target_type); + let ttype = target_type.to_lowercase(); + let tname = target_name.to_lowercase(); self.update_config_and_reload(|config| { - config - .0 - .entry(target_type.to_string()) - .or_default() - .insert(target_name.to_string(), kvs.clone()); + config.0.entry(ttype.clone()).or_default().insert(tname.clone(), kvs.clone()); true // The configuration is always modified }) .await } /// Removes all notification configurations for a bucket. - pub async fn remove_bucket_notification_config(&self, bucket_name: &str) { - self.notifier.remove_rules_map(bucket_name).await; + /// If the configuration is successfully removed, the entire notification system will be automatically reloaded. + /// + /// # Arguments + /// * `bucket` - The name of the bucket whose notification configuration is to be removed. + /// + pub async fn remove_bucket_notification_config(&self, bucket: &str) { + self.subscriber_view.clear_bucket(bucket); + self.notifier.remove_rules_map(bucket).await; } /// Removes a Target configuration. @@ -296,23 +313,50 @@ impl NotificationSystem { /// If the target configuration does not exist, it returns Ok(()) without making any changes. pub async fn remove_target_config(&self, target_type: &str, target_name: &str) -> Result<(), NotificationError> { info!("Removing config for target {} of type {}", target_name, target_type); - self.update_config_and_reload(|config| { - let mut changed = false; - if let Some(targets) = config.0.get_mut(&target_type.to_lowercase()) { - if targets.remove(&target_name.to_lowercase()).is_some() { - changed = true; + + let ttype = target_type.to_lowercase(); + let tname = target_name.to_lowercase(); + + let target_id = TargetID { + id: tname.clone(), + name: ttype.clone(), + }; + + // Deletion is prohibited if bucket rules refer to it + if self.notifier.is_target_bound_to_any_bucket(&target_id).await { + return Err(NotificationError::Configuration(format!( + "Target is still bound to bucket rules and deletion is prohibited: type={} name={}", + ttype, tname + ))); + } + + let config_result = self + .update_config_and_reload(|config| { + let mut changed = false; + if let Some(targets) = config.0.get_mut(&ttype) { + if targets.remove(&tname).is_some() { + changed = true; + } + if targets.is_empty() { + config.0.remove(target_type); + } } - if targets.is_empty() { - config.0.remove(target_type); + if !changed { + info!("Target {} of type {} not found, no changes made.", target_name, target_type); } - } - if !changed { - info!("Target {} of type {} not found, no changes made.", target_name, target_type); - } - debug!("Config after remove: {:?}", config); - changed - }) - .await + debug!("Config after remove: {:?}", config); + changed + }) + .await; + + if config_result.is_ok() { + // Remove from target list + let target_list = self.notifier.target_list(); + let mut target_list_guard = target_list.write().await; + let _ = target_list_guard.remove_target_only(&target_id).await; + } + + config_result } /// Enhanced event stream startup function, including monitoring and concurrency control @@ -343,6 +387,9 @@ impl NotificationSystem { let _ = cancel_tx.send(()).await; } + // Clear the target_list and ensure that reload is a replacement reconstruction (solve the target_list len unchanged/residual problem) + self.notifier.remove_all_bucket_targets().await; + // Update the config self.update_config(new_config.clone()).await; @@ -373,15 +420,16 @@ impl NotificationSystem { // The storage of the cloned target and the target itself let store_clone = store.boxed_clone(); - let target_box = target.clone_dyn(); - let target_arc = Arc::from(target_box); - - // Add a reference to the monitoring metrics - let metrics = self.metrics.clone(); - let semaphore = self.concurrency_limiter.clone(); + // let target_box = target.clone_dyn(); + let target_arc = Arc::from(target.clone_dyn()); // Encapsulated enhanced version of start_event_stream - let cancel_tx = self.enhanced_start_event_stream(store_clone, target_arc, metrics, semaphore); + let cancel_tx = self.enhanced_start_event_stream( + store_clone, + target_arc, + self.metrics.clone(), + self.concurrency_limiter.clone(), + ); // Start event stream processing and save cancel sender // let cancel_tx = start_event_stream(store_clone, target_clone); @@ -408,17 +456,18 @@ impl NotificationSystem { /// Loads the bucket notification configuration pub async fn load_bucket_notification_config( &self, - bucket_name: &str, - config: &BucketNotificationConfig, + bucket: &str, + cfg: &BucketNotificationConfig, ) -> Result<(), NotificationError> { - let arn_list = self.notifier.get_arn_list(&config.region).await; + self.subscriber_view.apply_bucket_config(bucket, cfg); + let arn_list = self.notifier.get_arn_list(&cfg.region).await; if arn_list.is_empty() { return Err(NotificationError::Configuration("No targets configured".to_string())); } info!("Available ARNs: {:?}", arn_list); // Validate the configuration against the available ARNs - if let Err(e) = config.validate(&config.region, &arn_list) { - debug!("Bucket notification config validation region:{} failed: {}", &config.region, e); + if let Err(e) = cfg.validate(&cfg.region, &arn_list) { + debug!("Bucket notification config validation region:{} failed: {}", &cfg.region, e); if !e.to_string().contains("ARN not found") { return Err(NotificationError::BucketNotification(e.to_string())); } else { @@ -426,9 +475,9 @@ impl NotificationSystem { } } - let rules_map = config.get_rules_map(); - self.notifier.add_rules_map(bucket_name, rules_map.clone()).await; - info!("Loaded notification config for bucket: {}", bucket_name); + let rules_map = cfg.get_rules_map(); + self.notifier.add_rules_map(bucket, rules_map.clone()).await; + info!("Loaded notification config for bucket: {}", bucket); Ok(()) } diff --git a/crates/notify/src/lib.rs b/crates/notify/src/lib.rs index cc514dbe..4181e4d0 100644 --- a/crates/notify/src/lib.rs +++ b/crates/notify/src/lib.rs @@ -23,6 +23,7 @@ mod event; pub mod factory; mod global; pub mod integration; +mod notification_system_subscriber; pub mod notifier; pub mod registry; pub mod rules; diff --git a/crates/notify/src/notification_system_subscriber.rs b/crates/notify/src/notification_system_subscriber.rs new file mode 100644 index 00000000..11014fb5 --- /dev/null +++ b/crates/notify/src/notification_system_subscriber.rs @@ -0,0 +1,74 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use crate::BucketNotificationConfig; +use crate::rules::{BucketRulesSnapshot, DynRulesContainer, SubscriberIndex}; +use rustfs_targets::EventName; + +/// NotificationSystemSubscriberView - Provides an interface to manage and query +/// the subscription status of buckets in the notification system. +#[derive(Debug)] +pub struct NotificationSystemSubscriberView { + index: SubscriberIndex, +} + +impl NotificationSystemSubscriberView { + /// Creates a new NotificationSystemSubscriberView with an empty SubscriberIndex. + /// + /// Returns a new instance of NotificationSystemSubscriberView. + pub fn new() -> Self { + Self { + index: SubscriberIndex::default(), + } + } + + /// Checks if a bucket has any subscribers for a specific event. + /// This is a quick check using the event mask in the snapshot. + /// + /// # Arguments + /// * `bucket` - The name of the bucket to check. + /// * `event` - The event name to check for subscriptions. + /// + /// Returns `true` if there are subscribers for the event, `false` otherwise. + #[inline] + pub fn has_subscriber(&self, bucket: &str, event: &EventName) -> bool { + self.index.has_subscriber(bucket, event) + } + + /// Builds and atomically replaces a bucket's subscription snapshot from the configuration. + /// + /// Core principle: masks and rules are calculated and stored together in the same update. + /// + /// # Arguments + /// * `bucket` - The name of the bucket to update. + /// * `cfg` - The bucket notification configuration to compile into a snapshot. + pub fn apply_bucket_config(&self, bucket: &str, cfg: &BucketNotificationConfig) { + // *It is recommended to merge compile into one function to ensure the same origin. + let snapshot: BucketRulesSnapshot = cfg.compile_snapshot(); + + // *debug to prevent inconsistencies from being introduced when modifying the compile logic in the future. + snapshot.debug_assert_mask_consistent(); + + self.index.store_snapshot(bucket, snapshot); + } + + /// Clears a bucket's subscription snapshot. + /// + /// #Arguments + /// * `bucket` - The name of the bucket to clear. + #[inline] + pub fn clear_bucket(&self, bucket: &str) { + self.index.clear_bucket(bucket); + } +} diff --git a/crates/notify/src/notifier.rs b/crates/notify/src/notifier.rs index b570fd6f..78beda9c 100644 --- a/crates/notify/src/notifier.rs +++ b/crates/notify/src/notifier.rs @@ -14,19 +14,21 @@ use crate::{error::NotificationError, event::Event, rules::RulesMap}; use hashbrown::HashMap; +use rustfs_config::notify::{DEFAULT_NOTIFY_SEND_CONCURRENCY, ENV_NOTIFY_SEND_CONCURRENCY}; use rustfs_targets::EventName; use rustfs_targets::Target; use rustfs_targets::arn::TargetID; use rustfs_targets::target::EntityTarget; use starshard::AsyncShardedHashMap; use std::sync::Arc; -use tokio::sync::RwLock; +use tokio::sync::{RwLock, Semaphore}; use tracing::{debug, error, info, instrument, warn}; /// Manages event notification to targets based on rules pub struct EventNotifier { target_list: Arc>, bucket_rules_map: Arc>, + send_limiter: Arc, } impl Default for EventNotifier { @@ -37,16 +39,41 @@ impl Default for EventNotifier { impl EventNotifier { /// Creates a new EventNotifier + /// + /// # Returns + /// Returns a new instance of EventNotifier. pub fn new() -> Self { + let max_inflight = rustfs_utils::get_env_usize(ENV_NOTIFY_SEND_CONCURRENCY, DEFAULT_NOTIFY_SEND_CONCURRENCY); EventNotifier { target_list: Arc::new(RwLock::new(TargetList::new())), bucket_rules_map: Arc::new(AsyncShardedHashMap::new(0)), + send_limiter: Arc::new(Semaphore::new(max_inflight)), } } + /// Checks whether a TargetID is still referenced by any bucket's rules. + /// + /// # Arguments + /// * `target_id` - The TargetID to check. + /// + /// # Returns + /// Returns `true` if the TargetID is bound to any bucket, otherwise `false`. + pub async fn is_target_bound_to_any_bucket(&self, target_id: &TargetID) -> bool { + // `AsyncShardedHashMap::iter()`: Traverse (bucket_name, rules_map) + let items = self.bucket_rules_map.iter().await; + for (_bucket, rules_map) in items { + if rules_map.contains_target_id(target_id) { + return true; + } + } + false + } + /// Returns a reference to the target list /// This method provides access to the target list for external use. /// + /// # Returns + /// Returns an `Arc>` representing the target list. pub fn target_list(&self) -> Arc> { Arc::clone(&self.target_list) } @@ -54,17 +81,23 @@ impl EventNotifier { /// Removes all notification rules for a bucket /// /// # Arguments - /// * `bucket_name` - The name of the bucket for which to remove rules + /// * `bucket` - The name of the bucket for which to remove rules /// /// This method removes all rules associated with the specified bucket name. /// It will log a message indicating the removal of rules. - pub async fn remove_rules_map(&self, bucket_name: &str) { - if self.bucket_rules_map.remove(&bucket_name.to_string()).await.is_some() { - info!("Removed all notification rules for bucket: {}", bucket_name); + pub async fn remove_rules_map(&self, bucket: &str) { + if self.bucket_rules_map.remove(&bucket.to_string()).await.is_some() { + info!("Removed all notification rules for bucket: {}", bucket); } } /// Returns a list of ARNs for the registered targets + /// + /// # Arguments + /// * `region` - The region to use for generating the ARNs + /// + /// # Returns + /// Returns a vector of strings representing the ARNs of the registered targets pub async fn get_arn_list(&self, region: &str) -> Vec { let target_list_guard = self.target_list.read().await; target_list_guard @@ -75,24 +108,37 @@ impl EventNotifier { } /// Adds a rules map for a bucket - pub async fn add_rules_map(&self, bucket_name: &str, rules_map: RulesMap) { + /// + /// # Arguments + /// * `bucket` - The name of the bucket for which to add the rules map + /// * `rules_map` - The rules map to add for the bucket + pub async fn add_rules_map(&self, bucket: &str, rules_map: RulesMap) { if rules_map.is_empty() { - self.bucket_rules_map.remove(&bucket_name.to_string()).await; + self.bucket_rules_map.remove(&bucket.to_string()).await; } else { - self.bucket_rules_map.insert(bucket_name.to_string(), rules_map).await; + self.bucket_rules_map.insert(bucket.to_string(), rules_map).await; } - info!("Added rules for bucket: {}", bucket_name); + info!("Added rules for bucket: {}", bucket); } /// Gets the rules map for a specific bucket. - pub async fn get_rules_map(&self, bucket_name: &str) -> Option { - self.bucket_rules_map.get(&bucket_name.to_string()).await + /// + /// # Arguments + /// * `bucket` - The name of the bucket for which to get the rules map + /// + /// # Returns + /// Returns `Some(RulesMap)` if rules exist for the bucket, otherwise returns `None`. + pub async fn get_rules_map(&self, bucket: &str) -> Option { + self.bucket_rules_map.get(&bucket.to_string()).await } /// Removes notification rules for a bucket - pub async fn remove_notification(&self, bucket_name: &str) { - self.bucket_rules_map.remove(&bucket_name.to_string()).await; - info!("Removed notification rules for bucket: {}", bucket_name); + /// + /// # Arguments + /// * `bucket` - The name of the bucket for which to remove notification rules + pub async fn remove_notification(&self, bucket: &str) { + self.bucket_rules_map.remove(&bucket.to_string()).await; + info!("Removed notification rules for bucket: {}", bucket); } /// Removes all targets @@ -125,69 +171,87 @@ impl EventNotifier { } /// Sends an event to the appropriate targets based on the bucket rules + /// + /// # Arguments + /// * `event` - The event to send #[instrument(skip_all)] pub async fn send(&self, event: Arc) { let bucket_name = &event.s3.bucket.name; let object_key = &event.s3.object.key; let event_name = event.event_name; - if let Some(rules) = self.bucket_rules_map.get(bucket_name).await { - let target_ids = rules.match_rules(event_name, object_key); - if target_ids.is_empty() { - debug!("No matching targets for event in bucket: {}", bucket_name); - return; - } - let target_ids_len = target_ids.len(); - let mut handles = vec![]; - // Use scope to limit the borrow scope of target_list - { - let target_list_guard = self.target_list.read().await; - info!("Sending event to targets: {:?}", target_ids); - for target_id in target_ids { - // `get` now returns Option> - if let Some(target_arc) = target_list_guard.get(&target_id) { - // Clone an Arc> (which is where target_list is stored) to move into an asynchronous task - // target_arc is already Arc, clone it for the async task - let cloned_target_for_task = target_arc.clone(); - let event_clone = event.clone(); - let target_name_for_task = cloned_target_for_task.name(); // Get the name before generating the task - debug!("Preparing to send event to target: {}", target_name_for_task); - // Use cloned data in closures to avoid borrowing conflicts - // Create an EntityTarget from the event - let entity_target: Arc> = Arc::new(EntityTarget { - object_name: object_key.to_string(), - bucket_name: bucket_name.to_string(), - event_name, - data: event_clone.clone().as_ref().clone(), - }); - let handle = tokio::spawn(async move { - if let Err(e) = cloned_target_for_task.save(entity_target.clone()).await { - error!("Failed to send event to target {}: {}", target_name_for_task, e); - } else { - debug!("Successfully saved event to target {}", target_name_for_task); - } - }); - handles.push(handle); - } else { - warn!("Target ID {:?} found in rules but not in target list.", target_id); - } - } - // target_list is automatically released here - } - - // Wait for all tasks to be completed - for handle in handles { - if let Err(e) = handle.await { - error!("Task for sending/saving event failed: {}", e); - } - } - info!("Event processing initiated for {} targets for bucket: {}", target_ids_len, bucket_name); - } else { + let Some(rules) = self.bucket_rules_map.get(bucket_name).await else { debug!("No rules found for bucket: {}", bucket_name); + return; + }; + + let target_ids = rules.match_rules(event_name, object_key); + if target_ids.is_empty() { + debug!("No matching targets for event in bucket: {}", bucket_name); + return; } + let target_ids_len = target_ids.len(); + let mut handles = vec![]; + + // Use scope to limit the borrow scope of target_list + let target_list_guard = self.target_list.read().await; + info!("Sending event to targets: {:?}", target_ids); + for target_id in target_ids { + // `get` now returns Option> + if let Some(target_arc) = target_list_guard.get(&target_id) { + // Clone an Arc> (which is where target_list is stored) to move into an asynchronous task + // target_arc is already Arc, clone it for the async task + let target_for_task = target_arc.clone(); + let limiter = self.send_limiter.clone(); + let event_clone = event.clone(); + let target_name_for_task = target_for_task.name(); // Get the name before generating the task + debug!("Preparing to send event to target: {}", target_name_for_task); + // Use cloned data in closures to avoid borrowing conflicts + // Create an EntityTarget from the event + let entity_target: Arc> = Arc::new(EntityTarget { + object_name: object_key.to_string(), + bucket_name: bucket_name.to_string(), + event_name, + data: event_clone.as_ref().clone(), + }); + let handle = tokio::spawn(async move { + let _permit = match limiter.acquire_owned().await { + Ok(p) => p, + Err(e) => { + error!("Failed to acquire send permit for target {}: {}", target_name_for_task, e); + return; + } + }; + if let Err(e) = target_for_task.save(entity_target.clone()).await { + error!("Failed to send event to target {}: {}", target_name_for_task, e); + } else { + debug!("Successfully saved event to target {}", target_name_for_task); + } + }); + handles.push(handle); + } else { + warn!("Target ID {:?} found in rules but not in target list.", target_id); + } + } + // target_list is automatically released here + drop(target_list_guard); + + // Wait for all tasks to be completed + for handle in handles { + if let Err(e) = handle.await { + error!("Task for sending/saving event failed: {}", e); + } + } + info!("Event processing initiated for {} targets for bucket: {}", target_ids_len, bucket_name); } /// Initializes the targets for buckets + /// + /// # Arguments + /// * `targets_to_init` - A vector of boxed targets to initialize + /// + /// # Returns + /// Returns `Ok(())` if initialization is successful, otherwise returns a `NotificationError`. #[instrument(skip(self, targets_to_init))] pub async fn init_bucket_targets( &self, @@ -195,6 +259,10 @@ impl EventNotifier { ) -> Result<(), NotificationError> { // Currently active, simpler logic let mut target_list_guard = self.target_list.write().await; //Gets a write lock for the TargetList + + // Clear existing targets first - rebuild from scratch to ensure consistency with new configuration + target_list_guard.clear(); + for target_boxed in targets_to_init { // Traverse the incoming Box debug!("init bucket target: {}", target_boxed.name()); @@ -214,6 +282,7 @@ impl EventNotifier { /// A thread-safe list of targets pub struct TargetList { + /// Map of TargetID to Target targets: HashMap + Send + Sync>>, } @@ -230,6 +299,12 @@ impl TargetList { } /// Adds a target to the list + /// + /// # Arguments + /// * `target` - The target to add + /// + /// # Returns + /// Returns `Ok(())` if the target was added successfully, or a `NotificationError` if an error occurred. pub fn add(&mut self, target: Arc + Send + Sync>) -> Result<(), NotificationError> { let id = target.id(); if self.targets.contains_key(&id) { @@ -240,8 +315,19 @@ impl TargetList { Ok(()) } + /// Clears all targets from the list + pub fn clear(&mut self) { + self.targets.clear(); + } + /// Removes a target by ID. Note: This does not stop its associated event stream. /// Stream cancellation should be handled by EventNotifier. + /// + /// # Arguments + /// * `id` - The ID of the target to remove + /// + /// # Returns + /// Returns the removed target if it existed, otherwise `None`. pub async fn remove_target_only(&mut self, id: &TargetID) -> Option + Send + Sync>> { if let Some(target_arc) = self.targets.remove(id) { if let Err(e) = target_arc.close().await { @@ -269,6 +355,12 @@ impl TargetList { } /// Returns a target by ID + /// + /// # Arguments + /// * `id` - The ID of the target to retrieve + /// + /// # Returns + /// Returns the target if it exists, otherwise `None`. pub fn get(&self, id: &TargetID) -> Option + Send + Sync>> { self.targets.get(id).cloned() } @@ -283,7 +375,7 @@ impl TargetList { self.targets.len() } - // is_empty can be derived from len() + /// is_empty can be derived from len() pub fn is_empty(&self) -> bool { self.targets.is_empty() } diff --git a/crates/notify/src/registry.rs b/crates/notify/src/registry.rs index 9d649793..cdf3aa11 100644 --- a/crates/notify/src/registry.rs +++ b/crates/notify/src/registry.rs @@ -16,9 +16,11 @@ use crate::Event; use crate::factory::{MQTTTargetFactory, TargetFactory, WebhookTargetFactory}; use futures::stream::{FuturesUnordered, StreamExt}; use hashbrown::{HashMap, HashSet}; -use rustfs_config::{DEFAULT_DELIMITER, ENABLE_KEY, ENV_PREFIX, notify::NOTIFY_ROUTE_PREFIX}; +use rustfs_config::{DEFAULT_DELIMITER, ENABLE_KEY, ENV_PREFIX, EnableState, notify::NOTIFY_ROUTE_PREFIX}; use rustfs_ecstore::config::{Config, KVS}; use rustfs_targets::{Target, TargetError, target::ChannelTargetType}; +use std::str::FromStr; +use std::sync::Arc; use tracing::{debug, error, info, warn}; /// Registry for managing target factories @@ -117,11 +119,7 @@ impl TargetRegistry { format!("{ENV_PREFIX}{NOTIFY_ROUTE_PREFIX}{target_type}{DEFAULT_DELIMITER}{ENABLE_KEY}{DEFAULT_DELIMITER}") .to_uppercase(); for (key, value) in &all_env { - if value.eq_ignore_ascii_case(rustfs_config::EnableState::One.as_str()) - || value.eq_ignore_ascii_case(rustfs_config::EnableState::On.as_str()) - || value.eq_ignore_ascii_case(rustfs_config::EnableState::True.as_str()) - || value.eq_ignore_ascii_case(rustfs_config::EnableState::Yes.as_str()) - { + if EnableState::from_str(value).ok().map(|s| s.is_enabled()).unwrap_or(false) { if let Some(id) = key.strip_prefix(&enable_prefix) { if !id.is_empty() { instance_ids_from_env.insert(id.to_lowercase()); @@ -208,10 +206,10 @@ impl TargetRegistry { let enabled = merged_config .lookup(ENABLE_KEY) .map(|v| { - v.eq_ignore_ascii_case(rustfs_config::EnableState::One.as_str()) - || v.eq_ignore_ascii_case(rustfs_config::EnableState::On.as_str()) - || v.eq_ignore_ascii_case(rustfs_config::EnableState::True.as_str()) - || v.eq_ignore_ascii_case(rustfs_config::EnableState::Yes.as_str()) + EnableState::from_str(v.as_str()) + .ok() + .map(|s| s.is_enabled()) + .unwrap_or(false) }) .unwrap_or(false); @@ -220,10 +218,10 @@ impl TargetRegistry { // 5.3. Create asynchronous tasks for enabled instances let target_type_clone = target_type.clone(); let tid = id.clone(); - let merged_config_arc = std::sync::Arc::new(merged_config); + let merged_config_arc = Arc::new(merged_config); tasks.push(async move { let result = factory.create_target(tid.clone(), &merged_config_arc).await; - (target_type_clone, tid, result, std::sync::Arc::clone(&merged_config_arc)) + (target_type_clone, tid, result, Arc::clone(&merged_config_arc)) }); } else { info!(instance_id = %id, "Skip the disabled target and will be removed from the final configuration"); diff --git a/crates/notify/src/rules/config.rs b/crates/notify/src/rules/config.rs index 5be48e8d..607e6aa0 100644 --- a/crates/notify/src/rules/config.rs +++ b/crates/notify/src/rules/config.rs @@ -15,13 +15,60 @@ use super::rules_map::RulesMap; use super::xml_config::ParseConfigError as BucketNotificationConfigError; use crate::rules::NotificationConfiguration; -use crate::rules::pattern_rules; -use crate::rules::target_id_set; -use hashbrown::HashMap; +use crate::rules::subscriber_snapshot::{BucketRulesSnapshot, DynRulesContainer, RuleEvents, RulesContainer}; use rustfs_targets::EventName; use rustfs_targets::arn::TargetID; use serde::{Deserialize, Serialize}; use std::io::Read; +use std::sync::Arc; + +/// A "rule view", only used for snapshot mask/consistency verification. +/// Here we choose to generate the view by "single event" to ensure that event_mask calculation is reliable and simple. +#[derive(Debug)] +struct RuleView { + events: Vec, +} + +impl RuleEvents for RuleView { + fn subscribed_events(&self) -> &[EventName] { + &self.events + } +} + +/// Adapt RulesMap to RulesContainer. +/// Key point: The items returned by iter_rules are &dyn RuleEvents, so a RuleView list is cached in the container. +#[derive(Debug)] +struct CompiledRules { + // Keep RulesMap (can be used later if you want to make more complex judgments during the snapshot reading phase) + #[allow(dead_code)] + rules_map: RulesMap, + // for RulesContainer::iter_rules + rule_views: Vec, +} + +impl CompiledRules { + fn from_rules_map(rules_map: &RulesMap) -> Self { + let mut rule_views = Vec::new(); + + for ev in rules_map.iter_events() { + rule_views.push(RuleView { events: vec![ev] }); + } + + Self { + rules_map: rules_map.clone(), + rule_views, + } + } +} + +impl RulesContainer for CompiledRules { + type Rule = dyn RuleEvents; + + fn iter_rules<'a>(&'a self) -> Box + 'a> { + // Key: Convert &RuleView into &dyn RuleEvents + Box::new(self.rule_views.iter().map(|v| v as &dyn RuleEvents)) + } +} /// Configuration for bucket notifications. /// This struct now holds the parsed and validated rules in the new RulesMap format. @@ -119,11 +166,26 @@ impl BucketNotificationConfig { pub fn set_region(&mut self, region: &str) { self.region = region.to_string(); } -} -// Add a helper to PatternRules if not already present -impl pattern_rules::PatternRules { - pub fn inner(&self) -> &HashMap { - &self.rules + /// Compiles the current BucketNotificationConfig into a BucketRulesSnapshot. + /// This involves transforming the rules into a format suitable for runtime use, + /// and calculating the event mask based on the subscribed events of the rules. + /// + /// # Returns + /// A BucketRulesSnapshot containing the compiled rules and event mask. + pub fn compile_snapshot(&self) -> BucketRulesSnapshot { + // 1) Generate container from RulesMap + let compiled = CompiledRules::from_rules_map(self.get_rules_map()); + let rules: Arc = Arc::new(compiled) as Arc; + + // 2) Calculate event_mask + let mut mask = 0u64; + for rule in rules.iter_rules() { + for ev in rule.subscribed_events() { + mask |= ev.mask(); + } + } + + BucketRulesSnapshot { event_mask: mask, rules } } } diff --git a/crates/notify/src/rules/mod.rs b/crates/notify/src/rules/mod.rs index 69b141f4..b976ddd9 100644 --- a/crates/notify/src/rules/mod.rs +++ b/crates/notify/src/rules/mod.rs @@ -12,22 +12,24 @@ // See the License for the specific language governing permissions and // limitations under the License. +mod config; pub mod pattern; -pub mod pattern_rules; -pub mod rules_map; -pub mod target_id_set; +mod pattern_rules; +mod rules_map; +mod subscriber_index; +mod subscriber_snapshot; +mod target_id_set; pub mod xml_config; // For XML structure definition and parsing - -pub mod config; // Definition and parsing for BucketNotificationConfig +// Definition and parsing for BucketNotificationConfig // Re-export key types from submodules for easy access to `crate::rules::TypeName` // Re-export key types from submodules for external use pub use config::BucketNotificationConfig; // Assume that BucketNotificationConfigError is also defined in config.rs // Or if it is still an alias for xml_config::ParseConfigError , adjust accordingly -pub use xml_config::ParseConfigError as BucketNotificationConfigError; - pub use pattern_rules::PatternRules; pub use rules_map::RulesMap; +pub use subscriber_index::*; +pub use subscriber_snapshot::*; pub use target_id_set::TargetIdSet; -pub use xml_config::{NotificationConfiguration, ParseConfigError}; +pub use xml_config::{NotificationConfiguration, ParseConfigError, ParseConfigError as BucketNotificationConfigError}; diff --git a/crates/notify/src/rules/pattern_rules.rs b/crates/notify/src/rules/pattern_rules.rs index 20b0fe93..06b31f07 100644 --- a/crates/notify/src/rules/pattern_rules.rs +++ b/crates/notify/src/rules/pattern_rules.rs @@ -12,8 +12,8 @@ // See the License for the specific language governing permissions and // limitations under the License. -use super::pattern; -use super::target_id_set::TargetIdSet; +use crate::rules::TargetIdSet; +use crate::rules::pattern; use hashbrown::HashMap; use rayon::prelude::*; use rustfs_targets::arn::TargetID; @@ -27,31 +27,69 @@ pub struct PatternRules { } impl PatternRules { + /// Create a new, empty PatternRules. pub fn new() -> Self { Default::default() } /// Add rules: Pattern and Target ID. /// If the schema already exists, add target_id to the existing TargetIdSet. + /// + /// # Arguments + /// * `pattern` - The object name pattern. + /// * `target_id` - The TargetID to associate with the pattern. pub fn add(&mut self, pattern: String, target_id: TargetID) { self.rules.entry(pattern).or_default().insert(target_id); } /// Checks if there are any rules that match the given object name. + /// + /// # Arguments + /// * `object_name` - The object name to match against the patterns. + /// + /// # Returns + /// `true` if any pattern matches the object name, otherwise `false`. pub fn match_simple(&self, object_name: &str) -> bool { self.rules.keys().any(|p| pattern::match_simple(p, object_name)) } /// Returns all TargetIDs that match the object name. + /// + /// Performance optimization points: + /// 1) Small collections are serialized directly to avoid rayon scheduling/merging overhead + /// 2) When hitting, no longer temporarily allocate TargetIdSet for each rule, but directly extend + /// + /// # Arguments + /// * `object_name` - The object name to match against the patterns. + /// + /// # Returns + /// A TargetIdSet containing all TargetIDs that match the object name. pub fn match_targets(&self, object_name: &str) -> TargetIdSet { + let n = self.rules.len(); + if n == 0 { + return TargetIdSet::new(); + } + + // Experience Threshold: Serial is usually faster below this value (can be adjusted after benchmarking) + const PAR_THRESHOLD: usize = 128; + + if n < PAR_THRESHOLD { + let mut out = TargetIdSet::new(); + for (pattern_str, target_set) in self.rules.iter() { + if pattern::match_simple(pattern_str, object_name) { + out.extend(target_set.iter().cloned()); + } + } + return out; + } + // Parallel path: Each thread accumulates a local set and finally merges it to reduce frequent allocations self.rules .par_iter() - .filter_map(|(pattern_str, target_set)| { + .fold(TargetIdSet::new, |mut local, (pattern_str, target_set)| { if pattern::match_simple(pattern_str, object_name) { - Some(target_set.iter().cloned().collect::()) - } else { - None + local.extend(target_set.iter().cloned()); } + local }) .reduce(TargetIdSet::new, |mut acc, set| { acc.extend(set); @@ -65,6 +103,11 @@ impl PatternRules { /// Merge another PatternRules. /// Corresponding to Go's `Rules.Union`. + /// # Arguments + /// * `other` - The PatternRules to merge with. + /// + /// # Returns + /// A new PatternRules containing the union of both. pub fn union(&self, other: &Self) -> Self { let mut new_rules = self.clone(); for (pattern, their_targets) in &other.rules { @@ -76,6 +119,13 @@ impl PatternRules { /// Calculate the difference from another PatternRules. /// Corresponding to Go's `Rules.Difference`. + /// The result contains only the patterns and TargetIDs that are in `self` but not in `other`. + /// + /// # Arguments + /// * `other` - The PatternRules to compare against. + /// + /// # Returns + /// A new PatternRules containing the difference. pub fn difference(&self, other: &Self) -> Self { let mut result_rules = HashMap::new(); for (pattern, self_targets) in &self.rules { @@ -94,4 +144,59 @@ impl PatternRules { } PatternRules { rules: result_rules } } + + /// Merge another PatternRules into self in place. + /// Corresponding to Go's `Rules.UnionInPlace`. + /// # Arguments + /// * `other` - The PatternRules to merge with. + pub fn union_in_place(&mut self, other: &Self) { + for (pattern, their_targets) in &other.rules { + self.rules + .entry(pattern.clone()) + .or_default() + .extend(their_targets.iter().cloned()); + } + } + + /// Calculate the difference from another PatternRules in place. + /// Corresponding to Go's `Rules.DifferenceInPlace`. + /// The result contains only the patterns and TargetIDs that are in `self` but not in `other`. + /// # Arguments + /// * `other` - The PatternRules to compare against. + pub fn difference_in_place(&mut self, other: &Self) { + self.rules.retain(|pattern, self_targets| { + if let Some(other_targets) = other.rules.get(pattern) { + // Remove other_targets from self_targets + self_targets.retain(|tid| !other_targets.contains(tid)); + } + !self_targets.is_empty() + }); + } + + /// Remove a pattern and its associated TargetID set from the PatternRules. + /// + /// # Arguments + /// * `pattern` - The pattern to remove. + pub fn remove_pattern(&mut self, pattern: &str) -> bool { + self.rules.remove(pattern).is_some() + } + + /// Determine whether the current PatternRules contains the specified TargetID (referenced by any pattern). + /// + /// # Parameters + /// * `target_id` - The TargetID to check for existence within the PatternRules + /// + /// # Returns + /// * `true` if the TargetID exists in any of the patterns; `false` otherwise. + pub fn contains_target_id(&self, target_id: &TargetID) -> bool { + self.rules.values().any(|set| set.contains(target_id)) + } + + /// Expose the internal rules for use in scenarios such as BucketNotificationConfig::validate. + /// + /// # Returns + /// A reference to the internal HashMap of patterns to TargetIdSets. + pub fn inner(&self) -> &HashMap { + &self.rules + } } diff --git a/crates/notify/src/rules/rules_map.rs b/crates/notify/src/rules/rules_map.rs index 59bb9c6c..c0f29675 100644 --- a/crates/notify/src/rules/rules_map.rs +++ b/crates/notify/src/rules/rules_map.rs @@ -12,8 +12,7 @@ // See the License for the specific language governing permissions and // limitations under the License. -use super::pattern_rules::PatternRules; -use super::target_id_set::TargetIdSet; +use crate::rules::{PatternRules, TargetIdSet}; use hashbrown::HashMap; use rustfs_targets::EventName; use rustfs_targets::arn::TargetID; @@ -31,6 +30,9 @@ pub struct RulesMap { impl RulesMap { /// Create a new, empty RulesMap. + /// + /// # Returns + /// A new instance of RulesMap with an empty map and a total_events_mask set to 0. pub fn new() -> Self { Default::default() } @@ -67,12 +69,12 @@ impl RulesMap { /// Merge another RulesMap. /// `RulesMap.Add(rulesMap2 RulesMap) corresponding to Go + /// + /// # Parameters + /// * `other_map` - The other RulesMap to be merged into the current one. pub fn add_map(&mut self, other_map: &Self) { for (event_name, other_pattern_rules) in &other_map.map { - let self_pattern_rules = self.map.entry(*event_name).or_default(); - // PatternRules::union Returns the new PatternRules, we need to modify the existing ones - let merged_rules = self_pattern_rules.union(other_pattern_rules); - *self_pattern_rules = merged_rules; + self.map.entry(*event_name).or_default().union_in_place(other_pattern_rules); } // Directly merge two masks. self.total_events_mask |= other_map.total_events_mask; @@ -81,11 +83,14 @@ impl RulesMap { /// Remove another rule defined in the RulesMap from the current RulesMap. /// /// After the rule is removed, `total_events_mask` is recalculated to ensure its accuracy. + /// + /// # Parameters + /// * `other_map` - The other RulesMap containing rules to be removed from the current one. pub fn remove_map(&mut self, other_map: &Self) { let mut events_to_remove = Vec::new(); for (event_name, self_pattern_rules) in &mut self.map { if let Some(other_pattern_rules) = other_map.map.get(event_name) { - *self_pattern_rules = self_pattern_rules.difference(other_pattern_rules); + self_pattern_rules.difference_in_place(other_pattern_rules); if self_pattern_rules.is_empty() { events_to_remove.push(*event_name); } @@ -102,6 +107,9 @@ impl RulesMap { /// /// This method uses a bitmask for a quick check of O(1) complexity. /// `event_name` can be a compound type, such as `ObjectCreatedAll`. + /// + /// # Parameters + /// * `event_name` - The event name to check for subscribers. pub fn has_subscriber(&self, event_name: &EventName) -> bool { // event_name.mask() will handle compound events correctly (self.total_events_mask & event_name.mask()) != 0 @@ -112,39 +120,54 @@ impl RulesMap { /// # Notice /// The `event_name` parameter should be a specific, non-compound event type. /// Because this is taken from the `Event` object that actually occurs. + /// + /// # Parameters + /// * `event_name` - The specific event name to match against. + /// * `object_key` - The object key to match against the patterns in the rules. + /// + /// # Returns + /// * A set of TargetIDs that match the given event and object key. pub fn match_rules(&self, event_name: EventName, object_key: &str) -> TargetIdSet { // Use bitmask to quickly determine whether there is a matching rule if (self.total_events_mask & event_name.mask()) == 0 { return TargetIdSet::new(); // No matching rules } - // First try to directly match the event name - if let Some(pattern_rules) = self.map.get(&event_name) { - let targets = pattern_rules.match_targets(object_key); - if !targets.is_empty() { - return targets; - } - } - // Go's RulesMap[eventName] is directly retrieved, and if it does not exist, it is empty Rules. - // Rust's HashMap::get returns Option. If the event name does not exist, there is no rule. - // Compound events (such as ObjectCreatedAll) have been expanded as a single event when add_rule_config. - // Therefore, a single event name should be used when querying. - // If event_name itself is a single type, look it up directly. - // If event_name is a compound type, Go's logic is expanded when added. - // Here match_rules should receive events that may already be single. - // If the caller passes in a compound event, it should expand itself or handle this function first. - // Assume that event_name is already a specific event that can be used for searching. + // In Go, RulesMap[eventName] returns empty rules if the key doesn't exist. + // Rust's HashMap::get returns Option, so missing key means no rules. + // Compound events like ObjectCreatedAll are expanded into specific events during add_rule_config. + // Thus, queries should use specific event names. + // If event_name is compound, expansion happens at addition time. + // match_rules assumes event_name is already a specific event for lookup. + // Callers should expand compound events before calling this method. self.map .get(&event_name) .map_or_else(TargetIdSet::new, |pr| pr.match_targets(object_key)) } /// Check if RulesMap is empty. + /// + /// # Returns + /// * `true` if there are no rules in the map; `false` otherwise pub fn is_empty(&self) -> bool { self.map.is_empty() } + /// Determine whether the current RulesMap contains the specified TargetID (referenced by any event / pattern). + /// + /// # Parameters + /// * `target_id` - The TargetID to check for existence within the RulesMap + /// + /// # Returns + /// * `true` if the TargetID exists in any of the PatternRules; `false` otherwise. + pub fn contains_target_id(&self, target_id: &TargetID) -> bool { + self.map.values().any(|pr| pr.contains_target_id(target_id)) + } + /// Returns a clone of internal rules for use in scenarios such as BucketNotificationConfig::validate. + /// + /// # Returns + /// A reference to the internal HashMap of EventName to PatternRules. pub fn inner(&self) -> &HashMap { &self.map } @@ -160,18 +183,32 @@ impl RulesMap { } /// Remove rules and optimize performance + /// + /// # Parameters + /// * `event_name` - The EventName from which to remove the rule. + /// * `pattern` - The pattern of the rule to be removed. #[allow(dead_code)] pub fn remove_rule(&mut self, event_name: &EventName, pattern: &str) { + let mut remove_event = false; + if let Some(pattern_rules) = self.map.get_mut(event_name) { - pattern_rules.rules.remove(pattern); + pattern_rules.remove_pattern(pattern); if pattern_rules.is_empty() { - self.map.remove(event_name); + remove_event = true; } } + + if remove_event { + self.map.remove(event_name); + } + self.recalculate_mask(); // Delay calculation mask } - /// Batch Delete Rules + /// Batch Delete Rules and Optimize Performance + /// + /// # Parameters + /// * `event_names` - A slice of EventNames to be removed. #[allow(dead_code)] pub fn remove_rules(&mut self, event_names: &[EventName]) { for event_name in event_names { @@ -181,9 +218,27 @@ impl RulesMap { } /// Update rules and optimize performance + /// + /// # Parameters + /// * `event_name` - The EventName to update. + /// * `pattern` - The pattern of the rule to be updated. + /// * `target_id` - The TargetID to be added. #[allow(dead_code)] pub fn update_rule(&mut self, event_name: EventName, pattern: String, target_id: TargetID) { self.map.entry(event_name).or_default().add(pattern, target_id); self.total_events_mask |= event_name.mask(); // Update only the relevant bitmask } + + /// Iterate all EventName keys contained in this RulesMap. + /// + /// Used by snapshot compilation to compute bucket event_mask. + /// + /// # Returns + /// An iterator over all EventName keys in the RulesMap. + #[inline] + pub fn iter_events(&self) -> impl Iterator + '_ { + // `inner()` is already used by config.rs, so we reuse it here. + // If the key type is `EventName`, `.copied()` is the cheapest way to return values. + self.inner().keys().copied() + } } diff --git a/crates/notify/src/rules/subscriber_index.rs b/crates/notify/src/rules/subscriber_index.rs new file mode 100644 index 00000000..205bc58a --- /dev/null +++ b/crates/notify/src/rules/subscriber_index.rs @@ -0,0 +1,131 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use crate::rules::{BucketRulesSnapshot, BucketSnapshotRef, DynRulesContainer}; +use arc_swap::ArcSwap; +use rustfs_targets::EventName; +use starshard::ShardedHashMap; +use std::fmt; +use std::sync::Arc; + +/// A global bucket -> snapshot index. +/// +/// Read path: lock-free load (ArcSwap) +/// Write path: atomic replacement after building a new snapshot +pub struct SubscriberIndex { + // Use starshard for sharding to reduce lock competition when the number of buckets is large + inner: ShardedHashMap>>>, + // Cache an "empty rule container" for empty snapshots (avoids building every time) + empty_rules: Arc, +} + +/// Avoid deriving fields that do not support Debug +impl fmt::Debug for SubscriberIndex { + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + f.debug_struct("SubscriberIndex").finish_non_exhaustive() + } +} + +impl SubscriberIndex { + /// Create a new SubscriberIndex. + /// + /// # Arguments + /// * `empty_rules` - An Arc to an empty rules container used for empty snapshots + /// + /// Returns a new instance of SubscriberIndex. + pub fn new(empty_rules: Arc) -> Self { + Self { + inner: ShardedHashMap::new(64), + empty_rules, + } + } + + /// Get the current snapshot of a bucket. + /// If it does not exist, return empty snapshot. + /// + /// # Arguments + /// * `bucket` - The name of the bucket to load. + /// + /// Returns the snapshot reference for the specified bucket. + pub fn load_snapshot(&self, bucket: &str) -> BucketSnapshotRef { + match self.inner.get(&bucket.to_string()) { + Some(cell) => cell.load_full(), + None => Arc::new(BucketRulesSnapshot::empty(self.empty_rules.clone())), + } + } + + /// Quickly determine whether the bucket has a subscription to an event. + /// This judgment can be consistent with subsequent rule matching when reading the same snapshot. + /// + /// # Arguments + /// * `bucket` - The name of the bucket to check. + /// * `event` - The event name to check for subscriptions. + /// + /// Returns `true` if there are subscribers for the event, `false` otherwise. + #[inline] + pub fn has_subscriber(&self, bucket: &str, event: &EventName) -> bool { + let snap = self.load_snapshot(bucket); + if snap.event_mask == 0 { + return false; + } + snap.has_event(event) + } + + /// Atomically update a bucket's snapshot (whole package replacement). + /// + /// - The caller first builds the complete `BucketRulesSnapshot` (including event\_mask and rules). + /// - This method ensures that the read path will not observe intermediate states. + /// + /// # Arguments + /// * `bucket` - The name of the bucket to update. + /// * `new_snapshot` - The new snapshot to store for the bucket. + pub fn store_snapshot(&self, bucket: &str, new_snapshot: BucketRulesSnapshot) { + let key = bucket.to_string(); + + let cell = self.inner.get(&key).unwrap_or_else(|| { + // Insert a default cell (empty snapshot) + let init = Arc::new(ArcSwap::from_pointee(BucketRulesSnapshot::empty(self.empty_rules.clone()))); + self.inner.insert(key.clone(), init.clone()); + init + }); + + cell.store(Arc::new(new_snapshot)); + } + + /// Delete the bucket's subscription view (make it empty). + /// + /// # Arguments + /// * `bucket` - The name of the bucket to clear. + pub fn clear_bucket(&self, bucket: &str) { + if let Some(cell) = self.inner.get(&bucket.to_string()) { + cell.store(Arc::new(BucketRulesSnapshot::empty(self.empty_rules.clone()))); + } + } +} + +impl Default for SubscriberIndex { + fn default() -> Self { + // An available empty rule container is required; here it is implemented using minimal empty + #[derive(Debug)] + struct EmptyRules; + impl crate::rules::subscriber_snapshot::RulesContainer for EmptyRules { + type Rule = dyn crate::rules::subscriber_snapshot::RuleEvents; + fn iter_rules<'a>(&'a self) -> Box + 'a> { + Box::new(std::iter::empty()) + } + } + + Self::new(Arc::new(EmptyRules) as Arc) + } +} diff --git a/crates/notify/src/rules/subscriber_snapshot.rs b/crates/notify/src/rules/subscriber_snapshot.rs new file mode 100644 index 00000000..4eed5d28 --- /dev/null +++ b/crates/notify/src/rules/subscriber_snapshot.rs @@ -0,0 +1,117 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use rustfs_targets::EventName; +use std::sync::Arc; + +/// Let the rules structure provide "what events it is subscribed to". +/// This way BucketRulesSnapshot does not need to know the internal shape of rules. +pub trait RuleEvents { + fn subscribed_events(&self) -> &[EventName]; +} + +/// Let the rules container provide the ability to iterate over all rules (abstracting only to the minimum necessary). +pub trait RulesContainer { + type Rule: RuleEvents + ?Sized; + fn iter_rules<'a>(&'a self) -> Box + 'a>; + + /// Fast empty judgment for snapshots (fix missing `rules.is_empty()`) + fn is_empty(&self) -> bool { + self.iter_rules().next().is_none() + } +} + +/// Represents a bucket's notification subscription view snapshot (immutable). +/// +/// - `event_mask`: Quickly determine whether there is a subscription to a certain type of event (bitset/flags). +/// - `rules`: precise rule mapping (prefix/suffix/pattern -> targets). +/// +/// The read path only reads this snapshot to ensure consistency. +#[derive(Debug, Clone)] +pub struct BucketRulesSnapshot +where + R: RulesContainer + ?Sized, +{ + pub event_mask: u64, + pub rules: Arc, +} + +impl BucketRulesSnapshot +where + R: RulesContainer + ?Sized, +{ + /// Create an empty snapshot with no subscribed events and no rules. + /// + /// # Arguments + /// * `rules` - An Arc to a rules container (can be an empty container). + /// + /// # Returns + /// An instance of `BucketRulesSnapshot` with an empty event mask. + #[inline] + pub fn empty(rules: Arc) -> Self { + Self { event_mask: 0, rules } + } + + /// Check if the snapshot has any subscribers for the specified event. + /// + /// # Arguments + /// * `event` - The event name to check for subscriptions. + /// + /// # Returns + /// `true` if there are subscribers for the event, `false` otherwise. + #[inline] + pub fn has_event(&self, event: &EventName) -> bool { + (self.event_mask & event.mask()) != 0 + } + + /// Check if the snapshot is empty (no subscribed events or rules). + /// + /// # Returns + /// `true` if the snapshot is empty, `false` otherwise. + #[inline] + pub fn is_empty(&self) -> bool { + self.event_mask == 0 || self.rules.is_empty() + } + + /// [debug] Assert that `event_mask` is consistent with the event declared in `rules`. + /// + /// Constraints: + /// - only runs in debug builds (release incurs no cost). + /// - If the rule contains compound events (\*All / Everything), rely on `EventName::mask()` to automatically expand. + #[inline] + pub fn debug_assert_mask_consistent(&self) { + #[cfg(debug_assertions)] + { + let mut recomputed = 0u64; + for rule in self.rules.iter_rules() { + for ev in rule.subscribed_events() { + recomputed |= ev.mask(); + } + } + + debug_assert!( + recomputed == self.event_mask, + "BucketRulesSnapshot.event_mask inconsistent: stored={:#x}, recomputed={:#x}", + self.event_mask, + recomputed + ); + } + } +} + +/// Unify trait-object snapshot types (fix Sized / missing generic arguments) +pub type DynRulesContainer = dyn RulesContainer + Send + Sync; + +/// Expose Arc form to facilitate sharing. +pub type BucketSnapshotRef = Arc>; diff --git a/crates/notify/src/rules/xml_config.rs b/crates/notify/src/rules/xml_config.rs index 134f0db2..698167d6 100644 --- a/crates/notify/src/rules/xml_config.rs +++ b/crates/notify/src/rules/xml_config.rs @@ -12,7 +12,7 @@ // See the License for the specific language governing permissions and // limitations under the License. -use super::pattern; +use crate::rules::pattern; use hashbrown::HashSet; use rustfs_targets::EventName; use rustfs_targets::arn::{ARN, ArnError, TargetIDError}; diff --git a/crates/notify/src/stream.rs b/crates/notify/src/stream.rs index 9b37c13b..8c70d3c2 100644 --- a/crates/notify/src/stream.rs +++ b/crates/notify/src/stream.rs @@ -13,18 +13,23 @@ // limitations under the License. use crate::{Event, integration::NotificationMetrics}; -use rustfs_targets::StoreError; -use rustfs_targets::Target; -use rustfs_targets::TargetError; -use rustfs_targets::store::{Key, Store}; -use rustfs_targets::target::EntityTarget; +use rustfs_targets::{ + StoreError, Target, TargetError, + store::{Key, Store}, + target::EntityTarget, +}; use std::sync::Arc; use std::time::{Duration, Instant}; use tokio::sync::{Semaphore, mpsc}; use tokio::time::sleep; use tracing::{debug, error, info, warn}; -/// Streams events from the store to the target +/// Streams events from the store to the target with retry logic +/// +/// # Arguments +/// - `store`: The event store +/// - `target`: The target to send events to +/// - `cancel_rx`: Receiver to listen for cancellation signals pub async fn stream_events( store: &mut (dyn Store + Send), target: &dyn Target, @@ -67,6 +72,7 @@ pub async fn stream_events( match target.send_from_store(key.clone()).await { Ok(_) => { info!("Successfully sent event for target: {}", target.name()); + // send_from_store deletes the event from store on success success = true; } Err(e) => { @@ -104,6 +110,13 @@ pub async fn stream_events( } /// Starts the event streaming process for a target +/// +/// # Arguments +/// - `store`: The event store +/// - `target`: The target to send events to +/// +/// # Returns +/// A sender to signal cancellation of the event stream pub fn start_event_stream( mut store: Box + Send>, target: Arc + Send + Sync>, @@ -119,6 +132,15 @@ pub fn start_event_stream( } /// Start event stream with batch processing +/// +/// # Arguments +/// - `store`: The event store +/// - `target`: The target to send events to clients +/// - `metrics`: Metrics for monitoring +/// - `semaphore`: Semaphore to limit concurrency +/// +/// # Returns +/// A sender to signal cancellation of the event stream pub fn start_event_stream_with_batching( mut store: Box, Error = StoreError, Key = Key> + Send>, target: Arc + Send + Sync>, @@ -136,6 +158,16 @@ pub fn start_event_stream_with_batching( } /// Event stream processing with batch processing +/// +/// # Arguments +/// - `store`: The event store +/// - `target`: The target to send events to clients +/// - `cancel_rx`: Receiver to listen for cancellation signals +/// - `metrics`: Metrics for monitoring +/// - `semaphore`: Semaphore to limit concurrency +/// +/// # Notes +/// This function processes events in batches to improve efficiency. pub async fn stream_events_with_batching( store: &mut (dyn Store, Error = StoreError, Key = Key> + Send), target: &dyn Target, @@ -231,7 +263,17 @@ pub async fn stream_events_with_batching( } } -/// Processing event batches +/// Processing event batches for targets +/// # Arguments +/// - `batch`: The batch of events to process +/// - `batch_keys`: The corresponding keys of the events in the batch +/// - `target`: The target to send events to clients +/// - `max_retries`: Maximum number of retries for sending an event +/// - `base_delay`: Base delay duration for retries +/// - `metrics`: Metrics for monitoring +/// - `semaphore`: Semaphore to limit concurrency +/// # Notes +/// This function processes a batch of events, sending each event to the target with retry async fn process_batch( batch: &mut Vec>, batch_keys: &mut Vec, @@ -262,6 +304,7 @@ async fn process_batch( // Retry logic while retry_count < max_retries && !success { + // After sending successfully, the event in the storage is deleted synchronously. match target.send_from_store(key.clone()).await { Ok(_) => { info!("Successfully sent event for target: {}, Key: {}", target.name(), key.to_string()); diff --git a/crates/obs/src/telemetry.rs b/crates/obs/src/telemetry.rs index e2c5baf7..2aa2642c 100644 --- a/crates/obs/src/telemetry.rs +++ b/crates/obs/src/telemetry.rs @@ -39,9 +39,9 @@ use rustfs_config::{ ENV_OBS_LOG_DIRECTORY, ENV_OBS_LOG_FLUSH_MS, ENV_OBS_LOG_MESSAGE_CAPA, ENV_OBS_LOG_POOL_CAPA, }, }; -use rustfs_utils::{get_env_u64, get_env_usize, get_local_ip_with_default}; +use rustfs_utils::{get_env_opt_str, get_env_u64, get_env_usize, get_local_ip_with_default}; use smallvec::SmallVec; -use std::{borrow::Cow, env, fs, io::IsTerminal, time::Duration}; +use std::{borrow::Cow, fs, io::IsTerminal, time::Duration}; use tracing::info; use tracing_error::ErrorLayer; use tracing_opentelemetry::{MetricsLayer, OpenTelemetryLayer}; @@ -574,8 +574,8 @@ pub(crate) fn init_telemetry(config: &OtelConfig) -> Result>) -> bool { + pub async fn evaluate(&self, values: &HashMap>) -> bool { + self.evaluate_with_resolver(values, None).await + } + + pub async fn evaluate_with_resolver( + &self, + values: &HashMap>, + resolver: Option<&dyn PolicyVariableResolver>, + ) -> bool { for c in self.for_any_value.iter() { - if !c.evaluate(false, values) { + if !c.evaluate_with_resolver(false, values, resolver).await { return false; } } for c in self.for_all_values.iter() { - if !c.evaluate(true, values) { + if !c.evaluate_with_resolver(true, values, resolver).await { return false; } } for c in self.for_normal.iter() { - if !c.evaluate(false, values) { + if !c.evaluate_with_resolver(false, values, resolver).await { return false; } } diff --git a/crates/policy/src/policy/function/condition.rs b/crates/policy/src/policy/function/condition.rs index 85c0db36..5792f252 100644 --- a/crates/policy/src/policy/function/condition.rs +++ b/crates/policy/src/policy/function/condition.rs @@ -12,6 +12,7 @@ // See the License for the specific language governing permissions and // limitations under the License. +use crate::policy::variables::PolicyVariableResolver; use serde::Deserialize; use serde::de::{Error, MapAccess}; use serde::ser::SerializeMap; @@ -106,16 +107,21 @@ impl Condition { } } - pub fn evaluate(&self, for_all: bool, values: &HashMap>) -> bool { + pub async fn evaluate_with_resolver( + &self, + for_all: bool, + values: &HashMap>, + resolver: Option<&dyn PolicyVariableResolver>, + ) -> bool { use Condition::*; let r = match self { - StringEquals(s) => s.evaluate(for_all, false, false, false, values), - StringNotEquals(s) => s.evaluate(for_all, false, false, true, values), - StringEqualsIgnoreCase(s) => s.evaluate(for_all, true, false, false, values), - StringNotEqualsIgnoreCase(s) => s.evaluate(for_all, true, false, true, values), - StringLike(s) => s.evaluate(for_all, false, true, false, values), - StringNotLike(s) => s.evaluate(for_all, false, true, true, values), + StringEquals(s) => s.evaluate_with_resolver(for_all, false, false, false, values, resolver).await, + StringNotEquals(s) => s.evaluate_with_resolver(for_all, false, false, true, values, resolver).await, + StringEqualsIgnoreCase(s) => s.evaluate_with_resolver(for_all, true, false, false, values, resolver).await, + StringNotEqualsIgnoreCase(s) => s.evaluate_with_resolver(for_all, true, false, true, values, resolver).await, + StringLike(s) => s.evaluate_with_resolver(for_all, false, true, false, values, resolver).await, + StringNotLike(s) => s.evaluate_with_resolver(for_all, false, true, true, values, resolver).await, BinaryEquals(s) => s.evaluate(values), IpAddress(s) => s.evaluate(values), NotIpAddress(s) => s.evaluate(values), diff --git a/crates/policy/src/policy/function/string.rs b/crates/policy/src/policy/function/string.rs index 29e098c4..f7207feb 100644 --- a/crates/policy/src/policy/function/string.rs +++ b/crates/policy/src/policy/function/string.rs @@ -21,26 +21,30 @@ use std::{borrow::Cow, collections::HashMap}; use crate::policy::function::func::FuncKeyValue; use crate::policy::utils::wildcard; +use futures::future; use serde::{Deserialize, Deserializer, Serialize, de, ser::SerializeSeq}; use super::{func::InnerFunc, key_name::KeyName}; +use crate::policy::variables::PolicyVariableResolver; pub type StringFunc = InnerFunc; impl StringFunc { - pub(crate) fn evaluate( + #[allow(clippy::too_many_arguments)] + pub(crate) async fn evaluate_with_resolver( &self, for_all: bool, ignore_case: bool, like: bool, negate: bool, values: &HashMap>, + resolver: Option<&dyn PolicyVariableResolver>, ) -> bool { for inner in self.0.iter() { let result = if like { - inner.eval_like(for_all, values) ^ negate + inner.eval_like(for_all, values, resolver).await ^ negate } else { - inner.eval(for_all, ignore_case, values) ^ negate + inner.eval(for_all, ignore_case, values, resolver).await ^ negate }; if !result { @@ -53,7 +57,13 @@ impl StringFunc { } impl FuncKeyValue { - fn eval(&self, for_all: bool, ignore_case: bool, values: &HashMap>) -> bool { + async fn eval( + &self, + for_all: bool, + ignore_case: bool, + values: &HashMap>, + resolver: Option<&dyn PolicyVariableResolver>, + ) -> bool { let rvalues = values // http.CanonicalHeaderKey ? .get(self.key.name().as_str()) @@ -70,12 +80,20 @@ impl FuncKeyValue { }) .unwrap_or_default(); - let fvalues = self - .values - .0 - .iter() - .map(|c| { - let mut c = Cow::from(c); + let resolved_values: Vec> = futures::future::join_all(self.values.0.iter().map(|c| async { + if let Some(res) = resolver { + super::super::variables::resolve_aws_variables(c, res).await + } else { + vec![c.to_string()] + } + })) + .await; + + let fvalues = resolved_values + .into_iter() + .flatten() + .map(|resolved_c| { + let mut c = Cow::from(resolved_c); for key in KeyName::COMMON_KEYS { match values.get(key.name()).and_then(|x| x.first()) { Some(v) if !v.is_empty() => return Cow::Owned(c.to_mut().replace(&key.var_name(), v)), @@ -97,15 +115,32 @@ impl FuncKeyValue { } } - fn eval_like(&self, for_all: bool, values: &HashMap>) -> bool { + async fn eval_like( + &self, + for_all: bool, + values: &HashMap>, + resolver: Option<&dyn PolicyVariableResolver>, + ) -> bool { if let Some(rvalues) = values.get(self.key.name().as_str()) { for v in rvalues.iter() { - let matched = self + let resolved_futures: Vec<_> = self .values .0 .iter() - .map(|c| { - let mut c = Cow::from(c); + .map(|c| async { + if let Some(res) = resolver { + super::super::variables::resolve_aws_variables(c, res).await + } else { + vec![c.to_string()] + } + }) + .collect(); + let resolved_values = future::join_all(resolved_futures).await; + let matched = resolved_values + .into_iter() + .flatten() + .map(|resolved_c| { + let mut c = Cow::from(resolved_c); for key in KeyName::COMMON_KEYS { match values.get(key.name()).and_then(|x| x.first()) { Some(v) if !v.is_empty() => return Cow::Owned(c.to_mut().replace(&key.var_name(), v)), @@ -214,6 +249,7 @@ mod tests { key_name::AwsKeyName::*, key_name::KeyName::{self, *}, }; + use std::collections::HashMap; use crate::policy::function::key_name::S3KeyName::S3LocationConstraint; use test_case::test_case; @@ -275,16 +311,13 @@ mod tests { negate: bool, values: Vec<(&str, Vec<&str>)>, ) -> bool { - let result = s.eval( - for_all, - ignore_case, - &values - .into_iter() - .map(|(k, v)| (k.to_owned(), v.into_iter().map(ToOwned::to_owned).collect::>())) - .collect(), - ); + let map: HashMap> = values + .into_iter() + .map(|(k, v)| (k.to_owned(), v.into_iter().map(ToOwned::to_owned).collect::>())) + .collect(); + let result = s.eval(for_all, ignore_case, &map, None); - result ^ negate + pollster::block_on(result) ^ negate } #[test_case(new_fkv("s3:x-amz-copy-source", vec!["mybucket/myobject"]), false, vec![("x-amz-copy-source", vec!["mybucket/myobject"])] => true ; "1")] @@ -380,15 +413,13 @@ mod tests { } fn test_eval_like(s: FuncKeyValue, for_all: bool, negate: bool, values: Vec<(&str, Vec<&str>)>) -> bool { - let result = s.eval_like( - for_all, - &values - .into_iter() - .map(|(k, v)| (k.to_owned(), v.into_iter().map(ToOwned::to_owned).collect::>())) - .collect(), - ); + let map: HashMap> = values + .into_iter() + .map(|(k, v)| (k.to_owned(), v.into_iter().map(ToOwned::to_owned).collect::>())) + .collect(); + let result = s.eval_like(for_all, &map, None); - result ^ negate + pollster::block_on(result) ^ negate } #[test_case(new_fkv("s3:x-amz-copy-source", vec!["mybucket/myobject"]), false, vec![("x-amz-copy-source", vec!["mybucket/myobject"])] => true ; "1")] diff --git a/crates/policy/src/policy/policy.rs b/crates/policy/src/policy/policy.rs index 334ae165..45d368d8 100644 --- a/crates/policy/src/policy/policy.rs +++ b/crates/policy/src/policy/policy.rs @@ -62,9 +62,9 @@ pub struct Policy { } impl Policy { - pub fn is_allowed(&self, args: &Args) -> bool { + pub async fn is_allowed(&self, args: &Args<'_>) -> bool { for statement in self.statements.iter().filter(|s| matches!(s.effect, Effect::Deny)) { - if !statement.is_allowed(args) { + if !statement.is_allowed(args).await { return false; } } @@ -74,7 +74,7 @@ impl Policy { } for statement in self.statements.iter().filter(|s| matches!(s.effect, Effect::Allow)) { - if statement.is_allowed(args) { + if statement.is_allowed(args).await { return true; } } @@ -82,9 +82,9 @@ impl Policy { false } - pub fn match_resource(&self, resource: &str) -> bool { + pub async fn match_resource(&self, resource: &str) -> bool { for statement in self.statements.iter() { - if statement.resources.match_resource(resource) { + if statement.resources.match_resource(resource).await { return true; } } @@ -188,9 +188,9 @@ pub struct BucketPolicy { } impl BucketPolicy { - pub fn is_allowed(&self, args: &BucketPolicyArgs) -> bool { + pub async fn is_allowed(&self, args: &BucketPolicyArgs<'_>) -> bool { for statement in self.statements.iter().filter(|s| matches!(s.effect, Effect::Deny)) { - if !statement.is_allowed(args) { + if !statement.is_allowed(args).await { return false; } } @@ -200,7 +200,7 @@ impl BucketPolicy { } for statement in self.statements.iter().filter(|s| matches!(s.effect, Effect::Allow)) { - if statement.is_allowed(args) { + if statement.is_allowed(args).await { return true; } } @@ -525,4 +525,281 @@ mod test { // assert_eq!(p, p2); Ok(()) } + + #[tokio::test] + async fn test_aws_username_policy_variable() -> Result<()> { + let data = r#" +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": ["arn:aws:s3:::${aws:username}-*"] + } + ] +} +"#; + + let policy = Policy::parse_config(data.as_bytes())?; + + let conditions = HashMap::new(); + + // Test allowed case - user testuser accessing testuser-bucket + let mut claims1 = HashMap::new(); + claims1.insert("username".to_string(), Value::String("testuser".to_string())); + + let args1 = Args { + account: "testuser", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "testuser-bucket", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims1, + deny_only: false, + }; + + // Test denied case - user otheruser accessing testuser-bucket + let mut claims2 = HashMap::new(); + claims2.insert("username".to_string(), Value::String("otheruser".to_string())); + + let args2 = Args { + account: "otheruser", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "testuser-bucket", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims2, + deny_only: false, + }; + + assert!(pollster::block_on(policy.is_allowed(&args1))); + assert!(!pollster::block_on(policy.is_allowed(&args2))); + + Ok(()) + } + + #[tokio::test] + async fn test_aws_userid_policy_variable() -> Result<()> { + let data = r#" +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": ["arn:aws:s3:::${aws:userid}-bucket"] + } + ] +} +"#; + + let policy = Policy::parse_config(data.as_bytes())?; + + let mut claims = HashMap::new(); + claims.insert("sub".to_string(), Value::String("AIDACKCEVSQ6C2EXAMPLE".to_string())); + + let conditions = HashMap::new(); + + // Test allowed case + let args1 = Args { + account: "testuser", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "AIDACKCEVSQ6C2EXAMPLE-bucket", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims, + deny_only: false, + }; + + // Test denied case + let args2 = Args { + account: "testuser", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "OTHERUSER-bucket", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims, + deny_only: false, + }; + + assert!(pollster::block_on(policy.is_allowed(&args1))); + assert!(!pollster::block_on(policy.is_allowed(&args2))); + + Ok(()) + } + + #[tokio::test] + async fn test_aws_policy_variables_concatenation() -> Result<()> { + let data = r#" +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": ["arn:aws:s3:::${aws:username}-${aws:userid}-bucket"] + } + ] +} +"#; + + let policy = Policy::parse_config(data.as_bytes())?; + + let mut claims = HashMap::new(); + claims.insert("username".to_string(), Value::String("testuser".to_string())); + claims.insert("sub".to_string(), Value::String("AIDACKCEVSQ6C2EXAMPLE".to_string())); + + let conditions = HashMap::new(); + + // Test allowed case + let args1 = Args { + account: "testuser", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "testuser-AIDACKCEVSQ6C2EXAMPLE-bucket", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims, + deny_only: false, + }; + + // Test denied case + let args2 = Args { + account: "testuser", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "otheruser-AIDACKCEVSQ6C2EXAMPLE-bucket", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims, + deny_only: false, + }; + + assert!(pollster::block_on(policy.is_allowed(&args1))); + assert!(!pollster::block_on(policy.is_allowed(&args2))); + + Ok(()) + } + + #[tokio::test] + async fn test_aws_policy_variables_nested() -> Result<()> { + let data = r#" +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": ["arn:aws:s3:::${${aws:PrincipalType}-${aws:userid}}"] + } + ] +} +"#; + + let policy = Policy::parse_config(data.as_bytes())?; + + let mut claims = HashMap::new(); + claims.insert("sub".to_string(), Value::String("AIDACKCEVSQ6C2EXAMPLE".to_string())); + // For PrincipalType, it will default to "User" when not explicitly set + + let conditions = HashMap::new(); + + // Test allowed case + let args1 = Args { + account: "testuser", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "User-AIDACKCEVSQ6C2EXAMPLE", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims, + deny_only: false, + }; + + // Test denied case + let args2 = Args { + account: "testuser", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "User-OTHERUSER", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims, + deny_only: false, + }; + + assert!(pollster::block_on(policy.is_allowed(&args1))); + assert!(!pollster::block_on(policy.is_allowed(&args2))); + + Ok(()) + } + + #[tokio::test] + async fn test_aws_policy_variables_multi_value() -> Result<()> { + let data = r#" +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": ["s3:ListBucket"], + "Resource": ["arn:aws:s3:::${aws:username}-bucket"] + } + ] +} +"#; + + let policy = Policy::parse_config(data.as_bytes())?; + + let mut claims = HashMap::new(); + // Test with array value for username + claims.insert( + "username".to_string(), + Value::Array(vec![Value::String("user1".to_string()), Value::String("user2".to_string())]), + ); + + let conditions = HashMap::new(); + + let args1 = Args { + account: "user1", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "user1-bucket", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims, + deny_only: false, + }; + + let args2 = Args { + account: "user2", + groups: &None, + action: Action::S3Action(crate::policy::action::S3Action::ListBucketAction), + bucket: "user2-bucket", + conditions: &conditions, + is_owner: false, + object: "", + claims: &claims, + deny_only: false, + }; + + // Either user1 or user2 should be allowed + assert!(pollster::block_on(policy.is_allowed(&args1)) || pollster::block_on(policy.is_allowed(&args2))); + + Ok(()) + } } diff --git a/crates/policy/src/policy/resource.rs b/crates/policy/src/policy/resource.rs index c7415861..a491c55b 100644 --- a/crates/policy/src/policy/resource.rs +++ b/crates/policy/src/policy/resource.rs @@ -24,15 +24,25 @@ use super::{ Error as IamError, Validator, function::key_name::KeyName, utils::{path, wildcard}, + variables::PolicyVariableResolver, }; #[derive(Serialize, Deserialize, Clone, Default, Debug)] pub struct ResourceSet(pub HashSet); impl ResourceSet { - pub fn is_match(&self, resource: &str, conditions: &HashMap>) -> bool { + pub async fn is_match(&self, resource: &str, conditions: &HashMap>) -> bool { + self.is_match_with_resolver(resource, conditions, None).await + } + + pub async fn is_match_with_resolver( + &self, + resource: &str, + conditions: &HashMap>, + resolver: Option<&dyn PolicyVariableResolver>, + ) -> bool { for re in self.0.iter() { - if re.is_match(resource, conditions) { + if re.is_match_with_resolver(resource, conditions, resolver).await { return true; } } @@ -40,9 +50,9 @@ impl ResourceSet { false } - pub fn match_resource(&self, resource: &str) -> bool { + pub async fn match_resource(&self, resource: &str) -> bool { for re in self.0.iter() { - if re.match_resource(resource) { + if re.match_resource(resource).await { return true; } } @@ -85,31 +95,56 @@ pub enum Resource { impl Resource { pub const S3_PREFIX: &'static str = "arn:aws:s3:::"; - pub fn is_match(&self, resource: &str, conditions: &HashMap>) -> bool { - let mut pattern = match self { + pub async fn is_match(&self, resource: &str, conditions: &HashMap>) -> bool { + self.is_match_with_resolver(resource, conditions, None).await + } + + pub async fn is_match_with_resolver( + &self, + resource: &str, + conditions: &HashMap>, + resolver: Option<&dyn PolicyVariableResolver>, + ) -> bool { + let pattern = match self { Resource::S3(s) => s.to_owned(), Resource::Kms(s) => s.to_owned(), }; - if !conditions.is_empty() { - for key in KeyName::COMMON_KEYS { - if let Some(rvalue) = conditions.get(key.name()) { - if matches!(rvalue.first().map(|c| !c.is_empty()), Some(true)) { - pattern = pattern.replace(&key.var_name(), &rvalue[0]); + + let patterns = if let Some(res) = resolver { + super::variables::resolve_aws_variables(&pattern, res).await + } else { + vec![pattern.clone()] + }; + + for pattern in patterns { + let mut resolved_pattern = pattern; + + // Apply condition substitutions + if !conditions.is_empty() { + for key in KeyName::COMMON_KEYS { + if let Some(rvalue) = conditions.get(key.name()) { + if matches!(rvalue.first().map(|c| !c.is_empty()), Some(true)) { + resolved_pattern = resolved_pattern.replace(&key.var_name(), &rvalue[0]); + } } } } + + let cp = path::clean(resource); + if cp != "." && cp == resolved_pattern.as_str() { + return true; + } + + if wildcard::is_match(resolved_pattern, resource) { + return true; + } } - let cp = path::clean(resource); - if cp != "." && cp == pattern.as_str() { - return true; - } - - wildcard::is_match(pattern, resource) + false } - pub fn match_resource(&self, resource: &str) -> bool { - self.is_match(resource, &HashMap::new()) + pub async fn match_resource(&self, resource: &str) -> bool { + self.is_match(resource, &HashMap::new()).await } } @@ -197,6 +232,7 @@ mod tests { #[test_case("arn:aws:s3:::mybucket","mybucket/myobject" => false; "15")] fn test_resource_is_match(resource: &str, object: &str) -> bool { let resource: Resource = resource.try_into().unwrap(); - resource.is_match(object, &HashMap::new()) + + pollster::block_on(resource.is_match(object, &HashMap::new())) } } diff --git a/crates/policy/src/policy/statement.rs b/crates/policy/src/policy/statement.rs index 8b7218ac..a27d8528 100644 --- a/crates/policy/src/policy/statement.rs +++ b/crates/policy/src/policy/statement.rs @@ -15,6 +15,7 @@ use super::{ ActionSet, Args, BucketPolicyArgs, Effect, Error as IamError, Functions, ID, Principal, ResourceSet, Validator, action::Action, + variables::{VariableContext, VariableResolver}, }; use crate::error::{Error, Result}; use serde::{Deserialize, Serialize}; @@ -68,7 +69,24 @@ impl Statement { false } - pub fn is_allowed(&self, args: &Args) -> bool { + pub async fn is_allowed(&self, args: &Args<'_>) -> bool { + let mut context = VariableContext::new(); + context.claims = Some(args.claims.clone()); + context.conditions = args.conditions.clone(); + context.account_id = Some(args.account.to_string()); + + let username = if let Some(parent) = args.claims.get("parent").and_then(|v| v.as_str()) { + // For temp credentials or service account credentials, username is parent_user + parent.to_string() + } else { + // For regular user credentials, username is access_key + args.account.to_string() + }; + + context.username = Some(username); + + let resolver = VariableResolver::new(context); + let check = 'c: { if (!self.actions.is_match(&args.action) && !self.actions.is_empty()) || self.not_actions.is_match(&args.action) { break 'c false; @@ -86,14 +104,20 @@ impl Statement { } if self.is_kms() && (resource == "/" || self.resources.is_empty()) { - break 'c self.conditions.evaluate(args.conditions); + break 'c self.conditions.evaluate_with_resolver(args.conditions, Some(&resolver)).await; } - if !self.resources.is_match(&resource, args.conditions) && !self.is_admin() && !self.is_sts() { + if !self + .resources + .is_match_with_resolver(&resource, args.conditions, Some(&resolver)) + .await + && !self.is_admin() + && !self.is_sts() + { break 'c false; } - self.conditions.evaluate(args.conditions) + self.conditions.evaluate_with_resolver(args.conditions, Some(&resolver)).await }; self.effect.is_allowed(check) @@ -155,7 +179,7 @@ pub struct BPStatement { } impl BPStatement { - pub fn is_allowed(&self, args: &BucketPolicyArgs) -> bool { + pub async fn is_allowed(&self, args: &BucketPolicyArgs<'_>) -> bool { let check = 'c: { if !self.principal.is_match(args.account) { break 'c false; @@ -176,15 +200,15 @@ impl BPStatement { resource.push('/'); } - if !self.resources.is_empty() && !self.resources.is_match(&resource, args.conditions) { + if !self.resources.is_empty() && !self.resources.is_match(&resource, args.conditions).await { break 'c false; } - if !self.not_resources.is_empty() && self.not_resources.is_match(&resource, args.conditions) { + if !self.not_resources.is_empty() && self.not_resources.is_match(&resource, args.conditions).await { break 'c false; } - self.conditions.evaluate(args.conditions) + self.conditions.evaluate(args.conditions).await }; self.effect.is_allowed(check) diff --git a/crates/policy/src/policy/variables.rs b/crates/policy/src/policy/variables.rs new file mode 100644 index 00000000..db35663e --- /dev/null +++ b/crates/policy/src/policy/variables.rs @@ -0,0 +1,465 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use async_trait::async_trait; +use moka::future::Cache; +use serde_json::Value; +use std::collections::HashMap; +use std::future::Future; +use std::time::Duration; +use time::OffsetDateTime; + +/// Context information for variable resolution +#[derive(Debug, Clone, Default)] +pub struct VariableContext { + pub is_https: bool, + pub source_ip: Option, + pub account_id: Option, + pub region: Option, + pub username: Option, + pub claims: Option>, + pub conditions: HashMap>, + pub custom_variables: HashMap, +} + +impl VariableContext { + pub fn new() -> Self { + Self::default() + } +} + +pub struct VariableResolverCache { + /// Moka cache storing resolved results + cache: Cache, +} + +impl VariableResolverCache { + pub fn new(capacity: usize, ttl_seconds: u64) -> Self { + let cache = Cache::builder() + .max_capacity(capacity as u64) + .time_to_live(Duration::from_secs(ttl_seconds)) + .build(); + + Self { cache } + } + + pub async fn get(&self, key: &str) -> Option { + self.cache.get(key).await + } + + pub async fn put(&self, key: String, value: String) { + self.cache.insert(key, value).await; + } + + pub async fn clear(&self) { + self.cache.invalidate_all(); + } +} + +/// Cached dynamic AWS variable resolver +pub struct CachedAwsVariableResolver { + inner: VariableResolver, + cache: VariableResolverCache, +} + +impl CachedAwsVariableResolver { + pub fn new(context: VariableContext) -> Self { + Self { + inner: VariableResolver::new(context), + cache: VariableResolverCache::new(100, 300), // 100 entries, 5 minutes expiration + } + } + + pub fn is_dynamic(&self, variable_name: &str) -> bool { + self.inner.is_dynamic(variable_name) + } +} + +#[async_trait] +impl PolicyVariableResolver for CachedAwsVariableResolver { + async fn resolve(&self, variable_name: &str) -> Option { + if self.is_dynamic(variable_name) { + return self.inner.resolve(variable_name).await; + } + + if let Some(cached) = self.cache.get(variable_name).await { + return Some(cached); + } + + let value = self.inner.resolve(variable_name).await?; + self.cache.put(variable_name.to_string(), value.clone()).await; + Some(value) + } + + async fn resolve_multiple(&self, variable_name: &str) -> Option> { + self.inner.resolve_multiple(variable_name).await + } + + fn is_dynamic(&self, variable_name: &str) -> bool { + self.inner.is_dynamic(variable_name) + } +} + +/// Policy variable resolver trait +#[async_trait] +pub trait PolicyVariableResolver: Sync { + async fn resolve(&self, variable_name: &str) -> Option; + async fn resolve_multiple(&self, variable_name: &str) -> Option> { + self.resolve(variable_name).await.map(|s| vec![s]) + } + fn is_dynamic(&self, variable_name: &str) -> bool; +} + +/// AWS variable resolver +pub struct VariableResolver { + context: VariableContext, +} + +impl VariableResolver { + pub fn new(context: VariableContext) -> Self { + Self { context } + } + + fn get_claim_as_strings(&self, claim_name: &str) -> Option> { + self.context + .claims + .as_ref() + .and_then(|claims| claims.get(claim_name)) + .and_then(|value| match value { + Value::String(s) => Some(vec![s.clone()]), + Value::Array(arr) => Some( + arr.iter() + .filter_map(|item| match item { + Value::String(s) => Some(s.clone()), + Value::Number(n) => Some(n.to_string()), + Value::Bool(b) => Some(b.to_string()), + _ => None, + }) + .collect(), + ), + Value::Number(n) => Some(vec![n.to_string()]), + Value::Bool(b) => Some(vec![b.to_string()]), + _ => None, + }) + } + + fn resolve_username(&self) -> Option { + self.context.username.clone() + } + + fn resolve_userid(&self) -> Option { + self.get_claim_as_strings("sub") + .or_else(|| self.get_claim_as_strings("parent")) + .and_then(|mut vec| vec.pop()) // 取第一个值,保持原有逻辑 + } + + fn resolve_principal_type(&self) -> String { + if let Some(claims) = &self.context.claims { + if claims.contains_key("roleArn") { + return "AssumedRole".to_string(); + } + + if claims.contains_key("parent") && claims.contains_key("sa-policy") { + return "ServiceAccount".to_string(); + } + } + + "User".to_string() + } + + fn resolve_secure_transport(&self) -> String { + if self.context.is_https { "true" } else { "false" }.to_string() + } + + fn resolve_current_time(&self) -> String { + let now = OffsetDateTime::now_utc(); + now.format(&time::format_description::well_known::Rfc3339) + .unwrap_or_else(|_| now.to_string()) + } + + fn resolve_epoch_time(&self) -> String { + OffsetDateTime::now_utc().unix_timestamp().to_string() + } + + fn resolve_account_id(&self) -> Option { + self.context.account_id.clone() + } + + fn resolve_region(&self) -> Option { + self.context.region.clone() + } + + fn resolve_source_ip(&self) -> Option { + self.context.source_ip.clone() + } + + fn resolve_custom_variable(&self, variable_name: &str) -> Option { + let custom_key = variable_name.strip_prefix("custom:")?; + self.context.custom_variables.get(custom_key).cloned() + } +} + +#[async_trait] +impl PolicyVariableResolver for VariableResolver { + async fn resolve(&self, variable_name: &str) -> Option { + match variable_name { + "aws:username" => self.resolve_username(), + "aws:userid" => self.resolve_userid(), + "aws:PrincipalType" => Some(self.resolve_principal_type()), + "aws:SecureTransport" => Some(self.resolve_secure_transport()), + "aws:CurrentTime" => Some(self.resolve_current_time()), + "aws:EpochTime" => Some(self.resolve_epoch_time()), + "aws:AccountId" => self.resolve_account_id(), + "aws:Region" => self.resolve_region(), + "aws:SourceIp" => self.resolve_source_ip(), + _ => { + // Handle custom:* variables + if variable_name.starts_with("custom:") { + self.resolve_custom_variable(variable_name) + } else { + None + } + } + } + } + + async fn resolve_multiple(&self, variable_name: &str) -> Option> { + match variable_name { + "aws:username" => self.resolve_username().map(|s| vec![s]), + + "aws:userid" => self + .get_claim_as_strings("sub") + .or_else(|| self.get_claim_as_strings("parent")), + + _ => self.resolve(variable_name).await.map(|s| vec![s]), + } + } + + fn is_dynamic(&self, variable_name: &str) -> bool { + matches!(variable_name, "aws:CurrentTime" | "aws:EpochTime") + } +} + +pub async fn resolve_aws_variables(pattern: &str, resolver: &dyn PolicyVariableResolver) -> Vec { + let mut results = vec![pattern.to_string()]; + + let mut changed = true; + let max_iterations = 10; // Prevent infinite loops + let mut iteration = 0; + + while changed && iteration < max_iterations { + changed = false; + iteration += 1; + + let mut new_results = Vec::new(); + for result in &results { + let resolved = resolve_single_pass(result, resolver).await; + if resolved.len() > 1 || (resolved.len() == 1 && &resolved[0] != result) { + changed = true; + } + new_results.extend(resolved); + } + + // Remove duplicates while preserving order + results.clear(); + let mut seen = std::collections::HashSet::new(); + for result in new_results { + if seen.insert(result.clone()) { + results.push(result); + } + } + } + + results +} + +// Need to box the future to avoid infinite size due to recursion +fn resolve_aws_variables_boxed<'a>( + pattern: &'a str, + resolver: &'a dyn PolicyVariableResolver, +) -> std::pin::Pin> + Send + 'a>> { + Box::pin(resolve_aws_variables(pattern, resolver)) +} + +/// Single pass resolution of variables in a string +async fn resolve_single_pass(pattern: &str, resolver: &dyn PolicyVariableResolver) -> Vec { + // Find all ${...} format variables + let mut results = vec![pattern.to_string()]; + + // Process each result string + let mut i = 0; + while i < results.len() { + let mut start = 0; + let mut modified = false; + + // Find variables in current string + while let Some(pos) = results[i][start..].find("${") { + let actual_pos = start + pos; + + // Find the matching closing brace, taking into account nested braces + let mut brace_count = 1; + let mut end_pos = actual_pos + 2; // Start after "${" + + while end_pos < results[i].len() && brace_count > 0 { + match results[i].chars().nth(end_pos).unwrap() { + '{' => brace_count += 1, + '}' => brace_count -= 1, + _ => {} + } + if brace_count > 0 { + end_pos += 1; + } + } + + if brace_count == 0 { + let var_name = &results[i][actual_pos + 2..end_pos]; + + // Check if this is a nested variable (contains ${...} inside) + if var_name.contains("${") { + // For nested variables like ${${a}-${b}}, we need to resolve the inner variables first + // Then use the resolved result as a new variable to resolve + let resolved_inner = resolve_aws_variables_boxed(var_name, resolver).await; + let mut new_results = Vec::new(); + + for resolved_var_name in resolved_inner { + let prefix = &results[i][..actual_pos]; + let suffix = &results[i][end_pos + 1..]; + new_results.push(format!("{prefix}{resolved_var_name}{suffix}")); + } + + if !new_results.is_empty() { + // Update result set + results.splice(i..i + 1, new_results); + modified = true; + break; + } else { + // If we couldn't resolve the nested variable, keep the original + start = end_pos + 1; + } + } else { + // Regular variable resolution + if let Some(values) = resolver.resolve_multiple(var_name).await { + if !values.is_empty() { + // If there are multiple values, create a new result for each value + let mut new_results = Vec::new(); + let prefix = &results[i][..actual_pos]; + let suffix = &results[i][end_pos + 1..]; + + for value in values { + new_results.push(format!("{prefix}{value}{suffix}")); + } + + results.splice(i..i + 1, new_results); + modified = true; + break; + } else { + // Variable resolved to empty, just remove the variable placeholder + let mut new_results = Vec::new(); + let prefix = &results[i][..actual_pos]; + let suffix = &results[i][end_pos + 1..]; + new_results.push(format!("{prefix}{suffix}")); + + results.splice(i..i + 1, new_results); + modified = true; + break; + } + } else { + // Variable not found, skip + start = end_pos + 1; + } + } + } else { + // No matching closing brace found, break loop + break; + } + } + + if !modified { + i += 1; + } + } + + results +} + +#[cfg(test)] +mod tests { + use super::*; + use serde_json::Value; + use std::collections::HashMap; + + #[tokio::test] + async fn test_resolve_aws_variables_with_username() { + let mut context = VariableContext::new(); + context.username = Some("testuser".to_string()); + + let resolver = VariableResolver::new(context); + let result = resolve_aws_variables("${aws:username}-bucket", &resolver).await; + assert_eq!(result, vec!["testuser-bucket".to_string()]); + } + + #[tokio::test] + async fn test_resolve_aws_variables_with_userid() { + let mut claims = HashMap::new(); + claims.insert("sub".to_string(), Value::String("AIDACKCEVSQ6C2EXAMPLE".to_string())); + + let mut context = VariableContext::new(); + context.claims = Some(claims); + + let resolver = VariableResolver::new(context); + let result = resolve_aws_variables("${aws:userid}-bucket", &resolver).await; + assert_eq!(result, vec!["AIDACKCEVSQ6C2EXAMPLE-bucket".to_string()]); + } + + #[tokio::test] + async fn test_resolve_aws_variables_with_multiple_variables() { + let mut claims = HashMap::new(); + claims.insert("sub".to_string(), Value::String("AIDACKCEVSQ6C2EXAMPLE".to_string())); + + let mut context = VariableContext::new(); + context.claims = Some(claims); + context.username = Some("testuser".to_string()); + + let resolver = VariableResolver::new(context); + let result = resolve_aws_variables("${aws:username}-${aws:userid}-bucket", &resolver).await; + assert_eq!(result, vec!["testuser-AIDACKCEVSQ6C2EXAMPLE-bucket".to_string()]); + } + + #[tokio::test] + async fn test_resolve_aws_variables_no_variables() { + let context = VariableContext::new(); + let resolver = VariableResolver::new(context); + + let result = resolve_aws_variables("test-bucket", &resolver).await; + assert_eq!(result, vec!["test-bucket".to_string()]); + } + + #[tokio::test] + async fn test_cached_aws_variable_resolver_dynamic_variables() { + let context = VariableContext::new(); + + let cached_resolver = CachedAwsVariableResolver::new(context); + + // Dynamic variables should not be cached + let result1 = resolve_aws_variables("${aws:EpochTime}-bucket", &cached_resolver).await; + + // Add a delay of 1 second to ensure different timestamps + tokio::time::sleep(Duration::from_secs(1)).await; + + let result2 = resolve_aws_variables("${aws:EpochTime}-bucket", &cached_resolver).await; + + // Both results should be different (different timestamps) + assert_ne!(result1, result2); + } +} diff --git a/crates/policy/tests/policy_is_allowed.rs b/crates/policy/tests/policy_is_allowed.rs index bbffc481..00991a71 100644 --- a/crates/policy/tests/policy_is_allowed.rs +++ b/crates/policy/tests/policy_is_allowed.rs @@ -612,7 +612,7 @@ struct ArgsBuilder { "24" )] fn policy_is_allowed(policy: Policy, args: ArgsBuilder) -> bool { - policy.is_allowed(&Args { + pollster::block_on(policy.is_allowed(&Args { account: &args.account, groups: &{ if args.groups.is_empty() { @@ -628,5 +628,5 @@ fn policy_is_allowed(policy: Policy, args: ArgsBuilder) -> bool { object: &args.object, claims: &args.claims, deny_only: args.deny_only, - }) + })) } diff --git a/crates/protos/src/generated/proto_gen/node_service.rs b/crates/protos/src/generated/proto_gen/node_service.rs index c7a0ac7f..dd0bc308 100644 --- a/crates/protos/src/generated/proto_gen/node_service.rs +++ b/crates/protos/src/generated/proto_gen/node_service.rs @@ -438,6 +438,24 @@ pub struct DeletePathsResponse { pub error: ::core::option::Option, } #[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)] +pub struct ReadMetadataRequest { + #[prost(string, tag = "1")] + pub disk: ::prost::alloc::string::String, + #[prost(string, tag = "2")] + pub volume: ::prost::alloc::string::String, + #[prost(string, tag = "3")] + pub path: ::prost::alloc::string::String, +} +#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)] +pub struct ReadMetadataResponse { + #[prost(bool, tag = "1")] + pub success: bool, + #[prost(message, optional, tag = "2")] + pub error: ::core::option::Option, + #[prost(bytes = "bytes", tag = "3")] + pub data: ::prost::bytes::Bytes, +} +#[derive(Clone, PartialEq, Eq, Hash, ::prost::Message)] pub struct UpdateMetadataRequest { #[prost(string, tag = "1")] pub disk: ::prost::alloc::string::String, @@ -1524,6 +1542,21 @@ pub mod node_service_client { .insert(GrpcMethod::new("node_service.NodeService", "UpdateMetadata")); self.inner.unary(req, path, codec).await } + pub async fn read_metadata( + &mut self, + request: impl tonic::IntoRequest, + ) -> std::result::Result, tonic::Status> { + self.inner + .ready() + .await + .map_err(|e| tonic::Status::unknown(format!("Service was not ready: {}", e.into())))?; + let codec = tonic_prost::ProstCodec::default(); + let path = http::uri::PathAndQuery::from_static("/node_service.NodeService/ReadMetadata"); + let mut req = request.into_request(); + req.extensions_mut() + .insert(GrpcMethod::new("node_service.NodeService", "ReadMetadata")); + self.inner.unary(req, path, codec).await + } pub async fn write_metadata( &mut self, request: impl tonic::IntoRequest, @@ -2403,6 +2436,10 @@ pub mod node_service_server { &self, request: tonic::Request, ) -> std::result::Result, tonic::Status>; + async fn read_metadata( + &self, + request: tonic::Request, + ) -> std::result::Result, tonic::Status>; async fn write_metadata( &self, request: tonic::Request, @@ -3407,6 +3444,34 @@ pub mod node_service_server { }; Box::pin(fut) } + "/node_service.NodeService/ReadMetadata" => { + #[allow(non_camel_case_types)] + struct ReadMetadataSvc(pub Arc); + impl tonic::server::UnaryService for ReadMetadataSvc { + type Response = super::ReadMetadataResponse; + type Future = BoxFuture, tonic::Status>; + fn call(&mut self, request: tonic::Request) -> Self::Future { + let inner = Arc::clone(&self.0); + let fut = async move { ::read_metadata(&inner, request).await }; + Box::pin(fut) + } + } + let accept_compression_encodings = self.accept_compression_encodings; + let send_compression_encodings = self.send_compression_encodings; + let max_decoding_message_size = self.max_decoding_message_size; + let max_encoding_message_size = self.max_encoding_message_size; + let inner = self.inner.clone(); + let fut = async move { + let method = ReadMetadataSvc(inner); + let codec = tonic_prost::ProstCodec::default(); + let mut grpc = tonic::server::Grpc::new(codec) + .apply_compression_config(accept_compression_encodings, send_compression_encodings) + .apply_max_message_size_config(max_decoding_message_size, max_encoding_message_size); + let res = grpc.unary(method, req).await; + Ok(res) + }; + Box::pin(fut) + } "/node_service.NodeService/WriteMetadata" => { #[allow(non_camel_case_types)] struct WriteMetadataSvc(pub Arc); diff --git a/crates/protos/src/lib.rs b/crates/protos/src/lib.rs index 4242a76f..e54f1edf 100644 --- a/crates/protos/src/lib.rs +++ b/crates/protos/src/lib.rs @@ -15,19 +15,24 @@ #[allow(unsafe_code)] mod generated; -use std::{error::Error, time::Duration}; - -pub use generated::*; use proto_gen::node_service::node_service_client::NodeServiceClient; -use rustfs_common::globals::{GLOBAL_Conn_Map, evict_connection}; +use rustfs_common::{GLOBAL_CONN_MAP, GLOBAL_ROOT_CERT, evict_connection}; +use std::{error::Error, time::Duration}; use tonic::{ Request, Status, metadata::MetadataValue, service::interceptor::InterceptedService, - transport::{Channel, Endpoint}, + transport::{Certificate, Channel, ClientTlsConfig, Endpoint}, }; use tracing::{debug, warn}; +// Type alias for the complex client type +pub type NodeServiceClientType = NodeServiceClient< + InterceptedService) -> Result, Status> + Send + Sync + 'static>>, +>; + +pub use generated::*; + // Default 100 MB pub const DEFAULT_GRPC_SERVER_MESSAGE_LEN: usize = 100 * 1024 * 1024; @@ -46,6 +51,12 @@ const HTTP2_KEEPALIVE_TIMEOUT_SECS: u64 = 3; /// Overall RPC timeout - maximum time for any single RPC operation const RPC_TIMEOUT_SECS: u64 = 30; +/// Default HTTPS prefix for rustfs +/// This is the default HTTPS prefix for rustfs. +/// It is used to identify HTTPS URLs. +/// Default value: https:// +const RUSTFS_HTTPS_PREFIX: &str = "https://"; + /// Creates a new gRPC channel with optimized keepalive settings for cluster resilience. /// /// This function is designed to detect dead peers quickly: @@ -56,7 +67,7 @@ const RPC_TIMEOUT_SECS: u64 = 30; async fn create_new_channel(addr: &str) -> Result> { debug!("Creating new gRPC channel to: {}", addr); - let connector = Endpoint::from_shared(addr.to_string())? + let mut connector = Endpoint::from_shared(addr.to_string())? // Fast connection timeout for dead peer detection .connect_timeout(Duration::from_secs(CONNECT_TIMEOUT_SECS)) // TCP-level keepalive - OS will probe connection @@ -70,11 +81,42 @@ async fn create_new_channel(addr: &str) -> Result> { // Overall timeout for any RPC - fail fast on unresponsive peers .timeout(Duration::from_secs(RPC_TIMEOUT_SECS)); + let root_cert = GLOBAL_ROOT_CERT.read().await; + if addr.starts_with(RUSTFS_HTTPS_PREFIX) { + if let Some(cert_pem) = root_cert.as_ref() { + let ca = Certificate::from_pem(cert_pem); + // Derive the hostname from the HTTPS URL for TLS hostname verification. + let domain = addr + .trim_start_matches(RUSTFS_HTTPS_PREFIX) + .split('/') + .next() + .unwrap_or("") + .split(':') + .next() + .unwrap_or(""); + let tls = if !domain.is_empty() { + ClientTlsConfig::new().ca_certificate(ca).domain_name(domain) + } else { + // Fallback: configure TLS without explicit domain if parsing fails. + ClientTlsConfig::new().ca_certificate(ca) + }; + connector = connector.tls_config(tls)?; + debug!("Configured TLS with custom root certificate for: {}", addr); + } else { + debug!("Using system root certificates for TLS: {}", addr); + } + } else { + // Custom root certificates are configured but will be ignored for non-HTTPS addresses. + if root_cert.is_some() { + warn!("Custom root certificates are configured but not used because the address does not use HTTPS: {addr}"); + } + } + let channel = connector.connect().await?; // Cache the new connection { - GLOBAL_Conn_Map.write().await.insert(addr.to_string(), channel.clone()); + GLOBAL_CONN_MAP.write().await.insert(addr.to_string(), channel.clone()); } debug!("Successfully created and cached gRPC channel to: {}", addr); @@ -111,7 +153,7 @@ pub async fn node_service_time_out_client( let token: MetadataValue<_> = "rustfs rpc".parse()?; // Try to get cached channel - let cached_channel = { GLOBAL_Conn_Map.read().await.get(addr).cloned() }; + let cached_channel = { GLOBAL_CONN_MAP.read().await.get(addr).cloned() }; let channel = match cached_channel { Some(channel) => { diff --git a/crates/protos/src/main.rs b/crates/protos/src/main.rs index fe18772a..95d6d79e 100644 --- a/crates/protos/src/main.rs +++ b/crates/protos/src/main.rs @@ -46,7 +46,7 @@ fn main() -> Result<(), AnyError> { }; if !need_compile { - println!("no need to compile protos.{}", need_compile); + println!("no need to compile protos.{need_compile}"); return Ok(()); } diff --git a/crates/protos/src/node.proto b/crates/protos/src/node.proto index c3b535c6..bb23ac75 100644 --- a/crates/protos/src/node.proto +++ b/crates/protos/src/node.proto @@ -313,6 +313,18 @@ message DeletePathsResponse { optional Error error = 2; } +message ReadMetadataRequest { + string disk = 1; + string volume = 2; + string path = 3; +} + +message ReadMetadataResponse { + bool success = 1; + optional Error error = 2; + bytes data = 3; +} + message UpdateMetadataRequest { string disk = 1; string volume = 2; @@ -786,6 +798,7 @@ service NodeService { rpc StatVolume(StatVolumeRequest) returns (StatVolumeResponse) {}; rpc DeletePaths(DeletePathsRequest) returns (DeletePathsResponse) {}; rpc UpdateMetadata(UpdateMetadataRequest) returns (UpdateMetadataResponse) {}; + rpc ReadMetadata(ReadMetadataRequest) returns (ReadMetadataResponse) {}; rpc WriteMetadata(WriteMetadataRequest) returns (WriteMetadataResponse) {}; rpc ReadVersion(ReadVersionRequest) returns (ReadVersionResponse) {}; rpc ReadXL(ReadXLRequest) returns (ReadXLResponse) {}; @@ -794,6 +807,7 @@ service NodeService { rpc ReadMultiple(ReadMultipleRequest) returns (ReadMultipleResponse) {}; rpc DeleteVolume(DeleteVolumeRequest) returns (DeleteVolumeResponse) {}; rpc DiskInfo(DiskInfoRequest) returns (DiskInfoResponse) {}; + /* -------------------------------lock service-------------------------- */ diff --git a/crates/s3select-api/Cargo.toml b/crates/s3select-api/Cargo.toml index bcb575b5..60c14156 100644 --- a/crates/s3select-api/Cargo.toml +++ b/crates/s3select-api/Cargo.toml @@ -39,6 +39,7 @@ object_store = { workspace = true } pin-project-lite.workspace = true s3s.workspace = true snafu = { workspace = true, features = ["backtrace"] } +parking_lot.workspace = true tokio.workspace = true tokio-util.workspace = true tracing.workspace = true diff --git a/crates/s3select-api/src/lib.rs b/crates/s3select-api/src/lib.rs index 3cee17e7..322bb436 100644 --- a/crates/s3select-api/src/lib.rs +++ b/crates/s3select-api/src/lib.rs @@ -12,10 +12,9 @@ // See the License for the specific language governing permissions and // limitations under the License. -use std::fmt::Display; - use datafusion::{common::DataFusionError, sql::sqlparser::parser::ParserError}; use snafu::{Backtrace, Location, Snafu}; +use std::fmt::Display; pub mod object_store; pub mod query; diff --git a/crates/s3select-api/src/query/execution.rs b/crates/s3select-api/src/query/execution.rs index ce26ff0c..86559908 100644 --- a/crates/s3select-api/src/query/execution.rs +++ b/crates/s3select-api/src/query/execution.rs @@ -15,10 +15,11 @@ use std::fmt::Display; use std::pin::Pin; use std::sync::Arc; -use std::sync::atomic::{AtomicPtr, Ordering}; use std::task::{Context, Poll}; use std::time::{Duration, Instant}; +use parking_lot::RwLock; + use async_trait::async_trait; use datafusion::arrow::datatypes::{Schema, SchemaRef}; use datafusion::arrow::record_batch::RecordBatch; @@ -132,7 +133,7 @@ pub struct QueryStateMachine { pub session: SessionCtx, pub query: Query, - state: AtomicPtr, + state: RwLock, start: Instant, } @@ -141,14 +142,14 @@ impl QueryStateMachine { Self { session, query, - state: AtomicPtr::new(Box::into_raw(Box::new(QueryState::ACCEPTING))), + state: RwLock::new(QueryState::ACCEPTING), start: Instant::now(), } } pub fn begin_analyze(&self) { // TODO record time - self.translate_to(Box::new(QueryState::RUNNING(RUNNING::ANALYZING))); + self.translate_to(QueryState::RUNNING(RUNNING::ANALYZING)); } pub fn end_analyze(&self) { @@ -157,7 +158,7 @@ impl QueryStateMachine { pub fn begin_optimize(&self) { // TODO record time - self.translate_to(Box::new(QueryState::RUNNING(RUNNING::OPTIMIZING))); + self.translate_to(QueryState::RUNNING(RUNNING::OPTIMIZING)); } pub fn end_optimize(&self) { @@ -166,7 +167,7 @@ impl QueryStateMachine { pub fn begin_schedule(&self) { // TODO - self.translate_to(Box::new(QueryState::RUNNING(RUNNING::SCHEDULING))); + self.translate_to(QueryState::RUNNING(RUNNING::SCHEDULING)); } pub fn end_schedule(&self) { @@ -175,29 +176,29 @@ impl QueryStateMachine { pub fn finish(&self) { // TODO - self.translate_to(Box::new(QueryState::DONE(DONE::FINISHED))); + self.translate_to(QueryState::DONE(DONE::FINISHED)); } pub fn cancel(&self) { // TODO - self.translate_to(Box::new(QueryState::DONE(DONE::CANCELLED))); + self.translate_to(QueryState::DONE(DONE::CANCELLED)); } pub fn fail(&self) { // TODO - self.translate_to(Box::new(QueryState::DONE(DONE::FAILED))); + self.translate_to(QueryState::DONE(DONE::FAILED)); } - pub fn state(&self) -> &QueryState { - unsafe { &*self.state.load(Ordering::Relaxed) } + pub fn state(&self) -> QueryState { + self.state.read().clone() } pub fn duration(&self) -> Duration { self.start.elapsed() } - fn translate_to(&self, state: Box) { - self.state.store(Box::into_raw(state), Ordering::Relaxed); + fn translate_to(&self, state: QueryState) { + *self.state.write() = state; } } diff --git a/crates/s3select-api/src/query/mod.rs b/crates/s3select-api/src/query/mod.rs index f21da83a..d83af94b 100644 --- a/crates/s3select-api/src/query/mod.rs +++ b/crates/s3select-api/src/query/mod.rs @@ -12,13 +12,11 @@ // See the License for the specific language governing permissions and // limitations under the License. -use std::sync::Arc; - use s3s::dto::SelectObjectContentInput; +use std::sync::Arc; pub mod analyzer; pub mod ast; -pub mod datasource; pub mod dispatcher; pub mod execution; pub mod function; diff --git a/crates/s3select-api/src/query/session.rs b/crates/s3select-api/src/query/session.rs index e96bc638..ab790542 100644 --- a/crates/s3select-api/src/query/session.rs +++ b/crates/s3select-api/src/query/session.rs @@ -12,20 +12,17 @@ // See the License for the specific language governing permissions and // limitations under the License. -use std::sync::Arc; - +use crate::query::Context; +use crate::{QueryError, QueryResult, object_store::EcObjectStore}; use datafusion::{ execution::{SessionStateBuilder, context::SessionState, runtime_env::RuntimeEnvBuilder}, parquet::data_type::AsBytes, prelude::SessionContext, }; use object_store::{ObjectStore, memory::InMemory, path::Path}; +use std::sync::Arc; use tracing::error; -use crate::{QueryError, QueryResult, object_store::EcObjectStore}; - -use super::Context; - #[derive(Clone)] pub struct SessionCtx { _desc: Arc, diff --git a/crates/scanner/Cargo.toml b/crates/scanner/Cargo.toml new file mode 100644 index 00000000..3257b3b4 --- /dev/null +++ b/crates/scanner/Cargo.toml @@ -0,0 +1,61 @@ +# Copyright 2024 RustFS Team +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +[package] +name = "rustfs-scanner" +version.workspace = true +edition.workspace = true +authors = ["RustFS Team"] +license.workspace = true +repository.workspace = true +rust-version.workspace = true +homepage.workspace = true +description = "RustFS Scanner provides scanning capabilities for data integrity checks, health monitoring, and storage analysis." +keywords = ["RustFS", "scanner", "health-monitoring", "data-integrity", "storage-analysis", "Minio"] +categories = ["web-programming", "development-tools", "filesystem"] +documentation = "https://docs.rs/rustfs-scanner/latest/rustfs_scanner/" + +[lints] +workspace = true + +[dependencies] +rustfs-config = { workspace = true } +rustfs-common = { workspace = true } +rustfs-utils = { workspace = true } +tokio = { workspace = true, features = ["full"] } +tracing = { workspace = true } +serde = { workspace = true, features = ["derive"] } +serde_json = { workspace = true } +thiserror = { workspace = true } +uuid = { workspace = true, features = ["v4", "serde"] } +anyhow = { workspace = true } +async-trait = { workspace = true } +futures = { workspace = true } +time = { workspace = true } +chrono = { workspace = true } +path-clean = { workspace = true } +rmp-serde = { workspace = true } +rustfs-filemeta = { workspace = true } +rustfs-madmin = { workspace = true } +tokio-util = { workspace = true } +rustfs-ecstore = { workspace = true } +http = { workspace = true } +rand = { workspace = true } +s3s = { workspace = true } + +[dev-dependencies] +tokio-test = { workspace = true } +tracing-subscriber = { workspace = true } +tempfile = { workspace = true } +serial_test = { workspace = true } diff --git a/crates/scanner/README.md b/crates/scanner/README.md new file mode 100644 index 00000000..c51c16e2 --- /dev/null +++ b/crates/scanner/README.md @@ -0,0 +1,36 @@ +# RustFS Scanner + +RustFS Scanner 提供了数据完整性检查、健康监控和存储分析等扫描功能。 + +## 功能特性 + +- 数据完整性扫描 +- 健康监控 +- 存储分析 +- 可扩展的扫描框架 + +## 使用示例 + +```rust +use rustfs_scanner::ScannerError; + +// TODO: 添加使用示例 +``` + +## 开发 + +### 构建 + +```bash +cargo build --package rustfs-scanner +``` + +### 测试 + +```bash +cargo test --package rustfs-scanner +``` + +## 许可证 + +Apache License 2.0 diff --git a/crates/scanner/src/data_usage.rs b/crates/scanner/src/data_usage.rs new file mode 100644 index 00000000..cd8feeea --- /dev/null +++ b/crates/scanner/src/data_usage.rs @@ -0,0 +1,39 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use rustfs_ecstore::disk::{BUCKET_META_PREFIX, RUSTFS_META_BUCKET}; +use rustfs_utils::path::SLASH_SEPARATOR; +use std::sync::LazyLock; + +// Data usage constants +pub const DATA_USAGE_ROOT: &str = SLASH_SEPARATOR; + +const DATA_USAGE_OBJ_NAME: &str = ".usage.json"; + +const DATA_USAGE_BLOOM_NAME: &str = ".bloomcycle.bin"; + +pub const DATA_USAGE_CACHE_NAME: &str = ".usage-cache.bin"; + +// Data usage paths (computed at runtime) +pub static DATA_USAGE_BUCKET: LazyLock = + LazyLock::new(|| format!("{RUSTFS_META_BUCKET}{SLASH_SEPARATOR}{BUCKET_META_PREFIX}")); + +pub static DATA_USAGE_OBJ_NAME_PATH: LazyLock = + LazyLock::new(|| format!("{BUCKET_META_PREFIX}{SLASH_SEPARATOR}{DATA_USAGE_OBJ_NAME}")); + +pub static DATA_USAGE_BLOOM_NAME_PATH: LazyLock = + LazyLock::new(|| format!("{BUCKET_META_PREFIX}{SLASH_SEPARATOR}{DATA_USAGE_BLOOM_NAME}")); + +pub static BACKGROUND_HEAL_INFO_PATH: LazyLock = + LazyLock::new(|| format!("{BUCKET_META_PREFIX}{SLASH_SEPARATOR}.background-heal.json")); diff --git a/crates/scanner/src/data_usage_define.rs b/crates/scanner/src/data_usage_define.rs new file mode 100644 index 00000000..b0eee07c --- /dev/null +++ b/crates/scanner/src/data_usage_define.rs @@ -0,0 +1,1602 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use path_clean::PathClean; +use s3s::dto::BucketLifecycleConfiguration; +use serde::{Deserialize, Serialize}; +use std::{ + collections::{HashMap, HashSet}, + hash::{DefaultHasher, Hash, Hasher}, + path::Path, + sync::{Arc, LazyLock}, + time::SystemTime, +}; + +use http::HeaderMap; +use rustfs_ecstore::{ + StorageAPI, + bucket::{lifecycle::lifecycle::TRANSITION_COMPLETE, replication::ReplicationConfig}, + config::{com::save_config, storageclass}, + disk::{BUCKET_META_PREFIX, RUSTFS_META_BUCKET}, + error::{Error, Result as StorageResult, StorageError}, + store_api::{ObjectInfo, ObjectOptions}, +}; +use rustfs_utils::path::{SLASH_SEPARATOR, path_join_buf}; +use tokio::time::{Duration, sleep, timeout}; +use tracing::{error, warn}; + +// Data usage constants +pub const DATA_USAGE_ROOT: &str = SLASH_SEPARATOR; + +const DATA_USAGE_OBJ_NAME: &str = ".usage.json"; + +const DATA_USAGE_BLOOM_NAME: &str = ".bloomcycle.bin"; + +pub const DATA_USAGE_CACHE_NAME: &str = ".usage-cache.bin"; + +// Data usage paths (computed at runtime) +pub static DATA_USAGE_BUCKET: LazyLock = + LazyLock::new(|| format!("{RUSTFS_META_BUCKET}{SLASH_SEPARATOR}{BUCKET_META_PREFIX}")); + +pub static DATA_USAGE_OBJ_NAME_PATH: LazyLock = + LazyLock::new(|| format!("{BUCKET_META_PREFIX}{SLASH_SEPARATOR}{DATA_USAGE_OBJ_NAME}")); + +pub static DATA_USAGE_BLOOM_NAME_PATH: LazyLock = + LazyLock::new(|| format!("{BUCKET_META_PREFIX}{SLASH_SEPARATOR}{DATA_USAGE_BLOOM_NAME}")); + +pub static BACKGROUND_HEAL_INFO_PATH: LazyLock = + LazyLock::new(|| format!("{BUCKET_META_PREFIX}{SLASH_SEPARATOR}.background-heal.json")); + +#[derive(Clone, Copy, Default, Debug, Serialize, Deserialize, PartialEq)] +pub struct TierStats { + pub total_size: u64, + pub num_versions: i32, + pub num_objects: i32, +} + +impl TierStats { + pub fn add(&self, u: &TierStats) -> TierStats { + TierStats { + total_size: self.total_size + u.total_size, + num_versions: self.num_versions + u.num_versions, + num_objects: self.num_objects + u.num_objects, + } + } + + pub fn from_object_info(oi: &ObjectInfo) -> Self { + TierStats { + total_size: oi.size as u64, + num_versions: 1, + num_objects: if oi.is_latest { 1 } else { 0 }, + } + } +} + +#[derive(Clone, Debug, Default, Serialize, Deserialize, PartialEq)] +pub struct AllTierStats { + pub tiers: HashMap, +} + +impl AllTierStats { + pub fn new() -> Self { + Self { tiers: HashMap::new() } + } + + pub fn add_sizes(&mut self, tiers: HashMap) { + for (tier, st) in tiers { + self.tiers + .insert(tier.clone(), self.tiers.get(&tier).unwrap_or(&TierStats::default()).add(&st)); + } + } + + pub fn merge(&mut self, other: AllTierStats) { + for (tier, st) in other.tiers { + self.tiers + .insert(tier.clone(), self.tiers.get(&tier).unwrap_or(&TierStats::default()).add(&st)); + } + } + + pub fn populate_stats(&self, stats: &mut HashMap) { + for (tier, st) in &self.tiers { + stats.insert( + tier.clone(), + TierStats { + total_size: st.total_size, + num_versions: st.num_versions, + num_objects: st.num_objects, + }, + ); + } + } +} + +/// Bucket target usage info provides replication statistics +#[derive(Debug, Default, Clone, Serialize, Deserialize)] +pub struct BucketTargetUsageInfo { + pub replication_pending_size: u64, + pub replication_failed_size: u64, + pub replicated_size: u64, + pub replica_size: u64, + pub replication_pending_count: u64, + pub replication_failed_count: u64, + pub replicated_count: u64, +} + +/// Bucket usage info provides bucket-level statistics +#[derive(Debug, Default, Clone, Serialize, Deserialize)] +pub struct BucketUsageInfo { + pub size: u64, + // Following five fields suffixed with V1 are here for backward compatibility + // Total Size for objects that have not yet been replicated + pub replication_pending_size_v1: u64, + // Total size for objects that have witness one or more failures and will be retried + pub replication_failed_size_v1: u64, + // Total size for objects that have been replicated to destination + pub replicated_size_v1: u64, + // Total number of objects pending replication + pub replication_pending_count_v1: u64, + // Total number of objects that failed replication + pub replication_failed_count_v1: u64, + + pub objects_count: u64, + pub object_size_histogram: HashMap, + pub object_versions_histogram: HashMap, + pub versions_count: u64, + pub delete_markers_count: u64, + pub replica_size: u64, + pub replica_count: u64, + pub replication_info: HashMap, +} + +/// DataUsageInfo represents data usage stats of the underlying storage +#[derive(Debug, Default, Clone, Serialize, Deserialize)] +pub struct DataUsageInfo { + /// Total capacity + pub total_capacity: u64, + /// Total used capacity + pub total_used_capacity: u64, + /// Total free capacity + pub total_free_capacity: u64, + + /// LastUpdate is the timestamp of when the data usage info was last updated + pub last_update: Option, + + /// Objects total count across all buckets + pub objects_total_count: u64, + /// Versions total count across all buckets + pub versions_total_count: u64, + /// Delete markers total count across all buckets + pub delete_markers_total_count: u64, + /// Objects total size across all buckets + pub objects_total_size: u64, + /// Replication info across all buckets + pub replication_info: HashMap, + + /// Total number of buckets in this cluster + pub buckets_count: u64, + /// Buckets usage info provides following information across all buckets + pub buckets_usage: HashMap, + /// Deprecated kept here for backward compatibility reasons + pub bucket_sizes: HashMap, + /// Per-disk snapshot information when available + #[serde(default)] + pub disk_usage_status: Vec, +} + +/// Metadata describing the status of a disk-level data usage snapshot. +#[derive(Debug, Default, Clone, Serialize, Deserialize)] +pub struct DiskUsageStatus { + pub disk_id: String, + pub pool_index: Option, + pub set_index: Option, + pub disk_index: Option, + pub last_update: Option, + pub snapshot_exists: bool, +} + +/// Size summary for a single object or group of objects +#[derive(Debug, Default, Clone)] +pub struct SizeSummary { + /// Total size + pub total_size: usize, + /// Number of versions + pub versions: usize, + /// Number of delete markers + pub delete_markers: usize, + /// Replicated size + pub replicated_size: i64, + /// Replicated count + pub replicated_count: usize, + /// Pending size + pub pending_size: i64, + /// Failed size + pub failed_size: i64, + /// Replica size + pub replica_size: i64, + /// Replica count + pub replica_count: usize, + /// Pending count + pub pending_count: usize, + /// Failed count + pub failed_count: usize, + /// Replication target stats + pub repl_target_stats: HashMap, + pub tier_stats: HashMap, +} + +impl SizeSummary { + pub fn actions_accounting(&mut self, oi: &ObjectInfo, size: i64, actual_size: i64) { + if oi.delete_marker { + self.delete_markers += 1; + } + + if oi.version_id.is_some_and(|v| !v.is_nil()) && size == actual_size { + self.versions += 1; + } + + self.total_size += if size > 0 { size as usize } else { 0 }; + + if oi.delete_marker || oi.transitioned_object.free_version { + return; + } + + let mut tier = oi.storage_class.clone().unwrap_or(storageclass::STANDARD.to_string()); + if oi.transitioned_object.status == TRANSITION_COMPLETE { + tier = oi.transitioned_object.tier.clone(); + } + + if let Some(tier_stats) = self.tier_stats.get_mut(&tier) { + tier_stats.add(&TierStats::from_object_info(oi)); + } + } +} + +/// Replication target size summary +#[derive(Debug, Default, Clone)] +pub struct ReplTargetSizeSummary { + /// Replicated size + pub replicated_size: i64, + /// Replicated count + pub replicated_count: usize, + /// Pending size + pub pending_size: i64, + /// Failed size + pub failed_size: i64, + /// Pending count + pub pending_count: usize, + /// Failed count + pub failed_count: usize, +} + +// ===== Cache-related data structures ===== + +/// Data usage hash for path-based caching +#[derive(Clone, Debug, Default, Eq, PartialEq)] +pub struct DataUsageHash(pub String); + +impl DataUsageHash { + pub fn string(&self) -> String { + self.0.clone() + } + + pub fn key(&self) -> String { + self.0.clone() + } + + pub fn mod_(&self, cycle: u32, cycles: u32) -> bool { + if cycles <= 1 { + return cycles == 1; + } + + let hash = self.calculate_hash(); + hash as u32 % cycles == cycle % cycles + } + + pub fn mod_alt(&self, cycle: u32, cycles: u32) -> bool { + if cycles <= 1 { + return cycles == 1; + } + + let hash = self.calculate_hash(); + + warn!( + "mod_alt: key: {} hash: {} cycle: {} cycles: {} mod: {} hash >> 32: {} result: {}", + self.0, + hash, + cycle, + cycles, + (hash >> 32) as u32, + (hash >> 32) as u32 % cycles, + cycle % cycles, + ); + + (hash >> 32) as u32 % cycles == cycle % cycles + } + + fn calculate_hash(&self) -> u64 { + let mut hasher = DefaultHasher::new(); + self.0.hash(&mut hasher); + hasher.finish() + } +} + +/// Data usage hash map type +pub type DataUsageHashMap = HashSet; + +/// Size histogram for object size distribution +#[derive(Clone, Debug, Serialize, Deserialize)] +pub struct SizeHistogram(Vec); + +impl Default for SizeHistogram { + fn default() -> Self { + Self(vec![0; 11]) // DATA_USAGE_BUCKET_LEN = 11 + } +} + +impl SizeHistogram { + pub fn add(&mut self, size: u64) { + let intervals = [ + (0, 1024), // LESS_THAN_1024_B + (1024, 64 * 1024 - 1), // BETWEEN_1024_B_AND_64_KB + (64 * 1024, 256 * 1024 - 1), // BETWEEN_64_KB_AND_256_KB + (256 * 1024, 512 * 1024 - 1), // BETWEEN_256_KB_AND_512_KB + (512 * 1024, 1024 * 1024 - 1), // BETWEEN_512_KB_AND_1_MB + (1024, 1024 * 1024 - 1), // BETWEEN_1024B_AND_1_MB + (1024 * 1024, 10 * 1024 * 1024 - 1), // BETWEEN_1_MB_AND_10_MB + (10 * 1024 * 1024, 64 * 1024 * 1024 - 1), // BETWEEN_10_MB_AND_64_MB + (64 * 1024 * 1024, 128 * 1024 * 1024 - 1), // BETWEEN_64_MB_AND_128_MB + (128 * 1024 * 1024, 512 * 1024 * 1024 - 1), // BETWEEN_128_MB_AND_512_MB + (512 * 1024 * 1024, u64::MAX), // GREATER_THAN_512_MB + ]; + + for (idx, (start, end)) in intervals.iter().enumerate() { + if size >= *start && size <= *end { + self.0[idx] += 1; + break; + } + } + } + + pub fn to_map(&self) -> HashMap { + let names = [ + "LESS_THAN_1024_B", + "BETWEEN_1024_B_AND_64_KB", + "BETWEEN_64_KB_AND_256_KB", + "BETWEEN_256_KB_AND_512_KB", + "BETWEEN_512_KB_AND_1_MB", + "BETWEEN_1024B_AND_1_MB", + "BETWEEN_1_MB_AND_10_MB", + "BETWEEN_10_MB_AND_64_MB", + "BETWEEN_64_MB_AND_128_MB", + "BETWEEN_128_MB_AND_512_MB", + "GREATER_THAN_512_MB", + ]; + + let mut res = HashMap::new(); + let mut spl_count = 0; + for (count, name) in self.0.iter().zip(names.iter()) { + if name == &"BETWEEN_1024B_AND_1_MB" { + res.insert(name.to_string(), spl_count); + } else if name.starts_with("BETWEEN_") && name.contains("_KB_") && name.contains("_MB") { + spl_count += count; + res.insert(name.to_string(), *count); + } else { + res.insert(name.to_string(), *count); + } + } + res + } +} + +/// Versions histogram for version count distribution +#[derive(Clone, Debug, Serialize, Deserialize)] +pub struct VersionsHistogram(Vec); + +impl Default for VersionsHistogram { + fn default() -> Self { + Self(vec![0; 7]) // DATA_USAGE_VERSION_LEN = 7 + } +} + +impl VersionsHistogram { + pub fn add(&mut self, count: u64) { + let intervals = [ + (0, 0), // UNVERSIONED + (1, 1), // SINGLE_VERSION + (2, 9), // BETWEEN_2_AND_10 + (10, 99), // BETWEEN_10_AND_100 + (100, 999), // BETWEEN_100_AND_1000 + (1000, 9999), // BETWEEN_1000_AND_10000 + (10000, u64::MAX), // GREATER_THAN_10000 + ]; + + for (idx, (start, end)) in intervals.iter().enumerate() { + if count >= *start && count <= *end { + self.0[idx] += 1; + break; + } + } + } + + pub fn to_map(&self) -> HashMap { + let names = [ + "UNVERSIONED", + "SINGLE_VERSION", + "BETWEEN_2_AND_10", + "BETWEEN_10_AND_100", + "BETWEEN_100_AND_1000", + "BETWEEN_1000_AND_10000", + "GREATER_THAN_10000", + ]; + + let mut res = HashMap::new(); + for (count, name) in self.0.iter().zip(names.iter()) { + res.insert(name.to_string(), *count); + } + res + } +} + +/// Replication statistics for a single target +#[derive(Debug, Default, Clone, Serialize, Deserialize)] +pub struct ReplicationStats { + pub pending_size: u64, + pub replicated_size: u64, + pub failed_size: u64, + pub failed_count: u64, + pub pending_count: u64, + pub missed_threshold_size: u64, + pub after_threshold_size: u64, + pub missed_threshold_count: u64, + pub after_threshold_count: u64, + pub replicated_count: u64, +} + +impl ReplicationStats { + pub fn empty(&self) -> bool { + self.replicated_size == 0 && self.failed_size == 0 && self.failed_count == 0 + } +} + +/// Replication statistics for all targets +#[derive(Debug, Default, Clone, Serialize, Deserialize)] +pub struct ReplicationAllStats { + pub targets: HashMap, + pub replica_size: u64, + pub replica_count: u64, +} + +impl ReplicationAllStats { + pub fn empty(&self) -> bool { + if self.replica_size != 0 && self.replica_count != 0 { + return false; + } + for (_, v) in self.targets.iter() { + if !v.empty() { + return false; + } + } + true + } +} + +/// Data usage cache entry +#[derive(Clone, Debug, Default, Serialize, Deserialize)] +pub struct DataUsageEntry { + pub children: DataUsageHashMap, + // These fields do not include any children. + pub size: usize, + pub objects: usize, + pub versions: usize, + pub delete_markers: usize, + pub obj_sizes: SizeHistogram, + pub obj_versions: VersionsHistogram, + pub replication_stats: Option, + pub compacted: bool, +} + +impl DataUsageEntry { + pub fn add_child(&mut self, hash: &DataUsageHash) { + if self.children.contains(&hash.key()) { + return; + } + self.children.insert(hash.key()); + } + + pub fn add_sizes(&mut self, summary: &SizeSummary) { + self.size += summary.total_size; + self.versions += summary.versions; + self.delete_markers += summary.delete_markers; + self.obj_sizes.add(summary.total_size as u64); + self.obj_versions.add(summary.versions as u64); + + let replication_stats = if self.replication_stats.is_none() { + self.replication_stats = Some(ReplicationAllStats::default()); + self.replication_stats.as_mut().unwrap() + } else { + self.replication_stats.as_mut().unwrap() + }; + replication_stats.replica_size += summary.replica_size as u64; + replication_stats.replica_count += summary.replica_count as u64; + + for (arn, st) in &summary.repl_target_stats { + let tgt_stat = replication_stats + .targets + .entry(arn.to_string()) + .or_insert(ReplicationStats::default()); + tgt_stat.pending_size += st.pending_size as u64; + tgt_stat.failed_size += st.failed_size as u64; + tgt_stat.replicated_size += st.replicated_size as u64; + tgt_stat.replicated_count += st.replicated_count as u64; + tgt_stat.failed_count += st.failed_count as u64; + tgt_stat.pending_count += st.pending_count as u64; + } + } + + pub fn merge(&mut self, other: &DataUsageEntry) { + self.objects += other.objects; + self.versions += other.versions; + self.delete_markers += other.delete_markers; + self.size += other.size; + + if let Some(o_rep) = &other.replication_stats { + if self.replication_stats.is_none() { + self.replication_stats = Some(ReplicationAllStats::default()); + } + let s_rep = self.replication_stats.as_mut().unwrap(); + s_rep.targets.clear(); + s_rep.replica_size += o_rep.replica_size; + s_rep.replica_count += o_rep.replica_count; + for (arn, stat) in o_rep.targets.iter() { + let st = s_rep.targets.entry(arn.clone()).or_default(); + *st = ReplicationStats { + pending_size: stat.pending_size + st.pending_size, + failed_size: stat.failed_size + st.failed_size, + replicated_size: stat.replicated_size + st.replicated_size, + pending_count: stat.pending_count + st.pending_count, + failed_count: stat.failed_count + st.failed_count, + replicated_count: stat.replicated_count + st.replicated_count, + ..Default::default() + }; + } + } + + for (i, v) in other.obj_sizes.0.iter().enumerate() { + self.obj_sizes.0[i] += v; + } + + for (i, v) in other.obj_versions.0.iter().enumerate() { + self.obj_versions.0[i] += v; + } + } +} + +#[derive(Clone, Debug, Default, Serialize, Deserialize)] +pub struct DataUsageEntryInfo { + pub name: String, + pub parent: String, + pub entry: DataUsageEntry, +} + +/// Data usage cache info +#[derive(Clone, Debug, Default, Serialize, Deserialize)] +pub struct DataUsageCacheInfo { + pub name: String, + pub next_cycle: u64, + pub last_update: Option, + pub skip_healing: bool, + pub lifecycle: Option>, + pub replication: Option>, +} + +/// Data usage cache +#[derive(Clone, Debug, Default, Serialize, Deserialize)] +pub struct DataUsageCache { + pub info: DataUsageCacheInfo, + pub cache: HashMap, +} + +impl DataUsageCache { + pub fn replace(&mut self, path: &str, parent: &str, e: DataUsageEntry) { + let hash = hash_path(path); + self.cache.insert(hash.key(), e); + if !parent.is_empty() { + let phash = hash_path(parent); + let p = { + let p = self.cache.entry(phash.key()).or_default(); + p.add_child(&hash); + p.clone() + }; + self.cache.insert(phash.key(), p); + } + } + + pub fn replace_hashed(&mut self, hash: &DataUsageHash, parent: &Option, e: &DataUsageEntry) { + self.cache.insert(hash.key(), e.clone()); + if let Some(parent) = parent { + self.cache.entry(parent.key()).or_default().add_child(hash); + } + } + + pub fn find(&self, path: &str) -> Option { + self.cache.get(&hash_path(path).key()).cloned() + } + + pub fn find_children_copy(&mut self, h: DataUsageHash) -> DataUsageHashMap { + self.cache.entry(h.string()).or_default().children.clone() + } + + pub fn flatten(&self, root: &DataUsageEntry) -> DataUsageEntry { + let mut root = root.clone(); + for id in root.children.clone().iter() { + if let Some(e) = self.cache.get(id) { + let mut e = e.clone(); + if !e.children.is_empty() { + e = self.flatten(&e); + } + root.merge(&e); + } + } + root.children.clear(); + root + } + + pub fn copy_with_children(&mut self, src: &DataUsageCache, hash: &DataUsageHash, parent: &Option) { + if let Some(e) = src.cache.get(&hash.string()) { + self.cache.insert(hash.key(), e.clone()); + for ch in e.children.iter() { + if *ch == hash.key() { + return; + } + self.copy_with_children(src, &DataUsageHash(ch.to_string()), &Some(hash.clone())); + } + if let Some(parent) = parent { + let p = self.cache.entry(parent.key()).or_default(); + p.add_child(hash); + } + } + } + + pub fn delete_recursive(&mut self, hash: &DataUsageHash) { + let mut need_remove = Vec::new(); + if let Some(v) = self.cache.get(&hash.string()) { + for child in v.children.iter() { + need_remove.push(child.clone()); + } + } + self.cache.remove(&hash.string()); + need_remove.iter().for_each(|child| { + self.delete_recursive(&DataUsageHash(child.to_string())); + }); + } + + pub fn size_recursive(&self, path: &str) -> Option { + match self.find(path) { + Some(root) => { + if root.children.is_empty() { + return Some(root); + } + let mut flat = self.flatten(&root); + if flat.replication_stats.is_some() && flat.replication_stats.as_ref().unwrap().empty() { + flat.replication_stats = None; + } + Some(flat) + } + None => None, + } + } + + pub fn search_parent(&self, hash: &DataUsageHash) -> Option { + let want = hash.key(); + if let Some(last_index) = want.rfind('/') { + if let Some(v) = self.find(&want[0..last_index]) { + if v.children.contains(&want) { + let found = hash_path(&want[0..last_index]); + return Some(found); + } + } + } + + for (k, v) in self.cache.iter() { + if v.children.contains(&want) { + let found = DataUsageHash(k.clone()); + return Some(found); + } + } + None + } + + pub fn is_compacted(&self, hash: &DataUsageHash) -> bool { + match self.cache.get(&hash.key()) { + Some(due) => due.compacted, + None => false, + } + } + + pub fn force_compact(&mut self, limit: usize) { + if self.cache.len() < limit { + return; + } + let top = hash_path(&self.info.name).key(); + let top_e = match self.find(&top) { + Some(e) => e, + None => return, + }; + // Note: DATA_SCANNER_FORCE_COMPACT_AT_FOLDERS constant would need to be passed as parameter + // or defined in common crate if needed + if top_e.children.len() > 250_000 { + // DATA_SCANNER_FORCE_COMPACT_AT_FOLDERS + self.reduce_children_of(&hash_path(&self.info.name), limit, true); + } + if self.cache.len() <= limit { + return; + } + + let mut found = HashSet::new(); + found.insert(top); + mark(self, &top_e, &mut found); + self.cache.retain(|k, _| { + if !found.contains(k) { + return false; + } + true + }); + } + + pub fn reduce_children_of(&mut self, path: &DataUsageHash, limit: usize, compact_self: bool) { + let e = match self.cache.get(&path.key()) { + Some(e) => e, + None => return, + }; + + if e.compacted { + return; + } + + if e.children.len() > limit && compact_self { + let mut flat = self.size_recursive(&path.key()).unwrap_or_default(); + flat.compacted = true; + self.delete_recursive(path); + self.replace_hashed(path, &None, &flat); + return; + } + let total = self.total_children_rec(&path.key()); + if total < limit { + return; + } + + let mut leaves = Vec::new(); + let mut remove = total - limit; + add(self, path, &mut leaves); + leaves.sort_by(|a, b| a.objects.cmp(&b.objects)); + + while remove > 0 && !leaves.is_empty() { + let e = leaves.first().unwrap(); + let candidate = e.path.clone(); + if candidate == *path && !compact_self { + break; + } + let removing = self.total_children_rec(&candidate.key()); + let mut flat = match self.size_recursive(&candidate.key()) { + Some(flat) => flat, + None => { + leaves.remove(0); + continue; + } + }; + + flat.compacted = true; + self.delete_recursive(&candidate); + self.replace_hashed(&candidate, &None, &flat); + + remove -= removing; + leaves.remove(0); + } + } + + pub fn total_children_rec(&self, path: &str) -> usize { + let root = self.find(path); + + if root.is_none() { + return 0; + } + let root = root.unwrap(); + if root.children.is_empty() { + return 0; + } + + let mut n = root.children.len(); + for ch in root.children.iter() { + n += self.total_children_rec(ch); + } + n + } + + pub fn merge(&mut self, o: &DataUsageCache) { + let mut existing_root = self.root(); + let other_root = o.root(); + if existing_root.is_none() && other_root.is_none() { + return; + } + if other_root.is_none() { + return; + } + if existing_root.is_none() { + *self = o.clone(); + return; + } + if o.info.last_update.gt(&self.info.last_update) { + self.info.last_update = o.info.last_update; + } + + existing_root.as_mut().unwrap().merge(other_root.as_ref().unwrap()); + self.cache.insert(hash_path(&self.info.name).key(), existing_root.unwrap()); + let e_hash = self.root_hash(); + for key in other_root.as_ref().unwrap().children.iter() { + let entry = &o.cache[key]; + let flat = o.flatten(entry); + let mut existing = self.cache[key].clone(); + existing.merge(&flat); + self.replace_hashed(&DataUsageHash(key.clone()), &Some(e_hash.clone()), &existing); + } + } + + pub fn root_hash(&self) -> DataUsageHash { + hash_path(&self.info.name) + } + + pub fn root(&self) -> Option { + self.find(&self.info.name) + } + + /// Convert cache to DataUsageInfo for a specific path + pub fn dui(&self, path: &str, buckets: &[String]) -> DataUsageInfo { + let e = match self.find(path) { + Some(e) => e, + None => return DataUsageInfo::default(), + }; + let flat = self.flatten(&e); + + let mut buckets_usage = HashMap::new(); + for bucket_name in buckets.iter() { + let e = match self.find(bucket_name) { + Some(e) => e, + None => continue, + }; + let flat = self.flatten(&e); + let mut bui = BucketUsageInfo { + size: flat.size as u64, + versions_count: flat.versions as u64, + objects_count: flat.objects as u64, + delete_markers_count: flat.delete_markers as u64, + object_size_histogram: flat.obj_sizes.to_map(), + object_versions_histogram: flat.obj_versions.to_map(), + ..Default::default() + }; + + if let Some(rs) = &flat.replication_stats { + bui.replica_size = rs.replica_size; + bui.replica_count = rs.replica_count; + + for (arn, stat) in rs.targets.iter() { + bui.replication_info.insert( + arn.clone(), + BucketTargetUsageInfo { + replication_pending_size: stat.pending_size, + replicated_size: stat.replicated_size, + replication_failed_size: stat.failed_size, + replication_pending_count: stat.pending_count, + replication_failed_count: stat.failed_count, + replicated_count: stat.replicated_count, + ..Default::default() + }, + ); + } + } + buckets_usage.insert(bucket_name.clone(), bui); + } + + DataUsageInfo { + last_update: self.info.last_update, + objects_total_count: flat.objects as u64, + versions_total_count: flat.versions as u64, + delete_markers_total_count: flat.delete_markers as u64, + objects_total_size: flat.size as u64, + buckets_count: e.children.len() as u64, + buckets_usage, + ..Default::default() + } + } + + pub fn marshal_msg(&self) -> Result, Box> { + let mut buf = Vec::new(); + self.serialize(&mut rmp_serde::Serializer::new(&mut buf))?; + Ok(buf) + } + + pub fn unmarshal(buf: &[u8]) -> Result> { + let t: Self = rmp_serde::from_slice(buf)?; + Ok(t) + } + + /// Only backend errors are returned as errors. + /// The loader is optimistic and has no locking, but tries 5 times before giving up. + /// If the object is not found, a nil error with empty data usage cache is returned. + pub async fn load(&mut self, store: Arc, name: &str) -> StorageResult<()> { + // By default, empty data usage cache + *self = DataUsageCache::default(); + + // Caches are read+written without locks + let mut retries = 0; + while retries < 5 { + let (should_retry, cache_opt, result) = Self::try_load_inner(store.clone(), name, Duration::from_secs(60)).await; + result?; + if let Some(cache) = cache_opt { + *self = cache; + return Ok(()); + } + if !should_retry { + break; + } + + // Try backup file + let backup_name = format!("{name}.bkp"); + let (backup_retry, backup_cache_opt, backup_result) = + Self::try_load_inner(store.clone(), &backup_name, Duration::from_secs(30)).await; + if backup_result.is_err() { + // Error loading backup, continue retry + } else if let Some(cache) = backup_cache_opt { + // Only return when we have valid data from the backup + *self = cache; + return Ok(()); + } else if !backup_retry { + // Backup not found and not retryable + break; + } + + retries += 1; + // Random sleep between 0 and 1 second + let sleep_ms: u64 = rand::random::() % 1000; + sleep(Duration::from_millis(sleep_ms)).await; + } + + if retries == 5 { + warn!("maximum retry reached to load the data usage cache `{}`", name); + } + + Ok(()) + } + // Inner load function that attempts to load from a specific path + // Returns (should_retry, cache_option, error_option) + async fn try_load_inner( + store: Arc, + load_name: &str, + timeout_duration: Duration, + ) -> (bool, Option, StorageResult<()>) { + // Abandon if more than time.Minute, so we don't hold up scanner. + // drive timeout by default is 2 minutes, we do not need to wait longer. + let load_fut = async { + // First try: RUSTFS_META_BUCKET + BUCKET_META_PREFIX/name + let path = path_join_buf(&[BUCKET_META_PREFIX, load_name]); + match store + .get_object_reader( + RUSTFS_META_BUCKET, + &path, + None, + HeaderMap::new(), + &ObjectOptions { + no_lock: true, + ..Default::default() + }, + ) + .await + { + Ok(mut reader) => { + match reader.read_all().await { + Ok(data) => { + match DataUsageCache::unmarshal(&data) { + Ok(cache) => Ok(Some(cache)), + Err(_) => { + // Deserialization failed, but we got data + Ok(None) + } + } + } + Err(e) => { + // Read error + Err(e) + } + } + } + Err(err) => { + match err { + Error::FileNotFound | Error::VolumeNotFound | Error::ObjectNotFound(_, _) | Error::BucketNotFound(_) => { + // Try second location: DATA_USAGE_BUCKET/name + match store + .get_object_reader( + &DATA_USAGE_BUCKET, + load_name, + None, + HeaderMap::new(), + &ObjectOptions { + no_lock: true, + ..Default::default() + }, + ) + .await + { + Ok(mut reader) => match reader.read_all().await { + Ok(data) => match DataUsageCache::unmarshal(&data) { + Ok(cache) => Ok(Some(cache)), + Err(_) => Ok(None), + }, + Err(e) => Err(e), + }, + Err(inner_err) => match inner_err { + Error::FileNotFound + | Error::VolumeNotFound + | Error::ObjectNotFound(_, _) + | Error::BucketNotFound(_) => { + // Object not found in both locations + Ok(None) + } + Error::ErasureReadQuorum => { + // InsufficientReadQuorum - retry + Ok(None) + } + _ => { + // Other storage errors - retry + if matches!( + inner_err, + Error::FaultyDisk | Error::DiskFull | Error::StorageFull | Error::SlowDown + ) { + return Ok(None); + } + Err(inner_err) + } + }, + } + } + Error::ErasureReadQuorum => { + // InsufficientReadQuorum - retry + Ok(None) + } + _ => { + // Other storage errors - retry + if matches!(err, Error::FaultyDisk | Error::DiskFull | Error::StorageFull | Error::SlowDown) { + return Ok(None); + } + Err(err) + } + } + } + } + }; + + match timeout(timeout_duration, load_fut).await { + Ok(result) => match result { + Ok(Some(cache)) => (false, Some(cache), Ok(())), + Ok(None) => { + // Not found or deserialization failed - check if we should retry + // For now, we don't retry on not found + (false, None, Ok(())) + } + Err(e) => { + // Check if it's a retryable error + if matches!( + e, + Error::ErasureReadQuorum | Error::FaultyDisk | Error::DiskFull | Error::StorageFull | Error::SlowDown + ) { + (true, None, Ok(())) + } else { + (false, None, Err(e)) + } + } + }, + Err(_) => { + // Timeout - retry + (true, None, Ok(())) + } + } + } + + pub async fn save(&self, store: Arc, name: &str) -> StorageResult<()> { + let mut buf = Vec::new(); + self.serialize(&mut rmp_serde::Serializer::new(&mut buf))?; + + let store_clone = store.clone(); + let buf_clone = buf.clone(); + let res = timeout(Duration::from_secs(5), async move { + save_config(store_clone, name, buf_clone).await?; + Ok::<(), StorageError>(()) + }) + .await + .map_err(|e| StorageError::other(format!("Failed to save data usage cache: {e}")))?; + + if let Err(e) = res { + error!("Failed to save data usage cache: {e}"); + return Err(e); + } + + let store_clone = store.clone(); + let backup_name = format!("{name}.bkp"); + let res = timeout(Duration::from_secs(5), async move { + save_config(store_clone, backup_name.as_str(), buf).await?; + Ok::<(), StorageError>(()) + }) + .await + .map_err(|e| StorageError::other(format!("Failed to save data usage cache: {e}")))?; + if let Err(e) = res { + error!("Failed to save data usage cache backup: {e}"); + return Err(e); + } + Ok(()) + } +} + +/// Trait for storage-specific operations on DataUsageCache +#[async_trait::async_trait] +pub trait DataUsageCacheStorage { + /// Load data usage cache from backend storage + async fn load(store: &dyn std::any::Any, name: &str) -> Result> + where + Self: Sized; + + /// Save data usage cache to backend storage + async fn save(&self, name: &str) -> Result<(), Box>; +} + +// Helper structs and functions for cache operations +#[derive(Default, Clone)] +struct Inner { + objects: usize, + path: DataUsageHash, +} + +fn add(data_usage_cache: &DataUsageCache, path: &DataUsageHash, leaves: &mut Vec) { + let e = match data_usage_cache.cache.get(&path.key()) { + Some(e) => e, + None => return, + }; + if !e.children.is_empty() { + return; + } + + let sz = data_usage_cache.size_recursive(&path.key()).unwrap_or_default(); + leaves.push(Inner { + objects: sz.objects, + path: path.clone(), + }); + for ch in e.children.iter() { + add(data_usage_cache, &DataUsageHash(ch.clone()), leaves); + } +} + +fn mark(duc: &DataUsageCache, entry: &DataUsageEntry, found: &mut HashSet) { + for k in entry.children.iter() { + found.insert(k.to_string()); + if let Some(ch) = duc.cache.get(k) { + mark(duc, ch, found); + } + } +} + +/// Hash a path for data usage caching +pub fn hash_path(data: &str) -> DataUsageHash { + DataUsageHash(Path::new(&data).clean().to_string_lossy().to_string()) +} + +impl DataUsageInfo { + /// Create a new DataUsageInfo + pub fn new() -> Self { + Self::default() + } + + /// Add object metadata to data usage statistics + pub fn add_object(&mut self, object_path: &str, meta_object: &rustfs_filemeta::MetaObject) { + // This method is kept for backward compatibility + // For accurate version counting, use add_object_from_file_meta instead + let bucket_name = match self.extract_bucket_from_path(object_path) { + Ok(name) => name, + Err(_) => return, + }; + + // Update bucket statistics + if let Some(bucket_usage) = self.buckets_usage.get_mut(&bucket_name) { + bucket_usage.size += meta_object.size as u64; + bucket_usage.objects_count += 1; + bucket_usage.versions_count += 1; // Simplified: assume 1 version per object + + // Update size histogram + let total_size = meta_object.size as u64; + let size_ranges = [ + ("0-1KB", 0, 1024), + ("1KB-1MB", 1024, 1024 * 1024), + ("1MB-10MB", 1024 * 1024, 10 * 1024 * 1024), + ("10MB-100MB", 10 * 1024 * 1024, 100 * 1024 * 1024), + ("100MB-1GB", 100 * 1024 * 1024, 1024 * 1024 * 1024), + ("1GB+", 1024 * 1024 * 1024, u64::MAX), + ]; + + for (range_name, min_size, max_size) in size_ranges { + if total_size >= min_size && total_size < max_size { + *bucket_usage.object_size_histogram.entry(range_name.to_string()).or_insert(0) += 1; + break; + } + } + + // Update version histogram (simplified - count as single version) + *bucket_usage + .object_versions_histogram + .entry("SINGLE_VERSION".to_string()) + .or_insert(0) += 1; + } else { + // Create new bucket usage + let mut bucket_usage = BucketUsageInfo { + size: meta_object.size as u64, + objects_count: 1, + versions_count: 1, + ..Default::default() + }; + bucket_usage.object_size_histogram.insert("0-1KB".to_string(), 1); + bucket_usage.object_versions_histogram.insert("SINGLE_VERSION".to_string(), 1); + self.buckets_usage.insert(bucket_name, bucket_usage); + } + + // Update global statistics + self.objects_total_size += meta_object.size as u64; + self.objects_total_count += 1; + self.versions_total_count += 1; + } + + /// Add object from FileMeta for accurate version counting + pub fn add_object_from_file_meta(&mut self, object_path: &str, file_meta: &rustfs_filemeta::FileMeta) { + let bucket_name = match self.extract_bucket_from_path(object_path) { + Ok(name) => name, + Err(_) => return, + }; + + // Calculate accurate statistics from all versions + let mut total_size = 0u64; + let mut versions_count = 0u64; + let mut delete_markers_count = 0u64; + let mut latest_object_size = 0u64; + + // Process all versions to get accurate counts + for version in &file_meta.versions { + match rustfs_filemeta::FileMetaVersion::try_from(version.clone()) { + Ok(ver) => { + if let Some(obj) = ver.object { + total_size += obj.size as u64; + versions_count += 1; + latest_object_size = obj.size as u64; // Keep track of latest object size + } else if ver.delete_marker.is_some() { + delete_markers_count += 1; + } + } + Err(_) => { + // Skip invalid versions + continue; + } + } + } + + // Update bucket statistics + if let Some(bucket_usage) = self.buckets_usage.get_mut(&bucket_name) { + bucket_usage.size += total_size; + bucket_usage.objects_count += 1; + bucket_usage.versions_count += versions_count; + bucket_usage.delete_markers_count += delete_markers_count; + + // Update size histogram based on latest object size + let size_ranges = [ + ("0-1KB", 0, 1024), + ("1KB-1MB", 1024, 1024 * 1024), + ("1MB-10MB", 1024 * 1024, 10 * 1024 * 1024), + ("10MB-100MB", 10 * 1024 * 1024, 100 * 1024 * 1024), + ("100MB-1GB", 100 * 1024 * 1024, 1024 * 1024 * 1024), + ("1GB+", 1024 * 1024 * 1024, u64::MAX), + ]; + + for (range_name, min_size, max_size) in size_ranges { + if latest_object_size >= min_size && latest_object_size < max_size { + *bucket_usage.object_size_histogram.entry(range_name.to_string()).or_insert(0) += 1; + break; + } + } + + // Update version histogram based on actual version count + let version_ranges = [ + ("1", 1, 1), + ("2-5", 2, 5), + ("6-10", 6, 10), + ("11-50", 11, 50), + ("51-100", 51, 100), + ("100+", 101, usize::MAX), + ]; + + for (range_name, min_versions, max_versions) in version_ranges { + if versions_count as usize >= min_versions && versions_count as usize <= max_versions { + *bucket_usage + .object_versions_histogram + .entry(range_name.to_string()) + .or_insert(0) += 1; + break; + } + } + } else { + // Create new bucket usage + let mut bucket_usage = BucketUsageInfo { + size: total_size, + objects_count: 1, + versions_count, + delete_markers_count, + ..Default::default() + }; + + // Set size histogram + let size_ranges = [ + ("0-1KB", 0, 1024), + ("1KB-1MB", 1024, 1024 * 1024), + ("1MB-10MB", 1024 * 1024, 10 * 1024 * 1024), + ("10MB-100MB", 10 * 1024 * 1024, 100 * 1024 * 1024), + ("100MB-1GB", 100 * 1024 * 1024, 1024 * 1024 * 1024), + ("1GB+", 1024 * 1024 * 1024, u64::MAX), + ]; + + for (range_name, min_size, max_size) in size_ranges { + if latest_object_size >= min_size && latest_object_size < max_size { + bucket_usage.object_size_histogram.insert(range_name.to_string(), 1); + break; + } + } + + // Set version histogram + let version_ranges = [ + ("1", 1, 1), + ("2-5", 2, 5), + ("6-10", 6, 10), + ("11-50", 11, 50), + ("51-100", 51, 100), + ("100+", 101, usize::MAX), + ]; + + for (range_name, min_versions, max_versions) in version_ranges { + if versions_count as usize >= min_versions && versions_count as usize <= max_versions { + bucket_usage.object_versions_histogram.insert(range_name.to_string(), 1); + break; + } + } + + self.buckets_usage.insert(bucket_name, bucket_usage); + // Update buckets count when adding new bucket + self.buckets_count = self.buckets_usage.len() as u64; + } + + // Update global statistics + self.objects_total_size += total_size; + self.objects_total_count += 1; + self.versions_total_count += versions_count; + self.delete_markers_total_count += delete_markers_count; + } + + /// Extract bucket name from object path + pub fn extract_bucket_from_path(&self, object_path: &str) -> Result> { + let parts: Vec<&str> = object_path.split('/').collect(); + if parts.is_empty() { + return Err("Invalid object path: empty".into()); + } + Ok(parts[0].to_string()) + } + + /// Update capacity information + pub fn update_capacity(&mut self, total: u64, used: u64, free: u64) { + self.total_capacity = total; + self.total_used_capacity = used; + self.total_free_capacity = free; + self.last_update = Some(SystemTime::now()); + } + + /// Add bucket usage info + pub fn add_bucket_usage(&mut self, bucket: String, usage: BucketUsageInfo) { + self.buckets_usage.insert(bucket.clone(), usage); + self.buckets_count = self.buckets_usage.len() as u64; + self.last_update = Some(SystemTime::now()); + } + + /// Get bucket usage info + pub fn get_bucket_usage(&self, bucket: &str) -> Option<&BucketUsageInfo> { + self.buckets_usage.get(bucket) + } + + /// Calculate total statistics from all buckets + pub fn calculate_totals(&mut self) { + self.objects_total_count = 0; + self.versions_total_count = 0; + self.delete_markers_total_count = 0; + self.objects_total_size = 0; + + for usage in self.buckets_usage.values() { + self.objects_total_count += usage.objects_count; + self.versions_total_count += usage.versions_count; + self.delete_markers_total_count += usage.delete_markers_count; + self.objects_total_size += usage.size; + } + } + + /// Merge another DataUsageInfo into this one + pub fn merge(&mut self, other: &DataUsageInfo) { + // Merge bucket usage + for (bucket, usage) in &other.buckets_usage { + if let Some(existing) = self.buckets_usage.get_mut(bucket) { + existing.merge(usage); + } else { + self.buckets_usage.insert(bucket.clone(), usage.clone()); + } + } + + self.disk_usage_status.extend(other.disk_usage_status.iter().cloned()); + + // Recalculate totals + self.calculate_totals(); + + // Ensure buckets_count stays consistent with buckets_usage + self.buckets_count = self.buckets_usage.len() as u64; + + // Update last update time + if let Some(other_update) = other.last_update { + if self.last_update.is_none() || other_update > self.last_update.unwrap() { + self.last_update = Some(other_update); + } + } + } +} + +impl BucketUsageInfo { + /// Create a new BucketUsageInfo + pub fn new() -> Self { + Self::default() + } + + /// Add size summary to this bucket usage + pub fn add_size_summary(&mut self, summary: &SizeSummary) { + self.size += summary.total_size as u64; + self.versions_count += summary.versions as u64; + self.delete_markers_count += summary.delete_markers as u64; + self.replica_size += summary.replica_size as u64; + self.replica_count += summary.replica_count as u64; + } + + /// Merge another BucketUsageInfo into this one + pub fn merge(&mut self, other: &BucketUsageInfo) { + self.size += other.size; + self.objects_count += other.objects_count; + self.versions_count += other.versions_count; + self.delete_markers_count += other.delete_markers_count; + self.replica_size += other.replica_size; + self.replica_count += other.replica_count; + + // Merge histograms + for (key, value) in &other.object_size_histogram { + *self.object_size_histogram.entry(key.clone()).or_insert(0) += value; + } + + for (key, value) in &other.object_versions_histogram { + *self.object_versions_histogram.entry(key.clone()).or_insert(0) += value; + } + + // Merge replication info + for (target, info) in &other.replication_info { + let entry = self.replication_info.entry(target.clone()).or_default(); + entry.replicated_size += info.replicated_size; + entry.replica_size += info.replica_size; + entry.replication_pending_size += info.replication_pending_size; + entry.replication_failed_size += info.replication_failed_size; + entry.replication_pending_count += info.replication_pending_count; + entry.replication_failed_count += info.replication_failed_count; + entry.replicated_count += info.replicated_count; + } + + // Merge backward compatibility fields + self.replication_pending_size_v1 += other.replication_pending_size_v1; + self.replication_failed_size_v1 += other.replication_failed_size_v1; + self.replicated_size_v1 += other.replicated_size_v1; + self.replication_pending_count_v1 += other.replication_pending_count_v1; + self.replication_failed_count_v1 += other.replication_failed_count_v1; + } +} + +impl SizeSummary { + /// Create a new SizeSummary + pub fn new() -> Self { + Self::default() + } + + /// Add another SizeSummary to this one + pub fn add(&mut self, other: &SizeSummary) { + self.total_size += other.total_size; + self.versions += other.versions; + self.delete_markers += other.delete_markers; + self.replicated_size += other.replicated_size; + self.replicated_count += other.replicated_count; + self.pending_size += other.pending_size; + self.failed_size += other.failed_size; + self.replica_size += other.replica_size; + self.replica_count += other.replica_count; + self.pending_count += other.pending_count; + self.failed_count += other.failed_count; + + // Merge replication target stats + for (target, stats) in &other.repl_target_stats { + let entry = self.repl_target_stats.entry(target.clone()).or_default(); + entry.replicated_size += stats.replicated_size; + entry.replicated_count += stats.replicated_count; + entry.pending_size += stats.pending_size; + entry.failed_size += stats.failed_size; + entry.pending_count += stats.pending_count; + entry.failed_count += stats.failed_count; + } + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_data_usage_info_creation() { + let mut info = DataUsageInfo::new(); + info.update_capacity(1000, 500, 500); + + assert_eq!(info.total_capacity, 1000); + assert_eq!(info.total_used_capacity, 500); + assert_eq!(info.total_free_capacity, 500); + assert!(info.last_update.is_some()); + } + + #[test] + fn test_bucket_usage_info_merge() { + let mut usage1 = BucketUsageInfo::new(); + usage1.size = 100; + usage1.objects_count = 10; + usage1.versions_count = 5; + + let mut usage2 = BucketUsageInfo::new(); + usage2.size = 200; + usage2.objects_count = 20; + usage2.versions_count = 10; + + usage1.merge(&usage2); + + assert_eq!(usage1.size, 300); + assert_eq!(usage1.objects_count, 30); + assert_eq!(usage1.versions_count, 15); + } + + #[test] + fn test_size_summary_add() { + let mut summary1 = SizeSummary::new(); + summary1.total_size = 100; + summary1.versions = 5; + + let mut summary2 = SizeSummary::new(); + summary2.total_size = 200; + summary2.versions = 10; + + summary1.add(&summary2); + + assert_eq!(summary1.total_size, 300); + assert_eq!(summary1.versions, 15); + } +} diff --git a/crates/scanner/src/error.rs b/crates/scanner/src/error.rs new file mode 100644 index 00000000..adf06639 --- /dev/null +++ b/crates/scanner/src/error.rs @@ -0,0 +1,36 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use thiserror::Error; + +/// Scanner-related errors +#[derive(Error, Debug)] +#[non_exhaustive] +pub enum ScannerError { + /// Configuration error + #[error("Configuration error: {0}")] + Config(String), + + /// I/O error + #[error("I/O error: {0}")] + Io(#[from] std::io::Error), + + /// Serialization error + #[error("Serialization error: {0}")] + Serialization(#[from] serde_json::Error), + + /// Generic error + #[error("Scanner error: {0}")] + Other(String), +} diff --git a/crates/scanner/src/last_minute.rs b/crates/scanner/src/last_minute.rs new file mode 100644 index 00000000..b4a776f9 --- /dev/null +++ b/crates/scanner/src/last_minute.rs @@ -0,0 +1,886 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use std::time::{Duration, SystemTime, UNIX_EPOCH}; + +#[allow(dead_code)] +#[derive(Debug, Default)] +pub struct TimedAction { + count: u64, + acc_time: u64, + min_time: Option, + max_time: Option, + bytes: u64, +} + +#[allow(dead_code)] +impl TimedAction { + // Avg returns the average time spent on the action. + pub fn avg(&self) -> Option { + if self.count == 0 { + return None; + } + Some(Duration::from_nanos(self.acc_time / self.count)) + } + + // AvgBytes returns the average bytes processed. + pub fn avg_bytes(&self) -> u64 { + if self.count == 0 { + return 0; + } + self.bytes / self.count + } + + // Merge other into t. + pub fn merge(&mut self, other: TimedAction) { + self.count += other.count; + self.acc_time += other.acc_time; + self.bytes += other.bytes; + + if self.count == 0 { + self.min_time = other.min_time; + } + if let Some(other_min) = other.min_time { + self.min_time = self.min_time.map_or(Some(other_min), |min| Some(min.min(other_min))); + } + + self.max_time = self + .max_time + .map_or(other.max_time, |max| Some(max.max(other.max_time.unwrap_or(0)))); + } +} + +#[allow(dead_code)] +#[derive(Debug)] +enum SizeCategory { + SizeLessThan1KiB = 0, + SizeLessThan1MiB, + SizeLessThan10MiB, + SizeLessThan100MiB, + SizeLessThan1GiB, + SizeGreaterThan1GiB, + // Add new entries here + SizeLastElemMarker, +} + +impl std::fmt::Display for SizeCategory { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + let s = match *self { + SizeCategory::SizeLessThan1KiB => "SizeLessThan1KiB", + SizeCategory::SizeLessThan1MiB => "SizeLessThan1MiB", + SizeCategory::SizeLessThan10MiB => "SizeLessThan10MiB", + SizeCategory::SizeLessThan100MiB => "SizeLessThan100MiB", + SizeCategory::SizeLessThan1GiB => "SizeLessThan1GiB", + SizeCategory::SizeGreaterThan1GiB => "SizeGreaterThan1GiB", + SizeCategory::SizeLastElemMarker => "SizeLastElemMarker", + }; + write!(f, "{s}") + } +} + +#[derive(Clone, Debug, Default, Copy)] +pub struct AccElem { + pub total: u64, + pub size: u64, + pub n: u64, +} + +impl AccElem { + pub fn add(&mut self, dur: &Duration) { + let dur = dur.as_secs(); + self.total = self.total.wrapping_add(dur); + self.n = self.n.wrapping_add(1); + } + + pub fn merge(&mut self, b: &AccElem) { + self.n = self.n.wrapping_add(b.n); + self.total = self.total.wrapping_add(b.total); + self.size = self.size.wrapping_add(b.size); + } + + pub fn avg(&self) -> Duration { + if self.n >= 1 && self.total > 0 { + return Duration::from_secs(self.total / self.n); + } + Duration::from_secs(0) + } +} + +#[derive(Clone, Debug)] +pub struct LastMinuteLatency { + pub totals: Vec, + pub last_sec: u64, +} + +impl Default for LastMinuteLatency { + fn default() -> Self { + Self { + totals: vec![AccElem::default(); 60], + last_sec: Default::default(), + } + } +} + +impl LastMinuteLatency { + pub fn merge(&mut self, o: &LastMinuteLatency) -> LastMinuteLatency { + let mut merged = LastMinuteLatency::default(); + let mut x = o.clone(); + if self.last_sec > o.last_sec { + x.forward_to(self.last_sec); + merged.last_sec = self.last_sec; + } else { + self.forward_to(o.last_sec); + merged.last_sec = o.last_sec; + } + + for i in 0..merged.totals.len() { + merged.totals[i] = AccElem { + total: self.totals[i].total + o.totals[i].total, + n: self.totals[i].n + o.totals[i].n, + size: self.totals[i].size + o.totals[i].size, + } + } + merged + } + + pub fn add(&mut self, t: &Duration) { + let sec = SystemTime::now() + .duration_since(UNIX_EPOCH) + .expect("Time went backwards") + .as_secs(); + self.forward_to(sec); + let win_idx = sec % 60; + self.totals[win_idx as usize].add(t); + self.last_sec = sec; + } + + pub fn add_all(&mut self, sec: u64, a: &AccElem) { + self.forward_to(sec); + let win_idx = sec % 60; + self.totals[win_idx as usize].merge(a); + self.last_sec = sec; + } + + pub fn get_total(&mut self) -> AccElem { + let mut res = AccElem::default(); + let sec = SystemTime::now() + .duration_since(UNIX_EPOCH) + .expect("Time went backwards") + .as_secs(); + self.forward_to(sec); + for elem in self.totals.iter() { + res.merge(elem); + } + res + } + + pub fn forward_to(&mut self, t: u64) { + if self.last_sec >= t { + return; + } + if t - self.last_sec >= 60 { + self.totals = vec![AccElem::default(); 60]; + self.last_sec = t; + return; + } + while self.last_sec != t { + let idx = (self.last_sec + 1) % 60; + self.totals[idx as usize] = AccElem::default(); + self.last_sec += 1; + } + } +} +#[cfg(test)] +mod tests { + use super::*; + use std::time::Duration; + + #[test] + fn test_acc_elem_default() { + let elem = AccElem::default(); + assert_eq!(elem.total, 0); + assert_eq!(elem.size, 0); + assert_eq!(elem.n, 0); + } + + #[test] + fn test_acc_elem_add_single_duration() { + let mut elem = AccElem::default(); + let duration = Duration::from_secs(5); + + elem.add(&duration); + + assert_eq!(elem.total, 5); + assert_eq!(elem.n, 1); + assert_eq!(elem.size, 0); // size is not modified by add + } + + #[test] + fn test_acc_elem_add_multiple_durations() { + let mut elem = AccElem::default(); + + elem.add(&Duration::from_secs(3)); + elem.add(&Duration::from_secs(7)); + elem.add(&Duration::from_secs(2)); + + assert_eq!(elem.total, 12); + assert_eq!(elem.n, 3); + assert_eq!(elem.size, 0); + } + + #[test] + fn test_acc_elem_add_zero_duration() { + let mut elem = AccElem::default(); + let duration = Duration::from_secs(0); + + elem.add(&duration); + + assert_eq!(elem.total, 0); + assert_eq!(elem.n, 1); + } + + #[test] + fn test_acc_elem_add_subsecond_duration() { + let mut elem = AccElem::default(); + // Duration less than 1 second should be truncated to 0 + let duration = Duration::from_millis(500); + + elem.add(&duration); + + assert_eq!(elem.total, 0); // as_secs() truncates subsecond values + assert_eq!(elem.n, 1); + } + + #[test] + fn test_acc_elem_merge_empty_elements() { + let mut elem1 = AccElem::default(); + let elem2 = AccElem::default(); + + elem1.merge(&elem2); + + assert_eq!(elem1.total, 0); + assert_eq!(elem1.size, 0); + assert_eq!(elem1.n, 0); + } + + #[test] + fn test_acc_elem_merge_with_data() { + let mut elem1 = AccElem { + total: 10, + size: 100, + n: 2, + }; + let elem2 = AccElem { + total: 15, + size: 200, + n: 3, + }; + + elem1.merge(&elem2); + + assert_eq!(elem1.total, 25); + assert_eq!(elem1.size, 300); + assert_eq!(elem1.n, 5); + } + + #[test] + fn test_acc_elem_merge_one_empty() { + let mut elem1 = AccElem { + total: 10, + size: 100, + n: 2, + }; + let elem2 = AccElem::default(); + + elem1.merge(&elem2); + + assert_eq!(elem1.total, 10); + assert_eq!(elem1.size, 100); + assert_eq!(elem1.n, 2); + } + + #[test] + fn test_acc_elem_avg_with_data() { + let elem = AccElem { + total: 15, + size: 0, + n: 3, + }; + + let avg = elem.avg(); + assert_eq!(avg, Duration::from_secs(5)); // 15 / 3 = 5 + } + + #[test] + fn test_acc_elem_avg_zero_count() { + let elem = AccElem { + total: 10, + size: 0, + n: 0, + }; + + let avg = elem.avg(); + assert_eq!(avg, Duration::from_secs(0)); + } + + #[test] + fn test_acc_elem_avg_zero_total() { + let elem = AccElem { total: 0, size: 0, n: 5 }; + + let avg = elem.avg(); + assert_eq!(avg, Duration::from_secs(0)); + } + + #[test] + fn test_acc_elem_avg_rounding() { + let elem = AccElem { + total: 10, + size: 0, + n: 3, + }; + + let avg = elem.avg(); + assert_eq!(avg, Duration::from_secs(3)); // 10 / 3 = 3 (integer division) + } + + #[test] + fn test_last_minute_latency_default() { + let latency = LastMinuteLatency::default(); + + assert_eq!(latency.totals.len(), 60); + assert_eq!(latency.last_sec, 0); + + // All elements should be default (empty) + for elem in &latency.totals { + assert_eq!(elem.total, 0); + assert_eq!(elem.size, 0); + assert_eq!(elem.n, 0); + } + } + + #[test] + fn test_last_minute_latency_forward_to_same_time() { + let mut latency = LastMinuteLatency { + last_sec: 100, + ..Default::default() + }; + + // Add some data to verify it's not cleared + latency.totals[0].total = 10; + latency.totals[0].n = 1; + + latency.forward_to(100); // Same time + + assert_eq!(latency.last_sec, 100); + assert_eq!(latency.totals[0].total, 10); // Data should remain + assert_eq!(latency.totals[0].n, 1); + } + + #[test] + fn test_last_minute_latency_forward_to_past_time() { + let mut latency = LastMinuteLatency { + last_sec: 100, + ..Default::default() + }; + + // Add some data to verify it's not cleared + latency.totals[0].total = 10; + latency.totals[0].n = 1; + + latency.forward_to(50); // Past time + + assert_eq!(latency.last_sec, 100); // Should not change + assert_eq!(latency.totals[0].total, 10); // Data should remain + assert_eq!(latency.totals[0].n, 1); + } + + #[test] + fn test_last_minute_latency_forward_to_large_gap() { + let mut latency = LastMinuteLatency { + last_sec: 100, + ..Default::default() + }; + + // Add some data to verify it's cleared + latency.totals[0].total = 10; + latency.totals[0].n = 1; + + latency.forward_to(200); // Gap >= 60 seconds + + assert_eq!(latency.last_sec, 200); // last_sec should be updated to target time + + // All data should be cleared + for elem in &latency.totals { + assert_eq!(elem.total, 0); + assert_eq!(elem.size, 0); + assert_eq!(elem.n, 0); + } + } + + #[test] + fn test_last_minute_latency_forward_to_small_gap() { + let mut latency = LastMinuteLatency { + last_sec: 100, + ..Default::default() + }; + + // Add data at specific indices + latency.totals[41].total = 10; // (100 + 1) % 60 = 41 + latency.totals[42].total = 20; // (100 + 2) % 60 = 42 + + latency.forward_to(102); // Forward by 2 seconds + + assert_eq!(latency.last_sec, 102); + + // The slots that were advanced should be cleared + assert_eq!(latency.totals[41].total, 0); // Cleared during forward + assert_eq!(latency.totals[42].total, 0); // Cleared during forward + } + + #[test] + fn test_last_minute_latency_add_all() { + let mut latency = LastMinuteLatency::default(); + let acc_elem = AccElem { + total: 15, + size: 100, + n: 3, + }; + + latency.add_all(1000, &acc_elem); + + assert_eq!(latency.last_sec, 1000); + let idx = 1000 % 60; // Should be 40 + assert_eq!(latency.totals[idx as usize].total, 15); + assert_eq!(latency.totals[idx as usize].size, 100); + assert_eq!(latency.totals[idx as usize].n, 3); + } + + #[test] + fn test_last_minute_latency_add_all_multiple() { + let mut latency = LastMinuteLatency::default(); + + let acc_elem1 = AccElem { + total: 10, + size: 50, + n: 2, + }; + let acc_elem2 = AccElem { + total: 20, + size: 100, + n: 4, + }; + + latency.add_all(1000, &acc_elem1); + latency.add_all(1000, &acc_elem2); // Same second + + let idx = 1000 % 60; + assert_eq!(latency.totals[idx as usize].total, 30); // 10 + 20 + assert_eq!(latency.totals[idx as usize].size, 150); // 50 + 100 + assert_eq!(latency.totals[idx as usize].n, 6); // 2 + 4 + } + + #[test] + fn test_last_minute_latency_merge_same_time() { + let mut latency1 = LastMinuteLatency::default(); + let mut latency2 = LastMinuteLatency::default(); + + latency1.last_sec = 1000; + latency2.last_sec = 1000; + + // Add data to both + latency1.totals[0].total = 10; + latency1.totals[0].n = 2; + latency2.totals[0].total = 20; + latency2.totals[0].n = 3; + + let merged = latency1.merge(&latency2); + + assert_eq!(merged.last_sec, 1000); + assert_eq!(merged.totals[0].total, 30); // 10 + 20 + assert_eq!(merged.totals[0].n, 5); // 2 + 3 + } + + #[test] + fn test_last_minute_latency_merge_different_times() { + let mut latency1 = LastMinuteLatency::default(); + let mut latency2 = LastMinuteLatency::default(); + + latency1.last_sec = 1000; + latency2.last_sec = 1010; // 10 seconds later + + // Add data to both + latency1.totals[0].total = 10; + latency2.totals[0].total = 20; + + let merged = latency1.merge(&latency2); + + assert_eq!(merged.last_sec, 1010); // Should use the later time + assert_eq!(merged.totals[0].total, 30); + } + + #[test] + fn test_last_minute_latency_merge_empty() { + let mut latency1 = LastMinuteLatency::default(); + let latency2 = LastMinuteLatency::default(); + + let merged = latency1.merge(&latency2); + + assert_eq!(merged.last_sec, 0); + for elem in &merged.totals { + assert_eq!(elem.total, 0); + assert_eq!(elem.size, 0); + assert_eq!(elem.n, 0); + } + } + + #[test] + fn test_last_minute_latency_window_wraparound() { + let mut latency = LastMinuteLatency::default(); + + // Test that indices wrap around correctly + for sec in 0..120 { + // Test for 2 minutes + let acc_elem = AccElem { + total: sec, + size: 0, + n: 1, + }; + latency.add_all(sec, &acc_elem); + + let expected_idx = sec % 60; + assert_eq!(latency.totals[expected_idx as usize].total, sec); + } + } + + #[test] + fn test_last_minute_latency_time_progression() { + let mut latency = LastMinuteLatency::default(); + + // Add data at time 1000 + latency.add_all( + 1000, + &AccElem { + total: 10, + size: 0, + n: 1, + }, + ); + + // Forward to time 1030 (30 seconds later) + latency.forward_to(1030); + + // Original data should still be there + let idx_1000 = 1000 % 60; + assert_eq!(latency.totals[idx_1000 as usize].total, 10); + + // Forward to time 1070 (70 seconds from original, > 60 seconds) + latency.forward_to(1070); + + // All data should be cleared due to large gap + for elem in &latency.totals { + assert_eq!(elem.total, 0); + assert_eq!(elem.n, 0); + } + } + + #[test] + fn test_last_minute_latency_realistic_scenario() { + let mut latency = LastMinuteLatency::default(); + let base_time = 1000u64; + + // Add data for exactly 60 seconds to fill the window + for i in 0..60 { + let current_time = base_time + i; + let duration_secs = i % 10 + 1; // Varying durations 1-10 seconds + let acc_elem = AccElem { + total: duration_secs, + size: 1024 * (i % 5 + 1), // Varying sizes + n: 1, + }; + + latency.add_all(current_time, &acc_elem); + } + + // Count non-empty slots after filling the window + let mut non_empty_count = 0; + let mut total_n = 0; + let mut total_sum = 0; + + for elem in &latency.totals { + if elem.n > 0 { + non_empty_count += 1; + total_n += elem.n; + total_sum += elem.total; + } + } + + // We should have exactly 60 non-empty slots (one for each second in the window) + assert_eq!(non_empty_count, 60); + assert_eq!(total_n, 60); // 60 data points total + assert!(total_sum > 0); + + // Test manual total calculation (get_total uses system time which interferes with test) + let mut manual_total = AccElem::default(); + for elem in &latency.totals { + manual_total.merge(elem); + } + assert_eq!(manual_total.n, 60); + assert_eq!(manual_total.total, total_sum); + } + + #[test] + fn test_acc_elem_clone_and_debug() { + let elem = AccElem { + total: 100, + size: 200, + n: 5, + }; + + let cloned = elem; + assert_eq!(elem.total, cloned.total); + assert_eq!(elem.size, cloned.size); + assert_eq!(elem.n, cloned.n); + + // Test Debug trait + let debug_str = format!("{elem:?}"); + assert!(debug_str.contains("100")); + assert!(debug_str.contains("200")); + assert!(debug_str.contains("5")); + } + + #[test] + fn test_last_minute_latency_clone() { + let mut latency = LastMinuteLatency { + last_sec: 1000, + ..Default::default() + }; + latency.totals[0].total = 100; + latency.totals[0].n = 5; + + let cloned = latency.clone(); + assert_eq!(latency.last_sec, cloned.last_sec); + assert_eq!(latency.totals[0].total, cloned.totals[0].total); + assert_eq!(latency.totals[0].n, cloned.totals[0].n); + } + + #[test] + fn test_edge_case_max_values() { + let mut elem = AccElem { + total: u64::MAX - 50, + size: u64::MAX - 50, + n: u64::MAX - 50, + }; + + let other = AccElem { + total: 100, + size: 100, + n: 100, + }; + + // This should not panic due to overflow, values will wrap around + elem.merge(&other); + + // Values should wrap around due to overflow (wrapping_add behavior) + assert_eq!(elem.total, 49); // (u64::MAX - 50) + 100 wraps to 49 + assert_eq!(elem.size, 49); + assert_eq!(elem.n, 49); + } + + #[test] + fn test_forward_to_boundary_conditions() { + let mut latency = LastMinuteLatency { + last_sec: 59, + ..Default::default() + }; + + // Add data at the last slot + latency.totals[59].total = 100; + latency.totals[59].n = 1; + + // Forward exactly 60 seconds (boundary case) + latency.forward_to(119); + + // All data should be cleared + for elem in &latency.totals { + assert_eq!(elem.total, 0); + assert_eq!(elem.n, 0); + } + } + + #[test] + fn test_get_total_with_data() { + let mut latency = LastMinuteLatency::default(); + + // Set a recent timestamp to avoid forward_to clearing data + let current_time = SystemTime::now() + .duration_since(UNIX_EPOCH) + .expect("Time went backwards") + .as_secs(); + latency.last_sec = current_time; + + // Add data to multiple slots + latency.totals[0] = AccElem { + total: 10, + size: 100, + n: 1, + }; + latency.totals[1] = AccElem { + total: 20, + size: 200, + n: 2, + }; + latency.totals[59] = AccElem { + total: 30, + size: 300, + n: 3, + }; + + let total = latency.get_total(); + + assert_eq!(total.total, 60); + assert_eq!(total.size, 600); + assert_eq!(total.n, 6); + } + + #[test] + fn test_window_index_calculation() { + // Test that window index calculation works correctly + let _latency = LastMinuteLatency::default(); + + let acc_elem = AccElem { total: 1, size: 1, n: 1 }; + + // Test various timestamps + let test_cases = [(0, 0), (1, 1), (59, 59), (60, 0), (61, 1), (119, 59), (120, 0)]; + + for (timestamp, expected_idx) in test_cases { + let mut test_latency = LastMinuteLatency::default(); + test_latency.add_all(timestamp, &acc_elem); + + assert_eq!( + test_latency.totals[expected_idx].n, 1, + "Failed for timestamp {timestamp} (expected index {expected_idx})" + ); + } + } + + #[test] + fn test_concurrent_safety_simulation() { + // Simulate concurrent access patterns + let mut latency = LastMinuteLatency::default(); + + // Use current time to ensure data doesn't get cleared by get_total + let current_time = SystemTime::now() + .duration_since(UNIX_EPOCH) + .expect("Time went backwards") + .as_secs(); + + // Simulate rapid additions within a 60-second window + for i in 0..1000 { + let acc_elem = AccElem { + total: (i % 10) + 1, // Ensure non-zero values + size: (i % 100) + 1, + n: 1, + }; + // Keep all timestamps within the current minute window + latency.add_all(current_time - (i % 60), &acc_elem); + } + + let total = latency.get_total(); + assert!(total.n > 0, "Total count should be greater than 0"); + assert!(total.total > 0, "Total time should be greater than 0"); + } + + #[test] + fn test_acc_elem_debug_format() { + let elem = AccElem { + total: 123, + size: 456, + n: 789, + }; + + let debug_str = format!("{elem:?}"); + assert!(debug_str.contains("123")); + assert!(debug_str.contains("456")); + assert!(debug_str.contains("789")); + } + + #[test] + fn test_large_values() { + let mut elem = AccElem::default(); + + // Test with large duration values + let large_duration = Duration::from_secs(u64::MAX / 2); + elem.add(&large_duration); + + assert_eq!(elem.total, u64::MAX / 2); + assert_eq!(elem.n, 1); + + // Test average calculation with large values + let avg = elem.avg(); + assert_eq!(avg, Duration::from_secs(u64::MAX / 2)); + } + + #[test] + fn test_zero_duration_handling() { + let mut elem = AccElem::default(); + + let zero_duration = Duration::from_secs(0); + elem.add(&zero_duration); + + assert_eq!(elem.total, 0); + assert_eq!(elem.n, 1); + assert_eq!(elem.avg(), Duration::from_secs(0)); + } +} + +const SIZE_LAST_ELEM_MARKER: usize = 10; // Assumed marker size is 10, modify according to actual situation + +#[allow(dead_code)] +#[derive(Debug, Default)] +pub struct LastMinuteHistogram { + histogram: Vec, + size: u32, +} + +impl LastMinuteHistogram { + pub fn merge(&mut self, other: &LastMinuteHistogram) { + for i in 0..self.histogram.len() { + self.histogram[i].merge(&other.histogram[i]); + } + } + + pub fn add(&mut self, size: i64, t: Duration) { + let index = size_to_tag(size); + self.histogram[index].add(&t); + } + + pub fn get_avg_data(&mut self) -> [AccElem; SIZE_LAST_ELEM_MARKER] { + let mut res = [AccElem::default(); SIZE_LAST_ELEM_MARKER]; + for (i, elem) in self.histogram.iter_mut().enumerate() { + res[i] = elem.get_total(); + } + res + } +} + +fn size_to_tag(size: i64) -> usize { + match size { + _ if size < 1024 => 0, // sizeLessThan1KiB + _ if size < 1024 * 1024 => 1, // sizeLessThan1MiB + _ if size < 10 * 1024 * 1024 => 2, // sizeLessThan10MiB + _ if size < 100 * 1024 * 1024 => 3, // sizeLessThan100MiB + _ if size < 1024 * 1024 * 1024 => 4, // sizeLessThan1GiB + _ => 5, // sizeGreaterThan1GiB + } +} diff --git a/crates/scanner/src/lib.rs b/crates/scanner/src/lib.rs new file mode 100644 index 00000000..94148fee --- /dev/null +++ b/crates/scanner/src/lib.rs @@ -0,0 +1,34 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +#![cfg_attr(docsrs, feature(doc_auto_cfg))] +#![warn( + // missing_docs, + rustdoc::missing_crate_level_docs, + unreachable_pub, + rust_2018_idioms +)] + +pub mod data_usage; +pub mod data_usage_define; +pub mod error; +pub mod last_minute; +pub mod metrics; +pub mod scanner; +pub mod scanner_folder; +pub mod scanner_io; + +pub use data_usage_define::*; +pub use error::ScannerError; +pub use scanner::init_data_scanner; diff --git a/crates/scanner/src/metrics.rs b/crates/scanner/src/metrics.rs new file mode 100644 index 00000000..cb387728 --- /dev/null +++ b/crates/scanner/src/metrics.rs @@ -0,0 +1,576 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use crate::last_minute::{AccElem, LastMinuteLatency}; +use chrono::{DateTime, Utc}; +use rustfs_madmin::metrics::ScannerMetrics as M_ScannerMetrics; +use serde::{Deserialize, Serialize}; +use std::{ + collections::HashMap, + fmt::Display, + pin::Pin, + sync::{ + Arc, OnceLock, + atomic::{AtomicU64, Ordering}, + }, + time::{Duration, SystemTime}, +}; +use tokio::sync::{Mutex, RwLock}; + +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub enum IlmAction { + NoneAction = 0, + DeleteAction, + DeleteVersionAction, + TransitionAction, + TransitionVersionAction, + DeleteRestoredAction, + DeleteRestoredVersionAction, + DeleteAllVersionsAction, + DelMarkerDeleteAllVersionsAction, + ActionCount, +} + +impl IlmAction { + pub fn delete_restored(&self) -> bool { + *self == Self::DeleteRestoredAction || *self == Self::DeleteRestoredVersionAction + } + + pub fn delete_versioned(&self) -> bool { + *self == Self::DeleteVersionAction || *self == Self::DeleteRestoredVersionAction + } + + pub fn delete_all(&self) -> bool { + *self == Self::DeleteAllVersionsAction || *self == Self::DelMarkerDeleteAllVersionsAction + } + + pub fn delete(&self) -> bool { + if self.delete_restored() { + return true; + } + *self == Self::DeleteVersionAction + || *self == Self::DeleteAction + || *self == Self::DeleteAllVersionsAction + || *self == Self::DelMarkerDeleteAllVersionsAction + } +} + +impl Display for IlmAction { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + write!(f, "{self:?}") + } +} + +pub static GLOBAL_METRICS: OnceLock> = OnceLock::new(); + +pub fn global_metrics() -> &'static Arc { + GLOBAL_METRICS.get_or_init(|| Arc::new(Metrics::new())) +} + +#[derive(Clone, Debug, PartialEq, PartialOrd)] +pub enum Metric { + // START Realtime metrics, that only records + // last minute latencies and total operation count. + ReadMetadata = 0, + CheckMissing, + SaveUsage, + ApplyAll, + ApplyVersion, + TierObjSweep, + HealCheck, + Ilm, + CheckReplication, + Yield, + CleanAbandoned, + ApplyNonCurrent, + HealAbandonedVersion, + + // START Trace metrics: + StartTrace, + ScanObject, // Scan object. All operations included. + HealAbandonedObject, + + // END realtime metrics: + LastRealtime, + + // Trace only metrics: + ScanFolder, // Scan a folder on disk, recursively. + ScanCycle, // Full cycle, cluster global. + ScanBucketDrive, // Single bucket on one drive. + CompactFolder, // Folder compacted. + + // Must be last: + Last, +} + +impl Metric { + /// Convert to string representation for metrics + pub fn as_str(self) -> &'static str { + match self { + Self::ReadMetadata => "read_metadata", + Self::CheckMissing => "check_missing", + Self::SaveUsage => "save_usage", + Self::ApplyAll => "apply_all", + Self::ApplyVersion => "apply_version", + Self::TierObjSweep => "tier_obj_sweep", + Self::HealCheck => "heal_check", + Self::Ilm => "ilm", + Self::CheckReplication => "check_replication", + Self::Yield => "yield", + Self::CleanAbandoned => "clean_abandoned", + Self::ApplyNonCurrent => "apply_non_current", + Self::HealAbandonedVersion => "heal_abandoned_version", + Self::StartTrace => "start_trace", + Self::ScanObject => "scan_object", + Self::HealAbandonedObject => "heal_abandoned_object", + Self::LastRealtime => "last_realtime", + Self::ScanFolder => "scan_folder", + Self::ScanCycle => "scan_cycle", + Self::ScanBucketDrive => "scan_bucket_drive", + Self::CompactFolder => "compact_folder", + Self::Last => "last", + } + } + + /// Convert from index back to enum (safe version) + pub fn from_index(index: usize) -> Option { + if index >= Self::Last as usize { + return None; + } + // Safe conversion using match instead of unsafe transmute + match index { + 0 => Some(Self::ReadMetadata), + 1 => Some(Self::CheckMissing), + 2 => Some(Self::SaveUsage), + 3 => Some(Self::ApplyAll), + 4 => Some(Self::ApplyVersion), + 5 => Some(Self::TierObjSweep), + 6 => Some(Self::HealCheck), + 7 => Some(Self::Ilm), + 8 => Some(Self::CheckReplication), + 9 => Some(Self::Yield), + 10 => Some(Self::CleanAbandoned), + 11 => Some(Self::ApplyNonCurrent), + 12 => Some(Self::HealAbandonedVersion), + 13 => Some(Self::StartTrace), + 14 => Some(Self::ScanObject), + 15 => Some(Self::HealAbandonedObject), + 16 => Some(Self::LastRealtime), + 17 => Some(Self::ScanFolder), + 18 => Some(Self::ScanCycle), + 19 => Some(Self::ScanBucketDrive), + 20 => Some(Self::CompactFolder), + 21 => Some(Self::Last), + _ => None, + } + } +} + +/// Thread-safe wrapper for LastMinuteLatency with atomic operations +#[derive(Default)] +pub struct LockedLastMinuteLatency { + latency: Arc>, +} + +impl Clone for LockedLastMinuteLatency { + fn clone(&self) -> Self { + Self { + latency: Arc::clone(&self.latency), + } + } +} + +impl LockedLastMinuteLatency { + pub fn new() -> Self { + Self { + latency: Arc::new(Mutex::new(LastMinuteLatency::default())), + } + } + + /// Add a duration measurement + pub async fn add(&self, duration: Duration) { + self.add_size(duration, 0).await; + } + + /// Add a duration measurement with size + pub async fn add_size(&self, duration: Duration, size: u64) { + let mut latency = self.latency.lock().await; + let now = SystemTime::now() + .duration_since(SystemTime::UNIX_EPOCH) + .unwrap_or_default() + .as_secs(); + + let elem = AccElem { + n: 1, + total: duration.as_secs(), + size, + }; + latency.add_all(now, &elem); + } + + /// Get total accumulated metrics for the last minute + pub async fn total(&self) -> AccElem { + let mut latency = self.latency.lock().await; + latency.get_total() + } +} + +/// Current path tracker for monitoring active scan paths +struct CurrentPathTracker { + current_path: Arc>, +} + +impl CurrentPathTracker { + fn new(initial_path: String) -> Self { + Self { + current_path: Arc::new(RwLock::new(initial_path)), + } + } + + async fn update_path(&self, path: String) { + *self.current_path.write().await = path; + } + + async fn get_path(&self) -> String { + self.current_path.read().await.clone() + } +} + +/// Main scanner metrics structure +pub struct Metrics { + // All fields must be accessed atomically and aligned. + operations: Vec, + latency: Vec, + actions: Vec, + actions_latency: Vec, + // Current paths contains disk -> tracker mappings + current_paths: Arc>>>, + + // Cycle information + cycle_info: Arc>>, +} + +// This is a placeholder. We'll need to define this struct. +#[derive(Clone, Debug, Default, Serialize, Deserialize)] +pub struct CurrentCycle { + pub current: u64, + pub next: u64, + pub cycle_completed: Vec>, + pub started: DateTime, +} + +impl CurrentCycle { + pub fn unmarshal(&mut self, buf: &[u8]) -> Result<(), Box> { + *self = rmp_serde::from_slice(buf)?; + Ok(()) + } + + pub fn marshal(&self) -> Result, Box> { + Ok(rmp_serde::to_vec(self)?) + } +} + +impl Metrics { + pub fn new() -> Self { + let operations = (0..Metric::Last as usize).map(|_| AtomicU64::new(0)).collect(); + + let latency = (0..Metric::LastRealtime as usize) + .map(|_| LockedLastMinuteLatency::new()) + .collect(); + + Self { + operations, + latency, + actions: (0..IlmAction::ActionCount as usize).map(|_| AtomicU64::new(0)).collect(), + actions_latency: vec![LockedLastMinuteLatency::default(); IlmAction::ActionCount as usize], + current_paths: Arc::new(RwLock::new(HashMap::new())), + cycle_info: Arc::new(RwLock::new(None)), + } + } + + /// Log scanner action with custom metadata - compatible with existing usage + pub fn log(metric: Metric) -> impl Fn(&HashMap) { + let metric = metric as usize; + let start_time = SystemTime::now(); + move |_custom: &HashMap| { + let duration = SystemTime::now().duration_since(start_time).unwrap_or_default(); + + // Update operation count + global_metrics().operations[metric].fetch_add(1, Ordering::Relaxed); + + // Update latency for realtime metrics (spawn async task for this) + if (metric) < Metric::LastRealtime as usize { + let metric_index = metric; + tokio::spawn(async move { + global_metrics().latency[metric_index].add(duration).await; + }); + } + + // Log trace metrics + if metric as u8 > Metric::StartTrace as u8 { + //debug!(metric = metric.as_str(), duration_ms = duration.as_millis(), "Scanner trace metric"); + } + } + } + + /// Time scanner action with size - returns function that takes size + pub fn time_size(metric: Metric) -> impl Fn(u64) { + let metric = metric as usize; + let start_time = SystemTime::now(); + move |size: u64| { + let duration = SystemTime::now().duration_since(start_time).unwrap_or_default(); + + // Update operation count + global_metrics().operations[metric].fetch_add(1, Ordering::Relaxed); + + // Update latency for realtime metrics with size (spawn async task) + if (metric) < Metric::LastRealtime as usize { + let metric_index = metric; + tokio::spawn(async move { + global_metrics().latency[metric_index].add_size(duration, size).await; + }); + } + } + } + + /// Time a scanner action - returns a closure to call when done + pub fn time(metric: Metric) -> impl Fn() { + let metric = metric as usize; + let start_time = SystemTime::now(); + move || { + let duration = SystemTime::now().duration_since(start_time).unwrap_or_default(); + + // Update operation count + global_metrics().operations[metric].fetch_add(1, Ordering::Relaxed); + + // Update latency for realtime metrics (spawn async task) + if (metric) < Metric::LastRealtime as usize { + let metric_index = metric; + tokio::spawn(async move { + global_metrics().latency[metric_index].add(duration).await; + }); + } + } + } + + /// Time N scanner actions - returns function that takes count, then returns completion function + pub fn time_n(metric: Metric) -> Box Box + Send + Sync> { + let metric = metric as usize; + let start_time = SystemTime::now(); + Box::new(move |count: usize| { + Box::new(move || { + let duration = SystemTime::now().duration_since(start_time).unwrap_or_default(); + + // Update operation count + global_metrics().operations[metric].fetch_add(count as u64, Ordering::Relaxed); + + // Update latency for realtime metrics (spawn async task) + if (metric) < Metric::LastRealtime as usize { + let metric_index = metric; + tokio::spawn(async move { + global_metrics().latency[metric_index].add(duration).await; + }); + } + }) + }) + } + + /// Time ILM action with versions - returns function that takes versions, then returns completion function + pub fn time_ilm(a: IlmAction) -> Box Box + Send + Sync> { + let a_clone = a as usize; + if a_clone == IlmAction::NoneAction as usize || a_clone >= IlmAction::ActionCount as usize { + return Box::new(move |_: u64| Box::new(move || {})); + } + let start = SystemTime::now(); + Box::new(move |versions: u64| { + Box::new(move || { + let duration = SystemTime::now().duration_since(start).unwrap_or(Duration::from_secs(0)); + tokio::spawn(async move { + global_metrics().actions[a_clone].fetch_add(versions, Ordering::Relaxed); + global_metrics().actions_latency[a_clone].add(duration).await; + }); + }) + }) + } + + /// Increment time with specific duration + pub async fn inc_time(metric: Metric, duration: Duration) { + let metric = metric as usize; + // Update operation count + global_metrics().operations[metric].fetch_add(1, Ordering::Relaxed); + + // Update latency for realtime metrics + if (metric) < Metric::LastRealtime as usize { + global_metrics().latency[metric].add(duration).await; + } + } + + /// Get lifetime operation count for a metric + pub fn lifetime(&self, metric: Metric) -> u64 { + let metric = metric as usize; + if (metric) >= Metric::Last as usize { + return 0; + } + self.operations[metric].load(Ordering::Relaxed) + } + + /// Get last minute statistics for a metric + pub async fn last_minute(&self, metric: Metric) -> AccElem { + let metric = metric as usize; + if (metric) >= Metric::LastRealtime as usize { + return AccElem::default(); + } + self.latency[metric].total().await + } + + /// Set current cycle information + pub async fn set_cycle(&self, cycle: Option) { + *self.cycle_info.write().await = cycle; + } + + /// Get current cycle information + pub async fn get_cycle(&self) -> Option { + self.cycle_info.read().await.clone() + } + + /// Get current active paths + pub async fn get_current_paths(&self) -> Vec { + let mut result = Vec::new(); + let paths = self.current_paths.read().await; + + for (disk, tracker) in paths.iter() { + let path = tracker.get_path().await; + result.push(format!("{disk}/{path}")); + } + + result + } + + /// Get number of active drives + pub async fn active_drives(&self) -> usize { + self.current_paths.read().await.len() + } + + /// Generate metrics report + pub async fn report(&self) -> M_ScannerMetrics { + let mut metrics = M_ScannerMetrics::default(); + + // Set cycle information + if let Some(cycle) = self.get_cycle().await { + metrics.current_cycle = cycle.current; + metrics.cycles_completed_at = cycle.cycle_completed; + metrics.current_started = cycle.started; + } + + metrics.collected_at = Utc::now(); + metrics.active_paths = self.get_current_paths().await; + + // Lifetime operations + for i in 0..Metric::Last as usize { + let count = self.operations[i].load(Ordering::Relaxed); + if count > 0 { + if let Some(metric) = Metric::from_index(i) { + metrics.life_time_ops.insert(metric.as_str().to_string(), count); + } + } + } + + // Last minute statistics for realtime metrics + for i in 0..Metric::LastRealtime as usize { + let last_min = self.latency[i].total().await; + if last_min.n > 0 { + if let Some(_metric) = Metric::from_index(i) { + // Convert to madmin TimedAction format if needed + // This would require implementing the conversion + } + } + } + + metrics + } +} + +// Type aliases for compatibility with existing code +pub type UpdateCurrentPathFn = Arc Pin + Send>> + Send + Sync>; +pub type CloseDiskFn = Arc Pin + Send>> + Send + Sync>; + +/// Create a current path updater for tracking scan progress +pub fn current_path_updater(disk: &str, initial: &str) -> (UpdateCurrentPathFn, CloseDiskFn) { + let tracker = Arc::new(CurrentPathTracker::new(initial.to_string())); + let disk_name = disk.to_string(); + + // Store the tracker in global metrics + let tracker_clone = Arc::clone(&tracker); + let disk_clone = disk_name.clone(); + tokio::spawn(async move { + global_metrics().current_paths.write().await.insert(disk_clone, tracker_clone); + }); + + let update_fn = { + let tracker = Arc::clone(&tracker); + Arc::new(move |path: &str| -> Pin + Send>> { + let tracker = Arc::clone(&tracker); + let path = path.to_string(); + Box::pin(async move { + tracker.update_path(path).await; + }) + }) + }; + + let done_fn = { + let disk_name = disk_name.clone(); + Arc::new(move || -> Pin + Send>> { + let disk_name = disk_name.clone(); + Box::pin(async move { + global_metrics().current_paths.write().await.remove(&disk_name); + }) + }) + }; + + (update_fn, done_fn) +} + +impl Default for Metrics { + fn default() -> Self { + Self::new() + } +} + +pub struct CloseDiskGuard(CloseDiskFn); + +impl CloseDiskGuard { + pub fn new(close_disk: CloseDiskFn) -> Self { + Self(close_disk) + } + + pub async fn close(&self) { + (self.0)().await; + } +} + +impl Drop for CloseDiskGuard { + fn drop(&mut self) { + // Drop cannot be async, so we spawn the async cleanup task + // The task will run in the background and complete asynchronously + if let Ok(handle) = tokio::runtime::Handle::try_current() { + let close_fn = self.0.clone(); + handle.spawn(async move { + close_fn().await; + }); + } else { + // If we're not in a tokio runtime context, we can't spawn + // This is a best-effort cleanup, so we just skip it + } + } +} diff --git a/crates/scanner/src/scanner.rs b/crates/scanner/src/scanner.rs new file mode 100644 index 00000000..292007c9 --- /dev/null +++ b/crates/scanner/src/scanner.rs @@ -0,0 +1,269 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use std::sync::Arc; + +use crate::data_usage::{BACKGROUND_HEAL_INFO_PATH, DATA_USAGE_BLOOM_NAME_PATH, DATA_USAGE_OBJ_NAME_PATH}; +use crate::metrics::CurrentCycle; +use crate::metrics::global_metrics; +use crate::scanner_folder::data_usage_update_dir_cycles; +use crate::scanner_io::ScannerIO; +use crate::{DataUsageInfo, ScannerError}; +use chrono::{DateTime, Utc}; +use rustfs_common::heal_channel::HealScanMode; +use rustfs_config::{DEFAULT_DATA_SCANNER_START_DELAY_SECS, ENV_DATA_SCANNER_START_DELAY_SECS}; +use rustfs_ecstore::StorageAPI as _; +use rustfs_ecstore::config::com::{read_config, save_config}; +use rustfs_ecstore::disk::RUSTFS_META_BUCKET; +use rustfs_ecstore::error::Error as EcstoreError; +use rustfs_ecstore::global::is_erasure_sd; +use rustfs_ecstore::store::ECStore; +use serde::{Deserialize, Serialize}; +use tokio::sync::mpsc; +use tokio::time::Duration; +use tokio_util::sync::CancellationToken; +use tracing::{debug, error, info, warn}; + +fn data_scanner_start_delay() -> Duration { + let secs = rustfs_utils::get_env_u64(ENV_DATA_SCANNER_START_DELAY_SECS, DEFAULT_DATA_SCANNER_START_DELAY_SECS); + Duration::from_secs(secs) +} + +pub async fn init_data_scanner(ctx: CancellationToken, storeapi: Arc) { + let ctx_clone = ctx.clone(); + let storeapi_clone = storeapi.clone(); + tokio::spawn(async move { + let sleep_time = Duration::from_secs(rand::random::() % 5); + tokio::time::sleep(sleep_time).await; + + loop { + if ctx_clone.is_cancelled() { + break; + } + + if let Err(e) = run_data_scanner(ctx_clone.clone(), storeapi_clone.clone()).await { + error!("Failed to run data scanner: {e}"); + } + tokio::time::sleep(data_scanner_start_delay()).await; + } + }); +} + +fn get_cycle_scan_mode(_current_cycle: u64, _bitrot_start_cycle: u64, _bitrot_start_time: Option>) -> HealScanMode { + // TODO: from config + HealScanMode::Normal +} + +/// Background healing information +#[derive(Clone, Debug, Default, Serialize, Deserialize)] +#[serde(rename_all = "camelCase")] +pub struct BackgroundHealInfo { + /// Bitrot scan start time + pub bitrot_start_time: Option>, + /// Bitrot scan start cycle + pub bitrot_start_cycle: u64, + /// Current scan mode + pub current_scan_mode: HealScanMode, +} + +/// Read background healing information from storage +pub async fn read_background_heal_info(storeapi: Arc) -> BackgroundHealInfo { + // Skip for ErasureSD setup + if is_erasure_sd().await { + return BackgroundHealInfo::default(); + } + + // Get last healing information + match read_config(storeapi, &BACKGROUND_HEAL_INFO_PATH).await { + Ok(buf) => match serde_json::from_slice::(&buf) { + Ok(info) => info, + Err(e) => { + error!("Failed to unmarshal background heal info from {}: {}", &*BACKGROUND_HEAL_INFO_PATH, e); + BackgroundHealInfo::default() + } + }, + Err(e) => { + // Only log if it's not a ConfigNotFound error + if e != EcstoreError::ConfigNotFound { + warn!("Failed to read background heal info from {}: {}", &*BACKGROUND_HEAL_INFO_PATH, e); + } + BackgroundHealInfo::default() + } + } +} + +/// Save background healing information to storage +pub async fn save_background_heal_info(storeapi: Arc, info: BackgroundHealInfo) { + // Skip for ErasureSD setup + if is_erasure_sd().await { + return; + } + + // Serialize to JSON + let data = match serde_json::to_vec(&info) { + Ok(data) => data, + Err(e) => { + error!("Failed to marshal background heal info: {}", e); + return; + } + }; + + // Save configuration + if let Err(e) = save_config(storeapi, &BACKGROUND_HEAL_INFO_PATH, data).await { + warn!("Failed to save background heal info to {}: {}", &*BACKGROUND_HEAL_INFO_PATH, e); + } +} + +pub async fn run_data_scanner(ctx: CancellationToken, storeapi: Arc) -> Result<(), ScannerError> { + // TODO: leader lock + let _guard = match storeapi.new_ns_lock(RUSTFS_META_BUCKET, "leader.lock").await { + Ok(guard) => guard, + Err(e) => { + error!("run_data_scanner: other node is running, failed to acquire leader lock: {e}"); + return Ok(()); + } + }; + + let mut cycle_info = CurrentCycle::default(); + let buf = read_config(storeapi.clone(), &DATA_USAGE_BLOOM_NAME_PATH) + .await + .unwrap_or_default(); + if buf.len() == 8 { + cycle_info.next = u64::from_le_bytes(buf.try_into().unwrap_or_default()); + } else if buf.len() > 8 { + cycle_info.next = u64::from_le_bytes(buf[0..8].try_into().unwrap_or_default()); + if let Err(e) = cycle_info.unmarshal(&buf[8..]) { + warn!("Failed to unmarshal cycle info: {e}"); + } + } + + let mut ticker = tokio::time::interval(data_scanner_start_delay()); + loop { + tokio::select! { + _ = ctx.cancelled() => { + break; + } + _ = ticker.tick() => { + + cycle_info.current = cycle_info.next; + cycle_info.started = Utc::now(); + + global_metrics().set_cycle(Some(cycle_info.clone())).await; + + let background_heal_info = read_background_heal_info(storeapi.clone()).await; + + let scan_mode = get_cycle_scan_mode(cycle_info.current, background_heal_info.bitrot_start_cycle, background_heal_info.bitrot_start_time); + if background_heal_info.current_scan_mode != scan_mode { + let mut new_heal_info = background_heal_info.clone(); + new_heal_info.current_scan_mode = scan_mode; + + if scan_mode == HealScanMode::Deep { + new_heal_info.bitrot_start_cycle = cycle_info.current; + new_heal_info.bitrot_start_time = Some(Utc::now()); + } + + save_background_heal_info(storeapi.clone(), new_heal_info).await; + } + + + + let (sender, receiver) = tokio::sync::mpsc::channel::(1); + let storeapi_clone = storeapi.clone(); + let ctx_clone = ctx.clone(); + tokio::spawn(async move { + store_data_usage_in_backend(ctx_clone, storeapi_clone, receiver).await; + }); + + + if let Err(e) = storeapi.clone().nsscanner(ctx.clone(), sender, cycle_info.current, scan_mode).await { + error!("Failed to scan namespace: {e}"); + } else { + info!("Namespace scanned successfully"); + + cycle_info.next +=1; + cycle_info.current = 0; + cycle_info.cycle_completed.push(Utc::now()); + + if cycle_info.cycle_completed.len() >= data_usage_update_dir_cycles() as usize { + cycle_info.cycle_completed = cycle_info.cycle_completed.split_off(data_usage_update_dir_cycles() as usize); + } + + global_metrics().set_cycle(Some(cycle_info.clone())).await; + + let cycle_info_buf = cycle_info.marshal().unwrap_or_default(); + + let mut buf = Vec::with_capacity(cycle_info_buf.len() + 8); + buf.extend_from_slice(&cycle_info.next.to_le_bytes()); + buf.extend_from_slice(&cycle_info_buf); + + + if let Err(e) = save_config(storeapi.clone(), &DATA_USAGE_BLOOM_NAME_PATH, buf).await { + error!("Failed to save data usage bloom name to {}: {}", &*DATA_USAGE_BLOOM_NAME_PATH, e); + } else { + info!("Data usage bloom name saved successfully"); + } + + + } + + ticker.reset(); + } + } + } + + global_metrics().set_cycle(None).await; + Ok(()) +} + +/// Store data usage info in backend. Will store all objects sent on the receiver until closed. +pub async fn store_data_usage_in_backend( + ctx: CancellationToken, + storeapi: Arc, + mut receiver: mpsc::Receiver, +) { + let mut attempts = 1u32; + + while let Some(data_usage_info) = receiver.recv().await { + if ctx.is_cancelled() { + break; + } + + debug!("store_data_usage_in_backend: received data usage info: {:?}", &data_usage_info); + + // Serialize to JSON + let data = match serde_json::to_vec(&data_usage_info) { + Ok(data) => data, + Err(e) => { + error!("Failed to marshal data usage info: {}", e); + continue; + } + }; + + // Save a backup every 10th update + if attempts > 10 { + let backup_path = format!("{:?}.bkp", &DATA_USAGE_OBJ_NAME_PATH); + if let Err(e) = save_config(storeapi.clone(), &backup_path, data.clone()).await { + warn!("Failed to save data usage backup to {}: {}", backup_path, e); + } + attempts = 1; + } + + // Save main configuration + if let Err(e) = save_config(storeapi.clone(), &DATA_USAGE_OBJ_NAME_PATH, data).await { + error!("Failed to save data usage info to {:?}: {e}", &DATA_USAGE_OBJ_NAME_PATH); + } + + attempts += 1; + } +} diff --git a/crates/scanner/src/scanner_folder.rs b/crates/scanner/src/scanner_folder.rs new file mode 100644 index 00000000..83922cbb --- /dev/null +++ b/crates/scanner/src/scanner_folder.rs @@ -0,0 +1,1208 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use std::collections::HashSet; +use std::fs::FileType; +use std::sync::Arc; +use std::time::{Duration, SystemTime}; + +use crate::ReplTargetSizeSummary; +use crate::data_usage_define::{DataUsageCache, DataUsageEntry, DataUsageHash, DataUsageHashMap, SizeSummary, hash_path}; +use crate::error::ScannerError; +use crate::metrics::{UpdateCurrentPathFn, current_path_updater}; +use crate::scanner_io::ScannerIODisk as _; +use rustfs_common::heal_channel::{HEAL_DELETE_DANGLING, HealChannelRequest, HealOpts, HealScanMode, send_heal_request}; +use rustfs_common::metrics::IlmAction; +use rustfs_ecstore::StorageAPI; +use rustfs_ecstore::bucket::lifecycle::bucket_lifecycle_audit::LcEventSrc; +use rustfs_ecstore::bucket::lifecycle::bucket_lifecycle_ops::apply_expiry_rule; +use rustfs_ecstore::bucket::lifecycle::evaluator::Evaluator; +use rustfs_ecstore::bucket::lifecycle::{ + bucket_lifecycle_ops::apply_transition_rule, + lifecycle::{Event, Lifecycle, ObjectOpts}, +}; +use rustfs_ecstore::bucket::replication::{ReplicationConfig, ReplicationConfigurationExt as _, queue_replication_heal_internal}; +use rustfs_ecstore::bucket::versioning::VersioningApi; +use rustfs_ecstore::bucket::versioning_sys::BucketVersioningSys; +use rustfs_ecstore::cache_value::metacache_set::{ListPathRawOptions, list_path_raw}; +use rustfs_ecstore::disk::error::DiskError; +use rustfs_ecstore::disk::{Disk, DiskAPI as _, DiskInfoOptions}; +use rustfs_ecstore::error::StorageError; +use rustfs_ecstore::global::is_erasure; +use rustfs_ecstore::pools::{path2_bucket_object, path2_bucket_object_with_base_path}; +use rustfs_ecstore::store_api::{ObjectInfo, ObjectToDelete}; +use rustfs_ecstore::store_utils::is_reserved_or_invalid_bucket; +use rustfs_filemeta::{MetaCacheEntries, MetaCacheEntry, MetadataResolutionParams, ReplicationStatusType}; +use rustfs_utils::path::{SLASH_SEPARATOR, path_join_buf}; +use s3s::dto::{BucketLifecycleConfiguration, ObjectLockConfiguration}; +use tokio::select; +use tokio::sync::mpsc; +use tokio_util::sync::CancellationToken; +use tracing::{debug, error, info, warn}; + +// Constants from Go code +const DATA_SCANNER_SLEEP_PER_FOLDER: Duration = Duration::from_millis(1); +const DATA_USAGE_UPDATE_DIR_CYCLES: u32 = 16; +const DATA_SCANNER_COMPACT_LEAST_OBJECT: usize = 500; +const DATA_SCANNER_COMPACT_AT_CHILDREN: usize = 10000; +const DATA_SCANNER_COMPACT_AT_FOLDERS: usize = DATA_SCANNER_COMPACT_AT_CHILDREN / 4; +const DATA_SCANNER_FORCE_COMPACT_AT_FOLDERS: usize = 250_000; +const DEFAULT_HEAL_OBJECT_SELECT_PROB: u32 = 1024; +const ENV_DATA_USAGE_UPDATE_DIR_CYCLES: &str = "RUSTFS_DATA_USAGE_UPDATE_DIR_CYCLES"; +const ENV_HEAL_OBJECT_SELECT_PROB: &str = "RUSTFS_HEAL_OBJECT_SELECT_PROB"; + +pub fn data_usage_update_dir_cycles() -> u32 { + rustfs_utils::get_env_u32(ENV_DATA_USAGE_UPDATE_DIR_CYCLES, DATA_USAGE_UPDATE_DIR_CYCLES) +} + +pub fn heal_object_select_prob() -> u32 { + rustfs_utils::get_env_u32(ENV_HEAL_OBJECT_SELECT_PROB, DEFAULT_HEAL_OBJECT_SELECT_PROB) +} + +/// Cached folder information for scanning +#[derive(Clone, Debug)] +pub struct CachedFolder { + pub name: String, + pub parent: Option, + pub object_heal_prob_div: u32, +} + +/// Type alias for get size function +pub type GetSizeFn = Box Result + Send + Sync>; + +/// Scanner item representing a file during scanning +#[derive(Clone, Debug)] +pub struct ScannerItem { + pub path: String, + pub bucket: String, + pub prefix: String, + pub object_name: String, + pub file_type: FileType, + pub lifecycle: Option>, + pub replication: Option>, + pub heal_enabled: bool, + pub heal_bitrot: bool, + pub debug: bool, +} + +impl ScannerItem { + /// Get the object path (prefix + object_name) + pub fn object_path(&self) -> String { + if self.prefix.is_empty() { + self.object_name.clone() + } else { + path_join_buf(&[&self.prefix, &self.object_name]) + } + } + + /// Transform meta directory by splitting prefix and extracting object name + /// This converts a directory path like "bucket/dir1/dir2/file" to prefix="bucket/dir1/dir2" and object_name="file" + pub fn transform_meta_dir(&mut self) { + let prefix = self.prefix.clone(); // Clone to avoid borrow checker issues + let split: Vec<&str> = prefix.split(SLASH_SEPARATOR).collect(); + + if split.len() > 1 { + let prefix_parts: Vec<&str> = split[..split.len() - 1].to_vec(); + self.prefix = path_join_buf(&prefix_parts); + } else { + self.prefix = String::new(); + } + + // Object name is the last element + self.object_name = split.last().unwrap_or(&"").to_string(); + } + + pub async fn apply_actions( + &mut self, + store: Arc, + object_infos: Vec, + lock_retention: Option>, + size_summary: &mut SizeSummary, + ) { + if object_infos.is_empty() { + debug!("apply_actions: no object infos for object: {}", self.object_path()); + return; + } + debug!("apply_actions: applying actions for object: {}", self.object_path()); + + let versioning_config = match BucketVersioningSys::get(&self.bucket).await { + Ok(versioning_config) => versioning_config, + Err(_) => { + warn!("apply_actions: Failed to get versioning configuration for bucket {}", self.bucket); + return; + } + }; + + let Some(lifecycle) = self.lifecycle.as_ref() else { + let mut cumulative_size = 0; + for oi in object_infos.iter() { + let actual_size = match oi.get_actual_size() { + Ok(size) => size, + Err(_) => { + warn!("apply_actions: Failed to get actual size for object {}", oi.name); + continue; + } + }; + + let size = self.heal_actions(store.clone(), oi, actual_size, size_summary).await; + + size_summary.actions_accounting(oi, size, actual_size); + + cumulative_size += size; + } + + self.alert_excessive_versions(object_infos.len(), cumulative_size); + + debug!("apply_actions: done for now no lifecycle config"); + return; + }; + + debug!("apply_actions: got lifecycle config for object: {}", self.object_path()); + + let object_opts = object_infos + .iter() + .map(ObjectOpts::from_object_info) + .collect::>(); + + let events = match Evaluator::new(lifecycle.clone()) + .with_lock_retention(lock_retention) + .with_replication_config(self.replication.clone()) + .eval(&object_opts) + .await + { + Ok(events) => events, + Err(e) => { + warn!("apply_actions: Failed to evaluate lifecycle for object: {}", e); + return; + } + }; + let mut to_delete_objs: Vec = Vec::new(); + let mut noncurrent_events: Vec = Vec::new(); + let mut cumulative_size = 0; + let mut remaining_versions = object_infos.len(); + 'eventLoop: { + for (i, event) in events.iter().enumerate() { + let oi = &object_infos[i]; + let actual_size = match oi.get_actual_size() { + Ok(size) => size, + Err(_) => { + warn!("apply_actions: Failed to get actual size for object {}", oi.name); + 0 + } + }; + + let mut size = actual_size; + + match event.action { + IlmAction::DeleteAllVersionsAction | IlmAction::DelMarkerDeleteAllVersionsAction => { + remaining_versions = 0; + debug!("apply_actions: applying expiry rule for object: {} {}", oi.name, event.action); + apply_expiry_rule(event, &LcEventSrc::Scanner, oi).await; + break 'eventLoop; + } + + IlmAction::DeleteAction | IlmAction::DeleteRestoredAction | IlmAction::DeleteRestoredVersionAction => { + if !versioning_config.prefix_enabled(&self.object_path()) && event.action == IlmAction::DeleteAction { + remaining_versions -= 1; + size = 0; + } + + debug!("apply_actions: applying expiry rule for object: {} {}", oi.name, event.action); + apply_expiry_rule(event, &LcEventSrc::Scanner, oi).await; + } + IlmAction::DeleteVersionAction => { + remaining_versions -= 1; + size = 0; + if let Some(opt) = object_opts.get(i) { + to_delete_objs.push(ObjectToDelete { + object_name: opt.name.clone(), + version_id: opt.version_id, + ..Default::default() + }); + } + noncurrent_events.push(event.clone()); + } + IlmAction::TransitionAction | IlmAction::TransitionVersionAction => { + debug!("apply_actions: applying transition rule for object: {} {}", oi.name, event.action); + apply_transition_rule(event, &LcEventSrc::Scanner, oi).await; + } + + IlmAction::NoneAction | IlmAction::ActionCount => { + size = self.heal_actions(store.clone(), oi, actual_size, size_summary).await; + } + } + + size_summary.actions_accounting(oi, size, actual_size); + + cumulative_size += size; + } + } + + if !to_delete_objs.is_empty() { + // TODO: enqueueNoncurrentVersions + } + self.alert_excessive_versions(remaining_versions, cumulative_size); + } + + async fn heal_actions( + &mut self, + store: Arc, + oi: &ObjectInfo, + actual_size: i64, + size_summary: &mut SizeSummary, + ) -> i64 { + debug!("heal_actions: healing object: {} {}", self.object_path(), oi.name); + + let mut size = actual_size; + + if self.heal_enabled { + size = self.apply_heal(store, oi).await; + } else { + debug!("heal_actions: heal is disabled for object: {} {}", self.object_path(), oi.name); + } + + self.heal_replication(oi, size_summary).await; + + size + } + + async fn heal_replication(&mut self, oi: &ObjectInfo, size_summary: &mut SizeSummary) { + if oi.version_id.is_none_or(|v| v.is_nil()) { + debug!("heal_replication: no version id for object: {} {}", self.object_path(), oi.name); + return; + } + + let Some(replication) = self.replication.clone() else { + debug!("heal_replication: no replication config for object: {} {}", self.object_path(), oi.name); + return; + }; + + let roi = queue_replication_heal_internal(&oi.bucket, oi.clone(), (*replication).clone(), 0).await; + if oi.delete_marker || oi.version_purge_status.is_empty() { + debug!( + "heal_replication: delete marker or version purge status is empty for object: {} {}", + self.object_path(), + oi.name + ); + return; + } + + for (arn, target_status) in roi.target_statuses.iter() { + if !size_summary.repl_target_stats.contains_key(arn.as_str()) { + size_summary + .repl_target_stats + .insert(arn.clone(), ReplTargetSizeSummary::default()); + } + + if let Some(repl_target_size_summary) = size_summary.repl_target_stats.get_mut(arn.as_str()) { + match target_status { + ReplicationStatusType::Pending => { + repl_target_size_summary.pending_size += roi.size; + repl_target_size_summary.pending_count += 1; + size_summary.pending_size += roi.size; + size_summary.pending_count += 1; + } + ReplicationStatusType::Failed => { + repl_target_size_summary.failed_size += roi.size; + repl_target_size_summary.failed_count += 1; + size_summary.failed_size += roi.size; + size_summary.failed_count += 1; + } + ReplicationStatusType::Completed | ReplicationStatusType::CompletedLegacy => { + repl_target_size_summary.replicated_size += roi.size; + repl_target_size_summary.replicated_count += 1; + size_summary.replicated_size += roi.size; + size_summary.replicated_count += 1; + } + _ => {} + } + } + } + + if oi.replication_status == ReplicationStatusType::Replica { + size_summary.replica_size += roi.size; + size_summary.replica_count += 1; + } + } + + async fn apply_heal(&mut self, store: Arc, oi: &ObjectInfo) -> i64 { + debug!( + "apply_heal: bucket: {}, object_path: {}, version_id: {}", + self.bucket, + self.object_path(), + oi.version_id.unwrap_or_default() + ); + + let scan_mode = if self.heal_bitrot { + HealScanMode::Deep + } else { + HealScanMode::Normal + }; + + match store + .clone() + .heal_object( + self.bucket.as_str(), + self.object_path().as_str(), + oi.version_id.map(|v| v.to_string()).unwrap_or_default().as_str(), + &HealOpts { + remove: HEAL_DELETE_DANGLING, + scan_mode, + ..Default::default() + }, + ) + .await + { + Ok((result, err)) => { + if let Some(err) = err { + warn!("apply_heal: failed to heal object: {}", err); + } + result.object_size as i64 + } + Err(e) => { + warn!("apply_heal: failed to heal object: {}", e); + 0 + } + } + } + + fn alert_excessive_versions(&self, _object_infos_length: usize, _cumulative_size: i64) { + // TODO: Implement alerting for excessive versions + } +} + +/// Folder scanner for scanning directory structures +pub struct FolderScanner { + root: String, + old_cache: DataUsageCache, + new_cache: DataUsageCache, + update_cache: DataUsageCache, + + data_usage_scanner_debug: bool, + heal_object_select: u32, + scan_mode: HealScanMode, + + we_sleep: Box bool + Send + Sync>, + // should_heal: Arc bool + Send + Sync>, + disks: Vec>, + disks_quorum: usize, + + updates: Option>, + last_update: SystemTime, + + update_current_path: UpdateCurrentPathFn, + + skip_heal: Arc, + local_disk: Arc, +} + +impl FolderScanner { + pub async fn should_heal(&self) -> bool { + if self.skip_heal.load(std::sync::atomic::Ordering::Relaxed) { + debug!("should_heal: false skip_heal is true for root: {}", self.root); + return false; + } + if self.heal_object_select == 0 { + debug!("should_heal: false heal_object_select is 0 for root: {}", self.root); + return false; + } + + if self + .local_disk + .disk_info(&DiskInfoOptions::default()) + .await + .unwrap_or_default() + .healing + { + self.skip_heal.store(true, std::sync::atomic::Ordering::Relaxed); + debug!("should_heal: false healing is true for root: {}", self.root); + return false; + } + + debug!("should_heal: true for root: {}", self.root); + true + } + + /// Set heal object select probability + pub fn set_heal_object_select(&mut self, prob: u32) { + self.heal_object_select = prob; + } + + /// Set debug mode + pub fn set_debug(&mut self, debug: bool) { + self.data_usage_scanner_debug = debug; + } + + /// Send update if enough time has passed + /// Should be called on a regular basis when the new_cache contains more recent total than previously. + /// May or may not send an update upstream. + pub async fn send_update(&mut self) { + // Send at most an update every minute. + if self.updates.is_none() { + return; + } + + let elapsed = self.last_update.elapsed().unwrap_or(Duration::from_secs(0)); + if elapsed < Duration::from_secs(60) { + debug!("send_update: done for now elapsed time is less than 60 seconds"); + return; + } + + if let Some(flat) = self.update_cache.size_recursive(&self.new_cache.info.name) { + if let Some(ref updates) = self.updates { + // Try to send without blocking + if let Err(e) = updates.send(flat.clone()).await { + error!("send_update: failed to send update: {}", e); + } + self.last_update = SystemTime::now(); + debug!("send_update: sent update for folder: {}", self.new_cache.info.name); + } + } + } + + /// Scan a folder recursively + /// Files found in the folders will be added to new_cache. + #[allow(clippy::never_loop)] + #[allow(unused_assignments)] + pub async fn scan_folder( + &mut self, + ctx: CancellationToken, + folder: CachedFolder, + into: &mut DataUsageEntry, + ) -> Result<(), ScannerError> { + if ctx.is_cancelled() { + return Err(ScannerError::Other("Operation cancelled".to_string())); + } + + let this_hash = hash_path(&folder.name); + // Store initial compaction state. + let was_compacted = into.compacted; + + let wait_time = None; + + loop { + if ctx.is_cancelled() { + return Err(ScannerError::Other("Operation cancelled".to_string())); + } + + let mut abandoned_children: DataUsageHashMap = HashSet::new(); + if !into.compacted { + abandoned_children = self.old_cache.find_children_copy(this_hash.clone()); + } + + debug!("scan_folder : {}/{}", &self.root, &folder.name); + let (_, prefix) = path2_bucket_object_with_base_path(&self.root, &folder.name); + + let active_life_cycle = if self + .old_cache + .info + .lifecycle + .as_ref() + .is_some_and(|v| v.has_active_rules(&prefix)) + { + self.old_cache.info.lifecycle.clone() + } else { + None + }; + + let active_replication = + if self.old_cache.info.replication.as_ref().is_some_and(|v| { + !v.is_empty() && v.config.as_ref().is_some_and(|config| config.has_active_rules(&prefix, true)) + }) { + self.old_cache.info.replication.clone() + } else { + None + }; + + if (self.we_sleep)() { + tokio::time::sleep(DATA_SCANNER_SLEEP_PER_FOLDER).await; + } + + let mut existing_folders: Vec = Vec::new(); + let mut new_folders: Vec = Vec::new(); + let mut found_objects = false; + + let dir_path = path_join_buf(&[&self.root, &folder.name]); + + debug!("scan_folder: dir_path: {:?}", dir_path); + + let mut dir_reader = tokio::fs::read_dir(&dir_path) + .await + .map_err(|e| ScannerError::Other(e.to_string()))?; + + while let Some(entry) = dir_reader + .next_entry() + .await + .map_err(|e| ScannerError::Other(e.to_string()))? + { + let file_name = entry.file_name().to_string_lossy().to_string(); + if file_name.is_empty() || file_name == "." || file_name == ".." { + debug!("scan_folder: done for now file_name is empty or . or .."); + continue; + } + + let file_path = entry.path().to_string_lossy().to_string(); + + let trim_dir_name = file_path.strip_prefix(&dir_path).unwrap_or(&file_path); + + let entry_name = path_join_buf(&[&folder.name, trim_dir_name]); + + if entry_name.is_empty() || entry_name == folder.name { + debug!("scan_folder: done for now entry_name is empty or equals folder name"); + continue; + } + + let entry_type = entry.file_type().await.map_err(|e| ScannerError::Other(e.to_string()))?; + + // ok + debug!("scan_folder: entry_name: {:?}", entry_name); + + let (bucket, prefix) = path2_bucket_object_with_base_path(self.root.as_str(), &entry_name); + if bucket.is_empty() { + debug!("scan_folder: done for now bucket is empty"); + break; + } + + if is_reserved_or_invalid_bucket(&bucket, false) { + debug!("scan_folder: done for now bucket is reserved or invalid"); + break; + } + + if ctx.is_cancelled() { + debug!("scan_folder: done for now operation cancelled"); + break; + } + + debug!("scan_folder: bucket: {:?}, prefix: {:?}", bucket, prefix); + + if entry_type.is_dir() { + let h = hash_path(&entry_name); + + if h == this_hash { + debug!("scan_folder: done for now self folder"); + continue; + } + + let exists = self.old_cache.cache.contains_key(&h.key()); + + let this = CachedFolder { + name: entry_name.clone(), + parent: Some(this_hash.clone()), + object_heal_prob_div: folder.object_heal_prob_div, + }; + + abandoned_children.remove(&h.key()); + + if exists { + debug!("scan_folder: adding existing folder: {}", entry_name); + existing_folders.push(this); + self.update_cache + .copy_with_children(&self.old_cache, &h, &Some(this_hash.clone())); + } else { + debug!("scan_folder: adding new folder: {}", entry_name); + new_folders.push(this); + } + continue; + } + + let mut wait = wait_time; + + if (self.we_sleep)() { + wait = Some(SystemTime::now()); + } + + debug!( + "scan_folder: heal_enabled: {} next_cycle: {} heal_object_select: {} object_heal_prob_div: {} should_heal: {}", + this_hash.mod_alt( + self.old_cache.info.next_cycle as u32 / folder.object_heal_prob_div, + self.heal_object_select / folder.object_heal_prob_div + ), + self.old_cache.info.next_cycle, + self.heal_object_select, + folder.object_heal_prob_div, + self.should_heal().await, + ); + + let heal_enabled = this_hash.mod_alt( + self.old_cache.info.next_cycle as u32 / folder.object_heal_prob_div, + self.heal_object_select / folder.object_heal_prob_div, + ) && self.should_heal().await; + + let mut item = ScannerItem { + path: file_path, + bucket, + prefix: rustfs_utils::path::dir(&prefix), + object_name: file_name, + lifecycle: active_life_cycle.clone(), + replication: active_replication.clone(), + heal_enabled, + heal_bitrot: self.scan_mode == HealScanMode::Deep, + debug: self.data_usage_scanner_debug, + file_type: entry_type, + }; + + debug!("scan_folder: item: {:?}", item); + + let sz = match self.local_disk.get_size(item.clone()).await { + Ok(sz) => sz, + Err(e) => { + warn!("scan_folder: failed to get size for item {}: {}", item.path, e); + // TODO: check error type + if let Some(t) = wait { + if let Ok(elapsed) = t.elapsed() { + tokio::time::sleep(elapsed).await; + } + } + + if e != StorageError::other("skip file".to_string()) { + warn!("scan_folder: failed to get size for item {}: {}", item.path, e); + } + continue; + } + }; + + debug!("scan_folder: got size for item {}: {:?}", item.path, &sz); + + found_objects = true; + + item.transform_meta_dir(); + + abandoned_children.remove(&path_join_buf(&[&item.bucket, &item.object_path()])); + + // TODO: check err + into.add_sizes(&sz); + into.objects += 1; + + if let Some(t) = wait { + if let Ok(elapsed) = t.elapsed() { + tokio::time::sleep(elapsed).await; + } + } + } + + if found_objects && is_erasure().await { + // If we found an object in erasure mode, we skip subdirs (only datadirs)... + info!("scan_folder: done for now found an object in erasure mode"); + break; + } + + // If we have many subfolders, compact ourself. + let should_compact = (self.new_cache.info.name != folder.name + && existing_folders.len() + new_folders.len() >= DATA_SCANNER_COMPACT_AT_FOLDERS) + || existing_folders.len() + new_folders.len() >= DATA_SCANNER_FORCE_COMPACT_AT_FOLDERS; + + // TODO: Check for excess folders and send events + + if !into.compacted && should_compact { + into.compacted = true; + new_folders.append(&mut existing_folders); + + existing_folders.clear(); + + if self.data_usage_scanner_debug { + debug!("scan_folder: Preemptively compacting: {}, entries: {}", folder.name, new_folders.len()); + } + } + + if !into.compacted { + for folder_item in &existing_folders { + let h = hash_path(&folder_item.name); + self.update_cache.copy_with_children(&self.old_cache, &h, &folder_item.parent); + } + } + + // Scan new folders + for folder_item in new_folders { + if ctx.is_cancelled() { + return Err(ScannerError::Other("Operation cancelled".to_string())); + } + + let h = hash_path(&folder_item.name); + // Add new folders to the update tree so totals update for these. + if !into.compacted { + let mut found_any = false; + let mut parent = this_hash.clone(); + let update_cache_name_hash = hash_path(&self.update_cache.info.name); + + while parent != update_cache_name_hash { + let parent_key = parent.key(); + let e = self.update_cache.find(&parent_key); + if e.is_none_or(|v| v.compacted) { + found_any = true; + break; + } + if let Some(next) = self.update_cache.search_parent(&parent) { + parent = next; + } else { + found_any = true; + break; + } + } + if !found_any { + // Add non-compacted empty entry. + self.update_cache + .replace_hashed(&h, &Some(this_hash.clone()), &DataUsageEntry::default()); + } + } + + (self.update_current_path)(&folder_item.name).await; + + let mut dst = if !into.compacted { + DataUsageEntry::default() + } else { + into.clone() + }; + + // Use Box::pin for recursive async call + let fut = Box::pin(self.scan_folder(ctx.clone(), folder_item.clone(), &mut dst)); + fut.await.map_err(|e| ScannerError::Other(e.to_string()))?; + + if !into.compacted { + let h = DataUsageHash(folder_item.name.clone()); + into.add_child(&h); + // We scanned a folder, optionally send update. + self.update_cache.delete_recursive(&h); + self.update_cache.copy_with_children(&self.new_cache, &h, &folder_item.parent); + self.send_update().await; + } + + if !into.compacted + && let Some(parent) = self.update_cache.find(&this_hash.key()) + && !parent.compacted + { + self.update_cache.delete_recursive(&h); + self.update_cache + .copy_with_children(&self.new_cache, &h, &Some(this_hash.clone())); + } + } + + // Scan existing folders + for mut folder_item in existing_folders { + if ctx.is_cancelled() { + return Err(ScannerError::Other("Operation cancelled".to_string())); + } + + let h = hash_path(&folder_item.name); + + if !into.compacted && self.old_cache.is_compacted(&h) { + let next_cycle = self.old_cache.info.next_cycle as u32; + if !h.mod_(next_cycle, data_usage_update_dir_cycles()) { + // Transfer and add as child... + self.new_cache.copy_with_children(&self.old_cache, &h, &folder_item.parent); + into.add_child(&h); + continue; + } + + folder_item.object_heal_prob_div = data_usage_update_dir_cycles(); + } + + (self.update_current_path)(&folder_item.name).await; + + let mut dst = if !into.compacted { + DataUsageEntry::default() + } else { + into.clone() + }; + + // Use Box::pin for recursive async call + let fut = Box::pin(self.scan_folder(ctx.clone(), folder_item.clone(), &mut dst)); + fut.await.map_err(|e| ScannerError::Other(e.to_string()))?; + + if !into.compacted { + let h = DataUsageHash(folder_item.name.clone()); + into.add_child(&h); + // We scanned a folder, optionally send update. + self.update_cache.delete_recursive(&h); + self.update_cache.copy_with_children(&self.new_cache, &h, &folder_item.parent); + self.send_update().await; + } + } + + // Scan for healing + if abandoned_children.is_empty() || !self.should_heal().await { + info!("scan_folder: done for now abandoned children are empty or we are not healing"); + // If we are not heal scanning, return now. + break; + } + + if self.disks.is_empty() || self.disks_quorum == 0 { + info!("scan_folder: done for now disks are empty or quorum is 0"); + break; + } + + debug!("scan_folder: scanning for healing abandoned children: {:?}", abandoned_children); + + let mut resolver = MetadataResolutionParams { + dir_quorum: self.disks_quorum, + obj_quorum: self.disks_quorum, + bucket: "".to_string(), + strict: false, + ..Default::default() + }; + + for name in abandoned_children { + if !self.should_heal().await { + break; + } + + let (bucket, prefix) = path2_bucket_object(name.as_str()); + + if bucket != resolver.bucket { + debug!("scan_folder: sending heal request for bucket: {}", bucket); + send_heal_request(HealChannelRequest { + bucket: bucket.clone(), + ..Default::default() + }) + .await + .map_err(|e| ScannerError::Other(e.to_string()))?; + } + + resolver.bucket = bucket.clone(); + + let child_ctx = ctx.child_token(); + + let (agreed_tx, mut agreed_rx) = mpsc::channel::(1); + let (partial_tx, mut partial_rx) = mpsc::channel::(1); + let (finished_tx, mut finished_rx) = mpsc::channel::>>(1); + + let disks = self.disks.iter().cloned().map(Some).collect(); + let disks_quorum = self.disks_quorum; + let bucket_clone = bucket.clone(); + let prefix_clone = prefix.clone(); + let child_ctx_clone = child_ctx.clone(); + let agreed_tx = agreed_tx.clone(); + let partial_tx = partial_tx.clone(); + let finished_tx = finished_tx.clone(); + + debug!("scan_folder: listing path: {}/{}", bucket, prefix); + tokio::spawn(async move { + if let Err(e) = list_path_raw( + child_ctx_clone.clone(), + ListPathRawOptions { + disks, + bucket: bucket_clone.clone(), + path: prefix_clone.clone(), + recursive: true, + report_not_found: true, + min_disks: disks_quorum, + agreed: Some(Box::new(move |entry: MetaCacheEntry| { + let entry_name = entry.name.clone(); + let agreed_tx = agreed_tx.clone(); + Box::pin(async move { + if let Err(e) = agreed_tx.send(entry_name).await { + error!("scan_folder: list_path_raw: failed to send entry name: {}: {}", entry.name, e); + } + }) + })), + partial: Some(Box::new(move |entries: MetaCacheEntries, _: &[Option]| { + let partial_tx = partial_tx.clone(); + Box::pin(async move { + if let Err(e) = partial_tx.send(entries).await { + error!("scan_folder: list_path_raw: failed to send partial err: {}", e); + } + }) + })), + finished: Some(Box::new(move |errs: &[Option]| { + let finished_tx = finished_tx.clone(); + let errs_clone = errs.to_vec(); + Box::pin(async move { + if let Err(e) = finished_tx.send(errs_clone).await { + error!("scan_folder: list_path_raw: failed to send finished errs: {}", e); + } + }) + })), + ..Default::default() + }, + ) + .await + { + error!("scan_folder: failed to list path: {}/{}: {}", bucket_clone, prefix_clone, e); + } + }); + + let mut found_objects = false; + + loop { + select! { + Some(entry_name) = agreed_rx.recv() => { + debug!("scan_folder: list_path_raw: found object: {}/{}", bucket, entry_name); + (self.update_current_path)(&entry_name).await; + } + Some(entries) = partial_rx.recv() => { + debug!("scan_folder: list_path_raw: found partial entries: {:?}", entries); + if !self.should_heal().await { + child_ctx.cancel(); + break; + } + + let entry_option = match entries.resolve(resolver.clone()){ + Some(entry) => { + Some(entry) + } + None => { + let (entry,_) = entries.first_found(); + entry + } + }; + + + let Some(entry) = entry_option else { + debug!("scan_folder: list_path_raw: found no entry"); + break; + }; + + (self.update_current_path)(&entry.name).await; + + if entry.is_dir() { + continue; + } + + + + + let fivs = match entry.file_info_versions(&bucket) { + Ok(fivs) => fivs, + Err(e) => { + error!("scan_folder: list_path_raw: failed to get file info versions: {}", e); + if let Err(e) = send_heal_request(HealChannelRequest { + bucket: bucket.clone(), + object_prefix: Some(entry.name.clone()), + ..Default::default() + }).await { + error!("scan_folder: list_path_raw: failed to send heal request: {}", e); + continue; + } + + + found_objects = true; + + continue; + } + }; + + for fiv in fivs.versions { + + if let Err(e) = send_heal_request(HealChannelRequest { + bucket: bucket.clone(), + object_prefix: Some(entry.name.clone()), + object_version_id: fiv.version_id.map(|v| v.to_string()), + ..Default::default() + }).await { + error!("scan_folder: list_path_raw: failed to send heal request: {}", e); + continue; + } + + found_objects = true; + + } + + + } + Some(errs) = finished_rx.recv() => { + debug!("scan_folder: list_path_raw: found finished errs: {:?}", errs); + child_ctx.cancel(); + } + _ = child_ctx.cancelled() => { + debug!("scan_folder: list_path_raw: child context cancelled loop break"); + break; + } + } + } + + if found_objects { + let folder_item = CachedFolder { + name: name.clone(), + parent: Some(this_hash.clone()), + object_heal_prob_div: 1, + }; + + let mut dst = if !into.compacted { + DataUsageEntry::default() + } else { + into.clone() + }; + + // Use Box::pin for recursive async call + let fut = Box::pin(self.scan_folder(ctx.clone(), folder_item.clone(), &mut dst)); + fut.await.map_err(|e| ScannerError::Other(e.to_string()))?; + + if !into.compacted { + let h = DataUsageHash(folder_item.name.clone()); + into.add_child(&h); + // We scanned a folder, optionally send update. + self.update_cache.delete_recursive(&h); + self.update_cache.copy_with_children(&self.new_cache, &h, &folder_item.parent); + self.send_update().await; + } + } + } + + break; + } + + if !was_compacted { + self.new_cache.replace_hashed(&this_hash, &folder.parent, into); + } + + if !into.compacted && self.new_cache.info.name != folder.name { + if let Some(mut flat) = self.new_cache.size_recursive(&this_hash.key()) { + flat.compacted = true; + let mut should_compact = false; + + if flat.objects < DATA_SCANNER_COMPACT_LEAST_OBJECT { + should_compact = true; + } else { + // Compact if we only have objects as children... + should_compact = true; + for k in &into.children { + if let Some(v) = self.new_cache.cache.get(k) { + if !v.children.is_empty() || v.objects > 1 { + should_compact = false; + break; + } + } + } + } + + if should_compact { + self.new_cache.delete_recursive(&this_hash); + self.new_cache.replace_hashed(&this_hash, &folder.parent, &flat); + } + } + } + + // Compact if too many children... + if !into.compacted { + self.new_cache.reduce_children_of( + &this_hash, + DATA_SCANNER_COMPACT_AT_CHILDREN, + self.new_cache.info.name != folder.name, + ); + } + + if self.update_cache.cache.contains_key(&this_hash.key()) && !was_compacted { + // Replace if existed before. + if let Some(flat) = self.new_cache.size_recursive(&this_hash.key()) { + self.update_cache.delete_recursive(&this_hash); + self.update_cache.replace_hashed(&this_hash, &folder.parent, &flat); + } + } + + Ok(()) + } + + pub fn as_mut_new_cache(&mut self) -> &mut DataUsageCache { + &mut self.new_cache + } +} + +/// Scan a data folder +/// This function scans the basepath+cache.info.name and returns an updated cache. +/// The returned cache will always be valid, but may not be updated from the existing. +/// Before each operation sleepDuration is called which can be used to temporarily halt the scanner. +/// If the supplied context is canceled the function will return at the first chance. +#[allow(clippy::too_many_arguments)] +pub async fn scan_data_folder( + ctx: CancellationToken, + disks: Vec>, + local_disk: Arc, + cache: DataUsageCache, + updates: Option>, + scan_mode: HealScanMode, + we_sleep: Box bool + Send + Sync>, +) -> Result { + use crate::data_usage_define::DATA_USAGE_ROOT; + + // Check that we're not trying to scan the root + if cache.info.name.is_empty() || cache.info.name == DATA_USAGE_ROOT { + return Err(ScannerError::Other("internal error: root scan attempted".to_string())); + } + + // Get disk path + let base_path = local_disk.path().to_string_lossy().to_string(); + + let (update_current_path, close_disk) = current_path_updater(&base_path, &cache.info.name); + + // Create skip_heal flag + let is_erasure_mode = is_erasure().await; + let skip_heal = std::sync::Arc::new(std::sync::atomic::AtomicBool::new(!is_erasure_mode || cache.info.skip_healing)); + + // Create heal_object_select flag + let heal_object_select = if is_erasure_mode && !cache.info.skip_healing { + heal_object_select_prob() + } else { + 0 + }; + + let disks_quorum = disks.len() / 2; + + // Create folder scanner + let mut scanner = FolderScanner { + root: base_path, + old_cache: cache.clone(), + new_cache: DataUsageCache { + info: cache.info.clone(), + ..Default::default() + }, + update_cache: DataUsageCache { + info: cache.info.clone(), + ..Default::default() + }, + data_usage_scanner_debug: false, + heal_object_select, + scan_mode, + we_sleep, + disks, + disks_quorum, + updates, + last_update: SystemTime::UNIX_EPOCH, + update_current_path, + skip_heal, + local_disk, + }; + + // Check if context is cancelled + if ctx.is_cancelled() { + return Err(ScannerError::Other("Operation cancelled".to_string())); + } + + // Read top level in bucket + let mut root = DataUsageEntry::default(); + let folder = CachedFolder { + name: cache.info.name.clone(), + parent: None, + object_heal_prob_div: 1, + }; + + warn!("scan_data_folder: folder: {:?}", folder); + + // Scan the folder + match scanner.scan_folder(ctx, folder, &mut root).await { + Ok(()) => { + // Get the new cache and finalize it + let new_cache = scanner.as_mut_new_cache(); + new_cache.force_compact(DATA_SCANNER_COMPACT_AT_CHILDREN); + new_cache.info.last_update = Some(SystemTime::now()); + new_cache.info.next_cycle = cache.info.next_cycle; + + (close_disk)().await; + Ok(new_cache.clone()) + } + Err(e) => { + (close_disk)().await; + // No useful information, return original cache + Err(e) + } + } +} diff --git a/crates/scanner/src/scanner_io.rs b/crates/scanner/src/scanner_io.rs new file mode 100644 index 00000000..0b33e884 --- /dev/null +++ b/crates/scanner/src/scanner_io.rs @@ -0,0 +1,631 @@ +use crate::scanner_folder::{ScannerItem, scan_data_folder}; +use crate::{ + DATA_USAGE_CACHE_NAME, DATA_USAGE_ROOT, DataUsageCache, DataUsageCacheInfo, DataUsageEntry, DataUsageEntryInfo, + DataUsageInfo, SizeSummary, TierStats, +}; +use futures::future::join_all; +use rand::seq::SliceRandom as _; +use rustfs_common::heal_channel::HealScanMode; +use rustfs_ecstore::bucket::bucket_target_sys::BucketTargetSys; +use rustfs_ecstore::bucket::lifecycle::lifecycle::Lifecycle; +use rustfs_ecstore::bucket::metadata_sys::{get_lifecycle_config, get_object_lock_config, get_replication_config}; +use rustfs_ecstore::bucket::replication::{ReplicationConfig, ReplicationConfigurationExt}; +use rustfs_ecstore::bucket::versioning::VersioningApi as _; +use rustfs_ecstore::bucket::versioning_sys::BucketVersioningSys; +use rustfs_ecstore::config::storageclass; +use rustfs_ecstore::disk::STORAGE_FORMAT_FILE; +use rustfs_ecstore::disk::{Disk, DiskAPI}; +use rustfs_ecstore::error::{Error, StorageError}; +use rustfs_ecstore::global::GLOBAL_TierConfigMgr; +use rustfs_ecstore::new_object_layer_fn; +use rustfs_ecstore::set_disk::SetDisks; +use rustfs_ecstore::store_api::{BucketInfo, BucketOptions, ObjectInfo}; +use rustfs_ecstore::{StorageAPI, error::Result, store::ECStore}; +use rustfs_filemeta::FileMeta; +use rustfs_utils::path::{SLASH_SEPARATOR, path_join_buf}; +use s3s::dto::{BucketLifecycleConfiguration, ReplicationConfiguration}; +use std::collections::HashMap; +use std::time::SystemTime; +use std::{fmt::Debug, sync::Arc}; +use time::OffsetDateTime; +use tokio::sync::{Mutex, mpsc}; +use tokio::time::Duration; +use tokio_util::sync::CancellationToken; +use tracing::{debug, error, info, warn}; + +#[async_trait::async_trait] +pub trait ScannerIO: Send + Sync + Debug + 'static { + async fn nsscanner( + &self, + ctx: CancellationToken, + updates: mpsc::Sender, + want_cycle: u64, + scan_mode: HealScanMode, + ) -> Result<()>; +} + +#[async_trait::async_trait] +pub trait ScannerIOCache: Send + Sync + Debug + 'static { + async fn nsscanner_cache( + self: Arc, + ctx: CancellationToken, + buckets: Vec, + updates: mpsc::Sender, + want_cycle: u64, + scan_mode: HealScanMode, + ) -> Result<()>; +} + +#[async_trait::async_trait] +pub trait ScannerIODisk: Send + Sync + Debug + 'static { + async fn nsscanner_disk( + &self, + ctx: CancellationToken, + cache: DataUsageCache, + updates: Option>, + scan_mode: HealScanMode, + ) -> Result; + + async fn get_size(&self, item: ScannerItem) -> Result; +} + +#[async_trait::async_trait] +impl ScannerIO for ECStore { + async fn nsscanner( + &self, + ctx: CancellationToken, + updates: mpsc::Sender, + want_cycle: u64, + scan_mode: HealScanMode, + ) -> Result<()> { + let child_token = ctx.child_token(); + + let all_buckets = self.list_bucket(&BucketOptions::default()).await?; + + if all_buckets.is_empty() { + if let Err(e) = updates.send(DataUsageInfo::default()).await { + error!("Failed to send data usage info: {}", e); + } + return Ok(()); + } + + let mut total_results = 0; + for pool in self.pools.iter() { + total_results += pool.disk_set.len(); + } + + let results = vec![DataUsageCache::default(); total_results]; + let results_mutex: Arc>> = Arc::new(Mutex::new(results)); + let first_err_mutex: Arc>> = Arc::new(Mutex::new(None)); + let mut results_index: i32 = -1_i32; + let mut wait_futs = Vec::new(); + + for pool in self.pools.iter() { + for set in pool.disk_set.iter() { + results_index += 1; + + let results_index_clone = results_index as usize; + // Clone the Arc to move it into the spawned task + let set_clone: Arc = Arc::clone(set); + + let child_token_clone = child_token.clone(); + let want_cycle_clone = want_cycle; + let scan_mode_clone = scan_mode; + let results_mutex_clone = results_mutex.clone(); + let first_err_mutex_clone = first_err_mutex.clone(); + + let (tx, mut rx) = tokio::sync::mpsc::channel::(1); + + // Spawn task to receive and store results + let receiver_fut = tokio::spawn(async move { + while let Some(result) = rx.recv().await { + let mut results = results_mutex_clone.lock().await; + results[results_index_clone] = result; + } + }); + wait_futs.push(receiver_fut); + + let all_buckets_clone = all_buckets.clone(); + // Spawn task to run the scanner + let scanner_fut = tokio::spawn(async move { + if let Err(e) = set_clone + .nsscanner_cache(child_token_clone.clone(), all_buckets_clone, tx, want_cycle_clone, scan_mode_clone) + .await + { + error!("Failed to scan set: {e}"); + let _ = first_err_mutex_clone.lock().await.insert(e); + child_token_clone.cancel(); + } + }); + wait_futs.push(scanner_fut); + } + } + + let (update_tx, mut update_rx) = tokio::sync::oneshot::channel::<()>(); + + let all_buckets_clone = all_buckets.iter().map(|b| b.name.clone()).collect::>(); + tokio::spawn(async move { + let mut last_update = SystemTime::now(); + + let mut ticker = tokio::time::interval(Duration::from_secs(30)); + loop { + tokio::select! { + _ = child_token.cancelled() => { + break; + } + res = &mut update_rx => { + if res.is_err() { + break; + } + + let results = results_mutex.lock().await; + let mut all_merged = DataUsageCache::default(); + for result in results.iter() { + if result.info.last_update.is_none() { + return; + } + all_merged.merge(result); + } + + if all_merged.root().is_some() && all_merged.info.last_update.unwrap() > last_update { + if let Err(e) = updates + .send(all_merged.dui(&all_merged.info.name, &all_buckets_clone)) + .await { + error!("Failed to send data usage info: {}", e); + } + } + break; + } + _ = ticker.tick() => { + let results = results_mutex.lock().await; + let mut all_merged = DataUsageCache::default(); + for result in results.iter() { + if result.info.last_update.is_none() { + return; + } + all_merged.merge(result); + } + + if all_merged.root().is_some() && all_merged.info.last_update.unwrap() > last_update { + if let Err(e) = updates + .send(all_merged.dui(&all_merged.info.name, &all_buckets_clone)) + .await { + error!("Failed to send data usage info: {}", e); + } + last_update = all_merged.info.last_update.unwrap(); + } + } + } + } + }); + + let _ = join_all(wait_futs).await; + + let _ = update_tx.send(()); + + Ok(()) + } +} + +#[async_trait::async_trait] +impl ScannerIOCache for SetDisks { + async fn nsscanner_cache( + self: Arc, + ctx: CancellationToken, + buckets: Vec, + updates: mpsc::Sender, + want_cycle: u64, + scan_mode: HealScanMode, + ) -> Result<()> { + if buckets.is_empty() { + return Ok(()); + } + + let (disks, healing) = self.get_online_disks_with_healing(false).await; + if disks.is_empty() { + info!("No online disks available for set"); + return Ok(()); + } + + let mut old_cache = DataUsageCache::default(); + old_cache.load(self.clone(), DATA_USAGE_CACHE_NAME).await?; + + let mut cache = DataUsageCache { + info: DataUsageCacheInfo { + name: DATA_USAGE_ROOT.to_string(), + next_cycle: old_cache.info.next_cycle, + ..Default::default() + }, + cache: HashMap::new(), + }; + + let (bucket_tx, bucket_rx) = mpsc::channel::(buckets.len()); + + let mut permutes = buckets.clone(); + permutes.shuffle(&mut rand::rng()); + + for bucket in permutes.iter() { + if old_cache.find(&bucket.name).is_none() { + if let Err(e) = bucket_tx.send(bucket.clone()).await { + error!("Failed to send bucket info: {}", e); + } + } + } + + for bucket in permutes.iter() { + if let Some(c) = old_cache.find(&bucket.name) { + cache.replace(&bucket.name, DATA_USAGE_ROOT, c.clone()); + + if let Err(e) = bucket_tx.send(bucket.clone()).await { + error!("Failed to send bucket info: {}", e); + } + } + } + + drop(bucket_tx); + + let cache_mutex: Arc> = Arc::new(Mutex::new(cache)); + + let (bucket_result_tx, mut bucket_result_rx) = mpsc::channel::(disks.len()); + + let cache_mutex_clone = cache_mutex.clone(); + let store_clone = self.clone(); + let ctx_clone = ctx.clone(); + let send_update_fut = tokio::spawn(async move { + let mut ticker = tokio::time::interval(Duration::from_secs(30 + rand::random::() % 10)); + + let mut last_update = None; + + loop { + tokio::select! { + _ = ctx_clone.cancelled() => { + break; + } + _ = ticker.tick() => { + + let cache = cache_mutex_clone.lock().await; + if cache.info.last_update == last_update { + continue; + } + + if let Err(e) = cache.save(store_clone.clone(), DATA_USAGE_CACHE_NAME).await { + error!("Failed to save data usage cache: {}", e); + } + + if let Err(e) = updates.send(cache.clone()).await { + error!("Failed to send data usage cache: {}", e); + + } + + last_update = cache.info.last_update; + } + res = bucket_result_rx.recv() => { + if let Some(result) = res { + let mut cache = cache_mutex_clone.lock().await; + cache.replace(&result.name, &result.parent, result.entry); + cache.info.last_update = Some(SystemTime::now()); + + } else { + let mut cache = cache_mutex_clone.lock().await; + cache.info.next_cycle =want_cycle; + cache.info.last_update = Some(SystemTime::now()); + + if let Err(e) = cache.save(store_clone.clone(), DATA_USAGE_CACHE_NAME).await { + error!("Failed to save data usage cache: {}", e); + } + + if let Err(e) = updates.send(cache.clone()).await { + error!("Failed to send data usage cache: {}", e); + + } + + return; + } + } + } + } + }); + + let mut futs = Vec::new(); + + let bucket_rx_mutex: Arc>> = Arc::new(Mutex::new(bucket_rx)); + let bucket_result_tx_clone: Arc>> = Arc::new(Mutex::new(bucket_result_tx)); + for disk in disks.into_iter() { + let bucket_rx_mutex_clone = bucket_rx_mutex.clone(); + let ctx_clone = ctx.clone(); + let store_clone_clone = self.clone(); + let bucket_result_tx_clone_clone = bucket_result_tx_clone.clone(); + futs.push(tokio::spawn(async move { + while let Some(bucket) = bucket_rx_mutex_clone.lock().await.recv().await { + if ctx_clone.is_cancelled() { + break; + } + + debug!("nsscanner_disk: got bucket: {}", bucket.name); + + let cache_name = path_join_buf(&[&bucket.name, DATA_USAGE_CACHE_NAME]); + + let mut cache = DataUsageCache::default(); + if let Err(e) = cache.load(store_clone_clone.clone(), &cache_name).await { + error!("Failed to load data usage cache: {}", e); + } + + if cache.info.name.is_empty() { + cache.info.name = bucket.name.clone(); + } + + cache.info.skip_healing = healing; + cache.info.next_cycle = want_cycle; + if cache.info.name != bucket.name { + cache.info = DataUsageCacheInfo { + name: bucket.name.clone(), + next_cycle: want_cycle, + ..Default::default() + }; + } + + warn!("nsscanner_disk: cache.info.name: {:?}", cache.info.name); + + let (updates_tx, mut updates_rx) = mpsc::channel::(1); + + let ctx_clone_clone = ctx_clone.clone(); + let bucket_name_clone = bucket.name.clone(); + let bucket_result_tx_clone_clone_clone = bucket_result_tx_clone_clone.clone(); + let update_fut = tokio::spawn(async move { + while let Some(result) = updates_rx.recv().await { + if ctx_clone_clone.is_cancelled() { + break; + } + + if let Err(e) = bucket_result_tx_clone_clone_clone + .lock() + .await + .send(DataUsageEntryInfo { + name: bucket_name_clone.clone(), + parent: DATA_USAGE_ROOT.to_string(), + entry: result, + }) + .await + { + error!("Failed to send data usage entry info: {}", e); + } + } + }); + + let before = cache.info.last_update; + + cache = match disk + .clone() + .nsscanner_disk(ctx_clone.clone(), cache.clone(), Some(updates_tx), scan_mode) + .await + { + Ok(cache) => cache, + Err(e) => { + error!("Failed to scan disk: {}", e); + + if let (Some(last_update), Some(before_update)) = (cache.info.last_update, before) { + if last_update > before_update { + if let Err(e) = cache.save(store_clone_clone.clone(), cache_name.as_str()).await { + error!("Failed to save data usage cache: {}", e); + } + } + } + + if let Err(e) = update_fut.await { + error!("Failed to update data usage cache: {}", e); + } + continue; + } + }; + + debug!("nsscanner_disk: got cache: {}", cache.info.name); + + if let Err(e) = update_fut.await { + error!("nsscanner_disk: Failed to update data usage cache: {}", e); + } + + let root = if let Some(r) = cache.root() { + cache.flatten(&r) + } else { + DataUsageEntry::default() + }; + + if ctx_clone.is_cancelled() { + break; + } + + debug!("nsscanner_disk: sending data usage entry info: {}", cache.info.name); + + if let Err(e) = bucket_result_tx_clone_clone + .lock() + .await + .send(DataUsageEntryInfo { + name: cache.info.name.clone(), + parent: DATA_USAGE_ROOT.to_string(), + entry: root, + }) + .await + { + error!("nsscanner_disk: Failed to send data usage entry info: {}", e); + } + + if let Err(e) = cache.save(store_clone_clone.clone(), &cache_name).await { + error!("nsscanner_disk: Failed to save data usage cache: {}", e); + } + } + })); + } + + let _ = join_all(futs).await; + + warn!("nsscanner_cache: joining all futures"); + + drop(bucket_result_tx_clone); + + warn!("nsscanner_cache: dropping bucket result tx"); + + send_update_fut.await?; + + warn!("nsscanner_cache: done"); + + Ok(()) + } +} + +#[async_trait::async_trait] +impl ScannerIODisk for Disk { + async fn get_size(&self, mut item: ScannerItem) -> Result { + if !item.path.ends_with(&format!("{SLASH_SEPARATOR}{STORAGE_FORMAT_FILE}")) { + return Err(StorageError::other("skip file".to_string())); + } + + debug!("get_size: reading metadata for {}/{}", &item.bucket, &item.object_path()); + + let data = match self.read_metadata(&item.bucket, &item.object_path()).await { + Ok(data) => data, + Err(e) => return Err(StorageError::other(format!("Failed to read metadata: {e}"))), + }; + + item.transform_meta_dir(); + + let meta = FileMeta::load(&data)?; + let fivs = match meta.get_file_info_versions(item.bucket.as_str(), item.object_path().as_str(), false) { + Ok(versions) => versions, + Err(e) => { + error!("Failed to get file info versions: {}", e); + return Err(StorageError::other("skip file".to_string())); + } + }; + + let versioned = BucketVersioningSys::get(&item.bucket) + .await + .map(|v| v.versioned(&item.object_path())) + .unwrap_or(false); + + let object_infos = fivs + .versions + .iter() + .map(|v| ObjectInfo::from_file_info(v, item.bucket.as_str(), item.object_path().as_str(), versioned)) + .collect::>(); + + let mut size_summary = SizeSummary::default(); + + let tiers = { + let tier_config_mgr = GLOBAL_TierConfigMgr.read().await; + tier_config_mgr.list_tiers() + }; + + for tier in tiers.iter() { + size_summary.tier_stats.insert(tier.name.clone(), TierStats::default()); + } + if !size_summary.tier_stats.is_empty() { + size_summary + .tier_stats + .insert(storageclass::STANDARD.to_string(), TierStats::default()); + size_summary + .tier_stats + .insert(storageclass::RRS.to_string(), TierStats::default()); + } + + let lock_config = match get_object_lock_config(&item.bucket).await { + Ok((cfg, _)) => Some(Arc::new(cfg)), + Err(_) => None, + }; + + let Some(ecstore) = new_object_layer_fn() else { + error!("ECStore not available"); + return Err(StorageError::other("ECStore not available".to_string())); + }; + + item.apply_actions(ecstore, object_infos, lock_config, &mut size_summary) + .await; + + // TODO: enqueueFreeVersion + + Ok(size_summary) + } + async fn nsscanner_disk( + &self, + ctx: CancellationToken, + cache: DataUsageCache, + updates: Option>, + scan_mode: HealScanMode, + ) -> Result { + // match self { + // Disk::Local(local_disk) => local_disk.nsscanner_disk(ctx, cache, updates, scan_mode).await, + // Disk::Remote(remote_disk) => remote_disk.nsscanner_disk(ctx, cache, updates, scan_mode).await, + // } + + let _guard = self.start_scan(); + + let mut cache = cache; + + let (lifecycle_config, _) = get_lifecycle_config(&cache.info.name) + .await + .unwrap_or((BucketLifecycleConfiguration::default(), OffsetDateTime::now_utc())); + + if lifecycle_config.has_active_rules("") { + cache.info.lifecycle = Some(Arc::new(lifecycle_config)); + } + + let (replication_config, _) = get_replication_config(&cache.info.name).await.unwrap_or(( + ReplicationConfiguration { + role: "".to_string(), + rules: vec![], + }, + OffsetDateTime::now_utc(), + )); + + if replication_config.has_active_rules("", true) { + if let Ok(targets) = BucketTargetSys::get().list_bucket_targets(&cache.info.name).await { + cache.info.replication = Some(Arc::new(ReplicationConfig { + config: Some(replication_config), + remotes: Some(targets), + })); + } + } + + // TODO: object lock + + let Some(ecstore) = new_object_layer_fn() else { + error!("ECStore not available"); + return Err(StorageError::other("ECStore not available".to_string())); + }; + + let disk_location = self.get_disk_location(); + + let (Some(pool_idx), Some(set_idx)) = (disk_location.pool_idx, disk_location.set_idx) else { + error!("Disk location not available"); + return Err(StorageError::other("Disk location not available".to_string())); + }; + + let disks_result = ecstore.get_disks(pool_idx, set_idx).await?; + + let Some(disk_idx) = disk_location.disk_idx else { + error!("Disk index not available"); + return Err(StorageError::other("Disk index not available".to_string())); + }; + + let local_disk = if let Some(Some(local_disk)) = disks_result.get(disk_idx) { + local_disk.clone() + } else { + error!("Local disk not available"); + return Err(StorageError::other("Local disk not available".to_string())); + }; + + let disks = disks_result.into_iter().flatten().collect::>>(); + + // Create we_sleep function (always return false for now, can be enhanced later) + let we_sleep: Box bool + Send + Sync> = Box::new(|| false); + + let result = scan_data_folder(ctx, disks, local_disk, cache, updates, scan_mode, we_sleep).await; + + match result { + Ok(mut data_usage_info) => { + data_usage_info.info.last_update = Some(SystemTime::now()); + Ok(data_usage_info) + } + Err(e) => Err(StorageError::other(format!("Failed to scan data folder: {e}"))), + } + } +} diff --git a/crates/targets/src/event_name.rs b/crates/targets/src/event_name.rs index 0ee1dce2..6df8d3f8 100644 --- a/crates/targets/src/event_name.rs +++ b/crates/targets/src/event_name.rs @@ -12,7 +12,6 @@ // See the License for the specific language governing permissions and // limitations under the License. -use serde::{Deserialize, Serialize}; use std::fmt; /// Error returned when parsing event name string fails. @@ -29,7 +28,7 @@ impl std::error::Error for ParseEventNameError {} /// Represents the type of event that occurs on the object. /// Based on AWS S3 event type and includes RustFS extension. -#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq, Hash, Default)] +#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Default)] pub enum EventName { // Single event type (values are 1-32 for compatible mask logic) ObjectAccessedGet = 1, @@ -289,3 +288,79 @@ impl From<&str> for EventName { EventName::parse(event_str).unwrap_or_else(|e| panic!("{}", e)) } } + +impl serde::ser::Serialize for EventName { + fn serialize(&self, serializer: S) -> Result + where + S: serde::ser::Serializer, + { + serializer.serialize_str(self.as_str()) + } +} + +impl<'de> serde::de::Deserialize<'de> for EventName { + fn deserialize(deserializer: D) -> Result + where + D: serde::de::Deserializer<'de>, + { + let s = String::deserialize(deserializer)?; + let s = Self::parse(&s).map_err(serde::de::Error::custom)?; + Ok(s) + } +} + +#[cfg(test)] +mod tests { + use super::*; + + // test serialization + #[test] + fn test_event_name_serialization_and_deserialization() { + struct TestCase { + event: EventName, + serialized_str: &'static str, + } + + let test_cases = vec![ + TestCase { + event: EventName::BucketCreated, + serialized_str: "\"s3:BucketCreated:*\"", + }, + TestCase { + event: EventName::ObjectCreatedAll, + serialized_str: "\"s3:ObjectCreated:*\"", + }, + TestCase { + event: EventName::ObjectCreatedPut, + serialized_str: "\"s3:ObjectCreated:Put\"", + }, + ]; + + for case in &test_cases { + let serialized = serde_json::to_string(&case.event); + assert!(serialized.is_ok(), "Serialization failed for `{}`", case.serialized_str); + assert_eq!(serialized.unwrap(), case.serialized_str); + + let deserialized = serde_json::from_str::(case.serialized_str); + assert!(deserialized.is_ok(), "Deserialization failed for `{}`", case.serialized_str); + assert_eq!(deserialized.unwrap(), case.event); + } + } + + #[test] + fn test_invalid_event_name_deserialization() { + let invalid_str = "\"s3:InvalidEvent:Test\""; + let deserialized = serde_json::from_str::(invalid_str); + assert!(deserialized.is_err(), "Deserialization should fail for invalid event name"); + + // Serializing EventName::Everything produces an empty string, but deserializing an empty string should fail. + let event_name = EventName::Everything; + let serialized_str = "\"\""; + let serialized = serde_json::to_string(&event_name); + assert!(serialized.is_ok(), "Serialization failed for `{serialized_str}`"); + assert_eq!(serialized.unwrap(), serialized_str); + + let deserialized = serde_json::from_str::(serialized_str); + assert!(deserialized.is_err(), "Deserialization should fail for empty string"); + } +} diff --git a/crates/targets/src/lib.rs b/crates/targets/src/lib.rs index aae0ac3e..a2351fb0 100644 --- a/crates/targets/src/lib.rs +++ b/crates/targets/src/lib.rs @@ -27,6 +27,7 @@ pub use target::Target; /// Represents a log of events for sending to targets #[derive(Debug, Clone, Serialize, Deserialize)] +#[serde(rename_all = "PascalCase")] pub struct TargetLog { /// The event name pub event_name: EventName, diff --git a/crates/targets/src/store.rs b/crates/targets/src/store.rs index 139e32be..be15a9b4 100644 --- a/crates/targets/src/store.rs +++ b/crates/targets/src/store.rs @@ -312,7 +312,7 @@ where compress: true, }; - let data = serde_json::to_vec(&item).map_err(|e| StoreError::Serialization(e.to_string()))?; + let data = serde_json::to_vec(&*item).map_err(|e| StoreError::Serialization(e.to_string()))?; self.write_file(&key, &data)?; Ok(key) diff --git a/crates/targets/src/target/mod.rs b/crates/targets/src/target/mod.rs index 627fa8d0..876f186b 100644 --- a/crates/targets/src/target/mod.rs +++ b/crates/targets/src/target/mod.rs @@ -159,3 +159,30 @@ impl std::fmt::Display for TargetType { } } } + +/// Decodes a form-urlencoded object name to its original form. +/// +/// This function properly handles form-urlencoded strings where spaces are +/// represented as `+` symbols. It first replaces `+` with spaces, then +/// performs standard percent-decoding. +/// +/// # Arguments +/// * `encoded` - The form-urlencoded string to decode +/// +/// # Returns +/// The decoded string, or an error if decoding fails +/// +/// # Example +/// ``` +/// use rustfs_targets::target::decode_object_name; +/// +/// let encoded = "greeting+file+%282%29.csv"; +/// let decoded = decode_object_name(encoded).unwrap(); +/// assert_eq!(decoded, "greeting file (2).csv"); +/// ``` +pub fn decode_object_name(encoded: &str) -> Result { + let replaced = encoded.replace("+", " "); + urlencoding::decode(&replaced) + .map(|s| s.into_owned()) + .map_err(|e| TargetError::Encoding(format!("Failed to decode object key: {e}"))) +} diff --git a/crates/targets/src/target/mqtt.rs b/crates/targets/src/target/mqtt.rs index 45b73e5e..9de8ac94 100644 --- a/crates/targets/src/target/mqtt.rs +++ b/crates/targets/src/target/mqtt.rs @@ -12,12 +12,15 @@ // See the License for the specific language governing permissions and // limitations under the License. -use crate::store::Key; -use crate::target::{ChannelTargetType, EntityTarget, TargetType}; -use crate::{StoreError, Target, TargetLog, arn::TargetID, error::TargetError, store::Store}; +use crate::{ + StoreError, Target, TargetLog, + arn::TargetID, + error::TargetError, + store::{Key, QueueStore, Store}, + target::{ChannelTargetType, EntityTarget, TargetType}, +}; use async_trait::async_trait; -use rumqttc::{AsyncClient, EventLoop, MqttOptions, Outgoing, Packet, QoS}; -use rumqttc::{ConnectionError, mqttbytes::Error as MqttBytesError}; +use rumqttc::{AsyncClient, ConnectionError, EventLoop, MqttOptions, Outgoing, Packet, QoS, mqttbytes::Error as MqttBytesError}; use serde::Serialize; use serde::de::DeserializeOwned; use std::sync::Arc; @@ -29,7 +32,6 @@ use std::{ use tokio::sync::{Mutex, OnceCell, mpsc}; use tracing::{debug, error, info, instrument, trace, warn}; use url::Url; -use urlencoding; const DEFAULT_CONNECTION_TIMEOUT: Duration = Duration::from_secs(15); const EVENT_LOOP_POLL_TIMEOUT: Duration = Duration::from_secs(10); // For initial connection check in task @@ -130,10 +132,10 @@ where debug!(target_id = %target_id, path = %specific_queue_path.display(), "Initializing queue store for MQTT target"); let extension = match args.target_type { TargetType::AuditLog => rustfs_config::audit::AUDIT_STORE_EXTENSION, - TargetType::NotifyEvent => rustfs_config::notify::STORE_EXTENSION, + TargetType::NotifyEvent => rustfs_config::notify::NOTIFY_STORE_EXTENSION, }; - let store = crate::store::QueueStore::>::new(specific_queue_path, args.queue_limit, extension); + let store = QueueStore::>::new(specific_queue_path, args.queue_limit, extension); if let Err(e) = store.open() { error!( target_id = %target_id, @@ -255,8 +257,8 @@ where .as_ref() .ok_or_else(|| TargetError::Configuration("MQTT client not initialized".to_string()))?; - let object_name = urlencoding::decode(&event.object_name) - .map_err(|e| TargetError::Encoding(format!("Failed to decode object key: {e}")))?; + // Decode form-urlencoded object name + let object_name = crate::target::decode_object_name(&event.object_name)?; let key = format!("{}/{}", event.bucket_name, object_name); diff --git a/crates/targets/src/target/webhook.rs b/crates/targets/src/target/webhook.rs index d2de20e9..5c505e3b 100644 --- a/crates/targets/src/target/webhook.rs +++ b/crates/targets/src/target/webhook.rs @@ -12,16 +12,17 @@ // See the License for the specific language governing permissions and // limitations under the License. -use crate::target::{ChannelTargetType, EntityTarget, TargetType}; use crate::{ StoreError, Target, TargetLog, arn::TargetID, error::TargetError, - store::{Key, Store}, + store::{Key, QueueStore, Store}, + target::{ChannelTargetType, EntityTarget, TargetType}, }; use async_trait::async_trait; use reqwest::{Client, StatusCode, Url}; -use rustfs_config::notify::STORE_EXTENSION; +use rustfs_config::audit::AUDIT_STORE_EXTENSION; +use rustfs_config::notify::NOTIFY_STORE_EXTENSION; use serde::Serialize; use serde::de::DeserializeOwned; use std::{ @@ -35,7 +36,6 @@ use std::{ use tokio::net::lookup_host; use tokio::sync::mpsc; use tracing::{debug, error, info, instrument}; -use urlencoding; /// Arguments for configuring a Webhook target #[derive(Debug, Clone)] @@ -155,11 +155,11 @@ where PathBuf::from(&args.queue_dir).join(format!("rustfs-{}-{}", ChannelTargetType::Webhook.as_str(), target_id.id)); let extension = match args.target_type { - TargetType::AuditLog => rustfs_config::audit::AUDIT_STORE_EXTENSION, - TargetType::NotifyEvent => STORE_EXTENSION, + TargetType::AuditLog => AUDIT_STORE_EXTENSION, + TargetType::NotifyEvent => NOTIFY_STORE_EXTENSION, }; - let store = crate::store::QueueStore::>::new(queue_dir, args.queue_limit, extension); + let store = QueueStore::>::new(queue_dir, args.queue_limit, extension); if let Err(e) = store.open() { error!("Failed to open store for Webhook target {}: {}", target_id.id, e); @@ -220,8 +220,8 @@ where async fn send(&self, event: &EntityTarget) -> Result<(), TargetError> { info!("Webhook Sending event to webhook target: {}", self.id); - let object_name = urlencoding::decode(&event.object_name) - .map_err(|e| TargetError::Encoding(format!("Failed to decode object key: {e}")))?; + // Decode form-urlencoded object name + let object_name = crate::target::decode_object_name(&event.object_name)?; let key = format!("{}/{}", event.bucket_name, object_name); @@ -420,3 +420,51 @@ where self.args.enable } } + +#[cfg(test)] +mod tests { + use crate::target::decode_object_name; + use url::form_urlencoded; + + #[test] + fn test_decode_object_name_with_spaces() { + // Test case from the issue: "greeting file (2).csv" + let object_name = "greeting file (2).csv"; + + // Simulate what event.rs does: form-urlencoded encoding (spaces become +) + let form_encoded = form_urlencoded::byte_serialize(object_name.as_bytes()).collect::(); + assert_eq!(form_encoded, "greeting+file+%282%29.csv"); + + // Test the decode_object_name helper function + let decoded = decode_object_name(&form_encoded).unwrap(); + assert_eq!(decoded, object_name); + assert!(!decoded.contains('+'), "Decoded string should not contain + symbols"); + } + + #[test] + fn test_decode_object_name_with_special_chars() { + // Test with various special characters + let test_cases = vec![ + ("folder/greeting file (2).csv", "folder%2Fgreeting+file+%282%29.csv"), + ("test file.txt", "test+file.txt"), + ("my file (copy).pdf", "my+file+%28copy%29.pdf"), + ("file with spaces and (parentheses).doc", "file+with+spaces+and+%28parentheses%29.doc"), + ]; + + for (original, form_encoded) in test_cases { + // Test the decode_object_name helper function + let decoded = decode_object_name(form_encoded).unwrap(); + assert_eq!(decoded, original, "Failed to decode: {}", form_encoded); + } + } + + #[test] + fn test_decode_object_name_without_spaces() { + // Test that files without spaces still work correctly + let object_name = "simple-file.txt"; + let form_encoded = form_urlencoded::byte_serialize(object_name.as_bytes()).collect::(); + + let decoded = decode_object_name(&form_encoded).unwrap(); + assert_eq!(decoded, object_name); + } +} diff --git a/crates/utils/Cargo.toml b/crates/utils/Cargo.toml index 5a0bd187..9b05e84e 100644 --- a/crates/utils/Cargo.toml +++ b/crates/utils/Cargo.toml @@ -84,7 +84,7 @@ tls = ["dep:rustls", "dep:rustls-pemfile", "dep:rustls-pki-types"] # tls charac net = ["ip", "dep:url", "dep:netif", "dep:futures", "dep:transform-stream", "dep:bytes", "dep:s3s", "dep:hyper", "dep:thiserror", "dep:tokio"] # network features with DNS resolver io = ["dep:tokio"] path = [] -notify = ["dep:hyper", "dep:s3s", "dep:hashbrown", "dep:thiserror", "dep:serde", "dep:libc"] # file system notification features +notify = ["dep:hyper", "dep:s3s", "dep:hashbrown", "dep:thiserror", "dep:serde", "dep:libc", "dep:url", "dep:regex"] # file system notification features compress = ["dep:flate2", "dep:brotli", "dep:snap", "dep:lz4", "dep:zstd"] string = ["dep:regex", "dep:rand"] crypto = ["dep:base64-simd", "dep:hex-simd", "dep:hmac", "dep:hyper", "dep:sha1"] diff --git a/crates/utils/src/certs.rs b/crates/utils/src/certs.rs index 24657f7a..463874ed 100644 --- a/crates/utils/src/certs.rs +++ b/crates/utils/src/certs.rs @@ -21,7 +21,7 @@ use std::collections::HashMap; use std::io::Error; use std::path::Path; use std::sync::Arc; -use std::{env, fs, io}; +use std::{fs, io}; use tracing::{debug, warn}; /// Load public certificate from file. @@ -243,17 +243,7 @@ pub fn create_multi_cert_resolver( /// * A boolean indicating whether TLS key logging is enabled based on the `RUSTFS_TLS_KEYLOG` environment variable. /// pub fn tls_key_log() -> bool { - env::var("RUSTFS_TLS_KEYLOG") - .map(|v| { - let v = v.trim(); - v.eq_ignore_ascii_case("1") - || v.eq_ignore_ascii_case("on") - || v.eq_ignore_ascii_case("true") - || v.eq_ignore_ascii_case("yes") - || v.eq_ignore_ascii_case("enabled") - || v.eq_ignore_ascii_case("t") - }) - .unwrap_or(false) + crate::get_env_bool(rustfs_config::ENV_TLS_KEYLOG, rustfs_config::DEFAULT_TLS_KEYLOG) } #[cfg(test)] diff --git a/crates/utils/src/string.rs b/crates/utils/src/string.rs index 42a8e0a6..8d3879d1 100644 --- a/crates/utils/src/string.rs +++ b/crates/utils/src/string.rs @@ -48,6 +48,14 @@ pub fn parse_bool(str: &str) -> Result { } } +pub fn parse_bool_with_default(str: &str, default: bool) -> bool { + match str { + "1" | "t" | "T" | "true" | "TRUE" | "True" | "on" | "ON" | "On" | "enabled" => true, + "0" | "f" | "F" | "false" | "FALSE" | "False" | "off" | "OFF" | "Off" | "disabled" => false, + _ => default, + } +} + /// Matches a simple pattern against a name using wildcards. /// /// # Arguments diff --git a/docker-buildx.sh b/docker-buildx.sh index d5770078..ed19c077 100755 --- a/docker-buildx.sh +++ b/docker-buildx.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash set -e diff --git a/docker-compose.yml b/docker-compose.yml index 492803e3..2dd53a8c 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -12,8 +12,6 @@ # See the License for the specific language governing permissions and # limitations under the License. -version: "3.8" - services: # RustFS main service rustfs: @@ -30,11 +28,11 @@ services: - "9000:9000" # S3 API port - "9001:9001" # Console port environment: - - RUSTFS_VOLUMES=/data/rustfs{0...3} # Define 4 storage volumes + - RUSTFS_VOLUMES=/data/rustfs{0..3} # Define 4 storage volumes - RUSTFS_ADDRESS=0.0.0.0:9000 - RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001 - RUSTFS_CONSOLE_ENABLE=true - - RUSTFS_EXTERNAL_ADDRESS=:9000 # Same as internal since no port mapping + - RUSTFS_EXTERNAL_ADDRESS=:9000 # Same as internal since no port mapping - RUSTFS_CORS_ALLOWED_ORIGINS=* - RUSTFS_CONSOLE_CORS_ALLOWED_ORIGINS=* - RUSTFS_ACCESS_KEY=rustfsadmin @@ -43,9 +41,9 @@ services: - RUSTFS_TLS_PATH=/opt/tls - RUSTFS_OBS_ENDPOINT=http://otel-collector:4317 volumes: - - deploy/data/pro:/data - - deploy/logs:/app/logs - - deploy/data/certs/:/opt/tls # TLS configuration, you should create tls directory and put your tls files in it and then specify the path here + - ./deploy/data/pro:/data + - ./deploy/logs:/app/logs + - ./deploy/data/certs/:/opt/tls # TLS configuration, you should create tls directory and put your tls files in it and then specify the path here networks: - rustfs-network restart: unless-stopped @@ -61,7 +59,9 @@ services: retries: 3 start_period: 40s depends_on: - - otel-collector + otel-collector: + condition: service_started + required: false # Development environment rustfs-dev: @@ -70,16 +70,17 @@ services: build: context: . dockerfile: Dockerfile.source + target: dev # Pure development environment ports: - "9010:9000" # S3 API port - "9011:9001" # Console port environment: - - RUSTFS_VOLUMES=/data/rustfs{1...4} + - RUSTFS_VOLUMES=/data/rustfs{0..3} - RUSTFS_ADDRESS=0.0.0.0:9000 - RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001 - RUSTFS_CONSOLE_ENABLE=true - - RUSTFS_EXTERNAL_ADDRESS=:9010 # External port mapping 9010 -> 9000 + - RUSTFS_EXTERNAL_ADDRESS=:9010 # External port mapping 9010 -> 9000 - RUSTFS_CORS_ALLOWED_ORIGINS=* - RUSTFS_CONSOLE_CORS_ALLOWED_ORIGINS=* - RUSTFS_ACCESS_KEY=devadmin @@ -88,7 +89,8 @@ services: - RUSTFS_OBS_LOG_DIRECTORY=/logs volumes: - .:/app # Mount source code to /app for development - - deploy/data/dev:/data + - cargo_registry:/usr/local/cargo/registry # Mount cargo registry to avoid re-downloading + - ./deploy/data/dev:/data networks: - rustfs-network restart: unless-stopped @@ -230,3 +232,5 @@ volumes: driver: local logs: driver: local + cargo_registry: + driver: local diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md new file mode 100644 index 00000000..60767eb6 --- /dev/null +++ b/docs/DEVELOPMENT.md @@ -0,0 +1,71 @@ +# RustFS Local Development Guide + +This guide explains how to set up and run a local development environment for RustFS using Docker. This approach allows you to build and run the code from source in a consistent environment without needing to install the Rust toolchain on your host machine. + +## Prerequisites + +- [Docker](https://docs.docker.com/get-docker/) +- [Docker Compose](https://docs.docker.com/compose/install/) + +## Quick Start + +The development environment is configured as a Docker Compose profile named `dev`. + +### 1. Setup Console UI (Optional) + +If you want to use the Console UI, you must download the static assets first. The default source checkout does not include them. + +```bash +bash scripts/static.sh +``` + +### 2. Start the Environment + +To start the development container: + +```bash +docker compose --profile dev up -d rustfs-dev +``` + +**Note**: The first run will take some time (5-10 minutes) because it builds the docker image and compiles all Rust dependencies from source. Subsequent runs will be much faster. + +### 3. View Logs + +To follow the application logs: + +```bash +docker compose --profile dev logs -f rustfs-dev +``` + +### 4. Access the Services + +- **S3 API**: `http://localhost:9010` +- **Console UI**: `http://localhost:9011/rustfs/console/index.html` + +## Workflow + +### Making Changes +The source code from your local `rustfs` directory is mounted into the container at `/app`. You can edit files in your preferred IDE on your host machine. + +### Applying Changes +Since the application runs via `cargo run`, you need to restart the container to pick up changes. Thanks to incremental compilation, this is fast. + +```bash +docker compose --profile dev restart rustfs-dev +``` + +### Rebuilding Dependencies +If you modify `Cargo.toml` or `Cargo.lock`, you generally need to rebuild the Docker image to update the cached dependencies layer: + +```bash +docker compose --profile dev build rustfs-dev +``` + +## Troubleshooting + +### `VolumeNotFound` Error +If you see an error like `Error: Custom { kind: Other, error: VolumeNotFound }`, it means the `rustfs` binary was started without valid volume arguments. +The development image uses `entrypoint.sh` to parse the `RUSTFS_VOLUMES` environment variable (supporting `{N..M}` syntax), create the directories, and pass them to `cargo run`. Ensure your `RUSTFS_VOLUMES` variable is correctly formatted. + +### Slow Initial Build +This is expected. The `dev` stage in `Dockerfile.source` compiles all dependencies from scratch. Because the `/usr/local/cargo/registry` is mounted as a volume, these compiled artifacts are preserved between restarts, making future builds fast. diff --git a/docs/compression-best-practices.md b/docs/compression-best-practices.md index 77d66ce8..6a10e7db 100644 --- a/docs/compression-best-practices.md +++ b/docs/compression-best-practices.md @@ -3,7 +3,89 @@ ## Overview This document outlines best practices for HTTP response compression in RustFS, based on lessons learned from fixing the -NoSuchKey error response regression (Issue #901). +NoSuchKey error response regression (Issue #901) and the whitelist-based compression redesign (Issue #902). + +## Whitelist-Based Compression (Issue #902) + +### Design Philosophy + +After Issue #901, we identified that the blacklist approach (compress everything except known problematic types) was +still causing issues with browser downloads showing "unknown file size". In Issue #902, we redesigned the compression +system using a **whitelist approach** aligned with MinIO's behavior: + +1. **Compression is disabled by default** - Opt-in rather than opt-out +2. **Only explicitly configured content types are compressed** - Preserves Content-Length for all other responses +3. **Fine-grained configuration** - Control via file extensions, MIME types, and size thresholds +4. **Skip already-encoded content** - Avoid double compression + +### Configuration Options + +RustFS provides flexible compression configuration via environment variables and command-line arguments: + +| Environment Variable | CLI Argument | Default | Description | +|---------------------|--------------|---------|-------------| +| `RUSTFS_COMPRESS_ENABLE` | | `false` | Enable/disable compression | +| `RUSTFS_COMPRESS_EXTENSIONS` | | `""` | File extensions to compress (e.g., `.txt,.log,.csv`) | +| `RUSTFS_COMPRESS_MIME_TYPES` | | `text/*,application/json,...` | MIME types to compress (supports wildcards) | +| `RUSTFS_COMPRESS_MIN_SIZE` | | `1000` | Minimum file size (bytes) for compression | + +### Usage Examples + +```bash +# Enable compression for text files and JSON +RUSTFS_COMPRESS_ENABLE=on \ +RUSTFS_COMPRESS_EXTENSIONS=.txt,.log,.csv,.json,.xml \ +RUSTFS_COMPRESS_MIME_TYPES=text/*,application/json,application/xml \ +RUSTFS_COMPRESS_MIN_SIZE=1000 \ +rustfs /data + +# Or using command-line arguments +rustfs /data \ + --compress-enable \ + --compress-extensions ".txt,.log,.csv" \ + --compress-mime-types "text/*,application/json" \ + --compress-min-size 1000 +``` + +### Implementation Details + +The `CompressionPredicate` implements intelligent compression decisions: + +```rust +impl Predicate for CompressionPredicate { + fn should_compress(&self, response: &Response) -> bool { + // 1. Check if compression is enabled + if !self.config.enabled { return false; } + + // 2. Never compress error responses + if status.is_client_error() || status.is_server_error() { return false; } + + // 3. Skip already-encoded content (gzip, br, deflate, etc.) + if has_content_encoding(response) { return false; } + + // 4. Check minimum size threshold + if content_length < self.config.min_size { return false; } + + // 5. Check whitelist: extension OR MIME type must match + if matches_extension(response) || matches_mime_type(response) { + return true; + } + + // 6. Default: don't compress (whitelist approach) + false + } +} +``` + +### Benefits of Whitelist Approach + +| Aspect | Blacklist (Old) | Whitelist (New) | +|--------|-----------------|-----------------| +| Default behavior | Compress most content | No compression | +| Content-Length | Often removed | Preserved for unmatched types | +| Browser downloads | "Unknown file size" | Accurate file size shown | +| Configuration | Complex exclusion rules | Simple inclusion rules | +| MinIO compatibility | Different behavior | Aligned behavior | ## Key Principles @@ -38,21 +120,54 @@ if status.is_client_error() || status.is_server_error() { - May actually increase payload size - Adds latency without benefit -**Recommended Threshold**: 256 bytes minimum +**Recommended Threshold**: 1000 bytes minimum (configurable via `RUSTFS_COMPRESS_MIN_SIZE`) **Implementation**: ```rust if let Some(content_length) = response.headers().get(CONTENT_LENGTH) { if let Ok(length) = content_length.to_str()?.parse::()? { - if length < 256 { + if length < self.config.min_size { return false; // Don't compress small responses } } } ``` -### 3. Maintain Observability +### 3. Skip Already-Encoded Content + +**Rationale**: If the response already has a `Content-Encoding` header (e.g., gzip, br, deflate, zstd), the content +is already compressed. Re-compressing provides no benefit and may cause issues: + +- Double compression wastes CPU cycles +- May corrupt data or increase size +- Breaks decompression on client side + +**Implementation**: + +```rust +// Skip if content is already encoded (e.g., gzip, br, deflate, zstd) +if let Some(content_encoding) = response.headers().get(CONTENT_ENCODING) { + if let Ok(encoding) = content_encoding.to_str() { + let encoding_lower = encoding.to_lowercase(); + // "identity" means no encoding, so we can still compress + if encoding_lower != "identity" && !encoding_lower.is_empty() { + debug!("Skipping compression for already encoded response: {}", encoding); + return false; + } + } +} +``` + +**Common Content-Encoding Values**: + +- `gzip` - GNU zip compression +- `br` - Brotli compression +- `deflate` - Deflate compression +- `zstd` - Zstandard compression +- `identity` - No encoding (compression allowed) + +### 4. Maintain Observability **Rationale**: Compression decisions can affect debugging and troubleshooting. Always log when compression is skipped. @@ -84,38 +199,58 @@ grep "Skipping compression" logs/rustfs.log | wc -l .layer(CompressionLayer::new()) ``` -**Problem**: Can cause Content-Length mismatches with error responses +**Problem**: Can cause Content-Length mismatches with error responses and browser download issues -### ✅ Using Intelligent Predicates +### ❌ Using Blacklist Approach ```rust -// GOOD - Filter based on status and size -.layer(CompressionLayer::new().compress_when(ShouldCompress)) -``` - -### ❌ Ignoring Content-Length Header - -```rust -// BAD - Only checking status +// BAD - Blacklist approach (compress everything except...) fn should_compress(&self, response: &Response) -> bool { - !response.status().is_client_error() + // Skip images, videos, archives... + if is_already_compressed_type(content_type) { return false; } + true // Compress everything else } ``` -**Problem**: May compress tiny responses unnecessarily +**Problem**: Removes Content-Length for many file types, causing "unknown file size" in browsers -### ✅ Checking Both Status and Size +### ✅ Using Whitelist-Based Predicate ```rust -// GOOD - Multi-criteria decision +// GOOD - Whitelist approach with configurable predicate +.layer(CompressionLayer::new().compress_when(CompressionPredicate::new(config))) +``` + +### ❌ Ignoring Content-Encoding Header + +```rust +// BAD - May double-compress already compressed content fn should_compress(&self, response: &Response) -> bool { - // Check status + matches_mime_type(response) // Missing Content-Encoding check +} +``` + +**Problem**: Double compression wastes CPU and may corrupt data + +### ✅ Comprehensive Checks + +```rust +// GOOD - Multi-criteria whitelist decision +fn should_compress(&self, response: &Response) -> bool { + // 1. Must be enabled + if !self.config.enabled { return false; } + + // 2. Skip error responses if response.status().is_error() { return false; } - // Check size - if get_content_length(response) < 256 { return false; } + // 3. Skip already-encoded content + if has_content_encoding(response) { return false; } - true + // 4. Check minimum size + if get_content_length(response) < self.config.min_size { return false; } + + // 5. Must match whitelist (extension OR MIME type) + matches_extension(response) || matches_mime_type(response) } ``` @@ -224,28 +359,52 @@ async fn test_error_response_not_truncated() { ## Migration Guide +### Migrating from Blacklist to Whitelist Approach + +If you're upgrading from an older RustFS version with blacklist-based compression: + +1. **Compression is now disabled by default** + - Set `RUSTFS_COMPRESS_ENABLE=on` to enable + - This ensures backward compatibility for existing deployments + +2. **Configure your whitelist** + ```bash + # Example: Enable compression for common text formats + RUSTFS_COMPRESS_ENABLE=on + RUSTFS_COMPRESS_EXTENSIONS=.txt,.log,.csv,.json,.xml,.html,.css,.js + RUSTFS_COMPRESS_MIME_TYPES=text/*,application/json,application/xml,application/javascript + RUSTFS_COMPRESS_MIN_SIZE=1000 + ``` + +3. **Verify browser downloads** + - Check that file downloads show accurate file sizes + - Verify Content-Length headers are preserved for non-compressed content + ### Updating Existing Code If you're adding compression to an existing service: -1. **Start Conservative**: Only compress responses > 1KB -2. **Monitor Impact**: Watch CPU and latency metrics -3. **Lower Threshold Gradually**: Test with smaller thresholds -4. **Always Exclude Errors**: Never compress 4xx/5xx +1. **Start with compression disabled** (default) +2. **Define your whitelist**: Identify content types that benefit from compression +3. **Set appropriate thresholds**: Start with 1KB minimum size +4. **Enable and monitor**: Watch CPU, latency, and download behavior ### Rollout Strategy 1. **Stage 1**: Deploy to canary (5% traffic) - Monitor for 24 hours - Check error rates and latency + - Verify browser download behavior 2. **Stage 2**: Expand to 25% traffic - Monitor for 48 hours - Validate compression ratios + - Check Content-Length preservation 3. **Stage 3**: Full rollout (100% traffic) - Continue monitoring for 1 week - Document any issues + - Fine-tune whitelist based on actual usage ## Related Documentation @@ -253,13 +412,33 @@ If you're adding compression to an existing service: - [tower-http Compression](https://docs.rs/tower-http/latest/tower_http/compression/) - [HTTP Content-Encoding](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding) +## Architecture + +### Module Structure + +The compression functionality is organized in a dedicated module for maintainability: + +``` +rustfs/src/server/ +├── compress.rs # Compression configuration and predicate +├── http.rs # HTTP server (uses compress module) +└── mod.rs # Module declarations +``` + +### Key Components + +1. **`CompressionConfig`** - Stores compression settings parsed from environment/CLI +2. **`CompressionPredicate`** - Implements `tower_http::compression::predicate::Predicate` +3. **Configuration Constants** - Defined in `crates/config/src/constants/compress.rs` + ## References 1. Issue #901: NoSuchKey error response regression -2. [Google Web Fundamentals - Text Compression](https://web.dev/reduce-network-payloads-using-text-compression/) -3. [AWS Best Practices - Response Compression](https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/) +2. Issue #902: Whitelist-based compression redesign +3. [Google Web Fundamentals - Text Compression](https://web.dev/reduce-network-payloads-using-text-compression/) +4. [AWS Best Practices - Response Compression](https://docs.aws.amazon.com/whitepapers/latest/s3-optimizing-performance-best-practices/) --- -**Last Updated**: 2025-11-24 +**Last Updated**: 2025-12-13 **Maintainer**: RustFS Team diff --git a/docs/console-separation.md b/docs/console-separation.md index 8b6b3861..7795b4fd 100644 --- a/docs/console-separation.md +++ b/docs/console-separation.md @@ -1068,7 +1068,7 @@ curl http://localhost:9001/health #### Docker Migration Example ```bash -#!/bin/bash +#!/usr/bin/env bash # migrate-docker.sh # Stop old container diff --git a/docs/examples/docker/docker-quickstart.sh b/docs/examples/docker/docker-quickstart.sh index 03ceb78a..a83da686 100755 --- a/docs/examples/docker/docker-quickstart.sh +++ b/docs/examples/docker/docker-quickstart.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # RustFS Docker Quick Start Script # This script provides easy deployment commands for different scenarios diff --git a/docs/examples/docker/enhanced-docker-deployment.sh b/docs/examples/docker/enhanced-docker-deployment.sh index 0baefda4..aa6f5ee8 100755 --- a/docs/examples/docker/enhanced-docker-deployment.sh +++ b/docs/examples/docker/enhanced-docker-deployment.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # RustFS Enhanced Docker Deployment Examples # This script demonstrates various deployment scenarios for RustFS with console separation diff --git a/docs/examples/docker/enhanced-security-deployment.sh b/docs/examples/docker/enhanced-security-deployment.sh index d5c2aa33..63c401ae 100755 --- a/docs/examples/docker/enhanced-security-deployment.sh +++ b/docs/examples/docker/enhanced-security-deployment.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # RustFS Enhanced Security Deployment Script # This script demonstrates production-ready deployment with enhanced security features diff --git a/docs/examples/mnmd/test-deployment.sh b/docs/examples/mnmd/test-deployment.sh index 89c3b9e3..5433632a 100755 --- a/docs/examples/mnmd/test-deployment.sh +++ b/docs/examples/mnmd/test-deployment.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Copyright 2024 RustFS Team # # Licensed under the Apache License, Version 2.0 (the "License"); diff --git a/docs/security/dos-prevention-body-limits.md b/docs/security/dos-prevention-body-limits.md new file mode 100644 index 00000000..a60d2ede --- /dev/null +++ b/docs/security/dos-prevention-body-limits.md @@ -0,0 +1,42 @@ +# DoS Prevention: Request/Response Body Size Limits + +## Executive Summary + +This document describes the implementation of request and response body size limits in RustFS to prevent Denial of Service (DoS) attacks through unbounded memory allocation. The previous use of `usize::MAX` with `store_all_limited()` posed a critical security risk allowing attackers to exhaust server memory. + +## Security Risk Assessment + +### Vulnerability: Unbounded Memory Allocation + +**Severity**: High +**Impact**: Server memory exhaustion, service unavailability +**Likelihood**: High (easily exploitable) + +**Previous Code** (vulnerable): +```rust +let body = input.store_all_limited(usize::MAX).await?; +``` + +On a 64-bit system, `usize::MAX` is approximately 18 exabytes, effectively unlimited. + +## Implemented Limits + +| Limit | Size | Use Cases | +|-------|------|-----------| +| `MAX_ADMIN_REQUEST_BODY_SIZE` | 1 MB | User management, policies, tier/KMS/event configs | +| `MAX_IAM_IMPORT_SIZE` | 10 MB | IAM import/export (ZIP archives) | +| `MAX_BUCKET_METADATA_IMPORT_SIZE` | 100 MB | Bucket metadata import | +| `MAX_HEAL_REQUEST_SIZE` | 1 MB | Healing operations | +| `MAX_S3_RESPONSE_SIZE` | 10 MB | S3 client responses from remote services | + +## Rationale + +- AWS IAM policy limit: 6KB-10KB +- Typical payloads: < 100KB +- 1MB-100MB limits provide generous headroom while preventing DoS +- Based on real-world usage analysis and industry standards + +## Files Modified + +- 22 files updated across admin handlers and S3 client modules +- 2 new files: `rustfs/src/admin/constants.rs`, `crates/ecstore/src/client/body_limits.rs` diff --git a/entrypoint.sh b/entrypoint.sh index f9e605f6..f17bc757 100755 --- a/entrypoint.sh +++ b/entrypoint.sh @@ -13,6 +13,10 @@ elif [ "${1#-}" != "$1" ]; then elif [ "$1" = "rustfs" ]; then shift set -- /usr/bin/rustfs "$@" +elif [ "$1" = "/usr/bin/rustfs" ]; then + : # already normalized +elif [ "$1" = "cargo" ]; then + : # Pass through cargo command as-is else set -- /usr/bin/rustfs "$@" fi @@ -22,8 +26,35 @@ DATA_VOLUMES="" process_data_volumes() { VOLUME_RAW="${RUSTFS_VOLUMES:-/data}" # Convert comma/tab to space - VOLUME_LIST=$(echo "$VOLUME_RAW" | tr ',\t' ' ') + VOLUME_LIST_RAW=$(echo "$VOLUME_RAW" | tr ',\t' ' ') + VOLUME_LIST="" + for vol in $VOLUME_LIST_RAW; do + # Helper to manually expand {N..M} since sh doesn't support it on variables + if echo "$vol" | grep -E -q "\{[0-9]+\.\.[0-9]+\}"; then + PREFIX=${vol%%\{*} + SUFFIX=${vol##*\}} + RANGE=${vol#*\{} + RANGE=${RANGE%\}} + START=${RANGE%%..*} + END=${RANGE##*..} + + # Check if START and END are numbers + if [ "$START" -eq "$START" ] 2>/dev/null && [ "$END" -eq "$END" 2>/dev/null ]; then + i=$START + while [ "$i" -le "$END" ]; do + VOLUME_LIST="$VOLUME_LIST ${PREFIX}${i}${SUFFIX}" + i=$((i+1)) + done + else + # Fallback if not numbers + VOLUME_LIST="$VOLUME_LIST $vol" + fi + else + VOLUME_LIST="$VOLUME_LIST $vol" + fi + done + for vol in $VOLUME_LIST; do case "$vol" in /*) diff --git a/flake.lock b/flake.lock new file mode 100644 index 00000000..4822d9da --- /dev/null +++ b/flake.lock @@ -0,0 +1,27 @@ +{ + "nodes": { + "nixpkgs": { + "locked": { + "lastModified": 1765270179, + "narHash": "sha256-g2a4MhRKu4ymR4xwo+I+auTknXt/+j37Lnf0Mvfl1rE=", + "owner": "NixOS", + "repo": "nixpkgs", + "rev": "677fbe97984e7af3175b6c121f3c39ee5c8d62c9", + "type": "github" + }, + "original": { + "owner": "NixOS", + "ref": "nixpkgs-unstable", + "repo": "nixpkgs", + "type": "github" + } + }, + "root": { + "inputs": { + "nixpkgs": "nixpkgs" + } + } + }, + "root": "root", + "version": 7 +} diff --git a/flake.nix b/flake.nix new file mode 100644 index 00000000..be6b90b2 --- /dev/null +++ b/flake.nix @@ -0,0 +1,69 @@ +# Nix flake for building RustFS +# +# Prerequisites: +# Install Nix: https://nixos.org/download/ +# Enable flakes: https://nixos.wiki/wiki/Flakes#Enable_flakes +# +# Usage: +# nix build # Build rustfs binary +# nix run # Build and run rustfs +# ./result/bin/rustfs --help +{ + description = "RustFS - High-performance S3-compatible object storage"; + + inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixpkgs-unstable"; + + outputs = + { nixpkgs, ... }: + let + systems = [ + "x86_64-linux" + "aarch64-linux" + "x86_64-darwin" + "aarch64-darwin" + ]; + forAllSystems = nixpkgs.lib.genAttrs systems; + in + { + packages = forAllSystems ( + system: + let + pkgs = nixpkgs.legacyPackages.${system}; + in + { + default = pkgs.rustPlatform.buildRustPackage { + pname = "rustfs"; + version = "0.0.5"; + + src = ./.; + + cargoLock = { + lockFile = ./Cargo.lock; + allowBuiltinFetchGit = true; + }; + + nativeBuildInputs = with pkgs; [ + pkg-config + protobuf + ]; + + buildInputs = with pkgs; [ openssl ]; + + cargoBuildFlags = [ + "--package" + "rustfs" + ]; + + doCheck = false; + + meta = { + description = "High-performance S3-compatible object storage"; + homepage = "https://rustfs.com"; + license = pkgs.lib.licenses.asl20; + mainProgram = "rustfs"; + }; + }; + } + ); + }; +} diff --git a/helm/README.md b/helm/README.md index 924da3ab..95515d27 100644 --- a/helm/README.md +++ b/helm/README.md @@ -9,28 +9,133 @@ RustFS helm chart supports **standalone and distributed mode**. For standalone m **NOTE**: Please make sure which mode suits for you situation and specify the right parameter to install rustfs on kubernetes. +--- + # Parameters Overview -| parameter | description | default value | -| -- | -- | -- | -| replicaCount | Number of cluster nodes. | Default is `4`. | -| mode.standalone.enabled | RustFS standalone mode support, namely one pod one pvc. | Default is `false` | -| mode.distributed.enabled | RustFS distributed mode support, namely multiple pod multiple pvc. | Default is `true`. | -| image.repository | docker image repository. | rustfs/rustfs. | -| image.tag | the tag for rustfs docker image | "latest" | -| secret.rustfs.access_key | RustFS Access Key ID | `rustfsadmin` | -| secret.rustfs.secret_key | RustFS Secret Key ID | `rustfsadmin` | -| storageclass.name | The name for StorageClass. | `local-path` | -| storageclass.dataStorageSize | The storage size for data PVC. | `256Mi` | -| storageclass.logStorageSize | The storage size for log PVC. | `256Mi` | -| ingress.className | Specify the ingress class, traefik or nginx. | `nginx` | +| Parameter | Type | Default value | Description | +|-----|------|---------|-------------| +| affinity.nodeAffinity | object | `{}` | | +| affinity.podAntiAffinity.enabled | bool | `true` | | +| affinity.podAntiAffinity.topologyKey | string | `"kubernetes.io/hostname"` | | +| commonLabels | object | `{}` | Labels to add to all deployed objects. | +| config.rustfs.address | string | `":9000"` | | +| config.rustfs.console_address | string | `":9001"` | | +| config.rustfs.console_enable | string | `"true"` | | +| config.rustfs.log_level | string | `"debug"` | | +| config.rustfs.obs_environment | string | `"develop"` | | +| config.rustfs.obs_log_directory | string | `"/logs"` | | +| config.rustfs.region | string | `"us-east-1"` | | +| config.rustfs.rust_log | string | `"debug"` | | +| config.rustfs.volumes | string | `""` | | +| containerSecurityContext.capabilities.drop[0] | string | `"ALL"` | | +| containerSecurityContext.readOnlyRootFilesystem | bool | `true` | | +| containerSecurityContext.runAsNonRoot | bool | `true` | | +| extraManifests | list | `[]` | List of additional k8s manifests. | +| fullnameOverride | string | `""` | | +| image.pullPolicy | string | `"IfNotPresent"` | | +| image.repository | string | `"rustfs/rustfs"` | RustFS docker image repository. | +| image.tag | string | `"latest"` | The tag for rustfs docker image. | +| imagePullSecrets | list | `[]` | A List of secrets to pull image from private registry. | +| imageRegistryCredentials.email | string | `""` | The email to pull rustfs image from private registry. | +| imageRegistryCredentials.enabled | bool | `false` | To indicate whether pull image from private registry. | +| imageRegistryCredentials.password | string | `""` | The password to pull rustfs image from private registry. | +| imageRegistryCredentials.registry | string | `""` | Private registry url to pull rustfs image. | +| imageRegistryCredentials.username | string | `""` | The username to pull rustfs image from private registry. | +| ingress.className | string | `"traefik"` | Specify the ingress class, traefik or nginx. | +| ingress.enabled | bool | `true` | | +| ingress.hosts[0].host | string | `"example.rustfs.com"` | | +| ingress.hosts[0].paths[0].path | string | `"/"` | | +| ingress.hosts[0].paths[0].pathType | string | `"ImplementationSpecific"` | | +| ingress.nginxAnnotations."nginx.ingress.kubernetes.io/affinity" | string | `"cookie"` | | +| ingress.nginxAnnotations."nginx.ingress.kubernetes.io/session-cookie-expires" | string | `"3600"` | | +| ingress.nginxAnnotations."nginx.ingress.kubernetes.io/session-cookie-hash" | string | `"sha1"` | | +| ingress.nginxAnnotations."nginx.ingress.kubernetes.io/session-cookie-max-age" | string | `"3600"` | | +| ingress.nginxAnnotations."nginx.ingress.kubernetes.io/session-cookie-name" | string | `"rustfs"` | | +| ingress.customAnnotations | dict | `{}` |Customize annotations. | +| ingress.traefikAnnotations."traefik.ingress.kubernetes.io/service.sticky.cookie" | string | `"true"` | | +| ingress.traefikAnnotations."traefik.ingress.kubernetes.io/service.sticky.cookie.httponly" | string | `"true"` | | +| ingress.traefikAnnotations."traefik.ingress.kubernetes.io/service.sticky.cookie.name" | string | `"rustfs"` | | +| ingress.traefikAnnotations."traefik.ingress.kubernetes.io/service.sticky.cookie.samesite" | string | `"none"` | | +| ingress.traefikAnnotations."traefik.ingress.kubernetes.io/service.sticky.cookie.secure" | string | `"true"` | | +| ingress.tls.enabled | bool | `false` | Enable tls and access rustfs via https. | +| ingress.tls.certManager.enabled | string | `false` | Enable cert manager support to generate certificate automatically. | +| ingress.tls.crt | string | "" | The content of certificate file. | +| ingress.tls.key | string | "" | The content of key file. | +| livenessProbe.failureThreshold | int | `3` | | +| livenessProbe.httpGet.path | string | `"/health"` | | +| livenessProbe.httpGet.port | string | `"endpoint"` | | +| livenessProbe.initialDelaySeconds | int | `10` | | +| livenessProbe.periodSeconds | int | `5` | | +| livenessProbe.successThreshold | int | `1` | | +| livenessProbe.timeoutSeconds | int | `3` | | +| mode.distributed.enabled | bool | `true` | RustFS distributed mode support, namely multiple pod multiple pvc. | +| mode.standalone.enabled | bool | `false` | RustFS standalone mode support, namely one pod one pvc. | +| nameOverride | string | `""` | | +| nodeSelector | object | `{}` | | +| podAnnotations | object | `{}` | | +| podLabels | object | `{}` | | +| podSecurityContext.fsGroup | int | `10001` | | +| podSecurityContext.runAsGroup | int | `10001` | | +| podSecurityContext.runAsUser | int | `10001` | | +| readinessProbe.failureThreshold | int | `3` | | +| readinessProbe.httpGet.path | string | `"/health"` | | +| readinessProbe.httpGet.port | string | `"endpoint"` | | +| readinessProbe.initialDelaySeconds | int | `30` | | +| readinessProbe.periodSeconds | int | `5` | | +| readinessProbe.successThreshold | int | `1` | | +| readinessProbe.timeoutSeconds | int | `3` | | +| replicaCount | int | `4` | Number of cluster nodes. | +| resources.limits.cpu | string | `"200m"` | | +| resources.limits.memory | string | `"512Mi"` | | +| resources.requests.cpu | string | `"100m"` | | +| resources.requests.memory | string | `"128Mi"` | | +| secret.existingSecret | string | `""` | Use existing secret with a credentials. | +| secret.rustfs.access_key | string | `"rustfsadmin"` | RustFS Access Key ID | +| secret.rustfs.secret_key | string | `"rustfsadmin"` | RustFS Secret Key ID | +| service.type | string | `"NodePort"` | | +| service.console.nodePort | int | `32001` | | +| service.console.port | int | `9001` | | +| service.endpoint.nodePort | int | `32000` | | +| service.endpoint.port | int | `9000` | | +| serviceAccount.annotations | object | `{}` | | +| serviceAccount.automount | bool | `true` | | +| serviceAccount.create | bool | `true` | | +| serviceAccount.name | string | `""` | | +| storageclass.dataStorageSize | string | `"256Mi"` | The storage size for data PVC. | +| storageclass.logStorageSize | string | `"256Mi"` | The storage size for logs PVC. | +| storageclass.name | string | `"local-path"` | The name for StorageClass. | +| tolerations | list | `[]` | | +--- -**NOTE**: [`local-path`](https://github.com/rancher/local-path-provisioner) is used by k3s. If you want to use `local-path`, running the command, +**NOTE**: -``` -kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/v0.0.32/deploy/local-path-storage.yaml -``` +The chart pulls the rustfs image from Docker Hub by default. For private registries, provide either: + +- **Existing secrets**: Set `imagePullSecrets` with an array of secret names + ```yaml + imagePullSecrets: + - name: my-existing-secret + ``` + +- **Auto-generated secret**: Enable `imageRegistryCredentials.enabled: true` and specify credentials plus your image details + ```yaml + imageRegistryCredentials: + enabled: true + registry: myregistry.com + username: myuser + password: mypass + email: user@example.com + ``` + +Both approaches support pulling from private registries seamlessly and you can also combine them. + +- The chart default pull rustfs image from dockerhub, if your rustfs image stores in private registry, you can use either existing image Pull secrets with parameter `imagePullSecrets` or create one setting `imageRegistryCredentials.enabled` to `true`,and then specify the `imageRegistryCredentials.registry/username/password/email` as well as `image.repository`,`image.tag` to pull rustfs image from your private registry. + +- The default storageclass is [`local-path`](https://github.com/rancher/local-path-provisioner),if you want to specify your own storageclass, try to set parameter `storageclass.name`. + +- The default size for data and logs dir is **256Mi** which must satisfy the production usage,you should specify `storageclass.dataStorageSize` and `storageclass.logStorageSize` to change the size, for example, 1Ti for data and 1Gi for logs. # Installation @@ -41,7 +146,7 @@ kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisione Due to the traefik and ingress has different session sticky/affinity annotations, and rustfs support both those two controller, you should specify parameter `ingress.className` to select the right one which suits for you. -## Installation with traekfik controller +## Installation with traefik controller If your ingress class is `traefik`, running the command: @@ -75,20 +180,20 @@ Check the ingress status ``` kubectl -n rustfs get ing NAME CLASS HOSTS ADDRESS PORTS AGE -rustfs nginx your.rustfs.com 10.43.237.152 80, 443 29m +rustfs nginx example.rustfs.com 10.43.237.152 80, 443 29m ``` -Access the rustfs cluster via `https://your.rustfs.com` with the default username and password `rustfsadmin`. +Access the rustfs cluster via `https://example.rustfs.com` with the default username and password `rustfsadmin`. -> Replace the `your.rustfs.com` with your own domain as well as the certificates. +> Replace the `example.rustfs.com` with your own domain as well as the certificates. # TLS configuration -By default, tls is not enabled.If you want to enable tls(recommendated),you can follow below steps: +By default, tls is not enabled. If you want to enable tls(recommendated),you can follow below steps: * Step 1: Certification generation -You can request cert and key from CA or use the self-signed cert(**not recommendated on prod**),and put those two files(eg, `tls.crt` and `tls.key`) under some directory on server, for example `tls` directory. +You can request cert and key from CA or use the self-signed cert(**not recommendated on prod**), and put those two files(eg, `tls.crt` and `tls.key`) under some directory on server, for example `tls` directory. * Step 2: Certification specifying @@ -104,4 +209,4 @@ Uninstalling the rustfs installation with command, ``` helm uninstall rustfs -n rustfs -``` \ No newline at end of file +``` diff --git a/helm/rustfs/Chart.yaml b/helm/rustfs/Chart.yaml index 2cc92efa..68118e54 100644 --- a/helm/rustfs/Chart.yaml +++ b/helm/rustfs/Chart.yaml @@ -15,7 +15,7 @@ type: application # This is the chart version. This version number should be incremented each time you make changes # to the chart and its templates, including the app version. # Versions are expected to follow Semantic Versioning (https://semver.org/) -version: 1.0.3 +version: 0.0.76 # This is the version number of the application being deployed. This version number should be # incremented each time you make changes to the application. Versions are not expected to diff --git a/helm/rustfs/templates/NOTES.txt b/helm/rustfs/templates/NOTES.txt index 7f5eb704..e73932fb 100644 --- a/helm/rustfs/templates/NOTES.txt +++ b/helm/rustfs/templates/NOTES.txt @@ -1,22 +1,10 @@ -1. Get the application URL by running these commands: +1. Watch all pods come up + kubectl get pods -w -l app.kubernetes.io/name={{ include "rustfs.name" . }} -n {{ .Release.Namespace }} {{- if .Values.ingress.enabled }} +2. Visit the dashboard {{- range $host := .Values.ingress.hosts }} {{- range .paths }} http{{ if $.Values.ingress.tls }}s{{ end }}://{{ $host.host }}{{ .path }} {{- end }} {{- end }} -{{- else if contains "NodePort" .Values.service.type }} - export NODE_PORT=$(kubectl get --namespace {{ .Release.Namespace }} -o jsonpath="{.spec.ports[0].nodePort}" services {{ include "rustfs.fullname" . }}) - export NODE_IP=$(kubectl get nodes --namespace {{ .Release.Namespace }} -o jsonpath="{.items[0].status.addresses[0].address}") - echo http://$NODE_IP:$NODE_PORT -{{- else if contains "LoadBalancer" .Values.service.type }} - NOTE: It may take a few minutes for the LoadBalancer IP to be available. - You can watch its status by running 'kubectl get --namespace {{ .Release.Namespace }} svc -w {{ include "rustfs.fullname" . }}' - export SERVICE_IP=$(kubectl get svc --namespace {{ .Release.Namespace }} {{ include "rustfs.fullname" . }} --template "{{"{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}"}}") - echo http://$SERVICE_IP:{{ .Values.service.port }} -{{- else if contains "ClusterIP" .Values.service.type }} - export POD_NAME=$(kubectl get pods --namespace {{ .Release.Namespace }} -l "app.kubernetes.io/name={{ include "rustfs.name" . }},app.kubernetes.io/instance={{ .Release.Name }}" -o jsonpath="{.items[0].metadata.name}") - export CONTAINER_PORT=$(kubectl get pod --namespace {{ .Release.Namespace }} $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}") - echo "Visit http://127.0.0.1:8080 to use your application" - kubectl --namespace {{ .Release.Namespace }} port-forward $POD_NAME 8080:$CONTAINER_PORT {{- end }} diff --git a/helm/rustfs/templates/_helpers.tpl b/helm/rustfs/templates/_helpers.tpl index 667b9ece..c9ab646b 100644 --- a/helm/rustfs/templates/_helpers.tpl +++ b/helm/rustfs/templates/_helpers.tpl @@ -71,3 +71,43 @@ Return the secret name {{- printf "%s-secret" (include "rustfs.fullname" .) }} {{- end }} {{- end }} + +{{/* +Return image pull secret content +*/}} +{{- define "imagePullSecret" }} +{{- with .Values.imageRegistryCredentials }} +{{- printf "{\"auths\":{\"%s\":{\"username\":\"%s\",\"password\":\"%s\",\"email\":\"%s\",\"auth\":\"%s\"}}}" .registry .username .password .email (printf "%s:%s" .username .password | b64enc) | b64enc }} +{{- end }} +{{- end }} + +{{/* +Return the default imagePullSecret name +*/}} +{{- define "rustfs.imagePullSecret.name" -}} +{{- printf "%s-registry-secret" (include "rustfs.fullname" .) }} +{{- end }} + +{{/* +Render imagePullSecrets for workloads - appends registry secret +*/}} +{{- define "chart.imagePullSecrets" -}} +{{- $secrets := .Values.imagePullSecrets | default list }} +{{- if .Values.imageRegistryCredentials.enabled }} +{{- $secrets = append $secrets (dict "name" (include "rustfs.imagePullSecret.name" .)) }} +{{- end }} +{{- toYaml $secrets }} +{{- end }} + +{{/* +Render RUSTFS_VOLUMES +*/}} +{{- define "rustfs.volumes" -}} +{{- if eq (int .Values.replicaCount) 4 }} +{{- printf "http://%s-{0...%d}.%s-headless:%d/data/rustfs{0...%d}" (include "rustfs.fullname" .) (sub (.Values.replicaCount | int) 1) (include "rustfs.fullname" . ) (.Values.service.endpoint.port | int) (sub (.Values.replicaCount | int) 1) }} +{{- end }} +{{- if eq (int .Values.replicaCount) 16 }} +{{- printf "http://%s-{0...%d}.%s-headless:%d/data" (include "rustfs.fullname" .) (sub (.Values.replicaCount | int) 1) (include "rustfs.fullname" .) (.Values.service.endpoint.port | int) }} +{{- end }} +{{- end }} + diff --git a/helm/rustfs/templates/configmap.yaml b/helm/rustfs/templates/configmap.yaml index 910ec874..e2a75a6d 100644 --- a/helm/rustfs/templates/configmap.yaml +++ b/helm/rustfs/templates/configmap.yaml @@ -2,19 +2,20 @@ apiVersion: v1 kind: ConfigMap metadata: name: {{ include "rustfs.fullname" . }}-config + labels: + {{- toYaml .Values.commonLabels | nindent 4 }} data: RUSTFS_ADDRESS: {{ .Values.config.rustfs.address | quote }} RUSTFS_CONSOLE_ADDRESS: {{ .Values.config.rustfs.console_address | quote }} - RUSTFS_OBS_LOG_DIRECTORY: {{ .Values.config.rustfs.obs_log_directory | quote }} RUSTFS_CONSOLE_ENABLE: {{ .Values.config.rustfs.console_enable | quote }} + RUSTFS_OBS_LOG_DIRECTORY: {{ .Values.config.rustfs.obs_log_directory | quote }} RUSTFS_OBS_LOGGER_LEVEL: {{ .Values.config.rustfs.log_level | quote }} - {{- if .Values.mode.distributed.enabled }} - {{- if eq (int .Values.replicaCount) 4 }} - RUSTFS_VOLUMES: "http://{{ include "rustfs.fullname" . }}-{0...3}.{{ include "rustfs.fullname" . }}-headless:9000/data/rustfs{0...3}" - {{- else if eq (int .Values.replicaCount) 16 }} - RUSTFS_VOLUMES: "http://{{ include "rustfs.fullname" . }}-{0...15}.{{ include "rustfs.fullname" . }}-headless:9000/data" + RUSTFS_OBS_ENVIRONMENT: {{ .Values.config.rustfs.obs_environment | quote }} + {{- if .Values.config.rustfs.region }} + RUSTFS_REGION: {{ .Values.config.rustfs.region | quote }} {{- end }} + {{- if .Values.mode.distributed.enabled }} + RUSTFS_VOLUMES: {{ .Values.config.rustfs.volumes | default (include "rustfs.volumes" .) }} {{- else }} RUSTFS_VOLUMES: "/data" {{- end }} - RUSTFS_OBS_ENVIRONMENT: "develop" diff --git a/helm/rustfs/templates/deployment.yaml b/helm/rustfs/templates/deployment.yaml index 2edc4736..1a2672b3 100644 --- a/helm/rustfs/templates/deployment.yaml +++ b/helm/rustfs/templates/deployment.yaml @@ -4,28 +4,63 @@ kind: Deployment metadata: name: {{ include "rustfs.fullname" . }} labels: - app: {{ include "rustfs.name" . }} + {{- include "rustfs.labels" . | nindent 4 }} + {{- with .Values.commonLabels }} + {{- toYaml . | nindent 4 }} + {{- end }} spec: replicas: 1 selector: matchLabels: - app: {{ include "rustfs.name" . }} + {{- include "rustfs.selectorLabels" . | nindent 6 }} template: metadata: labels: - app: {{ include "rustfs.name" . }} + {{- include "rustfs.selectorLabels" . | nindent 8 }} + {{- with .Values.podLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} spec: + {{- with include "chart.imagePullSecrets" . }} + imagePullSecrets: + {{- . | nindent 8 }} + {{- end }} + {{- if .Values.affinity }} + affinity: + {{- if .Values.affinity.nodeAffinity }} + nodeAffinity: + {{- toYaml .Values.affinity.nodeAffinity | nindent 10 }} + {{- if .Values.affinity.podAntiAffinity.enabled }} + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - labelSelector: + matchExpressions: + - key: app.kubernetes.io/name + operator: In + values: + - {{ include "rustfs.name" . }} + topologyKey: {{ .Values.affinity.podAntiAffinity.topologyKey }} + {{- end }} + {{- end }} + {{- end }} + {{- if .Values.tolerations }} + tolerations: + {{- toYaml .Values.tolerations | nindent 8 }} + {{- end }} {{- if .Values.podSecurityContext }} securityContext: - {{- toYaml .Values.podSecurityContext | nindent 12 }} + {{- toYaml .Values.podSecurityContext | nindent 8 }} + {{- end }} + {{- if .Values.imagePullSecrets }} + imagePullSecrets: + {{- toYaml .Values.imagePullSecrets | nindent 8 }} {{- end }} initContainers: - name: init-step - image: busybox - imagePullPolicy: {{ .Values.image.pullPolicy }} + image: "{{ .Values.initStep.image.repository }}:{{ .Values.initStep.image.tag }}" + imagePullPolicy: {{ .Values.initStep.image.pullPolicy }} securityContext: - runAsUser: 0 - runAsGroup: 0 + {{- toYaml .Values.initStep.containerSecurityContext | nindent 12 }} command: - sh - -c @@ -47,10 +82,10 @@ spec: {{- toYaml .Values.containerSecurityContext | nindent 12 }} {{- end }} ports: - - containerPort: {{ .Values.service.ep_port }} - name: endpoint - - containerPort: {{ .Values.service.console_port }} - name: console + - name: endpoint + containerPort: {{ .Values.service.endpoint.port }} + - name: console + containerPort: {{ .Values.service.console.port }} envFrom: - configMapRef: name: {{ include "rustfs.fullname" . }}-config @@ -86,7 +121,11 @@ spec: mountPath: /logs - name: data mountPath: /data + - name: tmp + mountPath: /tmp volumes: + - name: tmp + emptyDir: {} - name: logs persistentVolumeClaim: claimName: {{ include "rustfs.fullname" . }}-logs diff --git a/helm/rustfs/templates/ingress.yaml b/helm/rustfs/templates/ingress.yaml index 94eedfc7..89f99c4d 100644 --- a/helm/rustfs/templates/ingress.yaml +++ b/helm/rustfs/templates/ingress.yaml @@ -1,29 +1,35 @@ {{- if .Values.ingress.enabled -}} +{{- $secretName := .Values.ingress.tls.secretName }} +{{- $ingressAnnotations := dict }} +{{- if eq .Values.ingress.className "nginx" }} +{{- $ingressAnnotations = .Values.ingress.nginxAnnotations }} +{{- else if eq .Values.ingress.className "" }} +{{- $ingressAnnotations = .Values.ingress.customAnnotations }} +{{- end }} apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: {{ include "rustfs.fullname" . }} labels: {{- include "rustfs.labels" . | nindent 4 }} - {{- if eq .Values.ingress.className "nginx" }} - {{- with .Values.ingress.nginxAnnotations }} + {{- with .Values.commonLabels }} + {{- toYaml . | nindent 4 }} + {{- end }} + {{- with $ingressAnnotations }} annotations: {{- toYaml . | nindent 4 }} {{- end }} - {{- end }} spec: {{- with .Values.ingress.className }} ingressClassName: {{ . }} {{- end }} - {{- if .Values.tls.enabled }} + {{- if .Values.ingress.tls.enabled }} tls: - {{- range .Values.ingress.tls }} - hosts: - {{- range .hosts }} - - {{ . | quote }} + {{- range .Values.ingress.hosts }} + - {{ .host | quote }} {{- end }} - secretName: {{ .secretName }} - {{- end }} + secretName: {{ $secretName }} {{- end }} rules: {{- range .Values.ingress.hosts }} diff --git a/helm/rustfs/templates/pvc.yaml b/helm/rustfs/templates/pvc.yaml index 1cab744d..a50a04e9 100644 --- a/helm/rustfs/templates/pvc.yaml +++ b/helm/rustfs/templates/pvc.yaml @@ -3,6 +3,8 @@ apiVersion: v1 kind: PersistentVolumeClaim metadata: name: {{ include "rustfs.fullname" . }}-data + labels: + {{- toYaml .Values.commonLabels | nindent 4 }} spec: accessModes: ["ReadWriteOnce"] storageClassName: {{ .Values.storageclass.name }} @@ -15,10 +17,12 @@ apiVersion: v1 kind: PersistentVolumeClaim metadata: name: {{ include "rustfs.fullname" . }}-logs + labels: + {{- toYaml .Values.commonLabels | nindent 4 }} spec: accessModes: ["ReadWriteOnce"] storageClassName: {{ .Values.storageclass.name }} resources: requests: storage: {{ .Values.storageclass.logStorageSize }} -{{- end }} \ No newline at end of file +{{- end }} diff --git a/helm/rustfs/templates/secret-tls.yaml b/helm/rustfs/templates/secret-tls.yaml index 8c78787b..28b50600 100644 --- a/helm/rustfs/templates/secret-tls.yaml +++ b/helm/rustfs/templates/secret-tls.yaml @@ -1,10 +1,12 @@ -{{- if .Values.tls.enabled }} +{{- if and .Values.ingress.tls.enabled (not .Values.ingress.tls.certManager.enabled) }} apiVersion: v1 kind: Secret metadata: name: {{ include "rustfs.fullname" . }}-tls + labels: + {{- toYaml .Values.commonLabels | nindent 4 }} type: kubernetes.io/tls data: - tls.crt : {{ .Values.tls.crt | b64enc | quote }} - tls.key : {{ .Values.tls.key | b64enc | quote }} -{{- end }} \ No newline at end of file + tls.crt : {{ .Values.ingress.tls.crt | b64enc | quote }} + tls.key : {{ .Values.ingress.tls.key | b64enc | quote }} +{{- end }} diff --git a/helm/rustfs/templates/secret.yaml b/helm/rustfs/templates/secret.yaml index 7d061828..2caa8509 100644 --- a/helm/rustfs/templates/secret.yaml +++ b/helm/rustfs/templates/secret.yaml @@ -3,8 +3,23 @@ apiVersion: v1 kind: Secret metadata: name: {{ include "rustfs.secretName" . }} + labels: + {{- toYaml .Values.commonLabels | nindent 4 }} type: Opaque data: RUSTFS_ACCESS_KEY: {{ .Values.secret.rustfs.access_key | b64enc | quote }} RUSTFS_SECRET_KEY: {{ .Values.secret.rustfs.secret_key | b64enc | quote }} {{- end }} + +--- +{{- if .Values.imageRegistryCredentials.enabled }} +apiVersion: v1 +kind: Secret +metadata: + name: {{ include "rustfs.imagePullSecret.name" . }} + labels: + {{- toYaml .Values.commonLabels | nindent 4 }} +type: kubernetes.io/dockerconfigjson +data: + .dockerconfigjson: {{ template "imagePullSecret" . }} +{{- end }} diff --git a/helm/rustfs/templates/service.yaml b/helm/rustfs/templates/service.yaml index 3275a822..347383ab 100644 --- a/helm/rustfs/templates/service.yaml +++ b/helm/rustfs/templates/service.yaml @@ -5,27 +5,24 @@ metadata: name: {{ include "rustfs.fullname" . }}-headless labels: {{- include "rustfs.labels" . | nindent 4 }} + {{- with .Values.commonLabels }} + {{- toYaml . | nindent 4 }} + {{- end }} spec: + {{- /* headless service */}} clusterIP: None publishNotReadyAddresses: true ports: - {{- if .Values.ingress.enabled }} - - port: 9000 - {{- else }} - - port: {{ .Values.service.ep_port }} - {{- end }} - targetPort: {{ .Values.service.ep_port }} - protocol: TCP - name: endpoint - - port: {{ .Values.service.console_port }} - targetPort: 9001 - protocol: TCP - name: console + - name: endpoint + port: {{ .Values.service.endpoint.port }} + - name: console + port: {{ .Values.service.console.port }} selector: - app: {{ include "rustfs.name" . }} + {{- include "rustfs.selectorLabels" . | nindent 4 }} {{- end }} --- +{{- $serviceType := .Values.service.type }} apiVersion: v1 kind: Service metadata: @@ -40,10 +37,13 @@ metadata: {{- end }} labels: {{- include "rustfs.labels" . | nindent 4 }} + {{- with .Values.commonLabels }} + {{- toYaml . | nindent 4 }} + {{- end }} spec: - {{- if .Values.ingress.enabled }} + {{- if eq $serviceType "ClusterIP" }} type: ClusterIP - {{- else }} + {{- else if eq $serviceType "NodePort" }} type: NodePort sessionAffinity: ClientIP sessionAffinityConfig: @@ -51,13 +51,17 @@ spec: timeoutSeconds: 10800 {{- end }} ports: - - port: {{ .Values.service.ep_port }} - targetPort: {{ .Values.service.ep_port }} - protocol: TCP - name: endpoint - - port: {{ .Values.service.console_port }} - targetPort: {{ .Values.service.console_port }} - protocol: TCP - name: console + - name: endpoint + port: {{ .Values.service.endpoint.port }} + targetPort: {{ .Values.service.endpoint.port }} + {{- if eq $serviceType "NodePort" }} + nodePort: {{ .Values.service.endpoint.nodePort }} + {{- end }} + - name: console + port: {{ .Values.service.console.port }} + targetPort: {{ .Values.service.console.port }} + {{- if eq $serviceType "NodePort" }} + nodePort: {{ .Values.service.console.nodePort }} + {{- end }} selector: - app: {{ include "rustfs.name" . }} + {{- include "rustfs.selectorLabels" . | nindent 4 }} diff --git a/helm/rustfs/templates/serviceaccount.yaml b/helm/rustfs/templates/serviceaccount.yaml index a70c5d2e..9edd6d7b 100644 --- a/helm/rustfs/templates/serviceaccount.yaml +++ b/helm/rustfs/templates/serviceaccount.yaml @@ -5,6 +5,9 @@ metadata: name: {{ include "rustfs.serviceAccountName" . }} labels: {{- include "rustfs.labels" . | nindent 4 }} + {{- with .Values.commonLabels }} + {{- toYaml . | nindent 4 }} + {{- end }} {{- with .Values.serviceAccount.annotations }} annotations: {{- toYaml . | nindent 4 }} diff --git a/helm/rustfs/templates/statefulset.yaml b/helm/rustfs/templates/statefulset.yaml index 931cfff4..5fcfcc7d 100644 --- a/helm/rustfs/templates/statefulset.yaml +++ b/helm/rustfs/templates/statefulset.yaml @@ -1,34 +1,80 @@ +{{- $logDir := .Values.config.rustfs.obs_log_directory }} + {{- if .Values.mode.distributed.enabled }} +--- apiVersion: apps/v1 kind: StatefulSet metadata: name: {{ include "rustfs.fullname" . }} + labels: + {{- include "rustfs.labels" . | nindent 4 }} + {{- with .Values.commonLabels }} + {{- toYaml . | nindent 4 }} + {{- end }} spec: serviceName: {{ include "rustfs.fullname" . }}-headless replicas: {{ .Values.replicaCount }} podManagementPolicy: Parallel selector: matchLabels: - app: {{ include "rustfs.name" . }} + {{- include "rustfs.selectorLabels" . | nindent 6 }} template: metadata: labels: - app: {{ include "rustfs.name" . }} + {{- include "rustfs.selectorLabels" . | nindent 8 }} + {{- with .Values.podLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} spec: + {{- with include "chart.imagePullSecrets" . }} + imagePullSecrets: + {{- . | nindent 8 }} + {{- end }} + {{- if and .Values.nodeSelector (not .Values.affinity.nodeAffinity) }} + nodeSelector: + {{- toYaml .Values.nodeSelector | nindent 8 }} + {{- end }} + {{- if .Values.affinity }} + affinity: + nodeAffinity: + {{- if .Values.affinity.nodeAffinity }} + {{- toYaml .Values.affinity.nodeAffinity | nindent 10 }} + {{- else }} + {} + {{- if .Values.affinity.podAntiAffinity.enabled }} + {{- end }} + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - labelSelector: + matchExpressions: + - key: app.kubernetes.io/name + operator: In + values: + - {{ include "rustfs.name" . }} + topologyKey: {{ .Values.affinity.podAntiAffinity.topologyKey }} + {{- end }} + {{- end }} + {{- if .Values.tolerations }} + tolerations: + {{- toYaml .Values.tolerations | nindent 8 }} + {{- end }} {{- if .Values.podSecurityContext }} securityContext: - {{- toYaml .Values.podSecurityContext | nindent 12 }} + {{- toYaml .Values.podSecurityContext | nindent 8 }} + {{- end }} + {{- if .Values.imagePullSecrets }} + imagePullSecrets: + {{- toYaml .Values.imagePullSecrets | nindent 8 }} {{- end }} initContainers: - name: init-step - image: busybox - imagePullPolicy: {{ .Values.image.pullPolicy }} + image: "{{ .Values.initStep.image.repository }}:{{ .Values.initStep.image.tag }}" + imagePullPolicy: {{ .Values.initStep.image.pullPolicy }} securityContext: - runAsUser: 0 - runAsGroup: 0 + {{- toYaml .Values.initStep.containerSecurityContext | nindent 12 }} env: - name: REPLICA_COUNT - value: "{{ .Values.replicaCount }}" + value: {{ .Values.replicaCount | quote }} command: - sh - -c @@ -40,9 +86,8 @@ spec: elif [ "$REPLICA_COUNT" -eq 16 ]; then mkdir -p /data fi - - chown -R 10001:10001 /data - chown -R 10001:10001 /logs + mkdir -p {{ $logDir }} + chown -R 10001:10001 /data {{ $logDir }} volumeMounts: {{- if eq (int .Values.replicaCount) 4 }} {{- range $i := until (int .Values.replicaCount) }} @@ -54,7 +99,7 @@ spec: mountPath: /data {{- end }} - name: logs - mountPath: /logs + mountPath: {{ $logDir }} containers: - name: {{ .Chart.Name }} image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}" @@ -62,16 +107,13 @@ spec: imagePullPolicy: {{ .Values.image.pullPolicy }} {{- if .Values.containerSecurityContext }} securityContext: - {{- toYaml .Values.containerSecurityContext | nindent 12 }} + {{- toYaml .Values.containerSecurityContext | nindent 12 }} {{- end }} ports: - - containerPort: {{ .Values.service.ep_port }} - name: endpoint - - containerPort: {{ .Values.service.console_port }} - name: console - env: - - name: REPLICA_COUNT - value: "{{ .Values.replicaCount }}" + - name: endpoint + containerPort: {{ .Values.service.endpoint.port }} + - name: console + containerPort: {{ .Values.service.console.port }} envFrom: - configMapRef: name: {{ include "rustfs.fullname" . }}-config @@ -85,26 +127,14 @@ spec: memory: {{ .Values.resources.limits.memory }} cpu: {{ .Values.resources.limits.cpu }} livenessProbe: - httpGet: - path: /health - port: 9000 - initialDelaySeconds: 10 - periodSeconds: 5 - timeoutSeconds: 3 - successThreshold: 1 - failureThreshold: 3 + {{- toYaml .Values.livenessProbe | nindent 12 }} readinessProbe: - httpGet: - path: /health - port: 9000 - initialDelaySeconds: 30 - periodSeconds: 5 - timeoutSeconds: 3 - successThreshold: 1 - failureThreshold: 3 + {{- toYaml .Values.readinessProbe | nindent 12 }} volumeMounts: + - name: tmp + mountPath: /tmp - name: logs - mountPath: /logs + mountPath: {{ $logDir }} {{- if eq (int .Values.replicaCount) 4 }} {{- range $i := until (int .Values.replicaCount) }} - name: data-rustfs-{{ $i }} @@ -114,34 +144,43 @@ spec: - name: data mountPath: /data {{- end }} + volumes: + - name: tmp + emptyDir: {} volumeClaimTemplates: - metadata: name: logs + labels: + {{- toYaml .Values.commonLabels | nindent 10 }} spec: accessModes: ["ReadWriteOnce"] - storageClassName: {{ $.Values.storageclass.name }} + storageClassName: {{ .Values.storageclass.name }} resources: requests: - storage: {{ $.Values.storageclass.logStorageSize}} + storage: {{ .Values.storageclass.logStorageSize }} {{- if eq (int .Values.replicaCount) 4 }} {{- range $i := until (int .Values.replicaCount) }} - metadata: name: data-rustfs-{{ $i }} + labels: + {{- toYaml $.Values.commonLabels | nindent 10 }} spec: accessModes: ["ReadWriteOnce"] storageClassName: {{ $.Values.storageclass.name }} resources: requests: - storage: {{ $.Values.storageclass.dataStorageSize}} + storage: {{ $.Values.storageclass.dataStorageSize }} {{- end }} {{- else if eq (int .Values.replicaCount) 16 }} - metadata: name: data + labels: + {{- toYaml .Values.commonLabels | nindent 10 }} spec: accessModes: ["ReadWriteOnce"] - storageClassName: {{ $.Values.storageclass.name }} + storageClassName: {{ .Values.storageclass.name }} resources: requests: - storage: {{ $.Values.storageclass.dataStorageSize}} + storage: {{ .Values.storageclass.dataStorageSize }} {{- end }} {{- end }} diff --git a/helm/rustfs/templates/tests/test-connection.yaml b/helm/rustfs/templates/tests/test-connection.yaml index 42d4fff0..ee879f85 100644 --- a/helm/rustfs/templates/tests/test-connection.yaml +++ b/helm/rustfs/templates/tests/test-connection.yaml @@ -11,5 +11,5 @@ spec: - name: wget image: busybox command: ['wget'] - args: ['{{ include "rustfs.fullname" . }}:{{ .Values.service.port }}'] + args: ['-O', '/dev/null', '{{ include "rustfs.fullname" . }}-svc:{{ .Values.service.endpoint.port }}/health'] restartPolicy: Never diff --git a/helm/rustfs/values.yaml b/helm/rustfs/values.yaml index 17d23c43..5159b478 100644 --- a/helm/rustfs/values.yaml +++ b/helm/rustfs/values.yaml @@ -11,15 +11,22 @@ image: # This sets the pull policy for images. pullPolicy: IfNotPresent # Overrides the image tag whose default is the chart appVersion. - tag: "latest" + tag: "1.0.0-alpha.73" # This is for the secrets for pulling an image from a private repository more information can be found here: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/ imagePullSecrets: [] + +imageRegistryCredentials: + enabled: false + registry: "" + username: "" + password: "" + email: "" + # This is to override the chart name. nameOverride: "" fullnameOverride: "" - mode: standalone: enabled: false @@ -34,13 +41,18 @@ secret: config: rustfs: - volume: "/data/rustfs0,/data/rustfs1,/data/rustfs2,/data/rustfs3" - address: "0.0.0.0:9000" - console_address: "0.0.0.0:9001" + # Examples + # volumes: "/data/rustfs0,/data/rustfs1,/data/rustfs2,/data/rustfs3" + # volumes: "http://rustfs-{0...3}.rustfs-headless:9000/data/rustfs{0...3}" + volumes: "" + address: ":9000" + console_enable: "true" + console_address: ":9001" log_level: "debug" rust_log: "debug" - console_enable: "true" + region: "us-east-1" obs_log_directory: "/logs" + obs_environment: "develop" # This section builds out the service account more information can be found here: https://kubernetes.io/docs/concepts/security/service-accounts/ serviceAccount: @@ -57,13 +69,17 @@ serviceAccount: # This is for setting Kubernetes Annotations to a Pod. # For more information checkout: https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/ podAnnotations: {} + # This is for setting Kubernetes Labels to a Pod. # For more information checkout: https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ podLabels: {} +# Labels to add to all deployed objects +commonLabels: {} + podSecurityContext: fsGroup: 10001 - runAsUser: 10001 + runAsUser: 10001 runAsGroup: 10001 containerSecurityContext: @@ -74,14 +90,18 @@ containerSecurityContext: runAsNonRoot: true service: - type: NodePort - ep_port: 9000 - console_port: 9001 + type: ClusterIP + endpoint: + port: 9000 + nodePort: 32000 + console: + port: 9001 + nodePort: 32001 # This block is for setting up the ingress for more information can be found here: https://kubernetes.io/docs/concepts/services-networking/ingress/ ingress: enabled: true - className: "traefik" # Specify the classname, traefik or nginx. Different classname has different annotations for session sticky. + className: "nginx" # Specify the classname, traefik or nginx. Different classname has different annotations for session sticky. traefikAnnotations: traefik.ingress.kubernetes.io/service.sticky.cookie: "true" traefik.ingress.kubernetes.io/service.sticky.cookie.httponly: "true" @@ -94,20 +114,20 @@ ingress: nginx.ingress.kubernetes.io/session-cookie-hash: sha1 nginx.ingress.kubernetes.io/session-cookie-max-age: "3600" nginx.ingress.kubernetes.io/session-cookie-name: rustfs + customAnnotations: # Specify custom annotations + {} # Customize annotations hosts: - - host: your.rustfs.com + - host: example.rustfs.com paths: - path: / - pathType: ImplementationSpecific + pathType: Prefix tls: - - secretName: rustfs-tls - hosts: - - your.rustfs.com - -tls: - enabled: false - crt: tls.crt - key: tls.key + enabled: false # Enable tls and access rustfs via https. + certManager: + enabled: false # Enable certmanager to generate certificate for rustfs, default false. + secretName: secret-tls + crt: tls.crt + key: tls.key resources: # We usually recommend not to specify default resources and to leave this as a conscious @@ -125,29 +145,48 @@ resources: livenessProbe: httpGet: path: /health - port: http + port: endpoint + initialDelaySeconds: 10 + periodSeconds: 5 + timeoutSeconds: 3 + successThreshold: 1 + failureThreshold: 3 + readinessProbe: httpGet: path: /health - port: http - -# This section is for setting up autoscaling more information can be found here: https://kubernetes.io/docs/concepts/workloads/autoscaling/ -autoscaling: - enabled: false - minReplicas: 1 - maxReplicas: 100 - targetCPUUtilizationPercentage: 80 - # targetMemoryUtilizationPercentage: 80 + port: endpoint + initialDelaySeconds: 30 + periodSeconds: 5 + timeoutSeconds: 3 + successThreshold: 1 + failureThreshold: 3 nodeSelector: {} tolerations: [] -affinity: {} +affinity: + podAntiAffinity: + enabled: true + topologyKey: kubernetes.io/hostname + nodeAffinity: {} storageclass: name: local-path dataStorageSize: 256Mi logStorageSize: 256Mi +# Init container parameters. +initStep: + image: + repository: busybox + pullPolicy: IfNotPresent + tag: "latest" + containerSecurityContext: + runAsUser: 0 + runAsGroup: 0 + + + extraManifests: [] diff --git a/rustfs/Cargo.toml b/rustfs/Cargo.toml index e4c685eb..73438f1a 100644 --- a/rustfs/Cargo.toml +++ b/rustfs/Cargo.toml @@ -60,6 +60,7 @@ rustfs-s3select-query = { workspace = true } rustfs-targets = { workspace = true } rustfs-utils = { workspace = true, features = ["full"] } rustfs-zip = { workspace = true } +rustfs-scanner = { workspace = true } # Async Runtime and Networking async-trait = { workspace = true } @@ -72,6 +73,7 @@ hyper.workspace = true hyper-util.workspace = true http.workspace = true http-body.workspace = true +http-body-util.workspace = true reqwest = { workspace = true } socket2 = { workspace = true } tokio = { workspace = true, features = ["rt-multi-thread", "macros", "net", "signal", "process", "io-util"] } @@ -144,6 +146,7 @@ pprof = { workspace = true } [dev-dependencies] uuid = { workspace = true, features = ["v4"] } +serial_test = { workspace = true } [build-dependencies] http.workspace = true diff --git a/rustfs/src/admin/auth.rs b/rustfs/src/admin/auth.rs index 8b994097..2f101099 100644 --- a/rustfs/src/admin/auth.rs +++ b/rustfs/src/admin/auth.rs @@ -1,6 +1,18 @@ -use std::collections::HashMap; -use std::sync::Arc; +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. +use crate::auth::get_condition_values; use http::HeaderMap; use rustfs_iam::store::object::ObjectStore; use rustfs_iam::sys::IamSys; @@ -9,8 +21,8 @@ use rustfs_policy::policy::Args; use rustfs_policy::policy::action::Action; use s3s::S3Result; use s3s::s3_error; - -use crate::auth::get_condition_values; +use std::collections::HashMap; +use std::sync::Arc; pub async fn validate_admin_request( headers: &HeaderMap, diff --git a/rustfs/src/admin/console.rs b/rustfs/src/admin/console.rs index 0fe66040..b541edf1 100644 --- a/rustfs/src/admin/console.rs +++ b/rustfs/src/admin/console.rs @@ -14,6 +14,7 @@ use crate::config::build; use crate::license::get_license; +use crate::server::{CONSOLE_PREFIX, FAVICON_PATH, HEALTH_PREFIX, RUSTFS_ADMIN_PREFIX}; use axum::{ Router, body::Body, @@ -45,9 +46,6 @@ use tower_http::timeout::TimeoutLayer; use tower_http::trace::TraceLayer; use tracing::{debug, error, info, instrument, warn}; -pub(crate) const CONSOLE_PREFIX: &str = "/rustfs/console"; -const RUSTFS_ADMIN_PREFIX: &str = "/rustfs/admin/v3"; - #[derive(RustEmbed)] #[folder = "$CARGO_MANIFEST_DIR/static"] struct StaticFiles; @@ -457,7 +455,7 @@ fn get_console_config_from_env() -> (bool, u32, u64, String) { /// # Returns: /// - `true` if the path is for console access, `false` otherwise. pub fn is_console_path(path: &str) -> bool { - path == "/favicon.ico" || path.starts_with(CONSOLE_PREFIX) + path == FAVICON_PATH || path.starts_with(CONSOLE_PREFIX) } /// Setup comprehensive middleware stack with tower-http features @@ -477,11 +475,11 @@ fn setup_console_middleware_stack( auth_timeout: u64, ) -> Router { let mut app = Router::new() - .route("/favicon.ico", get(static_handler)) + .route(FAVICON_PATH, get(static_handler)) .route(&format!("{CONSOLE_PREFIX}/license"), get(license_handler)) .route(&format!("{CONSOLE_PREFIX}/config.json"), get(config_handler)) .route(&format!("{CONSOLE_PREFIX}/version"), get(version_handler)) - .route(&format!("{CONSOLE_PREFIX}/health"), get(health_check).head(health_check)) + .route(&format!("{CONSOLE_PREFIX}{HEALTH_PREFIX}"), get(health_check).head(health_check)) .nest(CONSOLE_PREFIX, Router::new().fallback_service(get(static_handler))) .fallback_service(get(static_handler)); diff --git a/rustfs/src/admin/handlers.rs b/rustfs/src/admin/handlers.rs index 878bb3b9..91bb86a3 100644 --- a/rustfs/src/admin/handlers.rs +++ b/rustfs/src/admin/handlers.rs @@ -24,6 +24,7 @@ use http::{HeaderMap, HeaderValue, Uri}; use hyper::StatusCode; use matchit::Params; use rustfs_common::heal_channel::HealOpts; +use rustfs_config::{MAX_ADMIN_REQUEST_BODY_SIZE, MAX_HEAL_REQUEST_SIZE}; use rustfs_ecstore::admin_server_info::get_server_info; use rustfs_ecstore::bucket::bucket_target_sys::BucketTargetSys; use rustfs_ecstore::bucket::metadata::BUCKET_TARGETS_FILE; @@ -71,7 +72,6 @@ use tokio_stream::wrappers::ReceiverStream; use tracing::debug; use tracing::{error, info, warn}; use url::Host; -// use url::UrlQuery; pub mod bucket_meta; pub mod event; @@ -158,14 +158,15 @@ impl Operation for IsAdminHandler { return Err(s3_error!(InvalidRequest, "get cred failed")); }; - let (_cred, _owner) = + let (cred, _owner) = check_key_valid(get_session_token(&req.uri, &req.headers).unwrap_or_default(), &input_cred.access_key).await?; let access_key_to_check = input_cred.access_key.clone(); // Check if the user is admin by comparing with global credentials let is_admin = if let Some(sys_cred) = get_global_action_cred() { - sys_cred.access_key == access_key_to_check + crate::auth::constant_time_eq(&access_key_to_check, &sys_cred.access_key) + || crate::auth::constant_time_eq(&cred.parent_user, &sys_cred.access_key) } else { false }; @@ -686,7 +687,6 @@ impl Stream for MetricsStream { type Item = Result; fn poll_next(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll> { - info!("MetricsStream poll_next"); let this = Pin::into_inner(self); this.inner.poll_next_unpin(cx) } @@ -748,7 +748,6 @@ impl Operation for MetricsHandler { let body = Body::from(in_stream); spawn(async move { while n > 0 { - info!("loop, n: {n}"); let mut m = RealtimeMetrics::default(); let m_local = collect_local_metrics(types, &opts).await; m.merge(m_local); @@ -765,7 +764,6 @@ impl Operation for MetricsHandler { // todo write resp match serde_json::to_vec(&m) { Ok(re) => { - info!("got metrics, send it to client, m: {m:?}"); let _ = tx.send(Ok(Bytes::from(re))).await; } Err(e) => { @@ -859,11 +857,11 @@ impl Operation for HealHandler { let Some(cred) = req.credentials else { return Err(s3_error!(InvalidRequest, "get cred failed")) }; info!("cred: {:?}", cred); let mut input = req.input; - let bytes = match input.store_all_unlimited().await { + let bytes = match input.store_all_limited(MAX_HEAL_REQUEST_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "heal request body too large or failed to read")); } }; info!("bytes: {:?}", bytes); @@ -1051,11 +1049,11 @@ impl Operation for SetRemoteTargetHandler { .map_err(ApiError::from)?; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "remote target configuration body too large or failed to read")); } }; @@ -1326,8 +1324,7 @@ impl Operation for ProfileHandler { let target_arch = std::env::consts::ARCH; let target_env = option_env!("CARGO_CFG_TARGET_ENV").unwrap_or("unknown"); let msg = format!( - "CPU profiling is not supported on this platform. target_os={}, target_env={}, target_arch={}, requested_url={}", - target_os, target_env, target_arch, requested_url + "CPU profiling is not supported on this platform. target_os={target_os}, target_env={target_env}, target_arch={target_arch}, requested_url={requested_url}" ); return Ok(S3Response::new((StatusCode::NOT_IMPLEMENTED, Body::from(msg)))); } diff --git a/rustfs/src/admin/handlers/bucket_meta.rs b/rustfs/src/admin/handlers/bucket_meta.rs index 1989cf9d..ea553672 100644 --- a/rustfs/src/admin/handlers/bucket_meta.rs +++ b/rustfs/src/admin/handlers/bucket_meta.rs @@ -21,9 +21,9 @@ use crate::{ admin::{auth::validate_admin_request, router::Operation}, auth::{check_key_valid, get_session_token}, }; - use http::{HeaderMap, StatusCode}; use matchit::Params; +use rustfs_config::MAX_BUCKET_METADATA_IMPORT_SIZE; use rustfs_ecstore::{ StorageAPI, bucket::{ @@ -393,11 +393,11 @@ impl Operation for ImportBucketMetadata { .await?; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_BUCKET_METADATA_IMPORT_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "bucket metadata import body too large or failed to read")); } }; diff --git a/rustfs/src/admin/handlers/event.rs b/rustfs/src/admin/handlers/event.rs index 8aabbf5f..a8b93227 100644 --- a/rustfs/src/admin/handlers/event.rs +++ b/rustfs/src/admin/handlers/event.rs @@ -17,7 +17,7 @@ use crate::auth::{check_key_valid, get_session_token}; use http::{HeaderMap, StatusCode}; use matchit::Params; use rustfs_config::notify::{NOTIFY_MQTT_SUB_SYS, NOTIFY_WEBHOOK_SUB_SYS}; -use rustfs_config::{ENABLE_KEY, EnableState}; +use rustfs_config::{ENABLE_KEY, EnableState, MAX_ADMIN_REQUEST_BODY_SIZE}; use rustfs_targets::check_mqtt_broker_available; use s3s::header::CONTENT_LENGTH; use s3s::{Body, S3Error, S3ErrorCode, S3Request, S3Response, S3Result, header::CONTENT_TYPE, s3_error}; @@ -140,7 +140,7 @@ impl Operation for NotificationTarget { // 4. The parsing request body is KVS (Key-Value Store) let mut input = req.input; - let body = input.store_all_unlimited().await.map_err(|e| { + let body = input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await.map_err(|e| { warn!("failed to read request body: {:?}", e); s3_error!(InvalidRequest, "failed to read request body") })?; diff --git a/rustfs/src/admin/handlers/group.rs b/rustfs/src/admin/handlers/group.rs index 953f3105..c7866a81 100644 --- a/rustfs/src/admin/handlers/group.rs +++ b/rustfs/src/admin/handlers/group.rs @@ -12,8 +12,13 @@ // See the License for the specific language governing permissions and // limitations under the License. +use crate::{ + admin::{auth::validate_admin_request, router::Operation, utils::has_space_be}, + auth::{check_key_valid, constant_time_eq, get_session_token}, +}; use http::{HeaderMap, StatusCode}; use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_ecstore::global::get_global_action_cred; use rustfs_iam::error::{is_err_no_such_group, is_err_no_such_user}; use rustfs_madmin::GroupAddRemove; @@ -27,11 +32,6 @@ use serde::Deserialize; use serde_urlencoded::from_bytes; use tracing::warn; -use crate::{ - admin::{auth::validate_admin_request, router::Operation, utils::has_space_be}, - auth::{check_key_valid, constant_time_eq, get_session_token}, -}; - #[derive(Debug, Deserialize, Default)] pub struct GroupQuery { pub group: String, @@ -213,11 +213,11 @@ impl Operation for UpdateGroupMembers { .await?; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "group configuration body too large or failed to read")); } }; diff --git a/rustfs/src/admin/handlers/kms.rs b/rustfs/src/admin/handlers/kms.rs index dbe74dbb..741508c2 100644 --- a/rustfs/src/admin/handlers/kms.rs +++ b/rustfs/src/admin/handlers/kms.rs @@ -20,6 +20,7 @@ use crate::auth::{check_key_valid, get_session_token}; use base64::Engine; use hyper::{HeaderMap, StatusCode}; use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_kms::{get_global_encryption_service, types::*}; use rustfs_policy::policy::action::{Action, AdminAction}; use s3s::header::CONTENT_TYPE; @@ -131,7 +132,7 @@ impl Operation for CreateKeyHandler { let body = req .input - .store_all_unlimited() + .store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE) .await .map_err(|e| s3_error!(InvalidRequest, "failed to read request body: {}", e))?; @@ -325,7 +326,7 @@ impl Operation for GenerateDataKeyHandler { let body = req .input - .store_all_unlimited() + .store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE) .await .map_err(|e| s3_error!(InvalidRequest, "failed to read request body: {}", e))?; diff --git a/rustfs/src/admin/handlers/kms_dynamic.rs b/rustfs/src/admin/handlers/kms_dynamic.rs index 150fc3ea..95bcddb7 100644 --- a/rustfs/src/admin/handlers/kms_dynamic.rs +++ b/rustfs/src/admin/handlers/kms_dynamic.rs @@ -19,6 +19,7 @@ use crate::admin::auth::validate_admin_request; use crate::auth::{check_key_valid, get_session_token}; use hyper::StatusCode; use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_ecstore::config::com::{read_config, save_config}; use rustfs_ecstore::new_object_layer_fn; use rustfs_kms::{ @@ -102,7 +103,7 @@ impl Operation for ConfigureKmsHandler { let body = req .input - .store_all_unlimited() + .store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE) .await .map_err(|e| s3_error!(InvalidRequest, "failed to read request body: {}", e))?; @@ -200,7 +201,7 @@ impl Operation for StartKmsHandler { let body = req .input - .store_all_unlimited() + .store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE) .await .map_err(|e| s3_error!(InvalidRequest, "failed to read request body: {}", e))?; @@ -469,7 +470,7 @@ impl Operation for ReconfigureKmsHandler { let body = req .input - .store_all_unlimited() + .store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE) .await .map_err(|e| s3_error!(InvalidRequest, "failed to read request body: {}", e))?; diff --git a/rustfs/src/admin/handlers/kms_keys.rs b/rustfs/src/admin/handlers/kms_keys.rs index 3b52841a..661b1ba9 100644 --- a/rustfs/src/admin/handlers/kms_keys.rs +++ b/rustfs/src/admin/handlers/kms_keys.rs @@ -19,6 +19,7 @@ use crate::admin::auth::validate_admin_request; use crate::auth::{check_key_valid, get_session_token}; use hyper::{HeaderMap, StatusCode}; use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_kms::{KmsError, get_global_kms_service_manager, types::*}; use rustfs_policy::policy::action::{Action, AdminAction}; use s3s::header::CONTENT_TYPE; @@ -83,7 +84,7 @@ impl Operation for CreateKmsKeyHandler { let body = req .input - .store_all_unlimited() + .store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE) .await .map_err(|e| s3_error!(InvalidRequest, "failed to read request body: {}", e))?; @@ -216,7 +217,7 @@ impl Operation for DeleteKmsKeyHandler { let body = req .input - .store_all_unlimited() + .store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE) .await .map_err(|e| s3_error!(InvalidRequest, "failed to read request body: {}", e))?; @@ -364,7 +365,7 @@ impl Operation for CancelKmsKeyDeletionHandler { let body = req .input - .store_all_unlimited() + .store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE) .await .map_err(|e| s3_error!(InvalidRequest, "failed to read request body: {}", e))?; diff --git a/rustfs/src/admin/handlers/policies.rs b/rustfs/src/admin/handlers/policies.rs index a65e7ee5..76915be0 100644 --- a/rustfs/src/admin/handlers/policies.rs +++ b/rustfs/src/admin/handlers/policies.rs @@ -18,6 +18,7 @@ use crate::{ }; use http::{HeaderMap, StatusCode}; use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_ecstore::global::get_global_action_cred; use rustfs_iam::error::is_err_no_such_user; use rustfs_iam::store::MappedPolicy; @@ -139,11 +140,11 @@ impl Operation for AddCannedPolicy { } let mut input = req.input; - let policy_bytes = match input.store_all_unlimited().await { + let policy_bytes = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "policy configuration body too large or failed to read")); } }; diff --git a/rustfs/src/admin/handlers/rebalance.rs b/rustfs/src/admin/handlers/rebalance.rs index ca5b60f5..736c8754 100644 --- a/rustfs/src/admin/handlers/rebalance.rs +++ b/rustfs/src/admin/handlers/rebalance.rs @@ -12,8 +12,13 @@ // See the License for the specific language governing permissions and // limitations under the License. +use crate::{ + admin::{auth::validate_admin_request, router::Operation}, + auth::{check_key_valid, get_session_token}, +}; use http::{HeaderMap, StatusCode}; use matchit::Params; +use rustfs_ecstore::rebalance::RebalanceMeta; use rustfs_ecstore::{ StorageAPI, error::StorageError, @@ -33,12 +38,6 @@ use std::time::Duration; use time::OffsetDateTime; use tracing::warn; -use crate::{ - admin::{auth::validate_admin_request, router::Operation}, - auth::{check_key_valid, get_session_token}, -}; -use rustfs_ecstore::rebalance::RebalanceMeta; - #[derive(Debug, Clone, Deserialize, Serialize)] pub struct RebalanceResp { pub id: String, diff --git a/rustfs/src/admin/handlers/service_account.rs b/rustfs/src/admin/handlers/service_account.rs index 935abcc0..1340d13f 100644 --- a/rustfs/src/admin/handlers/service_account.rs +++ b/rustfs/src/admin/handlers/service_account.rs @@ -18,6 +18,7 @@ use crate::{admin::router::Operation, auth::check_key_valid}; use http::HeaderMap; use hyper::StatusCode; use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_ecstore::global::get_global_action_cred; use rustfs_iam::error::is_err_no_such_service_account; use rustfs_iam::sys::{NewServiceAccountOpts, UpdateServiceAccountOpts}; @@ -48,11 +49,14 @@ impl Operation for AddServiceAccount { check_key_valid(get_session_token(&req.uri, &req.headers).unwrap_or_default(), &req_cred.access_key).await?; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!( + InvalidRequest, + "service account configuration body too large or failed to read" + )); } }; @@ -235,11 +239,14 @@ impl Operation for UpdateServiceAccount { // })?; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!( + InvalidRequest, + "service account configuration body too large or failed to read" + )); } }; @@ -439,8 +446,8 @@ impl Operation for ListServiceAccount { let query = { if let Some(query) = req.uri.query() { - let input: ListServiceAccountQuery = - from_bytes(query.as_bytes()).map_err(|_e| s3_error!(InvalidArgument, "get body failed"))?; + let input: ListServiceAccountQuery = from_bytes(query.as_bytes()) + .map_err(|_e| s3_error!(InvalidArgument, "invalid service account query parameters"))?; input } else { ListServiceAccountQuery::default() @@ -549,8 +556,8 @@ impl Operation for DeleteServiceAccount { let query = { if let Some(query) = req.uri.query() { - let input: AccessKeyQuery = - from_bytes(query.as_bytes()).map_err(|_e| s3_error!(InvalidArgument, "get body failed"))?; + let input: AccessKeyQuery = from_bytes(query.as_bytes()) + .map_err(|_e| s3_error!(InvalidArgument, "invalid access key query parameters"))?; input } else { AccessKeyQuery::default() diff --git a/rustfs/src/admin/handlers/sts.rs b/rustfs/src/admin/handlers/sts.rs index 757a4843..9770784d 100644 --- a/rustfs/src/admin/handlers/sts.rs +++ b/rustfs/src/admin/handlers/sts.rs @@ -18,6 +18,7 @@ use crate::{ }; use http::StatusCode; use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_ecstore::bucket::utils::serialize; use rustfs_iam::{manager::get_token_signing_key, sys::SESSION_POLICY_NAME}; use rustfs_policy::{auth::get_new_credentials_with_metadata, policy::Policy}; @@ -71,15 +72,15 @@ impl Operation for AssumeRoleHandle { let mut input = req.input; - let bytes = match input.store_all_unlimited().await { + let bytes = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "STS request body too large or failed to read")); } }; - let body: AssumeRoleRequest = from_bytes(&bytes).map_err(|_e| s3_error!(InvalidRequest, "get body failed"))?; + let body: AssumeRoleRequest = from_bytes(&bytes).map_err(|_e| s3_error!(InvalidRequest, "invalid STS request format"))?; if body.action.as_str() != ASSUME_ROLE_ACTION { return Err(s3_error!(InvalidArgument, "not support action")); diff --git a/rustfs/src/admin/handlers/tier.rs b/rustfs/src/admin/handlers/tier.rs index 6fc1e7f7..4fdd8954 100644 --- a/rustfs/src/admin/handlers/tier.rs +++ b/rustfs/src/admin/handlers/tier.rs @@ -13,24 +13,13 @@ // limitations under the License. #![allow(unused_variables, unused_mut, unused_must_use)] -use http::{HeaderMap, StatusCode}; -//use iam::get_global_action_cred; -use matchit::Params; -use rustfs_policy::policy::action::{Action, AdminAction}; -use s3s::{ - Body, S3Error, S3ErrorCode, S3Request, S3Response, S3Result, - header::{CONTENT_LENGTH, CONTENT_TYPE}, - s3_error, -}; -use serde_urlencoded::from_bytes; -use time::OffsetDateTime; -use tracing::{debug, warn}; - use crate::{ admin::{auth::validate_admin_request, router::Operation}, auth::{check_key_valid, get_session_token}, }; - +use http::{HeaderMap, StatusCode}; +use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_ecstore::{ config::storageclass, global::GLOBAL_TierConfigMgr, @@ -44,6 +33,15 @@ use rustfs_ecstore::{ }, }, }; +use rustfs_policy::policy::action::{Action, AdminAction}; +use s3s::{ + Body, S3Error, S3ErrorCode, S3Request, S3Response, S3Result, + header::{CONTENT_LENGTH, CONTENT_TYPE}, + s3_error, +}; +use serde_urlencoded::from_bytes; +use time::OffsetDateTime; +use tracing::{debug, warn}; #[derive(Debug, Clone, serde::Deserialize, Default)] pub struct AddTierQuery { @@ -95,11 +93,11 @@ impl Operation for AddTier { validate_admin_request(&req.headers, &cred, owner, false, vec![Action::AdminAction(AdminAction::SetTierAction)]).await?; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "tier configuration body too large or failed to read")); } }; @@ -223,11 +221,11 @@ impl Operation for EditTier { validate_admin_request(&req.headers, &cred, owner, false, vec![Action::AdminAction(AdminAction::SetTierAction)]).await?; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "tier configuration body too large or failed to read")); } }; diff --git a/rustfs/src/admin/handlers/trace.rs b/rustfs/src/admin/handlers/trace.rs index 8b1e0b84..1b9577a1 100644 --- a/rustfs/src/admin/handlers/trace.rs +++ b/rustfs/src/admin/handlers/trace.rs @@ -12,6 +12,7 @@ // See the License for the specific language governing permissions and // limitations under the License. +use crate::admin::router::Operation; use http::StatusCode; use hyper::Uri; use matchit::Params; @@ -20,8 +21,6 @@ use rustfs_madmin::service_commands::ServiceTraceOpts; use s3s::{Body, S3Request, S3Response, S3Result, s3_error}; use tracing::warn; -use crate::admin::router::Operation; - #[allow(dead_code)] fn extract_trace_options(uri: &Uri) -> S3Result { let mut st_opts = ServiceTraceOpts::default(); diff --git a/rustfs/src/admin/handlers/user.rs b/rustfs/src/admin/handlers/user.rs index be20eda0..0ab6a128 100644 --- a/rustfs/src/admin/handlers/user.rs +++ b/rustfs/src/admin/handlers/user.rs @@ -18,6 +18,7 @@ use crate::{ }; use http::{HeaderMap, StatusCode}; use matchit::Params; +use rustfs_config::{MAX_ADMIN_REQUEST_BODY_SIZE, MAX_IAM_IMPORT_SIZE}; use rustfs_ecstore::global::get_global_action_cred; use rustfs_iam::{ store::{GroupInfo, MappedPolicy, UserType}, @@ -76,7 +77,7 @@ impl Operation for AddUser { } let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); @@ -636,7 +637,7 @@ impl Operation for ImportIam { .await?; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_IAM_IMPORT_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); diff --git a/rustfs/src/admin/mod.rs b/rustfs/src/admin/mod.rs index 01d4942c..22f6a881 100644 --- a/rustfs/src/admin/mod.rs +++ b/rustfs/src/admin/mod.rs @@ -22,6 +22,7 @@ pub mod utils; #[cfg(test)] mod console_test; +use crate::server::{ADMIN_PREFIX, HEALTH_PREFIX, PROFILE_CPU_PATH, PROFILE_MEMORY_PATH}; use handlers::{ GetReplicationMetricsHandler, HealthCheckHandler, IsAdminHandler, ListRemoteTargetHandler, RemoveRemoteTargetHandler, SetRemoteTargetHandler, bucket_meta, @@ -37,17 +38,21 @@ use router::{AdminOperation, S3Router}; use rpc::register_rpc_route; use s3s::route::S3Route; -const ADMIN_PREFIX: &str = "/rustfs/admin"; -// const ADMIN_PREFIX: &str = "/minio/admin"; - +/// Create admin router +/// +/// # Arguments +/// * `console_enabled` - Whether the console is enabled +/// +/// # Returns +/// An instance of S3Route for admin operations pub fn make_admin_route(console_enabled: bool) -> std::io::Result { let mut r: S3Router = S3Router::new(console_enabled); // Health check endpoint for monitoring and orchestration - r.insert(Method::GET, "/health", AdminOperation(&HealthCheckHandler {}))?; - r.insert(Method::HEAD, "/health", AdminOperation(&HealthCheckHandler {}))?; - r.insert(Method::GET, "/profile/cpu", AdminOperation(&TriggerProfileCPU {}))?; - r.insert(Method::GET, "/profile/memory", AdminOperation(&TriggerProfileMemory {}))?; + r.insert(Method::GET, HEALTH_PREFIX, AdminOperation(&HealthCheckHandler {}))?; + r.insert(Method::HEAD, HEALTH_PREFIX, AdminOperation(&HealthCheckHandler {}))?; + r.insert(Method::GET, PROFILE_CPU_PATH, AdminOperation(&TriggerProfileCPU {}))?; + r.insert(Method::GET, PROFILE_MEMORY_PATH, AdminOperation(&TriggerProfileMemory {}))?; // 1 r.insert(Method::POST, "/", AdminOperation(&sts::AssumeRoleHandle {}))?; diff --git a/rustfs/src/admin/router.rs b/rustfs/src/admin/router.rs index c3a63b42..09c390cf 100644 --- a/rustfs/src/admin/router.rs +++ b/rustfs/src/admin/router.rs @@ -12,10 +12,9 @@ // See the License for the specific language governing permissions and // limitations under the License. -use crate::admin::ADMIN_PREFIX; use crate::admin::console::is_console_path; use crate::admin::console::make_console_server; -use crate::admin::rpc::RPC_PREFIX; +use crate::server::{ADMIN_PREFIX, HEALTH_PREFIX, PROFILE_CPU_PATH, PROFILE_MEMORY_PATH, RPC_PREFIX}; use hyper::HeaderMap; use hyper::Method; use hyper::StatusCode; @@ -86,22 +85,26 @@ where fn is_match(&self, method: &Method, uri: &Uri, headers: &HeaderMap, _: &mut Extensions) -> bool { let path = uri.path(); // Profiling endpoints - if method == Method::GET && (path == "/profile/cpu" || path == "/profile/memory") { + if method == Method::GET && (path == PROFILE_CPU_PATH || path == PROFILE_MEMORY_PATH) { return true; } // Health check - if (method == Method::HEAD || method == Method::GET) && path == "/health" { + if (method == Method::HEAD || method == Method::GET) && path == HEALTH_PREFIX { return true; } // AssumeRole - if method == Method::POST && path == "/" { - if let Some(val) = headers.get(header::CONTENT_TYPE) { - if val.as_bytes() == b"application/x-www-form-urlencoded" { - return true; - } - } + if method == Method::POST + && path == "/" + && headers + .get(header::CONTENT_TYPE) + .and_then(|v| v.to_str().ok()) + .map(|ct| ct.split(';').next().unwrap_or("").trim().to_lowercase()) + .map(|ct| ct == "application/x-www-form-urlencoded") + .unwrap_or(false) + { + return true; } path.starts_with(ADMIN_PREFIX) || path.starts_with(RPC_PREFIX) || is_console_path(path) @@ -113,12 +116,12 @@ where let path = req.uri.path(); // Profiling endpoints - if req.method == Method::GET && (path == "/profile/cpu" || path == "/profile/memory") { + if req.method == Method::GET && (path == PROFILE_CPU_PATH || path == PROFILE_MEMORY_PATH) { return Ok(()); } // Health check - if (req.method == Method::HEAD || req.method == Method::GET) && path == "/health" { + if (req.method == Method::HEAD || req.method == Method::GET) && path == HEALTH_PREFIX { return Ok(()); } diff --git a/rustfs/src/admin/rpc.rs b/rustfs/src/admin/rpc.rs index bc03cae5..8098236d 100644 --- a/rustfs/src/admin/rpc.rs +++ b/rustfs/src/admin/rpc.rs @@ -15,10 +15,12 @@ use super::router::AdminOperation; use super::router::Operation; use super::router::S3Router; +use crate::server::RPC_PREFIX; use futures::StreamExt; use http::StatusCode; use hyper::Method; use matchit::Params; +use rustfs_config::MAX_ADMIN_REQUEST_BODY_SIZE; use rustfs_ecstore::disk::DiskAPI; use rustfs_ecstore::disk::WalkDirOptions; use rustfs_ecstore::set_disk::DEFAULT_READ_BUFFER_SIZE; @@ -35,8 +37,6 @@ use tokio::io::AsyncWriteExt; use tokio_util::io::ReaderStream; use tracing::warn; -pub const RPC_PREFIX: &str = "/rustfs/rpc"; - pub fn register_rpc_route(r: &mut S3Router) -> std::io::Result<()> { r.insert( Method::GET, @@ -141,11 +141,11 @@ impl Operation for WalkDir { }; let mut input = req.input; - let body = match input.store_all_unlimited().await { + let body = match input.store_all_limited(MAX_ADMIN_REQUEST_BODY_SIZE).await { Ok(b) => b, Err(e) => { warn!("get body failed, e: {:?}", e); - return Err(s3_error!(InvalidRequest, "get body failed")); + return Err(s3_error!(InvalidRequest, "RPC request body too large or failed to read")); } }; diff --git a/rustfs/src/auth.rs b/rustfs/src/auth.rs index cc2d24c2..79cb2922 100644 --- a/rustfs/src/auth.rs +++ b/rustfs/src/auth.rs @@ -66,7 +66,7 @@ const SIGN_V2_ALGORITHM: &str = "AWS "; const SIGN_V4_ALGORITHM: &str = "AWS4-HMAC-SHA256"; const STREAMING_CONTENT_SHA256: &str = "STREAMING-AWS4-HMAC-SHA256-PAYLOAD"; const STREAMING_CONTENT_SHA256_TRAILER: &str = "STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER"; -pub const UNSIGNED_PAYLOAD_TRAILER: &str = "STREAMING-UNSIGNED-PAYLOAD-TRAILER"; +pub(crate) const UNSIGNED_PAYLOAD_TRAILER: &str = "STREAMING-UNSIGNED-PAYLOAD-TRAILER"; const ACTION_HEADER: &str = "Action"; const AMZ_CREDENTIAL: &str = "X-Amz-Credential"; const AMZ_ACCESS_KEY_ID: &str = "AWSAccessKeyId"; diff --git a/rustfs/src/config/config_test.rs b/rustfs/src/config/config_test.rs index 1f875fae..4e449b04 100644 --- a/rustfs/src/config/config_test.rs +++ b/rustfs/src/config/config_test.rs @@ -13,9 +13,48 @@ // limitations under the License. #[cfg(test)] +#[allow(unsafe_op_in_unsafe_fn)] mod tests { use crate::config::Opt; use clap::Parser; + use rustfs_ecstore::disks_layout::DisksLayout; + use serial_test::serial; + use std::env; + + /// Helper function to run test with environment variable set. + /// Automatically cleans up the environment variable after the test. + /// + /// # Safety + /// This function uses unsafe env::set_var and env::remove_var. + /// Tests using this helper must be marked with #[serial] to avoid race conditions. + #[allow(unsafe_code)] + fn with_env_var(key: &str, value: &str, test_fn: F) + where + F: FnOnce(), + { + unsafe { + env::set_var(key, value); + } + // Ensure cleanup happens even if test panics + let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(test_fn)); + unsafe { + env::remove_var(key); + } + // Re-panic if the test failed + if let Err(e) = result { + std::panic::resume_unwind(e); + } + } + + /// Helper to parse volumes and verify the layout. + fn verify_layout(volumes: &[T], verify_fn: F) + where + T: AsRef, + F: FnOnce(&DisksLayout), + { + let layout = DisksLayout::from_volumes(volumes).expect("Failed to parse volumes"); + verify_fn(&layout); + } #[test] fn test_default_console_configuration() { @@ -66,4 +105,422 @@ mod tests { assert_eq!(endpoint_port, 9000); assert_eq!(console_port, 9001); } + + #[test] + fn test_volumes_and_disk_layout_parsing() { + use rustfs_ecstore::disks_layout::DisksLayout; + + // Test case 1: Single volume path + let args = vec!["rustfs", "/data/vol1"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + assert_eq!(opt.volumes[0], "/data/vol1"); + + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse single volume"); + assert!(!layout.is_empty_layout()); + assert!(layout.is_single_drive_layout()); + assert_eq!(layout.get_single_drive_layout(), "/data/vol1"); + + // Test case 2: Multiple volume paths (space-separated via env) + let args = vec!["rustfs", "/data/vol1", "/data/vol2", "/data/vol3", "/data/vol4"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 4); + + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse multiple volumes"); + assert!(!layout.is_empty_layout()); + assert!(!layout.is_single_drive_layout()); + assert_eq!(layout.get_set_count(0), 1); + assert_eq!(layout.get_drives_per_set(0), 4); + + // Test case 3: Ellipses pattern - simple range + let args = vec!["rustfs", "/data/vol{1...4}"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + assert_eq!(opt.volumes[0], "/data/vol{1...4}"); + + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse ellipses pattern"); + assert!(!layout.is_empty_layout()); + assert_eq!(layout.get_set_count(0), 1); + assert_eq!(layout.get_drives_per_set(0), 4); + + // Test case 4: Ellipses pattern - larger range that creates multiple sets + let args = vec!["rustfs", "/data/vol{1...16}"]; + let opt = Opt::parse_from(args); + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse ellipses with multiple sets"); + assert!(!layout.is_empty_layout()); + assert_eq!(layout.get_drives_per_set(0), 16); + + // Test case 5: Distributed setup pattern + let args = vec!["rustfs", "http://server{1...4}/data/vol{1...4}"]; + let opt = Opt::parse_from(args); + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse distributed pattern"); + assert!(!layout.is_empty_layout()); + assert_eq!(layout.get_drives_per_set(0), 16); + + // Test case 6: Multiple pools (legacy: false) + let args = vec!["rustfs", "http://server1/data{1...4}", "http://server2/data{1...4}"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 2); + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse multiple pools"); + assert!(!layout.legacy); + assert_eq!(layout.pools.len(), 2); + + // Test case 7: Minimum valid drives for erasure coding (2 drives minimum) + let args = vec!["rustfs", "/data/vol1", "/data/vol2"]; + let opt = Opt::parse_from(args); + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Should succeed with 2 drives"); + assert_eq!(layout.get_drives_per_set(0), 2); + + // Test case 8: Invalid - single drive not enough for erasure coding + let args = vec!["rustfs", "/data/vol1"]; + let opt = Opt::parse_from(args); + // Single drive is special case and should succeed for single drive layout + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Single drive should work"); + assert!(layout.is_single_drive_layout()); + + // Test case 9: Command line with both address and volumes + let args = vec![ + "rustfs", + "/data/vol{1...8}", + "--address", + ":9000", + "--console-address", + ":9001", + ]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + assert_eq!(opt.address, ":9000"); + assert_eq!(opt.console_address, ":9001"); + + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse with address args"); + assert!(!layout.is_empty_layout()); + assert_eq!(layout.get_drives_per_set(0), 8); + + // Test case 10: Multiple ellipses in single argument - nested pattern + let args = vec!["rustfs", "/data{0...3}/vol{0...4}"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + assert_eq!(opt.volumes[0], "/data{0...3}/vol{0...4}"); + + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse nested ellipses pattern"); + assert!(!layout.is_empty_layout()); + // 4 data dirs * 5 vols = 20 drives + let total_drives = layout.get_set_count(0) * layout.get_drives_per_set(0); + assert_eq!(total_drives, 20, "Expected 20 drives from /data{{0...3}}/vol{{0...4}}"); + + // Test case 11: Multiple pools with nested ellipses patterns + let args = vec!["rustfs", "/data{0...3}/vol{0...4}", "/data{4...7}/vol{0...4}"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 2); + + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse multiple pools with nested patterns"); + assert!(!layout.legacy); + assert_eq!(layout.pools.len(), 2); + + // Each pool should have 20 drives (4 * 5) + let pool0_drives = layout.get_set_count(0) * layout.get_drives_per_set(0); + let pool1_drives = layout.get_set_count(1) * layout.get_drives_per_set(1); + assert_eq!(pool0_drives, 20, "Pool 0 should have 20 drives"); + assert_eq!(pool1_drives, 20, "Pool 1 should have 20 drives"); + + // Test case 11: Complex distributed pattern with multiple ellipses + let args = vec!["rustfs", "http://server{1...2}.local/disk{1...8}"]; + let opt = Opt::parse_from(args); + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse distributed nested pattern"); + assert!(!layout.is_empty_layout()); + // 2 servers * 8 disks = 16 drives + let total_drives = layout.get_set_count(0) * layout.get_drives_per_set(0); + assert_eq!(total_drives, 16, "Expected 16 drives from server{{1...2}}/disk{{1...8}}"); + + // Test case 12: Zero-padded patterns + let args = vec!["rustfs", "/data/vol{01...16}"]; + let opt = Opt::parse_from(args); + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse zero-padded pattern"); + assert!(!layout.is_empty_layout()); + assert_eq!(layout.get_drives_per_set(0), 16); + } + + /// Test environment variable parsing for volumes. + /// Uses #[serial] to avoid concurrent env var modifications. + #[test] + #[serial] + #[allow(unsafe_code)] + fn test_rustfs_volumes_env_variable() { + // Test case 1: Single volume via environment variable + with_env_var("RUSTFS_VOLUMES", "/data/vol1", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + assert_eq!(opt.volumes[0], "/data/vol1"); + + let layout = DisksLayout::from_volumes(&opt.volumes).expect("Failed to parse single volume from env"); + assert!(layout.is_single_drive_layout()); + }); + + // Test case 2: Multiple volumes via environment variable (space-separated) + with_env_var("RUSTFS_VOLUMES", "/data/vol1 /data/vol2 /data/vol3 /data/vol4", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 4); + assert_eq!(opt.volumes[0], "/data/vol1"); + assert_eq!(opt.volumes[1], "/data/vol2"); + assert_eq!(opt.volumes[2], "/data/vol3"); + assert_eq!(opt.volumes[3], "/data/vol4"); + + verify_layout(&opt.volumes, |layout| { + assert!(!layout.is_single_drive_layout()); + assert_eq!(layout.get_drives_per_set(0), 4); + }); + }); + + // Test case 3: Ellipses pattern via environment variable + with_env_var("RUSTFS_VOLUMES", "/data/vol{1...4}", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + assert_eq!(opt.volumes[0], "/data/vol{1...4}"); + + verify_layout(&opt.volumes, |layout| { + assert_eq!(layout.get_drives_per_set(0), 4); + }); + }); + + // Test case 4: Larger range with ellipses + with_env_var("RUSTFS_VOLUMES", "/data/vol{1...16}", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + verify_layout(&opt.volumes, |layout| { + assert_eq!(layout.get_drives_per_set(0), 16); + }); + }); + + // Test case 5: Distributed setup pattern + with_env_var("RUSTFS_VOLUMES", "http://server{1...4}/data/vol{1...4}", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + verify_layout(&opt.volumes, |layout| { + assert_eq!(layout.get_drives_per_set(0), 16); + }); + }); + + // Test case 6: Multiple pools via environment variable (space-separated) + with_env_var("RUSTFS_VOLUMES", "http://server1/data{1...4} http://server2/data{1...4}", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 2); + verify_layout(&opt.volumes, |layout| { + assert!(!layout.legacy); + assert_eq!(layout.pools.len(), 2); + }); + }); + + // Test case 7: Nested ellipses pattern + with_env_var("RUSTFS_VOLUMES", "/data{0...3}/vol{0...4}", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + assert_eq!(opt.volumes[0], "/data{0...3}/vol{0...4}"); + + verify_layout(&opt.volumes, |layout| { + let total_drives = layout.get_set_count(0) * layout.get_drives_per_set(0); + assert_eq!(total_drives, 20, "Expected 20 drives from /data{{0...3}}/vol{{0...4}}"); + }); + }); + + // Test case 8: Multiple pools with nested ellipses + with_env_var("RUSTFS_VOLUMES", "/data{0...3}/vol{0...4} /data{4...7}/vol{0...4}", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 2); + + verify_layout(&opt.volumes, |layout| { + assert_eq!(layout.pools.len(), 2); + let pool0_drives = layout.get_set_count(0) * layout.get_drives_per_set(0); + let pool1_drives = layout.get_set_count(1) * layout.get_drives_per_set(1); + assert_eq!(pool0_drives, 20, "Pool 0 should have 20 drives"); + assert_eq!(pool1_drives, 20, "Pool 1 should have 20 drives"); + }); + }); + + // Test case 9: Complex distributed pattern with multiple ellipses + with_env_var("RUSTFS_VOLUMES", "http://server{1...2}.local/disk{1...8}", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + verify_layout(&opt.volumes, |layout| { + let total_drives = layout.get_set_count(0) * layout.get_drives_per_set(0); + assert_eq!(total_drives, 16, "Expected 16 drives from server{{1...2}}/disk{{1...8}}"); + }); + }); + + // Test case 10: Zero-padded patterns + with_env_var("RUSTFS_VOLUMES", "/data/vol{01...16}", || { + let args = vec!["rustfs"]; + let opt = Opt::parse_from(args); + verify_layout(&opt.volumes, |layout| { + assert_eq!(layout.get_drives_per_set(0), 16); + }); + }); + + // Test case 11: Environment variable with additional CLI options + with_env_var("RUSTFS_VOLUMES", "/data/vol{1...8}", || { + let args = vec!["rustfs", "--address", ":9000", "--console-address", ":9001"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + assert_eq!(opt.address, ":9000"); + assert_eq!(opt.console_address, ":9001"); + + verify_layout(&opt.volumes, |layout| { + assert_eq!(layout.get_drives_per_set(0), 8); + }); + }); + + // Test case 12: Command line argument overrides environment variable + with_env_var("RUSTFS_VOLUMES", "/data/vol1", || { + let args = vec!["rustfs", "/override/vol1"]; + let opt = Opt::parse_from(args); + assert_eq!(opt.volumes.len(), 1); + // CLI argument should override environment variable + assert_eq!(opt.volumes[0], "/override/vol1"); + }); + } + + /// Test boundary cases for path parsing. + /// NOTE: Current implementation uses space as delimiter, + /// which means paths with spaces are NOT supported. + #[test] + #[serial] + #[allow(unsafe_code)] + fn test_volumes_boundary_cases() { + // Test case 1: Paths with spaces are not properly supported (known limitation) + // This test documents the current behavior - space-separated paths will be split + with_env_var("RUSTFS_VOLUMES", "/data/my disk/vol1", || { + let args = vec!["rustfs"]; + let opt = Opt::try_parse_from(args).expect("Failed to parse with spaces in path"); + // Current behavior: space causes split into 2 volumes + assert_eq!(opt.volumes.len(), 2, "Paths with spaces are split (known limitation)"); + assert_eq!(opt.volumes[0], "/data/my"); + assert_eq!(opt.volumes[1], "disk/vol1"); + }); + + // Test case 2: Empty environment variable causes parsing failure + // because volumes is required and NonEmptyStringValueParser filters empty strings + with_env_var("RUSTFS_VOLUMES", "", || { + let args = vec!["rustfs"]; + let result = Opt::try_parse_from(args); + // Should fail because no volumes provided (empty string filtered out) + assert!(result.is_err(), "Empty RUSTFS_VOLUMES should fail parsing (required field)"); + }); + + // Test case 2b: Multiple consecutive spaces create empty strings during splitting + // This causes parsing to fail because volumes is required and empty strings are invalid + with_env_var("RUSTFS_VOLUMES", "/data/vol1 /data/vol2", || { + let args = vec!["rustfs"]; + let result = Opt::try_parse_from(args); + // Should fail because double space creates an empty element + assert!(result.is_err(), "Multiple consecutive spaces should cause parsing failure"); + }); + + // Test case 3: Very long path with ellipses (stress test) + // Note: Large drive counts may be automatically split into multiple sets + let long_path = format!("/very/long/path/structure/with/many/directories/vol{{1...{}}}", 100); + with_env_var("RUSTFS_VOLUMES", &long_path, || { + let args = vec!["rustfs"]; + let opt = Opt::try_parse_from(args).expect("Failed to parse with long ellipses path"); + verify_layout(&opt.volumes, |layout| { + // Total drives should be 100, but may be distributed across sets + let total_drives = layout.get_set_count(0) * layout.get_drives_per_set(0); + assert_eq!(total_drives, 100, "Total drives should be 100"); + }); + }); + } + + /// Test error handling for invalid ellipses patterns. + #[test] + fn test_invalid_ellipses_patterns() { + // Test case 1: Invalid ellipses format (letters instead of numbers) + let args = vec!["rustfs", "/data/vol{a...z}"]; + let opt = Opt::parse_from(args); + let result = DisksLayout::from_volumes(&opt.volumes); + assert!(result.is_err(), "Invalid ellipses pattern with letters should fail"); + + // Test case 2: Reversed range (larger to smaller) + let args = vec!["rustfs", "/data/vol{10...1}"]; + let opt = Opt::parse_from(args); + let result = DisksLayout::from_volumes(&opt.volumes); + // Depending on implementation, this may succeed with 0 drives or fail + // Document actual behavior + if let Ok(layout) = result { + assert!( + layout.is_empty_layout() || layout.get_drives_per_set(0) == 0, + "Reversed range should result in empty layout" + ); + } + } + + #[test] + fn test_server_domains_parsing() { + // Test case 1: server domains without ports + let args = vec![ + "rustfs", + "/data/vol1", + "--server-domains", + "example.com,api.example.com,cdn.example.com", + ]; + let opt = Opt::parse_from(args); + + assert_eq!(opt.server_domains.len(), 3); + assert_eq!(opt.server_domains[0], "example.com"); + assert_eq!(opt.server_domains[1], "api.example.com"); + assert_eq!(opt.server_domains[2], "cdn.example.com"); + + // Test case 2: server domains with ports + let args = vec![ + "rustfs", + "/data/vol1", + "--server-domains", + "example.com:9000,api.example.com:8080,cdn.example.com:443", + ]; + let opt = Opt::parse_from(args); + + assert_eq!(opt.server_domains.len(), 3); + assert_eq!(opt.server_domains[0], "example.com:9000"); + assert_eq!(opt.server_domains[1], "api.example.com:8080"); + assert_eq!(opt.server_domains[2], "cdn.example.com:443"); + + // Test case 3: mixed server domains (with and without ports) + let args = vec![ + "rustfs", + "/data/vol1", + "--server-domains", + "example.com,api.example.com:9000,cdn.example.com,storage.example.com:8443", + ]; + let opt = Opt::parse_from(args); + + assert_eq!(opt.server_domains.len(), 4); + assert_eq!(opt.server_domains[0], "example.com"); + assert_eq!(opt.server_domains[1], "api.example.com:9000"); + assert_eq!(opt.server_domains[2], "cdn.example.com"); + assert_eq!(opt.server_domains[3], "storage.example.com:8443"); + + // Test case 4: single domain with port + let args = vec!["rustfs", "/data/vol1", "--server-domains", "example.com:9000"]; + let opt = Opt::parse_from(args); + + assert_eq!(opt.server_domains.len(), 1); + assert_eq!(opt.server_domains[0], "example.com:9000"); + + // Test case 5: localhost with different ports + let args = vec![ + "rustfs", + "/data/vol1", + "--server-domains", + "localhost:9000,127.0.0.1:9000,localhost", + ]; + let opt = Opt::parse_from(args); + + assert_eq!(opt.server_domains.len(), 3); + assert_eq!(opt.server_domains[0], "localhost:9000"); + assert_eq!(opt.server_domains[1], "127.0.0.1:9000"); + assert_eq!(opt.server_domains[2], "localhost"); + } } diff --git a/rustfs/src/config/mod.rs b/rustfs/src/config/mod.rs index 1e553d89..14923522 100644 --- a/rustfs/src/config/mod.rs +++ b/rustfs/src/config/mod.rs @@ -13,6 +13,7 @@ // limitations under the License. use clap::Parser; +use clap::builder::NonEmptyStringValueParser; use const_str::concat; use std::string::ToString; shadow_rs::shadow!(build); @@ -50,7 +51,12 @@ const LONG_VERSION: &str = concat!( #[command(version = SHORT_VERSION, long_version = LONG_VERSION)] pub struct Opt { /// DIR points to a directory on a filesystem. - #[arg(required = true, env = "RUSTFS_VOLUMES")] + #[arg( + required = true, + env = "RUSTFS_VOLUMES", + value_delimiter = ' ', + value_parser = NonEmptyStringValueParser::new() + )] pub volumes: Vec, /// bind to a specific ADDRESS:PORT, ADDRESS can be an IP or hostname @@ -58,7 +64,12 @@ pub struct Opt { pub address: String, /// Domain name used for virtual-hosted-style requests. - #[arg(long, env = "RUSTFS_SERVER_DOMAINS")] + #[arg( + long, + env = "RUSTFS_SERVER_DOMAINS", + value_delimiter = ',', + value_parser = NonEmptyStringValueParser::new() + )] pub server_domains: Vec, /// Access key used for authentication. diff --git a/rustfs/src/init.rs b/rustfs/src/init.rs index 397829ea..1db6eca7 100644 --- a/rustfs/src/init.rs +++ b/rustfs/src/init.rs @@ -13,7 +13,8 @@ // limitations under the License. use crate::storage::ecfs::{process_lambda_configurations, process_queue_configurations, process_topic_configurations}; -use crate::{admin, config}; +use crate::{admin, config, version}; +use chrono::Datelike; use rustfs_config::{DEFAULT_UPDATE_CHECK, ENV_UPDATE_CHECK}; use rustfs_ecstore::bucket::metadata_sys; use rustfs_notify::notifier_global; @@ -23,6 +24,21 @@ use std::env; use std::io::Error; use tracing::{debug, error, info, instrument, warn}; +#[instrument] +pub(crate) fn print_server_info() { + let current_year = chrono::Utc::now().year(); + // Use custom macros to print server information + info!("RustFS Object Storage Server"); + info!("Copyright: 2024-{} RustFS, Inc", current_year); + info!("License: Apache-2.0 https://www.apache.org/licenses/LICENSE-2.0"); + info!("Version: {}", version::get_version()); + info!("Docs: https://rustfs.com/docs/"); +} + +/// Initialize the asynchronous update check system. +/// This function checks if update checking is enabled via +/// environment variable or default configuration. If enabled, +/// it spawns an asynchronous task to check for updates with a timeout. pub(crate) fn init_update_check() { let update_check_enable = env::var(ENV_UPDATE_CHECK) .unwrap_or_else(|_| DEFAULT_UPDATE_CHECK.to_string()) @@ -70,6 +86,12 @@ pub(crate) fn init_update_check() { }); } +/// Add existing bucket notification configurations to the global notifier system. +/// This function retrieves notification configurations for each bucket +/// and registers the corresponding event rules with the notifier system. +/// It processes queue, topic, and lambda configurations and maps them to event rules. +/// # Arguments +/// * `buckets` - A vector of bucket names to process #[instrument(skip_all)] pub(crate) async fn add_bucket_notification_configuration(buckets: Vec) { let region_opt = rustfs_ecstore::global::get_global_region(); @@ -128,6 +150,15 @@ pub(crate) async fn add_bucket_notification_configuration(buckets: Vec) } /// Initialize KMS system and configure if enabled +/// +/// This function initializes the global KMS service manager. If KMS is enabled +/// via command line options, it configures and starts the service accordingly. +/// If not enabled, it attempts to load any persisted KMS configuration from +/// cluster storage and starts the service if found. +/// # Arguments +/// * `opt` - The application configuration options +/// +/// Returns `std::io::Result<()>` indicating success or failure #[instrument(skip(opt))] pub(crate) async fn init_kms_system(opt: &config::Opt) -> std::io::Result<()> { // Initialize global KMS service manager (starts in NotConfigured state) diff --git a/rustfs/src/main.rs b/rustfs/src/main.rs index bdc93286..5b5a0d2e 100644 --- a/rustfs/src/main.rs +++ b/rustfs/src/main.rs @@ -16,9 +16,8 @@ mod admin; mod auth; mod config; mod error; -// mod grpc; mod init; -pub mod license; +mod license; mod profiling; mod server; mod storage; @@ -26,36 +25,35 @@ mod update; mod version; // Ensure the correct path for parse_license is imported -use crate::init::{add_bucket_notification_configuration, init_buffer_profile_system, init_kms_system, init_update_check}; +use crate::init::{ + add_bucket_notification_configuration, init_buffer_profile_system, init_kms_system, init_update_check, print_server_info, +}; use crate::server::{ - SHUTDOWN_TIMEOUT, ServiceState, ServiceStateManager, ShutdownSignal, init_event_notifier, shutdown_event_notifier, + SHUTDOWN_TIMEOUT, ServiceState, ServiceStateManager, ShutdownSignal, init_cert, init_event_notifier, shutdown_event_notifier, start_audit_system, start_http_server, stop_audit_system, wait_for_shutdown, }; -use chrono::Datelike; use clap::Parser; use license::init_license; -use rustfs_ahm::{ - Scanner, create_ahm_services_cancel_token, heal::storage::ECStoreHealStorage, init_heal_manager, - scanner::data_scanner::ScannerConfig, shutdown_ahm_services, -}; -use rustfs_common::globals::set_global_addr; -use rustfs_ecstore::bucket::metadata_sys::init_bucket_metadata_sys; -use rustfs_ecstore::bucket::replication::{GLOBAL_REPLICATION_POOL, init_background_replication}; -use rustfs_ecstore::config as ecconfig; -use rustfs_ecstore::config::GLOBAL_CONFIG_SYS; -use rustfs_ecstore::store_api::BucketOptions; +use rustfs_ahm::{create_ahm_services_cancel_token, heal::storage::ECStoreHealStorage, init_heal_manager, shutdown_ahm_services}; +use rustfs_common::{GlobalReadiness, SystemStage, set_global_addr}; use rustfs_ecstore::{ StorageAPI, + bucket::metadata_sys::init_bucket_metadata_sys, + bucket::replication::{GLOBAL_REPLICATION_POOL, init_background_replication}, + config as ecconfig, + config::GLOBAL_CONFIG_SYS, endpoints::EndpointServerPools, global::{set_global_rustfs_port, shutdown_background_services}, notification_sys::new_global_notification_sys, set_global_endpoints, store::ECStore, store::init_local_disks, + store_api::BucketOptions, update_erasure_type, }; use rustfs_iam::init_iam_sys; use rustfs_obs::{init_obs, set_global_guard}; +use rustfs_scanner::init_data_scanner; use rustfs_utils::net::parse_and_resolve_address; use std::io::{Error, Result}; use std::sync::Arc; @@ -70,25 +68,6 @@ static GLOBAL: tikv_jemallocator::Jemalloc = tikv_jemallocator::Jemalloc; #[global_allocator] static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc; -const LOGO: &str = r#" - -░█▀▄░█░█░█▀▀░▀█▀░█▀▀░█▀▀ -░█▀▄░█░█░▀▀█░░█░░█▀▀░▀▀█ -░▀░▀░▀▀▀░▀▀▀░░▀░░▀░░░▀▀▀ - -"#; - -#[instrument] -fn print_server_info() { - let current_year = chrono::Utc::now().year(); - // Use custom macros to print server information - info!("RustFS Object Storage Server"); - info!("Copyright: 2024-{} RustFS, Inc", current_year); - info!("License: Apache-2.0 https://www.apache.org/licenses/LICENSE-2.0"); - info!("Version: {}", version::get_version()); - info!("Docs: https://rustfs.com/docs/"); -} - fn main() -> Result<()> { let runtime = server::get_tokio_runtime_builder() .build() @@ -121,11 +100,16 @@ async fn async_main() -> Result<()> { } // print startup logo - info!("{}", LOGO); + info!("{}", server::LOGO); // Initialize performance profiling if enabled profiling::init_from_env().await; + // Initialize TLS if a certificate path is provided + if let Some(tls_path) = &opt.tls_path { + init_cert(tls_path).await + } + // Run parameters match run(opt).await { Ok(_) => Ok(()), @@ -139,6 +123,8 @@ async fn async_main() -> Result<()> { #[instrument(skip(opt))] async fn run(opt: config::Opt) -> Result<()> { debug!("opt: {:?}", &opt); + // 1. Initialize global readiness tracker + let readiness = Arc::new(GlobalReadiness::new()); if let Some(region) = &opt.region { rustfs_ecstore::global::set_global_region(region.clone()); @@ -210,14 +196,14 @@ async fn run(opt: config::Opt) -> Result<()> { let s3_shutdown_tx = { let mut s3_opt = opt.clone(); s3_opt.console_enable = false; - let s3_shutdown_tx = start_http_server(&s3_opt, state_manager.clone()).await?; + let s3_shutdown_tx = start_http_server(&s3_opt, state_manager.clone(), readiness.clone()).await?; Some(s3_shutdown_tx) }; let console_shutdown_tx = if opt.console_enable && !opt.console_address.is_empty() { let mut console_opt = opt.clone(); console_opt.address = console_opt.console_address.clone(); - let console_shutdown_tx = start_http_server(&console_opt, state_manager.clone()).await?; + let console_shutdown_tx = start_http_server(&console_opt, state_manager.clone(), readiness.clone()).await?; Some(console_shutdown_tx) } else { None @@ -232,6 +218,7 @@ async fn run(opt: config::Opt) -> Result<()> { let ctx = CancellationToken::new(); // init store + // 2. Start Storage Engine (ECStore) let store = ECStore::new(server_addr, endpoint_pools.clone(), ctx.clone()) .await .inspect_err(|err| { @@ -239,10 +226,20 @@ async fn run(opt: config::Opt) -> Result<()> { })?; ecconfig::init(); - // config system configuration - GLOBAL_CONFIG_SYS.init(store.clone()).await?; - // init replication_pool + // // Initialize global configuration system + let mut retry_count = 0; + while let Err(e) = GLOBAL_CONFIG_SYS.init(store.clone()).await { + error!("GLOBAL_CONFIG_SYS.init failed {:?}", e); + // TODO: check error type + retry_count += 1; + if retry_count > 15 { + return Err(Error::other("GLOBAL_CONFIG_SYS.init failed")); + } + tokio::time::sleep(tokio::time::Duration::from_secs(1)).await; + } + readiness.mark_stage(SystemStage::StorageReady); + // init replication_pool init_background_replication(store.clone()).await; // Initialize KMS system if enabled init_kms_system(&opt).await?; @@ -275,7 +272,10 @@ async fn run(opt: config::Opt) -> Result<()> { init_bucket_metadata_sys(store.clone(), buckets.clone()).await; + // 3. Initialize IAM System (Blocking load) + // This ensures data is in memory before moving forward init_iam_sys(store.clone()).await.map_err(Error::other)?; + readiness.mark_stage(SystemStage::IamReady); add_bucket_notification_configuration(buckets.clone()).await; @@ -301,23 +301,29 @@ async fn run(opt: config::Opt) -> Result<()> { // Initialize heal manager and scanner based on environment variables if enable_heal || enable_scanner { - if enable_heal { - // Initialize heal manager with channel processor - let heal_storage = Arc::new(ECStoreHealStorage::new(store.clone())); - let heal_manager = init_heal_manager(heal_storage, None).await?; + let heal_storage = Arc::new(ECStoreHealStorage::new(store.clone())); - if enable_scanner { - info!(target: "rustfs::main::run","Starting scanner with heal manager..."); - let scanner = Scanner::new(Some(ScannerConfig::default()), Some(heal_manager)); - scanner.start().await?; - } else { - info!(target: "rustfs::main::run","Scanner disabled, but heal manager is initialized and available"); - } - } else if enable_scanner { - info!("Starting scanner without heal manager..."); - let scanner = Scanner::new(Some(ScannerConfig::default()), None); - scanner.start().await?; - } + init_heal_manager(heal_storage, None).await?; + + init_data_scanner(ctx.clone(), store.clone()).await; + + // if enable_heal { + // // Initialize heal manager with channel processor + // let heal_storage = Arc::new(ECStoreHealStorage::new(store.clone())); + // let heal_manager = init_heal_manager(heal_storage, None).await?; + + // if enable_scanner { + // info!(target: "rustfs::main::run","Starting scanner with heal manager..."); + // let scanner = Scanner::new(Some(ScannerConfig::default()), Some(heal_manager)); + // scanner.start().await?; + // } else { + // info!(target: "rustfs::main::run","Scanner disabled, but heal manager is initialized and available"); + // } + // } else if enable_scanner { + // info!("Starting scanner without heal manager..."); + // let scanner = Scanner::new(Some(ScannerConfig::default()), None); + // scanner.start().await?; + // } } else { info!(target: "rustfs::main::run","Both scanner and heal are disabled, skipping AHM service initialization"); } @@ -327,6 +333,15 @@ async fn run(opt: config::Opt) -> Result<()> { init_update_check(); + println!( + "RustFS server started successfully at {}, current time: {}", + &server_address, + chrono::offset::Utc::now().to_string() + ); + info!(target: "rustfs::main::run","server started successfully at {}", &server_address); + // 4. Mark as Full Ready now that critical components are warm + readiness.mark_stage(SystemStage::FullReady); + // Perform hibernation for 1 second tokio::time::sleep(SHUTDOWN_TIMEOUT).await; // listen to the shutdown signal diff --git a/rustfs/src/profiling.rs b/rustfs/src/profiling.rs index e237330c..a4c5ce8b 100644 --- a/rustfs/src/profiling.rs +++ b/rustfs/src/profiling.rs @@ -38,8 +38,7 @@ fn get_platform_info() -> (String, String, String) { pub async fn dump_cpu_pprof_for(_duration: std::time::Duration) -> Result { let (target_os, target_env, target_arch) = get_platform_info(); let msg = format!( - "CPU profiling is not supported on this platform. target_os={}, target_env={}, target_arch={}", - target_os, target_env, target_arch + "CPU profiling is not supported on this platform. target_os={target_os}, target_env={target_env}, target_arch={target_arch}" ); Err(msg) } @@ -48,8 +47,7 @@ pub async fn dump_cpu_pprof_for(_duration: std::time::Duration) -> Result Result { let (target_os, target_env, target_arch) = get_platform_info(); let msg = format!( - "Memory profiling is not supported on this platform. target_os={}, target_env={}, target_arch={}", - target_os, target_env, target_arch + "Memory profiling is not supported on this platform. target_os={target_os}, target_env={target_env}, target_arch={target_arch}" ); Err(msg) } diff --git a/rustfs/src/server/audit.rs b/rustfs/src/server/audit.rs index 2a81af15..144f7446 100644 --- a/rustfs/src/server/audit.rs +++ b/rustfs/src/server/audit.rs @@ -12,8 +12,7 @@ // See the License for the specific language governing permissions and // limitations under the License. -use rustfs_audit::system::AuditSystemState; -use rustfs_audit::{AuditError, AuditResult, audit_system, init_audit_system}; +use rustfs_audit::{AuditError, AuditResult, audit_system, init_audit_system, system::AuditSystemState}; use rustfs_config::DEFAULT_DELIMITER; use rustfs_ecstore::config::GLOBAL_SERVER_CONFIG; use tracing::{info, warn}; @@ -69,7 +68,9 @@ pub(crate) async fn start_audit_system() -> AuditResult<()> { mqtt_config.is_some(), webhook_config.is_some() ); + // 3. Initialize and start the audit system let system = init_audit_system(); + // Check if the audit system is already running let state = system.get_state().await; if state == AuditSystemState::Running { warn!( diff --git a/rustfs/src/server/cert.rs b/rustfs/src/server/cert.rs new file mode 100644 index 00000000..93013be0 --- /dev/null +++ b/rustfs/src/server/cert.rs @@ -0,0 +1,160 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use rustfs_common::set_global_root_cert; +use rustfs_config::{RUSTFS_CA_CERT, RUSTFS_PUBLIC_CERT, RUSTFS_TLS_CERT}; +use tracing::{debug, info}; + +/// Initialize TLS certificates for inter-node communication. +/// This function attempts to load certificates from the specified `tls_path`. +/// It looks for `rustfs_cert.pem`, `public.crt`, and `ca.crt` files. +/// Additionally, it tries to load system root certificates from common locations +/// to ensure trust for public CAs when mixing self-signed and public certificates. +/// If any certificates are found, they are set as the global root certificates. +pub(crate) async fn init_cert(tls_path: &str) { + let mut cert_data = Vec::new(); + + // Try rustfs_cert.pem (custom cert name) + walk_dir(std::path::PathBuf::from(tls_path), RUSTFS_TLS_CERT, &mut cert_data).await; + + // Try public.crt (common CA name) + let public_cert_path = std::path::Path::new(tls_path).join(RUSTFS_PUBLIC_CERT); + load_cert_file(public_cert_path.to_str().unwrap_or_default(), &mut cert_data, "CA certificate").await; + + // Try ca.crt (common CA name) + let ca_cert_path = std::path::Path::new(tls_path).join(RUSTFS_CA_CERT); + load_cert_file(ca_cert_path.to_str().unwrap_or_default(), &mut cert_data, "CA certificate").await; + + let trust_system_ca = rustfs_utils::get_env_bool(rustfs_config::ENV_TRUST_SYSTEM_CA, rustfs_config::DEFAULT_TRUST_SYSTEM_CA); + if !trust_system_ca { + // Attempt to load system root certificates to maintain trust for public CAs + // This is important when mixing self-signed internal certs with public external certs + let system_ca_paths = [ + "/etc/ssl/certs/ca-certificates.crt", // Debian/Ubuntu/Alpine + "/etc/pki/tls/certs/ca-bundle.crt", // Fedora/RHEL/CentOS + "/etc/ssl/ca-bundle.pem", // OpenSUSE + "/etc/pki/tls/cacert.pem", // OpenELEC + "/etc/ssl/cert.pem", // macOS/FreeBSD + "/usr/local/etc/openssl/cert.pem", // macOS/Homebrew OpenSSL + "/usr/local/share/certs/ca-root-nss.crt", // FreeBSD + "/etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem", // RHEL + "/usr/share/pki/ca-trust-legacy/ca-bundle.legacy.crt", // RHEL legacy + ]; + + let mut system_cert_loaded = false; + for path in system_ca_paths { + if load_cert_file(path, &mut cert_data, "system root certificates").await { + system_cert_loaded = true; + info!("Loaded system root certificates from {}", path); + break; // Stop after finding the first valid bundle + } + } + + if !system_cert_loaded { + debug!("Could not find system root certificates in common locations."); + } + } else { + info!("Loading system root certificates disabled via RUSTFS_TRUST_SYSTEM_CA"); + } + if !cert_data.is_empty() { + set_global_root_cert(cert_data).await; + info!("Configured custom root certificates for inter-node communication"); + } +} + +/// Helper function to load a certificate file and append to cert_data. +/// Returns true if the file was successfully loaded. +async fn load_cert_file(path: &str, cert_data: &mut Vec, desc: &str) -> bool { + if tokio::fs::metadata(path).await.is_ok() { + if let Ok(data) = tokio::fs::read(path).await { + cert_data.extend(data); + cert_data.push(b'\n'); + info!("Loaded {} from {}", desc, path); + true + } else { + debug!("Failed to read {} from {}", desc, path); + false + } + } else { + debug!("{} file not found at {}", desc, path); + false + } +} + +/// Load the certificate file if its name matches `cert_name`. +/// If it matches, the certificate data is appended to `cert_data`. +/// +/// # Parameters +/// - `entry`: The directory entry to check. +/// - `cert_name`: The name of the certificate file to match. +/// - `cert_data`: A mutable vector to append loaded certificate data. +async fn load_if_matches(entry: &tokio::fs::DirEntry, cert_name: &str, cert_data: &mut Vec) { + let fname = entry.file_name().to_string_lossy().to_string(); + if fname == cert_name { + let p = entry.path(); + load_cert_file(&p.to_string_lossy(), cert_data, "certificate").await; + } +} + +/// Search the directory at `path` and one level of subdirectories to find and load +/// certificates matching `cert_name`. Loaded certificate data is appended to +/// `cert_data`. +/// # Parameters +/// - `path`: The starting directory path to search for certificates. +/// - `cert_name`: The name of the certificate file to look for. +/// - `cert_data`: A mutable vector to append loaded certificate data. +async fn walk_dir(path: std::path::PathBuf, cert_name: &str, cert_data: &mut Vec) { + if let Ok(mut rd) = tokio::fs::read_dir(&path).await { + while let Ok(Some(entry)) = rd.next_entry().await { + if let Ok(ft) = entry.file_type().await { + if ft.is_file() { + load_if_matches(&entry, cert_name, cert_data).await; + } else if ft.is_dir() { + // Only check direct subdirectories, no deeper recursion + if let Ok(mut sub_rd) = tokio::fs::read_dir(&entry.path()).await { + while let Ok(Some(sub_entry)) = sub_rd.next_entry().await { + if let Ok(sub_ft) = sub_entry.file_type().await { + if sub_ft.is_file() { + load_if_matches(&sub_entry, cert_name, cert_data).await; + } + // Ignore subdirectories and symlinks in subdirs to limit to one level + } + } + } + } else if ft.is_symlink() { + // Follow symlink and treat target as file or directory, but limit to one level + if let Ok(meta) = tokio::fs::metadata(&entry.path()).await { + if meta.is_file() { + load_if_matches(&entry, cert_name, cert_data).await; + } else if meta.is_dir() { + // Treat as directory but only check its direct contents + if let Ok(mut sub_rd) = tokio::fs::read_dir(&entry.path()).await { + while let Ok(Some(sub_entry)) = sub_rd.next_entry().await { + if let Ok(sub_ft) = sub_entry.file_type().await { + if sub_ft.is_file() { + load_if_matches(&sub_entry, cert_name, cert_data).await; + } + // Ignore deeper levels + } + } + } + } + } + } + } + } + } else { + debug!("Certificate directory not found: {}", path.display()); + } +} diff --git a/rustfs/src/server/compress.rs b/rustfs/src/server/compress.rs new file mode 100644 index 00000000..9276869f --- /dev/null +++ b/rustfs/src/server/compress.rs @@ -0,0 +1,479 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +//! HTTP Response Compression Module +//! +//! This module provides configurable HTTP response compression functionality +//! using a whitelist-based approach. Unlike traditional blacklist approaches, +//! this design only compresses explicitly configured content types, which: +//! +//! 1. Preserves Content-Length for all other responses (better browser UX) +//! 2. Aligns with MinIO's opt-in compression behavior +//! 3. Provides fine-grained control over what gets compressed +//! +//! # Configuration +//! +//! Compression can be configured via environment variables or command line options: +//! +//! - `RUSTFS_COMPRESS_ENABLE` - Enable/disable compression (default: off) +//! - `RUSTFS_COMPRESS_EXTENSIONS` - File extensions to compress (e.g., `.txt,.log,.csv`) +//! - `RUSTFS_COMPRESS_MIME_TYPES` - MIME types to compress (e.g., `text/*,application/json`) +//! - `RUSTFS_COMPRESS_MIN_SIZE` - Minimum file size for compression (default: 1000 bytes) +//! +//! # Example +//! +//! ```bash +//! RUSTFS_COMPRESS_ENABLE=on \ +//! RUSTFS_COMPRESS_EXTENSIONS=.txt,.log,.csv \ +//! RUSTFS_COMPRESS_MIME_TYPES=text/*,application/json \ +//! RUSTFS_COMPRESS_MIN_SIZE=1000 \ +//! rustfs /data +//! ``` + +use http::Response; +use rustfs_config::{ + DEFAULT_COMPRESS_ENABLE, DEFAULT_COMPRESS_EXTENSIONS, DEFAULT_COMPRESS_MIME_TYPES, DEFAULT_COMPRESS_MIN_SIZE, + ENV_COMPRESS_ENABLE, ENV_COMPRESS_EXTENSIONS, ENV_COMPRESS_MIME_TYPES, ENV_COMPRESS_MIN_SIZE, EnableState, +}; +use std::str::FromStr; +use tower_http::compression::predicate::Predicate; +use tracing::debug; + +/// Configuration for HTTP response compression. +/// +/// This structure holds the whitelist-based compression settings: +/// - File extensions that should be compressed (checked via Content-Disposition header) +/// - MIME types that should be compressed (supports wildcards like `text/*`) +/// - Minimum file size threshold for compression +/// +/// When compression is enabled, only responses matching these criteria will be compressed. +/// This approach aligns with MinIO's behavior where compression is opt-in rather than default. +#[derive(Clone, Debug)] +pub struct CompressionConfig { + /// Whether compression is enabled + pub enabled: bool, + /// File extensions to compress (normalized to lowercase with leading dot) + pub extensions: Vec, + /// MIME type patterns to compress (supports wildcards like `text/*`) + pub mime_patterns: Vec, + /// Minimum file size (in bytes) for compression + pub min_size: u64, +} + +impl CompressionConfig { + /// Create a new compression configuration from environment variables + /// + /// Reads the following environment variables: + /// - `RUSTFS_COMPRESS_ENABLE` - Enable/disable compression (default: false) + /// - `RUSTFS_COMPRESS_EXTENSIONS` - File extensions to compress (default: "") + /// - `RUSTFS_COMPRESS_MIME_TYPES` - MIME types to compress (default: "text/*,application/json,...") + /// - `RUSTFS_COMPRESS_MIN_SIZE` - Minimum file size for compression (default: 1000) + pub fn from_env() -> Self { + // Read compression enable state + let enabled = std::env::var(ENV_COMPRESS_ENABLE) + .ok() + .and_then(|v| EnableState::from_str(&v).ok()) + .map(|state| state.is_enabled()) + .unwrap_or(DEFAULT_COMPRESS_ENABLE); + + // Read file extensions + let extensions_str = std::env::var(ENV_COMPRESS_EXTENSIONS).unwrap_or_else(|_| DEFAULT_COMPRESS_EXTENSIONS.to_string()); + let extensions: Vec = if extensions_str.is_empty() { + Vec::new() + } else { + extensions_str + .split(',') + .map(|s| { + let s = s.trim().to_lowercase(); + if s.starts_with('.') { s } else { format!(".{s}") } + }) + .filter(|s| s.len() > 1) + .collect() + }; + + // Read MIME type patterns + let mime_types_str = std::env::var(ENV_COMPRESS_MIME_TYPES).unwrap_or_else(|_| DEFAULT_COMPRESS_MIME_TYPES.to_string()); + let mime_patterns: Vec = if mime_types_str.is_empty() { + Vec::new() + } else { + mime_types_str + .split(',') + .map(|s| s.trim().to_lowercase()) + .filter(|s| !s.is_empty()) + .collect() + }; + + // Read minimum file size + let min_size = std::env::var(ENV_COMPRESS_MIN_SIZE) + .ok() + .and_then(|v| v.parse::().ok()) + .unwrap_or(DEFAULT_COMPRESS_MIN_SIZE); + + Self { + enabled, + extensions, + mime_patterns, + min_size, + } + } + + /// Check if a MIME type matches any of the configured patterns + pub(crate) fn matches_mime_type(&self, content_type: &str) -> bool { + let ct_lower = content_type.to_lowercase(); + // Extract the main MIME type (before any parameters like charset) + let main_type = ct_lower.split(';').next().unwrap_or(&ct_lower).trim(); + + for pattern in &self.mime_patterns { + if pattern.ends_with("/*") { + // Wildcard pattern like "text/*" + let prefix = &pattern[..pattern.len() - 1]; // "text/" + if main_type.starts_with(prefix) { + return true; + } + } else if main_type == pattern { + // Exact match + return true; + } + } + false + } + + /// Check if a filename matches any of the configured extensions + /// The filename is extracted from Content-Disposition header + pub(crate) fn matches_extension(&self, filename: &str) -> bool { + if self.extensions.is_empty() { + return false; + } + + let filename_lower = filename.to_lowercase(); + for ext in &self.extensions { + if filename_lower.ends_with(ext) { + return true; + } + } + false + } + + /// Extract filename from Content-Disposition header + /// Format: attachment; filename="example.txt" or attachment; filename=example.txt + pub(crate) fn extract_filename_from_content_disposition(header_value: &str) -> Option { + // Look for filename= or filename*= parameter + let lower = header_value.to_lowercase(); + + // Try to find filename="..." or filename=... + if let Some(idx) = lower.find("filename=") { + let start = idx + "filename=".len(); + let rest = &header_value[start..]; + + // Check if it's quoted + if let Some(stripped) = rest.strip_prefix('"') { + // Find closing quote + if let Some(end_quote) = stripped.find('"') { + return Some(stripped[..end_quote].to_string()); + } + } else { + // Unquoted - take until semicolon or end + let end = rest.find(';').unwrap_or(rest.len()); + return Some(rest[..end].trim().to_string()); + } + } + + None + } +} + +impl Default for CompressionConfig { + fn default() -> Self { + Self { + enabled: rustfs_config::DEFAULT_COMPRESS_ENABLE, + extensions: rustfs_config::DEFAULT_COMPRESS_EXTENSIONS + .split(',') + .filter_map(|s| { + let s = s.trim().to_lowercase(); + if s.is_empty() { + None + } else if s.starts_with('.') { + Some(s) + } else { + Some(format!(".{s}")) + } + }) + .collect(), + mime_patterns: rustfs_config::DEFAULT_COMPRESS_MIME_TYPES + .split(',') + .map(|s| s.trim().to_lowercase()) + .filter(|s| !s.is_empty()) + .collect(), + min_size: rustfs_config::DEFAULT_COMPRESS_MIN_SIZE, + } + } +} + +/// Predicate to determine if a response should be compressed. +/// +/// This predicate implements a whitelist-based compression approach: +/// - Only compresses responses that match configured file extensions OR MIME types +/// - Respects minimum file size threshold +/// - Always skips error responses (4xx, 5xx) to avoid Content-Length issues +/// +/// # Design Philosophy +/// Unlike the previous blacklist approach, this whitelist approach: +/// 1. Only compresses explicitly configured content types +/// 2. Preserves Content-Length for all other responses (better browser UX) +/// 3. Aligns with MinIO's opt-in compression behavior +/// +/// # Note on tower-http Integration +/// The `tower-http::CompressionLayer` automatically handles: +/// - Skipping responses with `Content-Encoding` header (already compressed) +/// - Skipping responses with `Content-Range` header (Range requests) +/// +/// These checks are performed before calling this predicate, so we don't need to check them here. +/// +/// # Extension Matching +/// File extension matching works by extracting the filename from the +/// `Content-Disposition` response header (e.g., `attachment; filename="file.txt"`). +/// +/// # Performance +/// This predicate is evaluated per-response and has O(n) complexity where n is +/// the number of configured extensions/MIME patterns. +#[derive(Clone, Debug)] +pub struct CompressionPredicate { + config: CompressionConfig, +} + +impl CompressionPredicate { + /// Create a new compression predicate with the given configuration + pub fn new(config: CompressionConfig) -> Self { + Self { config } + } +} + +impl Predicate for CompressionPredicate { + fn should_compress(&self, response: &Response) -> bool + where + B: http_body::Body, + { + // If compression is disabled, never compress + if !self.config.enabled { + return false; + } + + let status = response.status(); + + // Never compress error responses (4xx and 5xx status codes) + // This prevents Content-Length mismatch issues with error responses + if status.is_client_error() || status.is_server_error() { + debug!("Skipping compression for error response: status={}", status.as_u16()); + return false; + } + + // Note: CONTENT_ENCODING and CONTENT_RANGE checks are handled by tower-http's + // CompressionLayer before calling this predicate, so we don't need to check them here. + + // Check Content-Length header for minimum size threshold + if let Some(content_length) = response.headers().get(http::header::CONTENT_LENGTH) { + if let Ok(length_str) = content_length.to_str() { + if let Ok(length) = length_str.parse::() { + if length < self.config.min_size { + debug!( + "Skipping compression for small response: size={} bytes, min_size={}", + length, self.config.min_size + ); + return false; + } + } + } + } + + // Check if the response matches configured extension via Content-Disposition + if let Some(content_disposition) = response.headers().get(http::header::CONTENT_DISPOSITION) { + if let Ok(cd) = content_disposition.to_str() { + if let Some(filename) = CompressionConfig::extract_filename_from_content_disposition(cd) { + if self.config.matches_extension(&filename) { + debug!("Compressing response: filename '{}' matches configured extension", filename); + return true; + } + } + } + } + + // Check if the response matches configured MIME type + if let Some(content_type) = response.headers().get(http::header::CONTENT_TYPE) { + if let Ok(ct) = content_type.to_str() { + if self.config.matches_mime_type(ct) { + debug!("Compressing response: Content-Type '{}' matches configured MIME pattern", ct); + return true; + } + } + } + + // Default: don't compress (whitelist approach) + debug!("Skipping compression: response does not match any configured extension or MIME type"); + false + } +} + +#[cfg(test)] +mod tests { + use super::*; + + #[test] + fn test_compression_config_default() { + let config = CompressionConfig::default(); + assert!(!config.enabled); + assert!(config.extensions.is_empty()); + assert!(!config.mime_patterns.is_empty()); + assert_eq!(config.min_size, 1000); + } + + #[test] + fn test_compression_config_mime_matching() { + let config = CompressionConfig { + enabled: true, + extensions: vec![], + mime_patterns: vec!["text/*".to_string(), "application/json".to_string()], + min_size: 1000, + }; + + // Test wildcard matching + assert!(config.matches_mime_type("text/plain")); + assert!(config.matches_mime_type("text/html")); + assert!(config.matches_mime_type("text/css")); + assert!(config.matches_mime_type("TEXT/PLAIN")); // case insensitive + + // Test exact matching + assert!(config.matches_mime_type("application/json")); + assert!(config.matches_mime_type("application/json; charset=utf-8")); + + // Test non-matching types + assert!(!config.matches_mime_type("image/png")); + assert!(!config.matches_mime_type("application/octet-stream")); + assert!(!config.matches_mime_type("video/mp4")); + } + + #[test] + fn test_compression_config_extension_matching() { + let config = CompressionConfig { + enabled: true, + extensions: vec![".txt".to_string(), ".log".to_string(), ".csv".to_string()], + mime_patterns: vec![], + min_size: 1000, + }; + + // Test matching extensions + assert!(config.matches_extension("file.txt")); + assert!(config.matches_extension("path/to/file.log")); + assert!(config.matches_extension("data.csv")); + assert!(config.matches_extension("FILE.TXT")); // case insensitive + + // Test non-matching extensions + assert!(!config.matches_extension("image.png")); + assert!(!config.matches_extension("archive.zip")); + assert!(!config.matches_extension("document.pdf")); + } + + #[test] + fn test_extract_filename_from_content_disposition() { + // Quoted filename + assert_eq!( + CompressionConfig::extract_filename_from_content_disposition(r#"attachment; filename="example.txt""#), + Some("example.txt".to_string()) + ); + + // Unquoted filename + assert_eq!( + CompressionConfig::extract_filename_from_content_disposition("attachment; filename=example.log"), + Some("example.log".to_string()) + ); + + // Filename with path + assert_eq!( + CompressionConfig::extract_filename_from_content_disposition(r#"attachment; filename="path/to/file.csv""#), + Some("path/to/file.csv".to_string()) + ); + + // Mixed case + assert_eq!( + CompressionConfig::extract_filename_from_content_disposition(r#"Attachment; FILENAME="test.json""#), + Some("test.json".to_string()) + ); + + // No filename + assert_eq!(CompressionConfig::extract_filename_from_content_disposition("inline"), None); + } + + #[test] + fn test_compression_config_from_empty_strings() { + // Simulate config with empty extension and mime strings + let config = CompressionConfig { + enabled: true, + extensions: "" + .split(',') + .map(|s| s.trim().to_lowercase()) + .filter(|s| !s.is_empty()) + .collect(), + mime_patterns: "" + .split(',') + .map(|s| s.trim().to_lowercase()) + .filter(|s| !s.is_empty()) + .collect(), + min_size: 1000, + }; + + assert!(config.extensions.is_empty()); + assert!(config.mime_patterns.is_empty()); + assert!(!config.matches_extension("file.txt")); + assert!(!config.matches_mime_type("text/plain")); + } + + #[test] + fn test_compression_config_extension_normalization() { + // Extensions should be normalized with leading dot + let extensions: Vec = "txt,.log,csv" + .split(',') + .map(|s| { + let s = s.trim().to_lowercase(); + if s.starts_with('.') { s } else { format!(".{s}") } + }) + .filter(|s| s.len() > 1) + .collect(); + + assert_eq!(extensions, vec![".txt", ".log", ".csv"]); + } + + #[test] + fn test_compression_predicate_creation() { + // Test that CompressionPredicate can be created with various configs + let config_disabled = CompressionConfig { + enabled: false, + extensions: vec![".txt".to_string()], + mime_patterns: vec!["text/*".to_string()], + min_size: 0, + }; + let predicate = CompressionPredicate::new(config_disabled.clone()); + assert!(!predicate.config.enabled); + + let config_enabled = CompressionConfig { + enabled: true, + extensions: vec![".txt".to_string(), ".log".to_string()], + mime_patterns: vec!["text/*".to_string(), "application/json".to_string()], + min_size: 1000, + }; + let predicate = CompressionPredicate::new(config_enabled.clone()); + assert!(predicate.config.enabled); + assert_eq!(predicate.config.extensions.len(), 2); + assert_eq!(predicate.config.mime_patterns.len(), 2); + assert_eq!(predicate.config.min_size, 1000); + } +} diff --git a/rustfs/src/server/http.rs b/rustfs/src/server/http.rs index 521c2b06..53a03bca 100644 --- a/rustfs/src/server/http.rs +++ b/rustfs/src/server/http.rs @@ -13,10 +13,11 @@ // limitations under the License. // Ensure the correct path for parse_license is imported +use super::compress::{CompressionConfig, CompressionPredicate}; use crate::admin; use crate::auth::IAMAuth; use crate::config; -use crate::server::{ServiceState, ServiceStateManager, hybrid::hybrid, layer::RedirectLayer}; +use crate::server::{ReadinessGateLayer, ServiceState, ServiceStateManager, hybrid::hybrid, layer::RedirectLayer}; use crate::storage; use crate::storage::tonic_service::make_server; use bytes::Bytes; @@ -28,6 +29,7 @@ use hyper_util::{ service::TowerToHyperService, }; use metrics::{counter, histogram}; +use rustfs_common::GlobalReadiness; use rustfs_config::{DEFAULT_ACCESS_KEY, DEFAULT_SECRET_KEY, MI_B, RUSTFS_TLS_CERT, RUSTFS_TLS_KEY}; use rustfs_protos::proto_gen::node_service::node_service_server::NodeServiceServer; use rustfs_utils::net::parse_and_resolve_address; @@ -43,7 +45,7 @@ use tokio_rustls::TlsAcceptor; use tonic::{Request, Status, metadata::MetadataValue}; use tower::ServiceBuilder; use tower_http::catch_panic::CatchPanicLayer; -use tower_http::compression::{CompressionLayer, predicate::Predicate}; +use tower_http::compression::CompressionLayer; use tower_http::cors::{AllowOrigin, Any, CorsLayer}; use tower_http::request_id::{MakeRequestUuid, PropagateRequestIdLayer, SetRequestIdLayer}; use tower_http::trace::TraceLayer; @@ -108,63 +110,10 @@ fn get_cors_allowed_origins() -> String { .unwrap_or(rustfs_config::DEFAULT_CONSOLE_CORS_ALLOWED_ORIGINS.to_string()) } -/// Predicate to determine if a response should be compressed. -/// -/// This predicate implements intelligent compression selection to avoid issues -/// with error responses and small payloads. It excludes: -/// - Client error responses (4xx status codes) - typically small XML/JSON error messages -/// - Server error responses (5xx status codes) - ensures error details are preserved -/// - Very small responses (< 256 bytes) - compression overhead outweighs benefits -/// -/// # Rationale -/// The CompressionLayer can cause Content-Length header mismatches with error responses, -/// particularly when the s3s library generates XML error responses (~119 bytes for NoSuchKey). -/// By excluding these responses from compression, we ensure: -/// 1. Error responses are sent with accurate Content-Length headers -/// 2. Clients receive complete error bodies without truncation -/// 3. Small responses avoid compression overhead -/// -/// # Performance -/// This predicate is evaluated per-response and has O(1) complexity. -#[derive(Clone, Copy, Debug)] -struct ShouldCompress; - -impl Predicate for ShouldCompress { - fn should_compress(&self, response: &Response) -> bool - where - B: http_body::Body, - { - let status = response.status(); - - // Never compress error responses (4xx and 5xx status codes) - // This prevents Content-Length mismatch issues with error responses - if status.is_client_error() || status.is_server_error() { - debug!("Skipping compression for error response: status={}", status.as_u16()); - return false; - } - - // Check Content-Length header to avoid compressing very small responses - // Responses smaller than 256 bytes typically don't benefit from compression - // and may actually increase in size due to compression overhead - if let Some(content_length) = response.headers().get(http::header::CONTENT_LENGTH) { - if let Ok(length_str) = content_length.to_str() { - if let Ok(length) = length_str.parse::() { - if length < 256 { - debug!("Skipping compression for small response: size={} bytes", length); - return false; - } - } - } - } - - // Compress successful responses with sufficient size - true - } -} - pub async fn start_http_server( opt: &config::Opt, worker_state_manager: ServiceStateManager, + readiness: Arc, ) -> Result> { let server_addr = parse_and_resolve_address(opt.address.as_str()).map_err(Error::other)?; let server_port = server_addr.port(); @@ -172,16 +121,26 @@ pub async fn start_http_server( // The listening address and port are obtained from the parameters let listener = { let mut server_addr = server_addr; - let mut socket = socket2::Socket::new( + + // Try to create a socket for the address family; if that fails, fallback to IPv4. + let mut socket = match socket2::Socket::new( socket2::Domain::for_address(server_addr), socket2::Type::STREAM, Some(socket2::Protocol::TCP), - )?; + ) { + Ok(s) => s, + Err(e) => { + warn!("Failed to create socket for {:?}: {}, falling back to IPv4", server_addr, e); + let ipv4_addr = SocketAddr::new(std::net::Ipv4Addr::UNSPECIFIED.into(), server_addr.port()); + server_addr = ipv4_addr; + socket2::Socket::new(socket2::Domain::IPV4, socket2::Type::STREAM, Some(socket2::Protocol::TCP))? + } + }; + // If address is IPv6 try to enable dual-stack; on failure, switch to IPv4 socket. if server_addr.is_ipv6() { if let Err(e) = socket.set_only_v6(false) { - warn!("Failed to set IPV6_V6ONLY=false, falling back to IPv4-only: {}", e); - // Fallback to a new IPv4 socket if setting dual-stack fails. + warn!("Failed to set IPV6_V6ONLY=false, attempting IPv4 fallback: {}", e); let ipv4_addr = SocketAddr::new(std::net::Ipv4Addr::UNSPECIFIED.into(), server_addr.port()); server_addr = ipv4_addr; socket = socket2::Socket::new(socket2::Domain::IPV4, socket2::Type::STREAM, Some(socket2::Protocol::TCP))?; @@ -193,8 +152,27 @@ pub async fn start_http_server( socket.set_reuse_address(true)?; // Set the socket to non-blocking before passing it to Tokio. socket.set_nonblocking(true)?; - socket.bind(&server_addr.into())?; - socket.listen(backlog)?; + + // Attempt bind; if bind fails for IPv6, try IPv4 fallback once more. + if let Err(bind_err) = socket.bind(&server_addr.into()) { + warn!("Failed to bind to {}: {}.", server_addr, bind_err); + if server_addr.is_ipv6() { + // Try IPv4 fallback + let ipv4_addr = SocketAddr::new(std::net::Ipv4Addr::UNSPECIFIED.into(), server_addr.port()); + server_addr = ipv4_addr; + socket = socket2::Socket::new(socket2::Domain::IPV4, socket2::Type::STREAM, Some(socket2::Protocol::TCP))?; + socket.set_reuse_address(true)?; + socket.set_nonblocking(true)?; + socket.bind(&server_addr.into())?; + // [FIX] Ensure fallback socket is moved to listening state as well. + socket.listen(backlog)?; + } else { + return Err(bind_err); + } + } else { + // Listen on the socket when initial bind succeeded + socket.listen(backlog)?; + } TcpListener::from_std(socket.into())? }; @@ -232,7 +210,7 @@ pub async fn start_http_server( println!("Console WebUI (localhost): {protocol}://127.0.0.1:{server_port}/rustfs/console/index.html",); } else { info!(target: "rustfs::main::startup","RustFS API: {api_endpoints} {localhost_endpoint}"); - println!("RustFS API: {api_endpoints} {localhost_endpoint}"); + println!("RustFS Http API: {api_endpoints} {localhost_endpoint}"); println!("RustFS Start Time: {now_time}"); if DEFAULT_ACCESS_KEY.eq(&opt.access_key) && DEFAULT_SECRET_KEY.eq(&opt.secret_key) { warn!( @@ -290,6 +268,17 @@ pub async fn start_http_server( Some(cors_allowed_origins) }; + // Create compression configuration from environment variables + let compression_config = CompressionConfig::from_env(); + if compression_config.enabled { + info!( + "HTTP response compression enabled: extensions={:?}, mime_patterns={:?}, min_size={} bytes", + compression_config.extensions, compression_config.mime_patterns, compression_config.min_size + ); + } else { + debug!("HTTP response compression is disabled"); + } + let is_console = opt.console_enable; tokio::spawn(async move { // Create CORS layer inside the server loop closure @@ -395,15 +384,16 @@ pub async fn start_http_server( warn!(?err, "Failed to set set_send_buffer_size"); } - process_connection( - socket, - tls_acceptor.clone(), - http_server.clone(), - s3_service.clone(), - graceful.clone(), - cors_layer.clone(), + let connection_ctx = ConnectionContext { + http_server: http_server.clone(), + s3_service: s3_service.clone(), + cors_layer: cors_layer.clone(), + compression_config: compression_config.clone(), is_console, - ); + readiness: readiness.clone(), + }; + + process_connection(socket, tls_acceptor.clone(), connection_ctx, graceful.clone()); } worker_state_manager.update(ServiceState::Stopping); @@ -496,6 +486,16 @@ async fn setup_tls_acceptor(tls_path: &str) -> Result> { Ok(None) } +#[derive(Clone)] +struct ConnectionContext { + http_server: Arc>, + s3_service: S3Service, + cors_layer: CorsLayer, + compression_config: CompressionConfig, + is_console: bool, + readiness: Arc, +} + /// Process a single incoming TCP connection. /// /// This function is executed in a new Tokio task and it will: @@ -507,13 +507,19 @@ async fn setup_tls_acceptor(tls_path: &str) -> Result> { fn process_connection( socket: TcpStream, tls_acceptor: Option>, - http_server: Arc>, - s3_service: S3Service, + context: ConnectionContext, graceful: Arc, - cors_layer: CorsLayer, - is_console: bool, ) { tokio::spawn(async move { + let ConnectionContext { + http_server, + s3_service, + cors_layer, + compression_config, + is_console, + readiness, + } = context; + // Build services inside each connected task to avoid passing complex service types across tasks, // It also ensures that each connection has an independent service instance. let rpc_service = NodeServiceServer::with_interceptor(make_server(), check_auth); @@ -522,6 +528,9 @@ fn process_connection( let hybrid_service = ServiceBuilder::new() .layer(SetRequestIdLayer::x_request_id(MakeRequestUuid)) .layer(CatchPanicLayer::new()) + // CRITICAL: Insert ReadinessGateLayer before business logic + // This stops requests from hitting IAMAuth or Storage if they are not ready. + .layer(ReadinessGateLayer::new(readiness)) .layer( TraceLayer::new_for_http() .make_span_with(|request: &HttpRequest<_>| { @@ -577,8 +586,9 @@ fn process_connection( ) .layer(PropagateRequestIdLayer::x_request_id()) .layer(cors_layer) - // Compress responses, but exclude error responses to avoid Content-Length mismatch issues - .layer(CompressionLayer::new().compress_when(ShouldCompress)) + // Compress responses based on whitelist configuration + // Only compresses when enabled and matches configured extensions/MIME types + .layer(CompressionLayer::new().compress_when(CompressionPredicate::new(compression_config))) .option_layer(if is_console { Some(RedirectLayer) } else { None }) .service(service); diff --git a/rustfs/src/server/mod.rs b/rustfs/src/server/mod.rs index 5aee97e3..28af0093 100644 --- a/rustfs/src/server/mod.rs +++ b/rustfs/src/server/mod.rs @@ -13,17 +13,23 @@ // limitations under the License. mod audit; +mod cert; +mod compress; +mod event; mod http; mod hybrid; mod layer; +mod prefix; +mod readiness; +mod runtime; mod service_state; -mod event; -mod runtime; - pub(crate) use audit::{start_audit_system, stop_audit_system}; +pub(crate) use cert::init_cert; pub(crate) use event::{init_event_notifier, shutdown_event_notifier}; pub(crate) use http::start_http_server; +pub(crate) use prefix::*; +pub(crate) use readiness::ReadinessGateLayer; pub(crate) use runtime::get_tokio_runtime_builder; pub(crate) use service_state::SHUTDOWN_TIMEOUT; pub(crate) use service_state::ServiceState; diff --git a/rustfs/src/server/prefix.rs b/rustfs/src/server/prefix.rs new file mode 100644 index 00000000..bdb8216a --- /dev/null +++ b/rustfs/src/server/prefix.rs @@ -0,0 +1,55 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +/// Predefined CPU profiling path for RustFS server. +/// This path is used to access CPU profiling data. +pub(crate) const PROFILE_CPU_PATH: &str = "/profile/cpu"; + +/// This path is used to access memory profiling data. +pub(crate) const PROFILE_MEMORY_PATH: &str = "/profile/memory"; + +/// Favicon path to handle browser requests for the favicon. +/// This path serves the favicon.ico file. +pub(crate) const FAVICON_PATH: &str = "/favicon.ico"; + +/// Predefined health check path for RustFS server. +/// This path is used to check the health status of the server. +pub(crate) const HEALTH_PREFIX: &str = "/health"; + +/// Predefined administrative prefix for RustFS server routes. +/// This prefix is used for endpoints that handle administrative tasks +/// such as configuration, monitoring, and management. +pub(crate) const ADMIN_PREFIX: &str = "/rustfs/admin"; + +/// Environment variable name for overriding the default +/// administrative prefix path. +pub(crate) const RUSTFS_ADMIN_PREFIX: &str = "/rustfs/admin/v3"; + +/// Predefined console prefix for RustFS server routes. +/// This prefix is used for endpoints that handle console-related tasks +/// such as user interface and management. +pub(crate) const CONSOLE_PREFIX: &str = "/rustfs/console"; + +/// Predefined RPC prefix for RustFS server routes. +/// This prefix is used for endpoints that handle remote procedure calls (RPC). +pub(crate) const RPC_PREFIX: &str = "/rustfs/rpc"; + +/// LOGO art for RustFS server. +pub(crate) const LOGO: &str = r#" + +░█▀▄░█░█░█▀▀░▀█▀░█▀▀░█▀▀ +░█▀▄░█░█░▀▀█░░█░░█▀▀░▀▀█ +░▀░▀░▀▀▀░▀▀▀░░▀░░▀░░░▀▀▀ + +"#; diff --git a/rustfs/src/server/readiness.rs b/rustfs/src/server/readiness.rs new file mode 100644 index 00000000..a79ad083 --- /dev/null +++ b/rustfs/src/server/readiness.rs @@ -0,0 +1,129 @@ +// Copyright 2024 RustFS Team +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +use bytes::Bytes; +use http::{Request as HttpRequest, Response, StatusCode}; +use http_body::Body; +use http_body_util::{BodyExt, Full}; +use hyper::body::Incoming; +use rustfs_common::GlobalReadiness; +use std::future::Future; +use std::pin::Pin; +use std::sync::Arc; +use std::task::{Context, Poll}; +use tower::{Layer, Service}; + +/// ReadinessGateLayer ensures that the system components (IAM, Storage) +/// are fully initialized before allowing any request to proceed. +#[derive(Clone)] +pub struct ReadinessGateLayer { + readiness: Arc, +} + +impl ReadinessGateLayer { + /// Create a new ReadinessGateLayer + /// # Arguments + /// * `readiness` - An Arc to the GlobalReadiness instance + /// + /// # Returns + /// A new instance of ReadinessGateLayer + pub fn new(readiness: Arc) -> Self { + Self { readiness } + } +} + +impl Layer for ReadinessGateLayer { + type Service = ReadinessGateService; + + /// Wrap the inner service with ReadinessGateService + /// # Arguments + /// * `inner` - The inner service to wrap + /// # Returns + /// An instance of ReadinessGateService + fn layer(&self, inner: S) -> Self::Service { + ReadinessGateService { + inner, + readiness: self.readiness.clone(), + } + } +} + +#[derive(Clone)] +pub struct ReadinessGateService { + inner: S, + readiness: Arc, +} + +type BoxError = Box; +type BoxBody = http_body_util::combinators::UnsyncBoxBody; +impl Service> for ReadinessGateService +where + S: Service, Response = Response> + Clone + Send + 'static, + S::Future: Send + 'static, + S::Error: Send + 'static, + B: Body + Send + 'static, + B::Error: Into + Send + 'static, +{ + type Response = Response; + type Error = S::Error; + type Future = Pin> + Send>>; + + fn poll_ready(&mut self, cx: &mut Context<'_>) -> Poll> { + self.inner.poll_ready(cx) + } + + fn call(&mut self, req: HttpRequest) -> Self::Future { + let mut inner = self.inner.clone(); + let readiness = self.readiness.clone(); + Box::pin(async move { + let path = req.uri().path(); + // 1) Exact match: fixed probe/resource path + let is_exact_probe = matches!( + path, + crate::server::PROFILE_MEMORY_PATH + | crate::server::PROFILE_CPU_PATH + | crate::server::HEALTH_PREFIX + | crate::server::FAVICON_PATH + ); + + // 2) Prefix matching: the entire set of route prefixes (including their subpaths) + let is_prefix_probe = path.starts_with(crate::server::RUSTFS_ADMIN_PREFIX) + || path.starts_with(crate::server::CONSOLE_PREFIX) + || path.starts_with(crate::server::RPC_PREFIX) + || path.starts_with(crate::server::ADMIN_PREFIX); + + let is_probe = is_exact_probe || is_prefix_probe; + if !is_probe && !readiness.is_ready() { + let body: BoxBody = Full::new(Bytes::from_static(b"Service not ready")) + .map_err(|e| -> BoxError { Box::new(e) }) + .boxed_unsync(); + + let resp = Response::builder() + .status(StatusCode::SERVICE_UNAVAILABLE) + .header(http::header::RETRY_AFTER, "5") + .header(http::header::CONTENT_TYPE, "text/plain; charset=utf-8") + .header(http::header::CACHE_CONTROL, "no-store") + .body(body) + .expect("failed to build not ready response"); + return Ok(resp); + } + let resp = inner.call(req).await?; + // System is ready, forward to the actual S3/RPC handlers + // Transparently converts any response body into a BoxBody, and then Trace/Cors/Compression continues to work + let (parts, body) = resp.into_parts(); + let body: BoxBody = body.map_err(Into::into).boxed_unsync(); + Ok(Response::from_parts(parts, body)) + }) + } +} diff --git a/rustfs/src/storage/concurrency.rs b/rustfs/src/storage/concurrency.rs index cc78ef6d..4ab95135 100644 --- a/rustfs/src/storage/concurrency.rs +++ b/rustfs/src/storage/concurrency.rs @@ -1165,12 +1165,12 @@ impl HotObjectCache { #[allow(dead_code)] async fn invalidate_versioned(&self, bucket: &str, key: &str, version_id: Option<&str>) { // Always invalidate the latest version key - let base_key = format!("{}/{}", bucket, key); + let base_key = format!("{bucket}/{key}"); self.invalidate(&base_key).await; // Also invalidate the specific version if provided if let Some(vid) = version_id { - let versioned_key = format!("{}?versionId={}", base_key, vid); + let versioned_key = format!("{base_key}?versionId={vid}"); self.invalidate(&versioned_key).await; } } @@ -1625,8 +1625,8 @@ impl ConcurrencyManager { /// Cache key string pub fn make_cache_key(bucket: &str, key: &str, version_id: Option<&str>) -> String { match version_id { - Some(vid) => format!("{}/{}?versionId={}", bucket, key, vid), - None => format!("{}/{}", bucket, key), + Some(vid) => format!("{bucket}/{key}?versionId={vid}"), + None => format!("{bucket}/{key}"), } } @@ -1650,13 +1650,19 @@ pub fn get_concurrency_manager() -> &'static ConcurrencyManager { &CONCURRENCY_MANAGER } +/// Testing helper to reset the global request counter. +#[allow(dead_code)] +pub(crate) fn reset_active_get_requests() { + ACTIVE_GET_REQUESTS.store(0, Ordering::Relaxed); +} + #[cfg(test)] mod tests { use super::*; #[test] fn test_concurrent_request_tracking() { - // Ensure we start from a clean state + reset_active_get_requests(); assert_eq!(GetObjectGuard::concurrent_requests(), 0); let _guard1 = GetObjectGuard::new(); @@ -1722,7 +1728,7 @@ mod tests { // Fill cache with objects for i in 0..200 { let data = vec![0u8; 64 * KI_B]; - cache.put(format!("key_{}", i), data).await; + cache.put(format!("key_{i}"), data).await; } let stats = cache.stats().await; @@ -1779,8 +1785,7 @@ mod tests { let result = get_advanced_buffer_size(32 * KI_B as i64, 256 * KI_B, true); assert!( (16 * KI_B..=64 * KI_B).contains(&result), - "Small files should use reduced buffer: {}", - result + "Small files should use reduced buffer: {result}" ); } diff --git a/rustfs/src/storage/concurrent_get_object_test.rs b/rustfs/src/storage/concurrent_get_object_test.rs index b5b3fbca..dbaee439 100644 --- a/rustfs/src/storage/concurrent_get_object_test.rs +++ b/rustfs/src/storage/concurrent_get_object_test.rs @@ -214,9 +214,7 @@ mod tests { // Allow widened range due to parallel test execution affecting global counter assert!( (64 * KI_B..=MI_B).contains(&buffer_size), - "{}: buffer should be in valid range 64KB-1MB, got {} bytes", - description, - buffer_size + "{description}: buffer should be in valid range 64KB-1MB, got {buffer_size} bytes" ); } } @@ -229,22 +227,20 @@ mod tests { let min_buffer = get_concurrency_aware_buffer_size(small_file, 64 * KI_B); assert!( min_buffer >= 32 * KI_B, - "Buffer should have minimum size of 32KB for tiny files, got {}", - min_buffer + "Buffer should have minimum size of 32KB for tiny files, got {min_buffer}" ); // Test maximum buffer size (capped at 1MB when base is reasonable) let huge_file = 10 * 1024 * MI_B as i64; // 10GB file let max_buffer = get_concurrency_aware_buffer_size(huge_file, MI_B); - assert!(max_buffer <= MI_B, "Buffer should not exceed 1MB cap when requested, got {}", max_buffer); + assert!(max_buffer <= MI_B, "Buffer should not exceed 1MB cap when requested, got {max_buffer}"); // Test buffer size scaling with base - when base is small, result respects the limits let medium_file = 200 * KI_B as i64; // 200KB file (>100KB so minimum is 64KB) let buffer = get_concurrency_aware_buffer_size(medium_file, 128 * KI_B); assert!( (64 * KI_B..=MI_B).contains(&buffer), - "Buffer should be between 64KB and 1MB, got {}", - buffer + "Buffer should be between 64KB and 1MB, got {buffer}" ); } @@ -271,7 +267,7 @@ mod tests { let elapsed = start.elapsed(); // With 64 permits, 10 concurrent tasks should complete quickly - assert!(elapsed < Duration::from_secs(1), "Should complete within 1 second, took {:?}", elapsed); + assert!(elapsed < Duration::from_secs(1), "Should complete within 1 second, took {elapsed:?}"); } /// Test Moka cache operations: insert, retrieve, stats, and clear. @@ -373,7 +369,7 @@ mod tests { let num_objects = 20; // Total 120MB > 100MB limit for i in 0..num_objects { - let key = format!("test/object{}", i); + let key = format!("test/object{i}"); let data = vec![i as u8; object_size]; manager.cache_object(key, data).await; sleep(Duration::from_millis(10)).await; // Give Moka time to process @@ -407,7 +403,7 @@ mod tests { // Cache multiple objects for i in 0..10 { - let key = format!("batch/object{}", i); + let key = format!("batch/object{i}"); let data = vec![i as u8; 100 * KI_B]; // 100KB each manager.cache_object(key, data).await; } @@ -415,14 +411,14 @@ mod tests { sleep(Duration::from_millis(100)).await; // Test batch get - let keys: Vec = (0..10).map(|i| format!("batch/object{}", i)).collect(); + let keys: Vec = (0..10).map(|i| format!("batch/object{i}")).collect(); let results = manager.get_cached_batch(&keys).await; assert_eq!(results.len(), 10, "Should return result for each key"); // Verify all objects were retrieved let hits = results.iter().filter(|r| r.is_some()).count(); - assert!(hits >= 8, "Most objects should be cached (got {}/10 hits)", hits); + assert!(hits >= 8, "Most objects should be cached (got {hits}/10 hits)"); // Mix of existing and non-existing keys let mixed_keys = vec![ @@ -442,7 +438,7 @@ mod tests { // Prepare objects for warming let objects: Vec<(String, Vec)> = (0..5) - .map(|i| (format!("warm/object{}", i), vec![i as u8; 500 * KI_B])) + .map(|i| (format!("warm/object{i}"), vec![i as u8; 500 * KI_B])) .collect(); // Warm cache @@ -452,8 +448,8 @@ mod tests { // Verify all objects are cached for (key, data) in objects { let cached = manager.get_cached(&key).await; - assert!(cached.is_some(), "Warmed object {} should be cached", key); - assert_eq!(*cached.unwrap(), data, "Cached data for {} should match", key); + assert!(cached.is_some(), "Warmed object {key} should be cached"); + assert_eq!(*cached.unwrap(), data, "Cached data for {key} should match"); } let stats = manager.cache_stats().await; @@ -467,7 +463,7 @@ mod tests { // Cache objects with different access patterns for i in 0..5 { - let key = format!("hot/object{}", i); + let key = format!("hot/object{i}"); let data = vec![i as u8; 100 * KI_B]; manager.cache_object(key, data).await; } @@ -532,25 +528,23 @@ mod tests { /// Test advanced buffer sizing with file patterns #[tokio::test] async fn test_advanced_buffer_sizing() { + crate::storage::concurrency::reset_active_get_requests(); + let base_buffer = 256 * KI_B; // 256KB base // Test small file optimization let small_size = get_advanced_buffer_size(128 * KI_B as i64, base_buffer, false); assert!( small_size < base_buffer, - "Small files should use smaller buffers: {} < {}", - small_size, - base_buffer + "Small files should use smaller buffers: {small_size} < {base_buffer}" ); - assert!(small_size >= 16 * KI_B, "Should not go below minimum: {}", small_size); + assert!(small_size >= 16 * KI_B, "Should not go below minimum: {small_size}"); // Test sequential read optimization let seq_size = get_advanced_buffer_size(32 * MI_B as i64, base_buffer, true); assert!( seq_size >= base_buffer, - "Sequential reads should use larger buffers: {} >= {}", - seq_size, - base_buffer + "Sequential reads should use larger buffers: {seq_size} >= {base_buffer}" ); // Test large file with high concurrency @@ -558,9 +552,7 @@ mod tests { let large_concurrent = get_advanced_buffer_size(100 * MI_B as i64, base_buffer, false); assert!( large_concurrent <= base_buffer, - "High concurrency should reduce buffer: {} <= {}", - large_concurrent, - base_buffer + "High concurrency should reduce buffer: {large_concurrent} <= {base_buffer}" ); } @@ -571,7 +563,7 @@ mod tests { // Pre-populate cache for i in 0..20 { - let key = format!("concurrent/object{}", i); + let key = format!("concurrent/object{i}"); let data = vec![i as u8; 100 * KI_B]; manager.cache_object(key, data).await; } @@ -600,8 +592,7 @@ mod tests { // Moka's lock-free design should handle this quickly assert!( elapsed < Duration::from_millis(500), - "Concurrent cache access should be fast (took {:?})", - elapsed + "Concurrent cache access should be fast (took {elapsed:?})" ); } @@ -635,7 +626,7 @@ mod tests { // Cache some objects for i in 0..5 { - let key = format!("hitrate/object{}", i); + let key = format!("hitrate/object{i}"); let data = vec![i as u8; 100 * KI_B]; manager.cache_object(key, data).await; } @@ -645,16 +636,16 @@ mod tests { // Mix of hits and misses for i in 0..10 { let key = if i < 5 { - format!("hitrate/object{}", i) // Hit + format!("hitrate/object{i}") // Hit } else { - format!("hitrate/missing{}", i) // Miss + format!("hitrate/missing{i}") // Miss }; let _ = manager.get_cached(&key).await; } // Hit rate should be around 50% let hit_rate = manager.cache_hit_rate(); - assert!((40.0..=60.0).contains(&hit_rate), "Hit rate should be ~50%, got {:.1}%", hit_rate); + assert!((40.0..=60.0).contains(&hit_rate), "Hit rate should be ~50%, got {hit_rate:.1}%"); } /// Test TTL expiration (Moka automatic cleanup) @@ -686,7 +677,7 @@ mod tests { // Pre-populate for i in 0..50 { - let key = format!("bench/object{}", i); + let key = format!("bench/object{i}"); let data = vec![i as u8; 500 * KI_B]; manager.cache_object(key, data).await; } @@ -725,9 +716,18 @@ mod tests { seq_duration.as_secs_f64() / conc_duration.as_secs_f64() ); - // Concurrent should be faster or similar (lock-free advantage) - // Allow some margin for test variance - assert!(conc_duration <= seq_duration * 2, "Concurrent access should not be significantly slower"); + assert!(seq_duration > Duration::from_micros(0), "Sequential access should take some time"); + assert!(conc_duration > Duration::from_micros(0), "Concurrent access should take some time"); + + // Record performance indicators for analysis, but not as a basis for testing failure + let speedup_ratio = seq_duration.as_secs_f64() / conc_duration.as_secs_f64(); + if speedup_ratio < 0.8 { + println!("Warning: Concurrent access is significantly slower than sequential ({speedup_ratio:.2}x)"); + } else if speedup_ratio > 1.2 { + println!("Info: Concurrent access is significantly faster than sequential ({speedup_ratio:.2}x)"); + } else { + println!("Info: Performance difference between concurrent and sequential access is modest ({speedup_ratio:.2}x)"); + } } /// Test cache writeback mechanism @@ -1222,14 +1222,13 @@ mod tests { // Average should be around 20ms assert!( avg >= Duration::from_millis(15) && avg <= Duration::from_millis(25), - "Average should be around 20ms, got {:?}", - avg + "Average should be around 20ms, got {avg:?}" ); // Max should be 30ms assert_eq!(max, Duration::from_millis(30), "Max should be 30ms"); // P95 should be at or near 30ms - assert!(p95 >= Duration::from_millis(25), "P95 should be near 30ms, got {:?}", p95); + assert!(p95 >= Duration::from_millis(25), "P95 should be near 30ms, got {p95:?}"); } } diff --git a/rustfs/src/storage/ecfs.rs b/rustfs/src/storage/ecfs.rs index a96d1a42..508824ef 100644 --- a/rustfs/src/storage/ecfs.rs +++ b/rustfs/src/storage/ecfs.rs @@ -134,11 +134,15 @@ use std::{ sync::{Arc, LazyLock}, }; use time::{OffsetDateTime, format_description::well_known::Rfc3339}; -use tokio::{io::AsyncRead, sync::mpsc}; +use tokio::{ + io::{AsyncRead, AsyncSeek}, + sync::mpsc, +}; use tokio_stream::wrappers::ReceiverStream; use tokio_tar::Archive; use tokio_util::io::{ReaderStream, StreamReader}; use tracing::{debug, error, info, instrument, warn}; +use urlencoding::encode; use uuid::Uuid; macro_rules! try_ { @@ -397,6 +401,19 @@ impl AsyncRead for InMemoryAsyncReader { } } +impl AsyncSeek for InMemoryAsyncReader { + fn start_seek(mut self: std::pin::Pin<&mut Self>, position: std::io::SeekFrom) -> std::io::Result<()> { + // std::io::Cursor natively supports negative SeekCurrent offsets + // It will automatically handle validation and return an error if the final position would be negative + std::io::Seek::seek(&mut self.cursor, position)?; + Ok(()) + } + + fn poll_complete(self: std::pin::Pin<&mut Self>, _cx: &mut std::task::Context<'_>) -> std::task::Poll> { + std::task::Poll::Ready(Ok(self.cursor.position())) + } +} + async fn decrypt_multipart_managed_stream( mut encrypted_stream: Box, parts: &[ObjectPartInfo], @@ -465,7 +482,7 @@ fn validate_object_key(key: &str, operation: &str) -> S3Result<()> { if key.contains(['\0', '\n', '\r']) { return Err(S3Error::with_message( S3ErrorCode::InvalidArgument, - format!("Object key contains invalid control characters: {:?}", key), + format!("Object key contains invalid control characters: {key:?}"), )); } @@ -793,6 +810,9 @@ impl S3 for FS { key, server_side_encryption: requested_sse, ssekms_key_id: requested_kms_key_id, + sse_customer_algorithm, + sse_customer_key, + sse_customer_key_md5, .. } = req.input.clone(); let (src_bucket, src_key, version_id) = match copy_source { @@ -940,6 +960,44 @@ impl S3 for FS { } } + // Apply SSE-C encryption if customer-provided key is specified + if let (Some(sse_alg), Some(sse_key), Some(sse_md5)) = (&sse_customer_algorithm, &sse_customer_key, &sse_customer_key_md5) + { + if sse_alg.as_str() == "AES256" { + let key_bytes = BASE64_STANDARD.decode(sse_key.as_str()).map_err(|e| { + error!("Failed to decode SSE-C key: {}", e); + ApiError::from(StorageError::other("Invalid SSE-C key")) + })?; + + if key_bytes.len() != 32 { + return Err(ApiError::from(StorageError::other("SSE-C key must be 32 bytes")).into()); + } + + let computed_md5 = BASE64_STANDARD.encode(md5::compute(&key_bytes).0); + if computed_md5 != sse_md5.as_str() { + return Err(ApiError::from(StorageError::other("SSE-C key MD5 mismatch")).into()); + } + + // Store original size before encryption + src_info + .user_defined + .insert("x-amz-server-side-encryption-customer-original-size".to_string(), actual_size.to_string()); + + // SAFETY: The length of `key_bytes` is checked to be 32 bytes above, + // so this conversion cannot fail. + let key_array: [u8; 32] = key_bytes.try_into().expect("key length already checked"); + // Generate deterministic nonce from bucket-key + let nonce_source = format!("{bucket}-{key}"); + let nonce_hash = md5::compute(nonce_source.as_bytes()); + let nonce: [u8; 12] = nonce_hash.0[..12] + .try_into() + .expect("MD5 hash is always 16 bytes; taking first 12 bytes for nonce is safe"); + + let encrypt_reader = EncryptReader::new(reader, key_array, nonce); + reader = HashReader::new(Box::new(encrypt_reader), -1, actual_size, None, None, false).map_err(ApiError::from)?; + } + } + src_info.put_object_reader = Some(PutObjReader::new(reader)); // check quota @@ -949,6 +1007,19 @@ impl S3 for FS { src_info.user_defined.insert(k, v); } + // Store SSE-C metadata for GET responses + if let Some(ref sse_alg) = sse_customer_algorithm { + src_info.user_defined.insert( + "x-amz-server-side-encryption-customer-algorithm".to_string(), + sse_alg.as_str().to_string(), + ); + } + if let Some(ref sse_md5) = sse_customer_key_md5 { + src_info + .user_defined + .insert("x-amz-server-side-encryption-customer-key-md5".to_string(), sse_md5.clone()); + } + // TODO: src tags let oi = store @@ -979,6 +1050,8 @@ impl S3 for FS { copy_object_result: Some(copy_object_result), server_side_encryption: effective_sse, ssekms_key_id: effective_kms_key_id, + sse_customer_algorithm, + sse_customer_key_md5, ..Default::default() }; @@ -1798,12 +1871,12 @@ impl S3 for FS { mod_time: cached .last_modified .as_ref() - .and_then(|s| time::OffsetDateTime::parse(s, &time::format_description::well_known::Rfc3339).ok()), + .and_then(|s| OffsetDateTime::parse(s, &Rfc3339).ok()), size: cached.content_length, actual_size: cached.content_length, is_dir: false, user_defined: cached.user_metadata.clone(), - version_id: cached.version_id.as_ref().and_then(|v| uuid::Uuid::parse_str(v).ok()), + version_id: cached.version_id.as_ref().and_then(|v| Uuid::parse_str(v).ok()), delete_marker: cached.delete_marker, content_type: cached.content_type.clone(), content_encoding: cached.content_encoding.clone(), @@ -2037,8 +2110,8 @@ impl S3 for FS { let mut key_array = [0u8; 32]; key_array.copy_from_slice(&key_bytes[..32]); - // Verify MD5 hash of the key matches what we expect - let computed_md5 = format!("{:x}", md5::compute(&key_bytes)); + // Verify MD5 hash of the key matches what the client claims + let computed_md5 = BASE64_STANDARD.encode(md5::compute(&key_bytes).0); if computed_md5 != *sse_key_md5_provided { return Err(ApiError::from(StorageError::other("SSE-C key MD5 mismatch")).into()); } @@ -2152,7 +2225,7 @@ impl S3 for FS { let mut buf = Vec::with_capacity(response_content_length as usize); if let Err(e) = tokio::io::AsyncReadExt::read_to_end(&mut final_stream, &mut buf).await { error!("Failed to read object into memory for caching: {}", e); - return Err(ApiError::from(StorageError::other(format!("Failed to read object for caching: {}", e))).into()); + return Err(ApiError::from(StorageError::other(format!("Failed to read object for caching: {e}"))).into()); } // Verify we read the expected amount @@ -2165,17 +2238,15 @@ impl S3 for FS { } // Build CachedGetObject with full metadata for cache writeback - let last_modified_str = info - .mod_time - .and_then(|t| match t.format(&time::format_description::well_known::Rfc3339) { - Ok(s) => Some(s), - Err(e) => { - warn!("Failed to format last_modified for cache writeback: {}", e); - None - } - }); + let last_modified_str = info.mod_time.and_then(|t| match t.format(&Rfc3339) { + Ok(s) => Some(s), + Err(e) => { + warn!("Failed to format last_modified for cache writeback: {}", e); + None + } + }); - let cached_response = CachedGetObject::new(bytes::Bytes::from(buf.clone()), response_content_length) + let cached_response = CachedGetObject::new(Bytes::from(buf.clone()), response_content_length) .with_content_type(info.content_type.clone().unwrap_or_default()) .with_e_tag(info.etag.clone().unwrap_or_default()) .with_last_modified(last_modified_str.unwrap_or_default()); @@ -2209,11 +2280,55 @@ impl S3 for FS { ); Some(StreamingBlob::wrap(ReaderStream::with_capacity(final_stream, optimal_buffer_size))) } else { - // Standard streaming path for large objects or range/part requests - Some(StreamingBlob::wrap(bytes_stream( - ReaderStream::with_capacity(final_stream, optimal_buffer_size), - response_content_length as usize, - ))) + let seekable_object_size_threshold = rustfs_config::DEFAULT_OBJECT_SEEK_SUPPORT_THRESHOLD; + + let should_provide_seek_support = response_content_length > 0 + && response_content_length <= seekable_object_size_threshold as i64 + && part_number.is_none() + && rs.is_none(); + + if should_provide_seek_support { + debug!( + "Reading small object into memory for seek support: key={} size={}", + cache_key, response_content_length + ); + + // Read the stream into memory + let mut buf = Vec::with_capacity(response_content_length as usize); + match tokio::io::AsyncReadExt::read_to_end(&mut final_stream, &mut buf).await { + Ok(_) => { + // Verify we read the expected amount + if buf.len() != response_content_length as usize { + warn!( + "Object size mismatch during seek support read: expected={} actual={}", + response_content_length, + buf.len() + ); + } + + // Create seekable in-memory reader (similar to MinIO SDK's bytes.Reader) + let mem_reader = InMemoryAsyncReader::new(buf); + Some(StreamingBlob::wrap(bytes_stream( + ReaderStream::with_capacity(Box::new(mem_reader), optimal_buffer_size), + response_content_length as usize, + ))) + } + Err(e) => { + error!("Failed to read object into memory for seek support: {}", e); + // Fallback to streaming if read fails + Some(StreamingBlob::wrap(bytes_stream( + ReaderStream::with_capacity(final_stream, optimal_buffer_size), + response_content_length as usize, + ))) + } + } + } else { + // Standard streaming path for large objects or range/part requests + Some(StreamingBlob::wrap(bytes_stream( + ReaderStream::with_capacity(final_stream, optimal_buffer_size), + response_content_length as usize, + ))) + } }; // Extract SSE information from metadata for response @@ -2388,9 +2503,22 @@ impl S3 for FS { let info = store.get_object_info(&bucket, &key, &opts).await.map_err(ApiError::from)?; + if info.delete_marker { + if opts.version_id.is_none() { + return Err(S3Error::new(S3ErrorCode::NoSuchKey)); + } + return Err(S3Error::new(S3ErrorCode::MethodNotAllowed)); + } + if let Some(match_etag) = if_none_match { - if info.etag.as_ref().is_some_and(|etag| etag == match_etag.as_str()) { - return Err(S3Error::new(S3ErrorCode::NotModified)); + if let Some(strong_etag) = match_etag.into_etag() { + if info + .etag + .as_ref() + .is_some_and(|etag| ETag::Strong(etag.clone()) == strong_etag) + { + return Err(S3Error::new(S3ErrorCode::NotModified)); + } } } @@ -2405,8 +2533,14 @@ impl S3 for FS { } if let Some(match_etag) = if_match { - if info.etag.as_ref().is_some_and(|etag| etag != match_etag.as_str()) { - return Err(S3Error::new(S3ErrorCode::PreconditionFailed)); + if let Some(strong_etag) = match_etag.into_etag() { + if info + .etag + .as_ref() + .is_some_and(|etag| ETag::Strong(etag.clone()) != strong_etag) + { + return Err(S3Error::new(S3ErrorCode::PreconditionFailed)); + } } } else if let Some(unmodified_since) = if_unmodified_since { if info.mod_time.is_some_and(|mod_time| { @@ -2450,6 +2584,13 @@ impl S3 for FS { .map(|v| SSECustomerAlgorithm::from(v.clone())); let sse_customer_key_md5 = metadata_map.get("x-amz-server-side-encryption-customer-key-md5").cloned(); let ssekms_key_id = metadata_map.get("x-amz-server-side-encryption-aws-kms-key-id").cloned(); + // Prefer explicit storage_class from object info; fall back to persisted metadata header. + let storage_class = info + .storage_class + .clone() + .or_else(|| metadata_map.get("x-amz-storage-class").cloned()) + .filter(|s| !s.is_empty()) + .map(StorageClass::from); let mut checksum_crc32 = None; let mut checksum_crc32c = None; @@ -2503,6 +2644,7 @@ impl S3 for FS { checksum_sha256, checksum_crc64nvme, checksum_type, + storage_class, // metadata: object_metadata, ..Default::default() }; @@ -2587,16 +2729,52 @@ impl S3 for FS { async fn list_objects(&self, req: S3Request) -> S3Result> { let v2_resp = self.list_objects_v2(req.map_input(Into::into)).await?; - Ok(v2_resp.map_output(|v2| ListObjectsOutput { - contents: v2.contents, - delimiter: v2.delimiter, - encoding_type: v2.encoding_type, - name: v2.name, - prefix: v2.prefix, - max_keys: v2.max_keys, - common_prefixes: v2.common_prefixes, - is_truncated: v2.is_truncated, - ..Default::default() + Ok(v2_resp.map_output(|v2| { + // For ListObjects (v1) API, NextMarker should be the last item returned when truncated + // When both Contents and CommonPrefixes are present, NextMarker should be the + // lexicographically last item (either last key or last prefix) + let next_marker = if v2.is_truncated.unwrap_or(false) { + let last_key = v2 + .contents + .as_ref() + .and_then(|contents| contents.last()) + .and_then(|obj| obj.key.as_ref()) + .cloned(); + + let last_prefix = v2 + .common_prefixes + .as_ref() + .and_then(|prefixes| prefixes.last()) + .and_then(|prefix| prefix.prefix.as_ref()) + .cloned(); + + // NextMarker should be the lexicographically last item + // This matches Ceph S3 behavior used by s3-tests + match (last_key, last_prefix) { + (Some(k), Some(p)) => { + // Return the lexicographically greater one + if k > p { Some(k) } else { Some(p) } + } + (Some(k), None) => Some(k), + (None, Some(p)) => Some(p), + (None, None) => None, + } + } else { + None + }; + + ListObjectsOutput { + contents: v2.contents, + delimiter: v2.delimiter, + encoding_type: v2.encoding_type, + name: v2.name, + prefix: v2.prefix, + max_keys: v2.max_keys, + common_prefixes: v2.common_prefixes, + is_truncated: v2.is_truncated, + next_marker, + ..Default::default() + } })) } @@ -2607,6 +2785,7 @@ impl S3 for FS { bucket, continuation_token, delimiter, + encoding_type, fetch_owner, max_keys, prefix, @@ -2669,13 +2848,31 @@ impl S3 for FS { // warn!("object_infos objects {:?}", object_infos.objects); + // Apply URL encoding if encoding_type is "url" + // Note: S3 URL encoding should encode special characters but preserve path separators (/) + let should_encode = encoding_type.as_ref().map(|e| e.as_str() == "url").unwrap_or(false); + + // Helper function to encode S3 keys/prefixes (preserving /) + // S3 URL encoding encodes special characters but keeps '/' unencoded + let encode_s3_name = |name: &str| -> String { + name.split('/') + .map(|part| encode(part).to_string()) + .collect::>() + .join("/") + }; + let objects: Vec = object_infos .objects .iter() .filter(|v| !v.name.is_empty()) .map(|v| { + let key = if should_encode { + encode_s3_name(&v.name) + } else { + v.name.to_owned() + }; let mut obj = Object { - key: Some(v.name.to_owned()), + key: Some(key), last_modified: v.mod_time.map(Timestamp::from), size: Some(v.get_actual_size().unwrap_or_default()), e_tag: v.etag.clone().map(|etag| to_s3s_etag(&etag)), @@ -2693,14 +2890,18 @@ impl S3 for FS { }) .collect(); - let key_count = objects.len() as i32; - - let common_prefixes = object_infos + let common_prefixes: Vec = object_infos .prefixes .into_iter() - .map(|v| CommonPrefix { prefix: Some(v) }) + .map(|v| { + let prefix = if should_encode { encode_s3_name(&v) } else { v }; + CommonPrefix { prefix: Some(prefix) } + }) .collect(); + // KeyCount should include both objects and common prefixes per S3 API spec + let key_count = (objects.len() + common_prefixes.len()) as i32; + // Encode next_continuation_token to base64 let next_continuation_token = object_infos .next_continuation_token @@ -2714,6 +2915,7 @@ impl S3 for FS { max_keys: Some(max_keys), contents: Some(objects), delimiter, + encoding_type: encoding_type.clone(), name: Some(bucket), prefix: Some(prefix), common_prefixes: Some(common_prefixes), @@ -2761,9 +2963,10 @@ impl S3 for FS { key: Some(v.name.to_owned()), last_modified: v.mod_time.map(Timestamp::from), size: Some(v.size), - version_id: v.version_id.map(|v| v.to_string()), + version_id: Some(v.version_id.map(|v| v.to_string()).unwrap_or_else(|| "null".to_string())), is_latest: Some(v.is_latest), e_tag: v.etag.clone().map(|etag| to_s3s_etag(&etag)), + storage_class: v.storage_class.clone().map(ObjectVersionStorageClass::from), ..Default::default() // TODO: another fields } }) @@ -2783,13 +2986,17 @@ impl S3 for FS { .filter(|o| o.delete_marker) .map(|o| DeleteMarkerEntry { key: Some(o.name.clone()), - version_id: o.version_id.map(|v| v.to_string()), + version_id: Some(o.version_id.map(|v| v.to_string()).unwrap_or_else(|| "null".to_string())), is_latest: Some(o.is_latest), last_modified: o.mod_time.map(Timestamp::from), ..Default::default() }) .collect::>(); + // Only set next_version_id_marker if it has a value, per AWS S3 API spec + // boto3 expects it to be a string or omitted, not None + let next_version_id_marker = object_infos.next_version_idmarker.filter(|v| !v.is_empty()); + let output = ListObjectVersionsOutput { is_truncated: Some(object_infos.is_truncated), max_keys: Some(key_count), @@ -2799,6 +3006,8 @@ impl S3 for FS { common_prefixes: Some(common_prefixes), versions: Some(objects), delete_markers: Some(delete_markers), + next_key_marker: object_infos.next_marker, + next_version_id_marker, ..Default::default() }; @@ -2807,7 +3016,7 @@ impl S3 for FS { // #[instrument(level = "debug", skip(self, req))] async fn put_object(&self, req: S3Request) -> S3Result> { - let helper = OperationHelper::new(&req, EventName::ObjectCreatedPut, "s3:PutObject"); + let mut helper = OperationHelper::new(&req, EventName::ObjectCreatedPut, "s3:PutObject"); if req .headers .get("X-Amz-Meta-Snowball-Auto-Extract") @@ -2856,13 +3065,25 @@ impl S3 for FS { Ok(info) => { if !info.delete_marker { if let Some(ifmatch) = if_match { - if info.etag.as_ref().is_some_and(|etag| etag != ifmatch.as_str()) { - return Err(s3_error!(PreconditionFailed)); + if let Some(strong_etag) = ifmatch.into_etag() { + if info + .etag + .as_ref() + .is_some_and(|etag| ETag::Strong(etag.clone()) != strong_etag) + { + return Err(s3_error!(PreconditionFailed)); + } } } if let Some(ifnonematch) = if_none_match { - if info.etag.as_ref().is_some_and(|etag| etag == ifnonematch.as_str()) { - return Err(s3_error!(PreconditionFailed)); + if let Some(strong_etag) = ifnonematch.into_etag() { + if info + .etag + .as_ref() + .is_some_and(|etag| ETag::Strong(etag.clone()) == strong_etag) + { + return Err(s3_error!(PreconditionFailed)); + } } } } @@ -3046,8 +3267,8 @@ impl S3 for FS { let mut key_array = [0u8; 32]; key_array.copy_from_slice(&key_bytes[..32]); - // Verify MD5 hash of the key - let computed_md5 = format!("{:x}", md5::compute(&key_bytes)); + // Verify MD5 hash of the key matches what the client claims + let computed_md5 = BASE64_STANDARD.encode(md5::compute(&key_bytes).0); if computed_md5 != *sse_key_md5_provided { return Err(ApiError::from(StorageError::other("SSE-C key MD5 mismatch")).into()); } @@ -3125,6 +3346,12 @@ impl S3 for FS { let put_bucket = bucket.clone(); let put_key = key.clone(); let put_version = obj_info.version_id.map(|v| v.to_string()); + + helper = helper.object(obj_info.clone()); + if let Some(version_id) = &put_version { + helper = helper.version_id(version_id.clone()); + } + tokio::spawn(async move { manager .invalidate_cache_versioned(&put_bucket, &put_key, put_version.as_deref()) @@ -3477,8 +3704,8 @@ impl S3 for FS { let mut key_array = [0u8; 32]; key_array.copy_from_slice(&key_bytes[..32]); - // Verify MD5 hash of the key - let computed_md5 = format!("{:x}", md5::compute(&key_bytes)); + // Verify MD5 hash of the key matches what the client claims + let computed_md5 = BASE64_STANDARD.encode(md5::compute(&key_bytes).0); if computed_md5 != *sse_key_md5_provided { return Err(ApiError::from(StorageError::other("SSE-C key MD5 mismatch")).into()); } @@ -3655,7 +3882,12 @@ impl S3 for FS { // Validate copy conditions (simplified for now) if let Some(if_match) = copy_source_if_match { if let Some(ref etag) = src_info.etag { - if etag != &if_match { + if let Some(strong_etag) = if_match.into_etag() { + if ETag::Strong(etag.clone()) != strong_etag { + return Err(s3_error!(PreconditionFailed)); + } + } else { + // Weak ETag in If-Match should fail return Err(s3_error!(PreconditionFailed)); } } else { @@ -3665,9 +3897,12 @@ impl S3 for FS { if let Some(if_none_match) = copy_source_if_none_match { if let Some(ref etag) = src_info.etag { - if etag == &if_none_match { - return Err(s3_error!(PreconditionFailed)); + if let Some(strong_etag) = if_none_match.into_etag() { + if ETag::Strong(etag.clone()) == strong_etag { + return Err(s3_error!(PreconditionFailed)); + } } + // Weak ETag in If-None-Match is ignored (doesn't match) } } @@ -3939,13 +4174,25 @@ impl S3 for FS { Ok(info) => { if !info.delete_marker { if let Some(ifmatch) = if_match { - if info.etag.as_ref().is_some_and(|etag| etag != ifmatch.as_str()) { - return Err(s3_error!(PreconditionFailed)); + if let Some(strong_etag) = ifmatch.into_etag() { + if info + .etag + .as_ref() + .is_some_and(|etag| ETag::Strong(etag.clone()) != strong_etag) + { + return Err(s3_error!(PreconditionFailed)); + } } } if let Some(ifnonematch) = if_none_match { - if info.etag.as_ref().is_some_and(|etag| etag == ifnonematch.as_str()) { - return Err(s3_error!(PreconditionFailed)); + if let Some(strong_etag) = ifnonematch.into_etag() { + if info + .etag + .as_ref() + .is_some_and(|etag| ETag::Strong(etag.clone()) == strong_etag) + { + return Err(s3_error!(PreconditionFailed)); + } } } } @@ -4942,6 +5189,7 @@ impl S3 for FS { let (clear_result, event_rules) = tokio::join!(clear_rules, parse_rules); clear_result.map_err(|e| s3_error!(InternalError, "Failed to clear rules: {e}"))?; + warn!("notify event rules: {:?}", &event_rules); // Add a new notification rule notifier_global::add_event_specific_rules(&bucket, ®ion, &event_rules) @@ -5569,6 +5817,60 @@ mod tests { // and various dependencies that make unit testing challenging. For comprehensive testing // of S3 operations, integration tests would be more appropriate. + #[test] + fn test_list_objects_v2_key_count_includes_prefixes() { + // Test that KeyCount calculation includes both objects and common prefixes + // This verifies the fix for S3 API compatibility where KeyCount should equal + // the sum of Contents and CommonPrefixes lengths + + // Simulate the calculation logic from list_objects_v2 + let objects_count = 3_usize; + let common_prefixes_count = 2_usize; + + // KeyCount should include both objects and common prefixes per S3 API spec + let key_count = (objects_count + common_prefixes_count) as i32; + + assert_eq!(key_count, 5); + + // Edge cases: verify calculation logic + let no_objects = 0_usize; + let no_prefixes = 0_usize; + assert_eq!((no_objects + no_prefixes) as i32, 0); + + let one_object = 1_usize; + assert_eq!((one_object + no_prefixes) as i32, 1); + + let one_prefix = 1_usize; + assert_eq!((no_objects + one_prefix) as i32, 1); + } + + #[test] + fn test_s3_url_encoding_preserves_slash() { + // Test that S3 URL encoding preserves path separators (/) + // This verifies the encoding logic for EncodingType=url parameter + + use urlencoding::encode; + + // Helper function matching the implementation + let encode_s3_name = |name: &str| -> String { + name.split('/') + .map(|part| encode(part).to_string()) + .collect::>() + .join("/") + }; + + // Test cases from s3-tests + assert_eq!(encode_s3_name("asdf+b"), "asdf%2Bb"); + assert_eq!(encode_s3_name("foo+1/bar"), "foo%2B1/bar"); + assert_eq!(encode_s3_name("foo/"), "foo/"); + assert_eq!(encode_s3_name("quux ab/"), "quux%20ab/"); + + // Edge cases + assert_eq!(encode_s3_name("normal/key"), "normal/key"); + assert_eq!(encode_s3_name("key+with+plus"), "key%2Bwith%2Bplus"); + assert_eq!(encode_s3_name("key with spaces"), "key%20with%20spaces"); + } + #[test] fn test_s3_error_scenarios() { // Test that we can create expected S3 errors for common validation cases diff --git a/rustfs/src/storage/error.rs b/rustfs/src/storage/error.rs deleted file mode 100644 index e3b10cde..00000000 --- a/rustfs/src/storage/error.rs +++ /dev/null @@ -1,499 +0,0 @@ -// Copyright 2024 RustFS Team -// -// Licensed under the Apache License, Version 2.0 (the "License"); -// you may not use this file except in compliance with the License. -// You may obtain a copy of the License at -// -// http://www.apache.org/licenses/LICENSE-2.0 -// -// Unless required by applicable law or agreed to in writing, software -// distributed under the License is distributed on an "AS IS" BASIS, -// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -// See the License for the specific language governing permissions and -// limitations under the License. - -use ecstore::error::StorageError; -use rustfs_common::error::Error; -use s3s::{s3_error, S3Error, S3ErrorCode}; -pub fn to_s3_error(err: Error) -> S3Error { - if let Some(storage_err) = err.downcast_ref::() { - return match storage_err { - StorageError::NotImplemented => s3_error!(NotImplemented), - StorageError::InvalidArgument(bucket, object, version_id) => { - s3_error!(InvalidArgument, "Invalid arguments provided for {}/{}-{}", bucket, object, version_id) - } - StorageError::MethodNotAllowed => s3_error!(MethodNotAllowed), - StorageError::BucketNotFound(bucket) => { - s3_error!(NoSuchBucket, "bucket not found {}", bucket) - } - StorageError::BucketNotEmpty(bucket) => s3_error!(BucketNotEmpty, "bucket not empty {}", bucket), - StorageError::BucketNameInvalid(bucket) => s3_error!(InvalidBucketName, "invalid bucket name {}", bucket), - StorageError::ObjectNameInvalid(bucket, object) => { - s3_error!(InvalidArgument, "invalid object name {}/{}", bucket, object) - } - StorageError::BucketExists(bucket) => s3_error!(BucketAlreadyExists, "{}", bucket), - StorageError::StorageFull => s3_error!(ServiceUnavailable, "Storage reached its minimum free drive threshold."), - StorageError::SlowDown => s3_error!(SlowDown, "Please reduce your request rate"), - StorageError::PrefixAccessDenied(bucket, object) => { - s3_error!(AccessDenied, "PrefixAccessDenied {}/{}", bucket, object) - } - StorageError::InvalidUploadIDKeyCombination(bucket, object) => { - s3_error!(InvalidArgument, "Invalid UploadID KeyCombination: {}/{}", bucket, object) - } - StorageError::MalformedUploadID(bucket) => s3_error!(InvalidArgument, "Malformed UploadID: {}", bucket), - StorageError::ObjectNameTooLong(bucket, object) => { - s3_error!(InvalidArgument, "Object name too long: {}/{}", bucket, object) - } - StorageError::ObjectNamePrefixAsSlash(bucket, object) => { - s3_error!(InvalidArgument, "Object name contains forward slash as prefix: {}/{}", bucket, object) - } - StorageError::ObjectNotFound(bucket, object) => s3_error!(NoSuchKey, "{}/{}", bucket, object), - StorageError::VersionNotFound(bucket, object, version_id) => { - s3_error!(NoSuchVersion, "{}/{}/{}", bucket, object, version_id) - } - StorageError::InvalidUploadID(bucket, object, version_id) => { - s3_error!(InvalidPart, "Invalid upload id: {}/{}-{}", bucket, object, version_id) - } - StorageError::InvalidVersionID(bucket, object, version_id) => { - s3_error!(InvalidArgument, "Invalid version id: {}/{}-{}", bucket, object, version_id) - } - // extended - StorageError::DataMovementOverwriteErr(bucket, object, version_id) => s3_error!( - InvalidArgument, - "invalid data movement operation, source and destination pool are the same for : {}/{}-{}", - bucket, - object, - version_id - ), - - // extended - StorageError::ObjectExistsAsDirectory(bucket, object) => { - s3_error!(InvalidArgument, "Object exists on :{} as directory {}", bucket, object) - } - StorageError::InvalidPart(bucket, object, version_id) => { - s3_error!( - InvalidPart, - "Specified part could not be found. PartNumber {}, Expected {}, got {}", - bucket, - object, - version_id - ) - } - StorageError::DoneForNow => s3_error!(InternalError, "DoneForNow"), - }; - } - - if is_err_file_not_found(&err) { - return S3Error::with_message(S3ErrorCode::NoSuchKey, format!(" ec err {}", err)); - } - - S3Error::with_message(S3ErrorCode::InternalError, format!(" ec err {}", err)) -} - -#[cfg(test)] -mod tests { - use super::*; - use s3s::S3ErrorCode; - - #[test] - fn test_to_s3_error_not_implemented() { - let storage_err = StorageError::NotImplemented; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::NotImplemented); - } - - #[test] - fn test_to_s3_error_invalid_argument() { - let storage_err = - StorageError::InvalidArgument("test-bucket".to_string(), "test-object".to_string(), "test-version".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("Invalid arguments provided")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("test-object")); - assert!(s3_err.message().unwrap().contains("test-version")); - } - - #[test] - fn test_to_s3_error_method_not_allowed() { - let storage_err = StorageError::MethodNotAllowed; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::MethodNotAllowed); - } - - #[test] - fn test_to_s3_error_bucket_not_found() { - let storage_err = StorageError::BucketNotFound("test-bucket".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::NoSuchBucket); - assert!(s3_err.message().unwrap().contains("bucket not found")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - } - - #[test] - fn test_to_s3_error_bucket_not_empty() { - let storage_err = StorageError::BucketNotEmpty("test-bucket".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::BucketNotEmpty); - assert!(s3_err.message().unwrap().contains("bucket not empty")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - } - - #[test] - fn test_to_s3_error_bucket_name_invalid() { - let storage_err = StorageError::BucketNameInvalid("invalid-bucket-name".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidBucketName); - assert!(s3_err.message().unwrap().contains("invalid bucket name")); - assert!(s3_err.message().unwrap().contains("invalid-bucket-name")); - } - - #[test] - fn test_to_s3_error_object_name_invalid() { - let storage_err = StorageError::ObjectNameInvalid("test-bucket".to_string(), "invalid-object".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("invalid object name")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("invalid-object")); - } - - #[test] - fn test_to_s3_error_bucket_exists() { - let storage_err = StorageError::BucketExists("existing-bucket".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::BucketAlreadyExists); - assert!(s3_err.message().unwrap().contains("existing-bucket")); - } - - #[test] - fn test_to_s3_error_storage_full() { - let storage_err = StorageError::StorageFull; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::ServiceUnavailable); - assert!( - s3_err - .message() - .unwrap() - .contains("Storage reached its minimum free drive threshold") - ); - } - - #[test] - fn test_to_s3_error_slow_down() { - let storage_err = StorageError::SlowDown; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::SlowDown); - assert!(s3_err.message().unwrap().contains("Please reduce your request rate")); - } - - #[test] - fn test_to_s3_error_prefix_access_denied() { - let storage_err = StorageError::PrefixAccessDenied("test-bucket".to_string(), "test-prefix".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::AccessDenied); - assert!(s3_err.message().unwrap().contains("PrefixAccessDenied")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("test-prefix")); - } - - #[test] - fn test_to_s3_error_invalid_upload_id_key_combination() { - let storage_err = StorageError::InvalidUploadIDKeyCombination("test-bucket".to_string(), "test-object".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("Invalid UploadID KeyCombination")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("test-object")); - } - - #[test] - fn test_to_s3_error_malformed_upload_id() { - let storage_err = StorageError::MalformedUploadID("malformed-id".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("Malformed UploadID")); - assert!(s3_err.message().unwrap().contains("malformed-id")); - } - - #[test] - fn test_to_s3_error_object_name_too_long() { - let storage_err = StorageError::ObjectNameTooLong("test-bucket".to_string(), "very-long-object-name".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("Object name too long")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("very-long-object-name")); - } - - #[test] - fn test_to_s3_error_object_name_prefix_as_slash() { - let storage_err = StorageError::ObjectNamePrefixAsSlash("test-bucket".to_string(), "/invalid-object".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!( - s3_err - .message() - .unwrap() - .contains("Object name contains forward slash as prefix") - ); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("/invalid-object")); - } - - #[test] - fn test_to_s3_error_object_not_found() { - let storage_err = StorageError::ObjectNotFound("test-bucket".to_string(), "missing-object".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::NoSuchKey); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("missing-object")); - } - - #[test] - fn test_to_s3_error_version_not_found() { - let storage_err = - StorageError::VersionNotFound("test-bucket".to_string(), "test-object".to_string(), "missing-version".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::NoSuchVersion); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("test-object")); - assert!(s3_err.message().unwrap().contains("missing-version")); - } - - #[test] - fn test_to_s3_error_invalid_upload_id() { - let storage_err = - StorageError::InvalidUploadID("test-bucket".to_string(), "test-object".to_string(), "invalid-upload-id".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidPart); - assert!(s3_err.message().unwrap().contains("Invalid upload id")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("test-object")); - assert!(s3_err.message().unwrap().contains("invalid-upload-id")); - } - - #[test] - fn test_to_s3_error_invalid_version_id() { - let storage_err = StorageError::InvalidVersionID( - "test-bucket".to_string(), - "test-object".to_string(), - "invalid-version-id".to_string(), - ); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("Invalid version id")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("test-object")); - assert!(s3_err.message().unwrap().contains("invalid-version-id")); - } - - #[test] - fn test_to_s3_error_data_movement_overwrite_err() { - let storage_err = StorageError::DataMovementOverwriteErr( - "test-bucket".to_string(), - "test-object".to_string(), - "test-version".to_string(), - ); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("invalid data movement operation")); - assert!(s3_err.message().unwrap().contains("source and destination pool are the same")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("test-object")); - assert!(s3_err.message().unwrap().contains("test-version")); - } - - #[test] - fn test_to_s3_error_object_exists_as_directory() { - let storage_err = StorageError::ObjectExistsAsDirectory("test-bucket".to_string(), "directory-object".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("Object exists on")); - assert!(s3_err.message().unwrap().contains("as directory")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - assert!(s3_err.message().unwrap().contains("directory-object")); - } - - #[test] - fn test_to_s3_error_insufficient_read_quorum() { - let storage_err = StorageError::InsufficientReadQuorum; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::SlowDown); - assert!( - s3_err - .message() - .unwrap() - .contains("Storage resources are insufficient for the read operation") - ); - } - - #[test] - fn test_to_s3_error_insufficient_write_quorum() { - let storage_err = StorageError::InsufficientWriteQuorum; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::SlowDown); - assert!( - s3_err - .message() - .unwrap() - .contains("Storage resources are insufficient for the write operation") - ); - } - - #[test] - fn test_to_s3_error_decommission_not_started() { - let storage_err = StorageError::DecommissionNotStarted; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("Decommission Not Started")); - } - - #[test] - fn test_to_s3_error_decommission_already_running() { - let storage_err = StorageError::DecommissionAlreadyRunning; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InternalError); - assert!(s3_err.message().unwrap().contains("Decommission already running")); - } - - #[test] - fn test_to_s3_error_volume_not_found() { - let storage_err = StorageError::VolumeNotFound("test-volume".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::NoSuchBucket); - assert!(s3_err.message().unwrap().contains("bucket not found")); - assert!(s3_err.message().unwrap().contains("test-volume")); - } - - #[test] - fn test_to_s3_error_invalid_part() { - let storage_err = StorageError::InvalidPart(1, "expected-part".to_string(), "got-part".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidPart); - assert!(s3_err.message().unwrap().contains("Specified part could not be found")); - assert!(s3_err.message().unwrap().contains("PartNumber")); - assert!(s3_err.message().unwrap().contains("expected-part")); - assert!(s3_err.message().unwrap().contains("got-part")); - } - - #[test] - fn test_to_s3_error_done_for_now() { - let storage_err = StorageError::DoneForNow; - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InternalError); - assert!(s3_err.message().unwrap().contains("DoneForNow")); - } - - #[test] - fn test_to_s3_error_non_storage_error() { - // Test with a non-StorageError - let err = Error::from_string("Generic error message".to_string()); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InternalError); - assert!(s3_err.message().unwrap().contains("ec err")); - assert!(s3_err.message().unwrap().contains("Generic error message")); - } - - #[test] - fn test_to_s3_error_with_unicode_strings() { - let storage_err = StorageError::BucketNotFound("test-bucket".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::NoSuchBucket); - assert!(s3_err.message().unwrap().contains("bucket not found")); - assert!(s3_err.message().unwrap().contains("test-bucket")); - } - - #[test] - fn test_to_s3_error_with_special_characters() { - let storage_err = StorageError::ObjectNameInvalid("bucket-with-@#$%".to_string(), "object-with-!@#$%^&*()".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::InvalidArgument); - assert!(s3_err.message().unwrap().contains("invalid object name")); - assert!(s3_err.message().unwrap().contains("bucket-with-@#$%")); - assert!(s3_err.message().unwrap().contains("object-with-!@#$%^&*()")); - } - - #[test] - fn test_to_s3_error_with_empty_strings() { - let storage_err = StorageError::BucketNotFound("".to_string()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::NoSuchBucket); - assert!(s3_err.message().unwrap().contains("bucket not found")); - } - - #[test] - fn test_to_s3_error_with_very_long_strings() { - let long_bucket_name = "a".repeat(1000); - let storage_err = StorageError::BucketNotFound(long_bucket_name.clone()); - let err = Error::new(storage_err); - let s3_err = to_s3_error(err); - - assert_eq!(*s3_err.code(), S3ErrorCode::NoSuchBucket); - assert!(s3_err.message().unwrap().contains("bucket not found")); - assert!(s3_err.message().unwrap().contains(&long_bucket_name)); - } -} diff --git a/rustfs/src/storage/tonic_service.rs b/rustfs/src/storage/tonic_service.rs index eebe1c74..70e8b6fa 100644 --- a/rustfs/src/storage/tonic_service.rs +++ b/rustfs/src/storage/tonic_service.rs @@ -16,7 +16,7 @@ use bytes::Bytes; use futures::Stream; use futures_util::future::join_all; use rmp_serde::{Deserializer, Serializer}; -use rustfs_common::{globals::GLOBAL_Local_Node_Name, heal_channel::HealOpts}; +use rustfs_common::{GLOBAL_LOCAL_NODE_NAME, heal_channel::HealOpts}; use rustfs_ecstore::{ admin_server_info::get_local_server_property, bucket::{metadata::load_bucket_metadata, metadata_sys}, @@ -1072,6 +1072,29 @@ impl Node for NodeService { })) } } + async fn read_metadata(&self, request: Request) -> Result, Status> { + let request = request.into_inner(); + if let Some(disk) = self.find_disk(&request.disk).await { + match disk.read_metadata(&request.volume, &request.path).await { + Ok(data) => Ok(Response::new(ReadMetadataResponse { + success: true, + data, + error: None, + })), + Err(err) => Ok(Response::new(ReadMetadataResponse { + success: false, + data: Bytes::new(), + error: Some(err.into()), + })), + } + } else { + Ok(Response::new(ReadMetadataResponse { + success: false, + data: Bytes::new(), + error: Some(DiskError::other("can not find disk".to_string()).into()), + })) + } + } async fn update_metadata(&self, request: Request) -> Result, Status> { let request = request.into_inner(); @@ -1646,7 +1669,7 @@ impl Node for NodeService { } async fn get_net_info(&self, _request: Request) -> Result, Status> { - let addr = GLOBAL_Local_Node_Name.read().await.clone(); + let addr = GLOBAL_LOCAL_NODE_NAME.read().await.clone(); let info = get_net_info(&addr, ""); let mut buf = Vec::new(); if let Err(err) = info.serialize(&mut Serializer::new(&mut buf)) { @@ -1701,7 +1724,7 @@ impl Node for NodeService { &self, _request: Request, ) -> Result, Status> { - let addr = GLOBAL_Local_Node_Name.read().await.clone(); + let addr = GLOBAL_LOCAL_NODE_NAME.read().await.clone(); let info = get_sys_services(&addr); let mut buf = Vec::new(); if let Err(err) = info.serialize(&mut Serializer::new(&mut buf)) { @@ -1719,7 +1742,7 @@ impl Node for NodeService { } async fn get_sys_config(&self, _request: Request) -> Result, Status> { - let addr = GLOBAL_Local_Node_Name.read().await.clone(); + let addr = GLOBAL_LOCAL_NODE_NAME.read().await.clone(); let info = get_sys_config(&addr); let mut buf = Vec::new(); if let Err(err) = info.serialize(&mut Serializer::new(&mut buf)) { @@ -1737,7 +1760,7 @@ impl Node for NodeService { } async fn get_sys_errors(&self, _request: Request) -> Result, Status> { - let addr = GLOBAL_Local_Node_Name.read().await.clone(); + let addr = GLOBAL_LOCAL_NODE_NAME.read().await.clone(); let info = get_sys_errors(&addr); let mut buf = Vec::new(); if let Err(err) = info.serialize(&mut Serializer::new(&mut buf)) { @@ -1755,7 +1778,7 @@ impl Node for NodeService { } async fn get_mem_info(&self, _request: Request) -> Result, Status> { - let addr = GLOBAL_Local_Node_Name.read().await.clone(); + let addr = GLOBAL_LOCAL_NODE_NAME.read().await.clone(); let info = get_mem_info(&addr); let mut buf = Vec::new(); if let Err(err) = info.serialize(&mut Serializer::new(&mut buf)) { @@ -1798,7 +1821,7 @@ impl Node for NodeService { } async fn get_proc_info(&self, _request: Request) -> Result, Status> { - let addr = GLOBAL_Local_Node_Name.read().await.clone(); + let addr = GLOBAL_LOCAL_NODE_NAME.read().await.clone(); let info = get_proc_info(&addr); let mut buf = Vec::new(); if let Err(err) = info.serialize(&mut Serializer::new(&mut buf)) { diff --git a/scripts/dev_deploy.sh b/scripts/dev_deploy.sh index 23da85a0..c73b9ce1 100755 --- a/scripts/dev_deploy.sh +++ b/scripts/dev_deploy.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Copyright 2024 RustFS Team # # Licensed under the Apache License, Version 2.0 (the "License"); diff --git a/scripts/dev_rustfs.sh b/scripts/dev_rustfs.sh index 11ce4389..7a69e1e2 100644 --- a/scripts/dev_rustfs.sh +++ b/scripts/dev_rustfs.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Copyright 2024 RustFS Team # # Licensed under the Apache License, Version 2.0 (the "License"); diff --git a/scripts/e2e-run.sh b/scripts/e2e-run.sh index 9127fd0c..b518c598 100755 --- a/scripts/e2e-run.sh +++ b/scripts/e2e-run.sh @@ -1,4 +1,5 @@ -#!/bin/bash -ex +#!/usr/bin/env bash +set -ex # Copyright 2024 RustFS Team # # Licensed under the Apache License, Version 2.0 (the "License"); diff --git a/scripts/install-flatc.sh b/scripts/install-flatc.sh index 1f95a9cc..b787b8a4 100755 --- a/scripts/install-flatc.sh +++ b/scripts/install-flatc.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Install flatc 25.9.23 on macOS set -e diff --git a/scripts/install-protoc.sh b/scripts/install-protoc.sh index dfb52a0a..3d85cf21 100755 --- a/scripts/install-protoc.sh +++ b/scripts/install-protoc.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Install protoc 33.1 on macOS set -e diff --git a/scripts/notify.sh b/scripts/notify.sh index 49aedaf7..1acbcea2 100755 --- a/scripts/notify.sh +++ b/scripts/notify.sh @@ -1,4 +1,5 @@ -#!/bin/bash -e +#!/usr/bin/env bash +set -e # Copyright 2024 RustFS Team # # Licensed under the Apache License, Version 2.0 (the "License"); diff --git a/scripts/run.sh b/scripts/run.sh index 2b75d326..59374340 100755 --- a/scripts/run.sh +++ b/scripts/run.sh @@ -1,4 +1,6 @@ -#!/bin/bash -e +#!/usr/bin/env bash +set -e + # Copyright 2024 RustFS Team # # Licensed under the Apache License, Version 2.0 (the "License"); @@ -34,7 +36,7 @@ mkdir -p ./target/volume/test{1..4} if [ -z "$RUST_LOG" ]; then export RUST_BACKTRACE=1 - export RUST_LOG="rustfs=debug,ecstore=info,s3s=debug,iam=info" + export RUST_LOG="rustfs=debug,ecstore=info,s3s=debug,iam=info,notify=info" fi # export RUSTFS_ERASURE_SET_DRIVE_COUNT=5 @@ -77,7 +79,7 @@ export RUSTFS_OBS_LOG_FLUSH_MS=300 # Log flush interval in milliseconds #tokio runtime export RUSTFS_RUNTIME_WORKER_THREADS=16 export RUSTFS_RUNTIME_MAX_BLOCKING_THREADS=1024 -export RUSTFS_RUNTIME_THREAD_PRINT_ENABLED=true +export RUSTFS_RUNTIME_THREAD_PRINT_ENABLED=false # shellcheck disable=SC2125 export RUSTFS_RUNTIME_THREAD_STACK_SIZE=1024*1024 export RUSTFS_RUNTIME_THREAD_KEEP_ALIVE=60 @@ -88,31 +90,87 @@ export OTEL_INSTRUMENTATION_VERSION="0.1.1" export OTEL_INSTRUMENTATION_SCHEMA_URL="https://opentelemetry.io/schemas/1.31.0" export OTEL_INSTRUMENTATION_ATTRIBUTES="env=production" -# notify -export RUSTFS_NOTIFY_WEBHOOK_ENABLE="on" # Whether to enable webhook notification -export RUSTFS_NOTIFY_WEBHOOK_ENDPOINT="http://[::]:3020/webhook" # Webhook notification address -export RUSTFS_NOTIFY_WEBHOOK_QUEUE_DIR="$current_dir/deploy/logs/notify" +# # notify +# export RUSTFS_NOTIFY_WEBHOOK_ENABLE="on" # Whether to enable webhook notification +# export RUSTFS_NOTIFY_WEBHOOK_ENDPOINT="http://[::]:3020/webhook" # Webhook notification address +# export RUSTFS_NOTIFY_WEBHOOK_QUEUE_DIR="$current_dir/deploy/logs/notify" -export RUSTFS_NOTIFY_WEBHOOK_ENABLE_PRIMARY="on" # Whether to enable webhook notification -export RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_PRIMARY="http://[::]:3020/webhook" # Webhook notification address -export RUSTFS_NOTIFY_WEBHOOK_QUEUE_DIR_PRIMARY="$current_dir/deploy/logs/notify" +# export RUSTFS_NOTIFY_WEBHOOK_ENABLE_PRIMARY="on" # Whether to enable webhook notification +# export RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_PRIMARY="http://[::]:3020/webhook" # Webhook notification address +# export RUSTFS_NOTIFY_WEBHOOK_QUEUE_DIR_PRIMARY="$current_dir/deploy/logs/notify" -export RUSTFS_NOTIFY_WEBHOOK_ENABLE_MASTER="on" # Whether to enable webhook notification -export RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_MASTER="http://[::]:3020/webhook" # Webhook notification address -export RUSTFS_NOTIFY_WEBHOOK_QUEUE_DIR_MASTER="$current_dir/deploy/logs/notify" +# export RUSTFS_NOTIFY_WEBHOOK_ENABLE_MASTER="on" # Whether to enable webhook notification +# export RUSTFS_NOTIFY_WEBHOOK_ENDPOINT_MASTER="http://[::]:3020/webhook" # Webhook notification address +# export RUSTFS_NOTIFY_WEBHOOK_QUEUE_DIR_MASTER="$current_dir/deploy/logs/notify" + +# export RUSTFS_AUDIT_WEBHOOK_ENABLE="on" # Whether to enable webhook audit +# export RUSTFS_AUDIT_WEBHOOK_ENDPOINT="http://[::]:3020/webhook" # Webhook audit address +# export RUSTFS_AUDIT_WEBHOOK_QUEUE_DIR="$current_dir/deploy/logs/audit" + +# export RUSTFS_AUDIT_WEBHOOK_ENABLE_PRIMARY="on" # Whether to enable webhook audit +# export RUSTFS_AUDIT_WEBHOOK_ENDPOINT_PRIMARY="http://[::]:3020/webhook" # Webhook audit address +# export RUSTFS_AUDIT_WEBHOOK_QUEUE_DIR_PRIMARY="$current_dir/deploy/logs/audit" + +# export RUSTFS_AUDIT_WEBHOOK_ENABLE_MASTER="on" # Whether to enable webhook audit +# export RUSTFS_AUDIT_WEBHOOK_ENDPOINT_MASTER="http://[::]:3020/webhook" # Webhook audit address +# export RUSTFS_AUDIT_WEBHOOK_QUEUE_DIR_MASTER="$current_dir/deploy/logs/audit" # export RUSTFS_POLICY_PLUGIN_URL="http://localhost:8181/v1/data/rustfs/authz/allow" # The URL of the OPA system # export RUSTFS_POLICY_PLUGIN_AUTH_TOKEN="your-opa-token" # The authentication token for the OPA system is optional export RUSTFS_NS_SCANNER_INTERVAL=60 # Object scanning interval in seconds -# exportRUSTFS_SKIP_BACKGROUND_TASK=true +# export RUSTFS_SKIP_BACKGROUND_TASK=true -# export RUSTFS_COMPRESSION_ENABLED=true # Whether to enable compression +# Storage level compression (compression at object storage level) +# export RUSTFS_COMPRESSION_ENABLED=true # Whether to enable storage-level compression for objects + +# HTTP Response Compression (whitelist-based, aligned with MinIO) +# By default, HTTP response compression is DISABLED (aligned with MinIO behavior) +# When enabled, only explicitly configured file types will be compressed +# This preserves Content-Length headers for better browser download experience + +# Enable HTTP response compression +# export RUSTFS_COMPRESS_ENABLE=on + +# Example 1: Compress text files and logs +# Suitable for log files, text documents, CSV files +# export RUSTFS_COMPRESS_ENABLE=on +# export RUSTFS_COMPRESS_EXTENSIONS=.txt,.log,.csv +# export RUSTFS_COMPRESS_MIME_TYPES=text/* +# export RUSTFS_COMPRESS_MIN_SIZE=1000 + +# Example 2: Compress JSON and XML API responses +# Suitable for API services that return JSON/XML data +# export RUSTFS_COMPRESS_ENABLE=on +# export RUSTFS_COMPRESS_EXTENSIONS=.json,.xml +# export RUSTFS_COMPRESS_MIME_TYPES=application/json,application/xml +# export RUSTFS_COMPRESS_MIN_SIZE=1000 + +# Example 3: Comprehensive web content compression +# Suitable for web applications (HTML, CSS, JavaScript, JSON) +# export RUSTFS_COMPRESS_ENABLE=on +# export RUSTFS_COMPRESS_EXTENSIONS=.html,.css,.js,.json,.xml,.txt,.svg +# export RUSTFS_COMPRESS_MIME_TYPES=text/*,application/json,application/xml,application/javascript,image/svg+xml +# export RUSTFS_COMPRESS_MIN_SIZE=1000 + +# Example 4: Compress only large text files (minimum 10KB) +# Useful when you want to avoid compression overhead for small files +# export RUSTFS_COMPRESS_ENABLE=on +# export RUSTFS_COMPRESS_EXTENSIONS=.txt,.log +# export RUSTFS_COMPRESS_MIME_TYPES=text/* +# export RUSTFS_COMPRESS_MIN_SIZE=10240 + +# Notes: +# - Only files matching EITHER extensions OR MIME types will be compressed (whitelist approach) +# - Error responses (4xx, 5xx) are never compressed to avoid Content-Length issues +# - Already encoded content (gzip, br, deflate, zstd) is automatically skipped +# - Minimum size threshold prevents compression of small files where overhead > benefit +# - Wildcard patterns supported in MIME types (e.g., text/* matches text/plain, text/html, etc.) #export RUSTFS_REGION="us-east-1" -export RUSTFS_ENABLE_SCANNER=false +export RUSTFS_ENABLE_SCANNER=true export RUSTFS_ENABLE_HEAL=false @@ -125,6 +183,9 @@ export RUSTFS_ENABLE_PROFILING=false # Heal configuration queue size export RUSTFS_HEAL_QUEUE_SIZE=10000 +# rustfs trust system CA certificates +export RUSTFS_TRUST_SYSTEM_CA=true + if [ -n "$1" ]; then export RUSTFS_VOLUMES="$1" fi @@ -153,5 +214,4 @@ fi # To run in release mode, use the following line #cargo run --profile release --bin rustfs # To run in debug mode, use the following line -cargo run --bin rustfs - +cargo run --bin rustfs \ No newline at end of file diff --git a/scripts/run_e2e_tests.sh b/scripts/run_e2e_tests.sh index c9e0894d..754782f1 100755 --- a/scripts/run_e2e_tests.sh +++ b/scripts/run_e2e_tests.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # E2E Test Runner Script # Automatically starts RustFS instance, runs tests, and cleans up diff --git a/scripts/run_scanner_benchmarks.sh b/scripts/run_scanner_benchmarks.sh index bbf68530..dce92f2b 100755 --- a/scripts/run_scanner_benchmarks.sh +++ b/scripts/run_scanner_benchmarks.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Scanner performance benchmark runner # Usage: ./scripts/run_scanner_benchmarks.sh [test_type] [quick] diff --git a/scripts/setup-test-binaries.sh b/scripts/setup-test-binaries.sh index f3f01662..fa2389b0 100755 --- a/scripts/setup-test-binaries.sh +++ b/scripts/setup-test-binaries.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Setup test binaries for Docker build testing # This script creates temporary binary files for testing Docker build process diff --git a/scripts/test.sh b/scripts/test.sh index b4e1c68a..cca9e750 100755 --- a/scripts/test.sh +++ b/scripts/test.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Copyright 2024 RustFS Team # # Licensed under the Apache License, Version 2.0 (the "License"); diff --git a/scripts/test/delete_xldir.sh b/scripts/test/delete_xldir.sh index 8b6896cd..ad422668 100755 --- a/scripts/test/delete_xldir.sh +++ b/scripts/test/delete_xldir.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Delete all directories ending with __XLDIR__ in the specified path diff --git a/scripts/test/delete_xldir_simple.sh b/scripts/test/delete_xldir_simple.sh index 04d4406e..493e88e6 100755 --- a/scripts/test/delete_xldir_simple.sh +++ b/scripts/test/delete_xldir_simple.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash # Simple version: Delete all directories ending with __XLDIR__ in the specified path