mirror of https://github.com/rustfs/rustfs.git synced 2026-01-17 01:30:33 +00:00

Files

houseme 069194f553 Fix/getobjectlength (#920 )

* fix getobject content length resp

* Fix regression in exception handling for non-existent key with enhanced compression predicate and metadata improvements (#915)

* Initial plan

* Fix GetObject regression by excluding error responses from compression

The issue was that CompressionLayer was attempting to compress error responses,
which could cause Content-Length header mismatches. By excluding 4xx and 5xx
responses from compression, we ensure error responses (like NoSuchKey) are sent
correctly without body truncation.

Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>

* Enhance NoSuchKey fix with improved compression predicate and comprehensive tests

- Enhanced ShouldCompress predicate with size-based exclusion (< 256 bytes)
- Added detailed documentation explaining the compression logic
- Added debug logging for better observability
- Created comprehensive test suite with 4 test cases:
  - test_get_deleted_object_returns_nosuchkey
  - test_head_deleted_object_returns_nosuchkey
  - test_get_nonexistent_object_returns_nosuchkey
  - test_multiple_gets_deleted_object
- Added extensive inline documentation and comments
- Created docs/fix-nosuchkey-regression.md with full analysis

Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>

* Add compression best practices documentation

Added comprehensive guide covering:
- Best practices for HTTP response compression
- Common pitfalls and solutions
- Performance considerations and trade-offs
- Testing guidelines and examples
- Monitoring and alerting recommendations
- Migration guide for existing services

Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>

* fix

* fmt

* fmt

* Fix/objectdelete (#917)

* fix getobject content length resp

* fix delete object

---------

Co-authored-by: houseme <housemecn@gmail.com>

* Add comprehensive analysis of NoSuchKey fix and related improvements

Created detailed documentation analyzing:
- HTTP compression layer fix (primary issue)
- Content-length calculation fix from PR #917
- Delete object metadata fixes from PR #917
- How all components work together
- Complete scenario walkthrough
- Performance impact analysis
- Testing strategy and deployment checklist

This ties together all the changes in the PR branch including the merged
improvements from PR #917.

Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>

* replace `once_cell` to `std`

* fmt

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
Co-authored-by: houseme <housemecn@gmail.com>
Co-authored-by: weisd <im@weisd.in>

* fmt

---------

Co-authored-by: weisd <weishidavip@163.com>
Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
Co-authored-by: weisd <im@weisd.in>

2025-11-24 18:56:34 +08:00

6.3 KiB

Raw Blame History

HTTP Response Compression Best Practices in RustFS

Overview

This document outlines best practices for HTTP response compression in RustFS, based on lessons learned from fixing the NoSuchKey error response regression (Issue #901).

Key Principles

1. Never Compress Error Responses

Rationale: Error responses are typically small (100-500 bytes) and need to be transmitted accurately. Compression can:

Introduce Content-Length header mismatches
Add unnecessary overhead for small payloads
Potentially corrupt error details during buffering

Implementation:

// Always check status code first
if status.is_client_error() || status.is_server_error() {
    return false; // Don't compress
}

Affected Status Codes:

4xx Client Errors (400, 403, 404, etc.)
5xx Server Errors (500, 502, 503, etc.)

2. Size-Based Compression Threshold

Rationale: Compression has overhead in terms of CPU and potentially network roundtrips. For very small responses:

Compression overhead > space savings
May actually increase payload size
Adds latency without benefit

Recommended Threshold: 256 bytes minimum

Implementation:

if let Some(content_length) = response.headers().get(CONTENT_LENGTH) {
    if let Ok(length) = content_length.to_str()?.parse::<u64>()? {
        if length < 256 {
            return false; // Don't compress small responses
        }
    }
}

3. Maintain Observability

Rationale: Compression decisions can affect debugging and troubleshooting. Always log when compression is skipped.

Implementation:

debug!(
    "Skipping compression for error response: status={}",
    status.as_u16()
);

Log Analysis:

# Monitor compression decisions
RUST_LOG=rustfs::server::http=debug ./target/release/rustfs

# Look for patterns
grep "Skipping compression" logs/rustfs.log | wc -l

Common Pitfalls

❌ Compressing All Responses Blindly

// BAD - No filtering
.layer(CompressionLayer::new())

Problem: Can cause Content-Length mismatches with error responses

✅ Using Intelligent Predicates

// GOOD - Filter based on status and size
.layer(CompressionLayer::new().compress_when(ShouldCompress))

❌ Ignoring Content-Length Header

// BAD - Only checking status
fn should_compress(&self, response: &Response<B>) -> bool {
    !response.status().is_client_error()
}

Problem: May compress tiny responses unnecessarily

✅ Checking Both Status and Size

// GOOD - Multi-criteria decision
fn should_compress(&self, response: &Response<B>) -> bool {
    // Check status
    if response.status().is_error() { return false; }

    // Check size
    if get_content_length(response) < 256 { return false; }

    true
}

Performance Considerations

CPU Usage

Compression CPU Cost: ~1-5ms for typical responses
Benefit: 70-90% size reduction for text/json
Break-even: Responses > 512 bytes on fast networks

Network Latency

Savings: Proportional to size reduction
Break-even: ~256 bytes on typical connections
Diminishing Returns: Below 128 bytes

Memory Usage

Buffer Size: Usually 4-16KB per connection
Trade-off: Memory vs. bandwidth
Recommendation: Profile in production

Testing Guidelines

Unit Tests

Test compression predicate logic:

#[test]
fn test_should_not_compress_errors() {
    let predicate = ShouldCompress;
    let response = Response::builder()
        .status(404)
        .body(())
        .unwrap();

    assert!(!predicate.should_compress(&response));
}

#[test]
fn test_should_not_compress_small_responses() {
    let predicate = ShouldCompress;
    let response = Response::builder()
        .status(200)
        .header(CONTENT_LENGTH, "100")
        .body(())
        .unwrap();

    assert!(!predicate.should_compress(&response));
}

Integration Tests

Test actual S3 API responses:

#[tokio::test]
async fn test_error_response_not_truncated() {
    let response = client
        .get_object()
        .bucket("test")
        .key("nonexistent")
        .send()
        .await;

    // Should get proper error, not truncation error
    match response.unwrap_err() {
        SdkError::ServiceError(err) => {
            assert!(err.is_no_such_key());
        }
        other => panic!("Expected ServiceError, got {:?}", other),
    }
}

Monitoring and Alerts

Metrics to Track

Compression Ratio: compressed_size / original_size
Compression Skip Rate: skipped_count / total_count
Error Response Size Distribution
CPU Usage During Compression

Alert Conditions

# Prometheus alert rules
- alert: HighCompressionSkipRate
  expr: |
    rate(http_compression_skipped_total[5m]) 
    / rate(http_responses_total[5m]) > 0.5
  annotations:
    summary: "More than 50% of responses skipping compression"

- alert: LargeErrorResponses
  expr: |
    histogram_quantile(0.95, 
      rate(http_error_response_size_bytes_bucket[5m])) > 1024
  annotations:
    summary: "Error responses larger than 1KB"

Migration Guide

Updating Existing Code

If you're adding compression to an existing service:

Start Conservative: Only compress responses > 1KB
Monitor Impact: Watch CPU and latency metrics
Lower Threshold Gradually: Test with smaller thresholds
Always Exclude Errors: Never compress 4xx/5xx

Rollout Strategy

Stage 1: Deploy to canary (5% traffic)
- Monitor for 24 hours
- Check error rates and latency
Stage 2: Expand to 25% traffic
- Monitor for 48 hours
- Validate compression ratios
Stage 3: Full rollout (100% traffic)
- Continue monitoring for 1 week
- Document any issues

References

Issue #901: NoSuchKey error response regression
Google Web Fundamentals - Text Compression
AWS Best Practices - Response Compression

Last Updated: 2025-11-24
Maintainer: RustFS Team

6.3 KiB Raw Blame History

HTTP Response Compression Best Practices in RustFS

Overview

Key Principles

1. Never Compress Error Responses

2. Size-Based Compression Threshold

3. Maintain Observability

Common Pitfalls

❌ Compressing All Responses Blindly

✅ Using Intelligent Predicates

❌ Ignoring Content-Length Header

✅ Checking Both Status and Size

Performance Considerations

CPU Usage

Network Latency

Memory Usage

Testing Guidelines

Unit Tests

Integration Tests

Monitoring and Alerts

Metrics to Track

Alert Conditions

Migration Guide

Updating Existing Code

Rollout Strategy

Related Documentation

References

6.3 KiB

Raw Blame History