Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: houseme <4829346+houseme@users.noreply.github.com> Co-authored-by: houseme <housemecn@gmail.com>
9.3 KiB
Special Characters in Object Path - Solution Implementation
Executive Summary
After comprehensive investigation, the root cause analysis reveals:
- Backend (rustfs) is handling URL encoding correctly via the s3s library
- The primary issue is likely in the UI/client layer where URL encoding is not properly handled
- Backend enhancements needed to ensure robustness and better error messages
Root Cause Analysis
What s3s Library Does
The s3s library (version 0.12.0-rc.4) correctly URL-decodes object keys from HTTP requests:
// From s3s-0.12.0-rc.4/src/ops/mod.rs, line 261:
let decoded_uri_path = urlencoding::decode(req.uri.path())
.map_err(|_| S3ErrorCode::InvalidURI)?
.into_owned();
This means:
- Client sends:
PUT /bucket/a%20f+/file.txt - s3s decodes to:
a f+/file.txt - Our handler receives:
key = "a f+/file.txt"(already decoded)
What Our Backend Does
- Storage: Stores objects with decoded names (e.g.,
"a f+/file.txt") - Retrieval: Returns objects with decoded names in LIST responses
- Path operations: Rust's
PathAPIs preserve special characters correctly
The Real Problems
Problem 1: UI Client Issue (Part A)
Symptom: UI can navigate TO folder but can't LIST contents
Diagnosis:
- User uploads:
PUT /bucket/a%20f+/b/c/3/README.md✅ Works - CLI lists:
GET /bucket?prefix=a%20f+/✅ Works (mc properly encodes) - UI navigates: Shows folder "a f+" ✅ Works
- UI lists folder:
GET /bucket?prefix=a f+/❌ Fails (UI doesn't encode!)
Root Cause: The UI is not URL-encoding the prefix when making the LIST request. It should send prefix=a%20f%2B/ but likely sends prefix=a f+/ which causes issues.
Evidence:
- mc (MinIO client) works → proves backend is correct
- UI doesn't work → proves UI encoding is wrong
Problem 2: Client Encoding Issue (Part B)
Symptom: 400 error with plus signs
Error Message: api error InvalidArgument: Invalid argument
Diagnosis:
The plus sign (+) has special meaning in URL query parameters (represents space in form encoding) but not in URL paths. Clients must encode + as %2B in paths.
Example:
- Correct:
/bucket/ES%2Bnet/file.txt→ decoded toES+net/file.txt - Wrong:
/bucket/ES+net/file.txt→ might be misinterpreted
URL Encoding Rules
According to RFC 3986 and AWS S3 API:
| Character | In URL Path | In Query Param | Decoded Result |
|---|---|---|---|
| Space | %20 |
%20 or + |
(space) |
| Plus | %2B |
%2B |
+ (plus) |
| Percent | %25 |
%25 |
% (percent) |
Critical Note: In URL paths (not query params), + represents a literal plus sign, NOT a space. Only %20 represents space in paths.
Solution Implementation
Phase 1: Backend Validation & Logging (Low Risk)
Add defensive validation and better logging to help diagnose issues:
// In rustfs/src/storage/ecfs.rs
/// Validate that an object key doesn't contain problematic characters
/// that might indicate client-side encoding issues
fn log_potential_encoding_issues(key: &str) {
// Check for unencoded special chars that might indicate problems
if key.contains('\n') || key.contains('\r') || key.contains('\0') {
warn!("Object key contains control characters: {:?}", key);
}
// Log debug info for troubleshooting
debug!("Processing object key: {:?} (bytes: {:?})", key, key.as_bytes());
}
Benefit: Helps diagnose client-side issues without changing behavior.
Phase 2: Enhanced Error Messages (Low Risk)
When validation fails, provide helpful error messages:
// Check for invalid UTF-8 or suspicious patterns
if !key.is_ascii() && !key.is_char_boundary(key.len()) {
return Err(S3Error::with_message(
S3ErrorCode::InvalidArgument,
"Object key contains invalid UTF-8. Ensure keys are properly URL-encoded."
));
}
Phase 3: Documentation (No Risk)
- API Documentation: Document URL encoding requirements
- Client Guide: Explain how to properly encode object keys
- Troubleshooting Guide: Common issues and solutions
Phase 4: UI Fix (If Applicable)
If RustFS includes a web UI/console:
-
Ensure UI properly URL-encodes all requests:
// When making requests, encode the key: const encodedKey = encodeURIComponent(key); fetch(`/bucket/${encodedKey}`); // When making LIST requests, encode the prefix: const encodedPrefix = encodeURIComponent(prefix); fetch(`/bucket?prefix=${encodedPrefix}`); -
Decode when displaying:
// When showing keys in UI, decode for display: const displayKey = decodeURIComponent(key);
Testing Strategy
Test Cases
Our e2e tests in crates/e2e_test/src/special_chars_test.rs cover:
- ✅ Spaces in paths:
"a f+/b/c/3/README.md" - ✅ Plus signs in paths:
"ES+net/LHC+Data+Challenge/file.json" - ✅ Mixed special characters
- ✅ PUT, GET, LIST, DELETE operations
- ✅ Exact scenario from issue
Running Tests
# Run special character tests
cargo test --package e2e_test special_chars -- --nocapture
# Run specific test
cargo test --package e2e_test test_issue_scenario_exact -- --nocapture
Expected Results
All tests should pass because:
- s3s correctly decodes URL-encoded keys
- Rust Path APIs preserve special characters
- ecstore stores/retrieves keys correctly
- AWS SDK (used in tests) properly encodes keys
If tests fail, it would indicate a bug in our backend implementation.
Client Guidelines
For Application Developers
When using RustFS with any S3 client:
- Use a proper S3 SDK: AWS SDK, MinIO SDK, etc. handle encoding automatically
- If using raw HTTP: Manually URL-encode object keys in paths
- Remember:
- Space →
%20(not+in paths!) - Plus →
%2B - Percent →
%25
- Space →
Example: Correct Client Usage
# Python boto3 - handles encoding automatically
import boto3
s3 = boto3.client('s3', endpoint_url='http://localhost:9000')
# These work correctly - boto3 encodes automatically:
s3.put_object(Bucket='test', Key='path with spaces/file.txt', Body=b'data')
s3.put_object(Bucket='test', Key='path+with+plus/file.txt', Body=b'data')
s3.list_objects_v2(Bucket='test', Prefix='path with spaces/')
// Go AWS SDK - handles encoding automatically
package main
import (
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/service/s3"
)
func main() {
svc := s3.New(session.New())
// These work correctly - SDK encodes automatically:
svc.PutObject(&s3.PutObjectInput{
Bucket: aws.String("test"),
Key: aws.String("path with spaces/file.txt"),
Body: bytes.NewReader([]byte("data")),
})
svc.ListObjectsV2(&s3.ListObjectsV2Input{
Bucket: aws.String("test"),
Prefix: aws.String("path with spaces/"),
})
}
# MinIO mc client - handles encoding automatically
mc cp file.txt "local/bucket/path with spaces/file.txt"
mc ls "local/bucket/path with spaces/"
Example: Manual HTTP Requests
If making raw HTTP requests (not recommended):
# Correct: URL-encode the path
curl -X PUT "http://localhost:9000/bucket/path%20with%20spaces/file.txt" \
-H "Content-Type: text/plain" \
-d "data"
# Correct: Encode plus as %2B
curl -X PUT "http://localhost:9000/bucket/ES%2Bnet/file.txt" \
-H "Content-Type: text/plain" \
-d "data"
# List with encoded prefix
curl "http://localhost:9000/bucket?prefix=path%20with%20spaces/"
Monitoring and Debugging
Backend Logs
Enable debug logging to see key processing:
RUST_LOG=rustfs=debug cargo run
Look for log messages showing:
- Received keys
- Validation errors
- Storage operations
Common Issues
| Symptom | Likely Cause | Solution |
|---|---|---|
| 400 "InvalidArgument" | Client not encoding properly | Use S3 SDK or manually encode |
| 404 "NoSuchKey" but file exists | Encoding mismatch | Check client encoding |
| UI shows folder but can't list | UI bug - not encoding prefix | Fix UI to encode requests |
| Works with CLI, fails with UI | UI implementation issue | Compare UI requests vs CLI |
Conclusion
Backend Status: ✅ Working Correctly
The RustFS backend correctly handles URL-encoded object keys through the s3s library. No backend code changes are required for basic functionality.
Client/UI Status: ❌ Needs Attention
The issues described appear to be client-side or UI-side problems:
- Part A: UI not properly encoding LIST prefix requests
- Part B: Client not encoding
+as%2Bin paths
Recommendations
-
Short-term:
- Add logging and better error messages (Phase 1-2)
- Document client requirements (Phase 3)
- Fix UI if applicable (Phase 4)
-
Long-term:
- Add comprehensive e2e tests (already done!)
- Monitor for encoding-related errors
- Educate users on proper S3 client usage
-
For Users Experiencing Issues:
- Use proper S3 SDKs (AWS, MinIO, etc.)
- If using custom clients, ensure proper URL encoding
- If using RustFS UI, report UI bugs separately
Document Version: 1.0
Date: 2025-12-09
Status: Final - Ready for Implementation
Next Steps: Implement Phase 1-3, run tests, update user documentation