mirror of
https://github.com/rustfs/rustfs.git
synced 2026-01-17 01:30:33 +00:00
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: houseme <4829346+houseme@users.noreply.github.com> Co-authored-by: houseme <housemecn@gmail.com>
312 lines
9.3 KiB
Markdown
312 lines
9.3 KiB
Markdown
# Special Characters in Object Path - Solution Implementation
|
|
|
|
## Executive Summary
|
|
|
|
After comprehensive investigation, the root cause analysis reveals:
|
|
|
|
1. **Backend (rustfs) is handling URL encoding correctly** via the s3s library
|
|
2. **The primary issue is likely in the UI/client layer** where URL encoding is not properly handled
|
|
3. **Backend enhancements needed** to ensure robustness and better error messages
|
|
|
|
## Root Cause Analysis
|
|
|
|
### What s3s Library Does
|
|
|
|
The s3s library (version 0.12.0-rc.4) **correctly** URL-decodes object keys from HTTP requests:
|
|
|
|
```rust
|
|
// From s3s-0.12.0-rc.4/src/ops/mod.rs, line 261:
|
|
let decoded_uri_path = urlencoding::decode(req.uri.path())
|
|
.map_err(|_| S3ErrorCode::InvalidURI)?
|
|
.into_owned();
|
|
```
|
|
|
|
This means:
|
|
- Client sends: `PUT /bucket/a%20f+/file.txt`
|
|
- s3s decodes to: `a f+/file.txt`
|
|
- Our handler receives: `key = "a f+/file.txt"` (already decoded)
|
|
|
|
### What Our Backend Does
|
|
|
|
1. **Storage**: Stores objects with decoded names (e.g., `"a f+/file.txt"`)
|
|
2. **Retrieval**: Returns objects with decoded names in LIST responses
|
|
3. **Path operations**: Rust's `Path` APIs preserve special characters correctly
|
|
|
|
### The Real Problems
|
|
|
|
#### Problem 1: UI Client Issue (Part A)
|
|
|
|
**Symptom**: UI can navigate TO folder but can't LIST contents
|
|
|
|
**Diagnosis**:
|
|
- User uploads: `PUT /bucket/a%20f+/b/c/3/README.md` ✅ Works
|
|
- CLI lists: `GET /bucket?prefix=a%20f+/` ✅ Works (mc properly encodes)
|
|
- UI navigates: Shows folder "a f+" ✅ Works
|
|
- UI lists folder: `GET /bucket?prefix=a f+/` ❌ Fails (UI doesn't encode!)
|
|
|
|
**Root Cause**: The UI is not URL-encoding the prefix when making the LIST request. It should send `prefix=a%20f%2B/` but likely sends `prefix=a f+/` which causes issues.
|
|
|
|
**Evidence**:
|
|
- mc (MinIO client) works → proves backend is correct
|
|
- UI doesn't work → proves UI encoding is wrong
|
|
|
|
#### Problem 2: Client Encoding Issue (Part B)
|
|
|
|
**Symptom**: 400 error with plus signs
|
|
|
|
**Error Message**: `api error InvalidArgument: Invalid argument`
|
|
|
|
**Diagnosis**:
|
|
The plus sign (`+`) has special meaning in URL query parameters (represents space in form encoding) but not in URL paths. Clients must encode `+` as `%2B` in paths.
|
|
|
|
**Example**:
|
|
- Correct: `/bucket/ES%2Bnet/file.txt` → decoded to `ES+net/file.txt`
|
|
- Wrong: `/bucket/ES+net/file.txt` → might be misinterpreted
|
|
|
|
### URL Encoding Rules
|
|
|
|
According to RFC 3986 and AWS S3 API:
|
|
|
|
| Character | In URL Path | In Query Param | Decoded Result |
|
|
|-----------|-------------|----------------|----------------|
|
|
| Space | `%20` | `%20` or `+` | ` ` (space) |
|
|
| Plus | `%2B` | `%2B` | `+` (plus) |
|
|
| Percent | `%25` | `%25` | `%` (percent) |
|
|
|
|
**Critical Note**: In URL **paths** (not query params), `+` represents a literal plus sign, NOT a space. Only `%20` represents space in paths.
|
|
|
|
## Solution Implementation
|
|
|
|
### Phase 1: Backend Validation & Logging (Low Risk)
|
|
|
|
Add defensive validation and better logging to help diagnose issues:
|
|
|
|
```rust
|
|
// In rustfs/src/storage/ecfs.rs
|
|
|
|
/// Validate that an object key doesn't contain problematic characters
|
|
/// that might indicate client-side encoding issues
|
|
fn log_potential_encoding_issues(key: &str) {
|
|
// Check for unencoded special chars that might indicate problems
|
|
if key.contains('\n') || key.contains('\r') || key.contains('\0') {
|
|
warn!("Object key contains control characters: {:?}", key);
|
|
}
|
|
|
|
// Log debug info for troubleshooting
|
|
debug!("Processing object key: {:?} (bytes: {:?})", key, key.as_bytes());
|
|
}
|
|
```
|
|
|
|
**Benefit**: Helps diagnose client-side issues without changing behavior.
|
|
|
|
### Phase 2: Enhanced Error Messages (Low Risk)
|
|
|
|
When validation fails, provide helpful error messages:
|
|
|
|
```rust
|
|
// Check for invalid UTF-8 or suspicious patterns
|
|
if !key.is_ascii() && !key.is_char_boundary(key.len()) {
|
|
return Err(S3Error::with_message(
|
|
S3ErrorCode::InvalidArgument,
|
|
"Object key contains invalid UTF-8. Ensure keys are properly URL-encoded."
|
|
));
|
|
}
|
|
```
|
|
|
|
### Phase 3: Documentation (No Risk)
|
|
|
|
1. **API Documentation**: Document URL encoding requirements
|
|
2. **Client Guide**: Explain how to properly encode object keys
|
|
3. **Troubleshooting Guide**: Common issues and solutions
|
|
|
|
### Phase 4: UI Fix (If Applicable)
|
|
|
|
If RustFS includes a web UI/console:
|
|
|
|
1. **Ensure UI properly URL-encodes all requests**:
|
|
```javascript
|
|
// When making requests, encode the key:
|
|
const encodedKey = encodeURIComponent(key);
|
|
fetch(`/bucket/${encodedKey}`);
|
|
|
|
// When making LIST requests, encode the prefix:
|
|
const encodedPrefix = encodeURIComponent(prefix);
|
|
fetch(`/bucket?prefix=${encodedPrefix}`);
|
|
```
|
|
|
|
2. **Decode when displaying**:
|
|
```javascript
|
|
// When showing keys in UI, decode for display:
|
|
const displayKey = decodeURIComponent(key);
|
|
```
|
|
|
|
## Testing Strategy
|
|
|
|
### Test Cases
|
|
|
|
Our e2e tests in `crates/e2e_test/src/special_chars_test.rs` cover:
|
|
|
|
1. ✅ Spaces in paths: `"a f+/b/c/3/README.md"`
|
|
2. ✅ Plus signs in paths: `"ES+net/LHC+Data+Challenge/file.json"`
|
|
3. ✅ Mixed special characters
|
|
4. ✅ PUT, GET, LIST, DELETE operations
|
|
5. ✅ Exact scenario from issue
|
|
|
|
### Running Tests
|
|
|
|
```bash
|
|
# Run special character tests
|
|
cargo test --package e2e_test special_chars -- --nocapture
|
|
|
|
# Run specific test
|
|
cargo test --package e2e_test test_issue_scenario_exact -- --nocapture
|
|
```
|
|
|
|
### Expected Results
|
|
|
|
All tests should **pass** because:
|
|
- s3s correctly decodes URL-encoded keys
|
|
- Rust Path APIs preserve special characters
|
|
- ecstore stores/retrieves keys correctly
|
|
- AWS SDK (used in tests) properly encodes keys
|
|
|
|
If tests fail, it would indicate a bug in our backend implementation.
|
|
|
|
## Client Guidelines
|
|
|
|
### For Application Developers
|
|
|
|
When using RustFS with any S3 client:
|
|
|
|
1. **Use a proper S3 SDK**: AWS SDK, MinIO SDK, etc. handle encoding automatically
|
|
2. **If using raw HTTP**: Manually URL-encode object keys in paths
|
|
3. **Remember**:
|
|
- Space → `%20` (not `+` in paths!)
|
|
- Plus → `%2B`
|
|
- Percent → `%25`
|
|
|
|
### Example: Correct Client Usage
|
|
|
|
```python
|
|
# Python boto3 - handles encoding automatically
|
|
import boto3
|
|
s3 = boto3.client('s3', endpoint_url='http://localhost:9000')
|
|
|
|
# These work correctly - boto3 encodes automatically:
|
|
s3.put_object(Bucket='test', Key='path with spaces/file.txt', Body=b'data')
|
|
s3.put_object(Bucket='test', Key='path+with+plus/file.txt', Body=b'data')
|
|
s3.list_objects_v2(Bucket='test', Prefix='path with spaces/')
|
|
```
|
|
|
|
```go
|
|
// Go AWS SDK - handles encoding automatically
|
|
package main
|
|
|
|
import (
|
|
"github.com/aws/aws-sdk-go/aws"
|
|
"github.com/aws/aws-sdk-go/service/s3"
|
|
)
|
|
|
|
func main() {
|
|
svc := s3.New(session.New())
|
|
|
|
// These work correctly - SDK encodes automatically:
|
|
svc.PutObject(&s3.PutObjectInput{
|
|
Bucket: aws.String("test"),
|
|
Key: aws.String("path with spaces/file.txt"),
|
|
Body: bytes.NewReader([]byte("data")),
|
|
})
|
|
|
|
svc.ListObjectsV2(&s3.ListObjectsV2Input{
|
|
Bucket: aws.String("test"),
|
|
Prefix: aws.String("path with spaces/"),
|
|
})
|
|
}
|
|
```
|
|
|
|
```bash
|
|
# MinIO mc client - handles encoding automatically
|
|
mc cp file.txt "local/bucket/path with spaces/file.txt"
|
|
mc ls "local/bucket/path with spaces/"
|
|
```
|
|
|
|
### Example: Manual HTTP Requests
|
|
|
|
If making raw HTTP requests (not recommended):
|
|
|
|
```bash
|
|
# Correct: URL-encode the path
|
|
curl -X PUT "http://localhost:9000/bucket/path%20with%20spaces/file.txt" \
|
|
-H "Content-Type: text/plain" \
|
|
-d "data"
|
|
|
|
# Correct: Encode plus as %2B
|
|
curl -X PUT "http://localhost:9000/bucket/ES%2Bnet/file.txt" \
|
|
-H "Content-Type: text/plain" \
|
|
-d "data"
|
|
|
|
# List with encoded prefix
|
|
curl "http://localhost:9000/bucket?prefix=path%20with%20spaces/"
|
|
```
|
|
|
|
## Monitoring and Debugging
|
|
|
|
### Backend Logs
|
|
|
|
Enable debug logging to see key processing:
|
|
|
|
```bash
|
|
RUST_LOG=rustfs=debug cargo run
|
|
```
|
|
|
|
Look for log messages showing:
|
|
- Received keys
|
|
- Validation errors
|
|
- Storage operations
|
|
|
|
### Common Issues
|
|
|
|
| Symptom | Likely Cause | Solution |
|
|
|---------|--------------|----------|
|
|
| 400 "InvalidArgument" | Client not encoding properly | Use S3 SDK or manually encode |
|
|
| 404 "NoSuchKey" but file exists | Encoding mismatch | Check client encoding |
|
|
| UI shows folder but can't list | UI bug - not encoding prefix | Fix UI to encode requests |
|
|
| Works with CLI, fails with UI | UI implementation issue | Compare UI requests vs CLI |
|
|
|
|
## Conclusion
|
|
|
|
### Backend Status: ✅ Working Correctly
|
|
|
|
The RustFS backend correctly handles URL-encoded object keys through the s3s library. No backend code changes are required for basic functionality.
|
|
|
|
### Client/UI Status: ❌ Needs Attention
|
|
|
|
The issues described appear to be client-side or UI-side problems:
|
|
|
|
1. **Part A**: UI not properly encoding LIST prefix requests
|
|
2. **Part B**: Client not encoding `+` as `%2B` in paths
|
|
|
|
### Recommendations
|
|
|
|
1. **Short-term**:
|
|
- Add logging and better error messages (Phase 1-2)
|
|
- Document client requirements (Phase 3)
|
|
- Fix UI if applicable (Phase 4)
|
|
|
|
2. **Long-term**:
|
|
- Add comprehensive e2e tests (already done!)
|
|
- Monitor for encoding-related errors
|
|
- Educate users on proper S3 client usage
|
|
|
|
3. **For Users Experiencing Issues**:
|
|
- Use proper S3 SDKs (AWS, MinIO, etc.)
|
|
- If using custom clients, ensure proper URL encoding
|
|
- If using RustFS UI, report UI bugs separately
|
|
|
|
---
|
|
|
|
**Document Version**: 1.0
|
|
**Date**: 2025-12-09
|
|
**Status**: Final - Ready for Implementation
|
|
**Next Steps**: Implement Phase 1-3, run tests, update user documentation
|