mirror of
https://github.com/rustfs/rustfs.git
synced 2026-01-17 01:30:33 +00:00
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: houseme <4829346+houseme@users.noreply.github.com> Co-authored-by: houseme <housemecn@gmail.com>
443 lines
12 KiB
Markdown
443 lines
12 KiB
Markdown
# Working with Special Characters in Object Names
|
|
|
|
## Overview
|
|
|
|
This guide explains how to properly handle special characters (spaces, plus signs, etc.) in S3 object names when using RustFS.
|
|
|
|
## Quick Reference
|
|
|
|
| Character | What You Type | How It's Stored | How to Access It |
|
|
|-----------|---------------|-----------------|------------------|
|
|
| Space | `my file.txt` | `my file.txt` | Use proper S3 client/SDK |
|
|
| Plus | `test+file.txt` | `test+file.txt` | Use proper S3 client/SDK |
|
|
| Percent | `test%file.txt` | `test%file.txt` | Use proper S3 client/SDK |
|
|
|
|
**Key Point**: Use a proper S3 SDK or client. They handle URL encoding automatically!
|
|
|
|
## Recommended Approach: Use S3 SDKs
|
|
|
|
The easiest and most reliable way to work with object names containing special characters is to use an official S3 SDK. These handle all encoding automatically.
|
|
|
|
### AWS CLI
|
|
|
|
```bash
|
|
# Works correctly - AWS CLI handles encoding
|
|
aws --endpoint-url=http://localhost:9000 s3 cp file.txt "s3://mybucket/path with spaces/file.txt"
|
|
aws --endpoint-url=http://localhost:9000 s3 ls "s3://mybucket/path with spaces/"
|
|
|
|
# Works with plus signs
|
|
aws --endpoint-url=http://localhost:9000 s3 cp data.json "s3://mybucket/ES+net/data.json"
|
|
```
|
|
|
|
### MinIO Client (mc)
|
|
|
|
```bash
|
|
# Configure RustFS endpoint
|
|
mc alias set myrustfs http://localhost:9000 ACCESS_KEY SECRET_KEY
|
|
|
|
# Upload with spaces in path
|
|
mc cp README.md "myrustfs/mybucket/a f+/b/c/3/README.md"
|
|
|
|
# List contents
|
|
mc ls "myrustfs/mybucket/a f+/"
|
|
mc ls "myrustfs/mybucket/a f+/b/c/3/"
|
|
|
|
# Works with plus signs
|
|
mc cp file.txt "myrustfs/mybucket/ES+net/file.txt"
|
|
```
|
|
|
|
### Python (boto3)
|
|
|
|
```python
|
|
import boto3
|
|
|
|
# Configure client
|
|
s3 = boto3.client(
|
|
's3',
|
|
endpoint_url='http://localhost:9000',
|
|
aws_access_key_id='ACCESS_KEY',
|
|
aws_secret_access_key='SECRET_KEY'
|
|
)
|
|
|
|
# Upload with spaces - boto3 handles encoding automatically
|
|
s3.put_object(
|
|
Bucket='mybucket',
|
|
Key='path with spaces/file.txt',
|
|
Body=b'file content'
|
|
)
|
|
|
|
# List objects - boto3 encodes prefix automatically
|
|
response = s3.list_objects_v2(
|
|
Bucket='mybucket',
|
|
Prefix='path with spaces/'
|
|
)
|
|
|
|
for obj in response.get('Contents', []):
|
|
print(obj['Key']) # Will print: "path with spaces/file.txt"
|
|
|
|
# Works with plus signs
|
|
s3.put_object(
|
|
Bucket='mybucket',
|
|
Key='ES+net/LHC+Data+Challenge/file.json',
|
|
Body=b'data'
|
|
)
|
|
```
|
|
|
|
### Go (AWS SDK)
|
|
|
|
```go
|
|
package main
|
|
|
|
import (
|
|
"bytes"
|
|
"fmt"
|
|
|
|
"github.com/aws/aws-sdk-go/aws"
|
|
"github.com/aws/aws-sdk-go/aws/credentials"
|
|
"github.com/aws/aws-sdk-go/aws/session"
|
|
"github.com/aws/aws-sdk-go/service/s3"
|
|
)
|
|
|
|
func main() {
|
|
// Configure session
|
|
sess := session.Must(session.NewSession(&aws.Config{
|
|
Endpoint: aws.String("http://localhost:9000"),
|
|
Region: aws.String("us-east-1"),
|
|
Credentials: credentials.NewStaticCredentials("ACCESS_KEY", "SECRET_KEY", ""),
|
|
S3ForcePathStyle: aws.Bool(true),
|
|
}))
|
|
|
|
svc := s3.New(sess)
|
|
|
|
// Upload with spaces - SDK handles encoding
|
|
_, err := svc.PutObject(&s3.PutObjectInput{
|
|
Bucket: aws.String("mybucket"),
|
|
Key: aws.String("path with spaces/file.txt"),
|
|
Body: bytes.NewReader([]byte("content")),
|
|
})
|
|
|
|
if err != nil {
|
|
panic(err)
|
|
}
|
|
|
|
// List objects - SDK handles encoding
|
|
result, err := svc.ListObjectsV2(&s3.ListObjectsV2Input{
|
|
Bucket: aws.String("mybucket"),
|
|
Prefix: aws.String("path with spaces/"),
|
|
})
|
|
|
|
if err != nil {
|
|
panic(err)
|
|
}
|
|
|
|
for _, obj := range result.Contents {
|
|
fmt.Println(*obj.Key)
|
|
}
|
|
}
|
|
```
|
|
|
|
### Node.js (AWS SDK v3)
|
|
|
|
```javascript
|
|
const { S3Client, PutObjectCommand, ListObjectsV2Command } = require("@aws-sdk/client-s3");
|
|
|
|
// Configure client
|
|
const client = new S3Client({
|
|
endpoint: "http://localhost:9000",
|
|
region: "us-east-1",
|
|
credentials: {
|
|
accessKeyId: "ACCESS_KEY",
|
|
secretAccessKey: "SECRET_KEY",
|
|
},
|
|
forcePathStyle: true,
|
|
});
|
|
|
|
// Upload with spaces - SDK handles encoding
|
|
async function upload() {
|
|
const command = new PutObjectCommand({
|
|
Bucket: "mybucket",
|
|
Key: "path with spaces/file.txt",
|
|
Body: "file content",
|
|
});
|
|
|
|
await client.send(command);
|
|
}
|
|
|
|
// List objects - SDK handles encoding
|
|
async function list() {
|
|
const command = new ListObjectsV2Command({
|
|
Bucket: "mybucket",
|
|
Prefix: "path with spaces/",
|
|
});
|
|
|
|
const response = await client.send(command);
|
|
|
|
for (const obj of response.Contents || []) {
|
|
console.log(obj.Key);
|
|
}
|
|
}
|
|
```
|
|
|
|
## Advanced: Manual HTTP Requests
|
|
|
|
**⚠️ Not Recommended**: Only use if you can't use an S3 SDK.
|
|
|
|
If you must make raw HTTP requests, you need to manually URL-encode the object key in the path:
|
|
|
|
### URL Encoding Rules
|
|
|
|
| Character | Encoding | Example |
|
|
|-----------|----------|---------|
|
|
| Space | `%20` | `my file.txt` → `my%20file.txt` |
|
|
| Plus | `%2B` | `test+file.txt` → `test%2Bfile.txt` |
|
|
| Percent | `%25` | `test%file.txt` → `test%25file.txt` |
|
|
| Slash (in name) | `%2F` | `test/file.txt` → `test%2Ffile.txt` |
|
|
|
|
**Important**: In URL **paths** (not query parameters):
|
|
- `%20` = space
|
|
- `+` = literal plus sign (NOT space!)
|
|
- To represent a plus sign, use `%2B`
|
|
|
|
### Example: Manual curl Request
|
|
|
|
```bash
|
|
# Upload object with spaces
|
|
curl -X PUT "http://localhost:9000/mybucket/path%20with%20spaces/file.txt" \
|
|
-H "Authorization: AWS4-HMAC-SHA256 ..." \
|
|
-d "file content"
|
|
|
|
# Upload object with plus signs
|
|
curl -X PUT "http://localhost:9000/mybucket/ES%2Bnet/file.txt" \
|
|
-H "Authorization: AWS4-HMAC-SHA256 ..." \
|
|
-d "data"
|
|
|
|
# List objects (prefix in query parameter)
|
|
curl "http://localhost:9000/mybucket?prefix=path%20with%20spaces/"
|
|
|
|
# Note: You'll also need to compute AWS Signature V4
|
|
# This is complex - use an SDK instead!
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: "UI can navigate to folder but can't list contents"
|
|
|
|
**Symptom**:
|
|
- You uploaded: `mc cp file.txt "myrustfs/bucket/a f+/b/c/file.txt"`
|
|
- You can see folder `"a f+"` in the UI
|
|
- But clicking on it shows "No Objects"
|
|
|
|
**Root Cause**: The UI may not be properly URL-encoding the prefix when making the LIST request.
|
|
|
|
**Solution**:
|
|
1. **Use CLI instead**: `mc ls "myrustfs/bucket/a f+/b/c/"` works correctly
|
|
2. **Check UI console**: Open browser DevTools, look at Network tab, check if the request is properly encoded
|
|
3. **Report UI bug**: If using RustFS web console, this is a UI bug to report
|
|
|
|
**Workaround**:
|
|
Use the CLI for operations with special characters until UI is fixed.
|
|
|
|
### Issue: "400 Bad Request: Invalid Argument"
|
|
|
|
**Symptom**:
|
|
```
|
|
Error: api error InvalidArgument: Invalid argument
|
|
```
|
|
|
|
**Possible Causes**:
|
|
|
|
1. **Client not encoding plus signs**
|
|
- Problem: Client sends `/bucket/ES+net/file.txt`
|
|
- Solution: Client should send `/bucket/ES%2Bnet/file.txt`
|
|
- Fix: Use a proper S3 SDK
|
|
|
|
2. **Control characters in key**
|
|
- Problem: Key contains null bytes, newlines, etc.
|
|
- Solution: Remove invalid characters from key name
|
|
|
|
3. **Double-encoding**
|
|
- Problem: Client encodes twice: `%20` → `%2520`
|
|
- Solution: Only encode once, or use SDK
|
|
|
|
**Debugging**:
|
|
Enable debug logging on RustFS:
|
|
```bash
|
|
RUST_LOG=rustfs=debug ./rustfs server /data
|
|
```
|
|
|
|
Look for log lines like:
|
|
```
|
|
DEBUG rustfs::storage::ecfs: PUT object with special characters in key: "a f+/file.txt"
|
|
DEBUG rustfs::storage::ecfs: LIST objects with special characters in prefix: "ES+net/"
|
|
```
|
|
|
|
### Issue: "NoSuchKey error but file exists"
|
|
|
|
**Symptom**:
|
|
- Upload: `PUT /bucket/test+file.txt` works
|
|
- List: `GET /bucket?prefix=test` shows: `test+file.txt`
|
|
- Get: `GET /bucket/test+file.txt` fails with NoSuchKey
|
|
|
|
**Root Cause**: Key was stored with one encoding, requested with another.
|
|
|
|
**Diagnosis**:
|
|
```bash
|
|
# Check what name is actually stored
|
|
mc ls --recursive myrustfs/bucket/
|
|
|
|
# Try different encodings
|
|
curl "http://localhost:9000/bucket/test+file.txt" # Literal +
|
|
curl "http://localhost:9000/bucket/test%2Bfile.txt" # Encoded +
|
|
curl "http://localhost:9000/bucket/test%20file.txt" # Space (if + was meant as space)
|
|
```
|
|
|
|
**Solution**: Use a consistent S3 client/SDK for all operations.
|
|
|
|
### Issue: "Special characters work in CLI but not in UI"
|
|
|
|
**Root Cause**: This is a UI bug. The backend (RustFS) handles special characters correctly when accessed via proper S3 clients.
|
|
|
|
**Verification**:
|
|
```bash
|
|
# These should all work:
|
|
mc cp file.txt "myrustfs/bucket/test with spaces/file.txt"
|
|
mc ls "myrustfs/bucket/test with spaces/"
|
|
|
|
aws --endpoint-url=http://localhost:9000 s3 cp file.txt "s3://bucket/test with spaces/file.txt"
|
|
aws --endpoint-url=http://localhost:9000 s3 ls "s3://bucket/test with spaces/"
|
|
```
|
|
|
|
**Solution**: Report as UI bug. Use CLI for now.
|
|
|
|
## Best Practices
|
|
|
|
### 1. Use Simple Names When Possible
|
|
|
|
Avoid special characters if you don't need them:
|
|
- ✅ Good: `my-file.txt`, `data_2024.json`, `report-final.pdf`
|
|
- ⚠️ Acceptable but complex: `my file.txt`, `data+backup.json`, `report (final).pdf`
|
|
|
|
### 2. Always Use S3 SDKs/Clients
|
|
|
|
Don't try to build raw HTTP requests yourself. Use:
|
|
- AWS CLI
|
|
- MinIO client (mc)
|
|
- AWS SDKs (Python/boto3, Go, Node.js, Java, etc.)
|
|
- Other S3-compatible SDKs
|
|
|
|
### 3. Understand URL Encoding
|
|
|
|
If you must work with URLs directly:
|
|
- **In URL paths**: Space=`%20`, Plus=`%2B`, `+` means literal plus
|
|
- **In query params**: Space=`%20` or `+`, Plus=`%2B`
|
|
- Use a URL encoding library in your language
|
|
|
|
### 4. Test Your Client
|
|
|
|
Before deploying:
|
|
```bash
|
|
# Test with spaces
|
|
mc cp test.txt "myrustfs/bucket/test with spaces/file.txt"
|
|
mc ls "myrustfs/bucket/test with spaces/"
|
|
|
|
# Test with plus
|
|
mc cp test.txt "myrustfs/bucket/test+plus/file.txt"
|
|
mc ls "myrustfs/bucket/test+plus/"
|
|
|
|
# Test with mixed
|
|
mc cp test.txt "myrustfs/bucket/test with+mixed/file.txt"
|
|
mc ls "myrustfs/bucket/test with+mixed/"
|
|
```
|
|
|
|
## Technical Details
|
|
|
|
### How RustFS Handles Special Characters
|
|
|
|
1. **Request Reception**: Client sends HTTP request with URL-encoded path
|
|
```
|
|
PUT /bucket/test%20file.txt
|
|
```
|
|
|
|
2. **URL Decoding**: s3s library decodes the path
|
|
```rust
|
|
let decoded = urlencoding::decode("/bucket/test%20file.txt")
|
|
// Result: "/bucket/test file.txt"
|
|
```
|
|
|
|
3. **Storage**: Object stored with decoded name
|
|
```
|
|
Stored as: "test file.txt"
|
|
```
|
|
|
|
4. **Retrieval**: Object retrieved by decoded name
|
|
```rust
|
|
let key = "test file.txt"; // Already decoded by s3s
|
|
store.get_object(bucket, key)
|
|
```
|
|
|
|
5. **Response**: Key returned in response (decoded)
|
|
```xml
|
|
<Key>test file.txt</Key>
|
|
```
|
|
|
|
6. **Client Display**: S3 clients display the decoded name
|
|
```
|
|
Shows: test file.txt
|
|
```
|
|
|
|
### URL Encoding Standards
|
|
|
|
RustFS follows:
|
|
- **RFC 3986**: URI Generic Syntax
|
|
- **AWS S3 API**: Object key encoding rules
|
|
- **HTTP/1.1**: URL encoding in request URIs
|
|
|
|
Key points:
|
|
- Keys are UTF-8 strings
|
|
- URL encoding is only for HTTP transport
|
|
- Keys are stored and compared in decoded form
|
|
|
|
## FAQs
|
|
|
|
**Q: Can I use spaces in object names?**
|
|
A: Yes, but use an S3 SDK which handles encoding automatically.
|
|
|
|
**Q: Why does `+` not work as a space?**
|
|
A: In URL paths, `+` represents a literal plus sign. Only in query parameters does `+` mean space. Use `%20` for spaces in paths.
|
|
|
|
**Q: Does RustFS support Unicode in object names?**
|
|
A: Yes, object names are UTF-8 strings. They support any valid UTF-8 character.
|
|
|
|
**Q: What characters are forbidden?**
|
|
A: Control characters (null byte, newline, carriage return) are rejected. All printable characters are allowed.
|
|
|
|
**Q: How do I fix "UI can't list folder" issue?**
|
|
A: Use the CLI (mc or aws-cli) instead. This is a UI bug, not a backend issue.
|
|
|
|
**Q: Why do some clients work but others don't?**
|
|
A: Proper S3 SDKs handle encoding correctly. Custom clients may have bugs. Always use official SDKs.
|
|
|
|
## Getting Help
|
|
|
|
If you encounter issues:
|
|
|
|
1. **Check this guide first**
|
|
2. **Verify you're using an S3 SDK** (not raw HTTP)
|
|
3. **Test with mc client** to isolate if issue is backend or client
|
|
4. **Enable debug logging** on RustFS: `RUST_LOG=rustfs=debug`
|
|
5. **Report issues** at: https://github.com/rustfs/rustfs/issues
|
|
|
|
Include in bug reports:
|
|
- Client/SDK used (and version)
|
|
- Exact object name causing issue
|
|
- Whether mc client works
|
|
- Debug logs from RustFS
|
|
|
|
---
|
|
|
|
**Last Updated**: 2025-12-09
|
|
**RustFS Version**: 0.0.5+
|
|
**Related Documents**:
|
|
- [Special Characters Analysis](./special-characters-in-path-analysis.md)
|
|
- [Special Characters Solution](./special-characters-solution.md)
|