Merge branch 'main' of github.com:rustfs/s3-rustfs into feature/observability-metrics

# Conflicts:
#	.github/workflows/build.yml
#	.github/workflows/ci.yml
#	Cargo.lock
#	Cargo.toml
#	appauth/src/token.rs
#	crates/config/src/config.rs
#	crates/event-notifier/examples/simple.rs
#	crates/event-notifier/src/global.rs
#	crates/event-notifier/src/lib.rs
#	crates/event-notifier/src/notifier.rs
#	crates/event-notifier/src/store.rs
#	crates/filemeta/src/filemeta.rs
#	crates/notify/examples/webhook.rs
#	crates/utils/Cargo.toml
#	ecstore/Cargo.toml
#	ecstore/src/cmd/bucket_replication.rs
#	ecstore/src/config/com.rs
#	ecstore/src/disk/error.rs
#	ecstore/src/disk/mod.rs
#	ecstore/src/set_disk.rs
#	ecstore/src/store_api.rs
#	ecstore/src/store_list_objects.rs
#	iam/Cargo.toml
#	iam/src/manager.rs
#	policy/Cargo.toml
#	rustfs/src/admin/rpc.rs
#	rustfs/src/main.rs
#	rustfs/src/storage/mod.rs
This commit is contained in:
houseme
2025-06-19 13:16:48 +08:00
249 changed files with 25137 additions and 11731 deletions

View File

@@ -3,6 +3,7 @@
## ⚠️ CRITICAL DEVELOPMENT RULES ⚠️
### 🚨 NEVER COMMIT DIRECTLY TO MASTER/MAIN BRANCH 🚨
- **This is the most important rule - NEVER modify code directly on main or master branch**
- **Always work on feature branches and use pull requests for all changes**
- **Any direct commits to master/main branch are strictly forbidden**
@@ -15,31 +16,50 @@
6. Create a pull request for review
## Project Overview
RustFS is a high-performance distributed object storage system written in Rust, compatible with S3 API. The project adopts a modular architecture, supporting erasure coding storage, multi-tenant management, observability, and other enterprise-level features.
## Core Architecture Principles
### 1. Modular Design
- Project uses Cargo workspace structure, containing multiple independent crates
- Core modules: `rustfs` (main service), `ecstore` (erasure coding storage), `common` (shared components)
- Functional modules: `iam` (identity management), `madmin` (management interface), `crypto` (encryption), etc.
- Tool modules: `cli` (command line tool), `crates/*` (utility libraries)
### 2. Asynchronous Programming Pattern
- Comprehensive use of `tokio` async runtime
- Prioritize `async/await` syntax
- Use `async-trait` for async methods in traits
- Avoid blocking operations, use `spawn_blocking` when necessary
### 3. Error Handling Strategy
- Use unified error type `common::error::Error`
- Support error chains and context information
- Use `thiserror` to define specific error types
- Error conversion uses `downcast_ref` for type checking
- **Use modular, type-safe error handling with `thiserror`**
- Each module should define its own error type using `thiserror::Error` derive macro
- Support error chains and context information through `#[from]` and `#[source]` attributes
- Use `Result<T>` type aliases for consistency within each module
- Error conversion between modules should use explicit `From` implementations
- Follow the pattern: `pub type Result<T> = core::result::Result<T, Error>`
- Use `#[error("description")]` attributes for clear error messages
- Support error downcasting when needed through `other()` helper methods
- Implement `Clone` for errors when required by the domain logic
- **Current module error types:**
- `ecstore::error::StorageError` - Storage layer errors
- `ecstore::disk::error::DiskError` - Disk operation errors
- `iam::error::Error` - Identity and access management errors
- `policy::error::Error` - Policy-related errors
- `crypto::error::Error` - Cryptographic operation errors
- `filemeta::error::Error` - File metadata errors
- `rustfs::error::ApiError` - API layer errors
- Module-specific error types for specialized functionality
## Code Style Guidelines
### 1. Formatting Configuration
```toml
max_width = 130
fn_call_width = 90
@@ -55,21 +75,25 @@ single_line_let_else_max_width = 100
Before every commit, you **MUST**:
1. **Format your code**:
```bash
cargo fmt --all
```
2. **Verify formatting**:
```bash
cargo fmt --all --check
```
3. **Pass clippy checks**:
```bash
cargo clippy --all-targets --all-features -- -D warnings
```
4. **Ensure compilation**:
```bash
cargo check --all-targets
```
@@ -144,6 +168,7 @@ Example output when formatting fails:
```
### 3. Naming Conventions
- Use `snake_case` for functions, variables, modules
- Use `PascalCase` for types, traits, enums
- Constants use `SCREAMING_SNAKE_CASE`
@@ -153,6 +178,7 @@ Example output when formatting fails:
- Choose names that clearly express the purpose and intent
### 4. Type Declaration Guidelines
- **Prefer type inference over explicit type declarations** when the type is obvious from context
- Let the Rust compiler infer types whenever possible to reduce verbosity and improve maintainability
- Only specify types explicitly when:
@@ -162,6 +188,7 @@ Example output when formatting fails:
- Needed to resolve ambiguity between multiple possible types
**Good examples (prefer these):**
```rust
// Compiler can infer the type
let items = vec![1, 2, 3, 4];
@@ -173,6 +200,7 @@ let filtered: Vec<_> = items.iter().filter(|&&x| x > 2).collect();
```
**Avoid unnecessary explicit types:**
```rust
// Unnecessary - type is obvious
let items: Vec<i32> = vec![1, 2, 3, 4];
@@ -181,6 +209,7 @@ let result: ProcessResult = process_data(&input);
```
**When explicit types are beneficial:**
```rust
// API boundaries - always specify types
pub fn process_data(input: &[u8]) -> Result<ProcessResult, Error> { ... }
@@ -193,6 +222,7 @@ let cache: HashMap<String, Arc<Mutex<CacheEntry>>> = HashMap::new();
```
### 5. Documentation Comments
- Public APIs must have documentation comments
- Use `///` for documentation comments
- Complex functions add `# Examples` and `# Parameters` descriptions
@@ -201,6 +231,7 @@ let cache: HashMap<String, Arc<Mutex<CacheEntry>>> = HashMap::new();
- Avoid meaningless comments like "debug 111" or placeholder text
### 6. Import Guidelines
- Standard library imports first
- Third-party crate imports in the middle
- Project internal imports last
@@ -209,6 +240,7 @@ let cache: HashMap<String, Arc<Mutex<CacheEntry>>> = HashMap::new();
## Asynchronous Programming Guidelines
### 1. Trait Definition
```rust
#[async_trait::async_trait]
pub trait StorageAPI: Send + Sync {
@@ -217,6 +249,7 @@ pub trait StorageAPI: Send + Sync {
```
### 2. Error Handling
```rust
// Use ? operator to propagate errors
async fn example_function() -> Result<()> {
@@ -227,6 +260,7 @@ async fn example_function() -> Result<()> {
```
### 3. Concurrency Control
- Use `Arc` and `Mutex`/`RwLock` for shared state management
- Prioritize async locks from `tokio::sync`
- Avoid holding locks for long periods
@@ -234,6 +268,7 @@ async fn example_function() -> Result<()> {
## Logging and Tracing Guidelines
### 1. Tracing Usage
```rust
#[tracing::instrument(skip(self, data))]
async fn process_data(&self, data: &[u8]) -> Result<()> {
@@ -243,6 +278,7 @@ async fn process_data(&self, data: &[u8]) -> Result<()> {
```
### 2. Log Levels
- `error!`: System errors requiring immediate attention
- `warn!`: Warning information that may affect functionality
- `info!`: Important business information
@@ -250,6 +286,7 @@ async fn process_data(&self, data: &[u8]) -> Result<()> {
- `trace!`: Detailed execution paths
### 3. Structured Logging
```rust
info!(
counter.rustfs_api_requests_total = 1_u64,
@@ -262,45 +299,213 @@ info!(
## Error Handling Guidelines
### 1. Error Type Definition
```rust
#[derive(Debug, thiserror::Error)]
// Use thiserror for module-specific error types
#[derive(thiserror::Error, Debug)]
pub enum MyError {
#[error("IO error: {0}")]
Io(#[from] std::io::Error),
#[error("Storage error: {0}")]
Storage(#[from] ecstore::error::StorageError),
#[error("Custom error: {message}")]
Custom { message: String },
#[error("File not found: {path}")]
FileNotFound { path: String },
#[error("Invalid configuration: {0}")]
InvalidConfig(String),
}
// Provide Result type alias for the module
pub type Result<T> = core::result::Result<T, MyError>;
```
### 2. Error Helper Methods
```rust
impl MyError {
/// Create error from any compatible error type
pub fn other<E>(error: E) -> Self
where
E: Into<Box<dyn std::error::Error + Send + Sync>>,
{
MyError::Io(std::io::Error::other(error))
}
}
```
### 2. Error Conversion
### 3. Error Conversion Between Modules
```rust
pub fn to_s3_error(err: Error) -> S3Error {
if let Some(storage_err) = err.downcast_ref::<StorageError>() {
match storage_err {
StorageError::ObjectNotFound(bucket, object) => {
s3_error!(NoSuchKey, "{}/{}", bucket, object)
// Convert between different module error types
impl From<ecstore::error::StorageError> for MyError {
fn from(e: ecstore::error::StorageError) -> Self {
match e {
ecstore::error::StorageError::FileNotFound => {
MyError::FileNotFound { path: "unknown".to_string() }
}
// Other error types...
_ => MyError::Storage(e),
}
}
}
// Provide reverse conversion when needed
impl From<MyError> for ecstore::error::StorageError {
fn from(e: MyError) -> Self {
match e {
MyError::FileNotFound { .. } => ecstore::error::StorageError::FileNotFound,
MyError::Storage(e) => e,
_ => ecstore::error::StorageError::other(e),
}
}
// Default error handling
}
```
### 3. Error Context
### 4. Error Context and Propagation
```rust
// Add error context
.map_err(|e| Error::from_string(format!("Failed to process {}: {}", path, e)))?
// Use ? operator for clean error propagation
async fn example_function() -> Result<()> {
let data = read_file("path").await?;
process_data(data).await?;
Ok(())
}
// Add context to errors
fn process_with_context(path: &str) -> Result<()> {
std::fs::read(path)
.map_err(|e| MyError::Custom {
message: format!("Failed to read {}: {}", path, e)
})?;
Ok(())
}
```
### 5. API Error Conversion (S3 Example)
```rust
// Convert storage errors to API-specific errors
use s3s::{S3Error, S3ErrorCode};
#[derive(Debug)]
pub struct ApiError {
pub code: S3ErrorCode,
pub message: String,
pub source: Option<Box<dyn std::error::Error + Send + Sync>>,
}
impl From<ecstore::error::StorageError> for ApiError {
fn from(err: ecstore::error::StorageError) -> Self {
let code = match &err {
ecstore::error::StorageError::BucketNotFound(_) => S3ErrorCode::NoSuchBucket,
ecstore::error::StorageError::ObjectNotFound(_, _) => S3ErrorCode::NoSuchKey,
ecstore::error::StorageError::BucketExists(_) => S3ErrorCode::BucketAlreadyExists,
ecstore::error::StorageError::InvalidArgument(_, _, _) => S3ErrorCode::InvalidArgument,
ecstore::error::StorageError::MethodNotAllowed => S3ErrorCode::MethodNotAllowed,
ecstore::error::StorageError::StorageFull => S3ErrorCode::ServiceUnavailable,
_ => S3ErrorCode::InternalError,
};
ApiError {
code,
message: err.to_string(),
source: Some(Box::new(err)),
}
}
}
impl From<ApiError> for S3Error {
fn from(err: ApiError) -> Self {
let mut s3e = S3Error::with_message(err.code, err.message);
if let Some(source) = err.source {
s3e.set_source(source);
}
s3e
}
}
```
### 6. Error Handling Best Practices
#### Pattern Matching and Error Classification
```rust
// Use pattern matching for specific error handling
async fn handle_storage_operation() -> Result<()> {
match storage.get_object("bucket", "key").await {
Ok(object) => process_object(object),
Err(ecstore::error::StorageError::ObjectNotFound(bucket, key)) => {
warn!("Object not found: {}/{}", bucket, key);
create_default_object(bucket, key).await
}
Err(ecstore::error::StorageError::BucketNotFound(bucket)) => {
error!("Bucket not found: {}", bucket);
Err(MyError::Custom {
message: format!("Bucket {} does not exist", bucket)
})
}
Err(e) => {
error!("Storage operation failed: {}", e);
Err(MyError::Storage(e))
}
}
}
```
#### Error Aggregation and Reporting
```rust
// Collect and report multiple errors
pub fn validate_configuration(config: &Config) -> Result<()> {
let mut errors = Vec::new();
if config.bucket_name.is_empty() {
errors.push("Bucket name cannot be empty");
}
if config.region.is_empty() {
errors.push("Region must be specified");
}
if !errors.is_empty() {
return Err(MyError::Custom {
message: format!("Configuration validation failed: {}", errors.join(", "))
});
}
Ok(())
}
```
#### Contextual Error Information
```rust
// Add operation context to errors
#[tracing::instrument(skip(self))]
async fn upload_file(&self, bucket: &str, key: &str, data: Vec<u8>) -> Result<()> {
self.storage
.put_object(bucket, key, data)
.await
.map_err(|e| MyError::Custom {
message: format!("Failed to upload {}/{}: {}", bucket, key, e)
})
}
```
## Performance Optimization Guidelines
### 1. Memory Management
- Use `Bytes` instead of `Vec<u8>` for zero-copy operations
- Avoid unnecessary cloning, use reference passing
- Use `Arc` for sharing large objects
### 2. Concurrency Optimization
```rust
// Use join_all for concurrent operations
let futures = disks.iter().map(|disk| disk.operation());
@@ -308,12 +513,14 @@ let results = join_all(futures).await;
```
### 3. Caching Strategy
- Use `lazy_static` or `OnceCell` for global caching
- Implement LRU cache to avoid memory leaks
## Testing Guidelines
### 1. Unit Tests
```rust
#[cfg(test)]
mod tests {
@@ -331,14 +538,55 @@ mod tests {
fn test_with_cases(input: &str, expected: &str) {
assert_eq!(function(input), expected);
}
#[test]
fn test_error_conversion() {
use ecstore::error::StorageError;
let storage_err = StorageError::BucketNotFound("test-bucket".to_string());
let api_err: ApiError = storage_err.into();
assert_eq!(api_err.code, S3ErrorCode::NoSuchBucket);
assert!(api_err.message.contains("test-bucket"));
assert!(api_err.source.is_some());
}
#[test]
fn test_error_types() {
let io_err = std::io::Error::new(std::io::ErrorKind::NotFound, "file not found");
let my_err = MyError::Io(io_err);
// Test error matching
match my_err {
MyError::Io(_) => {}, // Expected
_ => panic!("Unexpected error type"),
}
}
#[test]
fn test_error_context() {
let result = process_with_context("nonexistent_file.txt");
assert!(result.is_err());
let err = result.unwrap_err();
match err {
MyError::Custom { message } => {
assert!(message.contains("Failed to read"));
assert!(message.contains("nonexistent_file.txt"));
}
_ => panic!("Expected Custom error"),
}
}
}
```
### 2. Integration Tests
- Use `e2e_test` module for end-to-end testing
- Simulate real storage environments
### 3. Test Quality Standards
- Write meaningful test cases that verify actual functionality
- Avoid placeholder or debug content like "debug 111", "test test", etc.
- Use descriptive test names that clearly indicate what is being tested
@@ -348,9 +596,11 @@ mod tests {
## Cross-Platform Compatibility Guidelines
### 1. CPU Architecture Compatibility
- **Always consider multi-platform and different CPU architecture compatibility** when writing code
- Support major architectures: x86_64, aarch64 (ARM64), and other target platforms
- Use conditional compilation for architecture-specific code:
```rust
#[cfg(target_arch = "x86_64")]
fn optimized_x86_64_function() { /* x86_64 specific implementation */ }
@@ -363,16 +613,19 @@ fn generic_function() { /* Generic fallback implementation */ }
```
### 2. Platform-Specific Dependencies
- Use feature flags for platform-specific dependencies
- Provide fallback implementations for unsupported platforms
- Test on multiple architectures in CI/CD pipeline
### 3. Endianness Considerations
- Use explicit byte order conversion when dealing with binary data
- Prefer `to_le_bytes()`, `from_le_bytes()` for consistent little-endian format
- Use `byteorder` crate for complex binary format handling
### 4. SIMD and Performance Optimizations
- Use portable SIMD libraries like `wide` or `packed_simd`
- Provide fallback implementations for non-SIMD architectures
- Use runtime feature detection when appropriate
@@ -380,10 +633,12 @@ fn generic_function() { /* Generic fallback implementation */ }
## Security Guidelines
### 1. Memory Safety
- Disable `unsafe` code (workspace.lints.rust.unsafe_code = "deny")
- Use `rustls` instead of `openssl`
### 2. Authentication and Authorization
```rust
// Use IAM system for permission checks
let identity = iam.authenticate(&access_key, &secret_key).await?;
@@ -393,11 +648,13 @@ iam.authorize(&identity, &action, &resource).await?;
## Configuration Management Guidelines
### 1. Environment Variables
- Use `RUSTFS_` prefix
- Support both configuration files and environment variables
- Provide reasonable default values
### 2. Configuration Structure
```rust
#[derive(Debug, Deserialize, Clone)]
pub struct Config {
@@ -411,10 +668,12 @@ pub struct Config {
## Dependency Management Guidelines
### 1. Workspace Dependencies
- Manage versions uniformly at workspace level
- Use `workspace = true` to inherit configuration
### 2. Feature Flags
```rust
[features]
default = ["file"]
@@ -425,15 +684,18 @@ kafka = ["dep:rdkafka"]
## Deployment and Operations Guidelines
### 1. Containerization
- Provide Dockerfile and docker-compose configuration
- Support multi-stage builds to optimize image size
### 2. Observability
- Integrate OpenTelemetry for distributed tracing
- Support Prometheus metrics collection
- Provide Grafana dashboards
### 3. Health Checks
```rust
// Implement health check endpoint
async fn health_check() -> Result<HealthStatus> {
@@ -444,6 +706,7 @@ async fn health_check() -> Result<HealthStatus> {
## Code Review Checklist
### 1. **Code Formatting and Quality (MANDATORY)**
- [ ] **Code is properly formatted** (`cargo fmt --all --check` passes)
- [ ] **All clippy warnings are resolved** (`cargo clippy --all-targets --all-features -- -D warnings` passes)
- [ ] **Code compiles successfully** (`cargo check --all-targets` passes)
@@ -451,27 +714,32 @@ async fn health_check() -> Result<HealthStatus> {
- [ ] **No formatting-related changes** mixed with functional changes (separate commits)
### 2. Functionality
- [ ] Are all error cases properly handled?
- [ ] Is there appropriate logging?
- [ ] Is there necessary test coverage?
### 3. Performance
- [ ] Are unnecessary memory allocations avoided?
- [ ] Are async operations used correctly?
- [ ] Are there potential deadlock risks?
### 4. Security
- [ ] Are input parameters properly validated?
- [ ] Are there appropriate permission checks?
- [ ] Is information leakage avoided?
### 5. Cross-Platform Compatibility
- [ ] Does the code work on different CPU architectures (x86_64, aarch64)?
- [ ] Are platform-specific features properly gated with conditional compilation?
- [ ] Is byte order handling correct for binary data?
- [ ] Are there appropriate fallback implementations for unsupported platforms?
### 6. Code Commits and Documentation
- [ ] Does it comply with [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)?
- [ ] Are commit messages concise and under 72 characters for the title line?
- [ ] Commit titles should be concise and in English, avoid Chinese
@@ -480,6 +748,7 @@ async fn health_check() -> Result<HealthStatus> {
## Common Patterns and Best Practices
### 1. Resource Management
```rust
// Use RAII pattern for resource management
pub struct ResourceGuard {
@@ -494,6 +763,7 @@ impl Drop for ResourceGuard {
```
### 2. Dependency Injection
```rust
// Use dependency injection pattern
pub struct Service {
@@ -503,6 +773,7 @@ pub struct Service {
```
### 3. Graceful Shutdown
```rust
// Implement graceful shutdown
async fn shutdown_gracefully(shutdown_rx: &mut Receiver<()>) {
@@ -521,16 +792,19 @@ async fn shutdown_gracefully(shutdown_rx: &mut Receiver<()>) {
## Domain-Specific Guidelines
### 1. Storage Operations
- All storage operations must support erasure coding
- Implement read/write quorum mechanisms
- Support data integrity verification
### 2. Network Communication
- Use gRPC for internal service communication
- HTTP/HTTPS support for S3-compatible API
- Implement connection pooling and retry mechanisms
### 3. Metadata Management
- Use FlatBuffers for serialization
- Support version control and migration
- Implement metadata caching
@@ -540,11 +814,12 @@ These rules should serve as guiding principles when developing the RustFS projec
### 4. Code Operations
#### Branch Management
- **🚨 CRITICAL: NEVER modify code directly on main or master branch - THIS IS ABSOLUTELY FORBIDDEN 🚨**
- **⚠️ ANY DIRECT COMMITS TO MASTER/MAIN WILL BE REJECTED AND MUST BE REVERTED IMMEDIATELY ⚠️**
- **Always work on feature branches - NO EXCEPTIONS**
- Always check the .cursorrules file before starting to ensure you understand the project guidelines
- **MANDATORY workflow for ALL changes:**
- **🚨 CRITICAL: NEVER modify code directly on main or master branch - THIS IS ABSOLUTELY FORBIDDEN 🚨**
- **⚠️ ANY DIRECT COMMITS TO MASTER/MAIN WILL BE REJECTED AND MUST BE REVERTED IMMEDIATELY ⚠️**
- **Always work on feature branches - NO EXCEPTIONS**
- Always check the .cursorrules file before starting to ensure you understand the project guidelines
- **MANDATORY workflow for ALL changes:**
1. `git checkout main` (switch to main branch)
2. `git pull` (get latest changes)
3. `git checkout -b feat/your-feature-name` (create and switch to feature branch)
@@ -552,28 +827,54 @@ These rules should serve as guiding principles when developing the RustFS projec
5. Test thoroughly before committing
6. Commit and push to the feature branch
7. Create a pull request for code review
- Use descriptive branch names following the pattern: `feat/feature-name`, `fix/issue-name`, `refactor/component-name`, etc.
- **Double-check current branch before ANY commit: `git branch` to ensure you're NOT on main/master**
- Ensure all changes are made on feature branches and merged through pull requests
- Use descriptive branch names following the pattern: `feat/feature-name`, `fix/issue-name`, `refactor/component-name`, etc.
- **Double-check current branch before ANY commit: `git branch` to ensure you're NOT on main/master**
- Ensure all changes are made on feature branches and merged through pull requests
#### Development Workflow
- Use English for all code comments, documentation, and variable names
- Write meaningful and descriptive names for variables, functions, and methods
- Avoid meaningless test content like "debug 111" or placeholder values
- Before each change, carefully read the existing code to ensure you understand the code structure and implementation, do not break existing logic implementation, do not introduce new issues
- Ensure each change provides sufficient test cases to guarantee code correctness
- Do not arbitrarily modify numbers and constants in test cases, carefully analyze their meaning to ensure test case correctness
- When writing or modifying tests, check existing test cases to ensure they have scientific naming and rigorous logic testing, if not compliant, modify test cases to ensure scientific and rigorous testing
- **Before committing any changes, run `cargo clippy --all-targets --all-features -- -D warnings` to ensure all code passes Clippy checks**
- After each development completion, first git add . then git commit -m "feat: feature description" or "fix: issue description", ensure compliance with [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
- **Keep commit messages concise and under 72 characters** for the title line, use body for detailed explanations if needed
- After each development completion, first git push to remote repository
- After each change completion, summarize the changes, do not create summary files, provide a brief change description, ensure compliance with [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
- Provide change descriptions needed for PR in the conversation, ensure compliance with [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
- **Always provide PR descriptions in English** after completing any changes, including:
- Clear and concise title following Conventional Commits format
- Detailed description of what was changed and why
- List of key changes and improvements
- Any breaking changes or migration notes if applicable
- Testing information and verification steps
- **Provide PR descriptions in copyable markdown format** enclosed in code blocks for easy one-click copying
- Use English for all code comments, documentation, and variable names
- Write meaningful and descriptive names for variables, functions, and methods
- Avoid meaningless test content like "debug 111" or placeholder values
- Before each change, carefully read the existing code to ensure you understand the code structure and implementation, do not break existing logic implementation, do not introduce new issues
- Ensure each change provides sufficient test cases to guarantee code correctness
- Do not arbitrarily modify numbers and constants in test cases, carefully analyze their meaning to ensure test case correctness
- When writing or modifying tests, check existing test cases to ensure they have scientific naming and rigorous logic testing, if not compliant, modify test cases to ensure scientific and rigorous testing
- **Before committing any changes, run `cargo clippy --all-targets --all-features -- -D warnings` to ensure all code passes Clippy checks**
- After each development completion, first git add . then git commit -m "feat: feature description" or "fix: issue description", ensure compliance with [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
- **Keep commit messages concise and under 72 characters** for the title line, use body for detailed explanations if needed
- After each development completion, first git push to remote repository
- After each change completion, summarize the changes, do not create summary files, provide a brief change description, ensure compliance with [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
- Provide change descriptions needed for PR in the conversation, ensure compliance with [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/)
- **Always provide PR descriptions in English** after completing any changes, including:
- Clear and concise title following Conventional Commits format
- Detailed description of what was changed and why
- List of key changes and improvements
- Any breaking changes or migration notes if applicable
- Testing information and verification steps
- **Provide PR descriptions in copyable markdown format** enclosed in code blocks for easy one-click copying
## 🚫 AI 文档生成限制
### 禁止生成总结文档
- **严格禁止创建任何形式的AI生成总结文档**
- **不得创建包含大量表情符号、详细格式化表格和典型AI风格的文档**
- **不得在项目中生成以下类型的文档:**
- 基准测试总结文档BENCHMARK*.md
- 实现对比分析文档IMPLEMENTATION_COMPARISON*.md
- 性能分析报告文档
- 架构总结文档
- 功能对比文档
- 任何带有大量表情符号和格式化内容的文档
- **如果需要文档,请只在用户明确要求时创建,并保持简洁实用的风格**
- **文档应当专注于实际需要的信息,避免过度格式化和装饰性内容**
- **任何发现的AI生成总结文档都应该立即删除**
### 允许的文档类型
- README.md项目介绍保持简洁
- 技术文档(仅在明确需要时创建)
- 用户手册(仅在明确需要时创建)
- API文档从代码生成
- 变更日志CHANGELOG.md

View File

@@ -1,4 +1,4 @@
FROM m.daocloud.io/docker.io/library/ubuntu:22.04
FROM ubuntu:22.04
ENV LANG C.UTF-8
@@ -18,10 +18,7 @@ RUN wget https://github.com/google/flatbuffers/releases/download/v25.2.10/Linux.
&& mv flatc /usr/local/bin/ && chmod +x /usr/local/bin/flatc && rm -rf Linux.flatc.binary.g++-13.zip
# install rust
ENV RUSTUP_DIST_SERVER="https://rsproxy.cn"
ENV RUSTUP_UPDATE_ROOT="https://rsproxy.cn/rustup"
RUN curl -o rustup-init.sh --proto '=https' --tlsv1.2 -sSf https://rsproxy.cn/rustup-init.sh \
&& sh rustup-init.sh -y && rm -rf rustup-init.sh
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
COPY .docker/cargo.config.toml /root/.cargo/config.toml

View File

@@ -1,4 +1,4 @@
FROM m.daocloud.io/docker.io/library/rockylinux:9.3 AS builder
FROM rockylinux:9.3 AS builder
ENV LANG C.UTF-8
@@ -25,10 +25,7 @@ RUN wget https://github.com/google/flatbuffers/releases/download/v25.2.10/Linux.
&& rm -rf Linux.flatc.binary.g++-13.zip
# install rust
ENV RUSTUP_DIST_SERVER="https://rsproxy.cn"
ENV RUSTUP_UPDATE_ROOT="https://rsproxy.cn/rustup"
RUN curl -o rustup-init.sh --proto '=https' --tlsv1.2 -sSf https://rsproxy.cn/rustup-init.sh \
&& sh rustup-init.sh -y && rm -rf rustup-init.sh
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
COPY .docker/cargo.config.toml /root/.cargo/config.toml

View File

@@ -1,4 +1,4 @@
FROM m.daocloud.io/docker.io/library/ubuntu:22.04
FROM ubuntu:22.04
ENV LANG C.UTF-8
@@ -18,10 +18,7 @@ RUN wget https://github.com/google/flatbuffers/releases/download/v25.2.10/Linux.
&& mv flatc /usr/local/bin/ && chmod +x /usr/local/bin/flatc && rm -rf Linux.flatc.binary.g++-13.zip
# install rust
ENV RUSTUP_DIST_SERVER="https://rsproxy.cn"
ENV RUSTUP_UPDATE_ROOT="https://rsproxy.cn/rustup"
RUN curl -o rustup-init.sh --proto '=https' --tlsv1.2 -sSf https://rsproxy.cn/rustup-init.sh \
&& sh rustup-init.sh -y && rm -rf rustup-init.sh
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
COPY .docker/cargo.config.toml /root/.cargo/config.toml

View File

@@ -1,13 +1,5 @@
[source.crates-io]
registry = "https://github.com/rust-lang/crates.io-index"
replace-with = 'rsproxy-sparse'
[source.rsproxy]
registry = "https://rsproxy.cn/crates.io-index"
[registries.rsproxy]
index = "https://rsproxy.cn/crates.io-index"
[source.rsproxy-sparse]
registry = "sparse+https://rsproxy.cn/index/"
[net]
git-fetch-with-cli = true

View File

@@ -13,9 +13,9 @@ inputs:
description: "Cache key for shared cache"
cache-save-if:
required: true
default: true
default: ${{ github.ref == 'refs/heads/main' }}
description: "Cache save condition"
run-os:
runs-on:
required: true
default: "ubuntu-latest"
description: "Running system"
@@ -24,7 +24,7 @@ runs:
using: "composite"
steps:
- name: Install system dependencies
if: inputs.run-os == 'ubuntu-latest'
if: inputs.runs-on == 'ubuntu-latest'
shell: bash
run: |
sudo apt update
@@ -45,7 +45,6 @@ runs:
- uses: Swatinem/rust-cache@v2
with:
cache-on-failure: true
cache-all-crates: true
shared-key: ${{ inputs.cache-shared-key }}
save-if: ${{ inputs.cache-save-if }}

View File

@@ -17,42 +17,114 @@ jobs:
matrix:
os: [ ubuntu-latest, macos-latest, windows-latest ]
variant:
- { profile: release, target: x86_64-unknown-linux-musl, glibc: "default" }
- { profile: release, target: x86_64-unknown-linux-gnu, glibc: "default" }
- {
profile: release,
target: x86_64-unknown-linux-musl,
glibc: "default",
}
- {
profile: release,
target: x86_64-unknown-linux-gnu,
glibc: "default",
}
- { profile: release, target: aarch64-apple-darwin, glibc: "default" }
#- { profile: release, target: aarch64-unknown-linux-gnu, glibc: "default" }
- { profile: release, target: aarch64-unknown-linux-musl, glibc: "default" }
- {
profile: release,
target: aarch64-unknown-linux-musl,
glibc: "default",
}
#- { profile: release, target: x86_64-pc-windows-msvc, glibc: "default" }
exclude:
# Linux targets on non-Linux systems
- os: macos-latest
variant: { profile: release, target: x86_64-unknown-linux-gnu, glibc: "default" }
variant:
{
profile: release,
target: x86_64-unknown-linux-gnu,
glibc: "default",
}
- os: macos-latest
variant: { profile: release, target: x86_64-unknown-linux-musl, glibc: "default" }
variant:
{
profile: release,
target: x86_64-unknown-linux-musl,
glibc: "default",
}
- os: macos-latest
variant: { profile: release, target: aarch64-unknown-linux-gnu, glibc: "default" }
variant:
{
profile: release,
target: aarch64-unknown-linux-gnu,
glibc: "default",
}
- os: macos-latest
variant: { profile: release, target: aarch64-unknown-linux-musl, glibc: "default" }
variant:
{
profile: release,
target: aarch64-unknown-linux-musl,
glibc: "default",
}
- os: windows-latest
variant: { profile: release, target: x86_64-unknown-linux-gnu, glibc: "default" }
variant:
{
profile: release,
target: x86_64-unknown-linux-gnu,
glibc: "default",
}
- os: windows-latest
variant: { profile: release, target: x86_64-unknown-linux-musl, glibc: "default" }
variant:
{
profile: release,
target: x86_64-unknown-linux-musl,
glibc: "default",
}
- os: windows-latest
variant: { profile: release, target: aarch64-unknown-linux-gnu, glibc: "default" }
variant:
{
profile: release,
target: aarch64-unknown-linux-gnu,
glibc: "default",
}
- os: windows-latest
variant: { profile: release, target: aarch64-unknown-linux-musl, glibc: "default" }
variant:
{
profile: release,
target: aarch64-unknown-linux-musl,
glibc: "default",
}
# Apple targets on non-macOS systems
- os: ubuntu-latest
variant: { profile: release, target: aarch64-apple-darwin, glibc: "default" }
variant:
{
profile: release,
target: aarch64-apple-darwin,
glibc: "default",
}
- os: windows-latest
variant: { profile: release, target: aarch64-apple-darwin, glibc: "default" }
variant:
{
profile: release,
target: aarch64-apple-darwin,
glibc: "default",
}
# Windows targets on non-Windows systems
- os: ubuntu-latest
variant: { profile: release, target: x86_64-pc-windows-msvc, glibc: "default" }
variant:
{
profile: release,
target: x86_64-pc-windows-msvc,
glibc: "default",
}
- os: macos-latest
variant: { profile: release, target: x86_64-pc-windows-msvc, glibc: "default" }
variant:
{
profile: release,
target: x86_64-pc-windows-msvc,
glibc: "default",
}
steps:
- name: Checkout repository
@@ -89,7 +161,7 @@ jobs:
if: steps.cache-protoc.outputs.cache-hit != 'true'
uses: arduino/setup-protoc@v3
with:
version: '31.1'
version: "31.1"
repo-token: ${{ secrets.GITHUB_TOKEN }}
- name: Setup Flatc
@@ -107,10 +179,10 @@ jobs:
# Set up Zig for cross-compilation
- uses: mlugg/setup-zig@v2
if: matrix.variant.glibc != 'default' || contains(matrix.variant.target, 'linux')
if: matrix.variant.glibc != 'default' || contains(matrix.variant.target, 'aarch64-unknown-linux')
- uses: taiki-e/install-action@cargo-zigbuild
if: matrix.variant.glibc != 'default' || contains(matrix.variant.target, 'linux')
if: matrix.variant.glibc != 'default' || contains(matrix.variant.target, 'aarch64-unknown-linux')
# Download static resources
- name: Download and Extract Static Assets
@@ -150,7 +222,7 @@ jobs:
# Determine whether to use zigbuild
USE_ZIGBUILD=false
if [[ "$GLIBC" != "default" || "$TARGET" == *"linux"* ]]; then
if [[ "$GLIBC" != "default" || "$TARGET" == *"aarch64-unknown-linux"* ]]; then
USE_ZIGBUILD=true
echo "Using zigbuild for cross-compilation"
fi
@@ -180,14 +252,14 @@ jobs:
if [[ "$GLIBC" != "default" ]]; then
BIN_NAME="${BIN_NAME}.glibc${GLIBC}"
fi
# Windows systems use exe suffix, and other systems do not have suffix
if [[ "${{ matrix.variant.target }}" == *"windows"* ]]; then
BIN_NAME="${BIN_NAME}.exe"
else
BIN_NAME="${BIN_NAME}.bin"
fi
echo "Binary name will be: $BIN_NAME"
echo "::group::Building rustfs"
@@ -265,17 +337,56 @@ jobs:
path: ${{ steps.package.outputs.artifact_name }}.zip
retention-days: 7
# Install ossutil2 tool for OSS upload
- name: Install ossutil2
if: startsWith(github.ref, 'refs/tags/') || github.ref == 'refs/heads/main'
shell: bash
run: |
echo "::group::Installing ossutil2"
# Download and install ossutil based on platform
if [ "${{ runner.os }}" = "Linux" ]; then
curl -o ossutil.zip https://gosspublic.alicdn.com/ossutil/v2/2.1.1/ossutil-2.1.1-linux-amd64.zip
unzip -o ossutil.zip
chmod 755 ossutil-2.1.1-linux-amd64/ossutil
sudo mv ossutil-2.1.1-linux-amd64/ossutil /usr/local/bin/
rm -rf ossutil.zip ossutil-2.1.1-linux-amd64
elif [ "${{ runner.os }}" = "macOS" ]; then
if [ "$(uname -m)" = "arm64" ]; then
curl -o ossutil.zip https://gosspublic.alicdn.com/ossutil/v2/2.1.1/ossutil-2.1.1-mac-arm64.zip
else
curl -o ossutil.zip https://gosspublic.alicdn.com/ossutil/v2/2.1.1/ossutil-2.1.1-mac-amd64.zip
fi
unzip -o ossutil.zip
chmod 755 ossutil-*/ossutil
sudo mv ossutil-*/ossutil /usr/local/bin/
rm -rf ossutil.zip ossutil-*
elif [ "${{ runner.os }}" = "Windows" ]; then
curl -o ossutil.zip https://gosspublic.alicdn.com/ossutil/v2/2.1.1/ossutil-2.1.1-windows-amd64.zip
unzip -o ossutil.zip
mv ossutil-*/ossutil.exe /usr/bin/ossutil.exe
rm -rf ossutil.zip ossutil-*
fi
echo "ossutil2 installation completed"
# Set the OSS configuration
ossutil config set Region oss-cn-beijing
ossutil config set endpoint oss-cn-beijing.aliyuncs.com
ossutil config set accessKeyID ${{ secrets.ALICLOUDOSS_KEY_ID }}
ossutil config set accessKeySecret ${{ secrets.ALICLOUDOSS_KEY_SECRET }}
- name: Upload to Aliyun OSS
if: startsWith(github.ref, 'refs/tags/') || github.ref == 'refs/heads/main'
uses: JohnGuan/oss-upload-action@main
with:
key-id: ${{ secrets.ALICLOUDOSS_KEY_ID }}
key-secret: ${{ secrets.ALICLOUDOSS_KEY_SECRET }}
region: oss-cn-beijing
bucket: rustfs-artifacts
assets: |
${{ steps.package.outputs.artifact_name }}.zip:/artifacts/rustfs/${{ steps.package.outputs.artifact_name }}.zip
${{ steps.package.outputs.artifact_name }}.zip:/artifacts/rustfs/${{ steps.package.outputs.artifact_name }}.latest.zip
shell: bash
env:
OSSUTIL_ACCESS_KEY_ID: ${{ secrets.ALICLOUDOSS_KEY_ID }}
OSSUTIL_ACCESS_KEY_SECRET: ${{ secrets.ALICLOUDOSS_KEY_SECRET }}
OSSUTIL_ENDPOINT: https://oss-cn-beijing.aliyuncs.com
run: |
echo "::group::Uploading files to OSS"
# Upload the artifact file to two different paths
ossutil cp "${{ steps.package.outputs.artifact_name }}.zip" "oss://rustfs-artifacts/artifacts/rustfs/${{ steps.package.outputs.artifact_name }}.zip" --force
ossutil cp "${{ steps.package.outputs.artifact_name }}.zip" "oss://rustfs-artifacts/artifacts/rustfs/${{ steps.package.outputs.artifact_name }}.latest.zip" --force
echo "Successfully uploaded artifacts to OSS"
# Determine whether to perform GUI construction based on conditions
- name: Prepare for GUI build
@@ -393,16 +504,17 @@ jobs:
# Upload GUI to Alibaba Cloud OSS
- name: Upload GUI to Aliyun OSS
if: startsWith(github.ref, 'refs/tags/')
uses: JohnGuan/oss-upload-action@main
with:
key-id: ${{ secrets.ALICLOUDOSS_KEY_ID }}
key-secret: ${{ secrets.ALICLOUDOSS_KEY_SECRET }}
region: oss-cn-beijing
bucket: rustfs-artifacts
assets: |
${{ steps.build_gui.outputs.gui_artifact_name }}.zip:/artifacts/rustfs/${{ steps.build_gui.outputs.gui_artifact_name }}.zip
${{ steps.build_gui.outputs.gui_artifact_name }}.zip:/artifacts/rustfs/${{ steps.build_gui.outputs.gui_artifact_name }}.latest.zip
shell: bash
env:
OSSUTIL_ACCESS_KEY_ID: ${{ secrets.ALICLOUDOSS_KEY_ID }}
OSSUTIL_ACCESS_KEY_SECRET: ${{ secrets.ALICLOUDOSS_KEY_SECRET }}
OSSUTIL_ENDPOINT: https://oss-cn-beijing.aliyuncs.com
run: |
echo "::group::Uploading GUI files to OSS"
# Upload the GUI artifact file to two different paths
ossutil cp "${{ steps.build_gui.outputs.gui_artifact_name }}.zip" "oss://rustfs-artifacts/artifacts/rustfs/${{ steps.build_gui.outputs.gui_artifact_name }}.zip" --force
ossutil cp "${{ steps.build_gui.outputs.gui_artifact_name }}.zip" "oss://rustfs-artifacts/artifacts/rustfs/${{ steps.build_gui.outputs.gui_artifact_name }}.latest.zip" --force
echo "Successfully uploaded GUI artifacts to OSS"
merge:
runs-on: ubuntu-latest

View File

@@ -11,9 +11,6 @@ on:
- cron: '0 0 * * 0' # at midnight of each sunday
workflow_dispatch:
env:
CARGO_TERM_COLOR: always
jobs:
skip-check:
permissions:
@@ -30,93 +27,52 @@ jobs:
cancel_others: true
paths_ignore: '["*.md"]'
# Quality checks for pull requests
pr-checks:
name: Pull Request Quality Checks
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4.2.2
- uses: ./.github/actions/setup
- name: Format Check
run: cargo fmt --all --check
- name: Lint Check
run: cargo check --all-targets
- name: Clippy Check
run: cargo clippy --all-targets --all-features -- -D warnings
- name: Unit Tests
run: cargo test --all --exclude e2e_test
develop:
needs: skip-check
if: needs.skip-check.outputs.should_skip != 'true'
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- uses: actions/checkout@v4.2.2
- uses: actions/checkout@v4
- uses: ./.github/actions/setup
- name: Test
run: cargo test --all --exclude e2e_test
- name: Format
run: cargo fmt --all --check
- name: Lint
run: cargo check --all-targets
- name: Clippy
run: cargo clippy --all-targets --all-features -- -D warnings
- name: Test
run: cargo test --all --exclude e2e_test
s3s-e2e:
name: E2E (s3s-e2e)
needs: skip-check
if: needs.skip-check.outputs.should_skip != 'true'
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4.2.2
- uses: ./.github/actions/setup
- name: Install s3s-e2e
uses: taiki-e/cache-cargo-install-action@v2
with:
tool: s3s-e2e
git: https://github.com/Nugine/s3s.git
rev: b7714bfaa17ddfa9b23ea01774a1e7bbdbfc2ca3
- name: Build debug
run: |
touch rustfs/build.rs
cargo build -p rustfs --bins
- name: Pack artifacts
run: |
mkdir -p ./target/artifacts
cp target/debug/rustfs ./target/artifacts/rustfs-debug
- uses: actions/upload-artifact@v4
with:
name: rustfs
path: ./target/artifacts/*
s3s-e2e:
name: E2E (s3s-e2e)
needs:
- skip-check
- develop
if: needs.skip-check.outputs.should_skip != 'true'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4.2.2
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
with:
cache-on-failure: true
cache-all-crates: true
- name: Install s3s-e2e
run: |
cargo install s3s-e2e --git https://github.com/Nugine/s3s.git
s3s-e2e --version
- uses: actions/download-artifact@v4
with:
name: rustfs
path: ./target/artifacts
- name: Run s3s-e2e
timeout-minutes: 10
run: |
./scripts/e2e-run.sh ./target/artifacts/rustfs-debug /tmp/rustfs
s3s-e2e --version
./scripts/e2e-run.sh ./target/debug/rustfs /tmp/rustfs
- uses: actions/upload-artifact@v4
with:
name: s3s-e2e.logs
path: /tmp/rustfs.log
path: /tmp/rustfs.log

227
.github/workflows/docker.yml vendored Normal file
View File

@@ -0,0 +1,227 @@
name: Build and Push Docker Images
on:
push:
branches:
- main
tags:
- "v*"
pull_request:
branches:
- main
workflow_dispatch:
inputs:
push_to_registry:
description: "Push images to registry"
required: false
default: "true"
type: boolean
env:
REGISTRY_IMAGE_DOCKERHUB: rustfs/rustfs
REGISTRY_IMAGE_GHCR: ghcr.io/${{ github.repository }}
jobs:
# Skip duplicate job runs
skip-check:
permissions:
actions: write
contents: read
runs-on: ubuntu-latest
outputs:
should_skip: ${{ steps.skip_check.outputs.should_skip }}
steps:
- id: skip_check
uses: fkirc/skip-duplicate-actions@v5
with:
concurrent_skipping: "same_content_newer"
cancel_others: true
paths_ignore: '["*.md", "docs/**"]'
# Build RustFS binary for different platforms
build-binary:
needs: skip-check
if: needs.skip-check.outputs.should_skip != 'true'
strategy:
matrix:
include:
- target: x86_64-unknown-linux-musl
os: ubuntu-latest
arch: amd64
use_cross: false
- target: aarch64-unknown-linux-gnu
os: ubuntu-latest
arch: arm64
use_cross: true
runs-on: ${{ matrix.os }}
timeout-minutes: 120
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Setup Rust toolchain
uses: actions-rust-lang/setup-rust-toolchain@v1
with:
target: ${{ matrix.target }}
components: rustfmt, clippy
- name: Install cross-compilation dependencies (native build)
if: matrix.use_cross == false
run: |
sudo apt-get update
sudo apt-get install -y musl-tools
- name: Install cross tool (cross compilation)
if: matrix.use_cross == true
uses: taiki-e/install-action@v2
with:
tool: cross
- name: Install protoc
uses: arduino/setup-protoc@v3
with:
version: "31.1"
repo-token: ${{ secrets.GITHUB_TOKEN }}
- name: Install flatc
uses: Nugine/setup-flatc@v1
with:
version: "25.2.10"
- name: Cache cargo dependencies
uses: actions/cache@v3
with:
path: |
~/.cargo/registry
~/.cargo/git
target
key: ${{ runner.os }}-cargo-${{ matrix.target }}-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-${{ matrix.target }}-
${{ runner.os }}-cargo-
- name: Generate protobuf code
run: cargo run --bin gproto
- name: Build RustFS binary (native)
if: matrix.use_cross == false
run: |
cargo build --release --target ${{ matrix.target }} --bin rustfs
- name: Build RustFS binary (cross)
if: matrix.use_cross == true
run: |
cross build --release --target ${{ matrix.target }} --bin rustfs
- name: Upload binary artifact
uses: actions/upload-artifact@v4
with:
name: rustfs-${{ matrix.arch }}
path: target/${{ matrix.target }}/release/rustfs
retention-days: 1
# Build and push multi-arch Docker images
build-images:
needs: [skip-check, build-binary]
if: needs.skip-check.outputs.should_skip != 'true'
runs-on: ubuntu-latest
timeout-minutes: 60
strategy:
matrix:
image-type: [production, ubuntu, rockylinux, devenv]
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Download binary artifacts
uses: actions/download-artifact@v4
with:
path: ./artifacts
- name: Setup binary files
run: |
mkdir -p target/x86_64-unknown-linux-musl/release
mkdir -p target/aarch64-unknown-linux-gnu/release
cp artifacts/rustfs-amd64/rustfs target/x86_64-unknown-linux-musl/release/
cp artifacts/rustfs-arm64/rustfs target/aarch64-unknown-linux-gnu/release/
chmod +x target/*/release/rustfs
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Login to Docker Hub
if: github.event_name != 'pull_request' && (github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/'))
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Login to GitHub Container Registry
if: github.event_name != 'pull_request' && (github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/'))
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set Dockerfile and context
id: dockerfile
run: |
case "${{ matrix.image-type }}" in
production)
echo "dockerfile=Dockerfile" >> $GITHUB_OUTPUT
echo "context=." >> $GITHUB_OUTPUT
echo "suffix=" >> $GITHUB_OUTPUT
;;
ubuntu)
echo "dockerfile=.docker/Dockerfile.ubuntu22.04" >> $GITHUB_OUTPUT
echo "context=." >> $GITHUB_OUTPUT
echo "suffix=-ubuntu22.04" >> $GITHUB_OUTPUT
;;
rockylinux)
echo "dockerfile=.docker/Dockerfile.rockylinux9.3" >> $GITHUB_OUTPUT
echo "context=." >> $GITHUB_OUTPUT
echo "suffix=-rockylinux9.3" >> $GITHUB_OUTPUT
;;
devenv)
echo "dockerfile=.docker/Dockerfile.devenv" >> $GITHUB_OUTPUT
echo "context=." >> $GITHUB_OUTPUT
echo "suffix=-devenv" >> $GITHUB_OUTPUT
;;
esac
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: |
${{ env.REGISTRY_IMAGE_DOCKERHUB }}
${{ env.REGISTRY_IMAGE_GHCR }}
tags: |
type=ref,event=branch,suffix=${{ steps.dockerfile.outputs.suffix }}
type=ref,event=pr,suffix=${{ steps.dockerfile.outputs.suffix }}
type=semver,pattern={{version}},suffix=${{ steps.dockerfile.outputs.suffix }}
type=semver,pattern={{major}}.{{minor}},suffix=${{ steps.dockerfile.outputs.suffix }}
type=semver,pattern={{major}},suffix=${{ steps.dockerfile.outputs.suffix }}
type=raw,value=latest,suffix=${{ steps.dockerfile.outputs.suffix }},enable={{is_default_branch}}
flavor: |
latest=false
- name: Build and push multi-arch Docker image
uses: docker/build-push-action@v5
with:
context: ${{ steps.dockerfile.outputs.context }}
file: ${{ steps.dockerfile.outputs.dockerfile }}
platforms: linux/amd64,linux/arm64
push: ${{ (github.event_name != 'pull_request' && (github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/'))) || github.event.inputs.push_to_registry == 'true' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha,scope=${{ matrix.image-type }}
cache-to: type=gha,mode=max,scope=${{ matrix.image-type }}
build-args: |
BUILDTIME=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.created'] }}
VERSION=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.version'] }}
REVISION=${{ fromJSON(steps.meta.outputs.json).labels['org.opencontainers.image.revision'] }}

822
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -20,14 +20,17 @@ members = [
"rustfs", # Core file system implementation
"s3select/api", # S3 Select API interface
"s3select/query", # S3 Select query engine
"crates/zip",
"crates/filemeta",
"crates/rio",
]
resolver = "2"
[workspace.package]
edition = "2021"
edition = "2024"
license = "Apache-2.0"
repository = "https://github.com/rustfs/rustfs"
rust-version = "1.75"
rust-version = "1.85"
version = "0.0.1"
[workspace.lints.rust]
@@ -54,6 +57,10 @@ rustfs-config = { path = "./crates/config", version = "0.0.1" }
rustfs-obs = { path = "crates/obs", version = "0.0.1" }
rustfs-notify = { path = "crates/notify", version = "0.0.1" }
rustfs-utils = { path = "crates/utils", version = "0.0.1" }
rustfs-rio = { path = "crates/rio", version = "0.0.1" }
rustfs-filemeta = { path = "crates/filemeta", version = "0.0.1" }
rustfs-disk = { path = "crates/disk", version = "0.0.1" }
rustfs-error = { path = "crates/error", version = "0.0.1" }
workers = { path = "./common/workers", version = "0.0.1" }
aes-gcm = { version = "0.10.3", features = ["std"] }
arc-swap = "1.7.1"
@@ -68,14 +75,16 @@ axum-extra = "0.10.1"
axum-server = { version = "0.7.2", features = ["tls-rustls"] }
backon = "1.5.1"
base64-simd = "0.8.0"
base64 = "0.22.1"
blake2 = "0.10.6"
bytes = "1.10.1"
bytes = { version = "1.10.1", features = ["serde"] }
bytesize = "2.0.1"
byteorder = "1.5.0"
cfg-if = "1.0.0"
chacha20poly1305 = { version = "0.10.1" }
chrono = { version = "0.4.41", features = ["serde"] }
clap = { version = "4.5.40", features = ["derive", "env"] }
config = "0.15.11"
const-str = { version = "0.6.2", features = ["std", "proc"] }
crc32fast = "1.4.2"
datafusion = "46.0.1"
@@ -84,7 +93,7 @@ dioxus = { version = "0.6.3", features = ["router"] }
dirs = "6.0.0"
dotenvy = "0.15.7"
flatbuffers = "25.2.10"
flexi_logger = { version = "0.30.2", features = ["trc"] }
flexi_logger = { version = "0.30.2", features = ["trc", "dont_minimize_extra_stacks"] }
futures = "0.3.31"
futures-core = "0.3.31"
futures-util = "0.3.31"
@@ -92,6 +101,7 @@ glob = "0.3.2"
hex = "0.4.3"
hex-simd = "0.8.0"
highway = { version = "1.3.0" }
hmac = "0.12.1"
hyper = "1.6.0"
hyper-util = { version = "0.1.14", features = [
"tokio",
@@ -141,6 +151,7 @@ opentelemetry-semantic-conventions = { version = "0.30.0", features = [
parking_lot = "0.12.4"
path-absolutize = "3.1.1"
path-clean = "1.0.1"
blake3 = { version = "1.8.2" }
pbkdf2 = "0.12.2"
percent-encoding = "2.3.1"
pin-project-lite = "0.2.16"
@@ -149,8 +160,13 @@ prost = "0.13.5"
prost-build = "0.13.5"
protobuf = "3.7"
rand = "0.9.1"
brotli = "8.0.1"
flate2 = "1.1.1"
zstd = "0.13.3"
lz4 = "1.28.1"
rdkafka = { version = "0.37.0", features = ["tokio"] }
reed-solomon-erasure = { version = "6.0.0", features = ["simd-accel"] }
reed-solomon-simd = { version = "3.0.0" }
regex = { version = "1.11.1" }
reqwest = { version = "0.12.19", default-features = false, features = [
"rustls-tls",
@@ -186,7 +202,6 @@ serde_with = "3.12.0"
sha2 = "0.10.9"
siphasher = "1.0.1"
smallvec = { version = "1.15.1", features = ["serde"] }
snafu = "0.8.6"
snap = "1.1.1"
socket2 = "0.5.10"
@@ -241,10 +256,10 @@ inherits = "dev"
[profile.release]
opt-level = 3
lto = "thin"
codegen-units = 1
panic = "abort" # Optional, remove the panic expansion code
strip = true # strip symbol information to reduce binary size
#lto = "thin"
#codegen-units = 1
#panic = "abort" # Optional, remove the panic expansion code
#strip = true # strip symbol information to reduce binary size
[profile.production]
inherits = "release"

View File

@@ -1,17 +1,37 @@
FROM alpine:latest
# RUN apk add --no-cache <package-name>
# Install runtime dependencies
RUN apk add --no-cache \
ca-certificates \
tzdata \
&& rm -rf /var/cache/apk/*
# Create rustfs user and group
RUN addgroup -g 1000 rustfs && \
adduser -D -s /bin/sh -u 1000 -G rustfs rustfs
WORKDIR /app
RUN mkdir -p /data/rustfs0 /data/rustfs1 /data/rustfs2 /data/rustfs3
# Create data directories
RUN mkdir -p /data/rustfs{0,1,2,3} && \
chown -R rustfs:rustfs /data /app
COPY ./target/x86_64-unknown-linux-musl/release/rustfs /app/rustfs
# Copy binary based on target architecture
COPY --chown=rustfs:rustfs \
target/*/release/rustfs \
/app/rustfs
RUN chmod +x /app/rustfs
EXPOSE 9000
EXPOSE 9001
# Switch to non-root user
USER rustfs
# Expose ports
EXPOSE 9000 9001
CMD ["/app/rustfs"]
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:9000/health || exit 1
# Set default command
CMD ["/app/rustfs"]

121
Dockerfile.multi-stage Normal file
View File

@@ -0,0 +1,121 @@
# Multi-stage Dockerfile for RustFS
# Supports cross-compilation for amd64 and arm64 architectures
ARG TARGETPLATFORM
ARG BUILDPLATFORM
# Build stage
FROM --platform=$BUILDPLATFORM rust:1.85-bookworm AS builder
# Install required build dependencies
RUN apt-get update && apt-get install -y \
wget \
git \
curl \
unzip \
gcc \
pkg-config \
libssl-dev \
lld \
&& rm -rf /var/lib/apt/lists/*
# Install cross-compilation tools for ARM64
RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \
apt-get update && \
apt-get install -y gcc-aarch64-linux-gnu && \
rm -rf /var/lib/apt/lists/*; \
fi
# Install protoc
RUN wget https://github.com/protocolbuffers/protobuf/releases/download/v31.1/protoc-31.1-linux-x86_64.zip \
&& unzip protoc-31.1-linux-x86_64.zip -d protoc3 \
&& mv protoc3/bin/* /usr/local/bin/ && chmod +x /usr/local/bin/protoc \
&& mv protoc3/include/* /usr/local/include/ && rm -rf protoc-31.1-linux-x86_64.zip protoc3
# Install flatc
RUN wget https://github.com/google/flatbuffers/releases/download/v25.2.10/Linux.flatc.binary.g++-13.zip \
&& unzip Linux.flatc.binary.g++-13.zip \
&& mv flatc /usr/local/bin/ && chmod +x /usr/local/bin/flatc && rm -rf Linux.flatc.binary.g++-13.zip
# Set up Rust targets based on platform
RUN case "$TARGETPLATFORM" in \
"linux/amd64") rustup target add x86_64-unknown-linux-gnu ;; \
"linux/arm64") rustup target add aarch64-unknown-linux-gnu ;; \
*) echo "Unsupported platform: $TARGETPLATFORM" && exit 1 ;; \
esac
# Set up environment for cross-compilation
ENV CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc
ENV CC_aarch64_unknown_linux_gnu=aarch64-linux-gnu-gcc
ENV CXX_aarch64_unknown_linux_gnu=aarch64-linux-gnu-g++
WORKDIR /usr/src/rustfs
# Copy Cargo files for dependency caching
COPY Cargo.toml Cargo.lock ./
COPY */Cargo.toml ./*/
# Create dummy main.rs files for dependency compilation
RUN find . -name "Cargo.toml" -not -path "./Cargo.toml" | \
xargs -I {} dirname {} | \
xargs -I {} sh -c 'mkdir -p {}/src && echo "fn main() {}" > {}/src/main.rs'
# Build dependencies only (cache layer)
RUN case "$TARGETPLATFORM" in \
"linux/amd64") cargo build --release --target x86_64-unknown-linux-gnu ;; \
"linux/arm64") cargo build --release --target aarch64-unknown-linux-gnu ;; \
esac
# Copy source code
COPY . .
# Generate protobuf code
RUN cargo run --bin gproto
# Build the actual application
RUN case "$TARGETPLATFORM" in \
"linux/amd64") \
cargo build --release --target x86_64-unknown-linux-gnu --bin rustfs && \
cp target/x86_64-unknown-linux-gnu/release/rustfs /usr/local/bin/rustfs \
;; \
"linux/arm64") \
cargo build --release --target aarch64-unknown-linux-gnu --bin rustfs && \
cp target/aarch64-unknown-linux-gnu/release/rustfs /usr/local/bin/rustfs \
;; \
esac
# Runtime stage - Ubuntu minimal for better compatibility
FROM ubuntu:22.04
# Install runtime dependencies
RUN apt-get update && apt-get install -y \
ca-certificates \
tzdata \
wget \
&& rm -rf /var/lib/apt/lists/*
# Create rustfs user and group
RUN groupadd -g 1000 rustfs && \
useradd -d /app -g rustfs -u 1000 -s /bin/bash rustfs
WORKDIR /app
# Create data directories
RUN mkdir -p /data/rustfs{0,1,2,3} && \
chown -R rustfs:rustfs /data /app
# Copy binary from builder stage
COPY --from=builder /usr/local/bin/rustfs /app/rustfs
RUN chmod +x /app/rustfs && chown rustfs:rustfs /app/rustfs
# Switch to non-root user
USER rustfs
# Expose ports
EXPOSE 9000 9001
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:9000/health || exit 1
# Set default command
CMD ["/app/rustfs"]

118
Makefile
View File

@@ -79,3 +79,121 @@ build: BUILD_CMD = /root/.cargo/bin/cargo build --release --bin rustfs --target-
build:
$(DOCKER_CLI) build -t $(ROCKYLINUX_BUILD_IMAGE_NAME) -f $(DOCKERFILE_PATH)/Dockerfile.$(BUILD_OS) .
$(DOCKER_CLI) run --rm --name $(ROCKYLINUX_BUILD_CONTAINER_NAME) -v $(shell pwd):/root/s3-rustfs -it $(ROCKYLINUX_BUILD_IMAGE_NAME) $(BUILD_CMD)
.PHONY: build-musl
build-musl:
@echo "🔨 Building rustfs for x86_64-unknown-linux-musl..."
cargo build --target x86_64-unknown-linux-musl --bin rustfs -r
.PHONY: build-gnu
build-gnu:
@echo "🔨 Building rustfs for x86_64-unknown-linux-gnu..."
cargo build --target x86_64-unknown-linux-gnu --bin rustfs -r
.PHONY: deploy-dev
deploy-dev: build-musl
@echo "🚀 Deploying to dev server: $${IP}"
./scripts/dev_deploy.sh $${IP}
# Multi-architecture Docker build targets
.PHONY: docker-build-multiarch
docker-build-multiarch:
@echo "🏗️ Building multi-architecture Docker images..."
./scripts/build-docker-multiarch.sh
.PHONY: docker-build-multiarch-push
docker-build-multiarch-push:
@echo "🚀 Building and pushing multi-architecture Docker images..."
./scripts/build-docker-multiarch.sh --push
.PHONY: docker-build-multiarch-version
docker-build-multiarch-version:
@if [ -z "$(VERSION)" ]; then \
echo "❌ 错误: 请指定版本, 例如: make docker-build-multiarch-version VERSION=v1.0.0"; \
exit 1; \
fi
@echo "🏗️ Building multi-architecture Docker images (version: $(VERSION))..."
./scripts/build-docker-multiarch.sh --version $(VERSION)
.PHONY: docker-push-multiarch-version
docker-push-multiarch-version:
@if [ -z "$(VERSION)" ]; then \
echo "❌ 错误: 请指定版本, 例如: make docker-push-multiarch-version VERSION=v1.0.0"; \
exit 1; \
fi
@echo "🚀 Building and pushing multi-architecture Docker images (version: $(VERSION))..."
./scripts/build-docker-multiarch.sh --version $(VERSION) --push
.PHONY: docker-build-ubuntu
docker-build-ubuntu:
@echo "🏗️ Building multi-architecture Ubuntu Docker images..."
./scripts/build-docker-multiarch.sh --type ubuntu
.PHONY: docker-build-rockylinux
docker-build-rockylinux:
@echo "🏗️ Building multi-architecture RockyLinux Docker images..."
./scripts/build-docker-multiarch.sh --type rockylinux
.PHONY: docker-build-devenv
docker-build-devenv:
@echo "🏗️ Building multi-architecture development environment Docker images..."
./scripts/build-docker-multiarch.sh --type devenv
.PHONY: docker-build-all-types
docker-build-all-types:
@echo "🏗️ Building all multi-architecture Docker image types..."
./scripts/build-docker-multiarch.sh --type production
./scripts/build-docker-multiarch.sh --type ubuntu
./scripts/build-docker-multiarch.sh --type rockylinux
./scripts/build-docker-multiarch.sh --type devenv
.PHONY: docker-inspect-multiarch
docker-inspect-multiarch:
@if [ -z "$(IMAGE)" ]; then \
echo "❌ 错误: 请指定镜像, 例如: make docker-inspect-multiarch IMAGE=rustfs/rustfs:latest"; \
exit 1; \
fi
@echo "🔍 Inspecting multi-architecture image: $(IMAGE)"
docker buildx imagetools inspect $(IMAGE)
.PHONY: build-cross-all
build-cross-all:
@echo "🔧 Building all target architectures..."
@if ! command -v cross &> /dev/null; then \
echo "📦 Installing cross..."; \
cargo install cross; \
fi
@echo "🔨 Generating protobuf code..."
cargo run --bin gproto || true
@echo "🔨 Building x86_64-unknown-linux-musl..."
cargo build --release --target x86_64-unknown-linux-musl --bin rustfs
@echo "🔨 Building aarch64-unknown-linux-gnu..."
cross build --release --target aarch64-unknown-linux-gnu --bin rustfs
@echo "✅ All architectures built successfully!"
.PHONY: help-docker
help-docker:
@echo "🐳 Docker 多架构构建帮助:"
@echo ""
@echo "基本构建:"
@echo " make docker-build-multiarch # 构建多架构镜像(不推送)"
@echo " make docker-build-multiarch-push # 构建并推送多架构镜像"
@echo ""
@echo "版本构建:"
@echo " make docker-build-multiarch-version VERSION=v1.0.0 # 构建指定版本"
@echo " make docker-push-multiarch-version VERSION=v1.0.0 # 构建并推送指定版本"
@echo ""
@echo "镜像类型:"
@echo " make docker-build-ubuntu # 构建 Ubuntu 镜像"
@echo " make docker-build-rockylinux # 构建 RockyLinux 镜像"
@echo " make docker-build-devenv # 构建开发环境镜像"
@echo " make docker-build-all-types # 构建所有类型镜像"
@echo ""
@echo "辅助工具:"
@echo " make build-cross-all # 构建所有架构的二进制文件"
@echo " make docker-inspect-multiarch IMAGE=xxx # 检查镜像的架构支持"
@echo ""
@echo "环境变量 (在推送时需要设置):"
@echo " DOCKERHUB_USERNAME Docker Hub 用户名"
@echo " DOCKERHUB_TOKEN Docker Hub 访问令牌"
@echo " GITHUB_TOKEN GitHub 访问令牌"

View File

@@ -1,4 +1,3 @@
use common::error::Result;
use rsa::Pkcs1v15Encrypt;
use rsa::{
pkcs8::{DecodePrivateKey, DecodePublicKey},
@@ -6,6 +5,7 @@ use rsa::{
RsaPrivateKey, RsaPublicKey,
};
use serde::{Deserialize, Serialize};
use std::io::{Error, Result};
#[derive(Serialize, Deserialize, Debug, Default, Clone)]
pub struct Token {
@@ -19,8 +19,8 @@ pub struct Token {
// 返回 base64 处理的加密字符串
pub fn gencode(token: &Token, key: &str) -> Result<String> {
let data = serde_json::to_vec(token)?;
let public_key = RsaPublicKey::from_public_key_pem(key)?;
let encrypted_data = public_key.encrypt(&mut OsRng, Pkcs1v15Encrypt, &data)?;
let public_key = RsaPublicKey::from_public_key_pem(key).map_err(Error::other)?;
let encrypted_data = public_key.encrypt(&mut OsRng, Pkcs1v15Encrypt, &data).map_err(Error::other)?;
Ok(base64_simd::URL_SAFE_NO_PAD.encode_to_string(&encrypted_data))
}
@@ -29,9 +29,11 @@ pub fn gencode(token: &Token, key: &str) -> Result<String> {
// [key] 私钥字符串
// 返回 Token 对象
pub fn parse(token: &str, key: &str) -> Result<Token> {
let encrypted_data = base64_simd::URL_SAFE_NO_PAD.decode_to_vec(token.as_bytes())?;
let private_key = RsaPrivateKey::from_pkcs8_pem(key)?;
let decrypted_data = private_key.decrypt(Pkcs1v15Encrypt, &encrypted_data)?;
let encrypted_data = base64_simd::URL_SAFE_NO_PAD
.decode_to_vec(token.as_bytes())
.map_err(Error::other)?;
let private_key = RsaPrivateKey::from_pkcs8_pem(key).map_err(Error::other)?;
let decrypted_data = private_key.decrypt(Pkcs1v15Encrypt, &encrypted_data).map_err(Error::other)?;
let res: Token = serde_json::from_slice(&decrypted_data)?;
Ok(res)
}
@@ -50,7 +52,7 @@ pub fn parse_license(license: &str) -> Result<Token> {
// }
}
static TEST_PRIVATE_KEY:&str ="-----BEGIN PRIVATE KEY-----\nMIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQCj86SrJIuxSxR6\nBJ/dlJEUIj6NeBRnhLQlCDdovuz61+7kJXVcxaR66w4m8W7SLEUP+IlPtnn6vmiG\n7XMhGNHIr7r1JsEVVLhZmL3tKI66DEZl786ZhG81BWqUlmcooIPS8UEPZNqJXLuz\nVGhxNyVGbj/tV7QC2pSISnKaixc+nrhxvo7w56p5qrm9tik0PjTgfZsUePkoBsSN\npoRkAauS14MAzK6HGB75CzG3dZqXUNWSWVocoWtQbZUwFGXyzU01ammsHQDvc2xu\nK1RQpd1qYH5bOWZ0N0aPFwT0r59HztFXg9sbjsnuhO1A7OiUOkc6iGVuJ0wm/9nA\nwZIBqzgjAgMBAAECggEAPMpeSEbotPhNw2BrllE76ec4omPfzPJbiU+em+wPGoNu\nRJHPDnMKJbl6Kd5jZPKdOOrCnxfd6qcnQsBQa/kz7+GYxMV12l7ra+1Cnujm4v0i\nLTHZvPpp8ZLsjeOmpF3AAzsJEJgon74OqtOlVjVIUPEYKvzV9ijt4gsYq0zfdYv0\nhrTMzyrGM4/UvKLsFIBROAfCeWfA7sXLGH8JhrRAyDrtCPzGtyyAmzoHKHtHafcB\nuyPFw/IP8otAgpDk5iiQPNkH0WwzAQIm12oHuNUa66NwUK4WEjXTnDg8KeWLHHNv\nIfN8vdbZchMUpMIvvkr7is315d8f2cHCB5gEO+GWAQKBgQDR/0xNll+FYaiUKCPZ\nvkOCAd3l5mRhsqnjPQ/6Ul1lAyYWpoJSFMrGGn/WKTa/FVFJRTGbBjwP+Mx10bfb\ngUg2GILDTISUh54fp4zngvTi9w4MWGKXrb7I1jPkM3vbJfC/v2fraQ/r7qHPpO2L\nf6ZbGxasIlSvr37KeGoelwcAQQKBgQDH3hmOTS2Hl6D4EXdq5meHKrfeoicGN7m8\noQK7u8iwn1R9zK5nh6IXxBhKYNXNwdCQtBZVRvFjjZ56SZJb7lKqa1BcTsgJfZCy\nnI3Uu4UykrECAH8AVCVqBXUDJmeA2yE+gDAtYEjvhSDHpUfWxoGHr0B/Oqk2Lxc/\npRy1qV5fYwKBgBWSL/hYVf+RhIuTg/s9/BlCr9SJ0g3nGGRrRVTlWQqjRCpXeFOO\nJzYqSq9pFGKUggEQxoOyJEFPwVDo9gXqRcyov+Xn2kaXl7qQr3yoixc1YZALFDWY\nd1ySBEqQr0xXnV9U/gvEgwotPRnjSzNlLWV2ZuHPtPtG/7M0o1H5GZMBAoGAKr3N\nW0gX53o+my4pCnxRQW+aOIsWq1a5aqRIEFudFGBOUkS2Oz+fI1P1GdrRfhnnfzpz\n2DK+plp/vIkFOpGhrf4bBlJ2psjqa7fdANRFLMaAAfyXLDvScHTQTCcnVUAHQPVq\n2BlSH56pnugyj7SNuLV6pnql+wdhAmRN2m9o1h8CgYAbX2juSr4ioXwnYjOUdrIY\n4+ERvHcXdjoJmmPcAm4y5NbSqLXyU0FQmplNMt2A5LlniWVJ9KNdjAQUt60FZw/+\nr76LdxXaHNZghyx0BOs7mtq5unSQXamZ8KixasfhE9uz3ij1jXjG6hafWkS8/68I\nuWbaZqgvy7a9oPHYlKH7Jg==\n-----END PRIVATE KEY-----\n";
static TEST_PRIVATE_KEY: &str = "-----BEGIN PRIVATE KEY-----\nMIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQCj86SrJIuxSxR6\nBJ/dlJEUIj6NeBRnhLQlCDdovuz61+7kJXVcxaR66w4m8W7SLEUP+IlPtnn6vmiG\n7XMhGNHIr7r1JsEVVLhZmL3tKI66DEZl786ZhG81BWqUlmcooIPS8UEPZNqJXLuz\nVGhxNyVGbj/tV7QC2pSISnKaixc+nrhxvo7w56p5qrm9tik0PjTgfZsUePkoBsSN\npoRkAauS14MAzK6HGB75CzG3dZqXUNWSWVocoWtQbZUwFGXyzU01ammsHQDvc2xu\nK1RQpd1qYH5bOWZ0N0aPFwT0r59HztFXg9sbjsnuhO1A7OiUOkc6iGVuJ0wm/9nA\nwZIBqzgjAgMBAAECggEAPMpeSEbotPhNw2BrllE76ec4omPfzPJbiU+em+wPGoNu\nRJHPDnMKJbl6Kd5jZPKdOOrCnxfd6qcnQsBQa/kz7+GYxMV12l7ra+1Cnujm4v0i\nLTHZvPpp8ZLsjeOmpF3AAzsJEJgon74OqtOlVjVIUPEYKvzV9ijt4gsYq0zfdYv0\nhrTMzyrGM4/UvKLsFIBROAfCeWfA7sXLGH8JhrRAyDrtCPzGtyyAmzoHKHtHafcB\nuyPFw/IP8otAgpDk5iiQPNkH0WwzAQIm12oHuNUa66NwUK4WEjXTnDg8KeWLHHNv\nIfN8vdbZchMUpMIvvkr7is315d8f2cHCB5gEO+GWAQKBgQDR/0xNll+FYaiUKCPZ\nvkOCAd3l5mRhsqnjPQ/6Ul1lAyYWpoJSFMrGGn/WKTa/FVFJRTGbBjwP+Mx10bfb\ngUg2GILDTISUh54fp4zngvTi9w4MWGKXrb7I1jPkM3vbJfC/v2fraQ/r7qHPpO2L\nf6ZbGxasIlSvr37KeGoelwcAQQKBgQDH3hmOTS2Hl6D4EXdq5meHKrfeoicGN7m8\noQK7u8iwn1R9zK5nh6IXxBhKYNXNwdCQtBZVRvFjjZ56SZJb7lKqa1BcTsgJfZCy\nnI3Uu4UykrECAH8AVCVqBXUDJmeA2yE+gDAtYEjvhSDHpUfWxoGHr0B/Oqk2Lxc/\npRy1qV5fYwKBgBWSL/hYVf+RhIuTg/s9/BlCr9SJ0g3nGGRrRVTlWQqjRCpXeFOO\nJzYqSq9pFGKUggEQxoOyJEFPwVDo9gXqRcyov+Xn2kaXl7qQr3yoixc1YZALFDWY\nd1ySBEqQr0xXnV9U/gvEgwotPRnjSzNlLWV2ZuHPtPtG/7M0o1H5GZMBAoGAKr3N\nW0gX53o+my4pCnxRQW+aOIsWq1a5aqRIEFudFGBOUkS2Oz+fI1P1GdrRfhnnfzpz\n2DK+plp/vIkFOpGhrf4bBlJ2psjqa7fdANRFLMaAAfyXLDvScHTQTCcnVUAHQPVq\n2BlSH56pnugyj7SNuLV6pnql+wdhAmRN2m9o1h8CgYAbX2juSr4ioXwnYjOUdrIY\n4+ERvHcXdjoJmmPcAm4y5NbSqLXyU0FQmplNMt2A5LlniWVJ9KNdjAQUt60FZw/+\nr76LdxXaHNZghyx0BOs7mtq5unSQXamZ8KixasfhE9uz3ij1jXjG6hafWkS8/68I\nuWbaZqgvy7a9oPHYlKH7Jg==\n-----END PRIVATE KEY-----\n";
#[cfg(test)]
mod tests {

View File

@@ -11,7 +11,7 @@ use tokio::fs;
use tokio::fs::File;
use tokio::io::AsyncWriteExt;
use tokio::net::TcpStream;
use tokio::sync::{mpsc, Mutex};
use tokio::sync::{Mutex, mpsc};
#[derive(RustEmbed)]
#[folder = "$CARGO_MANIFEST_DIR/embedded-rustfs/"]
@@ -746,10 +746,10 @@ mod tests {
assert_eq!(ServiceManager::extract_port("host:0"), Some(0));
assert_eq!(ServiceManager::extract_port("host:65535"), Some(65535));
assert_eq!(ServiceManager::extract_port("host:65536"), None); // Out of range
// IPv6-like address - extract_port takes the second part after split(':')
// For "::1:8080", split(':') gives ["", "", "1", "8080"], nth(1) gives ""
// IPv6-like address - extract_port takes the second part after split(':')
// For "::1:8080", split(':') gives ["", "", "1", "8080"], nth(1) gives ""
assert_eq!(ServiceManager::extract_port("::1:8080"), None); // Second part is empty
// For "[::1]:8080", split(':') gives ["[", "", "1]", "8080"], nth(1) gives ""
// For "[::1]:8080", split(':') gives ["[", "", "1]", "8080"], nth(1) gives ""
assert_eq!(ServiceManager::extract_port("[::1]:8080"), None); // Second part is empty
}

View File

@@ -11,6 +11,13 @@ pub struct Error {
}
impl Error {
pub fn other<E>(error: E) -> Self
where
E: std::fmt::Display + Into<Box<dyn std::error::Error + Send + Sync>>,
{
Self::from_std_error(error.into())
}
/// Create a new error from a `std::error::Error`.
#[must_use]
#[track_caller]

View File

@@ -1,5 +1,5 @@
pub mod bucket_stats;
pub mod error;
// pub mod error;
pub mod globals;
pub mod last_minute;

View File

@@ -3,7 +3,7 @@ use std::time::{Duration, Instant};
use tokio::{sync::mpsc::Sender, time::sleep};
use tracing::{info, warn};
use crate::{lock_args::LockArgs, LockApi, Locker};
use crate::{LockApi, Locker, lock_args::LockArgs};
const DRW_MUTEX_REFRESH_INTERVAL: Duration = Duration::from_secs(10);
const LOCK_RETRY_MIN_INTERVAL: Duration = Duration::from_millis(250);
@@ -117,7 +117,10 @@ impl DRWMutex {
quorum += 1;
}
}
info!("lockBlocking {}/{} for {:?}: lockType readLock({}), additional opts: {:?}, quorum: {}, tolerance: {}, lockClients: {}\n", id, source, self.names, is_read_lock, opts, quorum, tolerance, locker_len);
info!(
"lockBlocking {}/{} for {:?}: lockType readLock({}), additional opts: {:?}, quorum: {}, tolerance: {}, lockClients: {}\n",
id, source, self.names, is_read_lock, opts, quorum, tolerance, locker_len
);
// Recalculate tolerance after potential quorum adjustment
// Use saturating_sub to prevent underflow
@@ -376,8 +379,8 @@ mod tests {
use super::*;
use crate::local_locker::LocalLocker;
use async_trait::async_trait;
use common::error::{Error, Result};
use std::collections::HashMap;
use std::io::{Error, Result};
use std::sync::{Arc, Mutex};
// Mock locker for testing
@@ -436,10 +439,10 @@ mod tests {
async fn lock(&mut self, args: &LockArgs) -> Result<bool> {
let mut state = self.state.lock().unwrap();
if state.should_fail {
return Err(Error::from_string("Mock lock failure"));
return Err(Error::other("Mock lock failure"));
}
if !state.is_online {
return Err(Error::from_string("Mock locker offline"));
return Err(Error::other("Mock locker offline"));
}
// Check if already locked
@@ -454,7 +457,7 @@ mod tests {
async fn unlock(&mut self, args: &LockArgs) -> Result<bool> {
let mut state = self.state.lock().unwrap();
if state.should_fail {
return Err(Error::from_string("Mock unlock failure"));
return Err(Error::other("Mock unlock failure"));
}
Ok(state.locks.remove(&args.uid).is_some())
@@ -463,10 +466,10 @@ mod tests {
async fn rlock(&mut self, args: &LockArgs) -> Result<bool> {
let mut state = self.state.lock().unwrap();
if state.should_fail {
return Err(Error::from_string("Mock rlock failure"));
return Err(Error::other("Mock rlock failure"));
}
if !state.is_online {
return Err(Error::from_string("Mock locker offline"));
return Err(Error::other("Mock locker offline"));
}
// Check if write lock exists
@@ -481,7 +484,7 @@ mod tests {
async fn runlock(&mut self, args: &LockArgs) -> Result<bool> {
let mut state = self.state.lock().unwrap();
if state.should_fail {
return Err(Error::from_string("Mock runlock failure"));
return Err(Error::other("Mock runlock failure"));
}
Ok(state.read_locks.remove(&args.uid).is_some())
@@ -490,7 +493,7 @@ mod tests {
async fn refresh(&mut self, _args: &LockArgs) -> Result<bool> {
let state = self.state.lock().unwrap();
if state.should_fail {
return Err(Error::from_string("Mock refresh failure"));
return Err(Error::other("Mock refresh failure"));
}
Ok(true)
}
@@ -880,8 +883,8 @@ mod tests {
// Case 1: Even number of lockers
let locks = vec!["uid1".to_string(), "uid2".to_string(), "uid3".to_string(), "uid4".to_string()];
let tolerance = 2; // locks.len() / 2 = 4 / 2 = 2
// locks.len() - tolerance = 4 - 2 = 2, which equals tolerance
// So the special case applies: un_locks_failed >= tolerance
// locks.len() - tolerance = 4 - 2 = 2, which equals tolerance
// So the special case applies: un_locks_failed >= tolerance
// All 4 failed unlocks
assert!(check_failed_unlocks(&locks, tolerance)); // 4 >= 2 = true
@@ -897,8 +900,8 @@ mod tests {
// Case 2: Odd number of lockers
let locks = vec!["uid1".to_string(), "uid2".to_string(), "uid3".to_string()];
let tolerance = 1; // locks.len() / 2 = 3 / 2 = 1
// locks.len() - tolerance = 3 - 1 = 2, which does NOT equal tolerance (1)
// So the normal case applies: un_locks_failed > tolerance
// locks.len() - tolerance = 3 - 1 = 2, which does NOT equal tolerance (1)
// So the normal case applies: un_locks_failed > tolerance
// 3 failed unlocks
assert!(check_failed_unlocks(&locks, tolerance)); // 3 > 1 = true

View File

@@ -3,11 +3,11 @@
use std::sync::Arc;
use async_trait::async_trait;
use common::error::Result;
use lazy_static::lazy_static;
use local_locker::LocalLocker;
use lock_args::LockArgs;
use remote_client::RemoteClient;
use std::io::Result;
use tokio::sync::RwLock;
pub mod drwmutex;

View File

@@ -1,11 +1,11 @@
use async_trait::async_trait;
use common::error::{Error, Result};
use std::io::{Error, Result};
use std::{
collections::HashMap,
time::{Duration, Instant},
};
use crate::{lock_args::LockArgs, Locker};
use crate::{Locker, lock_args::LockArgs};
const MAX_DELETE_LIST: usize = 1000;
@@ -116,7 +116,7 @@ impl LocalLocker {
impl Locker for LocalLocker {
async fn lock(&mut self, args: &LockArgs) -> Result<bool> {
if args.resources.len() > MAX_DELETE_LIST {
return Err(Error::from_string(format!(
return Err(Error::other(format!(
"internal error: LocalLocker.lock called with more than {} resources",
MAX_DELETE_LIST
)));
@@ -152,7 +152,7 @@ impl Locker for LocalLocker {
async fn unlock(&mut self, args: &LockArgs) -> Result<bool> {
if args.resources.len() > MAX_DELETE_LIST {
return Err(Error::from_string(format!(
return Err(Error::other(format!(
"internal error: LocalLocker.unlock called with more than {} resources",
MAX_DELETE_LIST
)));
@@ -197,7 +197,7 @@ impl Locker for LocalLocker {
async fn rlock(&mut self, args: &LockArgs) -> Result<bool> {
if args.resources.len() != 1 {
return Err(Error::from_string("internal error: localLocker.RLock called with more than one resource"));
return Err(Error::other("internal error: localLocker.RLock called with more than one resource"));
}
let resource = &args.resources[0];
@@ -241,7 +241,7 @@ impl Locker for LocalLocker {
async fn runlock(&mut self, args: &LockArgs) -> Result<bool> {
if args.resources.len() != 1 {
return Err(Error::from_string("internal error: localLocker.RLock called with more than one resource"));
return Err(Error::other("internal error: localLocker.RLock called with more than one resource"));
}
let mut reply = false;
@@ -249,7 +249,7 @@ impl Locker for LocalLocker {
match self.lock_map.get_mut(resource) {
Some(lris) => {
if is_write_lock(lris) {
return Err(Error::from_string(format!("runlock attempted on a write locked entity: {}", resource)));
return Err(Error::other(format!("runlock attempted on a write locked entity: {}", resource)));
} else {
lris.retain(|lri| {
if lri.uid == args.uid && (args.owner.is_empty() || lri.owner == args.owner) {
@@ -389,8 +389,8 @@ fn format_uuid(s: &mut String, idx: &usize) {
#[cfg(test)]
mod test {
use super::LocalLocker;
use crate::{lock_args::LockArgs, Locker};
use common::error::Result;
use crate::{Locker, lock_args::LockArgs};
use std::io::Result;
use tokio;
#[tokio::test]

View File

@@ -125,7 +125,7 @@ impl LRWMutex {
mod test {
use std::{sync::Arc, time::Duration};
use common::error::Result;
use std::io::Result;
use tokio::time::sleep;
use crate::lrwmutex::LRWMutex;

View File

@@ -5,11 +5,11 @@ use tokio::sync::RwLock;
use uuid::Uuid;
use crate::{
LockApi,
drwmutex::{DRWMutex, Options},
lrwmutex::LRWMutex,
LockApi,
};
use common::error::Result;
use std::io::Result;
pub type RWLockerImpl = Box<dyn RWLocker + Send + Sync>;
@@ -258,12 +258,12 @@ impl RWLocker for LocalLockInstance {
mod test {
use std::{sync::Arc, time::Duration};
use common::error::Result;
use std::io::Result;
use tokio::sync::RwLock;
use crate::{
drwmutex::Options,
namespace_lock::{new_nslock, NsLockMap},
namespace_lock::{NsLockMap, new_nslock},
};
#[tokio::test]

View File

@@ -1,10 +1,10 @@
use async_trait::async_trait;
use common::error::{Error, Result};
use protos::{node_service_time_out_client, proto_gen::node_service::GenerallyLockRequest};
use std::io::{Error, Result};
use tonic::Request;
use tracing::info;
use crate::{lock_args::LockArgs, Locker};
use crate::{Locker, lock_args::LockArgs};
#[derive(Debug, Clone)]
pub struct RemoteClient {
@@ -25,13 +25,13 @@ impl Locker for RemoteClient {
let args = serde_json::to_string(args)?;
let mut client = node_service_time_out_client(&self.addr)
.await
.map_err(|err| Error::from_string(format!("can not get client, err: {}", err)))?;
.map_err(|err| Error::other(format!("can not get client, err: {}", err)))?;
let request = Request::new(GenerallyLockRequest { args });
let response = client.lock(request).await?.into_inner();
let response = client.lock(request).await.map_err(Error::other)?.into_inner();
if let Some(error_info) = response.error_info {
return Err(Error::from_string(error_info));
return Err(Error::other(error_info));
}
Ok(response.success)
@@ -42,13 +42,13 @@ impl Locker for RemoteClient {
let args = serde_json::to_string(args)?;
let mut client = node_service_time_out_client(&self.addr)
.await
.map_err(|err| Error::from_string(format!("can not get client, err: {}", err)))?;
.map_err(|err| Error::other(format!("can not get client, err: {}", err)))?;
let request = Request::new(GenerallyLockRequest { args });
let response = client.un_lock(request).await?.into_inner();
let response = client.un_lock(request).await.map_err(Error::other)?.into_inner();
if let Some(error_info) = response.error_info {
return Err(Error::from_string(error_info));
return Err(Error::other(error_info));
}
Ok(response.success)
@@ -59,13 +59,13 @@ impl Locker for RemoteClient {
let args = serde_json::to_string(args)?;
let mut client = node_service_time_out_client(&self.addr)
.await
.map_err(|err| Error::from_string(format!("can not get client, err: {}", err)))?;
.map_err(|err| Error::other(format!("can not get client, err: {}", err)))?;
let request = Request::new(GenerallyLockRequest { args });
let response = client.r_lock(request).await?.into_inner();
let response = client.r_lock(request).await.map_err(Error::other)?.into_inner();
if let Some(error_info) = response.error_info {
return Err(Error::from_string(error_info));
return Err(Error::other(error_info));
}
Ok(response.success)
@@ -76,13 +76,13 @@ impl Locker for RemoteClient {
let args = serde_json::to_string(args)?;
let mut client = node_service_time_out_client(&self.addr)
.await
.map_err(|err| Error::from_string(format!("can not get client, err: {}", err)))?;
.map_err(|err| Error::other(format!("can not get client, err: {}", err)))?;
let request = Request::new(GenerallyLockRequest { args });
let response = client.r_un_lock(request).await?.into_inner();
let response = client.r_un_lock(request).await.map_err(Error::other)?.into_inner();
if let Some(error_info) = response.error_info {
return Err(Error::from_string(error_info));
return Err(Error::other(error_info));
}
Ok(response.success)
@@ -93,13 +93,13 @@ impl Locker for RemoteClient {
let args = serde_json::to_string(args)?;
let mut client = node_service_time_out_client(&self.addr)
.await
.map_err(|err| Error::from_string(format!("can not get client, err: {}", err)))?;
.map_err(|err| Error::other(format!("can not get client, err: {}", err)))?;
let request = Request::new(GenerallyLockRequest { args });
let response = client.refresh(request).await?.into_inner();
let response = client.refresh(request).await.map_err(Error::other)?.into_inner();
if let Some(error_info) = response.error_info {
return Err(Error::from_string(error_info));
return Err(Error::other(error_info));
}
Ok(response.success)
@@ -110,13 +110,13 @@ impl Locker for RemoteClient {
let args = serde_json::to_string(args)?;
let mut client = node_service_time_out_client(&self.addr)
.await
.map_err(|err| Error::from_string(format!("can not get client, err: {}", err)))?;
.map_err(|err| Error::other(format!("can not get client, err: {}", err)))?;
let request = Request::new(GenerallyLockRequest { args });
let response = client.force_un_lock(request).await?.into_inner();
let response = client.force_un_lock(request).await.map_err(Error::other)?.into_inner();
if let Some(error_info) = response.error_info {
return Err(Error::from_string(error_info));
return Err(Error::other(error_info));
}
Ok(response.success)

View File

@@ -29,7 +29,7 @@ pub mod models {
#[inline]
unsafe fn follow(buf: &'a [u8], loc: usize) -> Self::Inner {
Self {
_tab: flatbuffers::Table::new(buf, loc),
_tab: unsafe { flatbuffers::Table::new(buf, loc) },
}
}
}

View File

@@ -11,15 +11,15 @@ pub struct Error {
pub struct PingRequest {
#[prost(uint64, tag = "1")]
pub version: u64,
#[prost(bytes = "vec", tag = "2")]
pub body: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub body: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct PingResponse {
#[prost(uint64, tag = "1")]
pub version: u64,
#[prost(bytes = "vec", tag = "2")]
pub body: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub body: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct HealBucketRequest {
@@ -105,8 +105,8 @@ pub struct ReadAllRequest {
pub struct ReadAllResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub data: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub data: ::prost::bytes::Bytes,
#[prost(message, optional, tag = "3")]
pub error: ::core::option::Option<Error>,
}
@@ -119,8 +119,8 @@ pub struct WriteAllRequest {
pub volume: ::prost::alloc::string::String,
#[prost(string, tag = "3")]
pub path: ::prost::alloc::string::String,
#[prost(bytes = "vec", tag = "4")]
pub data: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "4")]
pub data: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct WriteAllResponse {
@@ -191,7 +191,7 @@ pub struct CheckPartsResponse {
pub error: ::core::option::Option<Error>,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct RenamePartRequst {
pub struct RenamePartRequest {
#[prost(string, tag = "1")]
pub disk: ::prost::alloc::string::String,
#[prost(string, tag = "2")]
@@ -202,8 +202,8 @@ pub struct RenamePartRequst {
pub dst_volume: ::prost::alloc::string::String,
#[prost(string, tag = "5")]
pub dst_path: ::prost::alloc::string::String,
#[prost(bytes = "vec", tag = "6")]
pub meta: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "6")]
pub meta: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct RenamePartResponse {
@@ -213,7 +213,7 @@ pub struct RenamePartResponse {
pub error: ::core::option::Option<Error>,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct RenameFileRequst {
pub struct RenameFileRequest {
#[prost(string, tag = "1")]
pub disk: ::prost::alloc::string::String,
#[prost(string, tag = "2")]
@@ -243,8 +243,8 @@ pub struct WriteRequest {
pub path: ::prost::alloc::string::String,
#[prost(bool, tag = "4")]
pub is_append: bool,
#[prost(bytes = "vec", tag = "5")]
pub data: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "5")]
pub data: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct WriteResponse {
@@ -271,8 +271,8 @@ pub struct ReadAtRequest {
pub struct ReadAtResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub data: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub data: ::prost::bytes::Bytes,
#[prost(int64, tag = "3")]
pub read_size: i64,
#[prost(message, optional, tag = "4")]
@@ -300,8 +300,8 @@ pub struct WalkDirRequest {
/// indicate which one in the disks
#[prost(string, tag = "1")]
pub disk: ::prost::alloc::string::String,
#[prost(bytes = "vec", tag = "2")]
pub walk_dir_options: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub walk_dir_options: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct WalkDirResponse {
@@ -633,8 +633,8 @@ pub struct LocalStorageInfoRequest {
pub struct LocalStorageInfoResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub storage_info: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub storage_info: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -647,8 +647,8 @@ pub struct ServerInfoRequest {
pub struct ServerInfoResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub server_properties: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub server_properties: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -658,8 +658,8 @@ pub struct GetCpusRequest {}
pub struct GetCpusResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub cpus: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub cpus: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -669,8 +669,8 @@ pub struct GetNetInfoRequest {}
pub struct GetNetInfoResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub net_info: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub net_info: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -680,8 +680,8 @@ pub struct GetPartitionsRequest {}
pub struct GetPartitionsResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub partitions: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub partitions: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -691,8 +691,8 @@ pub struct GetOsInfoRequest {}
pub struct GetOsInfoResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub os_info: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub os_info: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -702,8 +702,8 @@ pub struct GetSeLinuxInfoRequest {}
pub struct GetSeLinuxInfoResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub sys_services: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub sys_services: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -713,8 +713,8 @@ pub struct GetSysConfigRequest {}
pub struct GetSysConfigResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub sys_config: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub sys_config: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -724,8 +724,8 @@ pub struct GetSysErrorsRequest {}
pub struct GetSysErrorsResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub sys_errors: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub sys_errors: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -735,24 +735,24 @@ pub struct GetMemInfoRequest {}
pub struct GetMemInfoResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub mem_info: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub mem_info: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct GetMetricsRequest {
#[prost(bytes = "vec", tag = "1")]
pub metric_type: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "vec", tag = "2")]
pub opts: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "1")]
pub metric_type: ::prost::bytes::Bytes,
#[prost(bytes = "bytes", tag = "2")]
pub opts: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct GetMetricsResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub realtime_metrics: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub realtime_metrics: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -762,8 +762,8 @@ pub struct GetProcInfoRequest {}
pub struct GetProcInfoResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub proc_info: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub proc_info: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -786,7 +786,7 @@ pub struct DownloadProfileDataResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(map = "string, bytes", tag = "2")]
pub data: ::std::collections::HashMap<::prost::alloc::string::String, ::prost::alloc::vec::Vec<u8>>,
pub data: ::std::collections::HashMap<::prost::alloc::string::String, ::prost::bytes::Bytes>,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -799,8 +799,8 @@ pub struct GetBucketStatsDataRequest {
pub struct GetBucketStatsDataResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub bucket_stats: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub bucket_stats: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -810,8 +810,8 @@ pub struct GetSrMetricsDataRequest {}
pub struct GetSrMetricsDataResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub sr_metrics_summary: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub sr_metrics_summary: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -821,8 +821,8 @@ pub struct GetAllBucketStatsRequest {}
pub struct GetAllBucketStatsResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub bucket_stats_map: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub bucket_stats_map: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -979,36 +979,36 @@ pub struct BackgroundHealStatusRequest {}
pub struct BackgroundHealStatusResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub bg_heal_state: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub bg_heal_state: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct GetMetacacheListingRequest {
#[prost(bytes = "vec", tag = "1")]
pub opts: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "1")]
pub opts: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct GetMetacacheListingResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub metacache: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub metacache: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct UpdateMetacacheListingRequest {
#[prost(bytes = "vec", tag = "1")]
pub metacache: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "1")]
pub metacache: ::prost::bytes::Bytes,
}
#[derive(Clone, PartialEq, ::prost::Message)]
pub struct UpdateMetacacheListingResponse {
#[prost(bool, tag = "1")]
pub success: bool,
#[prost(bytes = "vec", tag = "2")]
pub metacache: ::prost::alloc::vec::Vec<u8>,
#[prost(bytes = "bytes", tag = "2")]
pub metacache: ::prost::bytes::Bytes,
#[prost(string, optional, tag = "3")]
pub error_info: ::core::option::Option<::prost::alloc::string::String>,
}
@@ -1091,9 +1091,9 @@ pub mod node_service_client {
F: tonic::service::Interceptor,
T::ResponseBody: Default,
T: tonic::codegen::Service<
http::Request<tonic::body::Body>,
Response = http::Response<<T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody>,
>,
http::Request<tonic::body::Body>,
Response = http::Response<<T as tonic::client::GrpcService<tonic::body::Body>>::ResponseBody>,
>,
<T as tonic::codegen::Service<http::Request<tonic::body::Body>>>::Error:
Into<StdError> + std::marker::Send + std::marker::Sync,
{
@@ -1298,7 +1298,7 @@ pub mod node_service_client {
}
pub async fn rename_part(
&mut self,
request: impl tonic::IntoRequest<super::RenamePartRequst>,
request: impl tonic::IntoRequest<super::RenamePartRequest>,
) -> std::result::Result<tonic::Response<super::RenamePartResponse>, tonic::Status> {
self.inner
.ready()
@@ -1313,7 +1313,7 @@ pub mod node_service_client {
}
pub async fn rename_file(
&mut self,
request: impl tonic::IntoRequest<super::RenameFileRequst>,
request: impl tonic::IntoRequest<super::RenameFileRequest>,
) -> std::result::Result<tonic::Response<super::RenameFileResponse>, tonic::Status> {
self.inner
.ready()
@@ -2330,11 +2330,11 @@ pub mod node_service_server {
) -> std::result::Result<tonic::Response<super::CheckPartsResponse>, tonic::Status>;
async fn rename_part(
&self,
request: tonic::Request<super::RenamePartRequst>,
request: tonic::Request<super::RenamePartRequest>,
) -> std::result::Result<tonic::Response<super::RenamePartResponse>, tonic::Status>;
async fn rename_file(
&self,
request: tonic::Request<super::RenameFileRequst>,
request: tonic::Request<super::RenameFileRequest>,
) -> std::result::Result<tonic::Response<super::RenameFileResponse>, tonic::Status>;
async fn write(
&self,
@@ -2989,10 +2989,10 @@ pub mod node_service_server {
"/node_service.NodeService/RenamePart" => {
#[allow(non_camel_case_types)]
struct RenamePartSvc<T: NodeService>(pub Arc<T>);
impl<T: NodeService> tonic::server::UnaryService<super::RenamePartRequst> for RenamePartSvc<T> {
impl<T: NodeService> tonic::server::UnaryService<super::RenamePartRequest> for RenamePartSvc<T> {
type Response = super::RenamePartResponse;
type Future = BoxFuture<tonic::Response<Self::Response>, tonic::Status>;
fn call(&mut self, request: tonic::Request<super::RenamePartRequst>) -> Self::Future {
fn call(&mut self, request: tonic::Request<super::RenamePartRequest>) -> Self::Future {
let inner = Arc::clone(&self.0);
let fut = async move { <T as NodeService>::rename_part(&inner, request).await };
Box::pin(fut)
@@ -3017,10 +3017,10 @@ pub mod node_service_server {
"/node_service.NodeService/RenameFile" => {
#[allow(non_camel_case_types)]
struct RenameFileSvc<T: NodeService>(pub Arc<T>);
impl<T: NodeService> tonic::server::UnaryService<super::RenameFileRequst> for RenameFileSvc<T> {
impl<T: NodeService> tonic::server::UnaryService<super::RenameFileRequest> for RenameFileSvc<T> {
type Response = super::RenameFileResponse;
type Future = BoxFuture<tonic::Response<Self::Response>, tonic::Status>;
fn call(&mut self, request: tonic::Request<super::RenameFileRequst>) -> Self::Future {
fn call(&mut self, request: tonic::Request<super::RenameFileRequest>) -> Self::Future {
let inner = Arc::clone(&self.0);
let fut = async move { <T as NodeService>::rename_file(&inner, request).await };
Box::pin(fut)

View File

@@ -7,10 +7,10 @@ use common::globals::GLOBAL_Conn_Map;
pub use generated::*;
use proto_gen::node_service::node_service_client::NodeServiceClient;
use tonic::{
Request, Status,
metadata::MetadataValue,
service::interceptor::InterceptedService,
transport::{Channel, Endpoint},
Request, Status,
};
// Default 100 MB

View File

@@ -43,6 +43,7 @@ fn main() -> Result<(), AnyError> {
// .file_descriptor_set_path(descriptor_set_path)
.protoc_arg("--experimental_allow_proto3_optional")
.compile_well_known_types(true)
.bytes(["."])
.emit_rerun_if_changed(false)
.compile_protos(proto_files, &[proto_dir.clone()])
.map_err(|e| format!("Failed to generate protobuf file: {e}."))?;

View File

@@ -129,7 +129,7 @@ message CheckPartsResponse {
optional Error error = 3;
}
message RenamePartRequst {
message RenamePartRequest {
string disk = 1;
string src_volume = 2;
string src_path = 3;
@@ -143,7 +143,7 @@ message RenamePartResponse {
optional Error error = 2;
}
message RenameFileRequst {
message RenameFileRequest {
string disk = 1;
string src_volume = 2;
string src_path = 3;
@@ -175,7 +175,7 @@ message WriteResponse {
// string path = 3;
// bytes data = 4;
// }
//
//
// message AppendResponse {
// bool success = 1;
// optional Error error = 2;
@@ -755,8 +755,8 @@ service NodeService {
rpc Delete(DeleteRequest) returns (DeleteResponse) {};
rpc VerifyFile(VerifyFileRequest) returns (VerifyFileResponse) {};
rpc CheckParts(CheckPartsRequest) returns (CheckPartsResponse) {};
rpc RenamePart(RenamePartRequst) returns (RenamePartResponse) {};
rpc RenameFile(RenameFileRequst) returns (RenameFileResponse) {};
rpc RenamePart(RenamePartRequest) returns (RenamePartResponse) {};
rpc RenameFile(RenameFileRequest) returns (RenameFileResponse) {};
rpc Write(WriteRequest) returns (WriteResponse) {};
rpc WriteStream(stream WriteRequest) returns (stream WriteResponse) {};
// rpc Append(AppendRequest) returns (AppendResponse) {};

View File

@@ -200,7 +200,7 @@ mod tests {
// Test port related constants
assert_eq!(DEFAULT_PORT, 9000);
assert_eq!(DEFAULT_CONSOLE_PORT, 9002);
assert_eq!(DEFAULT_CONSOLE_PORT, 9001);
assert_ne!(DEFAULT_PORT, DEFAULT_CONSOLE_PORT, "Main port and console port should be different");
}
@@ -215,7 +215,7 @@ mod tests {
"Address should contain the default port"
);
assert_eq!(DEFAULT_CONSOLE_ADDRESS, ":9002");
assert_eq!(DEFAULT_CONSOLE_ADDRESS, ":9001");
assert!(DEFAULT_CONSOLE_ADDRESS.starts_with(':'), "Console address should start with colon");
assert!(
DEFAULT_CONSOLE_ADDRESS.contains(&DEFAULT_CONSOLE_PORT.to_string()),

View File

@@ -0,0 +1,32 @@
[package]
name = "rustfs-filemeta"
edition.workspace = true
license.workspace = true
repository.workspace = true
rust-version.workspace = true
version.workspace = true
[dependencies]
crc32fast = "1.4.2"
rmp.workspace = true
rmp-serde.workspace = true
serde.workspace = true
time.workspace = true
uuid = { workspace = true, features = ["v4", "fast-rng", "serde"] }
tokio = { workspace = true, features = ["io-util", "macros", "sync"] }
xxhash-rust = { version = "0.8.15", features = ["xxh64"] }
bytes.workspace = true
rustfs-utils = {workspace = true, features= ["hash"]}
byteorder = "1.5.0"
tracing.workspace = true
thiserror.workspace = true
[dev-dependencies]
criterion = { version = "0.5", features = ["html_reports"] }
[[bench]]
name = "xl_meta_bench"
harness = false
[lints]
workspace = true

238
crates/filemeta/README.md Normal file
View File

@@ -0,0 +1,238 @@
# RustFS FileMeta
A high-performance Rust implementation of xl-storage-format-v2, providing complete compatibility with S3-compatible metadata format while offering enhanced performance and safety.
## Overview
This crate implements the XL (Erasure Coded) metadata format used for distributed object storage. It provides:
- **Full S3 Compatibility**: 100% compatible with xl.meta file format
- **High Performance**: Optimized for speed with sub-microsecond parsing times
- **Memory Safety**: Written in safe Rust with comprehensive error handling
- **Comprehensive Testing**: Extensive test suite with real metadata validation
- **Cross-Platform**: Supports multiple CPU architectures (x86_64, aarch64)
## Features
### Core Functionality
- ✅ XL v2 file format parsing and serialization
- ✅ MessagePack-based metadata encoding/decoding
- ✅ Version management with modification time sorting
- ✅ Erasure coding information storage
- ✅ Inline data support for small objects
- ✅ CRC32 integrity verification using xxHash64
- ✅ Delete marker handling
- ✅ Legacy version support
### Advanced Features
- ✅ Signature calculation for version integrity
- ✅ Metadata validation and compatibility checking
- ✅ Version statistics and analytics
- ✅ Async I/O support with tokio
- ✅ Comprehensive error handling
- ✅ Performance benchmarking
## Performance
Based on our benchmarks:
| Operation | Time | Description |
|-----------|------|-------------|
| Parse Real xl.meta | ~255 ns | Parse authentic xl metadata |
| Parse Complex xl.meta | ~1.1 µs | Parse multi-version metadata |
| Serialize Real xl.meta | ~659 ns | Serialize to xl format |
| Round-trip Real xl.meta | ~1.3 µs | Parse + serialize cycle |
| Version Statistics | ~5.2 ns | Calculate version stats |
| Integrity Validation | ~7.8 ns | Validate metadata integrity |
## Usage
### Basic Usage
```rust
use rustfs_filemeta::file_meta::FileMeta;
// Load metadata from bytes
let metadata = FileMeta::load(&xl_meta_bytes)?;
// Access version information
for version in &metadata.versions {
println!("Version ID: {:?}", version.header.version_id);
println!("Mod Time: {:?}", version.header.mod_time);
}
// Serialize back to bytes
let serialized = metadata.marshal_msg()?;
```
### Advanced Usage
```rust
use rustfs_filemeta::file_meta::FileMeta;
// Load with validation
let mut metadata = FileMeta::load(&xl_meta_bytes)?;
// Validate integrity
metadata.validate_integrity()?;
// Check xl format compatibility
if metadata.is_compatible_with_meta() {
println!("Compatible with xl format");
}
// Get version statistics
let stats = metadata.get_version_stats();
println!("Total versions: {}", stats.total_versions);
println!("Object versions: {}", stats.object_versions);
println!("Delete markers: {}", stats.delete_markers);
```
### Working with FileInfo
```rust
use rustfs_filemeta::fileinfo::FileInfo;
use rustfs_filemeta::file_meta::FileMetaVersion;
// Convert FileInfo to metadata version
let file_info = FileInfo::new("bucket", "object.txt");
let meta_version = FileMetaVersion::from(file_info);
// Add version to metadata
metadata.add_version(file_info)?;
```
## Data Structures
### FileMeta
The main metadata container that holds all versions and inline data:
```rust
pub struct FileMeta {
pub versions: Vec<FileMetaShallowVersion>,
pub data: InlineData,
pub meta_ver: u8,
}
```
### FileMetaVersion
Represents a single object version:
```rust
pub struct FileMetaVersion {
pub version_type: VersionType,
pub object: Option<MetaObject>,
pub delete_marker: Option<MetaDeleteMarker>,
pub write_version: u64,
}
```
### MetaObject
Contains object-specific metadata including erasure coding information:
```rust
pub struct MetaObject {
pub version_id: Option<Uuid>,
pub data_dir: Option<Uuid>,
pub erasure_algorithm: ErasureAlgo,
pub erasure_m: usize,
pub erasure_n: usize,
// ... additional fields
}
```
## File Format Compatibility
This implementation is fully compatible with xl-storage-format-v2:
- **Header Format**: XL2 v1 format with proper version checking
- **Serialization**: MessagePack encoding identical to standard format
- **Checksums**: xxHash64-based CRC validation
- **Version Types**: Support for Object, Delete, and Legacy versions
- **Inline Data**: Compatible inline data storage for small objects
## Testing
The crate includes comprehensive tests with real xl metadata:
```bash
# Run all tests
cargo test
# Run benchmarks
cargo bench
# Run with coverage
cargo test --features coverage
```
### Test Coverage
- ✅ Real xl.meta file compatibility
- ✅ Complex multi-version scenarios
- ✅ Error handling and recovery
- ✅ Inline data processing
- ✅ Signature calculation
- ✅ Round-trip serialization
- ✅ Performance benchmarks
- ✅ Edge cases and boundary conditions
## Architecture
The crate follows a modular design:
```
src/
├── file_meta.rs # Core metadata structures and logic
├── file_meta_inline.rs # Inline data handling
├── fileinfo.rs # File information structures
├── test_data.rs # Test data generation
└── lib.rs # Public API exports
```
## Error Handling
Comprehensive error handling with detailed error messages:
```rust
use rustfs_filemeta::error::Error;
match FileMeta::load(&invalid_data) {
Ok(metadata) => { /* process metadata */ },
Err(Error::InvalidFormat(msg)) => {
eprintln!("Invalid format: {}", msg);
},
Err(Error::CorruptedData(msg)) => {
eprintln!("Corrupted data: {}", msg);
},
Err(e) => {
eprintln!("Other error: {}", e);
}
}
```
## Dependencies
- `rmp` - MessagePack serialization
- `uuid` - UUID handling
- `time` - Date/time operations
- `xxhash-rust` - Fast hashing
- `tokio` - Async runtime (optional)
- `criterion` - Benchmarking (dev dependency)
## Contributing
1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Ensure all tests pass
5. Submit a pull request
## License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
## Acknowledgments
- Original xl-storage-format-v2 implementation contributors
- Rust community for excellent crates and tooling
- Contributors and testers who helped improve this implementation

View File

@@ -0,0 +1,95 @@
use criterion::{Criterion, black_box, criterion_group, criterion_main};
use rustfs_filemeta::{FileMeta, test_data::*};
fn bench_create_real_xlmeta(c: &mut Criterion) {
c.bench_function("create_real_xlmeta", |b| b.iter(|| black_box(create_real_xlmeta().unwrap())));
}
fn bench_create_complex_xlmeta(c: &mut Criterion) {
c.bench_function("create_complex_xlmeta", |b| b.iter(|| black_box(create_complex_xlmeta().unwrap())));
}
fn bench_parse_real_xlmeta(c: &mut Criterion) {
let data = create_real_xlmeta().unwrap();
c.bench_function("parse_real_xlmeta", |b| b.iter(|| black_box(FileMeta::load(&data).unwrap())));
}
fn bench_parse_complex_xlmeta(c: &mut Criterion) {
let data = create_complex_xlmeta().unwrap();
c.bench_function("parse_complex_xlmeta", |b| b.iter(|| black_box(FileMeta::load(&data).unwrap())));
}
fn bench_serialize_real_xlmeta(c: &mut Criterion) {
let data = create_real_xlmeta().unwrap();
let fm = FileMeta::load(&data).unwrap();
c.bench_function("serialize_real_xlmeta", |b| b.iter(|| black_box(fm.marshal_msg().unwrap())));
}
fn bench_serialize_complex_xlmeta(c: &mut Criterion) {
let data = create_complex_xlmeta().unwrap();
let fm = FileMeta::load(&data).unwrap();
c.bench_function("serialize_complex_xlmeta", |b| b.iter(|| black_box(fm.marshal_msg().unwrap())));
}
fn bench_round_trip_real_xlmeta(c: &mut Criterion) {
let original_data = create_real_xlmeta().unwrap();
c.bench_function("round_trip_real_xlmeta", |b| {
b.iter(|| {
let fm = FileMeta::load(&original_data).unwrap();
let serialized = fm.marshal_msg().unwrap();
black_box(FileMeta::load(&serialized).unwrap())
})
});
}
fn bench_round_trip_complex_xlmeta(c: &mut Criterion) {
let original_data = create_complex_xlmeta().unwrap();
c.bench_function("round_trip_complex_xlmeta", |b| {
b.iter(|| {
let fm = FileMeta::load(&original_data).unwrap();
let serialized = fm.marshal_msg().unwrap();
black_box(FileMeta::load(&serialized).unwrap())
})
});
}
fn bench_version_stats(c: &mut Criterion) {
let data = create_complex_xlmeta().unwrap();
let fm = FileMeta::load(&data).unwrap();
c.bench_function("version_stats", |b| b.iter(|| black_box(fm.get_version_stats())));
}
fn bench_validate_integrity(c: &mut Criterion) {
let data = create_real_xlmeta().unwrap();
let fm = FileMeta::load(&data).unwrap();
c.bench_function("validate_integrity", |b| {
b.iter(|| {
fm.validate_integrity().unwrap();
black_box(())
})
});
}
criterion_group!(
benches,
bench_create_real_xlmeta,
bench_create_complex_xlmeta,
bench_parse_real_xlmeta,
bench_parse_complex_xlmeta,
bench_serialize_real_xlmeta,
bench_serialize_complex_xlmeta,
bench_round_trip_real_xlmeta,
bench_round_trip_complex_xlmeta,
bench_version_stats,
bench_validate_integrity
);
criterion_main!(benches);

View File

@@ -0,0 +1,569 @@
pub type Result<T> = core::result::Result<T, Error>;
#[derive(thiserror::Error, Debug)]
pub enum Error {
#[error("File not found")]
FileNotFound,
#[error("File version not found")]
FileVersionNotFound,
#[error("Volume not found")]
VolumeNotFound,
#[error("File corrupt")]
FileCorrupt,
#[error("Done for now")]
DoneForNow,
#[error("Method not allowed")]
MethodNotAllowed,
#[error("Unexpected error")]
Unexpected,
#[error("I/O error: {0}")]
Io(std::io::Error),
#[error("rmp serde decode error: {0}")]
RmpSerdeDecode(String),
#[error("rmp serde encode error: {0}")]
RmpSerdeEncode(String),
#[error("Invalid UTF-8: {0}")]
FromUtf8(String),
#[error("rmp decode value read error: {0}")]
RmpDecodeValueRead(String),
#[error("rmp encode value write error: {0}")]
RmpEncodeValueWrite(String),
#[error("rmp decode num value read error: {0}")]
RmpDecodeNumValueRead(String),
#[error("rmp decode marker read error: {0}")]
RmpDecodeMarkerRead(String),
#[error("time component range error: {0}")]
TimeComponentRange(String),
#[error("uuid parse error: {0}")]
UuidParse(String),
}
impl Error {
pub fn other<E>(error: E) -> Error
where
E: Into<Box<dyn std::error::Error + Send + Sync>>,
{
std::io::Error::other(error).into()
}
}
impl PartialEq for Error {
fn eq(&self, other: &Self) -> bool {
match (self, other) {
(Error::FileCorrupt, Error::FileCorrupt) => true,
(Error::DoneForNow, Error::DoneForNow) => true,
(Error::MethodNotAllowed, Error::MethodNotAllowed) => true,
(Error::FileNotFound, Error::FileNotFound) => true,
(Error::FileVersionNotFound, Error::FileVersionNotFound) => true,
(Error::VolumeNotFound, Error::VolumeNotFound) => true,
(Error::Io(e1), Error::Io(e2)) => e1.kind() == e2.kind() && e1.to_string() == e2.to_string(),
(Error::RmpSerdeDecode(e1), Error::RmpSerdeDecode(e2)) => e1 == e2,
(Error::RmpSerdeEncode(e1), Error::RmpSerdeEncode(e2)) => e1 == e2,
(Error::RmpDecodeValueRead(e1), Error::RmpDecodeValueRead(e2)) => e1 == e2,
(Error::RmpEncodeValueWrite(e1), Error::RmpEncodeValueWrite(e2)) => e1 == e2,
(Error::RmpDecodeNumValueRead(e1), Error::RmpDecodeNumValueRead(e2)) => e1 == e2,
(Error::TimeComponentRange(e1), Error::TimeComponentRange(e2)) => e1 == e2,
(Error::UuidParse(e1), Error::UuidParse(e2)) => e1 == e2,
(Error::Unexpected, Error::Unexpected) => true,
(a, b) => a.to_string() == b.to_string(),
}
}
}
impl Clone for Error {
fn clone(&self) -> Self {
match self {
Error::FileNotFound => Error::FileNotFound,
Error::FileVersionNotFound => Error::FileVersionNotFound,
Error::FileCorrupt => Error::FileCorrupt,
Error::DoneForNow => Error::DoneForNow,
Error::MethodNotAllowed => Error::MethodNotAllowed,
Error::VolumeNotFound => Error::VolumeNotFound,
Error::Io(e) => Error::Io(std::io::Error::new(e.kind(), e.to_string())),
Error::RmpSerdeDecode(s) => Error::RmpSerdeDecode(s.clone()),
Error::RmpSerdeEncode(s) => Error::RmpSerdeEncode(s.clone()),
Error::FromUtf8(s) => Error::FromUtf8(s.clone()),
Error::RmpDecodeValueRead(s) => Error::RmpDecodeValueRead(s.clone()),
Error::RmpEncodeValueWrite(s) => Error::RmpEncodeValueWrite(s.clone()),
Error::RmpDecodeNumValueRead(s) => Error::RmpDecodeNumValueRead(s.clone()),
Error::RmpDecodeMarkerRead(s) => Error::RmpDecodeMarkerRead(s.clone()),
Error::TimeComponentRange(s) => Error::TimeComponentRange(s.clone()),
Error::UuidParse(s) => Error::UuidParse(s.clone()),
Error::Unexpected => Error::Unexpected,
}
}
}
impl From<std::io::Error> for Error {
fn from(e: std::io::Error) -> Self {
match e.kind() {
std::io::ErrorKind::UnexpectedEof => Error::Unexpected,
_ => Error::Io(e),
}
}
}
impl From<Error> for std::io::Error {
fn from(e: Error) -> Self {
match e {
Error::Unexpected => std::io::Error::new(std::io::ErrorKind::UnexpectedEof, "Unexpected EOF"),
Error::Io(e) => e,
_ => std::io::Error::other(e.to_string()),
}
}
}
impl From<rmp_serde::decode::Error> for Error {
fn from(e: rmp_serde::decode::Error) -> Self {
Error::RmpSerdeDecode(e.to_string())
}
}
impl From<rmp_serde::encode::Error> for Error {
fn from(e: rmp_serde::encode::Error) -> Self {
Error::RmpSerdeEncode(e.to_string())
}
}
impl From<std::string::FromUtf8Error> for Error {
fn from(e: std::string::FromUtf8Error) -> Self {
Error::FromUtf8(e.to_string())
}
}
impl From<rmp::decode::ValueReadError> for Error {
fn from(e: rmp::decode::ValueReadError) -> Self {
Error::RmpDecodeValueRead(e.to_string())
}
}
impl From<rmp::encode::ValueWriteError> for Error {
fn from(e: rmp::encode::ValueWriteError) -> Self {
Error::RmpEncodeValueWrite(e.to_string())
}
}
impl From<rmp::decode::NumValueReadError> for Error {
fn from(e: rmp::decode::NumValueReadError) -> Self {
Error::RmpDecodeNumValueRead(e.to_string())
}
}
impl From<time::error::ComponentRange> for Error {
fn from(e: time::error::ComponentRange) -> Self {
Error::TimeComponentRange(e.to_string())
}
}
impl From<uuid::Error> for Error {
fn from(e: uuid::Error) -> Self {
Error::UuidParse(e.to_string())
}
}
impl From<rmp::decode::MarkerReadError> for Error {
fn from(e: rmp::decode::MarkerReadError) -> Self {
let serr = format!("{:?}", e);
Error::RmpDecodeMarkerRead(serr)
}
}
pub fn is_io_eof(e: &Error) -> bool {
match e {
Error::Io(e) => e.kind() == std::io::ErrorKind::UnexpectedEof,
_ => false,
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::{Error as IoError, ErrorKind};
#[test]
fn test_filemeta_error_from_io_error() {
let io_error = IoError::new(ErrorKind::PermissionDenied, "permission denied");
let filemeta_error: Error = io_error.into();
match filemeta_error {
Error::Io(inner_io) => {
assert_eq!(inner_io.kind(), ErrorKind::PermissionDenied);
assert!(inner_io.to_string().contains("permission denied"));
}
_ => panic!("Expected Io variant"),
}
}
#[test]
fn test_filemeta_error_other_function() {
let custom_error = "Custom filemeta error";
let filemeta_error = Error::other(custom_error);
match filemeta_error {
Error::Io(io_error) => {
assert!(io_error.to_string().contains(custom_error));
assert_eq!(io_error.kind(), ErrorKind::Other);
}
_ => panic!("Expected Io variant"),
}
}
#[test]
fn test_filemeta_error_conversions() {
// Test various error conversions
let serde_decode_err =
rmp_serde::decode::Error::InvalidMarkerRead(std::io::Error::new(ErrorKind::InvalidData, "invalid"));
let filemeta_error: Error = serde_decode_err.into();
assert!(matches!(filemeta_error, Error::RmpSerdeDecode(_)));
// Test with string-based error that we can actually create
let encode_error_string = "test encode error";
let filemeta_error = Error::RmpSerdeEncode(encode_error_string.to_string());
assert!(matches!(filemeta_error, Error::RmpSerdeEncode(_)));
let utf8_err = std::string::String::from_utf8(vec![0xFF]).unwrap_err();
let filemeta_error: Error = utf8_err.into();
assert!(matches!(filemeta_error, Error::FromUtf8(_)));
}
#[test]
fn test_filemeta_error_clone() {
let test_cases = vec![
Error::FileNotFound,
Error::FileVersionNotFound,
Error::VolumeNotFound,
Error::FileCorrupt,
Error::DoneForNow,
Error::MethodNotAllowed,
Error::Unexpected,
Error::Io(IoError::new(ErrorKind::NotFound, "test")),
Error::RmpSerdeDecode("test decode error".to_string()),
Error::RmpSerdeEncode("test encode error".to_string()),
Error::FromUtf8("test utf8 error".to_string()),
Error::RmpDecodeValueRead("test value read error".to_string()),
Error::RmpEncodeValueWrite("test value write error".to_string()),
Error::RmpDecodeNumValueRead("test num read error".to_string()),
Error::RmpDecodeMarkerRead("test marker read error".to_string()),
Error::TimeComponentRange("test time error".to_string()),
Error::UuidParse("test uuid error".to_string()),
];
for original_error in test_cases {
let cloned_error = original_error.clone();
assert_eq!(original_error, cloned_error);
}
}
#[test]
fn test_filemeta_error_partial_eq() {
// Test equality for simple variants
assert_eq!(Error::FileNotFound, Error::FileNotFound);
assert_ne!(Error::FileNotFound, Error::FileVersionNotFound);
// Test equality for Io variants
let io1 = Error::Io(IoError::new(ErrorKind::NotFound, "test"));
let io2 = Error::Io(IoError::new(ErrorKind::NotFound, "test"));
let io3 = Error::Io(IoError::new(ErrorKind::PermissionDenied, "test"));
assert_eq!(io1, io2);
assert_ne!(io1, io3);
// Test equality for string variants
let decode1 = Error::RmpSerdeDecode("error message".to_string());
let decode2 = Error::RmpSerdeDecode("error message".to_string());
let decode3 = Error::RmpSerdeDecode("different message".to_string());
assert_eq!(decode1, decode2);
assert_ne!(decode1, decode3);
}
#[test]
fn test_filemeta_error_display() {
let test_cases = vec![
(Error::FileNotFound, "File not found"),
(Error::FileVersionNotFound, "File version not found"),
(Error::VolumeNotFound, "Volume not found"),
(Error::FileCorrupt, "File corrupt"),
(Error::DoneForNow, "Done for now"),
(Error::MethodNotAllowed, "Method not allowed"),
(Error::Unexpected, "Unexpected error"),
(Error::RmpSerdeDecode("test".to_string()), "rmp serde decode error: test"),
(Error::RmpSerdeEncode("test".to_string()), "rmp serde encode error: test"),
(Error::FromUtf8("test".to_string()), "Invalid UTF-8: test"),
(Error::TimeComponentRange("test".to_string()), "time component range error: test"),
(Error::UuidParse("test".to_string()), "uuid parse error: test"),
];
for (error, expected_message) in test_cases {
assert_eq!(error.to_string(), expected_message);
}
}
#[test]
fn test_rmp_conversions() {
// Test rmp value read error (this one works since it has the same signature)
let value_read_err = rmp::decode::ValueReadError::InvalidMarkerRead(std::io::Error::new(ErrorKind::InvalidData, "test"));
let filemeta_error: Error = value_read_err.into();
assert!(matches!(filemeta_error, Error::RmpDecodeValueRead(_)));
// Test rmp num value read error
let num_value_err =
rmp::decode::NumValueReadError::InvalidMarkerRead(std::io::Error::new(ErrorKind::InvalidData, "test"));
let filemeta_error: Error = num_value_err.into();
assert!(matches!(filemeta_error, Error::RmpDecodeNumValueRead(_)));
}
#[test]
fn test_time_and_uuid_conversions() {
// Test time component range error
use time::{Date, Month};
let time_result = Date::from_calendar_date(2023, Month::January, 32); // Invalid day
assert!(time_result.is_err());
let time_error = time_result.unwrap_err();
let filemeta_error: Error = time_error.into();
assert!(matches!(filemeta_error, Error::TimeComponentRange(_)));
// Test UUID parse error
let uuid_result = uuid::Uuid::parse_str("invalid-uuid");
assert!(uuid_result.is_err());
let uuid_error = uuid_result.unwrap_err();
let filemeta_error: Error = uuid_error.into();
assert!(matches!(filemeta_error, Error::UuidParse(_)));
}
#[test]
fn test_marker_read_error_conversion() {
// Test rmp marker read error conversion
let marker_err = rmp::decode::MarkerReadError(std::io::Error::new(ErrorKind::InvalidData, "marker test"));
let filemeta_error: Error = marker_err.into();
assert!(matches!(filemeta_error, Error::RmpDecodeMarkerRead(_)));
assert!(filemeta_error.to_string().contains("marker"));
}
#[test]
fn test_is_io_eof_function() {
// Test is_io_eof helper function
let eof_error = Error::Io(IoError::new(ErrorKind::UnexpectedEof, "eof"));
assert!(is_io_eof(&eof_error));
let not_eof_error = Error::Io(IoError::new(ErrorKind::NotFound, "not found"));
assert!(!is_io_eof(&not_eof_error));
let non_io_error = Error::FileNotFound;
assert!(!is_io_eof(&non_io_error));
}
#[test]
fn test_filemeta_error_to_io_error_conversion() {
// Test conversion from FileMeta Error to io::Error through other function
let original_io_error = IoError::new(ErrorKind::InvalidData, "test data");
let filemeta_error = Error::other(original_io_error);
match filemeta_error {
Error::Io(io_err) => {
assert_eq!(io_err.kind(), ErrorKind::Other);
assert!(io_err.to_string().contains("test data"));
}
_ => panic!("Expected Io variant"),
}
}
#[test]
fn test_filemeta_error_roundtrip_conversion() {
// Test roundtrip conversion: io::Error -> FileMeta Error -> io::Error
let original_io_error = IoError::new(ErrorKind::PermissionDenied, "permission test");
// Convert to FileMeta Error
let filemeta_error: Error = original_io_error.into();
// Extract the io::Error back
match filemeta_error {
Error::Io(extracted_io_error) => {
assert_eq!(extracted_io_error.kind(), ErrorKind::PermissionDenied);
assert!(extracted_io_error.to_string().contains("permission test"));
}
_ => panic!("Expected Io variant"),
}
}
#[test]
fn test_filemeta_error_io_error_kinds_preservation() {
let io_error_kinds = vec![
ErrorKind::NotFound,
ErrorKind::PermissionDenied,
ErrorKind::ConnectionRefused,
ErrorKind::ConnectionReset,
ErrorKind::ConnectionAborted,
ErrorKind::NotConnected,
ErrorKind::AddrInUse,
ErrorKind::AddrNotAvailable,
ErrorKind::BrokenPipe,
ErrorKind::AlreadyExists,
ErrorKind::WouldBlock,
ErrorKind::InvalidInput,
ErrorKind::InvalidData,
ErrorKind::TimedOut,
ErrorKind::WriteZero,
ErrorKind::Interrupted,
ErrorKind::UnexpectedEof,
ErrorKind::Other,
];
for kind in io_error_kinds {
let io_error = IoError::new(kind, format!("test error for {:?}", kind));
let filemeta_error: Error = io_error.into();
match filemeta_error {
Error::Unexpected => {
assert_eq!(kind, ErrorKind::UnexpectedEof);
}
Error::Io(extracted_io_error) => {
assert_eq!(extracted_io_error.kind(), kind);
assert!(extracted_io_error.to_string().contains("test error"));
}
_ => panic!("Expected Io variant for kind {:?}", kind),
}
}
}
#[test]
fn test_filemeta_error_downcast_chain() {
// Test error downcast chain functionality
let original_io_error = IoError::new(ErrorKind::InvalidData, "original error");
let filemeta_error = Error::other(original_io_error);
// The error should be wrapped as an Io variant
if let Error::Io(io_err) = filemeta_error {
// The wrapped error should be Other kind (from std::io::Error::other)
assert_eq!(io_err.kind(), ErrorKind::Other);
// But the message should still contain the original error information
assert!(io_err.to_string().contains("original error"));
} else {
panic!("Expected Io variant");
}
}
#[test]
fn test_filemeta_error_maintains_error_information() {
let test_cases = vec![
(ErrorKind::NotFound, "file not found"),
(ErrorKind::PermissionDenied, "access denied"),
(ErrorKind::InvalidData, "corrupt data"),
(ErrorKind::TimedOut, "operation timed out"),
];
for (kind, message) in test_cases {
let io_error = IoError::new(kind, message);
let error_message = io_error.to_string();
let filemeta_error: Error = io_error.into();
match filemeta_error {
Error::Io(extracted_io_error) => {
assert_eq!(extracted_io_error.kind(), kind);
assert_eq!(extracted_io_error.to_string(), error_message);
}
_ => panic!("Expected Io variant"),
}
}
}
#[test]
fn test_filemeta_error_complex_conversion_chain() {
// Test conversion from string error types that we can actually create
// Test with UUID error conversion
let uuid_result = uuid::Uuid::parse_str("invalid-uuid-format");
assert!(uuid_result.is_err());
let uuid_error = uuid_result.unwrap_err();
let filemeta_error: Error = uuid_error.into();
match filemeta_error {
Error::UuidParse(message) => {
assert!(message.contains("invalid"));
}
_ => panic!("Expected UuidParse variant"),
}
// Test with time error conversion
use time::{Date, Month};
let time_result = Date::from_calendar_date(2023, Month::January, 32); // Invalid day
assert!(time_result.is_err());
let time_error = time_result.unwrap_err();
let filemeta_error2: Error = time_error.into();
match filemeta_error2 {
Error::TimeComponentRange(message) => {
assert!(message.contains("range"));
}
_ => panic!("Expected TimeComponentRange variant"),
}
// Test with UTF8 error conversion
let utf8_result = std::string::String::from_utf8(vec![0xFF]);
assert!(utf8_result.is_err());
let utf8_error = utf8_result.unwrap_err();
let filemeta_error3: Error = utf8_error.into();
match filemeta_error3 {
Error::FromUtf8(message) => {
assert!(message.contains("utf"));
}
_ => panic!("Expected FromUtf8 variant"),
}
}
#[test]
fn test_filemeta_error_equality_with_io_errors() {
// Test equality comparison for Io variants
let io_error1 = IoError::new(ErrorKind::NotFound, "test message");
let io_error2 = IoError::new(ErrorKind::NotFound, "test message");
let io_error3 = IoError::new(ErrorKind::PermissionDenied, "test message");
let io_error4 = IoError::new(ErrorKind::NotFound, "different message");
let filemeta_error1 = Error::Io(io_error1);
let filemeta_error2 = Error::Io(io_error2);
let filemeta_error3 = Error::Io(io_error3);
let filemeta_error4 = Error::Io(io_error4);
// Same kind and message should be equal
assert_eq!(filemeta_error1, filemeta_error2);
// Different kinds should not be equal
assert_ne!(filemeta_error1, filemeta_error3);
// Different messages should not be equal
assert_ne!(filemeta_error1, filemeta_error4);
}
#[test]
fn test_filemeta_error_clone_io_variants() {
let io_error = IoError::new(ErrorKind::ConnectionReset, "connection lost");
let original_error = Error::Io(io_error);
let cloned_error = original_error.clone();
// Cloned error should be equal to original
assert_eq!(original_error, cloned_error);
// Both should maintain the same properties
match (original_error, cloned_error) {
(Error::Io(orig_io), Error::Io(cloned_io)) => {
assert_eq!(orig_io.kind(), cloned_io.kind());
assert_eq!(orig_io.to_string(), cloned_io.to_string());
}
_ => panic!("Both should be Io variants"),
}
}
}

View File

@@ -0,0 +1,457 @@
use crate::error::{Error, Result};
use crate::headers::RESERVED_METADATA_PREFIX_LOWER;
use crate::headers::RUSTFS_HEALING;
use bytes::Bytes;
use rmp_serde::Serializer;
use rustfs_utils::HashAlgorithm;
use serde::Deserialize;
use serde::Serialize;
use std::collections::HashMap;
use time::OffsetDateTime;
use uuid::Uuid;
pub const ERASURE_ALGORITHM: &str = "rs-vandermonde";
pub const BLOCK_SIZE_V2: usize = 1024 * 1024; // 1M
// Additional constants from Go version
pub const NULL_VERSION_ID: &str = "null";
// pub const RUSTFS_ERASURE_UPGRADED: &str = "x-rustfs-internal-erasure-upgraded";
#[derive(Serialize, Deserialize, Debug, PartialEq, Clone, Default)]
pub struct ObjectPartInfo {
pub etag: String,
pub number: usize,
pub size: usize,
pub actual_size: i64, // Original data size
pub mod_time: Option<OffsetDateTime>,
// Index holds the index of the part in the erasure coding
pub index: Option<Bytes>,
// Checksums holds checksums of the part
pub checksums: Option<HashMap<String, String>>,
}
#[derive(Serialize, Deserialize, Debug, PartialEq, Default, Clone)]
// ChecksumInfo - carries checksums of individual scattered parts per disk.
pub struct ChecksumInfo {
pub part_number: usize,
pub algorithm: HashAlgorithm,
pub hash: Bytes,
}
#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, PartialOrd, Default, Clone)]
pub enum ErasureAlgo {
#[default]
Invalid = 0,
ReedSolomon = 1,
}
impl ErasureAlgo {
pub fn valid(&self) -> bool {
*self > ErasureAlgo::Invalid
}
pub fn to_u8(&self) -> u8 {
match self {
ErasureAlgo::Invalid => 0,
ErasureAlgo::ReedSolomon => 1,
}
}
pub fn from_u8(u: u8) -> Self {
match u {
1 => ErasureAlgo::ReedSolomon,
_ => ErasureAlgo::Invalid,
}
}
}
impl std::fmt::Display for ErasureAlgo {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
ErasureAlgo::Invalid => write!(f, "Invalid"),
ErasureAlgo::ReedSolomon => write!(f, "{}", ERASURE_ALGORITHM),
}
}
}
#[derive(Serialize, Deserialize, Debug, PartialEq, Default, Clone)]
// ErasureInfo holds erasure coding and bitrot related information.
pub struct ErasureInfo {
// Algorithm is the String representation of erasure-coding-algorithm
pub algorithm: String,
// DataBlocks is the number of data blocks for erasure-coding
pub data_blocks: usize,
// ParityBlocks is the number of parity blocks for erasure-coding
pub parity_blocks: usize,
// BlockSize is the size of one erasure-coded block
pub block_size: usize,
// Index is the index of the current disk
pub index: usize,
// Distribution is the distribution of the data and parity blocks
pub distribution: Vec<usize>,
// Checksums holds all bitrot checksums of all erasure encoded blocks
pub checksums: Vec<ChecksumInfo>,
}
pub fn calc_shard_size(block_size: usize, data_shards: usize) -> usize {
(block_size.div_ceil(data_shards) + 1) & !1
}
impl ErasureInfo {
pub fn get_checksum_info(&self, part_number: usize) -> ChecksumInfo {
for sum in &self.checksums {
if sum.part_number == part_number {
return sum.clone();
}
}
ChecksumInfo {
algorithm: HashAlgorithm::HighwayHash256S,
..Default::default()
}
}
/// Calculate the size of each shard.
pub fn shard_size(&self) -> usize {
calc_shard_size(self.block_size, self.data_blocks)
}
/// Calculate the total erasure file size for a given original size.
// Returns the final erasure size from the original size
pub fn shard_file_size(&self, total_length: i64) -> i64 {
if total_length == 0 {
return 0;
}
if total_length < 0 {
return total_length;
}
let total_length = total_length as usize;
let num_shards = total_length / self.block_size;
let last_block_size = total_length % self.block_size;
let last_shard_size = calc_shard_size(last_block_size, self.data_blocks);
(num_shards * self.shard_size() + last_shard_size) as i64
}
/// Check if this ErasureInfo equals another ErasureInfo
pub fn equals(&self, other: &ErasureInfo) -> bool {
self.algorithm == other.algorithm
&& self.data_blocks == other.data_blocks
&& self.parity_blocks == other.parity_blocks
&& self.block_size == other.block_size
&& self.index == other.index
&& self.distribution == other.distribution
}
}
// #[derive(Debug, Clone)]
#[derive(Serialize, Deserialize, Debug, PartialEq, Clone, Default)]
pub struct FileInfo {
pub volume: String,
pub name: String,
pub version_id: Option<Uuid>,
pub is_latest: bool,
pub deleted: bool,
// Transition related fields
pub transition_status: Option<String>,
pub transitioned_obj_name: Option<String>,
pub transition_tier: Option<String>,
pub transition_version_id: Option<String>,
pub expire_restored: bool,
pub data_dir: Option<Uuid>,
pub mod_time: Option<OffsetDateTime>,
pub size: i64,
// File mode bits
pub mode: Option<u32>,
// WrittenByVersion is the unix time stamp of the version that created this version of the object
pub written_by_version: Option<u64>,
pub metadata: HashMap<String, String>,
pub parts: Vec<ObjectPartInfo>,
pub erasure: ErasureInfo,
// MarkDeleted marks this version as deleted
pub mark_deleted: bool,
// ReplicationState - Internal replication state to be passed back in ObjectInfo
// pub replication_state: Option<ReplicationState>, // TODO: implement ReplicationState
pub data: Option<Bytes>,
pub num_versions: usize,
pub successor_mod_time: Option<OffsetDateTime>,
pub fresh: bool,
pub idx: usize,
// Combined checksum when object was uploaded
pub checksum: Option<Bytes>,
pub versioned: bool,
}
impl FileInfo {
pub fn new(object: &str, data_blocks: usize, parity_blocks: usize) -> Self {
let indexs = {
let cardinality = data_blocks + parity_blocks;
let mut nums = vec![0; cardinality];
let key_crc = crc32fast::hash(object.as_bytes());
let start = key_crc as usize % cardinality;
for i in 1..=cardinality {
nums[i - 1] = 1 + ((start + i) % cardinality);
}
nums
};
Self {
erasure: ErasureInfo {
algorithm: String::from(ERASURE_ALGORITHM),
data_blocks,
parity_blocks,
block_size: BLOCK_SIZE_V2,
distribution: indexs,
..Default::default()
},
..Default::default()
}
}
pub fn is_valid(&self) -> bool {
if self.deleted {
return true;
}
let data_blocks = self.erasure.data_blocks;
let parity_blocks = self.erasure.parity_blocks;
(data_blocks >= parity_blocks)
&& (data_blocks > 0)
&& (self.erasure.index > 0
&& self.erasure.index <= data_blocks + parity_blocks
&& self.erasure.distribution.len() == (data_blocks + parity_blocks))
}
pub fn get_etag(&self) -> Option<String> {
self.metadata.get("etag").cloned()
}
pub fn write_quorum(&self, quorum: usize) -> usize {
if self.deleted {
return quorum;
}
if self.erasure.data_blocks == self.erasure.parity_blocks {
return self.erasure.data_blocks + 1;
}
self.erasure.data_blocks
}
pub fn marshal_msg(&self) -> Result<Vec<u8>> {
let mut buf = Vec::new();
self.serialize(&mut Serializer::new(&mut buf))?;
Ok(buf)
}
pub fn unmarshal(buf: &[u8]) -> Result<Self> {
let t: FileInfo = rmp_serde::from_slice(buf)?;
Ok(t)
}
pub fn add_object_part(
&mut self,
num: usize,
etag: String,
part_size: usize,
mod_time: Option<OffsetDateTime>,
actual_size: i64,
index: Option<Bytes>,
) {
let part = ObjectPartInfo {
etag,
number: num,
size: part_size,
mod_time,
actual_size,
index,
checksums: None,
};
for p in self.parts.iter_mut() {
if p.number == num {
*p = part;
return;
}
}
self.parts.push(part);
self.parts.sort_by(|a, b| a.number.cmp(&b.number));
}
// to_part_offset gets the part index where offset is located, returns part index and offset
pub fn to_part_offset(&self, offset: usize) -> Result<(usize, usize)> {
if offset == 0 {
return Ok((0, 0));
}
let mut part_offset = offset;
for (i, part) in self.parts.iter().enumerate() {
let part_index = i;
if part_offset < part.size {
return Ok((part_index, part_offset));
}
part_offset -= part.size
}
Err(Error::other("part not found"))
}
pub fn set_healing(&mut self) {
self.metadata.insert(RUSTFS_HEALING.to_string(), "true".to_string());
}
pub fn set_inline_data(&mut self) {
self.metadata
.insert(format!("{}inline-data", RESERVED_METADATA_PREFIX_LOWER).to_owned(), "true".to_owned());
}
pub fn set_data_moved(&mut self) {
self.metadata
.insert(format!("{}data-moved", RESERVED_METADATA_PREFIX_LOWER).to_owned(), "true".to_owned());
}
pub fn inline_data(&self) -> bool {
self.metadata
.contains_key(format!("{}inline-data", RESERVED_METADATA_PREFIX_LOWER).as_str())
&& !self.is_remote()
}
/// Check if the object is compressed
pub fn is_compressed(&self) -> bool {
self.metadata
.contains_key(&format!("{}compression", RESERVED_METADATA_PREFIX_LOWER))
}
/// Check if the object is remote (transitioned to another tier)
pub fn is_remote(&self) -> bool {
!self.transition_tier.as_ref().is_none_or(|s| s.is_empty())
}
/// Get the data directory for this object
pub fn get_data_dir(&self) -> String {
if self.deleted {
return "delete-marker".to_string();
}
self.data_dir.map_or("".to_string(), |dir| dir.to_string())
}
/// Read quorum returns expected read quorum for this FileInfo
pub fn read_quorum(&self, dquorum: usize) -> usize {
if self.deleted {
return dquorum;
}
self.erasure.data_blocks
}
/// Create a shallow copy with minimal information for READ MRF checks
pub fn shallow_copy(&self) -> Self {
Self {
volume: self.volume.clone(),
name: self.name.clone(),
version_id: self.version_id,
deleted: self.deleted,
erasure: self.erasure.clone(),
..Default::default()
}
}
/// Check if this FileInfo equals another FileInfo
pub fn equals(&self, other: &FileInfo) -> bool {
// Check if both are compressed or both are not compressed
if self.is_compressed() != other.is_compressed() {
return false;
}
// Check transition info
if !self.transition_info_equals(other) {
return false;
}
// Check mod time
if self.mod_time != other.mod_time {
return false;
}
// Check erasure info
self.erasure.equals(&other.erasure)
}
/// Check if transition related information are equal
pub fn transition_info_equals(&self, other: &FileInfo) -> bool {
self.transition_status == other.transition_status
&& self.transition_tier == other.transition_tier
&& self.transitioned_obj_name == other.transitioned_obj_name
&& self.transition_version_id == other.transition_version_id
}
/// Check if metadata maps are equal
pub fn metadata_equals(&self, other: &FileInfo) -> bool {
if self.metadata.len() != other.metadata.len() {
return false;
}
for (k, v) in &self.metadata {
if other.metadata.get(k) != Some(v) {
return false;
}
}
true
}
/// Check if replication related fields are equal
pub fn replication_info_equals(&self, other: &FileInfo) -> bool {
self.mark_deleted == other.mark_deleted
// TODO: Add replication_state comparison when implemented
// && self.replication_state == other.replication_state
}
}
#[derive(Debug, Default, Clone, Serialize, Deserialize)]
pub struct FileInfoVersions {
// Name of the volume.
pub volume: String,
// Name of the file.
pub name: String,
// Represents the latest mod time of the
// latest version.
pub latest_mod_time: Option<OffsetDateTime>,
pub versions: Vec<FileInfo>,
pub free_versions: Vec<FileInfo>,
}
impl FileInfoVersions {
pub fn find_version_index(&self, v: &str) -> Option<usize> {
if v.is_empty() {
return None;
}
let vid = Uuid::parse_str(v).unwrap_or_default();
self.versions.iter().position(|v| v.version_id == Some(vid))
}
/// Calculate the total size of all versions for this object
pub fn size(&self) -> i64 {
self.versions.iter().map(|v| v.size).sum()
}
}
#[derive(Default, Serialize, Deserialize)]
pub struct RawFileInfo {
pub buf: Vec<u8>,
}
#[derive(Debug, Default, Clone, Serialize, Deserialize)]
pub struct FilesInfo {
pub files: Vec<FileInfo>,
pub is_truncated: bool,
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,4 +1,4 @@
use common::error::{Error, Result};
use crate::error::{Error, Result};
use serde::{Deserialize, Serialize};
use std::io::{Cursor, Read};
use uuid::Uuid;
@@ -27,11 +27,7 @@ impl InlineData {
}
pub fn after_version(&self) -> &[u8] {
if self.0.is_empty() {
&self.0
} else {
&self.0[1..]
}
if self.0.is_empty() { &self.0 } else { &self.0[1..] }
}
pub fn find(&self, key: &str) -> Result<Option<Vec<u8>>> {
@@ -90,7 +86,7 @@ impl InlineData {
let field = String::from_utf8(field_buff)?;
if field.is_empty() {
return Err(Error::msg("InlineData key empty"));
return Err(Error::other("InlineData key empty"));
}
let bin_len = rmp::decode::read_bin_len(&mut cur)? as usize;

View File

@@ -0,0 +1,23 @@
pub const AMZ_META_UNENCRYPTED_CONTENT_LENGTH: &str = "X-Amz-Meta-X-Amz-Unencrypted-Content-Length";
pub const AMZ_META_UNENCRYPTED_CONTENT_MD5: &str = "X-Amz-Meta-X-Amz-Unencrypted-Content-Md5";
pub const AMZ_STORAGE_CLASS: &str = "x-amz-storage-class";
pub const RESERVED_METADATA_PREFIX: &str = "X-RustFS-Internal-";
pub const RESERVED_METADATA_PREFIX_LOWER: &str = "x-rustfs-internal-";
pub const RUSTFS_HEALING: &str = "X-Rustfs-Internal-healing";
// pub const RUSTFS_DATA_MOVE: &str = "X-Rustfs-Internal-data-mov";
// pub const X_RUSTFS_INLINE_DATA: &str = "x-rustfs-inline-data";
pub const VERSION_PURGE_STATUS_KEY: &str = "X-Rustfs-Internal-purgestatus";
pub const X_RUSTFS_HEALING: &str = "X-Rustfs-Internal-healing";
pub const X_RUSTFS_DATA_MOV: &str = "X-Rustfs-Internal-data-mov";
pub const AMZ_OBJECT_TAGGING: &str = "X-Amz-Tagging";
pub const AMZ_BUCKET_REPLICATION_STATUS: &str = "X-Amz-Replication-Status";
pub const AMZ_DECODED_CONTENT_LENGTH: &str = "X-Amz-Decoded-Content-Length";
pub const RUSTFS_DATA_MOVE: &str = "X-Rustfs-Internal-data-mov";

View File

@@ -0,0 +1,14 @@
mod error;
mod fileinfo;
mod filemeta;
mod filemeta_inline;
pub mod headers;
mod metacache;
pub mod test_data;
pub use error::*;
pub use fileinfo::*;
pub use filemeta::*;
pub use filemeta_inline::*;
pub use metacache::*;

View File

@@ -0,0 +1,874 @@
use crate::error::{Error, Result};
use crate::{FileInfo, FileInfoVersions, FileMeta, FileMetaShallowVersion, VersionType, merge_file_meta_versions};
use rmp::Marker;
use serde::{Deserialize, Serialize};
use std::cmp::Ordering;
use std::str::from_utf8;
use std::{
fmt::Debug,
future::Future,
pin::Pin,
ptr,
sync::{
Arc,
atomic::{AtomicPtr, AtomicU64, Ordering as AtomicOrdering},
},
time::{Duration, SystemTime, UNIX_EPOCH},
};
use time::OffsetDateTime;
use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite, AsyncWriteExt};
use tokio::spawn;
use tokio::sync::Mutex;
use tracing::warn;
const SLASH_SEPARATOR: &str = "/";
#[derive(Clone, Debug, Default)]
pub struct MetadataResolutionParams {
pub dir_quorum: usize,
pub obj_quorum: usize,
pub requested_versions: usize,
pub bucket: String,
pub strict: bool,
pub candidates: Vec<Vec<FileMetaShallowVersion>>,
}
#[derive(Clone, Debug, Default, Serialize, Deserialize, PartialEq)]
pub struct MetaCacheEntry {
/// name is the full name of the object including prefixes
pub name: String,
/// Metadata. If none is present it is not an object but only a prefix.
/// Entries without metadata will only be present in non-recursive scans.
pub metadata: Vec<u8>,
/// cached contains the metadata if decoded.
#[serde(skip)]
pub cached: Option<FileMeta>,
/// Indicates the entry can be reused and only one reference to metadata is expected.
pub reusable: bool,
}
impl MetaCacheEntry {
pub fn marshal_msg(&self) -> Result<Vec<u8>> {
let mut wr = Vec::new();
rmp::encode::write_bool(&mut wr, true)?;
rmp::encode::write_str(&mut wr, &self.name)?;
rmp::encode::write_bin(&mut wr, &self.metadata)?;
Ok(wr)
}
pub fn is_dir(&self) -> bool {
self.metadata.is_empty() && self.name.ends_with('/')
}
pub fn is_in_dir(&self, dir: &str, separator: &str) -> bool {
if dir.is_empty() {
let idx = self.name.find(separator);
return idx.is_none() || idx.unwrap() == self.name.len() - separator.len();
}
let ext = self.name.trim_start_matches(dir);
if ext.len() != self.name.len() {
let idx = ext.find(separator);
return idx.is_none() || idx.unwrap() == ext.len() - separator.len();
}
false
}
pub fn is_object(&self) -> bool {
!self.metadata.is_empty()
}
pub fn is_object_dir(&self) -> bool {
!self.metadata.is_empty() && self.name.ends_with(SLASH_SEPARATOR)
}
pub fn is_latest_delete_marker(&mut self) -> bool {
if let Some(cached) = &self.cached {
if cached.versions.is_empty() {
return true;
}
return cached.versions[0].header.version_type == VersionType::Delete;
}
if !FileMeta::is_xl2_v1_format(&self.metadata) {
return false;
}
match FileMeta::check_xl2_v1(&self.metadata) {
Ok((meta, _, _)) => {
if !meta.is_empty() {
return FileMeta::is_latest_delete_marker(meta);
}
}
Err(_) => return true,
}
match self.xl_meta() {
Ok(res) => {
if res.versions.is_empty() {
return true;
}
res.versions[0].header.version_type == VersionType::Delete
}
Err(_) => true,
}
}
#[tracing::instrument(level = "debug", skip(self))]
pub fn to_fileinfo(&self, bucket: &str) -> Result<FileInfo> {
if self.is_dir() {
return Ok(FileInfo {
volume: bucket.to_owned(),
name: self.name.clone(),
..Default::default()
});
}
if self.cached.is_some() {
let fm = self.cached.as_ref().unwrap();
if fm.versions.is_empty() {
return Ok(FileInfo {
volume: bucket.to_owned(),
name: self.name.clone(),
deleted: true,
is_latest: true,
mod_time: Some(OffsetDateTime::UNIX_EPOCH),
..Default::default()
});
}
let fi = fm.into_fileinfo(bucket, self.name.as_str(), "", false, false)?;
return Ok(fi);
}
let mut fm = FileMeta::new();
fm.unmarshal_msg(&self.metadata)?;
let fi = fm.into_fileinfo(bucket, self.name.as_str(), "", false, false)?;
Ok(fi)
}
pub fn file_info_versions(&self, bucket: &str) -> Result<FileInfoVersions> {
if self.is_dir() {
return Ok(FileInfoVersions {
volume: bucket.to_string(),
name: self.name.clone(),
versions: vec![FileInfo {
volume: bucket.to_string(),
name: self.name.clone(),
..Default::default()
}],
..Default::default()
});
}
let mut fm = FileMeta::new();
fm.unmarshal_msg(&self.metadata)?;
fm.into_file_info_versions(bucket, self.name.as_str(), false)
}
pub fn matches(&self, other: Option<&MetaCacheEntry>, strict: bool) -> (Option<MetaCacheEntry>, bool) {
if other.is_none() {
return (None, false);
}
let other = other.unwrap();
if self.name != other.name {
if self.name < other.name {
return (Some(self.clone()), false);
}
return (Some(other.clone()), false);
}
if other.is_dir() || self.is_dir() {
if self.is_dir() {
return (Some(self.clone()), other.is_dir() == self.is_dir());
}
return (Some(other.clone()), other.is_dir() == self.is_dir());
}
let self_vers = match &self.cached {
Some(file_meta) => file_meta.clone(),
None => match FileMeta::load(&self.metadata) {
Ok(meta) => meta,
Err(_) => return (None, false),
},
};
let other_vers = match &other.cached {
Some(file_meta) => file_meta.clone(),
None => match FileMeta::load(&other.metadata) {
Ok(meta) => meta,
Err(_) => return (None, false),
},
};
if self_vers.versions.len() != other_vers.versions.len() {
match self_vers.lastest_mod_time().cmp(&other_vers.lastest_mod_time()) {
Ordering::Greater => return (Some(self.clone()), false),
Ordering::Less => return (Some(other.clone()), false),
_ => {}
}
if self_vers.versions.len() > other_vers.versions.len() {
return (Some(self.clone()), false);
}
return (Some(other.clone()), false);
}
let mut prefer = None;
for (s_version, o_version) in self_vers.versions.iter().zip(other_vers.versions.iter()) {
if s_version.header != o_version.header {
if s_version.header.has_ec() != o_version.header.has_ec() {
// One version has EC and the other doesn't - may have been written later.
// Compare without considering EC.
let (mut a, mut b) = (s_version.header.clone(), o_version.header.clone());
(a.ec_n, a.ec_m, b.ec_n, b.ec_m) = (0, 0, 0, 0);
if a == b {
continue;
}
}
if !strict && s_version.header.matches_not_strict(&o_version.header) {
if prefer.is_none() {
if s_version.header.sorts_before(&o_version.header) {
prefer = Some(self.clone());
} else {
prefer = Some(other.clone());
}
}
continue;
}
if prefer.is_some() {
return (prefer, false);
}
if s_version.header.sorts_before(&o_version.header) {
return (Some(self.clone()), false);
}
return (Some(other.clone()), false);
}
}
if prefer.is_none() {
prefer = Some(self.clone());
}
(prefer, true)
}
pub fn xl_meta(&mut self) -> Result<FileMeta> {
if self.is_dir() {
return Err(Error::FileNotFound);
}
if let Some(meta) = &self.cached {
Ok(meta.clone())
} else {
if self.metadata.is_empty() {
return Err(Error::FileNotFound);
}
let meta = FileMeta::load(&self.metadata)?;
self.cached = Some(meta.clone());
Ok(meta)
}
}
}
#[derive(Debug, Default)]
pub struct MetaCacheEntries(pub Vec<Option<MetaCacheEntry>>);
impl MetaCacheEntries {
#[allow(clippy::should_implement_trait)]
pub fn as_ref(&self) -> &[Option<MetaCacheEntry>] {
&self.0
}
pub fn resolve(&self, mut params: MetadataResolutionParams) -> Option<MetaCacheEntry> {
if self.0.is_empty() {
warn!("decommission_pool: entries resolve empty");
return None;
}
let mut dir_exists = 0;
let mut selected = None;
params.candidates.clear();
let mut objs_agree = 0;
let mut objs_valid = 0;
for entry in self.0.iter().flatten() {
let mut entry = entry.clone();
warn!("decommission_pool: entries resolve entry {:?}", entry.name);
if entry.name.is_empty() {
continue;
}
if entry.is_dir() {
dir_exists += 1;
selected = Some(entry.clone());
warn!("decommission_pool: entries resolve entry dir {:?}", entry.name);
continue;
}
let xl = match entry.xl_meta() {
Ok(xl) => xl,
Err(e) => {
warn!("decommission_pool: entries resolve entry xl_meta {:?}", e);
continue;
}
};
objs_valid += 1;
params.candidates.push(xl.versions.clone());
if selected.is_none() {
selected = Some(entry.clone());
objs_agree = 1;
warn!("decommission_pool: entries resolve entry selected {:?}", entry.name);
continue;
}
if let (prefer, true) = entry.matches(selected.as_ref(), params.strict) {
selected = prefer;
objs_agree += 1;
warn!("decommission_pool: entries resolve entry prefer {:?}", entry.name);
continue;
}
}
let Some(selected) = selected else {
warn!("decommission_pool: entries resolve entry no selected");
return None;
};
if selected.is_dir() && dir_exists >= params.dir_quorum {
warn!("decommission_pool: entries resolve entry dir selected {:?}", selected.name);
return Some(selected);
}
// If we would never be able to reach read quorum.
if objs_valid < params.obj_quorum {
warn!(
"decommission_pool: entries resolve entry not enough objects {} < {}",
objs_valid, params.obj_quorum
);
return None;
}
if objs_agree == objs_valid {
warn!("decommission_pool: entries resolve entry all agree {} == {}", objs_agree, objs_valid);
return Some(selected);
}
let Some(cached) = selected.cached else {
warn!("decommission_pool: entries resolve entry no cached");
return None;
};
let versions = merge_file_meta_versions(params.obj_quorum, params.strict, params.requested_versions, &params.candidates);
if versions.is_empty() {
warn!("decommission_pool: entries resolve entry no versions");
return None;
}
let metadata = match cached.marshal_msg() {
Ok(meta) => meta,
Err(e) => {
warn!("decommission_pool: entries resolve entry marshal_msg {:?}", e);
return None;
}
};
// Merge if we have disagreement.
// Create a new merged result.
let new_selected = MetaCacheEntry {
name: selected.name.clone(),
cached: Some(FileMeta {
meta_ver: cached.meta_ver,
versions,
..Default::default()
}),
reusable: true,
metadata,
};
warn!("decommission_pool: entries resolve entry selected {:?}", new_selected.name);
Some(new_selected)
}
pub fn first_found(&self) -> (Option<MetaCacheEntry>, usize) {
(self.0.iter().find(|x| x.is_some()).cloned().unwrap_or_default(), self.0.len())
}
}
#[derive(Debug, Default)]
pub struct MetaCacheEntriesSortedResult {
pub entries: Option<MetaCacheEntriesSorted>,
pub err: Option<Error>,
}
#[derive(Debug, Default)]
pub struct MetaCacheEntriesSorted {
pub o: MetaCacheEntries,
pub list_id: Option<String>,
pub reuse: bool,
pub last_skipped_entry: Option<String>,
}
impl MetaCacheEntriesSorted {
pub fn entries(&self) -> Vec<&MetaCacheEntry> {
let entries: Vec<&MetaCacheEntry> = self.o.0.iter().flatten().collect();
entries
}
pub fn forward_past(&mut self, marker: Option<String>) {
if let Some(val) = marker {
if let Some(idx) = self.o.0.iter().flatten().position(|v| v.name > val) {
self.o.0 = self.o.0.split_off(idx);
}
}
}
}
const METACACHE_STREAM_VERSION: u8 = 2;
#[derive(Debug)]
pub struct MetacacheWriter<W> {
wr: W,
created: bool,
buf: Vec<u8>,
}
impl<W: AsyncWrite + Unpin> MetacacheWriter<W> {
pub fn new(wr: W) -> Self {
Self {
wr,
created: false,
buf: Vec::new(),
}
}
pub async fn flush(&mut self) -> Result<()> {
self.wr.write_all(&self.buf).await?;
self.buf.clear();
Ok(())
}
pub async fn init(&mut self) -> Result<()> {
if !self.created {
rmp::encode::write_u8(&mut self.buf, METACACHE_STREAM_VERSION).map_err(|e| Error::other(format!("{:?}", e)))?;
self.flush().await?;
self.created = true;
}
Ok(())
}
pub async fn write(&mut self, objs: &[MetaCacheEntry]) -> Result<()> {
if objs.is_empty() {
return Ok(());
}
self.init().await?;
for obj in objs.iter() {
if obj.name.is_empty() {
return Err(Error::other("metacacheWriter: no name"));
}
self.write_obj(obj).await?;
}
Ok(())
}
pub async fn write_obj(&mut self, obj: &MetaCacheEntry) -> Result<()> {
self.init().await?;
rmp::encode::write_bool(&mut self.buf, true).map_err(|e| Error::other(format!("{:?}", e)))?;
rmp::encode::write_str(&mut self.buf, &obj.name).map_err(|e| Error::other(format!("{:?}", e)))?;
rmp::encode::write_bin(&mut self.buf, &obj.metadata).map_err(|e| Error::other(format!("{:?}", e)))?;
self.flush().await?;
Ok(())
}
pub async fn close(&mut self) -> Result<()> {
rmp::encode::write_bool(&mut self.buf, false).map_err(|e| Error::other(format!("{:?}", e)))?;
self.flush().await?;
Ok(())
}
}
pub struct MetacacheReader<R> {
rd: R,
init: bool,
err: Option<Error>,
buf: Vec<u8>,
offset: usize,
current: Option<MetaCacheEntry>,
}
impl<R: AsyncRead + Unpin> MetacacheReader<R> {
pub fn new(rd: R) -> Self {
Self {
rd,
init: false,
err: None,
buf: Vec::new(),
offset: 0,
current: None,
}
}
pub async fn read_more(&mut self, read_size: usize) -> Result<&[u8]> {
let ext_size = read_size + self.offset;
let extra = ext_size - self.offset;
if self.buf.capacity() >= ext_size {
// Extend the buffer if we have enough space.
self.buf.resize(ext_size, 0);
} else {
self.buf.extend(vec![0u8; extra]);
}
let pref = self.offset;
self.rd.read_exact(&mut self.buf[pref..ext_size]).await?;
self.offset += read_size;
let data = &self.buf[pref..ext_size];
Ok(data)
}
fn reset(&mut self) {
self.buf.clear();
self.offset = 0;
}
async fn check_init(&mut self) -> Result<()> {
if !self.init {
let ver = match rmp::decode::read_u8(&mut self.read_more(2).await?) {
Ok(res) => res,
Err(err) => {
self.err = Some(Error::other(format!("{:?}", err)));
0
}
};
match ver {
1 | 2 => (),
_ => {
self.err = Some(Error::other("invalid version"));
}
}
self.init = true;
}
Ok(())
}
async fn read_str_len(&mut self) -> Result<u32> {
let mark = match rmp::decode::read_marker(&mut self.read_more(1).await?) {
Ok(res) => res,
Err(err) => {
let err: Error = err.into();
self.err = Some(err.clone());
return Err(err);
}
};
match mark {
Marker::FixStr(size) => Ok(u32::from(size)),
Marker::Str8 => Ok(u32::from(self.read_u8().await?)),
Marker::Str16 => Ok(u32::from(self.read_u16().await?)),
Marker::Str32 => Ok(self.read_u32().await?),
_marker => Err(Error::other("str marker err")),
}
}
async fn read_bin_len(&mut self) -> Result<u32> {
let mark = match rmp::decode::read_marker(&mut self.read_more(1).await?) {
Ok(res) => res,
Err(err) => {
let err: Error = err.into();
self.err = Some(err.clone());
return Err(err);
}
};
match mark {
Marker::Bin8 => Ok(u32::from(self.read_u8().await?)),
Marker::Bin16 => Ok(u32::from(self.read_u16().await?)),
Marker::Bin32 => Ok(self.read_u32().await?),
_ => Err(Error::other("bin marker err")),
}
}
async fn read_u8(&mut self) -> Result<u8> {
let buf = self.read_more(1).await?;
Ok(u8::from_be_bytes(buf.try_into().expect("Slice with incorrect length")))
}
async fn read_u16(&mut self) -> Result<u16> {
let buf = self.read_more(2).await?;
Ok(u16::from_be_bytes(buf.try_into().expect("Slice with incorrect length")))
}
async fn read_u32(&mut self) -> Result<u32> {
let buf = self.read_more(4).await?;
Ok(u32::from_be_bytes(buf.try_into().expect("Slice with incorrect length")))
}
pub async fn skip(&mut self, size: usize) -> Result<()> {
self.check_init().await?;
if let Some(err) = &self.err {
return Err(err.clone());
}
let mut n = size;
if self.current.is_some() {
n -= 1;
self.current = None;
}
while n > 0 {
match rmp::decode::read_bool(&mut self.read_more(1).await?) {
Ok(res) => {
if !res {
return Ok(());
}
}
Err(err) => {
let err: Error = err.into();
self.err = Some(err.clone());
return Err(err);
}
};
let l = self.read_str_len().await?;
let _ = self.read_more(l as usize).await?;
let l = self.read_bin_len().await?;
let _ = self.read_more(l as usize).await?;
n -= 1;
}
Ok(())
}
pub async fn peek(&mut self) -> Result<Option<MetaCacheEntry>> {
self.check_init().await?;
if let Some(err) = &self.err {
return Err(err.clone());
}
match rmp::decode::read_bool(&mut self.read_more(1).await?) {
Ok(res) => {
if !res {
return Ok(None);
}
}
Err(err) => {
let err: Error = err.into();
self.err = Some(err.clone());
return Err(err);
}
};
let l = self.read_str_len().await?;
let buf = self.read_more(l as usize).await?;
let name_buf = buf.to_vec();
let name = match from_utf8(&name_buf) {
Ok(decoded) => decoded.to_owned(),
Err(err) => {
self.err = Some(Error::other(err.to_string()));
return Err(Error::other(err.to_string()));
}
};
let l = self.read_bin_len().await?;
let buf = self.read_more(l as usize).await?;
let metadata = buf.to_vec();
self.reset();
let entry = Some(MetaCacheEntry {
name,
metadata,
cached: None,
reusable: false,
});
self.current = entry.clone();
Ok(entry)
}
pub async fn read_all(&mut self) -> Result<Vec<MetaCacheEntry>> {
let mut ret = Vec::new();
loop {
if let Some(entry) = self.peek().await? {
ret.push(entry);
continue;
}
break;
}
Ok(ret)
}
}
pub type UpdateFn<T> = Box<dyn Fn() -> Pin<Box<dyn Future<Output = std::io::Result<T>> + Send>> + Send + Sync + 'static>;
#[derive(Clone, Debug, Default)]
pub struct Opts {
pub return_last_good: bool,
pub no_wait: bool,
}
pub struct Cache<T: Clone + Debug + Send> {
update_fn: UpdateFn<T>,
ttl: Duration,
opts: Opts,
val: AtomicPtr<T>,
last_update_ms: AtomicU64,
updating: Arc<Mutex<bool>>,
}
impl<T: Clone + Debug + Send + 'static> Cache<T> {
pub fn new(update_fn: UpdateFn<T>, ttl: Duration, opts: Opts) -> Self {
let val = AtomicPtr::new(ptr::null_mut());
Self {
update_fn,
ttl,
opts,
val,
last_update_ms: AtomicU64::new(0),
updating: Arc::new(Mutex::new(false)),
}
}
#[allow(unsafe_code)]
pub async fn get(self: Arc<Self>) -> std::io::Result<T> {
let v_ptr = self.val.load(AtomicOrdering::SeqCst);
let v = if v_ptr.is_null() {
None
} else {
Some(unsafe { (*v_ptr).clone() })
};
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.expect("Time went backwards")
.as_secs();
if now - self.last_update_ms.load(AtomicOrdering::SeqCst) < self.ttl.as_secs() {
if let Some(v) = v {
return Ok(v);
}
}
if self.opts.no_wait && v.is_some() && now - self.last_update_ms.load(AtomicOrdering::SeqCst) < self.ttl.as_secs() * 2 {
if self.updating.try_lock().is_ok() {
let this = Arc::clone(&self);
spawn(async move {
let _ = this.update().await;
});
}
return Ok(v.unwrap());
}
let _ = self.updating.lock().await;
if let Ok(duration) =
SystemTime::now().duration_since(UNIX_EPOCH + Duration::from_secs(self.last_update_ms.load(AtomicOrdering::SeqCst)))
{
if duration < self.ttl {
return Ok(v.unwrap());
}
}
match self.update().await {
Ok(_) => {
let v_ptr = self.val.load(AtomicOrdering::SeqCst);
let v = if v_ptr.is_null() {
None
} else {
Some(unsafe { (*v_ptr).clone() })
};
Ok(v.unwrap())
}
Err(err) => Err(err),
}
}
async fn update(&self) -> std::io::Result<()> {
match (self.update_fn)().await {
Ok(val) => {
self.val.store(Box::into_raw(Box::new(val)), AtomicOrdering::SeqCst);
let now = SystemTime::now()
.duration_since(UNIX_EPOCH)
.expect("Time went backwards")
.as_secs();
self.last_update_ms.store(now, AtomicOrdering::SeqCst);
Ok(())
}
Err(err) => {
let v_ptr = self.val.load(AtomicOrdering::SeqCst);
if self.opts.return_last_good && !v_ptr.is_null() {
return Ok(());
}
Err(err)
}
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Cursor;
#[tokio::test]
async fn test_writer() {
let mut f = Cursor::new(Vec::new());
let mut w = MetacacheWriter::new(&mut f);
let mut objs = Vec::new();
for i in 0..10 {
let info = MetaCacheEntry {
name: format!("item{}", i),
metadata: vec![0u8, 10],
cached: None,
reusable: false,
};
objs.push(info);
}
w.write(&objs).await.unwrap();
w.close().await.unwrap();
let data = f.into_inner();
let nf = Cursor::new(data);
let mut r = MetacacheReader::new(nf);
let nobjs = r.read_all().await.unwrap();
assert_eq!(objs, nobjs);
}
}

View File

@@ -0,0 +1,292 @@
use crate::error::Result;
use crate::filemeta::*;
use std::collections::HashMap;
use time::OffsetDateTime;
use uuid::Uuid;
/// 创建一个真实的 xl.meta 文件数据用于测试
pub fn create_real_xlmeta() -> Result<Vec<u8>> {
let mut fm = FileMeta::new();
// 创建一个真实的对象版本
let version_id = Uuid::parse_str("01234567-89ab-cdef-0123-456789abcdef")?;
let data_dir = Uuid::parse_str("fedcba98-7654-3210-fedc-ba9876543210")?;
let mut metadata = HashMap::new();
metadata.insert("Content-Type".to_string(), "text/plain".to_string());
metadata.insert("X-Amz-Meta-Author".to_string(), "test-user".to_string());
metadata.insert("X-Amz-Meta-Created".to_string(), "2024-01-15T10:30:00Z".to_string());
let object_version = MetaObject {
version_id: Some(version_id),
data_dir: Some(data_dir),
erasure_algorithm: crate::fileinfo::ErasureAlgo::ReedSolomon,
erasure_m: 4,
erasure_n: 2,
erasure_block_size: 1024 * 1024, // 1MB
erasure_index: 1,
erasure_dist: vec![0, 1, 2, 3, 4, 5],
bitrot_checksum_algo: ChecksumAlgo::HighwayHash,
part_numbers: vec![1],
part_etags: vec!["d41d8cd98f00b204e9800998ecf8427e".to_string()],
part_sizes: vec![1024],
part_actual_sizes: vec![1024],
part_indices: Vec::new(),
size: 1024,
mod_time: Some(OffsetDateTime::from_unix_timestamp(1705312200)?), // 2024-01-15 10:30:00 UTC
meta_sys: HashMap::new(),
meta_user: metadata,
};
let file_version = FileMetaVersion {
version_type: VersionType::Object,
object: Some(object_version),
delete_marker: None,
write_version: 1,
};
let shallow_version = FileMetaShallowVersion::try_from(file_version)?;
fm.versions.push(shallow_version);
// 添加一个删除标记版本
let delete_version_id = Uuid::parse_str("11111111-2222-3333-4444-555555555555")?;
let delete_marker = MetaDeleteMarker {
version_id: Some(delete_version_id),
mod_time: Some(OffsetDateTime::from_unix_timestamp(1705312260)?), // 1分钟后
meta_sys: None,
};
let delete_file_version = FileMetaVersion {
version_type: VersionType::Delete,
object: None,
delete_marker: Some(delete_marker),
write_version: 2,
};
let delete_shallow_version = FileMetaShallowVersion::try_from(delete_file_version)?;
fm.versions.push(delete_shallow_version);
// 添加一个 Legacy 版本用于测试
let legacy_version_id = Uuid::parse_str("aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee")?;
let legacy_version = FileMetaVersion {
version_type: VersionType::Legacy,
object: None,
delete_marker: None,
write_version: 3,
};
let mut legacy_shallow = FileMetaShallowVersion::try_from(legacy_version)?;
legacy_shallow.header.version_id = Some(legacy_version_id);
legacy_shallow.header.mod_time = Some(OffsetDateTime::from_unix_timestamp(1705312140)?); // 更早的时间
fm.versions.push(legacy_shallow);
// 按修改时间排序(最新的在前)
fm.versions.sort_by(|a, b| b.header.mod_time.cmp(&a.header.mod_time));
fm.marshal_msg()
}
/// 创建一个包含多个版本的复杂 xl.meta 文件
pub fn create_complex_xlmeta() -> Result<Vec<u8>> {
let mut fm = FileMeta::new();
// 创建10个版本的对象
for i in 0i64..10i64 {
let version_id = Uuid::new_v4();
let data_dir = if i % 3 == 0 { Some(Uuid::new_v4()) } else { None };
let mut metadata = HashMap::new();
metadata.insert("Content-Type".to_string(), "application/octet-stream".to_string());
metadata.insert("X-Amz-Meta-Version".to_string(), i.to_string());
metadata.insert("X-Amz-Meta-Test".to_string(), format!("test-value-{}", i));
let object_version = MetaObject {
version_id: Some(version_id),
data_dir,
erasure_algorithm: crate::fileinfo::ErasureAlgo::ReedSolomon,
erasure_m: 4,
erasure_n: 2,
erasure_block_size: 1024 * 1024,
erasure_index: (i % 6) as usize,
erasure_dist: vec![0, 1, 2, 3, 4, 5],
bitrot_checksum_algo: ChecksumAlgo::HighwayHash,
part_numbers: vec![1],
part_etags: vec![format!("etag-{:08x}", i)],
part_sizes: vec![1024 * (i + 1) as usize],
part_actual_sizes: vec![1024 * (i + 1)],
part_indices: Vec::new(),
size: 1024 * (i + 1),
mod_time: Some(OffsetDateTime::from_unix_timestamp(1705312200 + i * 60)?),
meta_sys: HashMap::new(),
meta_user: metadata,
};
let file_version = FileMetaVersion {
version_type: VersionType::Object,
object: Some(object_version),
delete_marker: None,
write_version: (i + 1) as u64,
};
let shallow_version = FileMetaShallowVersion::try_from(file_version)?;
fm.versions.push(shallow_version);
// 每隔3个版本添加一个删除标记
if i % 3 == 2 {
let delete_version_id = Uuid::new_v4();
let delete_marker = MetaDeleteMarker {
version_id: Some(delete_version_id),
mod_time: Some(OffsetDateTime::from_unix_timestamp(1705312200 + i * 60 + 30)?),
meta_sys: None,
};
let delete_file_version = FileMetaVersion {
version_type: VersionType::Delete,
object: None,
delete_marker: Some(delete_marker),
write_version: (i + 100) as u64,
};
let delete_shallow_version = FileMetaShallowVersion::try_from(delete_file_version)?;
fm.versions.push(delete_shallow_version);
}
}
// 按修改时间排序(最新的在前)
fm.versions.sort_by(|a, b| b.header.mod_time.cmp(&a.header.mod_time));
fm.marshal_msg()
}
/// 创建一个损坏的 xl.meta 文件用于错误处理测试
pub fn create_corrupted_xlmeta() -> Vec<u8> {
let mut data = vec![
// 正确的文件头
b'X', b'L', b'2', b' ', // 版本号
1, 0, 3, 0, // 版本号
0xc6, 0x00, 0x00, 0x00, 0x10, // 正确的 bin32 长度标记,但数据长度不匹配
];
// 添加不足的数据(少于声明的长度)
data.extend_from_slice(&[0x42; 8]); // 只有8字节但声明了16字节
data
}
/// 创建一个空的 xl.meta 文件
pub fn create_empty_xlmeta() -> Result<Vec<u8>> {
let fm = FileMeta::new();
fm.marshal_msg()
}
/// 验证解析结果的辅助函数
pub fn verify_parsed_metadata(fm: &FileMeta, expected_versions: usize) -> Result<()> {
assert_eq!(fm.versions.len(), expected_versions, "版本数量不匹配");
assert_eq!(fm.meta_ver, crate::filemeta::XL_META_VERSION, "元数据版本不匹配");
// 验证版本是否按修改时间排序
for i in 1..fm.versions.len() {
let prev_time = fm.versions[i - 1].header.mod_time;
let curr_time = fm.versions[i].header.mod_time;
if let (Some(prev), Some(curr)) = (prev_time, curr_time) {
assert!(prev >= curr, "版本未按修改时间正确排序");
}
}
Ok(())
}
/// 创建一个包含内联数据的 xl.meta 文件
pub fn create_xlmeta_with_inline_data() -> Result<Vec<u8>> {
let mut fm = FileMeta::new();
// 添加内联数据
let inline_data = b"This is inline data for testing purposes";
let version_id = Uuid::new_v4();
fm.data.replace(&version_id.to_string(), inline_data.to_vec())?;
let object_version = MetaObject {
version_id: Some(version_id),
data_dir: None,
erasure_algorithm: crate::fileinfo::ErasureAlgo::ReedSolomon,
erasure_m: 1,
erasure_n: 1,
erasure_block_size: 64 * 1024,
erasure_index: 0,
erasure_dist: vec![0, 1],
bitrot_checksum_algo: ChecksumAlgo::HighwayHash,
part_numbers: vec![1],
part_etags: Vec::new(),
part_sizes: vec![inline_data.len()],
part_actual_sizes: Vec::new(),
part_indices: Vec::new(),
size: inline_data.len() as i64,
mod_time: Some(OffsetDateTime::now_utc()),
meta_sys: HashMap::new(),
meta_user: HashMap::new(),
};
let file_version = FileMetaVersion {
version_type: VersionType::Object,
object: Some(object_version),
delete_marker: None,
write_version: 1,
};
let shallow_version = FileMetaShallowVersion::try_from(file_version)?;
fm.versions.push(shallow_version);
fm.marshal_msg()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_create_real_xlmeta() {
let data = create_real_xlmeta().expect("创建测试数据失败");
assert!(!data.is_empty(), "生成的数据不应为空");
// 验证文件头
assert_eq!(&data[0..4], b"XL2 ", "文件头不正确");
// 尝试解析
let fm = FileMeta::load(&data).expect("解析失败");
verify_parsed_metadata(&fm, 3).expect("验证失败");
}
#[test]
fn test_create_complex_xlmeta() {
let data = create_complex_xlmeta().expect("创建复杂测试数据失败");
assert!(!data.is_empty(), "生成的数据不应为空");
let fm = FileMeta::load(&data).expect("解析失败");
assert!(fm.versions.len() >= 10, "应该有至少10个版本");
}
#[test]
fn test_create_xlmeta_with_inline_data() {
let data = create_xlmeta_with_inline_data().expect("创建内联数据测试失败");
assert!(!data.is_empty(), "生成的数据不应为空");
let fm = FileMeta::load(&data).expect("解析失败");
assert_eq!(fm.versions.len(), 1, "应该有1个版本");
assert!(!fm.data.as_slice().is_empty(), "应该包含内联数据");
}
#[test]
fn test_corrupted_xlmeta_handling() {
let data = create_corrupted_xlmeta();
let result = FileMeta::load(&data);
assert!(result.is_err(), "损坏的数据应该解析失败");
}
#[test]
fn test_empty_xlmeta() {
let data = create_empty_xlmeta().expect("创建空测试数据失败");
let fm = FileMeta::load(&data).expect("解析空数据失败");
assert_eq!(fm.versions.len(), 0, "空文件应该没有版本");
}
}

View File

@@ -1,7 +1,7 @@
use crate::Error;
use serde::{Deserialize, Serialize};
use serde_with::{DeserializeFromStr, SerializeDisplay};
use smallvec::{smallvec, SmallVec};
use smallvec::{SmallVec, smallvec};
use std::borrow::Cow;
use std::collections::HashMap;
use std::time::{SystemTime, UNIX_EPOCH};

View File

@@ -1,5 +1,5 @@
use opentelemetry::global;
use rustfs_obs::{get_logger, init_obs, log_info, BaseLogEntry, ServerLogEntry, SystemObserver};
use rustfs_obs::{BaseLogEntry, ServerLogEntry, SystemObserver, get_logger, init_obs, log_info};
use std::collections::HashMap;
use std::time::{Duration, SystemTime};
use tracing::{error, info, instrument};

View File

@@ -1,6 +1,6 @@
use crate::logger::InitLogStatus;
use crate::telemetry::{init_telemetry, OtelGuard};
use crate::{get_global_logger, init_global_logger, AppConfig, Logger};
use crate::telemetry::{OtelGuard, init_telemetry};
use crate::{AppConfig, Logger, get_global_logger, init_global_logger};
use std::sync::{Arc, Mutex};
use tokio::sync::{OnceCell, SetError};
use tracing::{error, info};
@@ -102,8 +102,8 @@ pub fn get_logger() -> &'static Arc<tokio::sync::Mutex<Logger>> {
/// ```rust
/// use rustfs_obs::{ init_obs, set_global_guard};
///
/// fn init() -> Result<(), Box<dyn std::error::Error>> {
/// let guard = init_obs(None);
/// async fn init() -> Result<(), Box<dyn std::error::Error>> {
/// let (_, guard) = init_obs(None).await;
/// set_global_guard(guard)?;
/// Ok(())
/// }

View File

@@ -1,6 +1,6 @@
use crate::sinks::Sink;
use crate::{
sinks, AppConfig, AuditLogEntry, BaseLogEntry, ConsoleLogEntry, GlobalError, OtelConfig, ServerLogEntry, UnifiedLogEntry,
AppConfig, AuditLogEntry, BaseLogEntry, ConsoleLogEntry, GlobalError, OtelConfig, ServerLogEntry, UnifiedLogEntry, sinks,
};
use rustfs_config::{APP_NAME, ENVIRONMENT, SERVICE_VERSION};
use std::sync::Arc;

View File

@@ -1,5 +1,5 @@
use crate::sinks::Sink;
use crate::UnifiedLogEntry;
use crate::sinks::Sink;
use async_trait::async_trait;
/// Webhook Sink Implementation

View File

@@ -1,11 +1,11 @@
use crate::GlobalError;
use crate::system::attributes::ProcessAttributes;
use crate::system::gpu::GpuCollector;
use crate::system::metrics::{Metrics, DIRECTION, INTERFACE, STATUS};
use crate::GlobalError;
use crate::system::metrics::{DIRECTION, INTERFACE, Metrics, STATUS};
use opentelemetry::KeyValue;
use std::time::SystemTime;
use sysinfo::{Networks, Pid, ProcessStatus, System};
use tokio::time::{sleep, Duration};
use tokio::time::{Duration, sleep};
/// Collector is responsible for collecting system metrics and attributes.
/// It uses the sysinfo crate to gather information about the system and processes.

View File

@@ -1,14 +1,14 @@
#[cfg(feature = "gpu")]
use crate::GlobalError;
#[cfg(feature = "gpu")]
use crate::system::attributes::ProcessAttributes;
#[cfg(feature = "gpu")]
use crate::system::metrics::Metrics;
#[cfg(feature = "gpu")]
use crate::GlobalError;
use nvml_wrapper::Nvml;
#[cfg(feature = "gpu")]
use nvml_wrapper::enums::device::UsedGpuMemory;
#[cfg(feature = "gpu")]
use nvml_wrapper::Nvml;
#[cfg(feature = "gpu")]
use sysinfo::Pid;
#[cfg(feature = "gpu")]
use tracing::warn;

View File

@@ -1,19 +1,19 @@
use crate::OtelConfig;
use flexi_logger::{style, Age, Cleanup, Criterion, DeferredNow, FileSpec, LogSpecification, Naming, Record, WriteMode};
use flexi_logger::{Age, Cleanup, Criterion, DeferredNow, FileSpec, LogSpecification, Naming, Record, WriteMode, style};
use nu_ansi_term::Color;
use opentelemetry::trace::TracerProvider;
use opentelemetry::{global, KeyValue};
use opentelemetry::{KeyValue, global};
use opentelemetry_appender_tracing::layer::OpenTelemetryTracingBridge;
use opentelemetry_otlp::WithExportConfig;
use opentelemetry_sdk::logs::SdkLoggerProvider;
use opentelemetry_sdk::{
Resource,
metrics::{MeterProviderBuilder, PeriodicReader, SdkMeterProvider},
trace::{RandomIdGenerator, Sampler, SdkTracerProvider},
Resource,
};
use opentelemetry_semantic_conventions::{
attribute::{DEPLOYMENT_ENVIRONMENT_NAME, NETWORK_LOCAL_ADDRESS, SERVICE_VERSION as OTEL_SERVICE_VERSION},
SCHEMA_URL,
attribute::{DEPLOYMENT_ENVIRONMENT_NAME, NETWORK_LOCAL_ADDRESS, SERVICE_VERSION as OTEL_SERVICE_VERSION},
};
use rustfs_config::{
APP_NAME, DEFAULT_LOG_DIR, DEFAULT_LOG_KEEP_FILES, DEFAULT_LOG_LEVEL, ENVIRONMENT, METER_INTERVAL, SAMPLE_RATIO,
@@ -27,7 +27,7 @@ use tracing::info;
use tracing_error::ErrorLayer;
use tracing_opentelemetry::{MetricsLayer, OpenTelemetryLayer};
use tracing_subscriber::fmt::format::FmtSpan;
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt, EnvFilter, Layer};
use tracing_subscriber::{EnvFilter, Layer, layer::SubscriberExt, util::SubscriberInitExt};
/// A guard object that manages the lifecycle of OpenTelemetry components.
///

View File

@@ -1,4 +1,4 @@
use crate::{sinks::Sink, UnifiedLogEntry};
use crate::{UnifiedLogEntry, sinks::Sink};
use std::sync::Arc;
use tokio::sync::mpsc::Receiver;

34
crates/rio/Cargo.toml Normal file
View File

@@ -0,0 +1,34 @@
[package]
name = "rustfs-rio"
edition.workspace = true
license.workspace = true
repository.workspace = true
rust-version.workspace = true
version.workspace = true
[lints]
workspace = true
[dependencies]
tokio = { workspace = true, features = ["full"] }
rand = { workspace = true }
md-5 = { workspace = true }
http.workspace = true
aes-gcm = "0.10.3"
crc32fast = "1.4.2"
pin-project-lite.workspace = true
async-trait.workspace = true
base64-simd = "0.8.0"
hex-simd = "0.8.0"
serde = { workspace = true }
bytes.workspace = true
reqwest.workspace = true
tokio-util.workspace = true
futures.workspace = true
rustfs-utils = {workspace = true, features= ["io","hash","compress"]}
byteorder.workspace = true
serde_json.workspace = true
[dev-dependencies]
criterion = { version = "0.5.1", features = ["async", "async_tokio", "tokio"] }
tokio-test = "0.4"

View File

@@ -0,0 +1,672 @@
use bytes::Bytes;
use serde::{Deserialize, Serialize};
use std::io::{self, Read, Seek, SeekFrom};
const S2_INDEX_HEADER: &[u8] = b"s2idx\x00";
const S2_INDEX_TRAILER: &[u8] = b"\x00xdi2s";
const MAX_INDEX_ENTRIES: usize = 1 << 16;
const MIN_INDEX_DIST: i64 = 1 << 20;
// const MIN_INDEX_DIST: i64 = 0;
pub trait TryGetIndex {
fn try_get_index(&self) -> Option<&Index> {
None
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Index {
pub total_uncompressed: i64,
pub total_compressed: i64,
info: Vec<IndexInfo>,
est_block_uncomp: i64,
}
impl Default for Index {
fn default() -> Self {
Self::new()
}
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct IndexInfo {
pub compressed_offset: i64,
pub uncompressed_offset: i64,
}
#[allow(dead_code)]
impl Index {
pub fn new() -> Self {
Self {
total_uncompressed: -1,
total_compressed: -1,
info: Vec::new(),
est_block_uncomp: 0,
}
}
#[allow(dead_code)]
fn reset(&mut self, max_block: usize) {
self.est_block_uncomp = max_block as i64;
self.total_compressed = -1;
self.total_uncompressed = -1;
self.info.clear();
}
pub fn len(&self) -> usize {
self.info.len()
}
fn alloc_infos(&mut self, n: usize) {
if n > MAX_INDEX_ENTRIES {
panic!("n > MAX_INDEX_ENTRIES");
}
self.info = Vec::with_capacity(n);
}
pub fn add(&mut self, compressed_offset: i64, uncompressed_offset: i64) -> io::Result<()> {
if self.info.is_empty() {
self.info.push(IndexInfo {
compressed_offset,
uncompressed_offset,
});
return Ok(());
}
let last_idx = self.info.len() - 1;
let latest = &mut self.info[last_idx];
if latest.uncompressed_offset == uncompressed_offset {
latest.compressed_offset = compressed_offset;
return Ok(());
}
if latest.uncompressed_offset > uncompressed_offset {
return Err(io::Error::new(
io::ErrorKind::InvalidData,
format!(
"internal error: Earlier uncompressed received ({} > {})",
latest.uncompressed_offset, uncompressed_offset
),
));
}
if latest.compressed_offset > compressed_offset {
return Err(io::Error::new(
io::ErrorKind::InvalidData,
format!(
"internal error: Earlier compressed received ({} > {})",
latest.uncompressed_offset, uncompressed_offset
),
));
}
if latest.uncompressed_offset + MIN_INDEX_DIST > uncompressed_offset {
return Ok(());
}
self.info.push(IndexInfo {
compressed_offset,
uncompressed_offset,
});
self.total_compressed = compressed_offset;
self.total_uncompressed = uncompressed_offset;
Ok(())
}
pub fn find(&self, offset: i64) -> io::Result<(i64, i64)> {
if self.total_uncompressed < 0 {
return Err(io::Error::other("corrupt index"));
}
let mut offset = offset;
if offset < 0 {
offset += self.total_uncompressed;
if offset < 0 {
return Err(io::Error::new(io::ErrorKind::UnexpectedEof, "offset out of bounds"));
}
}
if offset > self.total_uncompressed {
return Err(io::Error::new(io::ErrorKind::UnexpectedEof, "offset out of bounds"));
}
if self.info.is_empty() {
return Err(io::Error::new(io::ErrorKind::UnexpectedEof, "empty index"));
}
if self.info.len() > 200 {
let n = self
.info
.binary_search_by(|info| {
if info.uncompressed_offset > offset {
std::cmp::Ordering::Greater
} else {
std::cmp::Ordering::Less
}
})
.unwrap_or_else(|i| i);
if n == 0 {
return Ok((self.info[0].compressed_offset, self.info[0].uncompressed_offset));
}
return Ok((self.info[n - 1].compressed_offset, self.info[n - 1].uncompressed_offset));
}
let mut compressed_off = 0;
let mut uncompressed_off = 0;
for info in &self.info {
if info.uncompressed_offset > offset {
break;
}
compressed_off = info.compressed_offset;
uncompressed_off = info.uncompressed_offset;
}
Ok((compressed_off, uncompressed_off))
}
fn reduce(&mut self) {
if self.info.len() < MAX_INDEX_ENTRIES && self.est_block_uncomp >= MIN_INDEX_DIST {
return;
}
let mut remove_n = (self.info.len() + 1) / MAX_INDEX_ENTRIES;
let src = self.info.clone();
let mut j = 0;
while self.est_block_uncomp * (remove_n as i64 + 1) < MIN_INDEX_DIST && self.info.len() / (remove_n + 1) > 1000 {
remove_n += 1;
}
let mut idx = 0;
while idx < src.len() {
self.info[j] = src[idx].clone();
j += 1;
idx += remove_n + 1;
}
self.info.truncate(j);
self.est_block_uncomp += self.est_block_uncomp * remove_n as i64;
}
pub fn into_vec(mut self) -> Bytes {
let mut b = Vec::new();
self.append_to(&mut b, self.total_uncompressed, self.total_compressed);
Bytes::from(b)
}
pub fn append_to(&mut self, b: &mut Vec<u8>, uncomp_total: i64, comp_total: i64) {
self.reduce();
let init_size = b.len();
// Add skippable header
b.extend_from_slice(&[0x50, 0x2A, 0x4D, 0x18]); // ChunkTypeIndex
b.extend_from_slice(&[0, 0, 0]); // Placeholder for chunk length
// Add header
b.extend_from_slice(S2_INDEX_HEADER);
// Add total sizes
let mut tmp = [0u8; 8];
let n = write_varint(&mut tmp, uncomp_total);
b.extend_from_slice(&tmp[..n]);
let n = write_varint(&mut tmp, comp_total);
b.extend_from_slice(&tmp[..n]);
let n = write_varint(&mut tmp, self.est_block_uncomp);
b.extend_from_slice(&tmp[..n]);
let n = write_varint(&mut tmp, self.info.len() as i64);
b.extend_from_slice(&tmp[..n]);
// Check if we should add uncompressed offsets
let mut has_uncompressed = 0u8;
for (idx, info) in self.info.iter().enumerate() {
if idx == 0 {
if info.uncompressed_offset != 0 {
has_uncompressed = 1;
break;
}
continue;
}
if info.uncompressed_offset != self.info[idx - 1].uncompressed_offset + self.est_block_uncomp {
has_uncompressed = 1;
break;
}
}
b.push(has_uncompressed);
// Add uncompressed offsets if needed
if has_uncompressed == 1 {
for (idx, info) in self.info.iter().enumerate() {
let mut u_off = info.uncompressed_offset;
if idx > 0 {
let prev = &self.info[idx - 1];
u_off -= prev.uncompressed_offset + self.est_block_uncomp;
}
let n = write_varint(&mut tmp, u_off);
b.extend_from_slice(&tmp[..n]);
}
}
// Add compressed offsets
let mut c_predict = self.est_block_uncomp / 2;
for (idx, info) in self.info.iter().enumerate() {
let mut c_off = info.compressed_offset;
if idx > 0 {
let prev = &self.info[idx - 1];
c_off -= prev.compressed_offset + c_predict;
c_predict += c_off / 2;
}
let n = write_varint(&mut tmp, c_off);
b.extend_from_slice(&tmp[..n]);
}
// Add total size and trailer
let total_size = (b.len() - init_size + 4 + S2_INDEX_TRAILER.len()) as u32;
b.extend_from_slice(&total_size.to_le_bytes());
b.extend_from_slice(S2_INDEX_TRAILER);
// Update chunk length
let chunk_len = b.len() - init_size - 4;
b[init_size + 1] = chunk_len as u8;
b[init_size + 2] = (chunk_len >> 8) as u8;
b[init_size + 3] = (chunk_len >> 16) as u8;
}
pub fn load<'a>(&mut self, mut b: &'a [u8]) -> io::Result<&'a [u8]> {
if b.len() <= 4 + S2_INDEX_HEADER.len() + S2_INDEX_TRAILER.len() {
return Err(io::Error::new(io::ErrorKind::UnexpectedEof, "buffer too small"));
}
if b[0] != 0x50 || b[1] != 0x2A || b[2] != 0x4D || b[3] != 0x18 {
return Err(io::Error::other("invalid chunk type"));
}
let chunk_len = (b[1] as usize) | ((b[2] as usize) << 8) | ((b[3] as usize) << 16);
b = &b[4..];
if b.len() < chunk_len {
return Err(io::Error::new(io::ErrorKind::UnexpectedEof, "buffer too small"));
}
if !b.starts_with(S2_INDEX_HEADER) {
return Err(io::Error::other("invalid header"));
}
b = &b[S2_INDEX_HEADER.len()..];
// Read total uncompressed
let (v, n) = read_varint(b)?;
if v < 0 {
return Err(io::Error::other("invalid uncompressed size"));
}
self.total_uncompressed = v;
b = &b[n..];
// Read total compressed
let (v, n) = read_varint(b)?;
if v < 0 {
return Err(io::Error::other("invalid compressed size"));
}
self.total_compressed = v;
b = &b[n..];
// Read est block uncomp
let (v, n) = read_varint(b)?;
if v < 0 {
return Err(io::Error::other("invalid block size"));
}
self.est_block_uncomp = v;
b = &b[n..];
// Read number of entries
let (v, n) = read_varint(b)?;
if v < 0 || v > MAX_INDEX_ENTRIES as i64 {
return Err(io::Error::other("invalid number of entries"));
}
let entries = v as usize;
b = &b[n..];
self.alloc_infos(entries);
if b.is_empty() {
return Err(io::Error::new(io::ErrorKind::UnexpectedEof, "buffer too small"));
}
let has_uncompressed = b[0];
b = &b[1..];
if has_uncompressed & 1 != has_uncompressed {
return Err(io::Error::other("invalid uncompressed flag"));
}
// Read uncompressed offsets
for idx in 0..entries {
let mut u_off = 0i64;
if has_uncompressed != 0 {
let (v, n) = read_varint(b)?;
u_off = v;
b = &b[n..];
}
if idx > 0 {
let prev = self.info[idx - 1].uncompressed_offset;
u_off += prev + self.est_block_uncomp;
if u_off <= prev {
return Err(io::Error::other("invalid offset"));
}
}
if u_off < 0 {
return Err(io::Error::other("negative offset"));
}
self.info[idx].uncompressed_offset = u_off;
}
// Read compressed offsets
let mut c_predict = self.est_block_uncomp / 2;
for idx in 0..entries {
let (v, n) = read_varint(b)?;
let mut c_off = v;
b = &b[n..];
if idx > 0 {
c_predict += c_off / 2;
let prev = self.info[idx - 1].compressed_offset;
c_off += prev + c_predict;
if c_off <= prev {
return Err(io::Error::other("invalid offset"));
}
}
if c_off < 0 {
return Err(io::Error::other("negative offset"));
}
self.info[idx].compressed_offset = c_off;
}
if b.len() < 4 + S2_INDEX_TRAILER.len() {
return Err(io::Error::new(io::ErrorKind::UnexpectedEof, "buffer too small"));
}
// Skip size
b = &b[4..];
// Check trailer
if !b.starts_with(S2_INDEX_TRAILER) {
return Err(io::Error::other("invalid trailer"));
}
Ok(&b[S2_INDEX_TRAILER.len()..])
}
pub fn load_stream<R: Read + Seek>(&mut self, mut rs: R) -> io::Result<()> {
// Go to end
rs.seek(SeekFrom::End(-10))?;
let mut tmp = [0u8; 10];
rs.read_exact(&mut tmp)?;
// Check trailer
if &tmp[4..4 + S2_INDEX_TRAILER.len()] != S2_INDEX_TRAILER {
return Err(io::Error::other("invalid trailer"));
}
let sz = u32::from_le_bytes(tmp[..4].try_into().unwrap());
if sz > 0x7fffffff {
return Err(io::Error::other("size too large"));
}
rs.seek(SeekFrom::End(-(sz as i64)))?;
let mut buf = vec![0u8; sz as usize];
rs.read_exact(&mut buf)?;
self.load(&buf)?;
Ok(())
}
pub fn to_json(&self) -> serde_json::Result<Vec<u8>> {
#[derive(Serialize)]
struct Offset {
compressed: i64,
uncompressed: i64,
}
#[derive(Serialize)]
struct IndexJson {
total_uncompressed: i64,
total_compressed: i64,
offsets: Vec<Offset>,
est_block_uncompressed: i64,
}
let json = IndexJson {
total_uncompressed: self.total_uncompressed,
total_compressed: self.total_compressed,
offsets: self
.info
.iter()
.map(|info| Offset {
compressed: info.compressed_offset,
uncompressed: info.uncompressed_offset,
})
.collect(),
est_block_uncompressed: self.est_block_uncomp,
};
serde_json::to_vec_pretty(&json)
}
}
// Helper functions for varint encoding/decoding
fn write_varint(buf: &mut [u8], mut v: i64) -> usize {
let mut n = 0;
while v >= 0x80 {
buf[n] = (v as u8) | 0x80;
v >>= 7;
n += 1;
}
buf[n] = v as u8;
n + 1
}
fn read_varint(buf: &[u8]) -> io::Result<(i64, usize)> {
let mut result = 0i64;
let mut shift = 0;
let mut n = 0;
while n < buf.len() {
let byte = buf[n];
n += 1;
result |= ((byte & 0x7F) as i64) << shift;
if byte < 0x80 {
return Ok((result, n));
}
shift += 7;
}
Err(io::Error::new(io::ErrorKind::UnexpectedEof, "unexpected EOF"))
}
// Helper functions for index header manipulation
#[allow(dead_code)]
pub fn remove_index_headers(b: &[u8]) -> Option<&[u8]> {
if b.len() < 4 + S2_INDEX_TRAILER.len() {
return None;
}
// Skip size
let b = &b[4..];
// Check trailer
if !b.starts_with(S2_INDEX_TRAILER) {
return None;
}
Some(&b[S2_INDEX_TRAILER.len()..])
}
#[allow(dead_code)]
pub fn restore_index_headers(in_data: &[u8]) -> Vec<u8> {
if in_data.is_empty() {
return Vec::new();
}
let mut b = Vec::with_capacity(4 + S2_INDEX_HEADER.len() + in_data.len() + S2_INDEX_TRAILER.len() + 4);
b.extend_from_slice(&[0x50, 0x2A, 0x4D, 0x18]);
b.extend_from_slice(S2_INDEX_HEADER);
b.extend_from_slice(in_data);
let total_size = (b.len() + 4 + S2_INDEX_TRAILER.len()) as u32;
b.extend_from_slice(&total_size.to_le_bytes());
b.extend_from_slice(S2_INDEX_TRAILER);
let chunk_len = b.len() - 4;
b[1] = chunk_len as u8;
b[2] = (chunk_len >> 8) as u8;
b[3] = (chunk_len >> 16) as u8;
b
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_index_new() {
let index = Index::new();
assert_eq!(index.total_uncompressed, -1);
assert_eq!(index.total_compressed, -1);
assert!(index.info.is_empty());
assert_eq!(index.est_block_uncomp, 0);
}
#[test]
fn test_index_add() -> io::Result<()> {
let mut index = Index::new();
// 测试添加第一个索引
index.add(100, 1000)?;
assert_eq!(index.info.len(), 1);
assert_eq!(index.info[0].compressed_offset, 100);
assert_eq!(index.info[0].uncompressed_offset, 1000);
// 测试添加相同未压缩偏移量的索引
index.add(200, 1000)?;
assert_eq!(index.info.len(), 1);
assert_eq!(index.info[0].compressed_offset, 200);
assert_eq!(index.info[0].uncompressed_offset, 1000);
// 测试添加新的索引(确保距离足够大)
index.add(300, 2000 + MIN_INDEX_DIST)?;
assert_eq!(index.info.len(), 2);
assert_eq!(index.info[1].compressed_offset, 300);
assert_eq!(index.info[1].uncompressed_offset, 2000 + MIN_INDEX_DIST);
Ok(())
}
#[test]
fn test_index_add_errors() {
let mut index = Index::new();
// 添加初始索引
index.add(100, 1000).unwrap();
// 测试添加更小的未压缩偏移量
let err = index.add(200, 500).unwrap_err();
assert_eq!(err.kind(), io::ErrorKind::InvalidData);
// 测试添加更小的压缩偏移量
let err = index.add(50, 2000).unwrap_err();
assert_eq!(err.kind(), io::ErrorKind::InvalidData);
}
#[test]
fn test_index_find() -> io::Result<()> {
let mut index = Index::new();
index.total_uncompressed = 1000 + MIN_INDEX_DIST * 3;
index.total_compressed = 5000;
// 添加一些测试数据,确保索引间距满足 MIN_INDEX_DIST 要求
index.add(100, 1000)?;
index.add(300, 1000 + MIN_INDEX_DIST)?;
index.add(500, 1000 + MIN_INDEX_DIST * 2)?;
// 测试查找存在的偏移量
let (comp, uncomp) = index.find(1500)?;
assert_eq!(comp, 100);
assert_eq!(uncomp, 1000);
// 测试查找边界值
let (comp, uncomp) = index.find(1000 + MIN_INDEX_DIST)?;
assert_eq!(comp, 300);
assert_eq!(uncomp, 1000 + MIN_INDEX_DIST);
// 测试查找最后一个索引
let (comp, uncomp) = index.find(1000 + MIN_INDEX_DIST * 2)?;
assert_eq!(comp, 500);
assert_eq!(uncomp, 1000 + MIN_INDEX_DIST * 2);
Ok(())
}
#[test]
fn test_index_find_errors() {
let mut index = Index::new();
index.total_uncompressed = 10000;
index.total_compressed = 5000;
// 测试未初始化的索引
let uninit_index = Index::new();
let err = uninit_index.find(1000).unwrap_err();
assert_eq!(err.kind(), io::ErrorKind::Other);
// 测试超出范围的偏移量
let err = index.find(15000).unwrap_err();
assert_eq!(err.kind(), io::ErrorKind::UnexpectedEof);
// 测试负数偏移量
let err = match index.find(-1000) {
Ok(_) => panic!("should be error"),
Err(e) => e,
};
assert_eq!(err.kind(), io::ErrorKind::UnexpectedEof);
}
#[test]
fn test_index_reduce() {
let mut index = Index::new();
index.est_block_uncomp = MIN_INDEX_DIST;
// 添加超过最大索引数量的条目,确保间距满足 MIN_INDEX_DIST 要求
for i in 0..MAX_INDEX_ENTRIES + 100 {
index.add(i as i64 * 100, i as i64 * MIN_INDEX_DIST).unwrap();
}
// 手动调用 reduce 方法
index.reduce();
// 验证索引数量是否被正确减少
assert!(index.info.len() <= MAX_INDEX_ENTRIES);
}
#[test]
fn test_index_json() -> io::Result<()> {
let mut index = Index::new();
// 添加一些测试数据
index.add(100, 1000)?;
index.add(300, 2000 + MIN_INDEX_DIST)?;
// 测试 JSON 序列化
let json = index.to_json().unwrap();
let json_str = String::from_utf8(json).unwrap();
println!("json_str: {}", json_str);
// 验证 JSON 内容
assert!(json_str.contains("\"compressed\": 100"));
assert!(json_str.contains("\"uncompressed\": 1000"));
assert!(json_str.contains("\"est_block_uncompressed\": 0"));
Ok(())
}
}

View File

@@ -0,0 +1,510 @@
use crate::compress_index::{Index, TryGetIndex};
use crate::{EtagResolvable, HashReaderDetector};
use crate::{HashReaderMut, Reader};
use pin_project_lite::pin_project;
use rustfs_utils::compress::{CompressionAlgorithm, compress_block, decompress_block};
use rustfs_utils::{put_uvarint, uvarint};
use std::cmp::min;
use std::io::{self};
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, ReadBuf};
// use tracing::error;
const COMPRESS_TYPE_COMPRESSED: u8 = 0x00;
const COMPRESS_TYPE_UNCOMPRESSED: u8 = 0x01;
const COMPRESS_TYPE_END: u8 = 0xFF;
const DEFAULT_BLOCK_SIZE: usize = 1 << 20; // 1MB
const HEADER_LEN: usize = 8;
pin_project! {
#[derive(Debug)]
/// A reader wrapper that compresses data on the fly using DEFLATE algorithm.
pub struct CompressReader<R> {
#[pin]
pub inner: R,
buffer: Vec<u8>,
pos: usize,
done: bool,
block_size: usize,
compression_algorithm: CompressionAlgorithm,
index: Index,
written: usize,
uncomp_written: usize,
temp_buffer: Vec<u8>,
temp_pos: usize,
}
}
impl<R> CompressReader<R>
where
R: Reader,
{
pub fn new(inner: R, compression_algorithm: CompressionAlgorithm) -> Self {
Self {
inner,
buffer: Vec::new(),
pos: 0,
done: false,
compression_algorithm,
block_size: DEFAULT_BLOCK_SIZE,
index: Index::new(),
written: 0,
uncomp_written: 0,
temp_buffer: Vec::with_capacity(DEFAULT_BLOCK_SIZE), // Pre-allocate capacity
temp_pos: 0,
}
}
/// Optional: allow users to customize block_size
pub fn with_block_size(inner: R, block_size: usize, compression_algorithm: CompressionAlgorithm) -> Self {
Self {
inner,
buffer: Vec::new(),
pos: 0,
done: false,
compression_algorithm,
block_size,
index: Index::new(),
written: 0,
uncomp_written: 0,
temp_buffer: Vec::with_capacity(block_size),
temp_pos: 0,
}
}
}
impl<R> TryGetIndex for CompressReader<R>
where
R: Reader,
{
fn try_get_index(&self) -> Option<&Index> {
Some(&self.index)
}
}
impl<R> AsyncRead for CompressReader<R>
where
R: AsyncRead + Unpin + Send + Sync,
{
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<io::Result<()>> {
let mut this = self.project();
// Copy from buffer first if available
if *this.pos < this.buffer.len() {
let to_copy = min(buf.remaining(), this.buffer.len() - *this.pos);
buf.put_slice(&this.buffer[*this.pos..*this.pos + to_copy]);
*this.pos += to_copy;
if *this.pos == this.buffer.len() {
this.buffer.clear();
*this.pos = 0;
}
return Poll::Ready(Ok(()));
}
if *this.done {
return Poll::Ready(Ok(()));
}
// Fill temporary buffer
while this.temp_buffer.len() < *this.block_size {
let remaining = *this.block_size - this.temp_buffer.len();
let mut temp = vec![0u8; remaining];
let mut temp_buf = ReadBuf::new(&mut temp);
match this.inner.as_mut().poll_read(cx, &mut temp_buf) {
Poll::Pending => {
if this.temp_buffer.is_empty() {
return Poll::Pending;
}
break;
}
Poll::Ready(Ok(())) => {
let n = temp_buf.filled().len();
if n == 0 {
if this.temp_buffer.is_empty() {
return Poll::Ready(Ok(()));
}
break;
}
this.temp_buffer.extend_from_slice(&temp[..n]);
}
Poll::Ready(Err(e)) => {
// error!("CompressReader poll_read: read inner error: {e}");
return Poll::Ready(Err(e));
}
}
}
// Process accumulated data
if !this.temp_buffer.is_empty() {
let uncompressed_data = &this.temp_buffer;
let out = build_compressed_block(uncompressed_data, *this.compression_algorithm);
*this.written += out.len();
*this.uncomp_written += uncompressed_data.len();
if let Err(e) = this.index.add(*this.written as i64, *this.uncomp_written as i64) {
// error!("CompressReader index add error: {e}");
return Poll::Ready(Err(e));
}
*this.buffer = out;
*this.pos = 0;
this.temp_buffer.truncate(0); // More efficient way to clear
let to_copy = min(buf.remaining(), this.buffer.len());
buf.put_slice(&this.buffer[..to_copy]);
*this.pos += to_copy;
if *this.pos == this.buffer.len() {
this.buffer.clear();
*this.pos = 0;
}
Poll::Ready(Ok(()))
} else {
Poll::Pending
}
}
}
impl<R> EtagResolvable for CompressReader<R>
where
R: EtagResolvable,
{
fn try_resolve_etag(&mut self) -> Option<String> {
self.inner.try_resolve_etag()
}
}
impl<R> HashReaderDetector for CompressReader<R>
where
R: HashReaderDetector,
{
fn is_hash_reader(&self) -> bool {
self.inner.is_hash_reader()
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
self.inner.as_hash_reader_mut()
}
}
pin_project! {
/// A reader wrapper that decompresses data on the fly using DEFLATE algorithm.
/// Header format:
/// - First byte: compression type (00 = compressed, 01 = uncompressed, FF = end)
/// - Bytes 1-3: length of compressed data (little-endian)
/// - Bytes 4-7: CRC32 checksum of uncompressed data (little-endian)
#[derive(Debug)]
pub struct DecompressReader<R> {
#[pin]
pub inner: R,
buffer: Vec<u8>,
buffer_pos: usize,
finished: bool,
// Fields for saving header read progress across polls
header_buf: [u8; 8],
header_read: usize,
header_done: bool,
// Fields for saving compressed block read progress across polls
compressed_buf: Option<Vec<u8>>,
compressed_read: usize,
compressed_len: usize,
compression_algorithm: CompressionAlgorithm,
}
}
impl<R> DecompressReader<R>
where
R: AsyncRead + Unpin + Send + Sync,
{
pub fn new(inner: R, compression_algorithm: CompressionAlgorithm) -> Self {
Self {
inner,
buffer: Vec::new(),
buffer_pos: 0,
finished: false,
header_buf: [0u8; 8],
header_read: 0,
header_done: false,
compressed_buf: None,
compressed_read: 0,
compressed_len: 0,
compression_algorithm,
}
}
}
impl<R> AsyncRead for DecompressReader<R>
where
R: AsyncRead + Unpin + Send + Sync,
{
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<io::Result<()>> {
let mut this = self.project();
// Copy from buffer first if available
if *this.buffer_pos < this.buffer.len() {
let to_copy = min(buf.remaining(), this.buffer.len() - *this.buffer_pos);
buf.put_slice(&this.buffer[*this.buffer_pos..*this.buffer_pos + to_copy]);
*this.buffer_pos += to_copy;
if *this.buffer_pos == this.buffer.len() {
this.buffer.clear();
*this.buffer_pos = 0;
}
return Poll::Ready(Ok(()));
}
if *this.finished {
return Poll::Ready(Ok(()));
}
// Read header
while !*this.header_done && *this.header_read < HEADER_LEN {
let mut temp = [0u8; HEADER_LEN];
let mut temp_buf = ReadBuf::new(&mut temp[0..HEADER_LEN - *this.header_read]);
match this.inner.as_mut().poll_read(cx, &mut temp_buf) {
Poll::Pending => return Poll::Pending,
Poll::Ready(Ok(())) => {
let n = temp_buf.filled().len();
if n == 0 {
break;
}
this.header_buf[*this.header_read..*this.header_read + n].copy_from_slice(&temp_buf.filled()[..n]);
*this.header_read += n;
}
Poll::Ready(Err(e)) => {
// error!("DecompressReader poll_read: read header error: {e}");
return Poll::Ready(Err(e));
}
}
if *this.header_read < HEADER_LEN {
return Poll::Pending;
}
}
if !*this.header_done && *this.header_read == 0 {
return Poll::Ready(Ok(()));
}
let typ = this.header_buf[0];
let len = (this.header_buf[1] as usize) | ((this.header_buf[2] as usize) << 8) | ((this.header_buf[3] as usize) << 16);
let crc = (this.header_buf[4] as u32)
| ((this.header_buf[5] as u32) << 8)
| ((this.header_buf[6] as u32) << 16)
| ((this.header_buf[7] as u32) << 24);
*this.header_read = 0;
*this.header_done = true;
if this.compressed_buf.is_none() {
*this.compressed_len = len;
*this.compressed_buf = Some(vec![0u8; *this.compressed_len]);
*this.compressed_read = 0;
}
let compressed_buf = this.compressed_buf.as_mut().unwrap();
while *this.compressed_read < *this.compressed_len {
let mut temp_buf = ReadBuf::new(&mut compressed_buf[*this.compressed_read..]);
match this.inner.as_mut().poll_read(cx, &mut temp_buf) {
Poll::Pending => return Poll::Pending,
Poll::Ready(Ok(())) => {
let n = temp_buf.filled().len();
if n == 0 {
break;
}
*this.compressed_read += n;
}
Poll::Ready(Err(e)) => {
// error!("DecompressReader poll_read: read compressed block error: {e}");
this.compressed_buf.take();
*this.compressed_read = 0;
*this.compressed_len = 0;
return Poll::Ready(Err(e));
}
}
}
let (uncompress_len, uvarint) = uvarint(&compressed_buf[0..16]);
let compressed_data = &compressed_buf[uvarint as usize..];
let decompressed = if typ == COMPRESS_TYPE_COMPRESSED {
match decompress_block(compressed_data, *this.compression_algorithm) {
Ok(out) => out,
Err(e) => {
// error!("DecompressReader decompress_block error: {e}");
this.compressed_buf.take();
*this.compressed_read = 0;
*this.compressed_len = 0;
return Poll::Ready(Err(e));
}
}
} else if typ == COMPRESS_TYPE_UNCOMPRESSED {
compressed_data.to_vec()
} else if typ == COMPRESS_TYPE_END {
this.compressed_buf.take();
*this.compressed_read = 0;
*this.compressed_len = 0;
*this.finished = true;
return Poll::Ready(Ok(()));
} else {
// error!("DecompressReader unknown compression type: {typ}");
this.compressed_buf.take();
*this.compressed_read = 0;
*this.compressed_len = 0;
return Poll::Ready(Err(io::Error::new(io::ErrorKind::InvalidData, "Unknown compression type")));
};
if decompressed.len() != uncompress_len as usize {
// error!("DecompressReader decompressed length mismatch: {} != {}", decompressed.len(), uncompress_len);
this.compressed_buf.take();
*this.compressed_read = 0;
*this.compressed_len = 0;
return Poll::Ready(Err(io::Error::new(io::ErrorKind::InvalidData, "Decompressed length mismatch")));
}
let actual_crc = crc32fast::hash(&decompressed);
if actual_crc != crc {
// error!("DecompressReader CRC32 mismatch: actual {actual_crc} != expected {crc}");
this.compressed_buf.take();
*this.compressed_read = 0;
*this.compressed_len = 0;
return Poll::Ready(Err(io::Error::new(io::ErrorKind::InvalidData, "CRC32 mismatch")));
}
*this.buffer = decompressed;
*this.buffer_pos = 0;
this.compressed_buf.take();
*this.compressed_read = 0;
*this.compressed_len = 0;
*this.header_done = false;
let to_copy = min(buf.remaining(), this.buffer.len());
buf.put_slice(&this.buffer[..to_copy]);
*this.buffer_pos += to_copy;
if *this.buffer_pos == this.buffer.len() {
this.buffer.clear();
*this.buffer_pos = 0;
}
Poll::Ready(Ok(()))
}
}
impl<R> EtagResolvable for DecompressReader<R>
where
R: EtagResolvable,
{
fn try_resolve_etag(&mut self) -> Option<String> {
self.inner.try_resolve_etag()
}
}
impl<R> HashReaderDetector for DecompressReader<R>
where
R: HashReaderDetector,
{
fn is_hash_reader(&self) -> bool {
self.inner.is_hash_reader()
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
self.inner.as_hash_reader_mut()
}
}
/// Build compressed block with header + uvarint + compressed data
fn build_compressed_block(uncompressed_data: &[u8], compression_algorithm: CompressionAlgorithm) -> Vec<u8> {
let crc = crc32fast::hash(uncompressed_data);
let compressed_data = compress_block(uncompressed_data, compression_algorithm);
let uncompressed_len = uncompressed_data.len();
let mut uncompressed_len_buf = [0u8; 10];
let int_len = put_uvarint(&mut uncompressed_len_buf[..], uncompressed_len as u64);
let len = compressed_data.len() + int_len;
let mut header = [0u8; HEADER_LEN];
header[0] = COMPRESS_TYPE_COMPRESSED;
header[1] = (len & 0xFF) as u8;
header[2] = ((len >> 8) & 0xFF) as u8;
header[3] = ((len >> 16) & 0xFF) as u8;
header[4] = (crc & 0xFF) as u8;
header[5] = ((crc >> 8) & 0xFF) as u8;
header[6] = ((crc >> 16) & 0xFF) as u8;
header[7] = ((crc >> 24) & 0xFF) as u8;
let mut out = Vec::with_capacity(len + HEADER_LEN);
out.extend_from_slice(&header);
out.extend_from_slice(&uncompressed_len_buf[..int_len]);
out.extend_from_slice(&compressed_data);
out
}
#[cfg(test)]
mod tests {
use crate::WarpReader;
use super::*;
use std::io::Cursor;
use tokio::io::{AsyncReadExt, BufReader};
#[tokio::test]
async fn test_compress_reader_basic() {
let data = b"hello world, hello world, hello world!";
let reader = Cursor::new(&data[..]);
let mut compress_reader = CompressReader::new(WarpReader::new(reader), CompressionAlgorithm::Gzip);
let mut compressed = Vec::new();
compress_reader.read_to_end(&mut compressed).await.unwrap();
// DecompressReader解包
let mut decompress_reader = DecompressReader::new(Cursor::new(compressed.clone()), CompressionAlgorithm::Gzip);
let mut decompressed = Vec::new();
decompress_reader.read_to_end(&mut decompressed).await.unwrap();
assert_eq!(&decompressed, data);
}
#[tokio::test]
async fn test_compress_reader_basic_deflate() {
let data = b"hello world, hello world, hello world!";
let reader = BufReader::new(&data[..]);
let mut compress_reader = CompressReader::new(WarpReader::new(reader), CompressionAlgorithm::Deflate);
let mut compressed = Vec::new();
compress_reader.read_to_end(&mut compressed).await.unwrap();
// DecompressReader解包
let mut decompress_reader = DecompressReader::new(Cursor::new(compressed.clone()), CompressionAlgorithm::Deflate);
let mut decompressed = Vec::new();
decompress_reader.read_to_end(&mut decompressed).await.unwrap();
assert_eq!(&decompressed, data);
}
#[tokio::test]
async fn test_compress_reader_empty() {
let data = b"";
let reader = BufReader::new(&data[..]);
let mut compress_reader = CompressReader::new(WarpReader::new(reader), CompressionAlgorithm::Gzip);
let mut compressed = Vec::new();
compress_reader.read_to_end(&mut compressed).await.unwrap();
let mut decompress_reader = DecompressReader::new(Cursor::new(compressed.clone()), CompressionAlgorithm::Gzip);
let mut decompressed = Vec::new();
decompress_reader.read_to_end(&mut decompressed).await.unwrap();
assert_eq!(&decompressed, data);
}
#[tokio::test]
async fn test_compress_reader_large() {
use rand::Rng;
// Generate 1MB of random bytes
let mut data = vec![0u8; 1024 * 1024];
rand::rng().fill(&mut data[..]);
let reader = Cursor::new(data.clone());
let mut compress_reader = CompressReader::new(WarpReader::new(reader), CompressionAlgorithm::Gzip);
let mut compressed = Vec::new();
compress_reader.read_to_end(&mut compressed).await.unwrap();
let mut decompress_reader = DecompressReader::new(Cursor::new(compressed.clone()), CompressionAlgorithm::Gzip);
let mut decompressed = Vec::new();
decompress_reader.read_to_end(&mut decompressed).await.unwrap();
assert_eq!(&decompressed, &data);
}
#[tokio::test]
async fn test_compress_reader_large_deflate() {
use rand::Rng;
// Generate 1MB of random bytes
let mut data = vec![0u8; 1024 * 1024 * 3 + 512];
rand::rng().fill(&mut data[..]);
let reader = Cursor::new(data.clone());
let mut compress_reader = CompressReader::new(WarpReader::new(reader), CompressionAlgorithm::default());
let mut compressed = Vec::new();
compress_reader.read_to_end(&mut compressed).await.unwrap();
let mut decompress_reader = DecompressReader::new(Cursor::new(compressed.clone()), CompressionAlgorithm::default());
let mut decompressed = Vec::new();
decompress_reader.read_to_end(&mut decompressed).await.unwrap();
assert_eq!(&decompressed, &data);
}
}

View File

@@ -0,0 +1,436 @@
use crate::HashReaderDetector;
use crate::HashReaderMut;
use crate::compress_index::{Index, TryGetIndex};
use crate::{EtagResolvable, Reader};
use aes_gcm::aead::Aead;
use aes_gcm::{Aes256Gcm, KeyInit, Nonce};
use pin_project_lite::pin_project;
use rustfs_utils::{put_uvarint, put_uvarint_len};
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, ReadBuf};
pin_project! {
/// A reader wrapper that encrypts data on the fly using AES-256-GCM.
/// This is a demonstration. For production, use a secure and audited crypto library.
#[derive(Debug)]
pub struct EncryptReader<R> {
#[pin]
pub inner: R,
key: [u8; 32], // AES-256-GCM key
nonce: [u8; 12], // 96-bit nonce for GCM
buffer: Vec<u8>,
buffer_pos: usize,
finished: bool,
}
}
impl<R> EncryptReader<R>
where
R: Reader,
{
pub fn new(inner: R, key: [u8; 32], nonce: [u8; 12]) -> Self {
Self {
inner,
key,
nonce,
buffer: Vec::new(),
buffer_pos: 0,
finished: false,
}
}
}
impl<R> AsyncRead for EncryptReader<R>
where
R: AsyncRead + Unpin + Send + Sync,
{
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<std::io::Result<()>> {
let mut this = self.project();
// Serve from buffer if any
if *this.buffer_pos < this.buffer.len() {
let to_copy = std::cmp::min(buf.remaining(), this.buffer.len() - *this.buffer_pos);
buf.put_slice(&this.buffer[*this.buffer_pos..*this.buffer_pos + to_copy]);
*this.buffer_pos += to_copy;
if *this.buffer_pos == this.buffer.len() {
this.buffer.clear();
*this.buffer_pos = 0;
}
return Poll::Ready(Ok(()));
}
if *this.finished {
return Poll::Ready(Ok(()));
}
// Read a fixed block size from inner
let block_size = 8 * 1024;
let mut temp = vec![0u8; block_size];
let mut temp_buf = ReadBuf::new(&mut temp);
match this.inner.as_mut().poll_read(cx, &mut temp_buf) {
Poll::Pending => Poll::Pending,
Poll::Ready(Ok(())) => {
let n = temp_buf.filled().len();
if n == 0 {
// EOF, write end header
let mut header = [0u8; 8];
header[0] = 0xFF; // type: end
*this.buffer = header.to_vec();
*this.buffer_pos = 0;
*this.finished = true;
let to_copy = std::cmp::min(buf.remaining(), this.buffer.len());
buf.put_slice(&this.buffer[..to_copy]);
*this.buffer_pos += to_copy;
Poll::Ready(Ok(()))
} else {
// Encrypt the chunk
let cipher = Aes256Gcm::new_from_slice(this.key).expect("key");
let nonce = Nonce::from_slice(this.nonce);
let plaintext = &temp_buf.filled()[..n];
let plaintext_len = plaintext.len();
let crc = crc32fast::hash(plaintext);
let ciphertext = cipher
.encrypt(nonce, plaintext)
.map_err(|e| std::io::Error::other(format!("encrypt error: {e}")))?;
let int_len = put_uvarint_len(plaintext_len as u64);
let clen = int_len + ciphertext.len() + 4;
// Header: 8 bytes
// 0: type (0 = encrypted, 0xFF = end)
// 1-3: length (little endian u24, ciphertext length)
// 4-7: CRC32 of ciphertext (little endian u32)
let mut header = [0u8; 8];
header[0] = 0x00; // 0 = encrypted
header[1] = (clen & 0xFF) as u8;
header[2] = ((clen >> 8) & 0xFF) as u8;
header[3] = ((clen >> 16) & 0xFF) as u8;
header[4] = (crc & 0xFF) as u8;
header[5] = ((crc >> 8) & 0xFF) as u8;
header[6] = ((crc >> 16) & 0xFF) as u8;
header[7] = ((crc >> 24) & 0xFF) as u8;
let mut out = Vec::with_capacity(8 + int_len + ciphertext.len());
out.extend_from_slice(&header);
let mut plaintext_len_buf = vec![0u8; int_len];
put_uvarint(&mut plaintext_len_buf, plaintext_len as u64);
out.extend_from_slice(&plaintext_len_buf);
out.extend_from_slice(&ciphertext);
*this.buffer = out;
*this.buffer_pos = 0;
let to_copy = std::cmp::min(buf.remaining(), this.buffer.len());
buf.put_slice(&this.buffer[..to_copy]);
*this.buffer_pos += to_copy;
Poll::Ready(Ok(()))
}
}
Poll::Ready(Err(e)) => Poll::Ready(Err(e)),
}
}
}
impl<R> EtagResolvable for EncryptReader<R>
where
R: EtagResolvable,
{
fn try_resolve_etag(&mut self) -> Option<String> {
self.inner.try_resolve_etag()
}
}
impl<R> HashReaderDetector for EncryptReader<R>
where
R: EtagResolvable + HashReaderDetector,
{
fn is_hash_reader(&self) -> bool {
self.inner.is_hash_reader()
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
self.inner.as_hash_reader_mut()
}
}
impl<R> TryGetIndex for EncryptReader<R>
where
R: TryGetIndex,
{
fn try_get_index(&self) -> Option<&Index> {
self.inner.try_get_index()
}
}
pin_project! {
/// A reader wrapper that decrypts data on the fly using AES-256-GCM.
/// This is a demonstration. For production, use a secure and audited crypto library.
#[derive(Debug)]
pub struct DecryptReader<R> {
#[pin]
pub inner: R,
key: [u8; 32], // AES-256-GCM key
nonce: [u8; 12], // 96-bit nonce for GCM
buffer: Vec<u8>,
buffer_pos: usize,
finished: bool,
// For block framing
header_buf: [u8; 8],
header_read: usize,
header_done: bool,
ciphertext_buf: Option<Vec<u8>>,
ciphertext_read: usize,
ciphertext_len: usize,
}
}
impl<R> DecryptReader<R>
where
R: Reader,
{
pub fn new(inner: R, key: [u8; 32], nonce: [u8; 12]) -> Self {
Self {
inner,
key,
nonce,
buffer: Vec::new(),
buffer_pos: 0,
finished: false,
header_buf: [0u8; 8],
header_read: 0,
header_done: false,
ciphertext_buf: None,
ciphertext_read: 0,
ciphertext_len: 0,
}
}
}
impl<R> AsyncRead for DecryptReader<R>
where
R: AsyncRead + Unpin + Send + Sync,
{
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<std::io::Result<()>> {
let mut this = self.project();
// Serve from buffer if any
if *this.buffer_pos < this.buffer.len() {
let to_copy = std::cmp::min(buf.remaining(), this.buffer.len() - *this.buffer_pos);
buf.put_slice(&this.buffer[*this.buffer_pos..*this.buffer_pos + to_copy]);
*this.buffer_pos += to_copy;
if *this.buffer_pos == this.buffer.len() {
this.buffer.clear();
*this.buffer_pos = 0;
}
return Poll::Ready(Ok(()));
}
if *this.finished {
return Poll::Ready(Ok(()));
}
// Read header (8 bytes), support partial header read
while !*this.header_done && *this.header_read < 8 {
let mut temp = [0u8; 8];
let mut temp_buf = ReadBuf::new(&mut temp[0..8 - *this.header_read]);
match this.inner.as_mut().poll_read(cx, &mut temp_buf) {
Poll::Pending => return Poll::Pending,
Poll::Ready(Ok(())) => {
let n = temp_buf.filled().len();
if n == 0 {
break;
}
this.header_buf[*this.header_read..*this.header_read + n].copy_from_slice(&temp_buf.filled()[..n]);
*this.header_read += n;
}
Poll::Ready(Err(e)) => return Poll::Ready(Err(e)),
}
if *this.header_read < 8 {
return Poll::Pending;
}
}
if !*this.header_done && *this.header_read == 8 {
*this.header_done = true;
}
if !*this.header_done {
return Poll::Pending;
}
let typ = this.header_buf[0];
let len = (this.header_buf[1] as usize) | ((this.header_buf[2] as usize) << 8) | ((this.header_buf[3] as usize) << 16);
let crc = (this.header_buf[4] as u32)
| ((this.header_buf[5] as u32) << 8)
| ((this.header_buf[6] as u32) << 16)
| ((this.header_buf[7] as u32) << 24);
*this.header_read = 0;
*this.header_done = false;
if typ == 0xFF {
*this.finished = true;
return Poll::Ready(Ok(()));
}
// Read ciphertext block (len bytes), support partial read
if this.ciphertext_buf.is_none() {
*this.ciphertext_len = len - 4; // 4 bytes for CRC32
*this.ciphertext_buf = Some(vec![0u8; *this.ciphertext_len]);
*this.ciphertext_read = 0;
}
let ciphertext_buf = this.ciphertext_buf.as_mut().unwrap();
while *this.ciphertext_read < *this.ciphertext_len {
let mut temp_buf = ReadBuf::new(&mut ciphertext_buf[*this.ciphertext_read..]);
match this.inner.as_mut().poll_read(cx, &mut temp_buf) {
Poll::Pending => return Poll::Pending,
Poll::Ready(Ok(())) => {
let n = temp_buf.filled().len();
if n == 0 {
break;
}
*this.ciphertext_read += n;
}
Poll::Ready(Err(e)) => {
this.ciphertext_buf.take();
*this.ciphertext_read = 0;
*this.ciphertext_len = 0;
return Poll::Ready(Err(e));
}
}
}
if *this.ciphertext_read < *this.ciphertext_len {
return Poll::Pending;
}
// Parse uvarint for plaintext length
let (plaintext_len, uvarint_len) = rustfs_utils::uvarint(&ciphertext_buf[0..16]);
let ciphertext = &ciphertext_buf[uvarint_len as usize..];
// Decrypt
let cipher = Aes256Gcm::new_from_slice(this.key).expect("key");
let nonce = Nonce::from_slice(this.nonce);
let plaintext = cipher
.decrypt(nonce, ciphertext)
.map_err(|e| std::io::Error::other(format!("decrypt error: {e}")))?;
if plaintext.len() != plaintext_len as usize {
this.ciphertext_buf.take();
*this.ciphertext_read = 0;
*this.ciphertext_len = 0;
return Poll::Ready(Err(std::io::Error::other("Plaintext length mismatch")));
}
// CRC32 check
let actual_crc = crc32fast::hash(&plaintext);
if actual_crc != crc {
this.ciphertext_buf.take();
*this.ciphertext_read = 0;
*this.ciphertext_len = 0;
return Poll::Ready(Err(std::io::Error::other("CRC32 mismatch")));
}
*this.buffer = plaintext;
*this.buffer_pos = 0;
// Clear block state for next block
this.ciphertext_buf.take();
*this.ciphertext_read = 0;
*this.ciphertext_len = 0;
let to_copy = std::cmp::min(buf.remaining(), this.buffer.len());
buf.put_slice(&this.buffer[..to_copy]);
*this.buffer_pos += to_copy;
Poll::Ready(Ok(()))
}
}
impl<R> EtagResolvable for DecryptReader<R>
where
R: EtagResolvable,
{
fn try_resolve_etag(&mut self) -> Option<String> {
self.inner.try_resolve_etag()
}
}
impl<R> HashReaderDetector for DecryptReader<R>
where
R: EtagResolvable + HashReaderDetector,
{
fn is_hash_reader(&self) -> bool {
self.inner.is_hash_reader()
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
self.inner.as_hash_reader_mut()
}
}
#[cfg(test)]
mod tests {
use std::io::Cursor;
use crate::WarpReader;
use super::*;
use rand::RngCore;
use tokio::io::{AsyncReadExt, BufReader};
#[tokio::test]
async fn test_encrypt_decrypt_reader_aes256gcm() {
let data = b"hello sse encrypt";
let mut key = [0u8; 32];
let mut nonce = [0u8; 12];
rand::rng().fill_bytes(&mut key);
rand::rng().fill_bytes(&mut nonce);
let reader = BufReader::new(&data[..]);
let encrypt_reader = EncryptReader::new(WarpReader::new(reader), key, nonce);
// Encrypt
let mut encrypt_reader = encrypt_reader;
let mut encrypted = Vec::new();
encrypt_reader.read_to_end(&mut encrypted).await.unwrap();
// Decrypt using DecryptReader
let reader = Cursor::new(encrypted.clone());
let decrypt_reader = DecryptReader::new(WarpReader::new(reader), key, nonce);
let mut decrypt_reader = decrypt_reader;
let mut decrypted = Vec::new();
decrypt_reader.read_to_end(&mut decrypted).await.unwrap();
assert_eq!(&decrypted, data);
}
#[tokio::test]
async fn test_decrypt_reader_only() {
// Encrypt some data first
let data = b"test decrypt only";
let mut key = [0u8; 32];
let mut nonce = [0u8; 12];
rand::rng().fill_bytes(&mut key);
rand::rng().fill_bytes(&mut nonce);
// Encrypt
let reader = BufReader::new(&data[..]);
let encrypt_reader = EncryptReader::new(WarpReader::new(reader), key, nonce);
let mut encrypt_reader = encrypt_reader;
let mut encrypted = Vec::new();
encrypt_reader.read_to_end(&mut encrypted).await.unwrap();
// Now test DecryptReader
let reader = Cursor::new(encrypted.clone());
let decrypt_reader = DecryptReader::new(WarpReader::new(reader), key, nonce);
let mut decrypt_reader = decrypt_reader;
let mut decrypted = Vec::new();
decrypt_reader.read_to_end(&mut decrypted).await.unwrap();
assert_eq!(&decrypted, data);
}
#[tokio::test]
async fn test_encrypt_decrypt_reader_large() {
use rand::Rng;
let size = 1024 * 1024;
let mut data = vec![0u8; size];
rand::rng().fill(&mut data[..]);
let mut key = [0u8; 32];
let mut nonce = [0u8; 12];
rand::rng().fill_bytes(&mut key);
rand::rng().fill_bytes(&mut nonce);
let reader = std::io::Cursor::new(data.clone());
let encrypt_reader = EncryptReader::new(WarpReader::new(reader), key, nonce);
let mut encrypt_reader = encrypt_reader;
let mut encrypted = Vec::new();
encrypt_reader.read_to_end(&mut encrypted).await.unwrap();
let reader = std::io::Cursor::new(encrypted.clone());
let decrypt_reader = DecryptReader::new(WarpReader::new(reader), key, nonce);
let mut decrypt_reader = decrypt_reader;
let mut decrypted = Vec::new();
decrypt_reader.read_to_end(&mut decrypted).await.unwrap();
assert_eq!(&decrypted, &data);
}
}

248
crates/rio/src/etag.rs Normal file
View File

@@ -0,0 +1,248 @@
/*!
# AsyncRead Wrapper Types with ETag Support
This module demonstrates a pattern for handling wrapped AsyncRead types where:
- Reader types contain the actual ETag capability
- Wrapper types need to be recursively unwrapped
- The system can handle arbitrary nesting like `CompressReader<EncryptReader<EtagReader<R>>>`
## Key Components
### Trait-Based Approach
The `EtagResolvable` trait provides a clean way to handle recursive unwrapping:
- Reader types implement it by returning their ETag directly
- Wrapper types implement it by delegating to their inner type
## Usage Examples
```rust
use rustfs_rio::{CompressReader, EtagReader, resolve_etag_generic};
use rustfs_rio::WarpReader;
use rustfs_utils::compress::CompressionAlgorithm;
use tokio::io::BufReader;
use std::io::Cursor;
// Direct usage with trait-based approach
let data = b"test data";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
let etag_reader = EtagReader::new(reader, Some("test_etag".to_string()));
let mut reader = CompressReader::new(etag_reader, CompressionAlgorithm::Gzip);
let etag = resolve_etag_generic(&mut reader);
```
*/
#[cfg(test)]
mod tests {
use crate::{CompressReader, EncryptReader, EtagReader, HashReader};
use crate::{WarpReader, resolve_etag_generic};
use rustfs_utils::compress::CompressionAlgorithm;
use std::io::Cursor;
use tokio::io::BufReader;
#[test]
fn test_etag_reader_resolution() {
let data = b"test data";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
let mut etag_reader = EtagReader::new(reader, Some("test_etag".to_string()));
// Test direct ETag resolution
assert_eq!(resolve_etag_generic(&mut etag_reader), Some("test_etag".to_string()));
}
#[test]
fn test_hash_reader_resolution() {
let data = b"test data";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
let mut hash_reader =
HashReader::new(reader, data.len() as i64, data.len() as i64, Some("hash_etag".to_string()), false).unwrap();
// Test HashReader ETag resolution
assert_eq!(resolve_etag_generic(&mut hash_reader), Some("hash_etag".to_string()));
}
#[test]
fn test_compress_reader_delegation() {
let data = b"test data for compression";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
let etag_reader = EtagReader::new(reader, Some("compress_etag".to_string()));
let mut compress_reader = CompressReader::new(etag_reader, CompressionAlgorithm::Gzip);
// Test that CompressReader delegates to inner EtagReader
assert_eq!(resolve_etag_generic(&mut compress_reader), Some("compress_etag".to_string()));
}
#[test]
fn test_encrypt_reader_delegation() {
let data = b"test data for encryption";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
let etag_reader = EtagReader::new(reader, Some("encrypt_etag".to_string()));
let key = [0u8; 32];
let nonce = [0u8; 12];
let mut encrypt_reader = EncryptReader::new(etag_reader, key, nonce);
// Test that EncryptReader delegates to inner EtagReader
assert_eq!(resolve_etag_generic(&mut encrypt_reader), Some("encrypt_etag".to_string()));
}
#[test]
fn test_complex_nesting() {
let data = b"test data for complex nesting";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
// Create a complex nested structure: CompressReader<EncryptReader<EtagReader<BufReader<Cursor>>>>
let etag_reader = EtagReader::new(reader, Some("nested_etag".to_string()));
let key = [0u8; 32];
let nonce = [0u8; 12];
let encrypt_reader = EncryptReader::new(etag_reader, key, nonce);
let mut compress_reader = CompressReader::new(encrypt_reader, CompressionAlgorithm::Gzip);
// Test that nested structure can resolve ETag
assert_eq!(resolve_etag_generic(&mut compress_reader), Some("nested_etag".to_string()));
}
#[test]
fn test_hash_reader_in_nested_structure() {
let data = b"test data for hash reader nesting";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
// Create nested structure: CompressReader<HashReader<BufReader<Cursor>>>
let hash_reader =
HashReader::new(reader, data.len() as i64, data.len() as i64, Some("hash_nested_etag".to_string()), false).unwrap();
let mut compress_reader = CompressReader::new(hash_reader, CompressionAlgorithm::Deflate);
// Test that nested HashReader can be resolved
assert_eq!(resolve_etag_generic(&mut compress_reader), Some("hash_nested_etag".to_string()));
}
#[test]
fn test_comprehensive_etag_extraction() {
println!("🔍 Testing comprehensive ETag extraction with real reader types...");
// Test 1: Simple EtagReader
let data1 = b"simple test";
let reader1 = BufReader::new(Cursor::new(&data1[..]));
let reader1 = Box::new(WarpReader::new(reader1));
let mut etag_reader = EtagReader::new(reader1, Some("simple_etag".to_string()));
assert_eq!(resolve_etag_generic(&mut etag_reader), Some("simple_etag".to_string()));
// Test 2: HashReader with ETag
let data2 = b"hash test";
let reader2 = BufReader::new(Cursor::new(&data2[..]));
let reader2 = Box::new(WarpReader::new(reader2));
let mut hash_reader =
HashReader::new(reader2, data2.len() as i64, data2.len() as i64, Some("hash_etag".to_string()), false).unwrap();
assert_eq!(resolve_etag_generic(&mut hash_reader), Some("hash_etag".to_string()));
// Test 3: Single wrapper - CompressReader<EtagReader>
let data3 = b"compress test";
let reader3 = BufReader::new(Cursor::new(&data3[..]));
let reader3 = Box::new(WarpReader::new(reader3));
let etag_reader3 = EtagReader::new(reader3, Some("compress_wrapped_etag".to_string()));
let mut compress_reader = CompressReader::new(etag_reader3, CompressionAlgorithm::Zstd);
assert_eq!(resolve_etag_generic(&mut compress_reader), Some("compress_wrapped_etag".to_string()));
// Test 4: Double wrapper - CompressReader<EncryptReader<EtagReader>>
let data4 = b"double wrap test";
let reader4 = BufReader::new(Cursor::new(&data4[..]));
let reader4 = Box::new(WarpReader::new(reader4));
let etag_reader4 = EtagReader::new(reader4, Some("double_wrapped_etag".to_string()));
let key = [1u8; 32];
let nonce = [1u8; 12];
let encrypt_reader4 = EncryptReader::new(etag_reader4, key, nonce);
let mut compress_reader4 = CompressReader::new(encrypt_reader4, CompressionAlgorithm::Gzip);
assert_eq!(resolve_etag_generic(&mut compress_reader4), Some("double_wrapped_etag".to_string()));
println!("✅ All ETag extraction methods work correctly!");
println!("✅ Trait-based approach handles recursive unwrapping!");
println!("✅ Complex nesting patterns with real reader types are supported!");
}
#[test]
fn test_real_world_scenario() {
println!("🔍 Testing real-world ETag extraction scenario with actual reader types...");
// Simulate a real-world scenario where we have nested AsyncRead wrappers
// and need to extract ETag information from deeply nested structures
let data = b"Real world test data that might be compressed and encrypted";
let base_reader = BufReader::new(Cursor::new(&data[..]));
let base_reader = Box::new(WarpReader::new(base_reader));
// Create a complex nested structure that might occur in practice:
// CompressReader<EncryptReader<HashReader<BufReader<Cursor>>>>
let hash_reader = HashReader::new(
base_reader,
data.len() as i64,
data.len() as i64,
Some("real_world_etag".to_string()),
false,
)
.unwrap();
let key = [42u8; 32];
let nonce = [24u8; 12];
let encrypt_reader = EncryptReader::new(hash_reader, key, nonce);
let mut compress_reader = CompressReader::new(encrypt_reader, CompressionAlgorithm::Deflate);
// Extract ETag using our generic system
let extracted_etag = resolve_etag_generic(&mut compress_reader);
println!("📋 Extracted ETag: {:?}", extracted_etag);
assert_eq!(extracted_etag, Some("real_world_etag".to_string()));
// Test another complex nesting with EtagReader at the core
let data2 = b"Another real world scenario";
let base_reader2 = BufReader::new(Cursor::new(&data2[..]));
let base_reader2 = Box::new(WarpReader::new(base_reader2));
let etag_reader = EtagReader::new(base_reader2, Some("core_etag".to_string()));
let key2 = [99u8; 32];
let nonce2 = [88u8; 12];
let encrypt_reader2 = EncryptReader::new(etag_reader, key2, nonce2);
let mut compress_reader2 = CompressReader::new(encrypt_reader2, CompressionAlgorithm::Zstd);
let trait_etag = resolve_etag_generic(&mut compress_reader2);
println!("📋 Trait-based ETag: {:?}", trait_etag);
assert_eq!(trait_etag, Some("core_etag".to_string()));
println!("✅ Real-world scenario test passed!");
println!(" - Successfully extracted ETag from nested CompressReader<EncryptReader<HashReader<AsyncRead>>>");
println!(" - Successfully extracted ETag from nested CompressReader<EncryptReader<EtagReader<AsyncRead>>>");
println!(" - Trait-based approach works with real reader types");
println!(" - System handles arbitrary nesting depths with actual implementations");
}
#[test]
fn test_no_etag_scenarios() {
println!("🔍 Testing scenarios where no ETag is available...");
// Test with HashReader that has no etag
let data = b"no etag test";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
let mut hash_reader_no_etag = HashReader::new(reader, data.len() as i64, data.len() as i64, None, false).unwrap();
assert_eq!(resolve_etag_generic(&mut hash_reader_no_etag), None);
// Test with EtagReader that has None etag
let data2 = b"no etag test 2";
let reader2 = BufReader::new(Cursor::new(&data2[..]));
let reader2 = Box::new(WarpReader::new(reader2));
let mut etag_reader_none = EtagReader::new(reader2, None);
assert_eq!(resolve_etag_generic(&mut etag_reader_none), None);
// Test nested structure with no ETag at the core
let data3 = b"nested no etag test";
let reader3 = BufReader::new(Cursor::new(&data3[..]));
let reader3 = Box::new(WarpReader::new(reader3));
let etag_reader3 = EtagReader::new(reader3, None);
let mut compress_reader3 = CompressReader::new(etag_reader3, CompressionAlgorithm::Gzip);
assert_eq!(resolve_etag_generic(&mut compress_reader3), None);
println!("✅ No ETag scenarios handled correctly!");
}
}

View File

@@ -0,0 +1,229 @@
use crate::compress_index::{Index, TryGetIndex};
use crate::{EtagResolvable, HashReaderDetector, HashReaderMut, Reader};
use md5::{Digest, Md5};
use pin_project_lite::pin_project;
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, ReadBuf};
pin_project! {
pub struct EtagReader {
#[pin]
pub inner: Box<dyn Reader>,
pub md5: Md5,
pub finished: bool,
pub checksum: Option<String>,
}
}
impl EtagReader {
pub fn new(inner: Box<dyn Reader>, checksum: Option<String>) -> Self {
Self {
inner,
md5: Md5::new(),
finished: false,
checksum,
}
}
/// Get the final md5 value (etag) as a hex string, only compute once.
/// Can be called multiple times, always returns the same result after finished.
pub fn get_etag(&mut self) -> String {
format!("{:x}", self.md5.clone().finalize())
}
}
impl AsyncRead for EtagReader {
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<std::io::Result<()>> {
let mut this = self.project();
let orig_filled = buf.filled().len();
let poll = this.inner.as_mut().poll_read(cx, buf);
if let Poll::Ready(Ok(())) = &poll {
let filled = &buf.filled()[orig_filled..];
if !filled.is_empty() {
this.md5.update(filled);
} else {
// EOF
*this.finished = true;
if let Some(checksum) = this.checksum {
let etag = format!("{:x}", this.md5.clone().finalize());
if *checksum != etag {
return Poll::Ready(Err(std::io::Error::new(std::io::ErrorKind::InvalidData, "Checksum mismatch")));
}
}
}
}
poll
}
}
impl EtagResolvable for EtagReader {
fn is_etag_reader(&self) -> bool {
true
}
fn try_resolve_etag(&mut self) -> Option<String> {
// EtagReader provides its own etag, not delegating to inner
if let Some(checksum) = &self.checksum {
Some(checksum.clone())
} else if self.finished {
Some(self.get_etag())
} else {
None
}
}
}
impl HashReaderDetector for EtagReader {
fn is_hash_reader(&self) -> bool {
self.inner.is_hash_reader()
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
self.inner.as_hash_reader_mut()
}
}
impl TryGetIndex for EtagReader {
fn try_get_index(&self) -> Option<&Index> {
self.inner.try_get_index()
}
}
#[cfg(test)]
mod tests {
use crate::WarpReader;
use super::*;
use std::io::Cursor;
use tokio::io::{AsyncReadExt, BufReader};
#[tokio::test]
async fn test_etag_reader_basic() {
let data = b"hello world";
let mut hasher = Md5::new();
hasher.update(data);
let expected = format!("{:x}", hasher.finalize());
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let mut etag_reader = EtagReader::new(reader, None);
let mut buf = Vec::new();
let n = etag_reader.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, data.len());
assert_eq!(&buf, data);
let etag = etag_reader.try_resolve_etag();
assert_eq!(etag, Some(expected));
}
#[tokio::test]
async fn test_etag_reader_empty() {
let data = b"";
let mut hasher = Md5::new();
hasher.update(data);
let expected = format!("{:x}", hasher.finalize());
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let mut etag_reader = EtagReader::new(reader, None);
let mut buf = Vec::new();
let n = etag_reader.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, 0);
assert!(buf.is_empty());
let etag = etag_reader.try_resolve_etag();
assert_eq!(etag, Some(expected));
}
#[tokio::test]
async fn test_etag_reader_multiple_get() {
let data = b"abc123";
let mut hasher = Md5::new();
hasher.update(data);
let expected = format!("{:x}", hasher.finalize());
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let mut etag_reader = EtagReader::new(reader, None);
let mut buf = Vec::new();
let _ = etag_reader.read_to_end(&mut buf).await.unwrap();
// Call etag multiple times, should always return the same result
let etag1 = { etag_reader.try_resolve_etag() };
let etag2 = { etag_reader.try_resolve_etag() };
assert_eq!(etag1, Some(expected.clone()));
assert_eq!(etag2, Some(expected.clone()));
}
#[tokio::test]
async fn test_etag_reader_not_finished() {
let data = b"abc123";
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let mut etag_reader = EtagReader::new(reader, None);
// Do not read to end, etag should be None
let mut buf = [0u8; 2];
let _ = etag_reader.read(&mut buf).await.unwrap();
assert_eq!(etag_reader.try_resolve_etag(), None);
}
#[tokio::test]
async fn test_etag_reader_large_data() {
use rand::Rng;
// Generate 3MB random data
let size = 3 * 1024 * 1024;
let mut data = vec![0u8; size];
rand::rng().fill(&mut data[..]);
let mut hasher = Md5::new();
hasher.update(&data);
let cloned_data = data.clone();
let expected = format!("{:x}", hasher.finalize());
let reader = Cursor::new(data.clone());
let reader = Box::new(WarpReader::new(reader));
let mut etag_reader = EtagReader::new(reader, None);
let mut buf = Vec::new();
let n = etag_reader.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, size);
assert_eq!(&buf, &cloned_data);
let etag = etag_reader.try_resolve_etag();
assert_eq!(etag, Some(expected));
}
#[tokio::test]
async fn test_etag_reader_checksum_match() {
let data = b"checksum test data";
let mut hasher = Md5::new();
hasher.update(data);
let expected = format!("{:x}", hasher.finalize());
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let mut etag_reader = EtagReader::new(reader, Some(expected.clone()));
let mut buf = Vec::new();
let n = etag_reader.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, data.len());
assert_eq!(&buf, data);
// 校验通过etag应等于expected
assert_eq!(etag_reader.try_resolve_etag(), Some(expected));
}
#[tokio::test]
async fn test_etag_reader_checksum_mismatch() {
let data = b"checksum test data";
let wrong_checksum = "deadbeefdeadbeefdeadbeefdeadbeef".to_string();
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let mut etag_reader = EtagReader::new(reader, Some(wrong_checksum));
let mut buf = Vec::new();
// 校验失败应该返回InvalidData错误
let err = etag_reader.read_to_end(&mut buf).await.unwrap_err();
assert_eq!(err.kind(), std::io::ErrorKind::InvalidData);
}
}

View File

@@ -0,0 +1,141 @@
use crate::compress_index::{Index, TryGetIndex};
use crate::{EtagResolvable, HashReaderDetector, HashReaderMut, Reader};
use pin_project_lite::pin_project;
use std::io::{Error, Result};
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, ReadBuf};
pin_project! {
pub struct HardLimitReader {
#[pin]
pub inner: Box<dyn Reader>,
remaining: i64,
}
}
impl HardLimitReader {
pub fn new(inner: Box<dyn Reader>, limit: i64) -> Self {
HardLimitReader { inner, remaining: limit }
}
}
impl AsyncRead for HardLimitReader {
fn poll_read(mut self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<Result<()>> {
if self.remaining < 0 {
return Poll::Ready(Err(Error::other("input provided more bytes than specified")));
}
// Save the initial length
let before = buf.filled().len();
// Poll the inner reader
let this = self.as_mut().project();
let poll = this.inner.poll_read(cx, buf);
if let Poll::Ready(Ok(())) = &poll {
let after = buf.filled().len();
let read = (after - before) as i64;
self.remaining -= read;
if self.remaining < 0 {
return Poll::Ready(Err(Error::other("input provided more bytes than specified")));
}
}
poll
}
}
impl EtagResolvable for HardLimitReader {
fn try_resolve_etag(&mut self) -> Option<String> {
self.inner.try_resolve_etag()
}
}
impl HashReaderDetector for HardLimitReader {
fn is_hash_reader(&self) -> bool {
self.inner.is_hash_reader()
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
self.inner.as_hash_reader_mut()
}
}
impl TryGetIndex for HardLimitReader {
fn try_get_index(&self) -> Option<&Index> {
self.inner.try_get_index()
}
}
#[cfg(test)]
mod tests {
use std::vec;
use crate::WarpReader;
use super::*;
use rustfs_utils::read_full;
use tokio::io::{AsyncReadExt, BufReader};
#[tokio::test]
async fn test_hardlimit_reader_normal() {
let data = b"hello world";
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let hardlimit = HardLimitReader::new(reader, 20);
let mut r = hardlimit;
let mut buf = Vec::new();
let n = r.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, data.len());
assert_eq!(&buf, data);
}
#[tokio::test]
async fn test_hardlimit_reader_exact_limit() {
let data = b"1234567890";
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let hardlimit = HardLimitReader::new(reader, 10);
let mut r = hardlimit;
let mut buf = Vec::new();
let n = r.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, 10);
assert_eq!(&buf, data);
}
#[tokio::test]
async fn test_hardlimit_reader_exceed_limit() {
let data = b"abcdef";
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let hardlimit = HardLimitReader::new(reader, 3);
let mut r = hardlimit;
let mut buf = vec![0u8; 10];
// 读取超限,应该返回错误
let err = match read_full(&mut r, &mut buf).await {
Ok(n) => {
println!("Read {} bytes", n);
assert_eq!(n, 3);
assert_eq!(&buf[..n], b"abc");
None
}
Err(e) => Some(e),
};
assert!(err.is_some());
let err = err.unwrap();
assert_eq!(err.kind(), std::io::ErrorKind::Other);
}
#[tokio::test]
async fn test_hardlimit_reader_empty() {
let data = b"";
let reader = BufReader::new(&data[..]);
let reader = Box::new(WarpReader::new(reader));
let hardlimit = HardLimitReader::new(reader, 5);
let mut r = hardlimit;
let mut buf = Vec::new();
let n = r.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, 0);
assert_eq!(&buf, data);
}
}

View File

@@ -0,0 +1,582 @@
//! HashReader implementation with generic support
//!
//! This module provides a generic `HashReader<R>` that can wrap any type implementing
//! `AsyncRead + Unpin + Send + Sync + 'static + EtagResolvable`.
//!
//! ## Migration from the original Reader enum
//!
//! The original `HashReader::new` method that worked with the `Reader` enum
//! has been replaced with a generic approach. To preserve the original logic:
//!
//! ### Original logic (before generics):
//! ```ignore
//! // Original code would do:
//! // 1. Check if inner is already a HashReader
//! // 2. If size > 0, wrap with HardLimitReader
//! // 3. If !diskable_md5, wrap with EtagReader
//! // 4. Create HashReader with the wrapped reader
//!
//! let reader = HashReader::new(inner, size, actual_size, etag, diskable_md5)?;
//! ```
//!
//! ### New generic approach:
//! ```rust
//! use rustfs_rio::{HashReader, HardLimitReader, EtagReader};
//! use tokio::io::BufReader;
//! use std::io::Cursor;
//! use rustfs_rio::WarpReader;
//!
//! # tokio_test::block_on(async {
//! let data = b"hello world";
//! let reader = BufReader::new(Cursor::new(&data[..]));
//! let reader = Box::new(WarpReader::new(reader));
//! let size = data.len() as i64;
//! let actual_size = size;
//! let etag = None;
//! let diskable_md5 = false;
//!
//! // Method 1: Simple creation (recommended for most cases)
//! let hash_reader = HashReader::new(reader, size, actual_size, etag.clone(), diskable_md5).unwrap();
//!
//! // Method 2: With manual wrapping to recreate original logic
//! let reader2 = BufReader::new(Cursor::new(&data[..]));
//! let reader2 = Box::new(WarpReader::new(reader2));
//! let wrapped_reader: Box<dyn rustfs_rio::Reader> = if size > 0 {
//! if !diskable_md5 {
//! // Wrap with both HardLimitReader and EtagReader
//! let hard_limit = HardLimitReader::new(reader2, size);
//! Box::new(EtagReader::new(Box::new(hard_limit), etag.clone()))
//! } else {
//! // Only wrap with HardLimitReader
//! Box::new(HardLimitReader::new(reader2, size))
//! }
//! } else if !diskable_md5 {
//! // Only wrap with EtagReader
//! Box::new(EtagReader::new(reader2, etag.clone()))
//! } else {
//! // No wrapping needed
//! reader2
//! };
//! let hash_reader2 = HashReader::new(wrapped_reader, size, actual_size, etag, diskable_md5).unwrap();
//! # });
//! ```
//!
//! ## HashReader Detection
//!
//! The `HashReaderDetector` trait allows detection of existing HashReader instances:
//!
//! ```rust
//! use rustfs_rio::{HashReader, HashReaderDetector};
//! use tokio::io::BufReader;
//! use std::io::Cursor;
//! use rustfs_rio::WarpReader;
//!
//! # tokio_test::block_on(async {
//! let data = b"test";
//! let reader = BufReader::new(Cursor::new(&data[..]));
//! let hash_reader = HashReader::new(Box::new(WarpReader::new(reader)), 4, 4, None, false).unwrap();
//!
//! // Check if a type is a HashReader
//! assert!(hash_reader.is_hash_reader());
//!
//! // Use new for compatibility (though it's simpler to use new() directly)
//! let reader2 = BufReader::new(Cursor::new(&data[..]));
//! let result = HashReader::new(Box::new(WarpReader::new(reader2)), 4, 4, None, false);
//! assert!(result.is_ok());
//! # });
//! ```
use pin_project_lite::pin_project;
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, ReadBuf};
use crate::compress_index::{Index, TryGetIndex};
use crate::{EtagReader, EtagResolvable, HardLimitReader, HashReaderDetector, Reader};
/// Trait for mutable operations on HashReader
pub trait HashReaderMut {
fn bytes_read(&self) -> u64;
fn checksum(&self) -> &Option<String>;
fn set_checksum(&mut self, checksum: Option<String>);
fn size(&self) -> i64;
fn set_size(&mut self, size: i64);
fn actual_size(&self) -> i64;
fn set_actual_size(&mut self, actual_size: i64);
}
pin_project! {
pub struct HashReader {
#[pin]
pub inner: Box<dyn Reader>,
pub size: i64,
checksum: Option<String>,
pub actual_size: i64,
pub diskable_md5: bool,
bytes_read: u64,
// TODO: content_hash
}
}
impl HashReader {
pub fn new(
mut inner: Box<dyn Reader>,
size: i64,
actual_size: i64,
md5: Option<String>,
diskable_md5: bool,
) -> std::io::Result<Self> {
// Check if it's already a HashReader and update its parameters
if let Some(existing_hash_reader) = inner.as_hash_reader_mut() {
if existing_hash_reader.bytes_read() > 0 {
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
"Cannot create HashReader from an already read HashReader",
));
}
if let Some(checksum) = existing_hash_reader.checksum() {
if let Some(ref md5) = md5 {
if checksum != md5 {
return Err(std::io::Error::new(std::io::ErrorKind::InvalidData, "HashReader checksum mismatch"));
}
}
}
if existing_hash_reader.size() > 0 && size > 0 && existing_hash_reader.size() != size {
return Err(std::io::Error::new(
std::io::ErrorKind::InvalidData,
format!("HashReader size mismatch: expected {}, got {}", existing_hash_reader.size(), size),
));
}
existing_hash_reader.set_checksum(md5.clone());
if existing_hash_reader.size() < 0 && size >= 0 {
existing_hash_reader.set_size(size);
}
if existing_hash_reader.actual_size() <= 0 && actual_size >= 0 {
existing_hash_reader.set_actual_size(actual_size);
}
return Ok(Self {
inner,
size,
checksum: md5,
actual_size,
diskable_md5,
bytes_read: 0,
});
}
if size > 0 {
let hr = HardLimitReader::new(inner, size);
inner = Box::new(hr);
if !diskable_md5 && !inner.is_hash_reader() {
let er = EtagReader::new(inner, md5.clone());
inner = Box::new(er);
}
} else if !diskable_md5 {
let er = EtagReader::new(inner, md5.clone());
inner = Box::new(er);
}
Ok(Self {
inner,
size,
checksum: md5,
actual_size,
diskable_md5,
bytes_read: 0,
})
}
/// Update HashReader parameters
pub fn update_params(&mut self, size: i64, actual_size: i64, etag: Option<String>) {
if self.size < 0 && size >= 0 {
self.size = size;
}
if self.actual_size <= 0 && actual_size > 0 {
self.actual_size = actual_size;
}
if etag.is_some() {
self.checksum = etag;
}
}
pub fn size(&self) -> i64 {
self.size
}
pub fn actual_size(&self) -> i64 {
self.actual_size
}
}
impl HashReaderMut for HashReader {
fn bytes_read(&self) -> u64 {
self.bytes_read
}
fn checksum(&self) -> &Option<String> {
&self.checksum
}
fn set_checksum(&mut self, checksum: Option<String>) {
self.checksum = checksum;
}
fn size(&self) -> i64 {
self.size
}
fn set_size(&mut self, size: i64) {
self.size = size;
}
fn actual_size(&self) -> i64 {
self.actual_size
}
fn set_actual_size(&mut self, actual_size: i64) {
self.actual_size = actual_size;
}
}
impl AsyncRead for HashReader {
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<std::io::Result<()>> {
let this = self.project();
let poll = this.inner.poll_read(cx, buf);
if let Poll::Ready(Ok(())) = &poll {
let filled = buf.filled().len();
*this.bytes_read += filled as u64;
if filled == 0 {
// EOF
// TODO: check content_hash
}
}
poll
}
}
impl EtagResolvable for HashReader {
fn try_resolve_etag(&mut self) -> Option<String> {
if self.diskable_md5 {
return None;
}
if let Some(etag) = self.inner.try_resolve_etag() {
return Some(etag);
}
// If no etag from inner and we have a stored checksum, return it
self.checksum.clone()
}
}
impl HashReaderDetector for HashReader {
fn is_hash_reader(&self) -> bool {
true
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
Some(self)
}
}
impl TryGetIndex for HashReader {
fn try_get_index(&self) -> Option<&Index> {
self.inner.try_get_index()
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::{DecryptReader, WarpReader, encrypt_reader};
use std::io::Cursor;
use tokio::io::{AsyncReadExt, BufReader};
#[tokio::test]
async fn test_hashreader_wrapping_logic() {
let data = b"hello world";
let size = data.len() as i64;
let actual_size = size;
let etag = None;
// Test 1: Simple creation
let reader1 = BufReader::new(Cursor::new(&data[..]));
let reader1 = Box::new(WarpReader::new(reader1));
let hash_reader1 = HashReader::new(reader1, size, actual_size, etag.clone(), false).unwrap();
assert_eq!(hash_reader1.size(), size);
assert_eq!(hash_reader1.actual_size(), actual_size);
// Test 2: With HardLimitReader wrapping
let reader2 = BufReader::new(Cursor::new(&data[..]));
let reader2 = Box::new(WarpReader::new(reader2));
let hard_limit = HardLimitReader::new(reader2, size);
let hard_limit = Box::new(hard_limit);
let hash_reader2 = HashReader::new(hard_limit, size, actual_size, etag.clone(), false).unwrap();
assert_eq!(hash_reader2.size(), size);
assert_eq!(hash_reader2.actual_size(), actual_size);
// Test 3: With EtagReader wrapping
let reader3 = BufReader::new(Cursor::new(&data[..]));
let reader3 = Box::new(WarpReader::new(reader3));
let etag_reader = EtagReader::new(reader3, etag.clone());
let etag_reader = Box::new(etag_reader);
let hash_reader3 = HashReader::new(etag_reader, size, actual_size, etag.clone(), false).unwrap();
assert_eq!(hash_reader3.size(), size);
assert_eq!(hash_reader3.actual_size(), actual_size);
}
#[tokio::test]
async fn test_hashreader_etag_basic() {
let data = b"hello hashreader";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
let mut hash_reader = HashReader::new(reader, data.len() as i64, data.len() as i64, None, false).unwrap();
let mut buf = Vec::new();
let _ = hash_reader.read_to_end(&mut buf).await.unwrap();
// Since we removed EtagReader integration, etag might be None
let _etag = hash_reader.try_resolve_etag();
// Just check that we can call etag() without error
assert_eq!(buf, data);
}
#[tokio::test]
async fn test_hashreader_diskable_md5() {
let data = b"no etag";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
let mut hash_reader = HashReader::new(reader, data.len() as i64, data.len() as i64, None, true).unwrap();
let mut buf = Vec::new();
let _ = hash_reader.read_to_end(&mut buf).await.unwrap();
// Etag should be None when diskable_md5 is true
let etag = hash_reader.try_resolve_etag();
assert!(etag.is_none());
assert_eq!(buf, data);
}
#[tokio::test]
async fn test_hashreader_new_logic() {
let data = b"test data";
let reader = BufReader::new(Cursor::new(&data[..]));
let reader = Box::new(WarpReader::new(reader));
// Create a HashReader first
let hash_reader =
HashReader::new(reader, data.len() as i64, data.len() as i64, Some("test_etag".to_string()), false).unwrap();
let hash_reader = Box::new(WarpReader::new(hash_reader));
// Now try to create another HashReader from the existing one using new
let result = HashReader::new(hash_reader, data.len() as i64, data.len() as i64, Some("test_etag".to_string()), false);
assert!(result.is_ok());
let final_reader = result.unwrap();
assert_eq!(final_reader.checksum, Some("test_etag".to_string()));
assert_eq!(final_reader.size(), data.len() as i64);
}
#[tokio::test]
async fn test_for_wrapping_readers() {
use crate::{CompressReader, DecompressReader};
use md5::{Digest, Md5};
use rand::Rng;
use rand::RngCore;
use rustfs_utils::compress::CompressionAlgorithm;
// Generate 1MB random data
let size = 1024 * 1024;
let mut data = vec![0u8; size];
rand::rng().fill(&mut data[..]);
let mut hasher = Md5::new();
hasher.update(&data);
let expected = format!("{:x}", hasher.finalize());
println!("expected: {}", expected);
let reader = Cursor::new(data.clone());
let reader = BufReader::new(reader);
// 启用压缩测试
let is_compress = true;
let size = data.len() as i64;
let actual_size = data.len() as i64;
let reader = Box::new(WarpReader::new(reader));
// 创建 HashReader
let mut hr = HashReader::new(reader, size, actual_size, Some(expected.clone()), false).unwrap();
// 如果启用压缩,先压缩数据
let compressed_data = if is_compress {
let mut compressed_buf = Vec::new();
let compress_reader = CompressReader::new(hr, CompressionAlgorithm::Gzip);
let mut compress_reader = compress_reader;
compress_reader.read_to_end(&mut compressed_buf).await.unwrap();
println!("Original size: {}, Compressed size: {}", data.len(), compressed_buf.len());
compressed_buf
} else {
// 如果不压缩,直接读取原始数据
let mut buf = Vec::new();
hr.read_to_end(&mut buf).await.unwrap();
buf
};
let mut key = [0u8; 32];
let mut nonce = [0u8; 12];
rand::rng().fill_bytes(&mut key);
rand::rng().fill_bytes(&mut nonce);
let is_encrypt = true;
if is_encrypt {
// 加密压缩后的数据
let encrypt_reader = encrypt_reader::EncryptReader::new(WarpReader::new(Cursor::new(compressed_data)), key, nonce);
let mut encrypted_data = Vec::new();
let mut encrypt_reader = encrypt_reader;
encrypt_reader.read_to_end(&mut encrypted_data).await.unwrap();
println!("Encrypted size: {}", encrypted_data.len());
// 解密数据
let decrypt_reader = DecryptReader::new(WarpReader::new(Cursor::new(encrypted_data)), key, nonce);
let mut decrypt_reader = decrypt_reader;
let mut decrypted_data = Vec::new();
decrypt_reader.read_to_end(&mut decrypted_data).await.unwrap();
if is_compress {
// 如果使用了压缩,需要解压缩
let decompress_reader =
DecompressReader::new(WarpReader::new(Cursor::new(decrypted_data)), CompressionAlgorithm::Gzip);
let mut decompress_reader = decompress_reader;
let mut final_data = Vec::new();
decompress_reader.read_to_end(&mut final_data).await.unwrap();
println!("Final decompressed size: {}", final_data.len());
assert_eq!(final_data.len() as i64, actual_size);
assert_eq!(&final_data, &data);
} else {
// 如果没有压缩,直接比较解密后的数据
assert_eq!(decrypted_data.len() as i64, actual_size);
assert_eq!(&decrypted_data, &data);
}
return;
}
// 如果不加密,直接处理压缩/解压缩
if is_compress {
let decompress_reader =
DecompressReader::new(WarpReader::new(Cursor::new(compressed_data)), CompressionAlgorithm::Gzip);
let mut decompress_reader = decompress_reader;
let mut decompressed = Vec::new();
decompress_reader.read_to_end(&mut decompressed).await.unwrap();
assert_eq!(decompressed.len() as i64, actual_size);
assert_eq!(&decompressed, &data);
} else {
assert_eq!(compressed_data.len() as i64, actual_size);
assert_eq!(&compressed_data, &data);
}
// 验证 etag注意压缩会改变数据所以这里的 etag 验证可能需要调整)
println!(
"Test completed successfully with compression: {}, encryption: {}",
is_compress, is_encrypt
);
}
#[tokio::test]
async fn test_compression_with_compressible_data() {
use crate::{CompressReader, DecompressReader};
use rustfs_utils::compress::CompressionAlgorithm;
// Create highly compressible data (repeated pattern)
let pattern = b"Hello, World! This is a test pattern that should compress well. ";
let repeat_count = 16384; // 16K repetitions
let mut data = Vec::new();
for _ in 0..repeat_count {
data.extend_from_slice(pattern);
}
println!("Original data size: {} bytes", data.len());
let reader = BufReader::new(Cursor::new(data.clone()));
let reader = Box::new(WarpReader::new(reader));
let hash_reader = HashReader::new(reader, data.len() as i64, data.len() as i64, None, false).unwrap();
// Test compression
let compress_reader = CompressReader::new(hash_reader, CompressionAlgorithm::Gzip);
let mut compressed_data = Vec::new();
let mut compress_reader = compress_reader;
compress_reader.read_to_end(&mut compressed_data).await.unwrap();
println!("Compressed data size: {} bytes", compressed_data.len());
println!("Compression ratio: {:.2}%", (compressed_data.len() as f64 / data.len() as f64) * 100.0);
// Verify compression actually reduced size for this compressible data
assert!(compressed_data.len() < data.len(), "Compression should reduce size for repetitive data");
// Test decompression
let decompress_reader = DecompressReader::new(Cursor::new(compressed_data), CompressionAlgorithm::Gzip);
let mut decompressed_data = Vec::new();
let mut decompress_reader = decompress_reader;
decompress_reader.read_to_end(&mut decompressed_data).await.unwrap();
// Verify decompressed data matches original
assert_eq!(decompressed_data.len(), data.len());
assert_eq!(&decompressed_data, &data);
println!("Compression/decompression test passed successfully!");
}
#[tokio::test]
async fn test_compression_algorithms() {
use crate::{CompressReader, DecompressReader};
use rustfs_utils::compress::CompressionAlgorithm;
let data = b"This is test data for compression algorithm testing. ".repeat(1000);
println!("Testing with {} bytes of data", data.len());
let algorithms = vec![
CompressionAlgorithm::Gzip,
CompressionAlgorithm::Deflate,
CompressionAlgorithm::Zstd,
];
for algorithm in algorithms {
println!("\nTesting algorithm: {:?}", algorithm);
let reader = BufReader::new(Cursor::new(data.clone()));
let reader = Box::new(WarpReader::new(reader));
let hash_reader = HashReader::new(reader, data.len() as i64, data.len() as i64, None, false).unwrap();
// Compress
let compress_reader = CompressReader::new(hash_reader, algorithm);
let mut compressed_data = Vec::new();
let mut compress_reader = compress_reader;
compress_reader.read_to_end(&mut compressed_data).await.unwrap();
println!(
" Compressed size: {} bytes (ratio: {:.2}%)",
compressed_data.len(),
(compressed_data.len() as f64 / data.len() as f64) * 100.0
);
// Decompress
let decompress_reader = DecompressReader::new(Cursor::new(compressed_data), algorithm);
let mut decompressed_data = Vec::new();
let mut decompress_reader = decompress_reader;
decompress_reader.read_to_end(&mut decompressed_data).await.unwrap();
// Verify
assert_eq!(decompressed_data.len(), data.len());
assert_eq!(&decompressed_data, &data);
println!(" ✓ Algorithm {:?} test passed", algorithm);
}
}
}

View File

@@ -0,0 +1,423 @@
use bytes::Bytes;
use futures::{Stream, TryStreamExt as _};
use http::HeaderMap;
use pin_project_lite::pin_project;
use reqwest::{Client, Method, RequestBuilder};
use std::error::Error as _;
use std::io::{self, Error};
use std::ops::Not as _;
use std::pin::Pin;
use std::sync::LazyLock;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, AsyncWrite, ReadBuf};
use tokio::sync::mpsc;
use tokio_util::io::StreamReader;
use crate::{EtagResolvable, HashReaderDetector, HashReaderMut};
fn get_http_client() -> Client {
// Reuse the HTTP connection pool in the global `reqwest::Client` instance
// TODO: interact with load balancing?
static CLIENT: LazyLock<Client> = LazyLock::new(Client::new);
CLIENT.clone()
}
static HTTP_DEBUG_LOG: bool = false;
#[inline(always)]
fn http_debug_log(args: std::fmt::Arguments) {
if HTTP_DEBUG_LOG {
println!("{}", args);
}
}
macro_rules! http_log {
($($arg:tt)*) => {
http_debug_log(format_args!($($arg)*));
};
}
pin_project! {
pub struct HttpReader {
url:String,
method: Method,
headers: HeaderMap,
inner: StreamReader<Pin<Box<dyn Stream<Item=std::io::Result<Bytes>>+Send+Sync>>, Bytes>,
}
}
impl HttpReader {
pub async fn new(url: String, method: Method, headers: HeaderMap, body: Option<Vec<u8>>) -> io::Result<Self> {
// http_log!("[HttpReader::new] url: {url}, method: {method:?}, headers: {headers:?}");
Self::with_capacity(url, method, headers, body, 0).await
}
/// Create a new HttpReader from a URL. The request is performed immediately.
pub async fn with_capacity(
url: String,
method: Method,
headers: HeaderMap,
body: Option<Vec<u8>>,
_read_buf_size: usize,
) -> io::Result<Self> {
// http_log!(
// "[HttpReader::with_capacity] url: {url}, method: {method:?}, headers: {headers:?}, buf_size: {}",
// _read_buf_size
// );
// First, check if the connection is available (HEAD)
let client = get_http_client();
let head_resp = client.head(&url).headers(headers.clone()).send().await;
match head_resp {
Ok(resp) => {
http_log!("[HttpReader::new] HEAD status: {}", resp.status());
if !resp.status().is_success() {
return Err(Error::other(format!("HEAD failed: url: {}, status {}", url, resp.status())));
}
}
Err(e) => {
http_log!("[HttpReader::new] HEAD error: {e}");
return Err(Error::other(e.source().map(|s| s.to_string()).unwrap_or_else(|| e.to_string())));
}
}
let client = get_http_client();
let mut request: RequestBuilder = client.request(method.clone(), url.clone()).headers(headers.clone());
if let Some(body) = body {
request = request.body(body);
}
let resp = request
.send()
.await
.map_err(|e| Error::other(format!("HttpReader HTTP request error: {}", e)))?;
if resp.status().is_success().not() {
return Err(Error::other(format!(
"HttpReader HTTP request failed with non-200 status {}",
resp.status()
)));
}
let stream = resp
.bytes_stream()
.map_err(|e| Error::other(format!("HttpReader stream error: {}", e)));
Ok(Self {
inner: StreamReader::new(Box::pin(stream)),
url,
method,
headers,
})
}
pub fn url(&self) -> &str {
&self.url
}
pub fn method(&self) -> &Method {
&self.method
}
pub fn headers(&self) -> &HeaderMap {
&self.headers
}
}
impl AsyncRead for HttpReader {
fn poll_read(mut self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<std::io::Result<()>> {
// http_log!(
// "[HttpReader::poll_read] url: {}, method: {:?}, buf.remaining: {}",
// self.url,
// self.method,
// buf.remaining()
// );
// Read from the inner stream
Pin::new(&mut self.inner).poll_read(cx, buf)
}
}
impl EtagResolvable for HttpReader {
fn is_etag_reader(&self) -> bool {
false
}
fn try_resolve_etag(&mut self) -> Option<String> {
None
}
}
impl HashReaderDetector for HttpReader {
fn is_hash_reader(&self) -> bool {
false
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
None
}
}
struct ReceiverStream {
receiver: mpsc::Receiver<Option<Bytes>>,
}
impl Stream for ReceiverStream {
type Item = Result<Bytes, std::io::Error>;
fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
let poll = Pin::new(&mut self.receiver).poll_recv(cx);
// match &poll {
// Poll::Ready(Some(Some(bytes))) => {
// // http_log!("[ReceiverStream] poll_next: got {} bytes", bytes.len());
// }
// Poll::Ready(Some(None)) => {
// // http_log!("[ReceiverStream] poll_next: sender shutdown");
// }
// Poll::Ready(None) => {
// // http_log!("[ReceiverStream] poll_next: channel closed");
// }
// Poll::Pending => {
// // http_log!("[ReceiverStream] poll_next: pending");
// }
// }
match poll {
Poll::Ready(Some(Some(bytes))) => Poll::Ready(Some(Ok(bytes))),
Poll::Ready(Some(None)) => Poll::Ready(None), // Sender shutdown
Poll::Ready(None) => Poll::Ready(None),
Poll::Pending => Poll::Pending,
}
}
}
pin_project! {
pub struct HttpWriter {
url:String,
method: Method,
headers: HeaderMap,
err_rx: tokio::sync::oneshot::Receiver<std::io::Error>,
sender: tokio::sync::mpsc::Sender<Option<Bytes>>,
handle: tokio::task::JoinHandle<std::io::Result<()>>,
finish:bool,
}
}
impl HttpWriter {
/// Create a new HttpWriter for the given URL. The HTTP request is performed in the background.
pub async fn new(url: String, method: Method, headers: HeaderMap) -> io::Result<Self> {
// http_log!("[HttpWriter::new] url: {url}, method: {method:?}, headers: {headers:?}");
let url_clone = url.clone();
let method_clone = method.clone();
let headers_clone = headers.clone();
// First, try to write empty data to check if writable
let client = get_http_client();
let resp = client.put(&url).headers(headers.clone()).body(Vec::new()).send().await;
match resp {
Ok(resp) => {
// http_log!("[HttpWriter::new] empty PUT status: {}", resp.status());
if !resp.status().is_success() {
return Err(Error::other(format!("Empty PUT failed: status {}", resp.status())));
}
}
Err(e) => {
// http_log!("[HttpWriter::new] empty PUT error: {e}");
return Err(Error::other(format!("Empty PUT failed: {e}")));
}
}
let (sender, receiver) = tokio::sync::mpsc::channel::<Option<Bytes>>(8);
let (err_tx, err_rx) = tokio::sync::oneshot::channel::<io::Error>();
let handle = tokio::spawn(async move {
let stream = ReceiverStream { receiver };
let body = reqwest::Body::wrap_stream(stream);
// http_log!(
// "[HttpWriter::spawn] sending HTTP request: url={url_clone}, method={method_clone:?}, headers={headers_clone:?}"
// );
let client = get_http_client();
let request = client
.request(method_clone, url_clone.clone())
.headers(headers_clone.clone())
.body(body);
// Hold the request until the shutdown signal is received
let response = request.send().await;
match response {
Ok(resp) => {
// http_log!("[HttpWriter::spawn] got response: status={}", resp.status());
if !resp.status().is_success() {
let _ = err_tx.send(Error::other(format!(
"HttpWriter HTTP request failed with non-200 status {}",
resp.status()
)));
return Err(Error::other(format!("HTTP request failed with non-200 status {}", resp.status())));
}
}
Err(e) => {
// http_log!("[HttpWriter::spawn] HTTP request error: {e}");
let _ = err_tx.send(Error::other(format!("HTTP request failed: {}", e)));
return Err(Error::other(format!("HTTP request failed: {}", e)));
}
}
// http_log!("[HttpWriter::spawn] HTTP request completed, exiting");
Ok(())
});
// http_log!("[HttpWriter::new] connection established successfully");
Ok(Self {
url,
method,
headers,
err_rx,
sender,
handle,
finish: false,
})
}
pub fn url(&self) -> &str {
&self.url
}
pub fn method(&self) -> &Method {
&self.method
}
pub fn headers(&self) -> &HeaderMap {
&self.headers
}
}
impl AsyncWrite for HttpWriter {
fn poll_write(mut self: Pin<&mut Self>, _cx: &mut Context<'_>, buf: &[u8]) -> Poll<io::Result<usize>> {
// http_log!(
// "[HttpWriter::poll_write] url: {}, method: {:?}, buf.len: {}",
// self.url,
// self.method,
// buf.len()
// );
if let Ok(e) = Pin::new(&mut self.err_rx).try_recv() {
return Poll::Ready(Err(e));
}
self.sender
.try_send(Some(Bytes::copy_from_slice(buf)))
.map_err(|e| Error::other(format!("HttpWriter send error: {}", e)))?;
Poll::Ready(Ok(buf.len()))
}
fn poll_flush(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<Result<(), io::Error>> {
Poll::Ready(Ok(()))
}
fn poll_shutdown(mut self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<Result<(), io::Error>> {
// let url = self.url.clone();
// let method = self.method.clone();
if !self.finish {
// http_log!("[HttpWriter::poll_shutdown] url: {}, method: {:?}", url, method);
self.sender
.try_send(None)
.map_err(|e| Error::other(format!("HttpWriter shutdown error: {}", e)))?;
// http_log!(
// "[HttpWriter::poll_shutdown] sent shutdown signal to HTTP request, url: {}, method: {:?}",
// url,
// method
// );
self.finish = true;
}
// Wait for the HTTP request to complete
use futures::FutureExt;
match Pin::new(&mut self.get_mut().handle).poll_unpin(_cx) {
Poll::Ready(Ok(_)) => {
// http_log!(
// "[HttpWriter::poll_shutdown] HTTP request finished successfully, url: {}, method: {:?}",
// url,
// method
// );
}
Poll::Ready(Err(e)) => {
// http_log!("[HttpWriter::poll_shutdown] HTTP request failed: {e}, url: {}, method: {:?}", url, method);
return Poll::Ready(Err(Error::other(format!("HTTP request failed: {}", e))));
}
Poll::Pending => {
// http_log!("[HttpWriter::poll_shutdown] HTTP request pending, url: {}, method: {:?}", url, method);
return Poll::Pending;
}
}
Poll::Ready(Ok(()))
}
}
// #[cfg(test)]
// mod tests {
// use super::*;
// use reqwest::Method;
// use std::vec;
// use tokio::io::{AsyncReadExt, AsyncWriteExt};
// #[tokio::test]
// async fn test_http_writer_err() {
// // Use a real local server for integration, or mockito for unit test
// // Here, we use the Go test server at 127.0.0.1:8081 (scripts/testfile.go)
// let url = "http://127.0.0.1:8081/testfile".to_string();
// let data = vec![42u8; 8];
// // Write
// // 添加 header X-Deny-Write = 1 模拟不可写入的情况
// let mut headers = HeaderMap::new();
// headers.insert("X-Deny-Write", "1".parse().unwrap());
// // 这里我们使用 PUT 方法
// let writer_result = HttpWriter::new(url.clone(), Method::PUT, headers).await;
// match writer_result {
// Ok(mut writer) => {
// // 如果能创建成功,写入应该报错
// let write_result = writer.write_all(&data).await;
// assert!(write_result.is_err(), "write_all should fail when server denies write");
// if let Err(e) = write_result {
// println!("write_all error: {e}");
// }
// let shutdown_result = writer.shutdown().await;
// if let Err(e) = shutdown_result {
// println!("shutdown error: {e}");
// }
// }
// Err(e) => {
// // 直接构造失败也可以
// println!("HttpWriter::new error: {e}");
// assert!(
// e.to_string().contains("Empty PUT failed") || e.to_string().contains("Forbidden"),
// "unexpected error: {e}"
// );
// return;
// }
// }
// // Should not reach here
// panic!("HttpWriter should not allow writing when server denies write");
// }
// #[tokio::test]
// async fn test_http_writer_and_reader_ok() {
// // 使用本地 Go 测试服务器
// let url = "http://127.0.0.1:8081/testfile".to_string();
// let data = vec![99u8; 512 * 1024]; // 512KB of data
// // Write (不加 X-Deny-Write)
// let headers = HeaderMap::new();
// let mut writer = HttpWriter::new(url.clone(), Method::PUT, headers).await.unwrap();
// writer.write_all(&data).await.unwrap();
// writer.shutdown().await.unwrap();
// http_log!("Wrote {} bytes to {} (ok case)", data.len(), url);
// // Read back
// let mut reader = HttpReader::with_capacity(url.clone(), Method::GET, HeaderMap::new(), 8192)
// .await
// .unwrap();
// let mut buf = Vec::new();
// reader.read_to_end(&mut buf).await.unwrap();
// assert_eq!(buf, data);
// // println!("Read {} bytes from {} (ok case)", buf.len(), url);
// // tokio::time::sleep(std::time::Duration::from_secs(2)).await; // Wait for server to process
// // println!("[test_http_writer_and_reader_ok] completed successfully");
// }
// }

69
crates/rio/src/lib.rs Normal file
View File

@@ -0,0 +1,69 @@
mod limit_reader;
pub use limit_reader::LimitReader;
mod etag_reader;
pub use etag_reader::EtagReader;
mod compress_index;
mod compress_reader;
pub use compress_reader::{CompressReader, DecompressReader};
mod encrypt_reader;
pub use encrypt_reader::{DecryptReader, EncryptReader};
mod hardlimit_reader;
pub use hardlimit_reader::HardLimitReader;
mod hash_reader;
pub use hash_reader::*;
pub mod reader;
pub use reader::WarpReader;
mod writer;
pub use writer::*;
mod http_reader;
pub use http_reader::*;
pub use compress_index::TryGetIndex;
mod etag;
pub trait Reader: tokio::io::AsyncRead + Unpin + Send + Sync + EtagResolvable + HashReaderDetector + TryGetIndex {}
// Trait for types that can be recursively searched for etag capability
pub trait EtagResolvable {
fn is_etag_reader(&self) -> bool {
false
}
fn try_resolve_etag(&mut self) -> Option<String> {
None
}
}
// Generic function that can work with any EtagResolvable type
pub fn resolve_etag_generic<R>(reader: &mut R) -> Option<String>
where
R: EtagResolvable,
{
reader.try_resolve_etag()
}
/// Trait to detect and manipulate HashReader instances
pub trait HashReaderDetector {
fn is_hash_reader(&self) -> bool {
false
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
None
}
}
impl Reader for crate::HashReader {}
impl Reader for crate::HardLimitReader {}
impl Reader for crate::EtagReader {}
impl<R> Reader for crate::CompressReader<R> where R: Reader {}
impl<R> Reader for crate::EncryptReader<R> where R: Reader {}

View File

@@ -0,0 +1,188 @@
//! LimitReader: a wrapper for AsyncRead that limits the total number of bytes read.
//!
//! # Example
//! ```
//! use tokio::io::{AsyncReadExt, BufReader};
//! use rustfs_rio::LimitReader;
//!
//! #[tokio::main]
//! async fn main() {
//! let data = b"hello world";
//! let reader = BufReader::new(&data[..]);
//! let mut limit_reader = LimitReader::new(reader, data.len());
//!
//! let mut buf = Vec::new();
//! let n = limit_reader.read_to_end(&mut buf).await.unwrap();
//! assert_eq!(n, data.len());
//! assert_eq!(&buf, data);
//! }
//! ```
use pin_project_lite::pin_project;
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, ReadBuf};
use crate::{EtagResolvable, HashReaderDetector, HashReaderMut};
pin_project! {
#[derive(Debug)]
pub struct LimitReader<R> {
#[pin]
pub inner: R,
limit: usize,
read: usize,
}
}
/// A wrapper for AsyncRead that limits the total number of bytes read.
impl<R> LimitReader<R>
where
R: AsyncRead + Unpin + Send + Sync,
{
/// Create a new LimitReader wrapping `inner`, with a total read limit of `limit` bytes.
pub fn new(inner: R, limit: usize) -> Self {
Self { inner, limit, read: 0 }
}
}
impl<R> AsyncRead for LimitReader<R>
where
R: AsyncRead + Unpin + Send + Sync,
{
fn poll_read(self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<std::io::Result<()>> {
let mut this = self.project();
let remaining = this.limit.saturating_sub(*this.read);
if remaining == 0 {
return Poll::Ready(Ok(()));
}
let orig_remaining = buf.remaining();
let allowed = remaining.min(orig_remaining);
if allowed == 0 {
return Poll::Ready(Ok(()));
}
if allowed == orig_remaining {
let before_size = buf.filled().len();
let poll = this.inner.as_mut().poll_read(cx, buf);
if let Poll::Ready(Ok(())) = &poll {
let n = buf.filled().len() - before_size;
*this.read += n;
}
poll
} else {
let mut temp = vec![0u8; allowed];
let mut temp_buf = ReadBuf::new(&mut temp);
let poll = this.inner.as_mut().poll_read(cx, &mut temp_buf);
if let Poll::Ready(Ok(())) = &poll {
let n = temp_buf.filled().len();
buf.put_slice(temp_buf.filled());
*this.read += n;
}
poll
}
}
}
impl<R> EtagResolvable for LimitReader<R>
where
R: EtagResolvable,
{
fn try_resolve_etag(&mut self) -> Option<String> {
self.inner.try_resolve_etag()
}
}
impl<R> HashReaderDetector for LimitReader<R>
where
R: HashReaderDetector,
{
fn is_hash_reader(&self) -> bool {
self.inner.is_hash_reader()
}
fn as_hash_reader_mut(&mut self) -> Option<&mut dyn HashReaderMut> {
self.inner.as_hash_reader_mut()
}
}
#[cfg(test)]
mod tests {
use std::io::Cursor;
use super::*;
use tokio::io::{AsyncReadExt, BufReader};
#[tokio::test]
async fn test_limit_reader_exact() {
let data = b"hello world";
let reader = BufReader::new(&data[..]);
let mut limit_reader = LimitReader::new(reader, data.len());
let mut buf = Vec::new();
let n = limit_reader.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, data.len());
assert_eq!(&buf, data);
}
#[tokio::test]
async fn test_limit_reader_less_than_data() {
let data = b"hello world";
let reader = BufReader::new(&data[..]);
let mut limit_reader = LimitReader::new(reader, 5);
let mut buf = Vec::new();
let n = limit_reader.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, 5);
assert_eq!(&buf, b"hello");
}
#[tokio::test]
async fn test_limit_reader_zero() {
let data = b"hello world";
let reader = BufReader::new(&data[..]);
let mut limit_reader = LimitReader::new(reader, 0);
let mut buf = Vec::new();
let n = limit_reader.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, 0);
assert!(buf.is_empty());
}
#[tokio::test]
async fn test_limit_reader_multiple_reads() {
let data = b"abcdefghij";
let reader = BufReader::new(&data[..]);
let mut limit_reader = LimitReader::new(reader, 7);
let mut buf1 = [0u8; 3];
let n1 = limit_reader.read(&mut buf1).await.unwrap();
assert_eq!(n1, 3);
assert_eq!(&buf1, b"abc");
let mut buf2 = [0u8; 5];
let n2 = limit_reader.read(&mut buf2).await.unwrap();
assert_eq!(n2, 4);
assert_eq!(&buf2[..n2], b"defg");
let mut buf3 = [0u8; 2];
let n3 = limit_reader.read(&mut buf3).await.unwrap();
assert_eq!(n3, 0);
}
#[tokio::test]
async fn test_limit_reader_large_file() {
use rand::Rng;
// Generate a 3MB random byte array for testing
let size = 3 * 1024 * 1024;
let mut data = vec![0u8; size];
rand::rng().fill(&mut data[..]);
let reader = Cursor::new(data.clone());
let mut limit_reader = LimitReader::new(reader, size);
// Read data into buffer
let mut buf = Vec::new();
let n = limit_reader.read_to_end(&mut buf).await.unwrap();
assert_eq!(n, size);
assert_eq!(buf.len(), size);
assert_eq!(&buf, &data);
}
}

30
crates/rio/src/reader.rs Normal file
View File

@@ -0,0 +1,30 @@
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::{AsyncRead, ReadBuf};
use crate::compress_index::TryGetIndex;
use crate::{EtagResolvable, HashReaderDetector, Reader};
pub struct WarpReader<R> {
inner: R,
}
impl<R: AsyncRead + Unpin + Send + Sync> WarpReader<R> {
pub fn new(inner: R) -> Self {
Self { inner }
}
}
impl<R: AsyncRead + Unpin + Send + Sync> AsyncRead for WarpReader<R> {
fn poll_read(mut self: Pin<&mut Self>, cx: &mut Context<'_>, buf: &mut ReadBuf<'_>) -> Poll<std::io::Result<()>> {
Pin::new(&mut self.inner).poll_read(cx, buf)
}
}
impl<R: AsyncRead + Unpin + Send + Sync> HashReaderDetector for WarpReader<R> {}
impl<R: AsyncRead + Unpin + Send + Sync> EtagResolvable for WarpReader<R> {}
impl<R: AsyncRead + Unpin + Send + Sync> TryGetIndex for WarpReader<R> {}
impl<R: AsyncRead + Unpin + Send + Sync> Reader for WarpReader<R> {}

92
crates/rio/src/writer.rs Normal file
View File

@@ -0,0 +1,92 @@
use std::io::Cursor;
use std::pin::Pin;
use tokio::io::AsyncWrite;
use crate::HttpWriter;
pub enum Writer {
Cursor(Cursor<Vec<u8>>),
Http(HttpWriter),
Other(Box<dyn AsyncWrite + Unpin + Send + Sync>),
}
impl Writer {
/// Create a Writer::Other from any AsyncWrite + Unpin + Send type.
pub fn from_tokio_writer<W>(w: W) -> Self
where
W: AsyncWrite + Unpin + Send + Sync + 'static,
{
Writer::Other(Box::new(w))
}
pub fn from_cursor(w: Cursor<Vec<u8>>) -> Self {
Writer::Cursor(w)
}
pub fn from_http(w: HttpWriter) -> Self {
Writer::Http(w)
}
pub fn into_cursor_inner(self) -> Option<Vec<u8>> {
match self {
Writer::Cursor(w) => Some(w.into_inner()),
_ => None,
}
}
pub fn as_cursor(&mut self) -> Option<&mut Cursor<Vec<u8>>> {
match self {
Writer::Cursor(w) => Some(w),
_ => None,
}
}
pub fn as_http(&mut self) -> Option<&mut HttpWriter> {
match self {
Writer::Http(w) => Some(w),
_ => None,
}
}
pub fn into_http(self) -> Option<HttpWriter> {
match self {
Writer::Http(w) => Some(w),
_ => None,
}
}
pub fn into_cursor(self) -> Option<Cursor<Vec<u8>>> {
match self {
Writer::Cursor(w) => Some(w),
_ => None,
}
}
}
impl AsyncWrite for Writer {
fn poll_write(
self: std::pin::Pin<&mut Self>,
cx: &mut std::task::Context<'_>,
buf: &[u8],
) -> std::task::Poll<std::io::Result<usize>> {
match self.get_mut() {
Writer::Cursor(w) => Pin::new(w).poll_write(cx, buf),
Writer::Http(w) => Pin::new(w).poll_write(cx, buf),
Writer::Other(w) => Pin::new(w.as_mut()).poll_write(cx, buf),
}
}
fn poll_flush(self: std::pin::Pin<&mut Self>, cx: &mut std::task::Context<'_>) -> std::task::Poll<std::io::Result<()>> {
match self.get_mut() {
Writer::Cursor(w) => Pin::new(w).poll_flush(cx),
Writer::Http(w) => Pin::new(w).poll_flush(cx),
Writer::Other(w) => Pin::new(w.as_mut()).poll_flush(cx),
}
}
fn poll_shutdown(self: std::pin::Pin<&mut Self>, cx: &mut std::task::Context<'_>) -> std::task::Poll<std::io::Result<()>> {
match self.get_mut() {
Writer::Cursor(w) => Pin::new(w).poll_shutdown(cx),
Writer::Http(w) => Pin::new(w).poll_shutdown(cx),
Writer::Other(w) => Pin::new(w.as_mut()).poll_shutdown(cx),
}
}
}

View File

@@ -7,15 +7,40 @@ rust-version.workspace = true
version.workspace = true
[dependencies]
base64-simd = { workspace = true, optional = true }
blake3 = { workspace = true, optional = true }
crc32fast.workspace = true
hex-simd = { workspace = true, optional = true }
highway = { workspace = true, optional = true }
lazy_static = { workspace = true, optional = true }
local-ip-address = { workspace = true, optional = true }
rustfs-config = { workspace = true, features = ["constants"] }
md-5 = { workspace = true, optional = true }
netif = { workspace = true, optional = true }
nix = { workspace = true, optional = true }
regex = { workspace = true, optional = true }
rustls = { workspace = true, optional = true }
rustls-pemfile = { workspace = true, optional = true }
rustls-pki-types = { workspace = true, optional = true }
serde = { workspace = true, optional = true }
sha2 = { workspace = true, optional = true }
siphasher = { workspace = true, optional = true }
tempfile = { workspace = true, optional = true }
tokio = { workspace = true, optional = true, features = ["io-util", "macros"] }
tracing = { workspace = true }
url = { workspace = true, optional = true }
flate2 = { workspace = true, optional = true }
brotli = { workspace = true, optional = true }
zstd = { workspace = true, optional = true }
snap = { workspace = true, optional = true }
lz4 = { workspace = true, optional = true }
[dev-dependencies]
tempfile = { workspace = true }
rand = { workspace = true }
[target.'cfg(windows)'.dependencies]
winapi = { workspace = true, optional = true, features = ["std", "fileapi", "minwindef", "ntdef", "winnt"] }
[lints]
workspace = true
@@ -24,6 +49,13 @@ workspace = true
default = ["ip"] # features that are enabled by default
ip = ["dep:local-ip-address"] # ip characteristics and their dependencies
tls = ["dep:rustls", "dep:rustls-pemfile", "dep:rustls-pki-types"] # tls characteristics and their dependencies
net = ["ip"] # empty network features
net = ["ip", "dep:url", "dep:netif", "dep:lazy_static"] # empty network features
io = ["dep:tokio"]
path = []
compress = ["dep:flate2", "dep:brotli", "dep:snap", "dep:lz4", "dep:zstd"]
string = ["dep:regex", "dep:lazy_static"]
crypto = ["dep:base64-simd", "dep:hex-simd"]
hash = ["dep:highway", "dep:md-5", "dep:sha2", "dep:blake3", "dep:serde", "dep:siphasher"]
os = ["dep:nix", "dep:tempfile", "winapi"] # operating system utilities
integration = [] # integration test features
full = ["ip", "tls", "net", "integration"] # all features
full = ["ip", "tls", "net", "io", "hash", "os", "integration", "path", "crypto", "string", "compress"] # all features

View File

@@ -396,10 +396,12 @@ mod tests {
// Should fail because no certificates found
let result = load_all_certs_from_directory(temp_dir.path().to_str().unwrap());
assert!(result.is_err());
assert!(result
.unwrap_err()
.to_string()
.contains("No valid certificate/private key pair found"));
assert!(
result
.unwrap_err()
.to_string()
.contains("No valid certificate/private key pair found")
);
}
#[test]
@@ -412,10 +414,12 @@ mod tests {
let result = load_all_certs_from_directory(unicode_dir.to_str().unwrap());
assert!(result.is_err());
assert!(result
.unwrap_err()
.to_string()
.contains("No valid certificate/private key pair found"));
assert!(
result
.unwrap_err()
.to_string()
.contains("No valid certificate/private key pair found")
);
}
#[test]

View File

@@ -0,0 +1,318 @@
use std::io::Write;
use tokio::io;
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Default)]
pub enum CompressionAlgorithm {
None,
Gzip,
Deflate,
Zstd,
#[default]
Lz4,
Brotli,
Snappy,
}
impl CompressionAlgorithm {
pub fn as_str(&self) -> &str {
match self {
CompressionAlgorithm::None => "none",
CompressionAlgorithm::Gzip => "gzip",
CompressionAlgorithm::Deflate => "deflate",
CompressionAlgorithm::Zstd => "zstd",
CompressionAlgorithm::Lz4 => "lz4",
CompressionAlgorithm::Brotli => "brotli",
CompressionAlgorithm::Snappy => "snappy",
}
}
}
impl std::fmt::Display for CompressionAlgorithm {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", self.as_str())
}
}
impl std::str::FromStr for CompressionAlgorithm {
type Err = std::io::Error;
fn from_str(s: &str) -> Result<Self, Self::Err> {
match s.to_lowercase().as_str() {
"gzip" => Ok(CompressionAlgorithm::Gzip),
"deflate" => Ok(CompressionAlgorithm::Deflate),
"zstd" => Ok(CompressionAlgorithm::Zstd),
"lz4" => Ok(CompressionAlgorithm::Lz4),
"brotli" => Ok(CompressionAlgorithm::Brotli),
"snappy" => Ok(CompressionAlgorithm::Snappy),
"none" => Ok(CompressionAlgorithm::None),
_ => Err(std::io::Error::other(format!("Unsupported compression algorithm: {}", s))),
}
}
}
pub fn compress_block(input: &[u8], algorithm: CompressionAlgorithm) -> Vec<u8> {
match algorithm {
CompressionAlgorithm::Gzip => {
let mut encoder = flate2::write::GzEncoder::new(Vec::new(), flate2::Compression::default());
let _ = encoder.write_all(input);
let _ = encoder.flush();
encoder.finish().unwrap_or_default()
}
CompressionAlgorithm::Deflate => {
let mut encoder = flate2::write::DeflateEncoder::new(Vec::new(), flate2::Compression::default());
let _ = encoder.write_all(input);
let _ = encoder.flush();
encoder.finish().unwrap_or_default()
}
CompressionAlgorithm::Zstd => {
let mut encoder = zstd::Encoder::new(Vec::new(), 0).expect("zstd encoder");
let _ = encoder.write_all(input);
encoder.finish().unwrap_or_default()
}
CompressionAlgorithm::Lz4 => {
let mut encoder = lz4::EncoderBuilder::new().build(Vec::new()).expect("lz4 encoder");
let _ = encoder.write_all(input);
let (out, result) = encoder.finish();
result.expect("lz4 finish");
out
}
CompressionAlgorithm::Brotli => {
let mut out = Vec::new();
brotli::CompressorWriter::new(&mut out, 4096, 5, 22)
.write_all(input)
.expect("brotli compress");
out
}
CompressionAlgorithm::Snappy => {
let mut encoder = snap::write::FrameEncoder::new(Vec::new());
let _ = encoder.write_all(input);
encoder.into_inner().unwrap_or_default()
}
CompressionAlgorithm::None => input.to_vec(),
}
}
pub fn decompress_block(compressed: &[u8], algorithm: CompressionAlgorithm) -> io::Result<Vec<u8>> {
match algorithm {
CompressionAlgorithm::Gzip => {
let mut decoder = flate2::read::GzDecoder::new(std::io::Cursor::new(compressed));
let mut out = Vec::new();
std::io::Read::read_to_end(&mut decoder, &mut out)?;
Ok(out)
}
CompressionAlgorithm::Deflate => {
let mut decoder = flate2::read::DeflateDecoder::new(std::io::Cursor::new(compressed));
let mut out = Vec::new();
std::io::Read::read_to_end(&mut decoder, &mut out)?;
Ok(out)
}
CompressionAlgorithm::Zstd => {
let mut decoder = zstd::Decoder::new(std::io::Cursor::new(compressed))?;
let mut out = Vec::new();
std::io::Read::read_to_end(&mut decoder, &mut out)?;
Ok(out)
}
CompressionAlgorithm::Lz4 => {
let mut decoder = lz4::Decoder::new(std::io::Cursor::new(compressed)).expect("lz4 decoder");
let mut out = Vec::new();
std::io::Read::read_to_end(&mut decoder, &mut out)?;
Ok(out)
}
CompressionAlgorithm::Brotli => {
let mut out = Vec::new();
let mut decoder = brotli::Decompressor::new(std::io::Cursor::new(compressed), 4096);
std::io::Read::read_to_end(&mut decoder, &mut out)?;
Ok(out)
}
CompressionAlgorithm::Snappy => {
let mut decoder = snap::read::FrameDecoder::new(std::io::Cursor::new(compressed));
let mut out = Vec::new();
std::io::Read::read_to_end(&mut decoder, &mut out)?;
Ok(out)
}
CompressionAlgorithm::None => Ok(Vec::new()),
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::str::FromStr;
use std::time::Instant;
#[test]
fn test_compress_decompress_gzip() {
let data = b"hello gzip compress";
let compressed = compress_block(data, CompressionAlgorithm::Gzip);
let decompressed = decompress_block(&compressed, CompressionAlgorithm::Gzip).unwrap();
assert_eq!(decompressed, data);
}
#[test]
fn test_compress_decompress_deflate() {
let data = b"hello deflate compress";
let compressed = compress_block(data, CompressionAlgorithm::Deflate);
let decompressed = decompress_block(&compressed, CompressionAlgorithm::Deflate).unwrap();
assert_eq!(decompressed, data);
}
#[test]
fn test_compress_decompress_zstd() {
let data = b"hello zstd compress";
let compressed = compress_block(data, CompressionAlgorithm::Zstd);
let decompressed = decompress_block(&compressed, CompressionAlgorithm::Zstd).unwrap();
assert_eq!(decompressed, data);
}
#[test]
fn test_compress_decompress_lz4() {
let data = b"hello lz4 compress";
let compressed = compress_block(data, CompressionAlgorithm::Lz4);
let decompressed = decompress_block(&compressed, CompressionAlgorithm::Lz4).unwrap();
assert_eq!(decompressed, data);
}
#[test]
fn test_compress_decompress_brotli() {
let data = b"hello brotli compress";
let compressed = compress_block(data, CompressionAlgorithm::Brotli);
let decompressed = decompress_block(&compressed, CompressionAlgorithm::Brotli).unwrap();
assert_eq!(decompressed, data);
}
#[test]
fn test_compress_decompress_snappy() {
let data = b"hello snappy compress";
let compressed = compress_block(data, CompressionAlgorithm::Snappy);
let decompressed = decompress_block(&compressed, CompressionAlgorithm::Snappy).unwrap();
assert_eq!(decompressed, data);
}
#[test]
fn test_from_str() {
assert_eq!(CompressionAlgorithm::from_str("gzip").unwrap(), CompressionAlgorithm::Gzip);
assert_eq!(CompressionAlgorithm::from_str("deflate").unwrap(), CompressionAlgorithm::Deflate);
assert_eq!(CompressionAlgorithm::from_str("zstd").unwrap(), CompressionAlgorithm::Zstd);
assert_eq!(CompressionAlgorithm::from_str("lz4").unwrap(), CompressionAlgorithm::Lz4);
assert_eq!(CompressionAlgorithm::from_str("brotli").unwrap(), CompressionAlgorithm::Brotli);
assert_eq!(CompressionAlgorithm::from_str("snappy").unwrap(), CompressionAlgorithm::Snappy);
assert!(CompressionAlgorithm::from_str("unknown").is_err());
}
#[test]
fn test_compare_compression_algorithms() {
use std::time::Instant;
let data = vec![42u8; 1024 * 100]; // 100KB of repetitive data
// let mut data = vec![0u8; 1024 * 1024];
// rand::thread_rng().fill(&mut data[..]);
let start = Instant::now();
let mut times = Vec::new();
times.push(("original", start.elapsed(), data.len()));
let start = Instant::now();
let gzip = compress_block(&data, CompressionAlgorithm::Gzip);
let gzip_time = start.elapsed();
times.push(("gzip", gzip_time, gzip.len()));
let start = Instant::now();
let deflate = compress_block(&data, CompressionAlgorithm::Deflate);
let deflate_time = start.elapsed();
times.push(("deflate", deflate_time, deflate.len()));
let start = Instant::now();
let zstd = compress_block(&data, CompressionAlgorithm::Zstd);
let zstd_time = start.elapsed();
times.push(("zstd", zstd_time, zstd.len()));
let start = Instant::now();
let lz4 = compress_block(&data, CompressionAlgorithm::Lz4);
let lz4_time = start.elapsed();
times.push(("lz4", lz4_time, lz4.len()));
let start = Instant::now();
let brotli = compress_block(&data, CompressionAlgorithm::Brotli);
let brotli_time = start.elapsed();
times.push(("brotli", brotli_time, brotli.len()));
let start = Instant::now();
let snappy = compress_block(&data, CompressionAlgorithm::Snappy);
let snappy_time = start.elapsed();
times.push(("snappy", snappy_time, snappy.len()));
println!("Compression results:");
for (name, dur, size) in &times {
println!("{}: {} bytes, {:?}", name, size, dur);
}
// All should decompress to the original
assert_eq!(decompress_block(&gzip, CompressionAlgorithm::Gzip).unwrap(), data);
assert_eq!(decompress_block(&deflate, CompressionAlgorithm::Deflate).unwrap(), data);
assert_eq!(decompress_block(&zstd, CompressionAlgorithm::Zstd).unwrap(), data);
assert_eq!(decompress_block(&lz4, CompressionAlgorithm::Lz4).unwrap(), data);
assert_eq!(decompress_block(&brotli, CompressionAlgorithm::Brotli).unwrap(), data);
assert_eq!(decompress_block(&snappy, CompressionAlgorithm::Snappy).unwrap(), data);
// All compressed results should not be empty
assert!(
!gzip.is_empty()
&& !deflate.is_empty()
&& !zstd.is_empty()
&& !lz4.is_empty()
&& !brotli.is_empty()
&& !snappy.is_empty()
);
}
#[test]
fn test_compression_benchmark() {
let sizes = [128 * 1024, 512 * 1024, 1024 * 1024];
let algorithms = [
CompressionAlgorithm::Gzip,
CompressionAlgorithm::Deflate,
CompressionAlgorithm::Zstd,
CompressionAlgorithm::Lz4,
CompressionAlgorithm::Brotli,
CompressionAlgorithm::Snappy,
];
println!("\n压缩算法基准测试结果:");
println!(
"{:<10} {:<10} {:<15} {:<15} {:<15}",
"数据大小", "算法", "压缩时间(ms)", "压缩后大小", "压缩率"
);
for size in sizes {
// 生成可压缩的数据(重复的文本模式)
let pattern = b"Hello, this is a test pattern that will be repeated multiple times to create compressible data. ";
let data: Vec<u8> = pattern.iter().cycle().take(size).copied().collect();
for algo in algorithms {
// 压缩测试
let start = Instant::now();
let compressed = compress_block(&data, algo);
let compress_time = start.elapsed();
// 解压测试
let start = Instant::now();
let _decompressed = decompress_block(&compressed, algo).unwrap();
let _decompress_time = start.elapsed();
// 计算压缩率
let compression_ratio = (size as f64 / compressed.len() as f64) as f32;
println!(
"{:<10} {:<10} {:<15.2} {:<15} {:<15.2}x",
format!("{}KB", size / 1024),
algo.as_str(),
compress_time.as_secs_f64() * 1000.0,
compressed.len(),
compression_ratio
);
// 验证解压结果
assert_eq!(_decompressed, data);
}
println!(); // 添加空行分隔不同大小的结果
}
}
}

206
crates/utils/src/hash.rs Normal file
View File

@@ -0,0 +1,206 @@
use highway::{HighwayHash, HighwayHasher, Key};
use md5::{Digest, Md5};
use serde::{Deserialize, Serialize};
use sha2::Sha256;
/// The fixed key for HighwayHash256. DO NOT change for compatibility.
const HIGHWAY_HASH256_KEY: [u64; 4] = [3, 4, 2, 1];
#[derive(Serialize, Deserialize, Debug, PartialEq, Default, Clone, Eq, Hash)]
/// Supported hash algorithms for bitrot protection.
pub enum HashAlgorithm {
// SHA256 represents the SHA-256 hash function
SHA256,
// HighwayHash256 represents the HighwayHash-256 hash function
HighwayHash256,
// HighwayHash256S represents the Streaming HighwayHash-256 hash function
#[default]
HighwayHash256S,
// BLAKE2b512 represents the BLAKE2b-512 hash function
BLAKE2b512,
/// MD5 (128-bit)
Md5,
/// No hash (for testing or unprotected data)
None,
}
enum HashEncoded {
Md5([u8; 16]),
Sha256([u8; 32]),
HighwayHash256([u8; 32]),
HighwayHash256S([u8; 32]),
Blake2b512(blake3::Hash),
None,
}
impl AsRef<[u8]> for HashEncoded {
#[inline]
fn as_ref(&self) -> &[u8] {
match self {
HashEncoded::Md5(hash) => hash.as_ref(),
HashEncoded::Sha256(hash) => hash.as_ref(),
HashEncoded::HighwayHash256(hash) => hash.as_ref(),
HashEncoded::HighwayHash256S(hash) => hash.as_ref(),
HashEncoded::Blake2b512(hash) => hash.as_bytes(),
HashEncoded::None => &[],
}
}
}
#[inline]
fn u8x32_from_u64x4(input: [u64; 4]) -> [u8; 32] {
let mut output = [0u8; 32];
for (i, &n) in input.iter().enumerate() {
output[i * 8..(i + 1) * 8].copy_from_slice(&n.to_le_bytes());
}
output
}
impl HashAlgorithm {
/// Hash the input data and return the hash result as Vec<u8>.
pub fn hash_encode(&self, data: &[u8]) -> impl AsRef<[u8]> {
match self {
HashAlgorithm::Md5 => HashEncoded::Md5(Md5::digest(data).into()),
HashAlgorithm::HighwayHash256 => {
let mut hasher = HighwayHasher::new(Key(HIGHWAY_HASH256_KEY));
hasher.append(data);
HashEncoded::HighwayHash256(u8x32_from_u64x4(hasher.finalize256()))
}
HashAlgorithm::SHA256 => HashEncoded::Sha256(Sha256::digest(data).into()),
HashAlgorithm::HighwayHash256S => {
let mut hasher = HighwayHasher::new(Key(HIGHWAY_HASH256_KEY));
hasher.append(data);
HashEncoded::HighwayHash256S(u8x32_from_u64x4(hasher.finalize256()))
}
HashAlgorithm::BLAKE2b512 => HashEncoded::Blake2b512(blake3::hash(data)),
HashAlgorithm::None => HashEncoded::None,
}
}
/// Return the output size in bytes for the hash algorithm.
pub fn size(&self) -> usize {
match self {
HashAlgorithm::SHA256 => 32,
HashAlgorithm::HighwayHash256 => 32,
HashAlgorithm::HighwayHash256S => 32,
HashAlgorithm::BLAKE2b512 => 32, // blake3 outputs 32 bytes by default
HashAlgorithm::Md5 => 16,
HashAlgorithm::None => 0,
}
}
}
use crc32fast::Hasher;
use siphasher::sip::SipHasher;
pub fn sip_hash(key: &str, cardinality: usize, id: &[u8; 16]) -> usize {
// 你的密钥,必须是 16 字节
// 计算字符串的 SipHash 值
let result = SipHasher::new_with_key(id).hash(key.as_bytes());
result as usize % cardinality
}
pub fn crc_hash(key: &str, cardinality: usize) -> usize {
let mut hasher = Hasher::new(); // 创建一个新的哈希器
hasher.update(key.as_bytes()); // 更新哈希状态,添加数据
let checksum = hasher.finalize();
checksum as usize % cardinality
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_hash_algorithm_sizes() {
assert_eq!(HashAlgorithm::Md5.size(), 16);
assert_eq!(HashAlgorithm::HighwayHash256.size(), 32);
assert_eq!(HashAlgorithm::HighwayHash256S.size(), 32);
assert_eq!(HashAlgorithm::SHA256.size(), 32);
assert_eq!(HashAlgorithm::BLAKE2b512.size(), 32);
assert_eq!(HashAlgorithm::None.size(), 0);
}
#[test]
fn test_hash_encode_none() {
let data = b"test data";
let hash = HashAlgorithm::None.hash_encode(data);
let hash = hash.as_ref();
assert_eq!(hash.len(), 0);
}
#[test]
fn test_hash_encode_md5() {
let data = b"test data";
let hash = HashAlgorithm::Md5.hash_encode(data);
let hash = hash.as_ref();
assert_eq!(hash.len(), 16);
// MD5 should be deterministic
let hash2 = HashAlgorithm::Md5.hash_encode(data);
let hash2 = hash2.as_ref();
assert_eq!(hash, hash2);
}
#[test]
fn test_hash_encode_highway() {
let data = b"test data";
let hash = HashAlgorithm::HighwayHash256.hash_encode(data);
let hash = hash.as_ref();
assert_eq!(hash.len(), 32);
// HighwayHash should be deterministic
let hash2 = HashAlgorithm::HighwayHash256.hash_encode(data);
let hash2 = hash2.as_ref();
assert_eq!(hash, hash2);
}
#[test]
fn test_hash_encode_sha256() {
let data = b"test data";
let hash = HashAlgorithm::SHA256.hash_encode(data);
let hash = hash.as_ref();
assert_eq!(hash.len(), 32);
// SHA256 should be deterministic
let hash2 = HashAlgorithm::SHA256.hash_encode(data);
let hash2 = hash2.as_ref();
assert_eq!(hash, hash2);
}
#[test]
fn test_hash_encode_blake2b512() {
let data = b"test data";
let hash = HashAlgorithm::BLAKE2b512.hash_encode(data);
let hash = hash.as_ref();
assert_eq!(hash.len(), 32); // blake3 outputs 32 bytes by default
// BLAKE2b512 should be deterministic
let hash2 = HashAlgorithm::BLAKE2b512.hash_encode(data);
let hash2 = hash2.as_ref();
assert_eq!(hash, hash2);
}
#[test]
fn test_different_data_different_hashes() {
let data1 = b"test data 1";
let data2 = b"test data 2";
let md5_hash1 = HashAlgorithm::Md5.hash_encode(data1);
let md5_hash2 = HashAlgorithm::Md5.hash_encode(data2);
assert_ne!(md5_hash1.as_ref(), md5_hash2.as_ref());
let highway_hash1 = HashAlgorithm::HighwayHash256.hash_encode(data1);
let highway_hash2 = HashAlgorithm::HighwayHash256.hash_encode(data2);
assert_ne!(highway_hash1.as_ref(), highway_hash2.as_ref());
let sha256_hash1 = HashAlgorithm::SHA256.hash_encode(data1);
let sha256_hash2 = HashAlgorithm::SHA256.hash_encode(data2);
assert_ne!(sha256_hash1.as_ref(), sha256_hash2.as_ref());
let blake_hash1 = HashAlgorithm::BLAKE2b512.hash_encode(data1);
let blake_hash2 = HashAlgorithm::BLAKE2b512.hash_encode(data2);
assert_ne!(blake_hash1.as_ref(), blake_hash2.as_ref());
}
}

231
crates/utils/src/io.rs Normal file
View File

@@ -0,0 +1,231 @@
use tokio::io::{AsyncRead, AsyncReadExt, AsyncWrite, AsyncWriteExt};
/// Write all bytes from buf to writer, returning the total number of bytes written.
pub async fn write_all<W: AsyncWrite + Send + Sync + Unpin>(writer: &mut W, buf: &[u8]) -> std::io::Result<usize> {
let mut total = 0;
while total < buf.len() {
match writer.write(&buf[total..]).await {
Ok(0) => {
break;
}
Ok(n) => total += n,
Err(e) => return Err(e),
}
}
Ok(total)
}
/// Read exactly buf.len() bytes into buf, or return an error if EOF is reached before.
/// Like Go's io.ReadFull.
#[allow(dead_code)]
pub async fn read_full<R: AsyncRead + Send + Sync + Unpin>(mut reader: R, mut buf: &mut [u8]) -> std::io::Result<usize> {
let mut total = 0;
while !buf.is_empty() {
let n = match reader.read(buf).await {
Ok(n) => n,
Err(e) => {
if total == 0 {
return Err(e);
}
return Err(std::io::Error::new(
std::io::ErrorKind::UnexpectedEof,
format!("read {} bytes, error: {}", total, e),
));
}
};
if n == 0 {
if total > 0 {
return Ok(total);
}
return Err(std::io::Error::new(std::io::ErrorKind::UnexpectedEof, "early EOF"));
}
buf = &mut buf[n..];
total += n;
}
Ok(total)
}
/// Encodes a u64 into buf and returns the number of bytes written.
/// Panics if buf is too small.
pub fn put_uvarint(buf: &mut [u8], x: u64) -> usize {
let mut i = 0;
let mut x = x;
while x >= 0x80 {
buf[i] = (x as u8) | 0x80;
x >>= 7;
i += 1;
}
buf[i] = x as u8;
i + 1
}
pub fn put_uvarint_len(x: u64) -> usize {
let mut i = 0;
let mut x = x;
while x >= 0x80 {
x >>= 7;
i += 1;
}
i + 1
}
/// Decodes a u64 from buf and returns (value, number of bytes read).
/// If buf is too small, returns (0, 0).
/// If overflow, returns (0, -(n as isize)), where n is the number of bytes read.
pub fn uvarint(buf: &[u8]) -> (u64, isize) {
let mut x: u64 = 0;
let mut s: u32 = 0;
for (i, &b) in buf.iter().enumerate() {
if i == 10 {
// MaxVarintLen64 = 10
return (0, -((i + 1) as isize));
}
if b < 0x80 {
if i == 9 && b > 1 {
return (0, -((i + 1) as isize));
}
return (x | ((b as u64) << s), (i + 1) as isize);
}
x |= ((b & 0x7F) as u64) << s;
s += 7;
}
(0, 0)
}
#[cfg(test)]
mod tests {
use super::*;
use tokio::io::BufReader;
#[tokio::test]
async fn test_read_full_exact() {
// let data = b"abcdef";
let data = b"channel async callback test data!";
let mut reader = BufReader::new(&data[..]);
let size = data.len();
let mut total = 0;
let mut rev = vec![0u8; size];
let mut count = 0;
while total < size {
let mut buf = [0u8; 8];
let n = read_full(&mut reader, &mut buf).await.unwrap();
total += n;
rev[total - n..total].copy_from_slice(&buf[..n]);
count += 1;
println!("count: {}, total: {}, n: {}", count, total, n);
}
assert_eq!(total, size);
assert_eq!(&rev, data);
}
#[tokio::test]
async fn test_read_full_short() {
let data = b"abc";
let mut reader = BufReader::new(&data[..]);
let mut buf = [0u8; 6];
let n = read_full(&mut reader, &mut buf).await.unwrap();
assert_eq!(n, 3);
assert_eq!(&buf[..n], data);
}
#[tokio::test]
async fn test_read_full_1m() {
let size = 1024 * 1024;
let data = vec![42u8; size];
let mut reader = BufReader::new(&data[..]);
let mut buf = vec![0u8; size / 3];
read_full(&mut reader, &mut buf).await.unwrap();
assert_eq!(buf, data[..size / 3]);
}
#[test]
fn test_put_uvarint_and_uvarint_zero() {
let mut buf = [0u8; 16];
let n = put_uvarint(&mut buf, 0);
let (decoded, m) = uvarint(&buf[..n]);
assert_eq!(decoded, 0);
assert_eq!(m as usize, n);
}
#[test]
fn test_put_uvarint_and_uvarint_max() {
let mut buf = [0u8; 16];
let n = put_uvarint(&mut buf, u64::MAX);
let (decoded, m) = uvarint(&buf[..n]);
assert_eq!(decoded, u64::MAX);
assert_eq!(m as usize, n);
}
#[test]
fn test_put_uvarint_and_uvarint_various() {
let mut buf = [0u8; 16];
for &v in &[1u64, 127, 128, 255, 300, 16384, u32::MAX as u64] {
let n = put_uvarint(&mut buf, v);
let (decoded, m) = uvarint(&buf[..n]);
assert_eq!(decoded, v, "decode mismatch for {}", v);
assert_eq!(m as usize, n, "length mismatch for {}", v);
}
}
#[test]
fn test_uvarint_incomplete() {
let buf = [0x80u8, 0x80, 0x80];
let (v, n) = uvarint(&buf);
assert_eq!(v, 0);
assert_eq!(n, 0);
}
#[test]
fn test_uvarint_overflow_case() {
let buf = [0xFFu8; 11];
let (v, n) = uvarint(&buf);
assert_eq!(v, 0);
assert!(n < 0);
}
#[tokio::test]
async fn test_write_all_basic() {
let data = b"hello world!";
let mut buf = Vec::new();
let n = write_all(&mut buf, data).await.unwrap();
assert_eq!(n, data.len());
assert_eq!(&buf, data);
}
#[tokio::test]
async fn test_write_all_partial() {
struct PartialWriter {
inner: Vec<u8>,
max_write: usize,
}
use std::pin::Pin;
use std::task::{Context, Poll};
use tokio::io::AsyncWrite;
impl AsyncWrite for PartialWriter {
fn poll_write(mut self: Pin<&mut Self>, _cx: &mut Context<'_>, buf: &[u8]) -> Poll<std::io::Result<usize>> {
let n = buf.len().min(self.max_write);
self.inner.extend_from_slice(&buf[..n]);
Poll::Ready(Ok(n))
}
fn poll_flush(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<std::io::Result<()>> {
Poll::Ready(Ok(()))
}
fn poll_shutdown(self: Pin<&mut Self>, _cx: &mut Context<'_>) -> Poll<std::io::Result<()>> {
Poll::Ready(Ok(()))
}
}
let data = b"abcdefghijklmnopqrstuvwxyz";
let mut writer = PartialWriter {
inner: Vec::new(),
max_write: 5,
};
let n = write_all(&mut writer, data).await.unwrap();
assert_eq!(n, data.len());
assert_eq!(&writer.inner, data);
}
}

View File

@@ -1,11 +1,44 @@
#[cfg(feature = "tls")]
mod certs;
pub mod certs;
#[cfg(feature = "ip")]
mod ip;
pub mod ip;
#[cfg(feature = "net")]
mod net;
pub mod net;
#[cfg(feature = "net")]
pub use net::*;
#[cfg(feature = "io")]
pub mod io;
#[cfg(feature = "hash")]
pub mod hash;
#[cfg(feature = "os")]
pub mod os;
#[cfg(feature = "path")]
pub mod path;
#[cfg(feature = "string")]
pub mod string;
#[cfg(feature = "crypto")]
pub mod crypto;
#[cfg(feature = "compress")]
pub mod compress;
#[cfg(feature = "tls")]
pub use certs::*;
#[cfg(feature = "hash")]
pub use hash::*;
#[cfg(feature = "io")]
pub use io::*;
#[cfg(feature = "ip")]
pub use ip::*;
#[cfg(feature = "crypto")]
pub use crypto::*;
#[cfg(feature = "compress")]
pub use compress::*;

View File

@@ -1 +1,499 @@
use lazy_static::lazy_static;
use std::{
collections::HashSet,
fmt::Display,
net::{IpAddr, Ipv6Addr, SocketAddr, TcpListener, ToSocketAddrs},
};
use url::Host;
lazy_static! {
static ref LOCAL_IPS: Vec<IpAddr> = must_get_local_ips().unwrap();
}
/// helper for validating if the provided arg is an ip address.
pub fn is_socket_addr(addr: &str) -> bool {
// TODO IPv6 zone information?
addr.parse::<SocketAddr>().is_ok() || addr.parse::<IpAddr>().is_ok()
}
/// checks if server_addr is valid and local host.
pub fn check_local_server_addr(server_addr: &str) -> std::io::Result<SocketAddr> {
let addr: Vec<SocketAddr> = match server_addr.to_socket_addrs() {
Ok(addr) => addr.collect(),
Err(err) => return Err(std::io::Error::other(err)),
};
// 0.0.0.0 is a wildcard address and refers to local network
// addresses. I.e, 0.0.0.0:9000 like ":9000" refers to port
// 9000 on localhost.
for a in addr {
if a.ip().is_unspecified() {
return Ok(a);
}
let host = match a {
SocketAddr::V4(a) => Host::<&str>::Ipv4(*a.ip()),
SocketAddr::V6(a) => Host::Ipv6(*a.ip()),
};
if is_local_host(host, 0, 0)? {
return Ok(a);
}
}
Err(std::io::Error::other("host in server address should be this server"))
}
/// checks if the given parameter correspond to one of
/// the local IP of the current machine
pub fn is_local_host(host: Host<&str>, port: u16, local_port: u16) -> std::io::Result<bool> {
let local_set: HashSet<IpAddr> = LOCAL_IPS.iter().copied().collect();
let is_local_host = match host {
Host::Domain(domain) => {
let ips = match (domain, 0).to_socket_addrs().map(|v| v.map(|v| v.ip()).collect::<Vec<_>>()) {
Ok(ips) => ips,
Err(err) => return Err(std::io::Error::other(err)),
};
ips.iter().any(|ip| local_set.contains(ip))
}
Host::Ipv4(ip) => local_set.contains(&IpAddr::V4(ip)),
Host::Ipv6(ip) => local_set.contains(&IpAddr::V6(ip)),
};
if port > 0 {
return Ok(is_local_host && port == local_port);
}
Ok(is_local_host)
}
/// returns IP address of given host.
pub fn get_host_ip(host: Host<&str>) -> std::io::Result<HashSet<IpAddr>> {
match host {
Host::Domain(domain) => match (domain, 0)
.to_socket_addrs()
.map(|v| v.map(|v| v.ip()).collect::<HashSet<_>>())
{
Ok(ips) => Ok(ips),
Err(err) => Err(std::io::Error::other(err)),
},
Host::Ipv4(ip) => {
let mut set = HashSet::with_capacity(1);
set.insert(IpAddr::V4(ip));
Ok(set)
}
Host::Ipv6(ip) => {
let mut set = HashSet::with_capacity(1);
set.insert(IpAddr::V6(ip));
Ok(set)
}
}
}
pub fn get_available_port() -> u16 {
TcpListener::bind("0.0.0.0:0").unwrap().local_addr().unwrap().port()
}
/// returns IPs of local interface
pub fn must_get_local_ips() -> std::io::Result<Vec<IpAddr>> {
match netif::up() {
Ok(up) => Ok(up.map(|x| x.address().to_owned()).collect()),
Err(err) => Err(std::io::Error::other(format!("Unable to get IP addresses of this host: {}", err))),
}
}
#[derive(Debug, Clone)]
pub struct XHost {
pub name: String,
pub port: u16,
pub is_port_set: bool,
}
impl Display for XHost {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
if !self.is_port_set {
write!(f, "{}", self.name)
} else if self.name.contains(':') {
write!(f, "[{}]:{}", self.name, self.port)
} else {
write!(f, "{}:{}", self.name, self.port)
}
}
}
impl TryFrom<String> for XHost {
type Error = std::io::Error;
fn try_from(value: String) -> std::result::Result<Self, Self::Error> {
if let Some(addr) = value.to_socket_addrs()?.next() {
Ok(Self {
name: addr.ip().to_string(),
port: addr.port(),
is_port_set: addr.port() > 0,
})
} else {
Err(std::io::Error::new(std::io::ErrorKind::InvalidData, "value invalid"))
}
}
}
/// parses the address string, process the ":port" format for double-stack binding,
/// and resolve the host name or IP address. If the port is 0, an available port is assigned.
pub fn parse_and_resolve_address(addr_str: &str) -> std::io::Result<SocketAddr> {
let resolved_addr: SocketAddr = if let Some(port) = addr_str.strip_prefix(":") {
// Process the ":port" format for double stack binding
let port_str = port;
let port: u16 = port_str
.parse()
.map_err(|e| std::io::Error::other(format!("Invalid port format: {}, err:{:?}", addr_str, e)))?;
let final_port = if port == 0 {
get_available_port() // assume get_available_port is available here
} else {
port
};
// Using IPv6 without address specified [::], it should handle both IPv4 and IPv6
SocketAddr::new(IpAddr::V6(Ipv6Addr::UNSPECIFIED), final_port)
} else {
// Use existing logic to handle regular address formats
let mut addr = check_local_server_addr(addr_str)?; // assume check_local_server_addr is available here
if addr.port() == 0 {
addr.set_port(get_available_port());
}
addr
};
Ok(resolved_addr)
}
#[cfg(test)]
mod test {
use std::net::{Ipv4Addr, Ipv6Addr};
use super::*;
#[test]
fn test_is_socket_addr() {
let test_cases = [
// Valid IP addresses
("192.168.1.0", true),
("127.0.0.1", true),
("10.0.0.1", true),
("0.0.0.0", true),
("255.255.255.255", true),
// Valid IPv6 addresses
("2001:db8::1", true),
("::1", true),
("::", true),
("fe80::1", true),
// Valid socket addresses
("192.168.1.0:8080", true),
("127.0.0.1:9000", true),
("[2001:db8::1]:9000", true),
("[::1]:8080", true),
("0.0.0.0:0", true),
// Invalid addresses
("localhost", false),
("localhost:9000", false),
("example.com", false),
("example.com:8080", false),
("http://192.168.1.0", false),
("http://192.168.1.0:9000", false),
("256.256.256.256", false),
("192.168.1", false),
("192.168.1.0.1", false),
("", false),
(":", false),
(":::", false),
("invalid_ip", false),
];
for (addr, expected) in test_cases {
let result = is_socket_addr(addr);
assert_eq!(expected, result, "addr: '{}', expected: {}, got: {}", addr, expected, result);
}
}
#[test]
fn test_check_local_server_addr() {
// Test valid local addresses
let valid_cases = ["localhost:54321", "127.0.0.1:9000", "0.0.0.0:9000", "[::1]:8080", "::1:8080"];
for addr in valid_cases {
let result = check_local_server_addr(addr);
assert!(result.is_ok(), "Expected '{}' to be valid, but got error: {:?}", addr, result);
}
// Test invalid addresses
let invalid_cases = [
("localhost", "invalid socket address"),
("", "invalid socket address"),
("example.org:54321", "host in server address should be this server"),
("8.8.8.8:53", "host in server address should be this server"),
(":-10", "invalid port value"),
("invalid:port", "invalid port value"),
];
for (addr, expected_error_pattern) in invalid_cases {
let result = check_local_server_addr(addr);
assert!(result.is_err(), "Expected '{}' to be invalid, but it was accepted: {:?}", addr, result);
let error_msg = result.unwrap_err().to_string();
assert!(
error_msg.contains(expected_error_pattern) || error_msg.contains("invalid socket address"),
"Error message '{}' doesn't contain expected pattern '{}' for address '{}'",
error_msg,
expected_error_pattern,
addr
);
}
}
#[test]
fn test_is_local_host() {
// Test localhost domain
let localhost_host = Host::Domain("localhost");
assert!(is_local_host(localhost_host, 0, 0).unwrap());
// Test loopback IP addresses
let ipv4_loopback = Host::Ipv4(Ipv4Addr::new(127, 0, 0, 1));
assert!(is_local_host(ipv4_loopback, 0, 0).unwrap());
let ipv6_loopback = Host::Ipv6(Ipv6Addr::new(0, 0, 0, 0, 0, 0, 0, 1));
assert!(is_local_host(ipv6_loopback, 0, 0).unwrap());
// Test port matching
let localhost_with_port1 = Host::Domain("localhost");
assert!(is_local_host(localhost_with_port1, 8080, 8080).unwrap());
let localhost_with_port2 = Host::Domain("localhost");
assert!(!is_local_host(localhost_with_port2, 8080, 9000).unwrap());
// Test non-local host
let external_host = Host::Ipv4(Ipv4Addr::new(8, 8, 8, 8));
assert!(!is_local_host(external_host, 0, 0).unwrap());
// Test invalid domain should return error
let invalid_host = Host::Domain("invalid.nonexistent.domain.example");
assert!(is_local_host(invalid_host, 0, 0).is_err());
}
#[test]
fn test_get_host_ip() {
// Test IPv4 address
let ipv4_host = Host::Ipv4(Ipv4Addr::new(192, 168, 1, 1));
let ipv4_result = get_host_ip(ipv4_host).unwrap();
assert_eq!(ipv4_result.len(), 1);
assert!(ipv4_result.contains(&IpAddr::V4(Ipv4Addr::new(192, 168, 1, 1))));
// Test IPv6 address
let ipv6_host = Host::Ipv6(Ipv6Addr::new(0x2001, 0xdb8, 0, 0, 0, 0, 0, 1));
let ipv6_result = get_host_ip(ipv6_host).unwrap();
assert_eq!(ipv6_result.len(), 1);
assert!(ipv6_result.contains(&IpAddr::V6(Ipv6Addr::new(0x2001, 0xdb8, 0, 0, 0, 0, 0, 1))));
// Test localhost domain
let localhost_host = Host::Domain("localhost");
let localhost_result = get_host_ip(localhost_host).unwrap();
assert!(!localhost_result.is_empty());
// Should contain at least loopback address
assert!(
localhost_result.contains(&IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)))
|| localhost_result.contains(&IpAddr::V6(Ipv6Addr::new(0, 0, 0, 0, 0, 0, 0, 1)))
);
// Test invalid domain
let invalid_host = Host::Domain("invalid.nonexistent.domain.example");
assert!(get_host_ip(invalid_host).is_err());
}
#[test]
fn test_get_available_port() {
let port1 = get_available_port();
let port2 = get_available_port();
// Port should be in valid range (u16 max is always <= 65535)
assert!(port1 > 0);
assert!(port2 > 0);
// Different calls should typically return different ports
assert_ne!(port1, port2);
}
#[test]
fn test_must_get_local_ips() {
let local_ips = must_get_local_ips().unwrap();
let local_set: HashSet<IpAddr> = local_ips.into_iter().collect();
// Should contain loopback addresses
assert!(local_set.contains(&IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1))));
// Should not be empty
assert!(!local_set.is_empty());
// All IPs should be valid
for ip in &local_set {
match ip {
IpAddr::V4(_) | IpAddr::V6(_) => {} // Valid
}
}
}
#[test]
fn test_xhost_display() {
// Test without port
let host_no_port = XHost {
name: "example.com".to_string(),
port: 0,
is_port_set: false,
};
assert_eq!(host_no_port.to_string(), "example.com");
// Test with port (IPv4-like name)
let host_with_port = XHost {
name: "192.168.1.1".to_string(),
port: 8080,
is_port_set: true,
};
assert_eq!(host_with_port.to_string(), "192.168.1.1:8080");
// Test with port (IPv6-like name)
let host_ipv6_with_port = XHost {
name: "2001:db8::1".to_string(),
port: 9000,
is_port_set: true,
};
assert_eq!(host_ipv6_with_port.to_string(), "[2001:db8::1]:9000");
// Test domain name with port
let host_domain_with_port = XHost {
name: "example.com".to_string(),
port: 443,
is_port_set: true,
};
assert_eq!(host_domain_with_port.to_string(), "example.com:443");
}
#[test]
fn test_xhost_try_from() {
// Test valid IPv4 address with port
let result = XHost::try_from("192.168.1.1:8080".to_string()).unwrap();
assert_eq!(result.name, "192.168.1.1");
assert_eq!(result.port, 8080);
assert!(result.is_port_set);
// Test valid IPv4 address without port
let result = XHost::try_from("192.168.1.1:0".to_string()).unwrap();
assert_eq!(result.name, "192.168.1.1");
assert_eq!(result.port, 0);
assert!(!result.is_port_set);
// Test valid IPv6 address with port
let result = XHost::try_from("[2001:db8::1]:9000".to_string()).unwrap();
assert_eq!(result.name, "2001:db8::1");
assert_eq!(result.port, 9000);
assert!(result.is_port_set);
// Test localhost with port (localhost may resolve to either IPv4 or IPv6)
let result = XHost::try_from("localhost:3000".to_string()).unwrap();
// localhost can resolve to either 127.0.0.1 or ::1 depending on system configuration
assert!(result.name == "127.0.0.1" || result.name == "::1");
assert_eq!(result.port, 3000);
assert!(result.is_port_set);
// Test invalid format
let result = XHost::try_from("invalid_format".to_string());
assert!(result.is_err());
// Test empty string
let result = XHost::try_from("".to_string());
assert!(result.is_err());
}
#[test]
fn test_parse_and_resolve_address() {
// Test port-only format
let result = parse_and_resolve_address(":8080").unwrap();
assert_eq!(result.ip(), IpAddr::V6(Ipv6Addr::UNSPECIFIED));
assert_eq!(result.port(), 8080);
// Test port-only format with port 0 (should get available port)
let result = parse_and_resolve_address(":0").unwrap();
assert_eq!(result.ip(), IpAddr::V6(Ipv6Addr::UNSPECIFIED));
assert!(result.port() > 0);
// Test localhost with port
let result = parse_and_resolve_address("localhost:9000").unwrap();
assert_eq!(result.port(), 9000);
// Test localhost with port 0 (should get available port)
let result = parse_and_resolve_address("localhost:0").unwrap();
assert!(result.port() > 0);
// Test 0.0.0.0 with port
let result = parse_and_resolve_address("0.0.0.0:7000").unwrap();
assert_eq!(result.ip(), IpAddr::V4(Ipv4Addr::new(0, 0, 0, 0)));
assert_eq!(result.port(), 7000);
// Test invalid port format
let result = parse_and_resolve_address(":invalid_port");
assert!(result.is_err());
// Test invalid address
let result = parse_and_resolve_address("example.org:8080");
assert!(result.is_err());
}
#[test]
fn test_edge_cases() {
// Test empty string for is_socket_addr
assert!(!is_socket_addr(""));
// Test single colon for is_socket_addr
assert!(!is_socket_addr(":"));
// Test malformed IPv6 for is_socket_addr
assert!(!is_socket_addr("[::]"));
assert!(!is_socket_addr("[::1"));
// Test very long strings
let long_string = "a".repeat(1000);
assert!(!is_socket_addr(&long_string));
// Test unicode characters
assert!(!is_socket_addr("测试.example.com"));
// Test special characters
assert!(!is_socket_addr("test@example.com:8080"));
assert!(!is_socket_addr("http://example.com:8080"));
}
#[test]
fn test_boundary_values() {
// Test port boundaries
assert!(is_socket_addr("127.0.0.1:0"));
assert!(is_socket_addr("127.0.0.1:65535"));
assert!(!is_socket_addr("127.0.0.1:65536"));
// Test IPv4 boundaries
assert!(is_socket_addr("0.0.0.0"));
assert!(is_socket_addr("255.255.255.255"));
assert!(!is_socket_addr("256.0.0.0"));
assert!(!is_socket_addr("0.0.0.256"));
// Test XHost with boundary ports
let host_max_port = XHost {
name: "example.com".to_string(),
port: 65535,
is_port_set: true,
};
assert_eq!(host_max_port.to_string(), "example.com:65535");
let host_zero_port = XHost {
name: "example.com".to_string(),
port: 0,
is_port_set: true,
};
assert_eq!(host_zero_port.to_string(), "example.com:0");
}
}

View File

@@ -1,16 +1,13 @@
use nix::sys::stat::{self, stat};
use nix::sys::statfs::{self, statfs, FsType};
use nix::sys::statfs::{self, FsType, statfs};
use std::fs::File;
use std::io::{self, BufRead, Error, ErrorKind};
use std::path::Path;
use crate::disk::Info;
use common::error::{Error as e_Error, Result};
use super::{DiskInfo, IOStats};
use super::IOStats;
/// returns total and free bytes available in a directory, e.g. `/`.
pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<Info> {
/// Returns total and free bytes available in a directory, e.g. `/`.
pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<DiskInfo> {
let stat_fs = statfs(p.as_ref())?;
let bsize = stat_fs.block_size() as u64;
@@ -21,30 +18,24 @@ pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<Info> {
let reserved = match bfree.checked_sub(bavail) {
Some(reserved) => reserved,
None => {
return Err(Error::new(
ErrorKind::Other,
format!(
"detected f_bavail space ({}) > f_bfree space ({}), fs corruption at ({}). please run 'fsck'",
bavail,
bfree,
p.as_ref().display()
),
))
return Err(Error::other(format!(
"detected f_bavail space ({}) > f_bfree space ({}), fs corruption at ({}). please run 'fsck'",
bavail,
bfree,
p.as_ref().display()
)));
}
};
let total = match blocks.checked_sub(reserved) {
Some(total) => total * bsize,
None => {
return Err(Error::new(
ErrorKind::Other,
format!(
"detected reserved space ({}) > blocks space ({}), fs corruption at ({}). please run 'fsck'",
reserved,
blocks,
p.as_ref().display()
),
))
return Err(Error::other(format!(
"detected reserved space ({}) > blocks space ({}), fs corruption at ({}). please run 'fsck'",
reserved,
blocks,
p.as_ref().display()
)));
}
};
@@ -52,36 +43,31 @@ pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<Info> {
let used = match total.checked_sub(free) {
Some(used) => used,
None => {
return Err(Error::new(
ErrorKind::Other,
format!(
"detected free space ({}) > total drive space ({}), fs corruption at ({}). please run 'fsck'",
free,
total,
p.as_ref().display()
),
))
return Err(Error::other(format!(
"detected free space ({}) > total drive space ({}), fs corruption at ({}). please run 'fsck'",
free,
total,
p.as_ref().display()
)));
}
};
let st = stat(p.as_ref())?;
Ok(Info {
Ok(DiskInfo {
total,
free,
used,
files: stat_fs.files(),
ffree: stat_fs.files_free(),
fstype: get_fs_type(stat_fs.filesystem_type()).to_string(),
major: stat::major(st.st_dev),
minor: stat::minor(st.st_dev),
..Default::default()
})
}
/// returns the filesystem type of the underlying mounted filesystem
/// Returns the filesystem type of the underlying mounted filesystem
///
/// TODO The following mapping could not find the corresponding constant in `nix`:
///
@@ -103,26 +89,28 @@ fn get_fs_type(fs_type: FsType) -> &'static str {
statfs::ECRYPTFS_SUPER_MAGIC => "ecryptfs",
statfs::OVERLAYFS_SUPER_MAGIC => "overlayfs",
statfs::REISERFS_SUPER_MAGIC => "REISERFS",
_ => "UNKNOWN",
}
}
pub fn same_disk(disk1: &str, disk2: &str) -> Result<bool> {
pub fn same_disk(disk1: &str, disk2: &str) -> std::io::Result<bool> {
let stat1 = stat(disk1)?;
let stat2 = stat(disk2)?;
Ok(stat1.st_dev == stat2.st_dev)
}
pub fn get_drive_stats(major: u32, minor: u32) -> Result<IOStats> {
pub fn get_drive_stats(major: u32, minor: u32) -> std::io::Result<IOStats> {
read_drive_stats(&format!("/sys/dev/block/{}:{}/stat", major, minor))
}
fn read_drive_stats(stats_file: &str) -> Result<IOStats> {
fn read_drive_stats(stats_file: &str) -> std::io::Result<IOStats> {
let stats = read_stat(stats_file)?;
if stats.len() < 11 {
return Err(e_Error::from_string(format!("found invalid format while reading {}", stats_file)));
return Err(Error::new(
ErrorKind::InvalidData,
format!("found invalid format while reading {}", stats_file),
));
}
let mut io_stats = IOStats {
read_ios: stats[0],
@@ -148,22 +136,24 @@ fn read_drive_stats(stats_file: &str) -> Result<IOStats> {
Ok(io_stats)
}
fn read_stat(file_name: &str) -> Result<Vec<u64>> {
// 打开文件
fn read_stat(file_name: &str) -> std::io::Result<Vec<u64>> {
// Open file
let path = Path::new(file_name);
let file = File::open(path)?;
// 创建一个 BufReader
// Create a BufReader
let reader = io::BufReader::new(file);
// 读取第一行
// Read first line
let mut stats = Vec::new();
if let Some(line) = reader.lines().next() {
let line = line?;
// 分割行并解析为 u64
// Split line and parse as u64
// https://rust-lang.github.io/rust-clippy/master/index.html#trim_split_whitespace
for token in line.split_whitespace() {
let ui64: u64 = token.parse()?;
let ui64: u64 = token
.parse()
.map_err(|e| Error::new(ErrorKind::InvalidData, format!("failed to parse '{}' as u64: {}", token, e)))?;
stats.push(ui64);
}
}

111
crates/utils/src/os/mod.rs Normal file
View File

@@ -0,0 +1,111 @@
#[cfg(target_os = "linux")]
mod linux;
#[cfg(all(unix, not(target_os = "linux")))]
mod unix;
#[cfg(target_os = "windows")]
mod windows;
#[cfg(target_os = "linux")]
pub use linux::{get_drive_stats, get_info, same_disk};
// pub use linux::same_disk;
#[cfg(all(unix, not(target_os = "linux")))]
pub use unix::{get_drive_stats, get_info, same_disk};
#[cfg(target_os = "windows")]
pub use windows::{get_drive_stats, get_info, same_disk};
#[derive(Debug, Default, PartialEq)]
pub struct IOStats {
pub read_ios: u64,
pub read_merges: u64,
pub read_sectors: u64,
pub read_ticks: u64,
pub write_ios: u64,
pub write_merges: u64,
pub write_sectors: u64,
pub write_ticks: u64,
pub current_ios: u64,
pub total_ticks: u64,
pub req_ticks: u64,
pub discard_ios: u64,
pub discard_merges: u64,
pub discard_sectors: u64,
pub discard_ticks: u64,
pub flush_ios: u64,
pub flush_ticks: u64,
}
#[derive(Debug, Default, PartialEq)]
pub struct DiskInfo {
pub total: u64,
pub free: u64,
pub used: u64,
pub files: u64,
pub ffree: u64,
pub fstype: String,
pub major: u64,
pub minor: u64,
pub name: String,
pub rotational: bool,
pub nrrequests: u64,
}
#[cfg(test)]
mod tests {
use super::*;
use std::path::PathBuf;
#[test]
fn test_get_info_valid_path() {
let temp_dir = tempfile::tempdir().unwrap();
let info = get_info(temp_dir.path()).unwrap();
println!("Disk Info: {:?}", info);
assert!(info.total > 0);
assert!(info.free > 0);
assert!(info.used > 0);
assert!(info.files > 0);
assert!(info.ffree > 0);
assert!(!info.fstype.is_empty());
}
#[test]
fn test_get_info_invalid_path() {
let invalid_path = PathBuf::from("/invalid/path");
let result = get_info(&invalid_path);
assert!(result.is_err());
}
#[test]
fn test_same_disk_same_path() {
let temp_dir = tempfile::tempdir().unwrap();
let path = temp_dir.path().to_str().unwrap();
let result = same_disk(path, path).unwrap();
assert!(result);
}
#[test]
fn test_same_disk_different_paths() {
let temp_dir1 = tempfile::tempdir().unwrap();
let temp_dir2 = tempfile::tempdir().unwrap();
let path1 = temp_dir1.path().to_str().unwrap();
let path2 = temp_dir2.path().to_str().unwrap();
let result = same_disk(path1, path2).unwrap();
// Since both temporary directories are created in the same file system,
// they should be on the same disk in most cases
println!("Path1: {}, Path2: {}, Same disk: {}", path1, path2, result);
// Test passes if the function doesn't panic - the actual result depends on test environment
}
#[ignore] // FIXME: failed in github actions
#[test]
fn test_get_drive_stats_default() {
let stats = get_drive_stats(0, 0).unwrap();
assert_eq!(stats, IOStats::default());
}
}

View File

@@ -1,12 +1,10 @@
use super::IOStats;
use crate::disk::Info;
use common::error::Result;
use super::{DiskInfo, IOStats};
use nix::sys::{stat::stat, statfs::statfs};
use std::io::Error;
use std::path::Path;
/// returns total and free bytes available in a directory, e.g. `/`.
pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<Info> {
/// Returns total and free bytes available in a directory, e.g. `/`.
pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<DiskInfo> {
let stat = statfs(p.as_ref())?;
let bsize = stat.block_size() as u64;
@@ -18,11 +16,11 @@ pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<Info> {
Some(reserved) => reserved,
None => {
return Err(Error::other(format!(
"detected f_bavail space ({}) > f_bfree space ({}), fs corruption at ({}). please run fsck",
"detected f_bavail space ({}) > f_bfree space ({}), fs corruption at ({}). please run 'fsck'",
bavail,
bfree,
p.as_ref().display()
)))
)));
}
};
@@ -30,11 +28,11 @@ pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<Info> {
Some(total) => total * bsize,
None => {
return Err(Error::other(format!(
"detected reserved space ({}) > blocks space ({}), fs corruption at ({}). please run fsck",
"detected reserved space ({}) > blocks space ({}), fs corruption at ({}). please run 'fsck'",
reserved,
blocks,
p.as_ref().display()
)))
)));
}
};
@@ -43,15 +41,15 @@ pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<Info> {
Some(used) => used,
None => {
return Err(Error::other(format!(
"detected free space ({}) > total drive space ({}), fs corruption at ({}). please run fsck",
"detected free space ({}) > total drive space ({}), fs corruption at ({}). please run 'fsck'",
free,
total,
p.as_ref().display()
)))
)));
}
};
Ok(Info {
Ok(DiskInfo {
total,
free,
used,
@@ -62,13 +60,13 @@ pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<Info> {
})
}
pub fn same_disk(disk1: &str, disk2: &str) -> Result<bool> {
pub fn same_disk(disk1: &str, disk2: &str) -> std::io::Result<bool> {
let stat1 = stat(disk1)?;
let stat2 = stat(disk2)?;
Ok(stat1.st_dev == stat2.st_dev)
}
pub fn get_drive_stats(_major: u32, _minor: u32) -> Result<IOStats> {
pub fn get_drive_stats(_major: u32, _minor: u32) -> std::io::Result<IOStats> {
Ok(IOStats::default())
}

View File

@@ -1,8 +1,6 @@
#![allow(unsafe_code)] // TODO: audit unsafe code
use super::IOStats;
use crate::disk::Info;
use common::error::Result;
use super::{DiskInfo, IOStats};
use std::io::{Error, ErrorKind};
use std::mem;
use std::os::windows::ffi::OsStrExt;
@@ -12,8 +10,8 @@ use winapi::shared::ntdef::ULARGE_INTEGER;
use winapi::um::fileapi::{GetDiskFreeSpaceExW, GetDiskFreeSpaceW, GetVolumeInformationW, GetVolumePathNameW};
use winapi::um::winnt::{LPCWSTR, WCHAR};
/// returns total and free bytes available in a directory, e.g. `C:\`.
pub fn get_info(p: impl AsRef<Path>) -> Result<Info> {
/// Returns total and free bytes available in a directory, e.g. `C:\`.
pub fn get_info(p: impl AsRef<Path>) -> std::io::Result<DiskInfo> {
let path_wide: Vec<WCHAR> = p
.as_ref()
.canonicalize()?
@@ -35,7 +33,7 @@ pub fn get_info(p: impl AsRef<Path>) -> Result<Info> {
)
};
if success == 0 {
return Err(Error::last_os_error().into());
return Err(Error::last_os_error());
}
let total = unsafe { *lp_total_number_of_bytes.QuadPart() };
@@ -50,8 +48,7 @@ pub fn get_info(p: impl AsRef<Path>) -> Result<Info> {
total,
p.as_ref().display()
),
)
.into());
));
}
let mut lp_sectors_per_cluster: DWORD = 0;
@@ -69,10 +66,10 @@ pub fn get_info(p: impl AsRef<Path>) -> Result<Info> {
)
};
if success == 0 {
return Err(Error::last_os_error().into());
return Err(Error::last_os_error());
}
Ok(Info {
Ok(DiskInfo {
total,
free,
used: total - free,
@@ -83,15 +80,15 @@ pub fn get_info(p: impl AsRef<Path>) -> Result<Info> {
})
}
/// returns leading volume name.
fn get_volume_name(v: &[WCHAR]) -> Result<LPCWSTR> {
/// Returns leading volume name.
fn get_volume_name(v: &[WCHAR]) -> std::io::Result<LPCWSTR> {
let volume_name_size: DWORD = MAX_PATH as _;
let mut lp_volume_name_buffer: [WCHAR; MAX_PATH] = [0; MAX_PATH];
let success = unsafe { GetVolumePathNameW(v.as_ptr(), lp_volume_name_buffer.as_mut_ptr(), volume_name_size) };
if success == 0 {
return Err(Error::last_os_error().into());
return Err(Error::last_os_error());
}
Ok(lp_volume_name_buffer.as_ptr())
@@ -102,8 +99,8 @@ fn utf16_to_string(v: &[WCHAR]) -> String {
String::from_utf16_lossy(&v[..len])
}
/// returns the filesystem type of the underlying mounted filesystem
fn get_fs_type(p: &[WCHAR]) -> Result<String> {
/// Returns the filesystem type of the underlying mounted filesystem
fn get_fs_type(p: &[WCHAR]) -> std::io::Result<String> {
let path = get_volume_name(p)?;
let volume_name_size: DWORD = MAX_PATH as _;
@@ -130,16 +127,16 @@ fn get_fs_type(p: &[WCHAR]) -> Result<String> {
};
if success == 0 {
return Err(Error::last_os_error().into());
return Err(Error::last_os_error());
}
Ok(utf16_to_string(&lp_file_system_name_buffer))
}
pub fn same_disk(_add_extensiondisk1: &str, _disk2: &str) -> Result<bool> {
pub fn same_disk(_disk1: &str, _disk2: &str) -> std::io::Result<bool> {
Ok(false)
}
pub fn get_drive_stats(_major: u32, _minor: u32) -> Result<IOStats> {
pub fn get_drive_stats(_major: u32, _minor: u32) -> std::io::Result<IOStats> {
Ok(IOStats::default())
}

View File

@@ -1,6 +1,105 @@
use common::error::{Error, Result};
use lazy_static::*;
use regex::Regex;
use std::io::{Error, Result};
pub fn parse_bool(str: &str) -> Result<bool> {
match str {
"1" | "t" | "T" | "true" | "TRUE" | "True" | "on" | "ON" | "On" | "enabled" => Ok(true),
"0" | "f" | "F" | "false" | "FALSE" | "False" | "off" | "OFF" | "Off" | "disabled" => Ok(false),
_ => Err(Error::other(format!("ParseBool: parsing {}", str))),
}
}
pub fn match_simple(pattern: &str, name: &str) -> bool {
if pattern.is_empty() {
return name == pattern;
}
if pattern == "*" {
return true;
}
// Do an extended wildcard '*' and '?' match.
deep_match_rune(name.as_bytes(), pattern.as_bytes(), true)
}
pub fn match_pattern(pattern: &str, name: &str) -> bool {
if pattern.is_empty() {
return name == pattern;
}
if pattern == "*" {
return true;
}
// Do an extended wildcard '*' and '?' match.
deep_match_rune(name.as_bytes(), pattern.as_bytes(), false)
}
pub fn has_pattern(patterns: &[&str], match_str: &str) -> bool {
for pattern in patterns {
if match_simple(pattern, match_str) {
return true;
}
}
false
}
pub fn has_string_suffix_in_slice(str: &str, list: &[&str]) -> bool {
let str = str.to_lowercase();
for v in list {
if *v == "*" {
return true;
}
if str.ends_with(&v.to_lowercase()) {
return true;
}
}
false
}
fn deep_match_rune(str_: &[u8], pattern: &[u8], simple: bool) -> bool {
let (mut str_, mut pattern) = (str_, pattern);
while !pattern.is_empty() {
match pattern[0] as char {
'*' => {
return if pattern.len() == 1 {
true
} else {
deep_match_rune(str_, &pattern[1..], simple)
|| (!str_.is_empty() && deep_match_rune(&str_[1..], pattern, simple))
};
}
'?' => {
if str_.is_empty() {
return simple;
}
}
_ => {
if str_.is_empty() || str_[0] != pattern[0] {
return false;
}
}
}
str_ = &str_[1..];
pattern = &pattern[1..];
}
str_.is_empty() && pattern.is_empty()
}
pub fn match_as_pattern_prefix(pattern: &str, text: &str) -> bool {
let mut i = 0;
while i < text.len() && i < pattern.len() {
match pattern.as_bytes()[i] as char {
'*' => return true,
'?' => i += 1,
_ => {
if pattern.as_bytes()[i] != text.as_bytes()[i] {
return false;
}
}
}
i += 1;
}
text.len() <= pattern.len()
}
lazy_static! {
static ref ELLIPSES_RE: Regex = Regex::new(r"(.*)(\{[0-9a-z]*\.\.\.[0-9a-z]*\})(.*)").unwrap();
@@ -15,9 +114,9 @@ const ELLIPSES: &str = "...";
/// associated prefix and suffixes.
#[derive(Debug, Default, PartialEq, Eq)]
pub struct Pattern {
pub(crate) prefix: String,
pub(crate) suffix: String,
pub(crate) seq: Vec<String>,
pub prefix: String,
pub suffix: String,
pub seq: Vec<String>,
}
impl Pattern {
@@ -107,17 +206,20 @@ pub fn find_ellipses_patterns(arg: &str) -> Result<ArgPattern> {
let mut parts = match ELLIPSES_RE.captures(arg) {
Some(caps) => caps,
None => {
return Err(Error::from_string(format!("Invalid ellipsis format in ({}), Ellipsis range must be provided in format {{N...M}} where N and M are positive integers, M must be greater than N, with an allowed minimum range of 4", arg)));
return Err(Error::other(format!(
"Invalid ellipsis format in ({}), Ellipsis range must be provided in format {{N...M}} where N and M are positive integers, M must be greater than N, with an allowed minimum range of 4",
arg
)));
}
};
let mut pattens = Vec::new();
let mut patterns = Vec::new();
while let Some(prefix) = parts.get(1) {
let seq = parse_ellipses_range(parts[2].into())?;
match ELLIPSES_RE.captures(prefix.into()) {
Some(cs) => {
pattens.push(Pattern {
patterns.push(Pattern {
seq,
prefix: String::new(),
suffix: parts[3].into(),
@@ -125,7 +227,7 @@ pub fn find_ellipses_patterns(arg: &str) -> Result<ArgPattern> {
parts = cs;
}
None => {
pattens.push(Pattern {
patterns.push(Pattern {
seq,
prefix: prefix.as_str().to_owned(),
suffix: parts[3].into(),
@@ -138,17 +240,20 @@ pub fn find_ellipses_patterns(arg: &str) -> Result<ArgPattern> {
// Check if any of the prefix or suffixes now have flower braces
// left over, in such a case we generally think that there is
// perhaps a typo in users input and error out accordingly.
for p in pattens.iter() {
for p in patterns.iter() {
if p.prefix.contains(OPEN_BRACES)
|| p.prefix.contains(CLOSE_BRACES)
|| p.suffix.contains(OPEN_BRACES)
|| p.suffix.contains(CLOSE_BRACES)
{
return Err(Error::from_string(format!("Invalid ellipsis format in ({}), Ellipsis range must be provided in format {{N...M}} where N and M are positive integers, M must be greater than N, with an allowed minimum range of 4", arg)));
return Err(Error::other(format!(
"Invalid ellipsis format in ({}), Ellipsis range must be provided in format {{N...M}} where N and M are positive integers, M must be greater than N, with an allowed minimum range of 4",
arg
)));
}
}
Ok(ArgPattern::new(pattens))
Ok(ArgPattern::new(patterns))
}
/// returns true if input arg has ellipses type pattern.
@@ -165,10 +270,10 @@ pub fn has_ellipses<T: AsRef<str>>(s: &[T]) -> bool {
/// {33...64}
pub fn parse_ellipses_range(pattern: &str) -> Result<Vec<String>> {
if !pattern.contains(OPEN_BRACES) {
return Err(Error::from_string("Invalid argument"));
return Err(Error::other("Invalid argument"));
}
if !pattern.contains(OPEN_BRACES) {
return Err(Error::from_string("Invalid argument"));
if !pattern.contains(CLOSE_BRACES) {
return Err(Error::other("Invalid argument"));
}
let ellipses_range: Vec<&str> = pattern
@@ -178,15 +283,15 @@ pub fn parse_ellipses_range(pattern: &str) -> Result<Vec<String>> {
.collect();
if ellipses_range.len() != 2 {
return Err(Error::from_string("Invalid argument"));
return Err(Error::other("Invalid argument"));
}
// TODO: Add support for hexadecimals.
let start = ellipses_range[0].parse::<usize>()?;
let end = ellipses_range[1].parse::<usize>()?;
let start = ellipses_range[0].parse::<usize>().map_err(Error::other)?;
let end = ellipses_range[1].parse::<usize>().map_err(Error::other)?;
if start > end {
return Err(Error::from_string("Invalid argument:range start cannot be bigger than end"));
return Err(Error::other("Invalid argument:range start cannot be bigger than end"));
}
let mut ret: Vec<String> = Vec::with_capacity(end - start + 1);

View File

@@ -608,8 +608,8 @@ mod tests {
#[tokio::test]
async fn test_decompress_with_invalid_format() {
// Test decompression with invalid format
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
let sample_content = b"Hello, compression world!";
let cursor = Cursor::new(sample_content);
@@ -634,8 +634,8 @@ mod tests {
#[tokio::test]
async fn test_decompress_with_zip_format() {
// Test decompression with Zip format (currently not supported)
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
let sample_content = b"Hello, compression world!";
let cursor = Cursor::new(sample_content);
@@ -660,8 +660,8 @@ mod tests {
#[tokio::test]
async fn test_decompress_error_propagation() {
// Test error propagation during decompression process
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
let sample_content = b"Hello, compression world!";
let cursor = Cursor::new(sample_content);
@@ -690,8 +690,8 @@ mod tests {
#[tokio::test]
async fn test_decompress_callback_execution() {
// Test callback function execution during decompression
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, Ordering};
let sample_content = b"Hello, compression world!";
let cursor = Cursor::new(sample_content);

View File

@@ -1,7 +1,7 @@
use jsonwebtoken::{Algorithm, DecodingKey, TokenData, Validation};
use crate::jwt::Claims;
use crate::Error;
use crate::jwt::Claims;
pub fn decode(token: &str, token_secret: &[u8]) -> Result<TokenData<Claims>, Error> {
Ok(jsonwebtoken::decode(

View File

@@ -1,7 +1,7 @@
use jsonwebtoken::{Algorithm, EncodingKey, Header};
use crate::jwt::Claims;
use crate::Error;
use crate::jwt::Claims;
pub fn encode(token_secret: &[u8], claims: &Claims) -> Result<String, Error> {
Ok(jsonwebtoken::encode(

221
docker-compose.yml Normal file
View File

@@ -0,0 +1,221 @@
version: '3.8'
services:
# RustFS main service
rustfs:
image: rustfs/rustfs:latest
container_name: rustfs-server
build:
context: .
dockerfile: Dockerfile.multi-stage
args:
TARGETPLATFORM: linux/amd64
ports:
- "9000:9000" # S3 API port
- "9001:9001" # Console port
environment:
- RUSTFS_VOLUMES=/data/rustfs0,/data/rustfs1,/data/rustfs2,/data/rustfs3
- RUSTFS_ADDRESS=0.0.0.0:9000
- RUSTFS_CONSOLE_ENABLE=true
- RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001
- RUSTFS_ACCESS_KEY=rustfsadmin
- RUSTFS_SECRET_KEY=rustfsadmin
- RUSTFS_LOG_LEVEL=info
- RUSTFS_OBS_ENDPOINT=http://otel-collector:4317
volumes:
- rustfs_data_0:/data/rustfs0
- rustfs_data_1:/data/rustfs1
- rustfs_data_2:/data/rustfs2
- rustfs_data_3:/data/rustfs3
- ./logs:/app/logs
networks:
- rustfs-network
restart: unless-stopped
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:9000/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
depends_on:
- otel-collector
# Development environment
rustfs-dev:
image: rustfs/rustfs:devenv
container_name: rustfs-dev
build:
context: .
dockerfile: .docker/Dockerfile.devenv
ports:
- "9010:9000"
- "9011:9001"
environment:
- RUSTFS_VOLUMES=/data/rustfs0,/data/rustfs1
- RUSTFS_ADDRESS=0.0.0.0:9000
- RUSTFS_CONSOLE_ENABLE=true
- RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001
- RUSTFS_ACCESS_KEY=devadmin
- RUSTFS_SECRET_KEY=devadmin
- RUSTFS_LOG_LEVEL=debug
volumes:
- .:/root/s3-rustfs
- rustfs_dev_data:/data
networks:
- rustfs-network
restart: unless-stopped
profiles:
- dev
# OpenTelemetry Collector
otel-collector:
image: otel/opentelemetry-collector-contrib:latest
container_name: otel-collector
command:
- --config=/etc/otelcol-contrib/otel-collector.yml
volumes:
- ./.docker/observability/otel-collector.yml:/etc/otelcol-contrib/otel-collector.yml:ro
ports:
- "4317:4317" # OTLP gRPC receiver
- "4318:4318" # OTLP HTTP receiver
- "8888:8888" # Prometheus metrics
- "8889:8889" # Prometheus exporter metrics
networks:
- rustfs-network
restart: unless-stopped
profiles:
- observability
# Jaeger for tracing
jaeger:
image: jaegertracing/all-in-one:latest
container_name: jaeger
ports:
- "16686:16686" # Jaeger UI
- "14250:14250" # Jaeger gRPC
environment:
- COLLECTOR_OTLP_ENABLED=true
networks:
- rustfs-network
restart: unless-stopped
profiles:
- observability
# Prometheus for metrics
prometheus:
image: prom/prometheus:latest
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./.docker/observability/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--storage.tsdb.retention.time=200h'
- '--web.enable-lifecycle'
networks:
- rustfs-network
restart: unless-stopped
profiles:
- observability
# Grafana for visualization
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana_data:/var/lib/grafana
- ./.docker/observability/grafana/provisioning:/etc/grafana/provisioning:ro
- ./.docker/observability/grafana/dashboards:/var/lib/grafana/dashboards:ro
networks:
- rustfs-network
restart: unless-stopped
profiles:
- observability
# MinIO for S3 API testing
minio:
image: minio/minio:latest
container_name: minio-test
ports:
- "9020:9000"
- "9021:9001"
environment:
- MINIO_ROOT_USER=minioadmin
- MINIO_ROOT_PASSWORD=minioadmin
volumes:
- minio_data:/data
command: server /data --console-address ":9001"
networks:
- rustfs-network
restart: unless-stopped
profiles:
- testing
# Redis for caching (optional)
redis:
image: redis:7-alpine
container_name: redis
ports:
- "6379:6379"
volumes:
- redis_data:/data
networks:
- rustfs-network
restart: unless-stopped
profiles:
- cache
# NGINX reverse proxy (optional)
nginx:
image: nginx:alpine
container_name: nginx-proxy
ports:
- "80:80"
- "443:443"
volumes:
- ./.docker/nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./.docker/nginx/ssl:/etc/nginx/ssl:ro
networks:
- rustfs-network
restart: unless-stopped
profiles:
- proxy
depends_on:
- rustfs
networks:
rustfs-network:
driver: bridge
ipam:
config:
- subnet: 172.20.0.0/16
volumes:
rustfs_data_0:
driver: local
rustfs_data_1:
driver: local
rustfs_data_2:
driver: local
rustfs_data_3:
driver: local
rustfs_dev_data:
driver: local
prometheus_data:
driver: local
grafana_data:
driver: local
minio_data:
driver: local
redis_data:
driver: local

530
docs/docker-build.md Normal file
View File

@@ -0,0 +1,530 @@
# RustFS Docker Build and Deployment Guide
This document describes how to build and deploy RustFS using Docker, including the automated GitHub Actions workflow for building and pushing images to Docker Hub and GitHub Container Registry.
## 🚀 Quick Start
### Using Pre-built Images
```bash
# Pull and run the latest RustFS image
docker run -d \
--name rustfs \
-p 9000:9000 \
-p 9001:9001 \
-v rustfs_data:/data \
-e RUSTFS_VOLUMES=/data/rustfs0,/data/rustfs1,/data/rustfs2,/data/rustfs3 \
-e RUSTFS_ACCESS_KEY=rustfsadmin \
-e RUSTFS_SECRET_KEY=rustfsadmin \
-e RUSTFS_CONSOLE_ENABLE=true \
rustfs/rustfs:latest
```
### Using Docker Compose
```bash
# Basic deployment
docker-compose up -d
# Development environment
docker-compose --profile dev up -d
# With observability stack
docker-compose --profile observability up -d
# Full stack with all services
docker-compose --profile dev --profile observability --profile testing up -d
```
## 📦 Available Images
Our GitHub Actions workflow builds multiple image variants:
### Image Registries
- **Docker Hub**: `rustfs/rustfs`
- **GitHub Container Registry**: `ghcr.io/rustfs/s3-rustfs`
### Image Variants
| Variant | Tag Suffix | Description | Use Case |
|---------|------------|-------------|----------|
| Production | *(none)* | Minimal Ubuntu-based runtime | Production deployment |
| Ubuntu | `-ubuntu22.04` | Ubuntu 22.04 based build environment | Development/Testing |
| Rocky Linux | `-rockylinux9.3` | Rocky Linux 9.3 based build environment | Enterprise environments |
| Development | `-devenv` | Full development environment | Development/Debugging |
### Supported Architectures
All images support multi-architecture:
- `linux/amd64` (x86_64-unknown-linux-musl)
- `linux/arm64` (aarch64-unknown-linux-gnu)
### Tag Examples
```bash
# Latest production image
rustfs/rustfs:latest
rustfs/rustfs:main
# Specific version
rustfs/rustfs:v1.0.0
rustfs/rustfs:v1.0.0-ubuntu22.04
# Development environment
rustfs/rustfs:latest-devenv
rustfs/rustfs:main-devenv
```
## 🔧 GitHub Actions Workflow
The Docker build workflow (`.github/workflows/docker.yml`) automatically:
1. **Builds cross-platform binaries** for `amd64` and `arm64`
2. **Creates Docker images** for all variants
3. **Pushes to registries** (Docker Hub and GitHub Container Registry)
4. **Creates multi-arch manifests** for seamless platform selection
5. **Performs security scanning** using Trivy
### Cross-Compilation Strategy
To handle complex native dependencies, we use different compilation strategies:
- **x86_64**: Native compilation with `x86_64-unknown-linux-musl` for static linking
- **aarch64**: Cross-compilation with `aarch64-unknown-linux-gnu` using the `cross` tool
This approach ensures compatibility with various C libraries while maintaining performance.
### Workflow Triggers
- **Push to main branch**: Builds and pushes `main` and `latest` tags
- **Tag push** (`v*`): Builds and pushes version tags
- **Pull requests**: Builds images without pushing
- **Manual trigger**: Workflow dispatch with options
### Required Secrets
Configure these secrets in your GitHub repository:
```bash
# Docker Hub credentials
DOCKERHUB_USERNAME=your-dockerhub-username
DOCKERHUB_TOKEN=your-dockerhub-access-token
# GitHub token is automatically available
GITHUB_TOKEN=automatically-provided
```
## 🏗️ Building Locally
### Prerequisites
- Docker with BuildKit enabled
- Rust toolchain (1.85+)
- Protocol Buffers compiler (protoc 31.1+)
- FlatBuffers compiler (flatc 25.2.10+)
- `cross` tool for ARM64 compilation
### Installation Commands
```bash
# Install Rust targets
rustup target add x86_64-unknown-linux-musl
rustup target add aarch64-unknown-linux-gnu
# Install cross for ARM64 compilation
cargo install cross --git https://github.com/cross-rs/cross
# Install protoc (macOS)
brew install protobuf
# Install protoc (Ubuntu)
sudo apt-get install protobuf-compiler
# Install flatc
# Download from: https://github.com/google/flatbuffers/releases
```
### Build Commands
```bash
# Test cross-compilation setup
./scripts/test-cross-build.sh
# Build production image for local platform
docker build -t rustfs:local .
# Build multi-stage production image
docker build -f Dockerfile.multi-stage -t rustfs:multi-stage .
# Build specific variant
docker build -f .docker/Dockerfile.ubuntu22.04 -t rustfs:ubuntu .
# Build for specific platform
docker build --platform linux/amd64 -t rustfs:amd64 .
docker build --platform linux/arm64 -t rustfs:arm64 .
# Build multi-platform image
docker buildx build --platform linux/amd64,linux/arm64 -t rustfs:multi .
```
### Cross-Compilation
```bash
# Generate protobuf code first
cargo run --bin gproto
# Native x86_64 build
cargo build --release --target x86_64-unknown-linux-musl --bin rustfs
# Cross-compile for ARM64
cross build --release --target aarch64-unknown-linux-gnu --bin rustfs
```
### Build with Docker Compose
```bash
# Build all services
docker-compose build
# Build specific service
docker-compose build rustfs
# Build development environment
docker-compose build rustfs-dev
```
## 🚀 Deployment Options
### 1. Single Container
```bash
docker run -d \
--name rustfs \
--restart unless-stopped \
-p 9000:9000 \
-p 9001:9001 \
-v /data/rustfs:/data \
-e RUSTFS_VOLUMES=/data/rustfs0,/data/rustfs1,/data/rustfs2,/data/rustfs3 \
-e RUSTFS_ADDRESS=0.0.0.0:9000 \
-e RUSTFS_CONSOLE_ENABLE=true \
-e RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001 \
-e RUSTFS_ACCESS_KEY=rustfsadmin \
-e RUSTFS_SECRET_KEY=rustfsadmin \
rustfs/rustfs:latest
```
### 2. Docker Compose Profiles
```bash
# Production deployment
docker-compose up -d
# Development with debugging
docker-compose --profile dev up -d
# With monitoring stack
docker-compose --profile observability up -d
# Complete testing environment
docker-compose --profile dev --profile observability --profile testing up -d
```
### 3. Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: rustfs
spec:
replicas: 3
selector:
matchLabels:
app: rustfs
template:
metadata:
labels:
app: rustfs
spec:
containers:
- name: rustfs
image: rustfs/rustfs:latest
ports:
- containerPort: 9000
- containerPort: 9001
env:
- name: RUSTFS_VOLUMES
value: "/data/rustfs0,/data/rustfs1,/data/rustfs2,/data/rustfs3"
- name: RUSTFS_ADDRESS
value: "0.0.0.0:9000"
- name: RUSTFS_CONSOLE_ENABLE
value: "true"
- name: RUSTFS_CONSOLE_ADDRESS
value: "0.0.0.0:9001"
volumeMounts:
- name: data
mountPath: /data
volumes:
- name: data
persistentVolumeClaim:
claimName: rustfs-data
```
## ⚙️ Configuration
### Environment Variables
| Variable | Description | Default |
|----------|-------------|---------|
| `RUSTFS_VOLUMES` | Comma-separated list of data volumes | Required |
| `RUSTFS_ADDRESS` | Server bind address | `0.0.0.0:9000` |
| `RUSTFS_CONSOLE_ENABLE` | Enable web console | `false` |
| `RUSTFS_CONSOLE_ADDRESS` | Console bind address | `0.0.0.0:9001` |
| `RUSTFS_ACCESS_KEY` | S3 access key | `rustfsadmin` |
| `RUSTFS_SECRET_KEY` | S3 secret key | `rustfsadmin` |
| `RUSTFS_LOG_LEVEL` | Log level | `info` |
| `RUSTFS_OBS_ENDPOINT` | Observability endpoint | `""` |
| `RUSTFS_TLS_PATH` | TLS certificates path | `""` |
### Volume Mounts
- **Data volumes**: `/data/rustfs{0,1,2,3}` - RustFS data storage
- **Logs**: `/app/logs` - Application logs
- **Config**: `/etc/rustfs/` - Configuration files
- **TLS**: `/etc/ssl/rustfs/` - TLS certificates
### Ports
- **9000**: S3 API endpoint
- **9001**: Web console (if enabled)
- **9002**: Admin API (if enabled)
- **50051**: gRPC API (if enabled)
## 🔍 Monitoring and Observability
### Health Checks
The Docker images include built-in health checks:
```bash
# Check container health
docker ps --filter "name=rustfs" --format "table {{.Names}}\t{{.Status}}"
# View health check logs
docker inspect rustfs --format='{{json .State.Health}}'
```
### Metrics and Tracing
When using the observability profile:
- **Prometheus**: http://localhost:9090
- **Grafana**: http://localhost:3000 (admin/admin)
- **Jaeger**: http://localhost:16686
- **OpenTelemetry Collector**: http://localhost:8888/metrics
### Log Collection
```bash
# View container logs
docker logs rustfs -f
# Export logs
docker logs rustfs > rustfs.log 2>&1
```
## 🛠️ Development
### Development Environment
```bash
# Start development container
docker-compose --profile dev up -d rustfs-dev
# Access development container
docker exec -it rustfs-dev bash
# Mount source code for live development
docker run -it --rm \
-v $(pwd):/root/s3-rustfs \
-p 9000:9000 \
rustfs/rustfs:devenv \
bash
```
### Building from Source in Container
```bash
# Use development image for building
docker run --rm \
-v $(pwd):/root/s3-rustfs \
-w /root/s3-rustfs \
rustfs/rustfs:ubuntu22.04 \
cargo build --release --bin rustfs
```
### Testing Cross-Compilation
```bash
# Run the test script to verify cross-compilation setup
./scripts/test-cross-build.sh
# This will test:
# - x86_64-unknown-linux-musl compilation
# - aarch64-unknown-linux-gnu cross-compilation
# - Docker builds for both architectures
```
## 🔐 Security
### Security Scanning
The workflow includes Trivy security scanning:
```bash
# Run security scan locally
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock \
-v $HOME/Library/Caches:/root/.cache/ \
aquasec/trivy:latest image rustfs/rustfs:latest
```
### Security Best Practices
1. **Use non-root user**: Images run as `rustfs` user (UID 1000)
2. **Minimal base images**: Ubuntu minimal for production
3. **Security updates**: Regular base image updates
4. **Secret management**: Use Docker secrets or environment files
5. **Network security**: Use Docker networks and proper firewall rules
## 📝 Troubleshooting
### Common Issues
#### 1. Cross-Compilation Failures
**Problem**: ARM64 build fails with linking errors
```bash
error: linking with `aarch64-linux-gnu-gcc` failed
```
**Solution**: Use the `cross` tool instead of native cross-compilation:
```bash
# Install cross tool
cargo install cross --git https://github.com/cross-rs/cross
# Use cross for ARM64 builds
cross build --release --target aarch64-unknown-linux-gnu --bin rustfs
```
#### 2. Protobuf Generation Issues
**Problem**: Missing protobuf definitions
```bash
error: failed to run custom build command for `protos`
```
**Solution**: Generate protobuf code first:
```bash
cargo run --bin gproto
```
#### 3. Docker Build Failures
**Problem**: Binary not found in Docker build
```bash
COPY failed: file not found in build context
```
**Solution**: Ensure binaries are built before Docker build:
```bash
# Build binaries first
cargo build --release --target x86_64-unknown-linux-musl --bin rustfs
cross build --release --target aarch64-unknown-linux-gnu --bin rustfs
# Then build Docker image
docker build .
```
### Debug Commands
```bash
# Check container status
docker ps -a
# View container logs
docker logs rustfs --tail 100
# Access container shell
docker exec -it rustfs bash
# Check resource usage
docker stats rustfs
# Inspect container configuration
docker inspect rustfs
# Test cross-compilation setup
./scripts/test-cross-build.sh
```
## 🔄 CI/CD Integration
### GitHub Actions
The provided workflow can be customized:
```yaml
# Override image names
env:
REGISTRY_IMAGE_DOCKERHUB: myorg/rustfs
REGISTRY_IMAGE_GHCR: ghcr.io/myorg/rustfs
```
### GitLab CI
```yaml
build:
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
- docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
```
### Jenkins Pipeline
```groovy
pipeline {
agent any
stages {
stage('Build') {
steps {
script {
docker.build("rustfs:${env.BUILD_ID}")
}
}
}
stage('Push') {
steps {
script {
docker.withRegistry('https://registry.hub.docker.com', 'dockerhub-credentials') {
docker.image("rustfs:${env.BUILD_ID}").push()
}
}
}
}
}
}
```
## 📚 Additional Resources
- [Docker Official Documentation](https://docs.docker.com/)
- [Docker Compose Reference](https://docs.docker.com/compose/)
- [GitHub Actions Documentation](https://docs.github.com/en/actions)
- [Cross-compilation with Rust](https://rust-lang.github.io/rustup/cross-compilation.html)
- [Cross tool documentation](https://github.com/cross-rs/cross)
- [RustFS Configuration Guide](../README.md)

View File

@@ -27,4 +27,6 @@ tokio = { workspace = true }
tower.workspace = true
url.workspace = true
madmin.workspace =true
common.workspace = true
common.workspace = true
rustfs-filemeta.workspace = true
bytes.workspace = true

View File

@@ -5,7 +5,7 @@ use std::{error::Error, sync::Arc, time::Duration};
use lock::{
drwmutex::Options,
lock_args::LockArgs,
namespace_lock::{new_nslock, NsLockMap},
namespace_lock::{NsLockMap, new_nslock},
new_lock_api,
};
use protos::{node_service_time_out_client, proto_gen::node_service::GenerallyLockRequest};
@@ -60,16 +60,16 @@ async fn test_lock_unlock_ns_lock() -> Result<(), Box<dyn Error>> {
vec![locker],
)
.await;
assert!(ns
.0
.write()
.await
.get_lock(&Options {
timeout: Duration::from_secs(5),
retry_interval: Duration::from_secs(1),
})
.await
.unwrap());
assert!(
ns.0.write()
.await
.get_lock(&Options {
timeout: Duration::from_secs(5),
retry_interval: Duration::from_secs(1),
})
.await
.unwrap()
);
ns.0.write().await.un_lock().await.unwrap();
Ok(())

View File

@@ -1,7 +1,7 @@
#![cfg(test)]
use ecstore::disk::{MetaCacheEntry, VolumeInfo, WalkDirOptions};
use ecstore::metacache::writer::{MetacacheReader, MetacacheWriter};
use ecstore::disk::{VolumeInfo, WalkDirOptions};
use futures::future::join_all;
use protos::proto_gen::node_service::WalkDirRequest;
use protos::{
@@ -12,11 +12,12 @@ use protos::{
},
};
use rmp_serde::{Deserializer, Serializer};
use rustfs_filemeta::{MetaCacheEntry, MetacacheReader, MetacacheWriter};
use serde::{Deserialize, Serialize};
use std::{error::Error, io::Cursor};
use tokio::spawn;
use tonic::codegen::tokio_stream::StreamExt;
use tonic::Request;
use tonic::codegen::tokio_stream::StreamExt;
const CLUSTER_ADDR: &str = "http://localhost:9000";
@@ -42,7 +43,7 @@ async fn ping() -> Result<(), Box<dyn Error>> {
// Construct PingRequest
let request = Request::new(PingRequest {
version: 1,
body: finished_data.to_vec(),
body: bytes::Bytes::copy_from_slice(finished_data),
});
// Send request and get response
@@ -113,7 +114,7 @@ async fn walk_dir() -> Result<(), Box<dyn Error>> {
let mut client = node_service_time_out_client(&CLUSTER_ADDR.to_string()).await?;
let request = Request::new(WalkDirRequest {
disk: "/home/dandan/code/rust/s3-rustfs/target/debug/data".to_string(),
walk_dir_options: buf,
walk_dir_options: buf.into(),
});
let mut response = client.walk_dir(request).await?.into_inner();
@@ -126,7 +127,7 @@ async fn walk_dir() -> Result<(), Box<dyn Error>> {
println!("{}", resp.error_info.unwrap_or("".to_string()));
}
let entry = serde_json::from_str::<MetaCacheEntry>(&resp.meta_cache_entry)
.map_err(|_e| common::error::Error::from_string(format!("Unexpected response: {:?}", response)))
.map_err(|_e| std::io::Error::other(format!("Unexpected response: {:?}", response)))
.unwrap();
out.write_obj(&entry).await.unwrap();
}

View File

@@ -10,6 +10,11 @@ rust-version.workspace = true
[lints]
workspace = true
[features]
default = ["reed-solomon-simd"]
reed-solomon-simd = []
reed-solomon-erasure = []
[dependencies]
rustfs-config = { workspace = true, features = ["constants"] }
async-trait.workspace = true
@@ -35,7 +40,8 @@ http.workspace = true
highway = { workspace = true }
url.workspace = true
uuid = { workspace = true, features = ["v4", "fast-rng", "serde"] }
reed-solomon-erasure = { workspace = true }
reed-solomon-erasure = { version = "6.0.0", features = ["simd-accel"] }
reed-solomon-simd = { version = "3.0.0" }
transform-stream = "0.3.1"
lazy_static.workspace = true
lock.workspace = true
@@ -50,7 +56,9 @@ tokio-util = { workspace = true, features = ["io", "compat"] }
crc32fast = { workspace = true }
siphasher = { workspace = true }
base64-simd = { workspace = true }
sha2 = { version = "0.11.0-pre.4" }
base64 = { workspace = true }
hmac = { workspace = true }
sha2 = { workspace = true }
hex-simd = { workspace = true }
path-clean = { workspace = true }
tempfile.workspace = true
@@ -71,6 +79,9 @@ rustfs-rsc = { workspace = true }
urlencoding = { workspace = true }
smallvec = { workspace = true }
shadow-rs.workspace = true
rustfs-filemeta.workspace = true
rustfs-utils = { workspace = true, features = ["full"] }
rustfs-rio.workspace = true
[target.'cfg(not(windows))'.dependencies]
nix = { workspace = true }
@@ -81,6 +92,16 @@ winapi = { workspace = true }
[dev-dependencies]
tokio = { workspace = true, features = ["rt-multi-thread", "macros"] }
criterion = { version = "0.5", features = ["html_reports"] }
temp-env = "0.2.0"
[build-dependencies]
shadow-rs = { workspace = true, features = ["build", "metadata"] }
[[bench]]
name = "erasure_benchmark"
harness = false
[[bench]]
name = "comparison_benchmark"
harness = false

109
ecstore/README_cn.md Normal file
View File

@@ -0,0 +1,109 @@
# ECStore - Erasure Coding Storage
ECStore provides erasure coding functionality for the RustFS project, supporting multiple Reed-Solomon implementations for optimal performance and compatibility.
## Reed-Solomon Implementations
### Available Backends
#### `reed-solomon-erasure` (Default)
- **Stability**: Mature and well-tested implementation
- **Performance**: Good performance with SIMD acceleration when available
- **Compatibility**: Works with any shard size
- **Memory**: Efficient memory usage
- **Use case**: Recommended for production use
#### `reed-solomon-simd` (Optional)
- **Performance**: Optimized SIMD implementation for maximum speed
- **Limitations**: Has restrictions on shard sizes (must be >= 64 bytes typically)
- **Memory**: May use more memory for small shards
- **Use case**: Best for large data blocks where performance is critical
### Feature Flags
Configure the Reed-Solomon implementation using Cargo features:
```toml
# Use default implementation (reed-solomon-erasure)
ecstore = "0.0.1"
# Use SIMD implementation for maximum performance
ecstore = { version = "0.0.1", features = ["reed-solomon-simd"], default-features = false }
# Use traditional implementation explicitly
ecstore = { version = "0.0.1", features = ["reed-solomon-erasure"], default-features = false }
```
### Usage Example
```rust
use ecstore::erasure_coding::Erasure;
// Create erasure coding instance
// 4 data shards, 2 parity shards, 1KB block size
let erasure = Erasure::new(4, 2, 1024);
// Encode data
let data = b"hello world from rustfs erasure coding";
let shards = erasure.encode_data(data)?;
// Simulate loss of one shard
let mut shards_opt: Vec<Option<Vec<u8>>> = shards
.iter()
.map(|b| Some(b.to_vec()))
.collect();
shards_opt[2] = None; // Lose shard 2
// Reconstruct missing data
erasure.decode_data(&mut shards_opt)?;
// Recover original data
let mut recovered = Vec::new();
for shard in shards_opt.iter().take(4) { // Only data shards
recovered.extend_from_slice(shard.as_ref().unwrap());
}
recovered.truncate(data.len());
assert_eq!(&recovered, data);
```
## Performance Considerations
### When to use `reed-solomon-simd`
- Large block sizes (>= 1KB recommended)
- High-throughput scenarios
- CPU-intensive workloads where encoding/decoding is the bottleneck
### When to use `reed-solomon-erasure`
- Small block sizes
- Memory-constrained environments
- General-purpose usage
- Production deployments requiring maximum stability
### Implementation Details
#### `reed-solomon-erasure`
- **Instance Reuse**: The encoder instance is cached and reused across multiple operations
- **Thread Safety**: Thread-safe with interior mutability
- **Memory Efficiency**: Lower memory footprint for small data
#### `reed-solomon-simd`
- **Instance Creation**: New encoder/decoder instances are created for each operation
- **API Design**: The SIMD implementation's API is designed for single-use instances
- **Performance Trade-off**: While instances are created per operation, the SIMD optimizations provide significant performance benefits for large data blocks
- **Optimization**: Future versions may implement instance pooling if the underlying API supports reuse
### Performance Tips
1. **Batch Operations**: When possible, batch multiple small operations into larger blocks
2. **Block Size Optimization**: Use block sizes that are multiples of 64 bytes for SIMD implementations
3. **Memory Allocation**: Pre-allocate buffers when processing multiple blocks
4. **Feature Selection**: Choose the appropriate feature based on your data size and performance requirements
## Cross-Platform Compatibility
Both implementations support:
- x86_64 with SIMD acceleration
- aarch64 (ARM64) with optimizations
- Other architectures with fallback implementations
The `reed-solomon-erasure` implementation provides better cross-platform compatibility and is recommended for most use cases.

View File

@@ -0,0 +1,330 @@
//! 专门比较 Pure Erasure 和 Hybrid (SIMD) 模式性能的基准测试
//!
//! 这个基准测试使用不同的feature编译配置来直接对比两种实现的性能。
//!
//! ## 运行比较测试
//!
//! ```bash
//! # 测试 Pure Erasure 实现 (默认)
//! cargo bench --bench comparison_benchmark
//!
//! # 测试 Hybrid (SIMD) 实现
//! cargo bench --bench comparison_benchmark --features reed-solomon-simd
//!
//! # 测试强制 erasure-only 模式
//! cargo bench --bench comparison_benchmark --features reed-solomon-erasure
//!
//! # 生成对比报告
//! cargo bench --bench comparison_benchmark -- --save-baseline erasure
//! cargo bench --bench comparison_benchmark --features reed-solomon-simd -- --save-baseline hybrid
//! ```
use criterion::{BenchmarkId, Criterion, Throughput, black_box, criterion_group, criterion_main};
use ecstore::erasure_coding::Erasure;
use std::time::Duration;
/// 基准测试数据配置
struct TestData {
data: Vec<u8>,
size_name: &'static str,
}
impl TestData {
fn new(size: usize, size_name: &'static str) -> Self {
let data = (0..size).map(|i| (i % 256) as u8).collect();
Self { data, size_name }
}
}
/// 生成不同大小的测试数据集
fn generate_test_datasets() -> Vec<TestData> {
vec![
TestData::new(1024, "1KB"), // 小数据
TestData::new(8 * 1024, "8KB"), // 中小数据
TestData::new(64 * 1024, "64KB"), // 中等数据
TestData::new(256 * 1024, "256KB"), // 中大数据
TestData::new(1024 * 1024, "1MB"), // 大数据
TestData::new(4 * 1024 * 1024, "4MB"), // 超大数据
]
}
/// 编码性能比较基准测试
fn bench_encode_comparison(c: &mut Criterion) {
let datasets = generate_test_datasets();
let configs = vec![
(4, 2, "4+2"), // 常用配置
(6, 3, "6+3"), // 50%冗余
(8, 4, "8+4"), // 50%冗余,更多分片
];
for dataset in &datasets {
for (data_shards, parity_shards, config_name) in &configs {
let test_name = format!("{}_{}_{}", dataset.size_name, config_name, get_implementation_name());
let mut group = c.benchmark_group("encode_comparison");
group.throughput(Throughput::Bytes(dataset.data.len() as u64));
group.sample_size(20);
group.measurement_time(Duration::from_secs(10));
// 检查是否能够创建erasure实例某些配置在纯SIMD模式下可能失败
match Erasure::new(*data_shards, *parity_shards, dataset.data.len()).encode_data(&dataset.data) {
Ok(_) => {
group.bench_with_input(
BenchmarkId::new("implementation", &test_name),
&(&dataset.data, *data_shards, *parity_shards),
|b, (data, data_shards, parity_shards)| {
let erasure = Erasure::new(*data_shards, *parity_shards, data.len());
b.iter(|| {
let shards = erasure.encode_data(black_box(data)).unwrap();
black_box(shards);
});
},
);
}
Err(e) => {
println!("⚠️ 跳过测试 {} - 配置不支持: {}", test_name, e);
}
}
group.finish();
}
}
}
/// 解码性能比较基准测试
fn bench_decode_comparison(c: &mut Criterion) {
let datasets = generate_test_datasets();
let configs = vec![(4, 2, "4+2"), (6, 3, "6+3"), (8, 4, "8+4")];
for dataset in &datasets {
for (data_shards, parity_shards, config_name) in &configs {
let test_name = format!("{}_{}_{}", dataset.size_name, config_name, get_implementation_name());
let erasure = Erasure::new(*data_shards, *parity_shards, dataset.data.len());
// 预先编码数据 - 检查是否支持此配置
match erasure.encode_data(&dataset.data) {
Ok(encoded_shards) => {
let mut group = c.benchmark_group("decode_comparison");
group.throughput(Throughput::Bytes(dataset.data.len() as u64));
group.sample_size(20);
group.measurement_time(Duration::from_secs(10));
group.bench_with_input(
BenchmarkId::new("implementation", &test_name),
&(&encoded_shards, *data_shards, *parity_shards),
|b, (shards, data_shards, parity_shards)| {
let erasure = Erasure::new(*data_shards, *parity_shards, dataset.data.len());
b.iter(|| {
// 模拟最大可恢复的数据丢失
let mut shards_opt: Vec<Option<Vec<u8>>> =
shards.iter().map(|shard| Some(shard.to_vec())).collect();
// 丢失等于奇偶校验分片数量的分片
for item in shards_opt.iter_mut().take(*parity_shards) {
*item = None;
}
erasure.decode_data(black_box(&mut shards_opt)).unwrap();
black_box(&shards_opt);
});
},
);
group.finish();
}
Err(e) => {
println!("⚠️ 跳过解码测试 {} - 配置不支持: {}", test_name, e);
}
}
}
}
}
/// 分片大小敏感性测试
fn bench_shard_size_sensitivity(c: &mut Criterion) {
let data_shards = 4;
let parity_shards = 2;
// 测试不同的分片大小特别关注SIMD的临界点
let shard_sizes = vec![32, 64, 128, 256, 512, 1024, 2048, 4096, 8192];
let mut group = c.benchmark_group("shard_size_sensitivity");
group.sample_size(15);
group.measurement_time(Duration::from_secs(8));
for shard_size in shard_sizes {
let total_size = shard_size * data_shards;
let data = (0..total_size).map(|i| (i % 256) as u8).collect::<Vec<u8>>();
let test_name = format!("{}B_shard_{}", shard_size, get_implementation_name());
group.throughput(Throughput::Bytes(total_size as u64));
// 检查此分片大小是否支持
let erasure = Erasure::new(data_shards, parity_shards, data.len());
match erasure.encode_data(&data) {
Ok(_) => {
group.bench_with_input(BenchmarkId::new("shard_size", &test_name), &data, |b, data| {
let erasure = Erasure::new(data_shards, parity_shards, data.len());
b.iter(|| {
let shards = erasure.encode_data(black_box(data)).unwrap();
black_box(shards);
});
});
}
Err(e) => {
println!("⚠️ 跳过分片大小测试 {} - 不支持: {}", test_name, e);
}
}
}
group.finish();
}
/// 高负载并发测试
fn bench_concurrent_load(c: &mut Criterion) {
use std::sync::Arc;
use std::thread;
let data_size = 1024 * 1024; // 1MB
let data = Arc::new((0..data_size).map(|i| (i % 256) as u8).collect::<Vec<u8>>());
let erasure = Arc::new(Erasure::new(4, 2, data_size));
let mut group = c.benchmark_group("concurrent_load");
group.throughput(Throughput::Bytes(data_size as u64));
group.sample_size(10);
group.measurement_time(Duration::from_secs(15));
let test_name = format!("1MB_concurrent_{}", get_implementation_name());
group.bench_function(&test_name, |b| {
b.iter(|| {
let handles: Vec<_> = (0..4)
.map(|_| {
let data_clone = data.clone();
let erasure_clone = erasure.clone();
thread::spawn(move || {
let shards = erasure_clone.encode_data(&data_clone).unwrap();
black_box(shards);
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
});
});
group.finish();
}
/// 错误恢复能力测试
fn bench_error_recovery_performance(c: &mut Criterion) {
let data_size = 256 * 1024; // 256KB
let data = (0..data_size).map(|i| (i % 256) as u8).collect::<Vec<u8>>();
let configs = vec![
(4, 2, 1), // 丢失1个分片
(4, 2, 2), // 丢失2个分片最大可恢复
(6, 3, 2), // 丢失2个分片
(6, 3, 3), // 丢失3个分片最大可恢复
(8, 4, 3), // 丢失3个分片
(8, 4, 4), // 丢失4个分片最大可恢复
];
let mut group = c.benchmark_group("error_recovery");
group.throughput(Throughput::Bytes(data_size as u64));
group.sample_size(15);
group.measurement_time(Duration::from_secs(8));
for (data_shards, parity_shards, lost_shards) in configs {
let erasure = Erasure::new(data_shards, parity_shards, data_size);
let test_name = format!("{}+{}_lost{}_{}", data_shards, parity_shards, lost_shards, get_implementation_name());
// 检查此配置是否支持
match erasure.encode_data(&data) {
Ok(encoded_shards) => {
group.bench_with_input(
BenchmarkId::new("recovery", &test_name),
&(&encoded_shards, data_shards, parity_shards, lost_shards),
|b, (shards, data_shards, parity_shards, lost_shards)| {
let erasure = Erasure::new(*data_shards, *parity_shards, data_size);
b.iter(|| {
let mut shards_opt: Vec<Option<Vec<u8>>> = shards.iter().map(|shard| Some(shard.to_vec())).collect();
// 丢失指定数量的分片
for item in shards_opt.iter_mut().take(*lost_shards) {
*item = None;
}
erasure.decode_data(black_box(&mut shards_opt)).unwrap();
black_box(&shards_opt);
});
},
);
}
Err(e) => {
println!("⚠️ 跳过错误恢复测试 {} - 配置不支持: {}", test_name, e);
}
}
}
group.finish();
}
/// 内存效率测试
fn bench_memory_efficiency(c: &mut Criterion) {
let data_shards = 4;
let parity_shards = 2;
let data_size = 1024 * 1024; // 1MB
let mut group = c.benchmark_group("memory_efficiency");
group.throughput(Throughput::Bytes(data_size as u64));
group.sample_size(10);
group.measurement_time(Duration::from_secs(8));
let test_name = format!("memory_pattern_{}", get_implementation_name());
// 测试连续多次编码对内存的影响
group.bench_function(format!("{}_continuous", test_name), |b| {
let erasure = Erasure::new(data_shards, parity_shards, data_size);
b.iter(|| {
for i in 0..10 {
let data = vec![(i % 256) as u8; data_size];
let shards = erasure.encode_data(black_box(&data)).unwrap();
black_box(shards);
}
});
});
// 测试大量小编码任务
group.bench_function(format!("{}_small_chunks", test_name), |b| {
let chunk_size = 1024; // 1KB chunks
let erasure = Erasure::new(data_shards, parity_shards, chunk_size);
b.iter(|| {
for i in 0..1024 {
let data = vec![(i % 256) as u8; chunk_size];
let shards = erasure.encode_data(black_box(&data)).unwrap();
black_box(shards);
}
});
});
group.finish();
}
/// 获取当前实现的名称
fn get_implementation_name() -> &'static str {
#[cfg(feature = "reed-solomon-simd")]
return "hybrid";
#[cfg(not(feature = "reed-solomon-simd"))]
return "erasure";
}
criterion_group!(
benches,
bench_encode_comparison,
bench_decode_comparison,
bench_shard_size_sensitivity,
bench_concurrent_load,
bench_error_recovery_performance,
bench_memory_efficiency
);
criterion_main!(benches);

View File

@@ -0,0 +1,395 @@
//! Reed-Solomon erasure coding performance benchmarks.
//!
//! This benchmark compares the performance of different Reed-Solomon implementations:
//! - Default (Pure erasure): Stable reed-solomon-erasure implementation
//! - `reed-solomon-simd` feature: SIMD mode with optimized performance
//!
//! ## Running Benchmarks
//!
//! ```bash
//! # 运行所有基准测试
//! cargo bench
//!
//! # 运行特定的基准测试
//! cargo bench --bench erasure_benchmark
//!
//! # 生成HTML报告
//! cargo bench --bench erasure_benchmark -- --output-format html
//!
//! # 只测试编码性能
//! cargo bench encode
//!
//! # 只测试解码性能
//! cargo bench decode
//! ```
//!
//! ## Test Configurations
//!
//! The benchmarks test various scenarios:
//! - Different data sizes: 1KB, 64KB, 1MB, 16MB
//! - Different erasure coding configurations: (4,2), (6,3), (8,4)
//! - Both encoding and decoding operations
//! - Small vs large shard scenarios for SIMD optimization
use criterion::{BenchmarkId, Criterion, Throughput, black_box, criterion_group, criterion_main};
use ecstore::erasure_coding::{Erasure, calc_shard_size};
use std::time::Duration;
/// 基准测试配置结构体
#[derive(Clone, Debug)]
struct BenchConfig {
/// 数据分片数量
data_shards: usize,
/// 奇偶校验分片数量
parity_shards: usize,
/// 测试数据大小(字节)
data_size: usize,
/// 块大小(字节)
block_size: usize,
/// 配置名称
name: String,
}
impl BenchConfig {
fn new(data_shards: usize, parity_shards: usize, data_size: usize, block_size: usize) -> Self {
Self {
data_shards,
parity_shards,
data_size,
block_size,
name: format!("{}+{}_{}KB_{}KB-block", data_shards, parity_shards, data_size / 1024, block_size / 1024),
}
}
}
/// 生成测试数据
fn generate_test_data(size: usize) -> Vec<u8> {
(0..size).map(|i| (i % 256) as u8).collect()
}
/// 基准测试: 编码性能对比
fn bench_encode_performance(c: &mut Criterion) {
let configs = vec![
// 小数据量测试 - 1KB
BenchConfig::new(4, 2, 1024, 1024),
BenchConfig::new(6, 3, 1024, 1024),
BenchConfig::new(8, 4, 1024, 1024),
// 中等数据量测试 - 64KB
BenchConfig::new(4, 2, 64 * 1024, 64 * 1024),
BenchConfig::new(6, 3, 64 * 1024, 64 * 1024),
BenchConfig::new(8, 4, 64 * 1024, 64 * 1024),
// 大数据量测试 - 1MB
BenchConfig::new(4, 2, 1024 * 1024, 1024 * 1024),
BenchConfig::new(6, 3, 1024 * 1024, 1024 * 1024),
BenchConfig::new(8, 4, 1024 * 1024, 1024 * 1024),
// 超大数据量测试 - 16MB
BenchConfig::new(4, 2, 16 * 1024 * 1024, 16 * 1024 * 1024),
BenchConfig::new(6, 3, 16 * 1024 * 1024, 16 * 1024 * 1024),
];
for config in configs {
let data = generate_test_data(config.data_size);
// 测试当前默认实现通常是SIMD
let mut group = c.benchmark_group("encode_current");
group.throughput(Throughput::Bytes(config.data_size as u64));
group.sample_size(10);
group.measurement_time(Duration::from_secs(5));
group.bench_with_input(BenchmarkId::new("current_impl", &config.name), &(&data, &config), |b, (data, config)| {
let erasure = Erasure::new(config.data_shards, config.parity_shards, config.block_size);
b.iter(|| {
let shards = erasure.encode_data(black_box(data)).unwrap();
black_box(shards);
});
});
group.finish();
// 如果SIMD feature启用测试专用的erasure实现对比
#[cfg(feature = "reed-solomon-simd")]
{
use ecstore::erasure_coding::ReedSolomonEncoder;
let mut erasure_group = c.benchmark_group("encode_erasure_only");
erasure_group.throughput(Throughput::Bytes(config.data_size as u64));
erasure_group.sample_size(10);
erasure_group.measurement_time(Duration::from_secs(5));
erasure_group.bench_with_input(
BenchmarkId::new("erasure_impl", &config.name),
&(&data, &config),
|b, (data, config)| {
let encoder = ReedSolomonEncoder::new(config.data_shards, config.parity_shards).unwrap();
b.iter(|| {
// 创建编码所需的数据结构
let per_shard_size = calc_shard_size(data.len(), config.data_shards);
let total_size = per_shard_size * (config.data_shards + config.parity_shards);
let mut buffer = vec![0u8; total_size];
buffer[..data.len()].copy_from_slice(data);
let slices: smallvec::SmallVec<[&mut [u8]; 16]> = buffer.chunks_exact_mut(per_shard_size).collect();
encoder.encode(black_box(slices)).unwrap();
black_box(&buffer);
});
},
);
erasure_group.finish();
}
// 如果使用SIMD feature测试直接SIMD实现对比
#[cfg(feature = "reed-solomon-simd")]
{
// 只对大shard测试SIMD小于512字节的shard SIMD性能不佳
let shard_size = calc_shard_size(config.data_size, config.data_shards);
if shard_size >= 512 {
let mut simd_group = c.benchmark_group("encode_simd_direct");
simd_group.throughput(Throughput::Bytes(config.data_size as u64));
simd_group.sample_size(10);
simd_group.measurement_time(Duration::from_secs(5));
simd_group.bench_with_input(
BenchmarkId::new("simd_impl", &config.name),
&(&data, &config),
|b, (data, config)| {
b.iter(|| {
// 直接使用SIMD实现
let per_shard_size = calc_shard_size(data.len(), config.data_shards);
match reed_solomon_simd::ReedSolomonEncoder::new(
config.data_shards,
config.parity_shards,
per_shard_size,
) {
Ok(mut encoder) => {
// 创建正确大小的缓冲区,并填充数据
let mut buffer = vec![0u8; per_shard_size * config.data_shards];
let copy_len = data.len().min(buffer.len());
buffer[..copy_len].copy_from_slice(&data[..copy_len]);
// 按正确的分片大小添加数据分片
for chunk in buffer.chunks_exact(per_shard_size) {
encoder.add_original_shard(black_box(chunk)).unwrap();
}
let result = encoder.encode().unwrap();
black_box(result);
}
Err(_) => {
// SIMD不支持此配置跳过
black_box(());
}
}
});
},
);
simd_group.finish();
}
}
}
}
/// 基准测试: 解码性能对比
fn bench_decode_performance(c: &mut Criterion) {
let configs = vec![
// 中等数据量测试 - 64KB
BenchConfig::new(4, 2, 64 * 1024, 64 * 1024),
BenchConfig::new(6, 3, 64 * 1024, 64 * 1024),
// 大数据量测试 - 1MB
BenchConfig::new(4, 2, 1024 * 1024, 1024 * 1024),
BenchConfig::new(6, 3, 1024 * 1024, 1024 * 1024),
// 超大数据量测试 - 16MB
BenchConfig::new(4, 2, 16 * 1024 * 1024, 16 * 1024 * 1024),
];
for config in configs {
let data = generate_test_data(config.data_size);
let erasure = Erasure::new(config.data_shards, config.parity_shards, config.block_size);
// 预先编码数据
let encoded_shards = erasure.encode_data(&data).unwrap();
// 测试当前默认实现的解码性能
let mut group = c.benchmark_group("decode_current");
group.throughput(Throughput::Bytes(config.data_size as u64));
group.sample_size(10);
group.measurement_time(Duration::from_secs(5));
group.bench_with_input(
BenchmarkId::new("current_impl", &config.name),
&(&encoded_shards, &config),
|b, (shards, config)| {
let erasure = Erasure::new(config.data_shards, config.parity_shards, config.block_size);
b.iter(|| {
// 模拟数据丢失 - 丢失一个数据分片和一个奇偶分片
let mut shards_opt: Vec<Option<Vec<u8>>> = shards.iter().map(|shard| Some(shard.to_vec())).collect();
// 丢失最后一个数据分片和第一个奇偶分片
shards_opt[config.data_shards - 1] = None;
shards_opt[config.data_shards] = None;
erasure.decode_data(black_box(&mut shards_opt)).unwrap();
black_box(&shards_opt);
});
},
);
group.finish();
// 如果使用混合模式默认测试SIMD解码性能
#[cfg(not(feature = "reed-solomon-erasure"))]
{
let shard_size = calc_shard_size(config.data_size, config.data_shards);
if shard_size >= 512 {
let mut simd_group = c.benchmark_group("decode_simd_direct");
simd_group.throughput(Throughput::Bytes(config.data_size as u64));
simd_group.sample_size(10);
simd_group.measurement_time(Duration::from_secs(5));
simd_group.bench_with_input(
BenchmarkId::new("simd_impl", &config.name),
&(&encoded_shards, &config),
|b, (shards, config)| {
b.iter(|| {
let per_shard_size = calc_shard_size(config.data_size, config.data_shards);
match reed_solomon_simd::ReedSolomonDecoder::new(
config.data_shards,
config.parity_shards,
per_shard_size,
) {
Ok(mut decoder) => {
// 添加可用的分片(除了丢失的)
for (i, shard) in shards.iter().enumerate() {
if i != config.data_shards - 1 && i != config.data_shards {
if i < config.data_shards {
decoder.add_original_shard(i, black_box(shard)).unwrap();
} else {
let recovery_idx = i - config.data_shards;
decoder.add_recovery_shard(recovery_idx, black_box(shard)).unwrap();
}
}
}
let result = decoder.decode().unwrap();
black_box(result);
}
Err(_) => {
// SIMD不支持此配置跳过
black_box(());
}
}
});
},
);
simd_group.finish();
}
}
}
}
/// 基准测试: 不同分片大小对性能的影响
fn bench_shard_size_impact(c: &mut Criterion) {
let shard_sizes = vec![64, 128, 256, 512, 1024, 2048, 4096, 8192];
let data_shards = 4;
let parity_shards = 2;
let mut group = c.benchmark_group("shard_size_impact");
group.sample_size(10);
group.measurement_time(Duration::from_secs(3));
for shard_size in shard_sizes {
let total_data_size = shard_size * data_shards;
let data = generate_test_data(total_data_size);
group.throughput(Throughput::Bytes(total_data_size as u64));
// 测试当前实现
group.bench_with_input(BenchmarkId::new("current", format!("shard_{}B", shard_size)), &data, |b, data| {
let erasure = Erasure::new(data_shards, parity_shards, total_data_size);
b.iter(|| {
let shards = erasure.encode_data(black_box(data)).unwrap();
black_box(shards);
});
});
}
group.finish();
}
/// 基准测试: 编码配置对性能的影响
fn bench_coding_configurations(c: &mut Criterion) {
let configs = vec![
(2, 1), // 最小冗余
(3, 2), // 中等冗余
(4, 2), // 常用配置
(6, 3), // 50%冗余
(8, 4), // 50%冗余,更多分片
(10, 5), // 50%冗余,大量分片
(12, 6), // 50%冗余,更大量分片
];
let data_size = 1024 * 1024; // 1MB测试数据
let data = generate_test_data(data_size);
let mut group = c.benchmark_group("coding_configurations");
group.throughput(Throughput::Bytes(data_size as u64));
group.sample_size(10);
group.measurement_time(Duration::from_secs(5));
for (data_shards, parity_shards) in configs {
let config_name = format!("{}+{}", data_shards, parity_shards);
group.bench_with_input(BenchmarkId::new("encode", &config_name), &data, |b, data| {
let erasure = Erasure::new(data_shards, parity_shards, data_size);
b.iter(|| {
let shards = erasure.encode_data(black_box(data)).unwrap();
black_box(shards);
});
});
}
group.finish();
}
/// 基准测试: 内存使用模式
fn bench_memory_patterns(c: &mut Criterion) {
let data_shards = 4;
let parity_shards = 2;
let block_size = 1024 * 1024; // 1MB块
let mut group = c.benchmark_group("memory_patterns");
group.sample_size(10);
group.measurement_time(Duration::from_secs(5));
// 测试重复使用同一个Erasure实例
group.bench_function("reuse_erasure_instance", |b| {
let erasure = Erasure::new(data_shards, parity_shards, block_size);
let data = generate_test_data(block_size);
b.iter(|| {
let shards = erasure.encode_data(black_box(&data)).unwrap();
black_box(shards);
});
});
// 测试每次创建新的Erasure实例
group.bench_function("new_erasure_instance", |b| {
let data = generate_test_data(block_size);
b.iter(|| {
let erasure = Erasure::new(data_shards, parity_shards, block_size);
let shards = erasure.encode_data(black_box(&data)).unwrap();
black_box(shards);
});
});
group.finish();
}
// 基准测试组配置
criterion_group!(
benches,
bench_encode_performance,
bench_decode_performance,
bench_shard_size_impact,
bench_coding_configurations,
bench_memory_patterns
);
criterion_main!(benches);

266
ecstore/run_benchmarks.sh Executable file
View File

@@ -0,0 +1,266 @@
#!/bin/bash
# Reed-Solomon 实现性能比较脚本
#
# 这个脚本将运行不同的基准测试来比较SIMD模式和纯Erasure模式的性能
#
# 使用方法:
# ./run_benchmarks.sh [quick|full|comparison]
#
# quick - 快速测试主要场景
# full - 完整基准测试套件
# comparison - 专门对比两种实现模式
set -e
# 颜色输出
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# 输出带颜色的信息
print_info() {
echo -e "${BLUE}[INFO]${NC} $1"
}
print_success() {
echo -e "${GREEN}[SUCCESS]${NC} $1"
}
print_warning() {
echo -e "${YELLOW}[WARNING]${NC} $1"
}
print_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# 检查是否安装了必要工具
check_requirements() {
print_info "检查系统要求..."
if ! command -v cargo &> /dev/null; then
print_error "cargo 未安装,请先安装 Rust 工具链"
exit 1
fi
# 检查是否安装了 criterion
if ! grep -q "criterion" Cargo.toml; then
print_error "Cargo.toml 中未找到 criterion 依赖"
exit 1
fi
print_success "系统要求检查通过"
}
# 清理之前的测试结果
cleanup() {
print_info "清理之前的测试结果..."
rm -rf target/criterion
print_success "清理完成"
}
# 运行纯 Erasure 模式基准测试
run_erasure_benchmark() {
print_info "🏛️ 开始运行纯 Erasure 模式基准测试..."
echo "================================================"
cargo bench --bench comparison_benchmark \
--features reed-solomon-erasure \
-- --save-baseline erasure_baseline
print_success "纯 Erasure 模式基准测试完成"
}
# 运行SIMD模式基准测试
run_simd_benchmark() {
print_info "🎯 开始运行SIMD模式基准测试..."
echo "================================================"
cargo bench --bench comparison_benchmark \
--features reed-solomon-simd \
-- --save-baseline simd_baseline
print_success "SIMD模式基准测试完成"
}
# 运行完整的基准测试套件
run_full_benchmark() {
print_info "🚀 开始运行完整基准测试套件..."
echo "================================================"
# 运行详细的基准测试使用默认纯Erasure模式
cargo bench --bench erasure_benchmark
print_success "完整基准测试套件完成"
}
# 运行性能对比测试
run_comparison_benchmark() {
print_info "📊 开始运行性能对比测试..."
echo "================================================"
print_info "步骤 1: 测试纯 Erasure 模式..."
cargo bench --bench comparison_benchmark \
--features reed-solomon-erasure \
-- --save-baseline erasure_baseline
print_info "步骤 2: 测试SIMD模式并与 Erasure 模式对比..."
cargo bench --bench comparison_benchmark \
--features reed-solomon-simd \
-- --baseline erasure_baseline
print_success "性能对比测试完成"
}
# 生成比较报告
generate_comparison_report() {
print_info "📊 生成性能比较报告..."
if [ -d "target/criterion" ]; then
print_info "基准测试结果已保存到 target/criterion/ 目录"
print_info "你可以打开 target/criterion/report/index.html 查看详细报告"
# 如果有 python 环境,可以启动简单的 HTTP 服务器查看报告
if command -v python3 &> /dev/null; then
print_info "你可以运行以下命令启动本地服务器查看报告:"
echo " cd target/criterion && python3 -m http.server 8080"
echo " 然后在浏览器中访问 http://localhost:8080/report/index.html"
fi
else
print_warning "未找到基准测试结果目录"
fi
}
# 快速测试模式
run_quick_test() {
print_info "🏃 运行快速性能测试..."
print_info "测试纯 Erasure 模式..."
cargo bench --bench comparison_benchmark \
--features reed-solomon-erasure \
-- encode_comparison --quick
print_info "测试SIMD模式..."
cargo bench --bench comparison_benchmark \
--features reed-solomon-simd \
-- encode_comparison --quick
print_success "快速测试完成"
}
# 显示帮助信息
show_help() {
echo "Reed-Solomon 性能基准测试脚本"
echo ""
echo "实现模式:"
echo " 🏛️ 纯 Erasure 模式(默认)- 稳定兼容的 reed-solomon-erasure 实现"
echo " 🎯 SIMD模式 - 高性能SIMD优化实现"
echo ""
echo "使用方法:"
echo " $0 [command]"
echo ""
echo "命令:"
echo " quick 运行快速性能测试"
echo " full 运行完整基准测试套件默认Erasure模式"
echo " comparison 运行详细的实现模式对比测试"
echo " erasure 只测试纯 Erasure 模式"
echo " simd 只测试SIMD模式"
echo " clean 清理测试结果"
echo " help 显示此帮助信息"
echo ""
echo "示例:"
echo " $0 quick # 快速测试两种模式"
echo " $0 comparison # 详细对比测试"
echo " $0 full # 完整测试套件默认Erasure模式"
echo " $0 simd # 只测试SIMD模式"
echo " $0 erasure # 只测试纯 Erasure 模式"
echo ""
echo "模式说明:"
echo " Erasure模式: 使用reed-solomon-erasure实现稳定可靠"
echo " SIMD模式: 使用reed-solomon-simd实现高性能优化"
}
# 显示测试配置信息
show_test_info() {
print_info "📋 测试配置信息:"
echo " - 当前目录: $(pwd)"
echo " - Rust 版本: $(rustc --version)"
echo " - Cargo 版本: $(cargo --version)"
echo " - CPU 架构: $(uname -m)"
echo " - 操作系统: $(uname -s)"
# 检查 CPU 特性
if [ -f "/proc/cpuinfo" ]; then
echo " - CPU 型号: $(grep 'model name' /proc/cpuinfo | head -1 | cut -d: -f2 | xargs)"
if grep -q "avx2" /proc/cpuinfo; then
echo " - SIMD 支持: AVX2 ✅ (SIMD模式将利用SIMD优化)"
elif grep -q "sse4" /proc/cpuinfo; then
echo " - SIMD 支持: SSE4 ✅ (SIMD模式将利用SIMD优化)"
else
echo " - SIMD 支持: 未检测到高级 SIMD 特性"
fi
fi
echo " - 默认模式: 纯Erasure模式 (稳定可靠)"
echo " - 高性能模式: SIMD模式 (性能优化)"
echo ""
}
# 主函数
main() {
print_info "🧪 Reed-Solomon 实现性能基准测试"
echo "================================================"
check_requirements
show_test_info
case "${1:-help}" in
"quick")
run_quick_test
generate_comparison_report
;;
"full")
cleanup
run_full_benchmark
generate_comparison_report
;;
"comparison")
cleanup
run_comparison_benchmark
generate_comparison_report
;;
"erasure")
cleanup
run_erasure_benchmark
generate_comparison_report
;;
"simd")
cleanup
run_simd_benchmark
generate_comparison_report
;;
"clean")
cleanup
;;
"help"|"--help"|"-h")
show_help
;;
*)
print_error "未知命令: $1"
echo ""
show_help
exit 1
;;
esac
print_success "✨ 基准测试执行完成!"
print_info "💡 提示: 推荐使用默认的纯Erasure模式对于高性能需求可考虑SIMD模式"
}
# 如果直接运行此脚本,调用主函数
if [[ "${BASH_SOURCE[0]}" == "${0}" ]]; then
main "$@"
fi

View File

@@ -1,8 +1,9 @@
use crate::error::{Error, Result};
use crate::{
disk::endpoint::Endpoint,
global::{GLOBAL_Endpoints, GLOBAL_BOOT_TIME},
global::{GLOBAL_BOOT_TIME, GLOBAL_Endpoints},
heal::{
data_usage::{load_data_usage_from_backend, DATA_USAGE_CACHE_NAME, DATA_USAGE_ROOT},
data_usage::{DATA_USAGE_CACHE_NAME, DATA_USAGE_ROOT, load_data_usage_from_backend},
data_usage_cache::DataUsageCache,
heal_commands::{DRIVE_STATE_OK, DRIVE_STATE_UNFORMATTED},
},
@@ -11,10 +12,10 @@ use crate::{
store_api::StorageAPI,
};
use common::{
error::{Error, Result},
// error::{Error, Result},
globals::GLOBAL_Local_Node_Name,
};
use madmin::{BackendDisks, Disk, ErasureSetInfo, InfoMessage, ServerProperties, ITEM_INITIALIZING, ITEM_OFFLINE, ITEM_ONLINE};
use madmin::{BackendDisks, Disk, ErasureSetInfo, ITEM_INITIALIZING, ITEM_OFFLINE, ITEM_ONLINE, InfoMessage, ServerProperties};
use protos::{
models::{PingBody, PingBodyBuilder},
node_service_time_out_client,
@@ -87,12 +88,12 @@ async fn is_server_resolvable(endpoint: &Endpoint) -> Result<()> {
// 创建客户端
let mut client = node_service_time_out_client(&addr)
.await
.map_err(|err| Error::msg(err.to_string()))?;
.map_err(|err| Error::other(err.to_string()))?;
// 构造 PingRequest
let request = Request::new(PingRequest {
version: 1,
body: finished_data.to_vec(),
body: bytes::Bytes::copy_from_slice(finished_data),
});
// 发送请求并获取响应
@@ -332,7 +333,7 @@ fn get_online_offline_disks_stats(disks_info: &[Disk]) -> (BackendDisks, Backend
async fn get_pools_info(all_disks: &[Disk]) -> Result<HashMap<i32, HashMap<i32, ErasureSetInfo>>> {
let Some(store) = new_object_layer_fn() else {
return Err(Error::msg("ServerNotInitialized"));
return Err(Error::other("ServerNotInitialized"));
};
let mut pools_info: HashMap<i32, HashMap<i32, ErasureSetInfo>> = HashMap::new();

View File

@@ -1,841 +1,174 @@
use crate::{
disk::{error::DiskError, Disk, DiskAPI},
erasure::{ReadAt, Writer},
io::{FileReader, FileWriter},
store_api::BitrotAlgorithm,
};
use blake2::Blake2b512;
use blake2::Digest as _;
use bytes::Bytes;
use common::error::{Error, Result};
use highway::{HighwayHash, HighwayHasher, Key};
use lazy_static::lazy_static;
use sha2::{digest::core_api::BlockSizeUser, Digest, Sha256};
use std::{any::Any, collections::HashMap, io::Cursor, sync::Arc};
use tokio::io::{AsyncReadExt as _, AsyncWriteExt};
use tracing::{error, info};
use crate::disk::error::DiskError;
use crate::disk::{self, DiskAPI as _, DiskStore};
use crate::erasure_coding::{BitrotReader, BitrotWriterWrapper, CustomWriter};
use rustfs_utils::HashAlgorithm;
use std::io::Cursor;
use tokio::io::AsyncRead;
lazy_static! {
static ref BITROT_ALGORITHMS: HashMap<BitrotAlgorithm, &'static str> = {
let mut m = HashMap::new();
m.insert(BitrotAlgorithm::SHA256, "sha256");
m.insert(BitrotAlgorithm::BLAKE2b512, "blake2b");
m.insert(BitrotAlgorithm::HighwayHash256, "highwayhash256");
m.insert(BitrotAlgorithm::HighwayHash256S, "highwayhash256S");
m
};
}
/// Create a BitrotReader from either inline data or disk file stream
///
/// # Parameters
/// * `inline_data` - Optional inline data, if present, will use Cursor to read from memory
/// * `disk` - Optional disk reference for file stream reading
/// * `bucket` - Bucket name for file path
/// * `path` - File path within the bucket
/// * `offset` - Starting offset for reading
/// * `length` - Length to read
/// * `shard_size` - Shard size for erasure coding
/// * `checksum_algo` - Hash algorithm for bitrot verification
#[allow(clippy::too_many_arguments)]
pub async fn create_bitrot_reader(
inline_data: Option<&[u8]>,
disk: Option<&DiskStore>,
bucket: &str,
path: &str,
offset: usize,
length: usize,
shard_size: usize,
checksum_algo: HashAlgorithm,
) -> disk::error::Result<Option<BitrotReader<Box<dyn AsyncRead + Send + Sync + Unpin>>>> {
// Calculate the total length to read, including the checksum overhead
let length = length.div_ceil(shard_size) * checksum_algo.size() + length;
// const MAGIC_HIGHWAY_HASH256_KEY: &[u8] = &[
// 0x4b, 0xe7, 0x34, 0xfa, 0x8e, 0x23, 0x8a, 0xcd, 0x26, 0x3e, 0x83, 0xe6, 0xbb, 0x96, 0x85, 0x52, 0x04, 0x0f, 0x93, 0x5d, 0xa3,
// 0x9f, 0x44, 0x14, 0x97, 0xe0, 0x9d, 0x13, 0x22, 0xde, 0x36, 0xa0,
// ];
const MAGIC_HIGHWAY_HASH256_KEY: &[u64; 4] = &[3, 4, 2, 1];
#[derive(Clone, Debug)]
pub enum Hasher {
SHA256(Sha256),
HighwayHash256(HighwayHasher),
BLAKE2b512(Blake2b512),
}
impl Hasher {
pub fn update(&mut self, data: impl AsRef<[u8]>) {
match self {
Hasher::SHA256(core_wrapper) => {
core_wrapper.update(data);
}
Hasher::HighwayHash256(highway_hasher) => {
highway_hasher.append(data.as_ref());
}
Hasher::BLAKE2b512(core_wrapper) => {
core_wrapper.update(data);
if let Some(data) = inline_data {
// Use inline data
let rd = Cursor::new(data.to_vec());
let reader = BitrotReader::new(Box::new(rd) as Box<dyn AsyncRead + Send + Sync + Unpin>, shard_size, checksum_algo);
Ok(Some(reader))
} else if let Some(disk) = disk {
// Read from disk
match disk.read_file_stream(bucket, path, offset, length).await {
Ok(rd) => {
let reader = BitrotReader::new(rd, shard_size, checksum_algo);
Ok(Some(reader))
}
Err(e) => Err(e),
}
}
pub fn finalize(self) -> Vec<u8> {
match self {
Hasher::SHA256(core_wrapper) => core_wrapper.finalize().to_vec(),
Hasher::HighwayHash256(highway_hasher) => highway_hasher
.finalize256()
.iter()
.flat_map(|&n| n.to_le_bytes()) // 使用小端字节序转换
.collect(),
Hasher::BLAKE2b512(core_wrapper) => core_wrapper.finalize().to_vec(),
}
}
pub fn size(&self) -> usize {
match self {
Hasher::SHA256(_) => Sha256::output_size(),
Hasher::HighwayHash256(_) => 32,
Hasher::BLAKE2b512(_) => Blake2b512::output_size(),
}
}
pub fn block_size(&self) -> usize {
match self {
Hasher::SHA256(_) => Sha256::block_size(),
Hasher::HighwayHash256(_) => 64,
Hasher::BLAKE2b512(_) => 64,
}
}
pub fn reset(&mut self) {
match self {
Hasher::SHA256(core_wrapper) => core_wrapper.reset(),
Hasher::HighwayHash256(highway_hasher) => {
let key = Key(*MAGIC_HIGHWAY_HASH256_KEY);
*highway_hasher = HighwayHasher::new(key);
}
Hasher::BLAKE2b512(core_wrapper) => core_wrapper.reset(),
}
} else {
// Neither inline data nor disk available
Ok(None)
}
}
impl BitrotAlgorithm {
pub fn new_hasher(&self) -> Hasher {
match self {
BitrotAlgorithm::SHA256 => Hasher::SHA256(Sha256::new()),
BitrotAlgorithm::HighwayHash256 | BitrotAlgorithm::HighwayHash256S => {
let key = Key(*MAGIC_HIGHWAY_HASH256_KEY);
Hasher::HighwayHash256(HighwayHasher::new(key))
}
BitrotAlgorithm::BLAKE2b512 => Hasher::BLAKE2b512(Blake2b512::new()),
}
}
pub fn available(&self) -> bool {
BITROT_ALGORITHMS.get(self).is_some()
}
pub fn string(&self) -> String {
BITROT_ALGORITHMS.get(self).map_or("".to_string(), |s| s.to_string())
}
}
#[derive(Debug)]
pub struct BitrotVerifier {
_algorithm: BitrotAlgorithm,
_sum: Vec<u8>,
}
impl BitrotVerifier {
pub fn new(algorithm: BitrotAlgorithm, checksum: &[u8]) -> BitrotVerifier {
BitrotVerifier {
_algorithm: algorithm,
_sum: checksum.to_vec(),
}
}
}
pub fn bitrot_algorithm_from_string(s: &str) -> BitrotAlgorithm {
for (k, v) in BITROT_ALGORITHMS.iter() {
if *v == s {
return k.clone();
}
}
BitrotAlgorithm::HighwayHash256S
}
pub type BitrotWriter = Box<dyn Writer + Send + 'static>;
// pub async fn new_bitrot_writer(
// disk: DiskStore,
// orig_volume: &str,
// volume: &str,
// file_path: &str,
// length: usize,
// algo: BitrotAlgorithm,
// shard_size: usize,
// ) -> Result<BitrotWriter> {
// if algo == BitrotAlgorithm::HighwayHash256S {
// return Ok(Box::new(
// StreamingBitrotWriter::new(disk, orig_volume, volume, file_path, length, algo, shard_size).await?,
// ));
// }
// Ok(Box::new(WholeBitrotWriter::new(disk, volume, file_path, algo, shard_size)))
// }
pub type BitrotReader = Box<dyn ReadAt + Send>;
// #[allow(clippy::too_many_arguments)]
// pub fn new_bitrot_reader(
// disk: DiskStore,
// data: &[u8],
// bucket: &str,
// file_path: &str,
// till_offset: usize,
// algo: BitrotAlgorithm,
// sum: &[u8],
// shard_size: usize,
// ) -> BitrotReader {
// if algo == BitrotAlgorithm::HighwayHash256S {
// return Box::new(StreamingBitrotReader::new(disk, data, bucket, file_path, algo, till_offset, shard_size));
// }
// Box::new(WholeBitrotReader::new(disk, bucket, file_path, algo, till_offset, sum))
// }
pub async fn close_bitrot_writers(writers: &mut [Option<BitrotWriter>]) -> Result<()> {
for w in writers.iter_mut().flatten() {
w.close().await?;
}
Ok(())
}
// pub fn bitrot_writer_sum(w: &BitrotWriter) -> Vec<u8> {
// if let Some(w) = w.as_any().downcast_ref::<WholeBitrotWriter>() {
// return w.hash.clone().finalize();
// }
// Vec::new()
// }
pub fn bitrot_shard_file_size(size: usize, shard_size: usize, algo: BitrotAlgorithm) -> usize {
if algo != BitrotAlgorithm::HighwayHash256S {
return size;
}
size.div_ceil(shard_size) * algo.new_hasher().size() + size
}
pub async fn bitrot_verify(
r: FileReader,
want_size: usize,
part_size: usize,
algo: BitrotAlgorithm,
_want: Vec<u8>,
mut shard_size: usize,
) -> Result<()> {
// if algo != BitrotAlgorithm::HighwayHash256S {
// let mut h = algo.new_hasher();
// h.update(r.get_ref());
// let hash = h.finalize();
// if hash != want {
// info!("bitrot_verify except: {:?}, got: {:?}", want, hash);
// return Err(Error::new(DiskError::FileCorrupt));
// }
// return Ok(());
// }
let mut h = algo.new_hasher();
let mut hash_buf = vec![0; h.size()];
let mut left = want_size;
if left != bitrot_shard_file_size(part_size, shard_size, algo.clone()) {
info!(
"bitrot_shard_file_size failed, left: {}, part_size: {}, shard_size: {}, algo: {:?}",
left, part_size, shard_size, algo
);
return Err(Error::new(DiskError::FileCorrupt));
}
let mut r = r;
while left > 0 {
h.reset();
let n = r.read_exact(&mut hash_buf).await?;
left -= n;
if left < shard_size {
shard_size = left;
}
let mut buf = vec![0; shard_size];
let read = r.read_exact(&mut buf).await?;
h.update(buf);
left -= read;
let hash = h.clone().finalize();
if h.clone().finalize() != hash_buf[0..n] {
info!("bitrot_verify except: {:?}, got: {:?}", hash_buf[0..n].to_vec(), hash);
return Err(Error::new(DiskError::FileCorrupt));
}
}
Ok(())
}
// pub struct WholeBitrotWriter {
// disk: DiskStore,
// volume: String,
// file_path: String,
// _shard_size: usize,
// pub hash: Hasher,
// }
// impl WholeBitrotWriter {
// pub fn new(disk: DiskStore, volume: &str, file_path: &str, algo: BitrotAlgorithm, shard_size: usize) -> Self {
// WholeBitrotWriter {
// disk,
// volume: volume.to_string(),
// file_path: file_path.to_string(),
// _shard_size: shard_size,
// hash: algo.new_hasher(),
// }
// }
// }
// #[async_trait::async_trait]
// impl Writer for WholeBitrotWriter {
// fn as_any(&self) -> &dyn Any {
// self
// }
// async fn write(&mut self, buf: &[u8]) -> Result<()> {
// let mut file = self.disk.append_file(&self.volume, &self.file_path).await?;
// let _ = file.write(buf).await?;
// self.hash.update(buf);
// Ok(())
// }
// }
// #[derive(Debug)]
// pub struct WholeBitrotReader {
// disk: DiskStore,
// volume: String,
// file_path: String,
// _verifier: BitrotVerifier,
// till_offset: usize,
// buf: Option<Vec<u8>>,
// }
// impl WholeBitrotReader {
// pub fn new(disk: DiskStore, volume: &str, file_path: &str, algo: BitrotAlgorithm, till_offset: usize, sum: &[u8]) -> Self {
// Self {
// disk,
// volume: volume.to_string(),
// file_path: file_path.to_string(),
// _verifier: BitrotVerifier::new(algo, sum),
// till_offset,
// buf: None,
// }
// }
// }
// #[async_trait::async_trait]
// impl ReadAt for WholeBitrotReader {
// async fn read_at(&mut self, offset: usize, length: usize) -> Result<(Vec<u8>, usize)> {
// if self.buf.is_none() {
// let buf_len = self.till_offset - offset;
// let mut file = self
// .disk
// .read_file_stream(&self.volume, &self.file_path, offset, length)
// .await?;
// let mut buf = vec![0u8; buf_len];
// file.read_at(offset, &mut buf).await?;
// self.buf = Some(buf);
// }
// if let Some(buf) = &mut self.buf {
// if buf.len() < length {
// return Err(Error::new(DiskError::LessData));
// }
// return Ok((buf.drain(0..length).collect::<Vec<_>>(), length));
// }
// Err(Error::new(DiskError::LessData))
// }
// }
// struct StreamingBitrotWriter {
// hasher: Hasher,
// tx: Sender<Option<Vec<u8>>>,
// task: Option<JoinHandle<()>>,
// }
// impl StreamingBitrotWriter {
// pub async fn new(
// disk: DiskStore,
// orig_volume: &str,
// volume: &str,
// file_path: &str,
// length: usize,
// algo: BitrotAlgorithm,
// shard_size: usize,
// ) -> Result<Self> {
// let hasher = algo.new_hasher();
// let (tx, mut rx) = mpsc::channel::<Option<Vec<u8>>>(10);
// let total_file_size = length.div_ceil(shard_size) * hasher.size() + length;
// let mut writer = disk.create_file(orig_volume, volume, file_path, total_file_size).await?;
// let task = spawn(async move {
// loop {
// if let Some(Some(buf)) = rx.recv().await {
// writer.write(&buf).await.unwrap();
// continue;
// }
// break;
// }
// });
// Ok(StreamingBitrotWriter {
// hasher,
// tx,
// task: Some(task),
// })
// }
// }
// #[async_trait::async_trait]
// impl Writer for StreamingBitrotWriter {
// fn as_any(&self) -> &dyn Any {
// self
// }
// async fn write(&mut self, buf: &[u8]) -> Result<()> {
// if buf.is_empty() {
// return Ok(());
// }
// self.hasher.reset();
// self.hasher.update(buf);
// let hash_bytes = self.hasher.clone().finalize();
// let _ = self.tx.send(Some(hash_bytes)).await?;
// let _ = self.tx.send(Some(buf.to_vec())).await?;
// Ok(())
// }
// async fn close(&mut self) -> Result<()> {
// let _ = self.tx.send(None).await?;
// if let Some(task) = self.task.take() {
// let _ = task.await; // 等待任务完成
// }
// Ok(())
// }
// }
// #[derive(Debug)]
// struct StreamingBitrotReader {
// disk: DiskStore,
// _data: Vec<u8>,
// volume: String,
// file_path: String,
// till_offset: usize,
// curr_offset: usize,
// hasher: Hasher,
// shard_size: usize,
// buf: Vec<u8>,
// hash_bytes: Vec<u8>,
// }
// impl StreamingBitrotReader {
// pub fn new(
// disk: DiskStore,
// data: &[u8],
// volume: &str,
// file_path: &str,
// algo: BitrotAlgorithm,
// till_offset: usize,
// shard_size: usize,
// ) -> Self {
// let hasher = algo.new_hasher();
// Self {
// disk,
// _data: data.to_vec(),
// volume: volume.to_string(),
// file_path: file_path.to_string(),
// till_offset: till_offset.div_ceil(shard_size) * hasher.size() + till_offset,
// curr_offset: 0,
// hash_bytes: Vec::with_capacity(hasher.size()),
// hasher,
// shard_size,
// buf: Vec::new(),
// }
// }
// }
// #[async_trait::async_trait]
// impl ReadAt for StreamingBitrotReader {
// async fn read_at(&mut self, offset: usize, length: usize) -> Result<(Vec<u8>, usize)> {
// if offset % self.shard_size != 0 {
// return Err(Error::new(DiskError::Unexpected));
// }
// if self.buf.is_empty() {
// self.curr_offset = offset;
// let stream_offset = (offset / self.shard_size) * self.hasher.size() + offset;
// let buf_len = self.till_offset - stream_offset;
// let mut file = self.disk.read_file(&self.volume, &self.file_path).await?;
// let mut buf = vec![0u8; buf_len];
// file.read_at(stream_offset, &mut buf).await?;
// self.buf = buf;
// }
// if offset != self.curr_offset {
// return Err(Error::new(DiskError::Unexpected));
// }
// self.hash_bytes = self.buf.drain(0..self.hash_bytes.capacity()).collect();
// let buf = self.buf.drain(0..length).collect::<Vec<_>>();
// self.hasher.reset();
// self.hasher.update(&buf);
// let actual = self.hasher.clone().finalize();
// if actual != self.hash_bytes {
// return Err(Error::new(DiskError::FileCorrupt));
// }
// let readed_len = buf.len();
// self.curr_offset += readed_len;
// Ok((buf, readed_len))
// }
// }
pub struct BitrotFileWriter {
inner: Option<FileWriter>,
hasher: Hasher,
_shard_size: usize,
inline: bool,
inline_data: Vec<u8>,
}
impl BitrotFileWriter {
pub async fn new(
disk: Arc<Disk>,
volume: &str,
path: &str,
inline: bool,
algo: BitrotAlgorithm,
_shard_size: usize,
) -> Result<Self> {
let inner = if !inline {
Some(disk.create_file("", volume, path, 0).await?)
} else {
None
};
let hasher = algo.new_hasher();
Ok(Self {
inner,
inline,
inline_data: Vec::new(),
hasher,
_shard_size,
})
}
// pub fn writer(&self) -> &FileWriter {
// &self.inner
// }
pub fn inline_data(&self) -> &[u8] {
&self.inline_data
}
}
#[async_trait::async_trait]
impl Writer for BitrotFileWriter {
fn as_any(&self) -> &dyn Any {
self
}
#[tracing::instrument(level = "info", skip_all)]
async fn write(&mut self, buf: Bytes) -> Result<()> {
if buf.is_empty() {
return Ok(());
}
let mut hasher = self.hasher.clone();
let h_buf = buf.clone();
let hash_bytes = tokio::spawn(async move {
hasher.reset();
hasher.update(h_buf);
hasher.finalize()
})
.await?;
if let Some(f) = self.inner.as_mut() {
f.write_all(&hash_bytes).await?;
f.write_all(&buf).await?;
} else {
self.inline_data.extend_from_slice(&hash_bytes);
self.inline_data.extend_from_slice(&buf);
}
Ok(())
}
async fn close(&mut self) -> Result<()> {
if self.inline {
return Ok(());
}
if let Some(f) = self.inner.as_mut() {
f.shutdown().await?;
}
Ok(())
}
}
pub async fn new_bitrot_filewriter(
disk: Arc<Disk>,
/// Create a new BitrotWriterWrapper based on the provided parameters
///
/// # Parameters
/// - `is_inline_buffer`: If true, creates an in-memory buffer writer; if false, uses disk storage
/// - `disk`: Optional disk instance for file creation (used when is_inline_buffer is false)
/// - `shard_size`: Size of each shard for bitrot calculation
/// - `checksum_algo`: Hash algorithm to use for bitrot verification
/// - `volume`: Volume/bucket name for disk storage
/// - `path`: File path for disk storage
/// - `length`: Expected file length for disk storage
///
/// # Returns
/// A Result containing the BitrotWriterWrapper or an error
pub async fn create_bitrot_writer(
is_inline_buffer: bool,
disk: Option<&DiskStore>,
volume: &str,
path: &str,
inline: bool,
algo: BitrotAlgorithm,
length: i64,
shard_size: usize,
) -> Result<BitrotWriter> {
let w = BitrotFileWriter::new(disk, volume, path, inline, algo, shard_size).await?;
checksum_algo: HashAlgorithm,
) -> disk::error::Result<BitrotWriterWrapper> {
let writer = if is_inline_buffer {
CustomWriter::new_inline_buffer()
} else if let Some(disk) = disk {
let length = if length > 0 {
let length = length as usize;
(length.div_ceil(shard_size) * checksum_algo.size() + length) as i64
} else {
0
};
Ok(Box::new(w))
}
let file = disk.create_file("", volume, path, length).await?;
CustomWriter::new_tokio_writer(file)
} else {
return Err(DiskError::DiskNotFound);
};
struct BitrotFileReader {
disk: Arc<Disk>,
data: Option<Vec<u8>>,
volume: String,
file_path: String,
reader: Option<FileReader>,
till_offset: usize,
curr_offset: usize,
hasher: Hasher,
shard_size: usize,
// buf: Vec<u8>,
hash_bytes: Vec<u8>,
read_buf: Vec<u8>,
}
fn ceil(a: usize, b: usize) -> usize {
a.div_ceil(b)
}
impl BitrotFileReader {
pub fn new(
disk: Arc<Disk>,
data: Option<Vec<u8>>,
volume: String,
file_path: String,
algo: BitrotAlgorithm,
till_offset: usize,
shard_size: usize,
) -> Self {
let hasher = algo.new_hasher();
Self {
disk,
data,
volume,
file_path,
till_offset: ceil(till_offset, shard_size) * hasher.size() + till_offset,
curr_offset: 0,
hash_bytes: vec![0u8; hasher.size()],
hasher,
shard_size,
// buf: Vec::new(),
read_buf: Vec::new(),
reader: None,
}
}
}
#[async_trait::async_trait]
impl ReadAt for BitrotFileReader {
// 读取数据
async fn read_at(&mut self, offset: usize, length: usize) -> Result<(Vec<u8>, usize)> {
if offset % self.shard_size != 0 {
error!(
"BitrotFileReader read_at offset % self.shard_size != 0 , {} % {} = {}",
offset,
self.shard_size,
offset % self.shard_size
);
return Err(Error::new(DiskError::Unexpected));
}
if self.reader.is_none() {
self.curr_offset = offset;
let stream_offset = (offset / self.shard_size) * self.hasher.size() + offset;
if let Some(data) = self.data.clone() {
self.reader = Some(Box::new(Cursor::new(data)));
} else {
self.reader = Some(
self.disk
.read_file_stream(&self.volume, &self.file_path, stream_offset, self.till_offset - stream_offset)
.await?,
);
}
}
if offset != self.curr_offset {
error!(
"BitrotFileReader read_at {}/{} offset != self.curr_offset, {} != {}",
&self.volume, &self.file_path, offset, self.curr_offset
);
return Err(Error::new(DiskError::Unexpected));
}
let reader = self.reader.as_mut().unwrap();
// let mut hash_buf = self.hash_bytes;
self.hash_bytes.clear();
self.hash_bytes.resize(self.hasher.size(), 0u8);
reader.read_exact(&mut self.hash_bytes).await?;
self.read_buf.clear();
self.read_buf.resize(length, 0u8);
reader.read_exact(&mut self.read_buf).await?;
self.hasher.reset();
self.hasher.update(&self.read_buf);
let actual = self.hasher.clone().finalize();
if actual != self.hash_bytes {
error!(
"BitrotFileReader read_at actual != self.hash_bytes, {:?} != {:?}",
actual, self.hash_bytes
);
return Err(Error::new(DiskError::FileCorrupt));
}
let readed_len = self.read_buf.len();
self.curr_offset += readed_len;
Ok((self.read_buf.clone(), readed_len))
// let stream_offset = (offset / self.shard_size) * self.hasher.size() + offset;
// let buf_len = self.hasher.size() + length;
// self.read_buf.clear();
// self.read_buf.resize(buf_len, 0u8);
// self.inner.read_at(stream_offset, &mut self.read_buf).await?;
// let hash_bytes = &self.read_buf.as_slice()[0..self.hash_bytes.capacity()];
// self.hash_bytes.clone_from_slice(hash_bytes);
// let buf = self.read_buf.as_slice()[self.hash_bytes.capacity()..self.hash_bytes.capacity() + length].to_vec();
// self.hasher.reset();
// self.hasher.update(&buf);
// let actual = self.hasher.clone().finalize();
// if actual != self.hash_bytes {
// return Err(Error::new(DiskError::FileCorrupt));
// }
// let readed_len = buf.len();
// self.curr_offset += readed_len;
// Ok((buf, readed_len))
}
}
pub fn new_bitrot_filereader(
disk: Arc<Disk>,
data: Option<Vec<u8>>,
volume: String,
file_path: String,
till_offset: usize,
algo: BitrotAlgorithm,
shard_size: usize,
) -> BitrotReader {
Box::new(BitrotFileReader::new(disk, data, volume, file_path, algo, till_offset, shard_size))
Ok(BitrotWriterWrapper::new(writer, shard_size, checksum_algo))
}
#[cfg(test)]
mod test {
use std::collections::HashMap;
mod tests {
use super::*;
use crate::{disk::error::DiskError, store_api::BitrotAlgorithm};
use common::error::{Error, Result};
use hex_simd::decode_to_vec;
#[tokio::test]
async fn test_create_bitrot_reader_with_inline_data() {
let test_data = b"hello world test data";
let shard_size = 16;
let checksum_algo = HashAlgorithm::HighwayHash256;
// use super::{bitrot_writer_sum, new_bitrot_reader};
let result =
create_bitrot_reader(Some(test_data), None, "test-bucket", "test-path", 0, 0, shard_size, checksum_algo).await;
#[test]
fn bitrot_self_test() -> Result<()> {
let mut checksums = HashMap::new();
checksums.insert(
BitrotAlgorithm::SHA256,
"a7677ff19e0182e4d52e3a3db727804abc82a5818749336369552e54b838b004",
);
checksums.insert(BitrotAlgorithm::BLAKE2b512, "e519b7d84b1c3c917985f544773a35cf265dcab10948be3550320d156bab612124a5ae2ae5a8c73c0eea360f68b0e28136f26e858756dbfe7375a7389f26c669");
checksums.insert(
BitrotAlgorithm::HighwayHash256,
"c81c2386a1f565e805513d630d4e50ff26d11269b21c221cf50fc6c29d6ff75b",
);
checksums.insert(
BitrotAlgorithm::HighwayHash256S,
"c81c2386a1f565e805513d630d4e50ff26d11269b21c221cf50fc6c29d6ff75b",
);
let iter = [
BitrotAlgorithm::SHA256,
BitrotAlgorithm::BLAKE2b512,
BitrotAlgorithm::HighwayHash256,
];
for algo in iter.iter() {
if !algo.available() || *algo != BitrotAlgorithm::HighwayHash256 {
continue;
}
let checksum = decode_to_vec(checksums.get(algo).unwrap())?;
let mut h = algo.new_hasher();
let mut msg = Vec::with_capacity(h.size() * h.block_size());
let mut sum = Vec::with_capacity(h.size());
for _ in (0..h.size() * h.block_size()).step_by(h.size()) {
h.update(&msg);
sum = h.finalize();
msg.extend(sum.clone());
h = algo.new_hasher();
}
if checksum != sum {
return Err(Error::new(DiskError::FileCorrupt));
}
}
Ok(())
assert!(result.is_ok());
assert!(result.unwrap().is_some());
}
// #[tokio::test]
// async fn test_all_bitrot_algorithms() -> Result<()> {
// for algo in BITROT_ALGORITHMS.keys() {
// test_bitrot_reader_writer_algo(algo.clone()).await?;
// }
#[tokio::test]
async fn test_create_bitrot_reader_without_data_or_disk() {
let shard_size = 16;
let checksum_algo = HashAlgorithm::HighwayHash256;
// Ok(())
// }
let result = create_bitrot_reader(None, None, "test-bucket", "test-path", 0, 1024, shard_size, checksum_algo).await;
// async fn test_bitrot_reader_writer_algo(algo: BitrotAlgorithm) -> Result<()> {
// let temp_dir = TempDir::new().unwrap().path().to_string_lossy().to_string();
// fs::create_dir_all(&temp_dir)?;
// let volume = "testvol";
// let file_path = "testfile";
assert!(result.is_ok());
assert!(result.unwrap().is_none());
}
// let ep = Endpoint::try_from(temp_dir.as_str())?;
// let opt = DiskOption::default();
// let disk = new_disk(&ep, &opt).await?;
// disk.make_volume(volume).await?;
// let mut writer = new_bitrot_writer(disk.clone(), "", volume, file_path, 35, algo.clone(), 10).await?;
#[tokio::test]
async fn test_create_bitrot_writer_inline() {
use rustfs_utils::HashAlgorithm;
// writer.write(b"aaaaaaaaaa").await?;
// writer.write(b"aaaaaaaaaa").await?;
// writer.write(b"aaaaaaaaaa").await?;
// writer.write(b"aaaaa").await?;
let wrapper = create_bitrot_writer(
true, // is_inline_buffer
None, // disk not needed for inline buffer
"test-volume",
"test-path",
1024, // length
1024, // shard_size
HashAlgorithm::HighwayHash256,
)
.await;
// let sum = bitrot_writer_sum(&writer);
// writer.close().await?;
assert!(wrapper.is_ok());
let mut wrapper = wrapper.unwrap();
// let mut reader = new_bitrot_reader(disk, b"", volume, file_path, 35, algo, &sum, 10);
// let read_len = 10;
// let mut result: Vec<u8>;
// (result, _) = reader.read_at(0, read_len).await?;
// assert_eq!(result, b"aaaaaaaaaa");
// (result, _) = reader.read_at(10, read_len).await?;
// assert_eq!(result, b"aaaaaaaaaa");
// (result, _) = reader.read_at(20, read_len).await?;
// assert_eq!(result, b"aaaaaaaaaa");
// (result, _) = reader.read_at(30, read_len / 2).await?;
// assert_eq!(result, b"aaaaa");
// Test writing some data
let test_data = b"hello world";
let result = wrapper.write(test_data).await;
assert!(result.is_ok());
// Ok(())
// }
// Test getting inline data
let inline_data = wrapper.into_inline_data();
assert!(inline_data.is_some());
// The inline data should contain both hash and data
let data = inline_data.unwrap();
assert!(!data.is_empty());
}
#[tokio::test]
async fn test_create_bitrot_writer_disk_without_disk() {
use rustfs_utils::HashAlgorithm;
// Test error case: trying to create disk writer without providing disk instance
let wrapper = create_bitrot_writer(
false, // is_inline_buffer = false, so needs disk
None, // disk = None, should cause error
"test-volume",
"test-path",
1024, // length
1024, // shard_size
HashAlgorithm::HighwayHash256,
)
.await;
assert!(wrapper.is_err());
let error = wrapper.unwrap_err();
println!("error: {:?}", error);
assert_eq!(error, DiskError::DiskNotFound);
}
}

View File

@@ -1,6 +1,6 @@
use common::error::Error;
use crate::error::Error;
#[derive(Debug, thiserror::Error, PartialEq, Eq)]
#[derive(Debug, thiserror::Error)]
pub enum BucketMetadataError {
#[error("tagging not found")]
TaggingNotFound,
@@ -18,18 +18,58 @@ pub enum BucketMetadataError {
BucketReplicationConfigNotFound,
#[error("bucket remote target not found")]
BucketRemoteTargetNotFound,
#[error("Io error: {0}")]
Io(std::io::Error),
}
impl BucketMetadataError {
pub fn is(&self, err: &Error) -> bool {
if let Some(e) = err.downcast_ref::<BucketMetadataError>() {
e == self
} else {
false
pub fn other<E>(error: E) -> Self
where
E: Into<Box<dyn std::error::Error + Send + Sync>>,
{
BucketMetadataError::Io(std::io::Error::other(error))
}
}
impl From<BucketMetadataError> for Error {
fn from(e: BucketMetadataError) -> Self {
match e {
BucketMetadataError::BucketPolicyNotFound => Error::BucketPolicyNotFound,
_ => Error::other(e),
}
}
}
impl From<Error> for BucketMetadataError {
fn from(e: Error) -> Self {
match e {
Error::BucketPolicyNotFound => BucketMetadataError::BucketPolicyNotFound,
Error::Io(e) => e.into(),
_ => BucketMetadataError::other(e),
}
}
}
impl From<std::io::Error> for BucketMetadataError {
fn from(e: std::io::Error) -> Self {
e.downcast::<BucketMetadataError>().unwrap_or_else(BucketMetadataError::other)
}
}
impl PartialEq for BucketMetadataError {
fn eq(&self, other: &Self) -> bool {
match (self, other) {
(BucketMetadataError::Io(e1), BucketMetadataError::Io(e2)) => {
e1.kind() == e2.kind() && e1.to_string() == e2.to_string()
}
(e1, e2) => e1.to_u32() == e2.to_u32(),
}
}
}
impl Eq for BucketMetadataError {}
impl BucketMetadataError {
pub fn to_u32(&self) -> u32 {
match self {
@@ -41,6 +81,7 @@ impl BucketMetadataError {
BucketMetadataError::BucketQuotaConfigNotFound => 0x06,
BucketMetadataError::BucketReplicationConfigNotFound => 0x07,
BucketMetadataError::BucketRemoteTargetNotFound => 0x08,
BucketMetadataError::Io(_) => 0x09,
}
}
@@ -54,6 +95,7 @@ impl BucketMetadataError {
0x06 => Some(BucketMetadataError::BucketQuotaConfigNotFound),
0x07 => Some(BucketMetadataError::BucketReplicationConfigNotFound),
0x08 => Some(BucketMetadataError::BucketRemoteTargetNotFound),
0x09 => Some(BucketMetadataError::Io(std::io::Error::other("Io error"))),
_ => None,
}
}

View File

@@ -17,13 +17,13 @@ use time::OffsetDateTime;
use tracing::error;
use crate::bucket::target::BucketTarget;
use crate::bucket::utils::deserialize;
use crate::config::com::{read_config, save_config};
use crate::{config, new_object_layer_fn};
use common::error::{Error, Result};
use crate::error::{Error, Result};
use crate::new_object_layer_fn;
use crate::disk::BUCKET_META_PREFIX;
use crate::store::ECStore;
use crate::utils::xml::deserialize;
pub const BUCKET_METADATA_FILE: &str = ".metadata.bin";
pub const BUCKET_METADATA_FORMAT: u16 = 1;
@@ -178,7 +178,7 @@ impl BucketMetadata {
pub fn check_header(buf: &[u8]) -> Result<()> {
if buf.len() <= 4 {
return Err(Error::msg("read_bucket_metadata: data invalid"));
return Err(Error::other("read_bucket_metadata: data invalid"));
}
let format = LittleEndian::read_u16(&buf[0..2]);
@@ -186,12 +186,12 @@ impl BucketMetadata {
match format {
BUCKET_METADATA_FORMAT => {}
_ => return Err(Error::msg("read_bucket_metadata: format invalid")),
_ => return Err(Error::other("read_bucket_metadata: format invalid")),
}
match version {
BUCKET_METADATA_VERSION => {}
_ => return Err(Error::msg("read_bucket_metadata: version invalid")),
_ => return Err(Error::other("read_bucket_metadata: version invalid")),
}
Ok(())
@@ -285,7 +285,7 @@ impl BucketMetadata {
self.bucket_targets_config_json = data.clone();
self.bucket_targets_config_updated_at = updated;
}
_ => return Err(Error::msg(format!("config file not found : {}", config_file))),
_ => return Err(Error::other(format!("config file not found : {}", config_file))),
}
Ok(updated)
@@ -296,7 +296,9 @@ impl BucketMetadata {
}
pub async fn save(&mut self) -> Result<()> {
let Some(store) = new_object_layer_fn() else { return Err(Error::msg("errServerNotInitialized")) };
let Some(store) = new_object_layer_fn() else {
return Err(Error::other("errServerNotInitialized"));
};
self.parse_all_configs(store.clone())?;
@@ -364,7 +366,7 @@ pub async fn load_bucket_metadata_parse(api: Arc<ECStore>, bucket: &str, parse:
let mut bm = match read_bucket_metadata(api.clone(), bucket).await {
Ok(res) => res,
Err(err) => {
if !config::error::is_err_config_not_found(&err) {
if err != Error::ConfigNotFound {
return Err(err);
}
@@ -388,7 +390,7 @@ pub async fn load_bucket_metadata_parse(api: Arc<ECStore>, bucket: &str, parse:
async fn read_bucket_metadata(api: Arc<ECStore>, bucket: &str) -> Result<BucketMetadata> {
if bucket.is_empty() {
error!("bucket name empty");
return Err(Error::msg("invalid argument"));
return Err(Error::other("invalid argument"));
}
let bm = BucketMetadata::new(bucket);
@@ -403,7 +405,7 @@ async fn read_bucket_metadata(api: Arc<ECStore>, bucket: &str) -> Result<BucketM
Ok(bm)
}
fn _write_time<S>(t: &OffsetDateTime, s: S) -> Result<S::Ok, S::Error>
fn _write_time<S>(t: &OffsetDateTime, s: S) -> std::result::Result<S::Ok, S::Error>
where
S: Serializer,
{

View File

@@ -3,18 +3,15 @@ use std::sync::OnceLock;
use std::time::Duration;
use std::{collections::HashMap, sync::Arc};
use crate::StorageAPI;
use crate::bucket::error::BucketMetadataError;
use crate::bucket::metadata::{load_bucket_metadata_parse, BUCKET_LIFECYCLE_CONFIG};
use crate::bucket::utils::is_meta_bucketname;
use crate::bucket::metadata::{BUCKET_LIFECYCLE_CONFIG, load_bucket_metadata_parse};
use crate::bucket::utils::{deserialize, is_meta_bucketname};
use crate::cmd::bucket_targets;
use crate::config::error::ConfigError;
use crate::disk::error::DiskError;
use crate::global::{is_dist_erasure, is_erasure, new_object_layer_fn, GLOBAL_Endpoints};
use crate::error::{Error, Result, is_err_bucket_not_found};
use crate::global::{GLOBAL_Endpoints, is_dist_erasure, is_erasure, new_object_layer_fn};
use crate::heal::heal_commands::HealOpts;
use crate::store::ECStore;
use crate::utils::xml::deserialize;
use crate::{config, StorageAPI};
use common::error::{Error, Result};
use futures::future::join_all;
use policy::policy::BucketPolicy;
use s3s::dto::{
@@ -26,7 +23,7 @@ use tokio::sync::RwLock;
use tokio::time::sleep;
use tracing::{error, warn};
use super::metadata::{load_bucket_metadata, BucketMetadata};
use super::metadata::{BucketMetadata, load_bucket_metadata};
use super::quota::BucketQuota;
use super::target::BucketTargets;
@@ -50,7 +47,7 @@ pub(super) fn get_bucket_metadata_sys() -> Result<Arc<RwLock<BucketMetadataSys>>
if let Some(sys) = GLOBAL_BucketMetadataSys.get() {
Ok(sys.clone())
} else {
Err(Error::msg("GLOBAL_BucketMetadataSys not init"))
Err(Error::other("GLOBAL_BucketMetadataSys not init"))
}
}
@@ -168,7 +165,7 @@ impl BucketMetadataSys {
if let Some(endpoints) = GLOBAL_Endpoints.get() {
endpoints.es_count() * 10
} else {
return Err(Error::msg("GLOBAL_Endpoints not init"));
return Err(Error::other("GLOBAL_Endpoints not init"));
}
};
@@ -248,14 +245,14 @@ impl BucketMetadataSys {
pub async fn get(&self, bucket: &str) -> Result<Arc<BucketMetadata>> {
if is_meta_bucketname(bucket) {
return Err(Error::new(ConfigError::NotFound));
return Err(Error::ConfigNotFound);
}
let map = self.metadata_map.read().await;
if let Some(bm) = map.get(bucket) {
Ok(bm.clone())
} else {
Err(Error::new(ConfigError::NotFound))
Err(Error::ConfigNotFound)
}
}
@@ -280,7 +277,7 @@ impl BucketMetadataSys {
let meta = match self.get_config_from_disk(bucket).await {
Ok(res) => res,
Err(err) => {
if !config::error::is_err_config_not_found(&err) {
if err != Error::ConfigNotFound {
return Err(err);
} else {
BucketMetadata::new(bucket)
@@ -304,16 +301,18 @@ impl BucketMetadataSys {
}
async fn update_and_parse(&mut self, bucket: &str, config_file: &str, data: Vec<u8>, parse: bool) -> Result<OffsetDateTime> {
let Some(store) = new_object_layer_fn() else { return Err(Error::msg("errServerNotInitialized")) };
let Some(store) = new_object_layer_fn() else {
return Err(Error::other("errServerNotInitialized"));
};
if is_meta_bucketname(bucket) {
return Err(Error::msg("errInvalidArgument"));
return Err(Error::other("errInvalidArgument"));
}
let mut bm = match load_bucket_metadata_parse(store, bucket, parse).await {
Ok(res) => res,
Err(err) => {
if !is_erasure().await && !is_dist_erasure().await && DiskError::VolumeNotFound.is(&err) {
if !is_erasure().await && !is_dist_erasure().await && is_err_bucket_not_found(&err) {
BucketMetadata::new(bucket)
} else {
return Err(err);
@@ -330,7 +329,7 @@ impl BucketMetadataSys {
async fn save(&self, bm: BucketMetadata) -> Result<()> {
if is_meta_bucketname(&bm.name) {
return Err(Error::msg("errInvalidArgument"));
return Err(Error::other("errInvalidArgument"));
}
let mut bm = bm;
@@ -345,7 +344,7 @@ impl BucketMetadataSys {
pub async fn get_config_from_disk(&self, bucket: &str) -> Result<BucketMetadata> {
println!("load data from disk");
if is_meta_bucketname(bucket) {
return Err(Error::msg("errInvalidArgument"));
return Err(Error::other("errInvalidArgument"));
}
load_bucket_metadata(self.api.clone(), bucket).await
@@ -364,10 +363,10 @@ impl BucketMetadataSys {
Ok(res) => res,
Err(err) => {
return if *self.initialized.read().await {
Err(Error::msg("errBucketMetadataNotInitialized"))
Err(Error::other("errBucketMetadataNotInitialized"))
} else {
Err(err)
}
};
}
};
@@ -385,7 +384,7 @@ impl BucketMetadataSys {
Ok((res, _)) => res,
Err(err) => {
warn!("get_versioning_config err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
return if err == Error::ConfigNotFound {
Ok((VersioningConfiguration::default(), OffsetDateTime::UNIX_EPOCH))
} else {
Err(err)
@@ -405,8 +404,8 @@ impl BucketMetadataSys {
Ok((res, _)) => res,
Err(err) => {
warn!("get_bucket_policy err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
Err(Error::new(BucketMetadataError::BucketPolicyNotFound))
return if err == Error::ConfigNotFound {
Err(BucketMetadataError::BucketPolicyNotFound.into())
} else {
Err(err)
};
@@ -416,7 +415,7 @@ impl BucketMetadataSys {
if let Some(config) = &bm.policy_config {
Ok((config.clone(), bm.policy_config_updated_at))
} else {
Err(Error::new(BucketMetadataError::BucketPolicyNotFound))
Err(BucketMetadataError::BucketPolicyNotFound.into())
}
}
@@ -425,8 +424,8 @@ impl BucketMetadataSys {
Ok((res, _)) => res,
Err(err) => {
warn!("get_tagging_config err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
Err(Error::new(BucketMetadataError::TaggingNotFound))
return if err == Error::ConfigNotFound {
Err(BucketMetadataError::TaggingNotFound.into())
} else {
Err(err)
};
@@ -436,7 +435,7 @@ impl BucketMetadataSys {
if let Some(config) = &bm.tagging_config {
Ok((config.clone(), bm.tagging_config_updated_at))
} else {
Err(Error::new(BucketMetadataError::TaggingNotFound))
Err(BucketMetadataError::TaggingNotFound.into())
}
}
@@ -444,9 +443,8 @@ impl BucketMetadataSys {
let bm = match self.get_config(bucket).await {
Ok((res, _)) => res,
Err(err) => {
warn!("get_object_lock_config err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
Err(Error::new(BucketMetadataError::BucketObjectLockConfigNotFound))
return if err == Error::ConfigNotFound {
Err(BucketMetadataError::BucketObjectLockConfigNotFound.into())
} else {
Err(err)
};
@@ -456,7 +454,7 @@ impl BucketMetadataSys {
if let Some(config) = &bm.object_lock_config {
Ok((config.clone(), bm.object_lock_config_updated_at))
} else {
Err(Error::new(BucketMetadataError::BucketObjectLockConfigNotFound))
Err(BucketMetadataError::BucketObjectLockConfigNotFound.into())
}
}
@@ -465,8 +463,8 @@ impl BucketMetadataSys {
Ok((res, _)) => res,
Err(err) => {
warn!("get_lifecycle_config err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
Err(Error::new(BucketMetadataError::BucketLifecycleNotFound))
return if err == Error::ConfigNotFound {
Err(BucketMetadataError::BucketLifecycleNotFound.into())
} else {
Err(err)
};
@@ -475,12 +473,12 @@ impl BucketMetadataSys {
if let Some(config) = &bm.lifecycle_config {
if config.rules.is_empty() {
Err(Error::new(BucketMetadataError::BucketLifecycleNotFound))
Err(BucketMetadataError::BucketLifecycleNotFound.into())
} else {
Ok((config.clone(), bm.lifecycle_config_updated_at))
}
} else {
Err(Error::new(BucketMetadataError::BucketLifecycleNotFound))
Err(BucketMetadataError::BucketLifecycleNotFound.into())
}
}
@@ -489,7 +487,7 @@ impl BucketMetadataSys {
Ok((bm, _)) => bm.notification_config.clone(),
Err(err) => {
warn!("get_notification_config err {:?}", &err);
if config::error::is_err_config_not_found(&err) {
if err == Error::ConfigNotFound {
None
} else {
return Err(err);
@@ -505,8 +503,8 @@ impl BucketMetadataSys {
Ok((res, _)) => res,
Err(err) => {
warn!("get_sse_config err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
Err(Error::new(BucketMetadataError::BucketSSEConfigNotFound))
return if err == Error::ConfigNotFound {
Err(BucketMetadataError::BucketSSEConfigNotFound.into())
} else {
Err(err)
};
@@ -516,7 +514,7 @@ impl BucketMetadataSys {
if let Some(config) = &bm.sse_config {
Ok((config.clone(), bm.encryption_config_updated_at))
} else {
Err(Error::new(BucketMetadataError::BucketSSEConfigNotFound))
Err(BucketMetadataError::BucketSSEConfigNotFound.into())
}
}
@@ -536,8 +534,8 @@ impl BucketMetadataSys {
Ok((res, _)) => res,
Err(err) => {
warn!("get_quota_config err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
Err(Error::new(BucketMetadataError::BucketQuotaConfigNotFound))
return if err == Error::ConfigNotFound {
Err(BucketMetadataError::BucketQuotaConfigNotFound.into())
} else {
Err(err)
};
@@ -547,7 +545,7 @@ impl BucketMetadataSys {
if let Some(config) = &bm.quota_config {
Ok((config.clone(), bm.quota_config_updated_at))
} else {
Err(Error::new(BucketMetadataError::BucketQuotaConfigNotFound))
Err(BucketMetadataError::BucketQuotaConfigNotFound.into())
}
}
@@ -555,14 +553,14 @@ impl BucketMetadataSys {
let (bm, reload) = match self.get_config(bucket).await {
Ok(res) => {
if res.0.replication_config.is_none() {
return Err(Error::new(BucketMetadataError::BucketReplicationConfigNotFound));
return Err(BucketMetadataError::BucketReplicationConfigNotFound.into());
}
res
}
Err(err) => {
warn!("get_replication_config err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
Err(Error::new(BucketMetadataError::BucketReplicationConfigNotFound))
return if err == Error::ConfigNotFound {
Err(BucketMetadataError::BucketReplicationConfigNotFound.into())
} else {
Err(err)
};
@@ -576,7 +574,7 @@ impl BucketMetadataSys {
//println!("549 {:?}", config.clone());
Ok((config.clone(), bm.replication_config_updated_at))
} else {
Err(Error::new(BucketMetadataError::BucketReplicationConfigNotFound))
Err(BucketMetadataError::BucketReplicationConfigNotFound.into())
}
}
@@ -585,8 +583,8 @@ impl BucketMetadataSys {
Ok(res) => res,
Err(err) => {
warn!("get_replication_config err {:?}", &err);
return if config::error::is_err_config_not_found(&err) {
Err(Error::new(BucketMetadataError::BucketRemoteTargetNotFound))
return if err == Error::ConfigNotFound {
Err(BucketMetadataError::BucketRemoteTargetNotFound.into())
} else {
Err(err)
};
@@ -603,7 +601,7 @@ impl BucketMetadataSys {
Ok(config.clone())
} else {
Err(Error::new(BucketMetadataError::BucketRemoteTargetNotFound))
Err(BucketMetadataError::BucketRemoteTargetNotFound.into())
}
}
}

View File

@@ -1,5 +1,5 @@
use super::{error::BucketMetadataError, metadata_sys::get_bucket_metadata_sys};
use common::error::Result;
use crate::error::Result;
use policy::policy::{BucketPolicy, BucketPolicyArgs};
use tracing::warn;
@@ -10,8 +10,9 @@ impl PolicySys {
match Self::get(args.bucket).await {
Ok(cfg) => return cfg.is_allowed(args),
Err(err) => {
if !BucketMetadataError::BucketPolicyNotFound.is(&err) {
warn!("config get err {:?}", err);
let berr: BucketMetadataError = err.into();
if berr != BucketMetadataError::BucketPolicyNotFound {
warn!("config get err {:?}", berr);
}
}
}

View File

@@ -1,4 +1,4 @@
use common::error::Result;
use crate::error::Result;
use rmp_serde::Serializer as rmpSerializer;
use serde::{Deserialize, Serialize};

View File

@@ -1,4 +1,4 @@
use common::error::Result;
use crate::error::Result;
use rmp_serde::Serializer as rmpSerializer;
use serde::{Deserialize, Serialize};
use time::OffsetDateTime;

View File

@@ -1,5 +1,6 @@
use crate::disk::RUSTFS_META_BUCKET;
use common::error::{Error, Result};
use crate::error::{Error, Result};
use s3s::xml;
pub fn is_meta_bucketname(name: &str) -> bool {
name.starts_with(RUSTFS_META_BUCKET)
@@ -13,60 +14,88 @@ lazy_static::lazy_static! {
static ref IP_ADDRESS: Regex = Regex::new(r"^(\d+\.){3}\d+$").unwrap();
}
pub fn check_bucket_name_common(bucket_name: &str, strict: bool) -> Result<(), Error> {
pub fn check_bucket_name_common(bucket_name: &str, strict: bool) -> Result<()> {
let bucket_name_trimmed = bucket_name.trim();
if bucket_name_trimmed.is_empty() {
return Err(Error::msg("Bucket name cannot be empty"));
return Err(Error::other("Bucket name cannot be empty"));
}
if bucket_name_trimmed.len() < 3 {
return Err(Error::msg("Bucket name cannot be shorter than 3 characters"));
return Err(Error::other("Bucket name cannot be shorter than 3 characters"));
}
if bucket_name_trimmed.len() > 63 {
return Err(Error::msg("Bucket name cannot be longer than 63 characters"));
return Err(Error::other("Bucket name cannot be longer than 63 characters"));
}
if bucket_name_trimmed == "rustfs" {
return Err(Error::msg("Bucket name cannot be rustfs"));
return Err(Error::other("Bucket name cannot be rustfs"));
}
if IP_ADDRESS.is_match(bucket_name_trimmed) {
return Err(Error::msg("Bucket name cannot be an IP address"));
return Err(Error::other("Bucket name cannot be an IP address"));
}
if bucket_name_trimmed.contains("..") || bucket_name_trimmed.contains(".-") || bucket_name_trimmed.contains("-.") {
return Err(Error::msg("Bucket name contains invalid characters"));
return Err(Error::other("Bucket name contains invalid characters"));
}
if strict {
if !VALID_BUCKET_NAME_STRICT.is_match(bucket_name_trimmed) {
return Err(Error::msg("Bucket name contains invalid characters"));
return Err(Error::other("Bucket name contains invalid characters"));
}
} else if !VALID_BUCKET_NAME.is_match(bucket_name_trimmed) {
return Err(Error::msg("Bucket name contains invalid characters"));
return Err(Error::other("Bucket name contains invalid characters"));
}
Ok(())
}
pub fn check_valid_bucket_name(bucket_name: &str) -> Result<(), Error> {
pub fn check_valid_bucket_name(bucket_name: &str) -> Result<()> {
check_bucket_name_common(bucket_name, false)
}
pub fn check_valid_bucket_name_strict(bucket_name: &str) -> Result<(), Error> {
pub fn check_valid_bucket_name_strict(bucket_name: &str) -> Result<()> {
check_bucket_name_common(bucket_name, true)
}
pub fn check_valid_object_name_prefix(object_name: &str) -> Result<(), Error> {
pub fn check_valid_object_name_prefix(object_name: &str) -> Result<()> {
if object_name.len() > 1024 {
return Err(Error::msg("Object name cannot be longer than 1024 characters"));
return Err(Error::other("Object name cannot be longer than 1024 characters"));
}
if !object_name.is_ascii() {
return Err(Error::msg("Object name with non-UTF-8 strings are not supported"));
return Err(Error::other("Object name with non-UTF-8 strings are not supported"));
}
Ok(())
}
pub fn check_valid_object_name(object_name: &str) -> Result<(), Error> {
pub fn check_valid_object_name(object_name: &str) -> Result<()> {
if object_name.trim().is_empty() {
return Err(Error::msg("Object name cannot be empty"));
return Err(Error::other("Object name cannot be empty"));
}
check_valid_object_name_prefix(object_name)
}
pub fn deserialize<T>(input: &[u8]) -> xml::DeResult<T>
where
T: for<'xml> xml::Deserialize<'xml>,
{
let mut d = xml::Deserializer::new(input);
let ans = T::deserialize(&mut d)?;
d.expect_eof()?;
Ok(ans)
}
pub fn serialize_content<T: xml::SerializeContent>(val: &T) -> xml::SerResult<String> {
let mut buf = Vec::with_capacity(256);
{
let mut ser = xml::Serializer::new(&mut buf);
val.serialize_content(&mut ser)?;
}
Ok(String::from_utf8(buf).unwrap())
}
pub fn serialize<T: xml::Serialize>(val: &T) -> xml::SerResult<Vec<u8>> {
let mut buf = Vec::with_capacity(256);
{
let mut ser = xml::Serializer::new(&mut buf);
val.serialize(&mut ser)?;
}
Ok(buf)
}

View File

@@ -1,6 +1,6 @@
use s3s::dto::{BucketVersioningStatus, VersioningConfiguration};
use crate::utils::wildcard;
use rustfs_utils::string::match_simple;
pub trait VersioningApi {
fn enabled(&self) -> bool;
@@ -33,7 +33,7 @@ impl VersioningApi for VersioningConfiguration {
for p in excluded_prefixes.iter() {
if let Some(ref sprefix) = p.prefix {
let pattern = format!("{}*", sprefix);
if wildcard::match_simple(&pattern, prefix) {
if match_simple(&pattern, prefix) {
return false;
}
}
@@ -63,7 +63,7 @@ impl VersioningApi for VersioningConfiguration {
for p in excluded_prefixes.iter() {
if let Some(ref sprefix) = p.prefix {
let pattern = format!("{}*", sprefix);
if wildcard::match_simple(&pattern, prefix) {
if match_simple(&pattern, prefix) {
return true;
}
}

Some files were not shown because too many files have changed in this diff Show More