* Initial plan
* Implement priority-based heal queue with deduplication
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
* Apply cargo fmt formatting fixes
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
* Add queue monitoring, better error handling, and adaptive processing
- Add priority-based queue statistics tracking
- Implement queue capacity warnings (>80% full)
- Process multiple tasks per cycle when capacity allows
- Add proper error logging for failed heal request submissions
- Add Hash trait to HealPriority for HashMap support
- Improve observability with detailed queue status logs
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
* Add comprehensive diagnostic logging to heal flow
- Add detailed logging for shard availability checks during reads
- Log successful heal request submissions
- Add detailed logging in heal_object for disk checking
- Log each disk's heal decision with reason and endpoint
- Add logging for heal writing and renaming phases
- Improve variable naming (nil_count -> available_shards)
- Add success confirmation messages for renamed healed data
These logs will help diagnose where the heal flow is failing.
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
* fix
* feat(heal): enhance robustness, safety, and observability of heal process
- **Logging & Observability**:
- Upgrade critical heal logs from [info](/crates/e2e_test/src/reliant/node_interact_test.rs:196:0-213:1) to `warn` for better visibility.
- Implement structured logging with `tracing` fields for machine readability.
- Add `#[tracing::instrument]` to [HealTask](c/crates/ahm/src/heal/task.rs:182:0-205:1) and [SetDisks](/crates/ecstore/src/set_disk.rs:120:0-131:1) methods for automatic context propagation.
- **Robustness**:
- Add exponential backoff retry (3 attempts) for acquiring write locks in [heal_object](/crates/ahm/src/heal/storage.rs:438:4-460:5) to handle contention.
- Handle [rename_data](/crates/ecstore/src/set_disk.rs:392:4-516:5) failures gracefully by preserving temporary files instead of forcing deletion, preventing potential data loss.
- **Data Safety**:
- Fix [object_exists](/crates/ahm/src/heal/storage.rs:395:4-412:5) to propagate IO errors instead of treating them as "object not found".
- Update [ErasureSetHealer](/crates/ahm/src/heal/erasure_healer.rs:28:0-33:1) to mark objects as failed rather than skipped when existence checks error, ensuring they are tracked for retry.
* fix
* fmt
* improve code for heal_object
* fix
* fix
* fix
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
Co-authored-by: houseme <housemecn@gmail.com>
* fix
* chore: upgrade cryptography libraries to RC versions
- Upgrade aes-gcm to 0.11.0-rc.2 with rand_core support
- Upgrade chacha20poly1305 to 0.11.0-rc.2
- Upgrade argon2 to 0.6.0-rc.2 with std features
- Upgrade hmac to 0.13.0-rc.3
- Upgrade pbkdf2 to 0.13.0-rc.2
- Upgrade rsa to 0.10.0-rc.10
- Upgrade sha1 and sha2 to 0.11.0-rc.3
- Upgrade md-5 to 0.11.0-rc.3
These upgrades provide enhanced security features and performance
improvements while maintaining backward compatibility with existing
encryption workflows.
* add
* improve code
* fix
* improve code for metrics
* improve code for metrics
* fix
* fix
* Refactor telemetry initialization and environment functions ordering
- Reorder functions in envs.rs by type size (8-bit to 64-bit, signed before unsigned) and add missing variants like get_env_opt_u16.
- Optimize init_telemetry to support three modes: stdout logging (default error level with span tracing), file rolling logs (size-based with retention), and HTTP-based observability with sub-endpoints (trace, metric, log) falling back to unified endpoint.
- Fix stdout logging issue by retaining WorkerGuard in OtelGuard to prevent premature release of async writer threads.
- Enhance observability mode with HTTP protocol, compression, and proper resource management.
- Update OtelGuard to include tracing_guard for stdout and flexi_logger_handles for file logging.
- Improve error handling and configuration extraction in OtelConfig.
* fix
* up
* fix
* fix
* improve code for obs
* fix
* fix
* improve code for dns log
* fix
* Improve comments, remove unused parameters in config.rs (opt), add observability enable flag, and enhance error logging in run function execution.
- Normalize ETags by removing quotes before comparison in complete_multipart_upload
- Fix ETag comparison in replication logic to handle quoted ETags from API responses
- Fix ETag comparison in transition object logic
- Add unit tests for trim_etag function
This fixes the ETag mismatch error when uploading large files (5GB+) via multipart upload,
which was caused by PR #592 adding quotes to ETag responses while internal storage remains unquoted.
Fixes#625
* feat(append): implement object append operations with state tracking
Signed-off-by: junxiang Mu <1948535941@qq.com>
* chore: rebase
Signed-off-by: junxiang Mu <1948535941@qq.com>
---------
Signed-off-by: junxiang Mu <1948535941@qq.com>
- Add lock timeout support and track acquisition time in lock state
- Improve lock conflict handling with detailed error messages
- Optimize lock reuse when already held by same owner
- Refactor lock state to store owner info and timeout duration
- Update all lock operations to handle new state structure
Signed-off-by: junxiang Mu <1948535941@qq.com>
* fix: remove code
* improve code for tokio runtime config
* improve code for main
* fix: add tokio enable_all
* upgrade version
* improve for Cargo.toml
* refactor: replace print statements with proper logging and fix grammar
- Fix English grammar errors in existing log messages
- Add tracing imports where needed
- Improve log message clarity and consistency
- Follow project logging best practices using tracing crate
* fix: resolve clippy warnings and format code
- Fix unused import warnings by making test imports conditional with #[cfg(test)]
- Fix unused variable warning by prefixing with underscore
- Run cargo fmt to fix formatting issues
- Ensure all code passes clippy checks with -D warnings flag
* refactor: move tracing::debug import into test module
Move the tracing::debug import from file-level #[cfg(test)] into the test module itself for better code organization and consistency with other test modules
* Checkpoint before follow-up message
Co-authored-by: anzhengchao <anzhengchao@gmail.com>
* refactor: move tracing::debug import into test module in user_agent.rs
Complete the refactoring by moving the tracing::debug import from file-level #[cfg(test)] into the test module for consistency across all test files
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>
* Checkpoint before follow-up message
Co-authored-by: anzhengchao <anzhengchao@gmail.com>
* Translate project documentation and comments from Chinese to English
Co-authored-by: anzhengchao <anzhengchao@gmail.com>
* Fix typo: "unparseable" to "unparsable" in version test comment
Co-authored-by: anzhengchao <anzhengchao@gmail.com>
* Refactor compression test code with minor syntax improvements
Co-authored-by: anzhengchao <anzhengchao@gmail.com>
---------
Co-authored-by: Cursor Agent <cursoragent@cursor.com>