* Initial plan
* Implement priority-based heal queue with deduplication
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
* Apply cargo fmt formatting fixes
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
* Add queue monitoring, better error handling, and adaptive processing
- Add priority-based queue statistics tracking
- Implement queue capacity warnings (>80% full)
- Process multiple tasks per cycle when capacity allows
- Add proper error logging for failed heal request submissions
- Add Hash trait to HealPriority for HashMap support
- Improve observability with detailed queue status logs
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
* Add comprehensive diagnostic logging to heal flow
- Add detailed logging for shard availability checks during reads
- Log successful heal request submissions
- Add detailed logging in heal_object for disk checking
- Log each disk's heal decision with reason and endpoint
- Add logging for heal writing and renaming phases
- Improve variable naming (nil_count -> available_shards)
- Add success confirmation messages for renamed healed data
These logs will help diagnose where the heal flow is failing.
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
* fix
* feat(heal): enhance robustness, safety, and observability of heal process
- **Logging & Observability**:
- Upgrade critical heal logs from [info](/crates/e2e_test/src/reliant/node_interact_test.rs:196:0-213:1) to `warn` for better visibility.
- Implement structured logging with `tracing` fields for machine readability.
- Add `#[tracing::instrument]` to [HealTask](c/crates/ahm/src/heal/task.rs:182:0-205:1) and [SetDisks](/crates/ecstore/src/set_disk.rs:120:0-131:1) methods for automatic context propagation.
- **Robustness**:
- Add exponential backoff retry (3 attempts) for acquiring write locks in [heal_object](/crates/ahm/src/heal/storage.rs:438:4-460:5) to handle contention.
- Handle [rename_data](/crates/ecstore/src/set_disk.rs:392:4-516:5) failures gracefully by preserving temporary files instead of forcing deletion, preventing potential data loss.
- **Data Safety**:
- Fix [object_exists](/crates/ahm/src/heal/storage.rs:395:4-412:5) to propagate IO errors instead of treating them as "object not found".
- Update [ErasureSetHealer](/crates/ahm/src/heal/erasure_healer.rs:28:0-33:1) to mark objects as failed rather than skipped when existence checks error, ensuring they are tracked for retry.
* fix
* fmt
* improve code for heal_object
* fix
* fix
* fix
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: houseme <4829346+houseme@users.noreply.github.com>
Co-authored-by: houseme <housemecn@gmail.com>
* fix
* chore: upgrade cryptography libraries to RC versions
- Upgrade aes-gcm to 0.11.0-rc.2 with rand_core support
- Upgrade chacha20poly1305 to 0.11.0-rc.2
- Upgrade argon2 to 0.6.0-rc.2 with std features
- Upgrade hmac to 0.13.0-rc.3
- Upgrade pbkdf2 to 0.13.0-rc.2
- Upgrade rsa to 0.10.0-rc.10
- Upgrade sha1 and sha2 to 0.11.0-rc.3
- Upgrade md-5 to 0.11.0-rc.3
These upgrades provide enhanced security features and performance
improvements while maintaining backward compatibility with existing
encryption workflows.
* add
* improve code
* fix
* Add InvalidRangeSpec error
* Add EntityTooSmall to from_u32
* Add InvalidRangeSpec to from_u32
* Map InvalidRangeSpec to correct S3ErrorCode
* Return Error::InvalidRangeSpec
* Use auto implementation
* Add default storage class to ListObjectsV2
Resolves#764
* Add storage_class to response
* Make storage class optional so default won't be an empty string
---------
Co-authored-by: houseme <housemecn@gmail.com>
* improve code for dns log
* fix
* Improve comments, remove unused parameters in config.rs (opt), add observability enable flag, and enhance error logging in run function execution.
- Reduce metrics push frequency from default to 3s for better performance
- Optimize resource utilization during metrics collection
- Improve real-time monitoring responsiveness
Related to admin metrics optimization on fix/admin-metrics branch
- Normalize ETags by removing quotes before comparison in complete_multipart_upload
- Fix ETag comparison in replication logic to handle quoted ETags from API responses
- Fix ETag comparison in transition object logic
- Add unit tests for trim_etag function
This fixes the ETag mismatch error when uploading large files (5GB+) via multipart upload,
which was caused by PR #592 adding quotes to ETag responses while internal storage remains unquoted.
Fixes#625
* feat(append): implement object append operations with state tracking
Signed-off-by: junxiang Mu <1948535941@qq.com>
* chore: rebase
Signed-off-by: junxiang Mu <1948535941@qq.com>
---------
Signed-off-by: junxiang Mu <1948535941@qq.com>
- Add lock timeout support and track acquisition time in lock state
- Improve lock conflict handling with detailed error messages
- Optimize lock reuse when already held by same owner
- Refactor lock state to store owner info and timeout duration
- Update all lock operations to handle new state structure
Signed-off-by: junxiang Mu <1948535941@qq.com>
* fix: fix datausageinfo
Signed-off-by: junxiang Mu <1948535941@qq.com>
* feat(data-usage): implement local disk snapshot aggregation for data usage statistics
Signed-off-by: junxiang Mu <1948535941@qq.com>
* feat(scanner): improve data usage collection with local scan aggregation
Signed-off-by: junxiang Mu <1948535941@qq.com>
* refactor: improve object existence check and code style
Signed-off-by: junxiang Mu <1948535941@qq.com>
---------
Signed-off-by: junxiang Mu <1948535941@qq.com>
* feat(kms): implement key management service with local and vault backends
Signed-off-by: junxiang Mu <1948535941@qq.com>
* feat(kms): enhance security with zeroize for sensitive data and improve key management
Signed-off-by: junxiang Mu <1948535941@qq.com>
* remove Hashi word
Signed-off-by: junxiang Mu <1948535941@qq.com>
* refactor: remove unused request structs from kms handlers
Signed-off-by: junxiang Mu <1948535941@qq.com>
---------
Signed-off-by: junxiang Mu <1948535941@qq.com>
* Refactor: reimplement lock
Signed-off-by: junxiang Mu <1948535941@qq.com>
* Fix: fix test case failed
Signed-off-by: junxiang Mu <1948535941@qq.com>
* Improve: lock pref
Signed-off-by: junxiang Mu <1948535941@qq.com>
* fix(lock): Fix resource cleanup issue when batch lock acquisition fails
Ensure that the locks already acquired are properly released when batch lock acquisition fails to avoid memory leaks
Improve the lock protection mechanism to prevent double release issues
Add complete Apache license declarations to all files
Signed-off-by: junxiang Mu <1948535941@qq.com>
---------
Signed-off-by: junxiang Mu <1948535941@qq.com>