Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1608][part-8] feat(spark3): add a limit to the number of retries when block access is denied #1715

Closed
wants to merge 2 commits into from

Conversation

dingshun3016
Copy link
Contributor

What changes were proposed in this pull request?

In #1617
If the servers in the cluster is unhealthy very frequently, such as high load or high network card traffic or disk utilization triggering thresholds, etc, this may easily trigger the restriction of blockFailSentRetryMaxTimes in reassign. therefore, add a limit to the number of retries when block access is denied.

Why are the changes needed?

Add a limit to the number of retries when block access is denied.

Fix: # (issue)

Does this PR introduce any user-facing change?

No.

How was this patch tested?

@codecov-commenter
Copy link

codecov-commenter commented May 16, 2024

Codecov Report

Attention: Patch coverage is 54.54545% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 54.05%. Comparing base (6f6d35a) to head (618fe20).
Report is 21 commits behind head on master.

Files Patch % Lines
...va/org/apache/uniffle/common/ShuffleBlockInfo.java 25.00% 3 Missing ⚠️
...ffle/client/impl/grpc/ShuffleServerGrpcClient.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #1715      +/-   ##
============================================
- Coverage     54.86%   54.05%   -0.81%     
- Complexity     2358     2763     +405     
============================================
  Files           368      414      +46     
  Lines         16379    21774    +5395     
  Branches       1504     2054     +550     
============================================
+ Hits           8986    11770    +2784     
- Misses         6862     9260    +2398     
- Partials        531      744     +213     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

Test Results

 2 391 files  ±0   2 391 suites  ±0   4h 57m 58s ⏱️ +25s
   929 tests ±0     928 ✅ ±0   1 💤 ±0  0 ❌ ±0 
10 763 runs  ±0  10 749 ✅ ±0  14 💤 ±0  0 ❌ ±0 

Results for commit 618fe20. ± Comparison against base commit de4b261.

@zuston
Copy link
Member

zuston commented May 16, 2024

This looks strange that it replace the partial blockMaxRetryTimes abilities and introduces the extra config option to meet something, but this change is not compatible with the previous logic.

@dingshun3016
Copy link
Contributor Author

This looks strange that it replace the partial blockMaxRetryTimes abilities and introduces the extra config option to meet something, but this change is not compatible with the previous logic.

What do you recommend? Or continue to reuse blockMaxRetryTimes

@zuston
Copy link
Member

zuston commented May 22, 2024

This looks strange that it replace the partial blockMaxRetryTimes abilities and introduces the extra config option to meet something, but this change is not compatible with the previous logic.

What do you recommend? Or continue to reuse blockMaxRetryTimes

After rethinking, I think we don't need this. If you want to increase retry count, you just need to increase the block retry max times.

@dingshun3016
Copy link
Contributor Author

This looks strange that it replace the partial blockMaxRetryTimes abilities and introduces the extra config option to meet something, but this change is not compatible with the previous logic.

What do you recommend? Or continue to reuse blockMaxRetryTimes

After rethinking, I think we don't need this. If you want to increase retry count, you just need to increase the block retry max times.

Ok, i will close it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants