Skip to content

[CIP-19][CELEBORN-1937] QuotaManager should support app threshold#3651

Open
leixm wants to merge 2 commits intoapache:mainfrom
leixm:CELEBORN-1937
Open

[CIP-19][CELEBORN-1937] QuotaManager should support app threshold#3651
leixm wants to merge 2 commits intoapache:mainfrom
leixm:CELEBORN-1937

Conversation

@leixm
Copy link
Copy Markdown
Contributor

@leixm leixm commented Apr 7, 2026

What changes were proposed in this pull request?

Reopen from #3336, QuotaManager supports app threshold.

Why are the changes needed?

For CIP-19, prioritize cleaning up abnormally large shuffle apps, then sort by app priority and consumption.

Does this PR resolve a correctness bug?

No.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

UTs.

@leixm
Copy link
Copy Markdown
Contributor Author

leixm commented Apr 8, 2026

@AngersZhuuuu @RexXiong Can you help review?

@RexXiong
Copy link
Copy Markdown
Contributor

RexXiong commented Apr 9, 2026

Overall LGTM, but I also have a couple of questions regarding the expiration strategy when multiple apps exceed the app-level quota:

  1. Priority-based expiration: When a user exceeds their overall quota and multiple apps also exceed the app-level threshold, could we prioritize which apps to expire based on app priority (or consumption)? This would allow more important apps to continue running while less critical ones are throttled first.

  2. Incremental expiration: After expiring a single app, if the remaining apps no longer exceed the quota, is it possible to avoid expiring them? This would minimize the impact to the user's workload by only expiring the minimum number of apps necessary to bring usage back under the quota limit.


by claude

@leixm
Copy link
Copy Markdown
Contributor Author

leixm commented Apr 9, 2026

Overall LGTM, but I also have a couple of questions regarding the expiration strategy when multiple apps exceed the app-level quota:

  1. Priority-based expiration: When a user exceeds their overall quota and multiple apps also exceed the app-level threshold, could we prioritize which apps to expire based on app priority (or consumption)? This would allow more important apps to continue running while less critical ones are throttled first.
  2. Incremental expiration: After expiring a single app, if the remaining apps no longer exceed the quota, is it possible to avoid expiring them? This would minimize the impact to the user's workload by only expiring the minimum number of apps necessary to bring usage back under the quota limit.

by claude

  1. Subsequent PRs will be prioritized based on priority and consumption.

  2. Incremental expiration was already supported before this PR. This PR aims to introduce a new app threshold based on fairness considerations. We don't want quotas to be consumed by a few apps. When the user/tenant/cluster quota is exceeded, apps exceeding the app threshold will be killed first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants