Skip to content

global sort: reduce the memory usage of merge sort concurrent reader#62921

Merged
ti-chi-bot[bot] merged 8 commits intopingcap:masterfrom
fzzf678:fix_62853_merge_sort_oom
Aug 21, 2025
Merged

global sort: reduce the memory usage of merge sort concurrent reader#62921
ti-chi-bot[bot] merged 8 commits intopingcap:masterfrom
fzzf678:fix_62853_merge_sort_oom

Conversation

@fzzf678
Copy link
Copy Markdown
Contributor

@fzzf678 fzzf678 commented Aug 11, 2025

What problem does this PR solve?

Issue Number: close #62853

Problem Summary:

What changed and how does it work?

Reduce the concurrency of concurrent readers, pass the outerConcurrency correctly.

In the previous PR, we increased the concurrency to get better performance and if there are multiple readers concurrently read, the outerConcurrency will be used to control the whole concurrency. https://github.com/pingcap/tidb/pull/48871/files#diff-1ea5d87a202f7a3a089796fa8ac98c520879a2bdfdc739ed98a2261a9a7e8ad1R377-R381

But the outerConcurrency was forgotten to be passed(see https://github.com/pingcap/tidb/pull/48871/files#diff-b573df9af3de0ee789f4ec5114d897cc2f9b5ac57f7cc19757e2353127680edfR91), the reader concurrency is always 256 no matter how many reader we use, this leads to the OOM.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • For the OOM problem, after we pass the outerConcurrency correctly in this PR, the add index can be success without OOM. The profile as shown below, we can see the memory occupied by switchToConcurrentReader is 2GiB, this is expected because there are 8 readers and each has 32 concurrency with 256MiB memory, totally 8*256MiB = 2GiB.
截屏2025-08-19 20 01 06
  • About the merge sort performance, we test the merge sort time using tidb_ddl_reorg_worker_cnt = 4. Before this PR, each subtask tasks about 44-46 minutes. After this PR, each subtask tasks about 33-34 minutes to finish merge sort.
    We detect the reason is, previously the concurrency was too high, which slowed down the pull of some goroutines, causing tail requests to drag down the overall situation. After reducing the concurrency, resource contention and congestion were alleviated, slow requests were reduced, and the overall tail delay converged. Resource contention is reduced or congestion handling capabilities are improved, making data transmission more stable and efficient. This PR is mainly to fix the OOM problem, maybe we can continue investigating performance improvements later.

img_v3_02pa_61852241-d8c9-4dee-8434-e63cf6f23c5g

img_v3_02pa_d01f59fd-6a55-4298-be6a-4626a30eca8g

img_v3_02pb_b3dc4a5b-f78d-4a81-9b8e-a4b9d02cf14g img_v3_02pb_e2668838-dc01-4d55-b315-7ca579947efg

Before(repeat 3 times):

+---------------------+---------------------+----------+------+-----------+---------+
| start_time          | end_time            | duration | step | exec_id   | state   |
+---------------------+---------------------+----------+------+-----------+---------+
| 2025-08-12 05:28:30 | 2025-08-12 06:14:54 | 00:46:24 |    2 | tc-tidb-0 | succeed |
| 2025-08-12 05:28:30 | 2025-08-12 06:14:25 | 00:45:55 |    2 | tc-tidb-2 | succeed |
| 2025-08-12 05:28:30 | 2025-08-12 06:13:26 | 00:44:56 |    2 | tc-tidb-1 | succeed |
+---------------------+---------------------+----------+------+-----------+---------+

+---------------------+---------------------+----------+------+-----------+---------+
| start_time          | end_time            | duration | step | exec_id   | state   |
+---------------------+---------------------+----------+------+-----------+---------+
| 2025-08-12 08:41:37 | 2025-08-12 09:27:12 | 00:45:35 |    2 | tc-tidb-0 | succeed |
| 2025-08-12 08:41:37 | 2025-08-12 09:27:03 | 00:45:26 |    2 | tc-tidb-1 | succeed |
| 2025-08-12 08:41:37 | 2025-08-12 09:26:24 | 00:44:47 |    2 | tc-tidb-2 | succeed |
+---------------------+---------------------+----------+------+-----------+---------+

+---------------------+---------------------+----------+------+-----------+---------+
| start_time          | end_time            | duration | step | exec_id   | state   |
+---------------------+---------------------+----------+------+-----------+---------+
| 2025-08-12 10:48:09 | 2025-08-12 11:33:27 | 00:45:18 |    2 | tc-tidb-1 | succeed |
| 2025-08-12 10:48:09 | 2025-08-12 11:33:24 | 00:45:15 |    2 | tc-tidb-2 | succeed |
| 2025-08-12 10:48:10 | 2025-08-12 11:33:21 | 00:45:11 |    2 | tc-tidb-0 | succeed |
+---------------------+---------------------+----------+------+-----------+---------+

After(repeat 3 times):

+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+
| start_time          | end_time            | duration | step | exec_id                                                                 | state   |
+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+
| 2025-08-19 01:40:47 | 2025-08-19 02:15:01 | 00:34:14 |    2 | tc-tidb-1 | succeed |
| 2025-08-19 01:40:47 | 2025-08-19 02:14:25 | 00:33:38 |    2 | tc-tidb-0 | succeed |
| 2025-08-19 01:40:47 | 2025-08-19 02:14:10 | 00:33:23 |    2 | tc-tidb-2 | succeed |
+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+

+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+
| start_time          | end_time            | duration | step | exec_id                                                                 | state   |
+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+
| 2025-08-19 03:47:08 | 2025-08-19 04:21:05 | 00:33:57 |    2 | tc-tidb-0 | succeed |
| 2025-08-19 03:47:08 | 2025-08-19 04:20:58 | 00:33:50 |    2 | tc-tidb-1 | succeed |
| 2025-08-19 03:47:08 | 2025-08-19 04:20:42 | 00:33:34 |    2 | tc-tidb-2 | succeed |
+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+

+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+
| start_time          | end_time            | duration | step | exec_id                                                                 | state   |
+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+
| 2025-08-19 12:42:42 | 2025-08-19 13:17:04 | 00:34:22 |    2 | tc-tidb-2 | succeed |
| 2025-08-19 12:42:42 | 2025-08-19 13:16:44 | 00:34:02 |    2 | tc-tidb-0 | succeed |
| 2025-08-19 12:42:42 | 2025-08-19 13:16:41 | 00:33:59 |    2 | tc-tidb-1 | succeed |
+---------------------+---------------------+----------+------+-------------------------------------------------------------------------+---------+
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Aug 11, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 11, 2025
@tiprow
Copy link
Copy Markdown

tiprow bot commented Aug 11, 2025

Hi @fzzf678. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. do-not-merge/needs-triage-completed and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. do-not-merge/needs-linked-issue do-not-merge/needs-tests-checked labels Aug 13, 2025
@fzzf678 fzzf678 changed the title global sort: limit the memory usage of merge sort MergeKVIter global sort: reduce the memory usage of merge sort MergeKVIter Aug 13, 2025
@fzzf678 fzzf678 changed the title global sort: reduce the memory usage of merge sort MergeKVIter global sort: reduce the memory usage of merge sort concurrent reader Aug 14, 2025
@fzzf678 fzzf678 marked this pull request as ready for review August 14, 2025 02:27
@ti-chi-bot ti-chi-bot bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 14, 2025
@fzzf678 fzzf678 assigned tangenta and unassigned tangenta Aug 14, 2025
@fzzf678 fzzf678 requested a review from tangenta August 14, 2025 02:28
@codecov
Copy link
Copy Markdown

codecov bot commented Aug 14, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 73.4498%. Comparing base (6361402) to head (aee5212).
⚠️ Report is 472 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #62921        +/-   ##
================================================
+ Coverage   72.8138%   73.4498%   +0.6359%     
================================================
  Files          1823       1853        +30     
  Lines        494575     502708      +8133     
================================================
+ Hits         360119     369238      +9119     
+ Misses       112573     110903      -1670     
- Partials      21883      22567       +684     
Flag Coverage Δ
integration 44.7680% <0.0000%> (?)
unit 72.2793% <66.6666%> (+0.0052%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.8700% <ø> (ø)
parser ∅ <ø> (∅)
br 46.3505% <ø> (+0.0178%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@fzzf678
Copy link
Copy Markdown
Contributor Author

fzzf678 commented Aug 14, 2025

memSizePerCon := res.Mem.Capacity() / res.CPU.Capacity()
Should we consider using this memSizePerConto control the concurrency of concurrent reader?
The memSizePerCon can't control, need another value to control the total memory usage of concurrent reader

@ti-chi-bot ti-chi-bot bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. do-not-merge/needs-triage-completed labels Aug 19, 2025
@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Aug 20, 2025
Copy link
Copy Markdown
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before this PR, each subtask tasks about 44-46 minutes. After this PR, each subtask tasks about 33-34 minutes to finish merge sort.

should before this PR OOM? where the 44-46 minutes came from?

i think this PR only avoid OOM, why the performance is getting better?

@fzzf678
Copy link
Copy Markdown
Contributor Author

fzzf678 commented Aug 20, 2025

before this PR, each subtask tasks about 44-46 minutes. After this PR, each subtask tasks about 33-34 minutes to finish merge sort.

should before this PR OOM? where the 44-46 minutes came from?

i think this PR only avoid OOM, why the performance is getting better?

The OOM occurs when tidb_ddl_reorg_worker_cnt = 8, the 44-46 minutes is tested by using the tidb_ddl_reorg_worker_cnt = 4. Supply this to the description.

@ti-chi-bot ti-chi-bot bot added approved lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Aug 21, 2025
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Aug 21, 2025

[LGTM Timeline notifier]

Timeline:

  • 2025-08-20 03:44:24.556505674 +0000 UTC m=+411472.499681190: ☑️ agreed by wjhuang2016.
  • 2025-08-21 02:50:50.857473549 +0000 UTC m=+494658.800649066: ☑️ agreed by D3Hunter.

@hawkingrei
Copy link
Copy Markdown
Member

/retest

@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot bot commented Aug 21, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: D3Hunter, tangenta, wjhuang2016

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot merged commit fdca815 into pingcap:master Aug 21, 2025
25 checks passed
@ti-chi-bot ti-chi-bot bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Nov 28, 2025
ti-chi-bot pushed a commit to ti-chi-bot/tidb that referenced this pull request Nov 28, 2025
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot
Copy link
Copy Markdown
Member

In response to a cherrypick label: new pull request created to branch release-8.5: #64744.
But this PR has conflicts, please resolve them!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved lgtm needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TiDB OOM at merge sort step when using distributed framework to add index

6 participants