BugZero | MongoDB BugID 2699760 - Memory Usage in noPassthrough Suite increased sign...

OPERATIONAL DEFECT DATABASE

...

BugZero | MongoDB BugID 2699760 - Memory Usage in noPassthrough Suite increased sign...

MongoDB - Defect ID: 2699760

Memory Usage in noPassthrough Suite increased significantly after upgrading to Windows Server 2022

MongoDB - Defect ID: 2699760

Memory Usage in noPassthrough Suite increased significantly after upgrading to Windows Server 2022

Last updated on July 8th, 2024

BugZero Risk Score
6.0 Medium

Overall: 6.0

Severity: 6.4

Community: 6.0

Lifecycle: 9.1

What is the BugZero Risk Score?

MongoDB Integration

Learn more about where this data comes from

MongoDB Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Bug Details

Priority: Major - P3
Status: Closed

Description

Info

While updating tests to run on Windows Server 2022 for Mongo 8.0 platform support, several issues were discovered in the noPassthrough suite: https://spruce.mongodb.com/task/mongodb_mongo_master_enterprise_windows_all_feature_flags_required_noPassthrough_1_windows_enterprise_patch_d60231163ae986719f5b012c47fb065331fabdab_6669f1b564e1ae0007c8514b_24_06_12_19_07_21?execution=2&sortBy=STATUS&sortDir=ASC https://spruce.mongodb.com/task/mongodb_mongo_master_enterprise_windows_all_feature_flags_required_noPassthrough_1_windows_enterprise_patch_d60231163ae986719f5b012c47fb065331fabdab_6669f1b564e1ae0007c8514b_24_06_12_19_07_21/tests?execution=1&sortBy=STATUS&sortDir=ASC https://spruce.mongodb.com/task/mongodb_mongo_master_enterprise_windows_all_feature_flags_required_noPassthrough_1_windows_enterprise_patch_d60231163ae986719f5b012c47fb065331fabdab_6669f1b564e1ae0007c8514b_24_06_12_19_07_21/tests?execution=0&sortBy=STATUS&sortDir=ASC The commit this branch is based off of does not have this issue and the only changes are switching the evergreen host distro from "windows-vsCurrent-large" (Windows Server 2019) to "windows-2022-large" (Windows Server 2022) The version upgrade will use a workaround to decrease resmoke concurrency to avoid exhausting the system's memory, but it's still unclear why the upgrade caused memory usage to increase. max.hirschhorn@mongodb.com's analysis: The Evergreen timeout in execution #3 appears to be caused by slow resmoke logging which led to the primary of the replica set stepping down and hitting fassert(7152000) due to being unable to step down quickly enough since the mongod was fsyncLocked. [js_test:sharded_pit_backup_restore_simple] d20846| 2024-06-13T01:49:41.751+01:00 I REPL 21809 [S] [ReplCoord-0] "Can't see a majority of the set, relinquishing primary" ... [js_test:sharded_pit_backup_restore_simple] d20846| 2024-06-13T01:50:11.832+01:00 F REPL 5675600 [S] [ReplCoord-0] "Time out exceeded waiting for RSTL, stepUp/stepDown is not possible thus calling abort() to allow cluster to progress","attr":{"lockRep":{"ReplicationStateTransition":{"acquireCount": {"W":1} ,"acquireWaitCount": {"W":1} ,"timeAcquiringMicros":{"W":30079690}}}} [js_test:sharded_pit_backup_restore_simple] d20846| 2024-06-13T01:50:11.832+01:00 F ASSERT 23089 [S] [ReplCoord-0] "Fatal assertion","attr": {"msgid":7152000,"file":"src\\mongo\\db\\repl\\replication_coordinator_impl.cpp","line":2964} https://parsley.mongodb.com/test/mongodb_mongo_master_enterprise_windows_all_feature_flags_required_noPassthrough_1_windows_enterprise_patch_d60231163ae986719f5b012c47fb065331fabdab_6669f1b564e1ae0007c8514b_24_06_12_19_07_21/2/af21249a209a8a57122acbfa50b9bb32?bookmarks=0,118966,137712,239798,242772&filters=10020846%255C%257C.%2A%255C%255BReplCoord-0%255C%255D&shareLine=0 The Evergreen timeout in execution #2 appears to be caused by out_timeseries_cleans_up_bucket_collections.js though I couldn't say why. The logs are incomplete for the other tests because the flush thread had a MemoryError exception. The memory usage hits ~100% at 22:36 UTC but neither the system logs nor system_resource_info.json can identify what is consuming the excessive memory. Notably, the sum of the memory among the processes listed only totals to 10-13GB of the 33GB available. The Evergreen failure in execution #1 has 7 of the 8 tests failing with "out of memory".

Top User Comments

xgen-internal-githook commented on Mon, 8 Jul 2024 22:57:27 +0000: Author: {'name': 'Louis Williams', 'email': 'louiswilliams@users.noreply.github.com', 'username': 'louiswilliams'} Message: SERVER-91824 Remove TODO for SERVER-91466 (#24430) GitOrigin-RevId: a40e69bc20b36dfe7ffc3e241a7f77a2930cbfb3 Branch: master https://github.com/mongodb/mongo/commit/691442bb1ec633ef090fdbda3a7457dc7fd6df8b gregory.noma commented on Tue, 25 Jun 2024 15:32:52 +0000: The sharded backup tests spawns 9 mongods and one CSRS so it's a pretty resource-intensive test. We also in the future won't be supporting Windows as a production platform anyway, and we fixed the failures here already. So, closing out this ticket. dbeng-pm-bot commented on Thu, 13 Jun 2024 19:41:51 +0000: This issue has been flagged for rapid response! Assignees of rapid response tickets are responsible for providing a daily update on this issue using the 'Server Rapid Response' canned comment template. Any questions about this ticket can be directed to the #server-rapid-response Slack channel and more information on the Server Rapid Response process can be found on the Wiki louis.williams commented on Thu, 13 Jun 2024 13:22:02 +0000: Thanks max.hirschhorn@mongodb.com. I agree that we are likely running too many tasks on these hosts. I'm going to re-assign to Dev Prod Build to investigate if reducing the number of tasks solves the problem.

Steps to Reproduce

Change history

No changes to display

Links

Relevant Products

Click on a version to see all relevant bugs

Affected versions:No known affected versions

Fixed versions: No known fixed versions

Relevant Products

Click on a version to see all relevant bugs

Affected versions:No known affected versions

Fixed versions: No known fixed versions

Top MongoDB Defects

5.5Defect ID: 3192414
Sharded DDL commands may complete while the DDL coordinator is still active in-memory (cleaning up)
5.4Defect ID: 3194150
Shard role API stashing doesn't abandon the snapshot for recursive acquisitions
5.3Defect ID: 3215941
moveChunk with waitForDelete hangs when range deleter is disabled
5.3Defect ID: 3198578
$listClusterCatalog reports wrong sharding metadata for timeseries collections
5.3Defect ID: 3185774
applyOps command creates extra time-series index

MongoDB Integration

Learn more about where this data comes from

MongoDB Integration

Learn more

Bug Scrub Advisor

Streamline upgrades with automated vendor bug scrubs

Bug Scrub Advisor

Learn more

BugZero Enterprise

Wish you caught this bug sooner? Get proactive today.

BugZero Enterprise

Learn more

Ready to prevent the next vendor outage?

Get a demo

OPERATIONAL DEFECT DATABASE

MongoDB - Defect ID: 2699760

Memory Usage in noPassthrough Suite increased significantly after upgrading to Windows Server 2022

MongoDB - Defect ID: 2699760

Memory Usage in noPassthrough Suite increased significantly after upgrading to Windows Server 2022

Last updated on July 8th, 2024

BugZero Risk Score6.0 Medium

Bug Details

Info

Top User Comments

Steps to Reproduce

Links

Top MongoDB Defects

Ready to prevent the next vendor outage?

BugZero Risk Score
6.0 Medium