...
BugZero found this defect 2726 days ago.
2017-07-07T15:09:03.839-0500 F - [repl index builder 23862] Got signal: 6 (Aborted). 0x5633b8a7f8a1 0x5633b8a7eab9 0x5633b8a7ef9d 0x7f1544cec890 0x7f1544967067 0x7f1544968448 0x5633b7d2ecf3 0x5633b8191a96 0x5633b89f2101 0x5633b94f3d30 0x7f1544ce5064 0x7f1544a1a62d ----- BEGIN BACKTRACE ----- {"backtrace":[{"b":"5633B7515000","o":"156A8A1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"5633B7515000","o":"1569AB9"},{"b":"5633B7515000","o":"1569F9D"},{"b":"7F1544CDD000","o":"F890"},{"b":"7F1544932000","o":"35067","s":"gsignal"},{"b":"7F1544932000","o":"36448","s":"abort"},{"b":"5633B7515000","o":"819CF3","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"},{"b":"5633B7515000","o":"C7CA96","s":"_ZN5mongo12IndexBuilder3runEv"},{"b":"5633B7515000","o":"14DD101","s":"_ZN5mongo13BackgroundJob7jobBodyEv"},{"b":"5633B7515000","o":"1FDED30"},{"b":"7F1544CDD000","o":"8064"},{"b":"7F1544932000","o":"E862D","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.6", "gitVersion" : "c55eb86ef46ee7aede3b1e2a5d184a7df4bfb5b5", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "3.16.0-4-amd64", "version" : "#1 SMP Debian 3.16.43-2+deb8u2 (2017-06-26)", "machine" : "x86_64" }, "somap" : [ { "b" : "5633B7515000", "elfType" : 3, "buildId" : "A103B8CEADAFC57DD867918614DCE184B9D877C2" }, { "b" : "7FFCD8F3E000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "FC9137B45D34B77AE9F781A05AA9CF5C3CD44D62" }, { "b" : "7F1545C19000", "path" : "/usr/lib/x86_64-linux-gnu/libssl.so.1.0.0", "elfType" : 3, "buildId" : "21115992A1F885E1ACE88AADA60F126AD9759D03" }, { "b" : "7F154581D000", "path" : "/usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0", "elfType" : 3, "buildId" : "32E9A5B9EED626E93DEEB00A49033F78652DB9A3" }, { "b" : "7F1545615000", "path" : "/lib/x86_64-linux-gnu/librt.so.1", "elfType" : 3, "buildId" : "A63C95FB33CCA970E141D2E13774B997C1CF0565" }, { "b" : "7F1545411000", "path" : "/lib/x86_64-linux-gnu/libdl.so.2", "elfType" : 3, "buildId" : "D70B531D672A34D71DB42EB32B68E63F2DCC5B6A" }, { "b" : "7F1545110000", "path" : "/lib/x86_64-linux-gnu/libm.so.6", "elfType" : 3, "buildId" : "152C93BA3E8590F7ED0BCDDF868600D55EC4DD6F" }, { "b" : "7F1544EFA000", "path" : "/lib/x86_64-linux-gnu/libgcc_s.so.1", "elfType" : 3, "buildId" : "D5FB04F64B3DAEA6D6B68B5E8B9D4D2BC1A6E1FC" }, { "b" : "7F1544CDD000", "path" : "/lib/x86_64-linux-gnu/libpthread.so.0", "elfType" : 3, "buildId" : "9DA9387A60FFC196AEDB9526275552AFEF499C44" }, { "b" : "7F1544932000", "path" : "/lib/x86_64-linux-gnu/libc.so.6", "elfType" : 3, "buildId" : "48C48BC6ABB794461B8A558DD76B29876A0551F0" }, { "b" : "7F1545E7A000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "1D98D41FBB1EABA7EC05D0FD7624B85D6F51C03C" } ] }} mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x5633b8a7f8a1] mongod(+0x1569AB9) [0x5633b8a7eab9] mongod(+0x1569F9D) [0x5633b8a7ef9d] libpthread.so.0(+0xF890) [0x7f1544cec890] libc.so.6(gsignal+0x37) [0x7f1544967067] libc.so.6(abort+0x148) [0x7f1544968448] mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x5633b7d2ecf3] mongod(_ZN5mongo12IndexBuilder3runEv+0xD86) [0x5633b8191a96] mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0x131) [0x5633b89f2101] mongod(+0x1FDED30) [0x5633b94f3d30] libpthread.so.0(+0x8064) [0x7f1544ce5064] libc.so.6(clone+0x6D) [0x7f1544a1a62d] ----- END BACKTRACE -----
thomas.schubert commented on Wed, 19 Jul 2017 21:23:31 +0000: Hi sallgeud, Sorry for the delay getting back to you. From the log files, I can see that these crashes are the result of hitting a "Too many open files" system limit. We're aware that these replica sets may have hundreds of thousands of active data handles, as a result, unfortunately, this type of error is not unexpected. As you're aware, we have work scheduled to reduce the number of open files required for your schema and workload. For now, I would recommend reconfirming that your ulimits are appropriately set. Kind regards, Thomas sallgeud commented on Wed, 19 Jul 2017 16:53:46 +0000: Uploaded. It happened in 3.4.5 for us in the previous few days, so I've uploaded the 3.4.5 logs as well. sallgeud commented on Mon, 10 Jul 2017 19:29:55 +0000: Oops.. nm. See it now sallgeud commented on Mon, 10 Jul 2017 19:29:37 +0000: Sure. Send me over the secure upload Link thomas.schubert commented on Mon, 10 Jul 2017 17:41:59 +0000: Additionally, would you please provide the diagnostic.data, as that may help us rule out some other theories that would explain this behavior. Thanks again, Thomas thomas.schubert commented on Mon, 10 Jul 2017 16:04:21 +0000: Hi sallgeud, Thanks for the report, to help us investigate would you please upload the complete log file that includes the fassert? I've created a secure upload portal for you to use. Kind regards, Thomas
While running several copyDatabase functions concurrently, the error occured. Only currently verified to impact 3.4.6 on linux