...
When querying the "system.profile" collection with a user containing only the clusterMonitor role, closing the cursor appears to fail and "leak" the cursor until the 10 minute timeout. However querying with a user with higher privileges closes the cursor successfully. Repro steps are above. I tried with both mgo and Java drivers just to double check. Log level 2 output from a reproducer: 2015-09-10T18:31:50.300-0400 I NETWORK [initandlisten] connection accepted from 127.0.0.1:64007 #2199 (4 connections now open) 2015-09-10T18:31:50.307-0400 D COMMAND [conn2199] run command admin.$cmd { buildinfo: 1 } 2015-09-10T18:31:50.308-0400 I COMMAND [conn2199] command admin.$cmd command: buildInfo { buildinfo: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:977 locks:{} 0ms 2015-09-10T18:31:50.316-0400 D COMMAND [conn2199] run command admin.$cmd { ismaster: 1 } 2015-09-10T18:31:50.316-0400 I COMMAND [conn2199] command admin.$cmd command: isMaster { ismaster: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:465 locks:{} 0ms 2015-09-10T18:31:50.344-0400 I NETWORK [initandlisten] connection accepted from 127.0.0.1:64009 #2200 (5 connections now open) 2015-09-10T18:31:50.344-0400 D COMMAND [conn2200] run command admin.$cmd { buildinfo: 1 } 2015-09-10T18:31:50.344-0400 I COMMAND [conn2200] command admin.$cmd command: buildInfo { buildinfo: 1 } keyUpdates:0 writeConflicts:0 numYields:0 reslen:977 locks:{} 0ms 2015-09-10T18:31:50.357-0400 D COMMAND [conn2200] run command admin.$cmd { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: BinData(0, 6E2C2C6E3D636C75737465726D6F6E69746F722C723D585E292A4C7B2164637D744B4C2965796D556548265C4677) } 2015-09-10T18:31:50.358-0400 D QUERY [conn2200] Relevant index 0 is kp: { user: 1, db: 1 } io: { v: 1, unique: true, key: { user: 1, db: 1 }, name: "user_1_db_1", ns: "admin.system.users" } 2015-09-10T18:31:50.358-0400 D QUERY [conn2200] Only one plan is available; it will be run but will not be cached. query: { user: "clustermonitor", db: "admin" } sort: {} projection: {} skip: 0 limit: 0, planSummary: IXSCAN { user: 1, db: 1 } 2015-09-10T18:31:50.358-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.358-0400 D STORAGE [conn2200] WT rollback_transaction 2015-09-10T18:31:50.362-0400 I COMMAND [conn2200] command admin.$cmd command: saslStart { saslStart: 1, mechanism: "SCRAM-SHA-1", payload: BinData(0, 6E2C2C6E3D636C75737465726D6F6E69746F722C723D585E292A4C7B2164637D744B4C2965796D556548265C4677) } keyUpdates:0 writeConflicts:0 numYields:0 reslen:171 locks:{ G lobal: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 4ms 2015-09-10T18:31:50.362-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.363-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::beingReleased 2015-09-10T18:31:50.363-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::broughtBack 2015-09-10T18:31:50.363-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.363-0400 D STORAGE [conn2200] WT commit_transaction 2015-09-10T18:31:50.363-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::beingReleased 2015-09-10T18:31:50.363-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::broughtBack 2015-09-10T18:31:50.363-0400 D STORAGE [conn2200] WT commit_transaction 2015-09-10T18:31:50.582-0400 D COMMAND [conn2200] run command admin.$cmd { saslContinue: 1, conversationId: 1, payload: BinData(0, 633D626977732C723D585E292A4C7B2164637D744B4C2965796D556548265C46772B5264594D65677343656D6871664D6841425237646A2F46556573484735505A2C703D576B...) } 2015-09-10T18:31:50.582-0400 I COMMAND [conn2200] command admin.$cmd command: saslContinue { saslContinue: 1, conversationId: 1, payload: BinData(0, 633D626977732C723D585E292A4C7B2164637D744B4C2965796D556548265C46772B5264594D65677343656D6871664D6841425237646A2F46556573484735505A2C703D576B...) } keyUpdates:0 writeConflicts:0 numYields:0 reslen:108 locks:{} 0ms 2015-09-10T18:31:50.583-0400 D COMMAND [conn2200] run command admin.$cmd { saslContinue: 1, conversationId: 1, payload: BinData(0, 763D4E373437683752596F3274504D4C49333047325862626D4E6D4D673D) } 2015-09-10T18:31:50.583-0400 D QUERY [conn2200] Relevant index 0 is kp: { user: 1, db: 1 } io: { v: 1, unique: true, key: { user: 1, db: 1 }, name: "user_1_db_1", ns: "admin.system.users" } 2015-09-10T18:31:50.583-0400 D QUERY [conn2200] Only one plan is available; it will be run but will not be cached. query: { user: "clustermonitor", db: "admin" } sort: {} projection: {} skip: 0 limit: 0, planSummary: IXSCAN { user: 1, db: 1 } 2015-09-10T18:31:50.583-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.583-0400 D STORAGE [conn2200] WT rollback_transaction 2015-09-10T18:31:50.583-0400 I ACCESS [conn2200] Successfully authenticated as principal clustermonitor on admin 2015-09-10T18:31:50.584-0400 I COMMAND [conn2200] command admin.$cmd command: saslContinue { saslContinue: 1, conversationId: 1, payload: BinData(0, 763D4E373437683752596F3274504D4C49333047325862626D4E6D4D673D) } keyUpdates:0 writeConflicts:0 numYields:0 reslen:78 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 0ms 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::beingReleased 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::broughtBack 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WT commit_transaction 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::beingReleased 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::broughtBack 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WT commit_transaction 2015-09-10T18:31:50.584-0400 D QUERY [conn2200] Running query: query: {} sort: { $natural: -1 } projection: {} skip: 0 limit: 20 2015-09-10T18:31:50.584-0400 D QUERY [conn2200] Only one plan is available; it will be run but will not be cached. query: {} sort: { $natural: -1 } projection: {} skip: 0 limit: 20, planSummary: COLLSCAN 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WT rollback_transaction 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::beingReleased 2015-09-10T18:31:50.584-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::broughtBack 2015-09-10T18:31:50.584-0400 I QUERY [conn2200] query admin.system.profile query: { $query: {}, $orderby: { $natural: -1 } } planSummary: COLLSCAN cursorid:39651685655 ntoreturn:20 ntoskip:0 nscanned:0 nscannedObjects:20 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:20 reslen:11243 locks:{ Global: { acquireCount: { r: 2 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } } } 0ms 2015-09-10T18:31:50.585-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.585-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::beingReleased 2015-09-10T18:31:50.585-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::broughtBack 2015-09-10T18:31:50.585-0400 D STORAGE [conn2200] WT begin_transaction 2015-09-10T18:31:50.585-0400 D STORAGE [conn2200] WT commit_transaction 2015-09-10T18:31:50.585-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::beingReleased 2015-09-10T18:31:50.585-0400 D STORAGE [conn2200] WiredTigerRecoveryUnit::broughtBack 2015-09-10T18:31:50.585-0400 D STORAGE [conn2200] WT commit_transaction 2015-09-10T18:31:50.597-0400 I COMMAND [conn2200] killcursors: found 0 of 1 2015-09-10T18:31:50.598-0400 I QUERY [conn2200] killcursors keyUpdates:0 writeConflicts:0 numYields:0 locks:{} 0ms 2015-09-10T18:31:50.600-0400 D NETWORK [conn2199] SocketException: remote: 127.0.0.1:64007 error: 9001 socket exception [CLOSED] server [127.0.0.1:64007] 2015-09-10T18:31:50.600-0400 D NETWORK [conn2200] SocketException: remote: 127.0.0.1:64009 error: 9001 socket exception [CLOSED] server [127.0.0.1:64009] 2015-09-10T18:31:50.600-0400 I NETWORK [conn2199] end connection 127.0.0.1:64007 (4 connections now open) 2015-09-10T18:31:50.601-0400 I NETWORK [conn2200] end connection 127.0.0.1:64009 (4 connections now open) Environment OS X 10.8.4 MongoDB 3.0.4, WT Replica Set with auth, authSchemaVersion 5
andreas.nilsson@10gen.com commented on Wed, 20 Apr 2016 20:03:43 +0000: Should be reverted when SERVER-9609 is resolved xgen-internal-githook commented on Wed, 16 Sep 2015 18:44:36 +0000: Author: {u'username': u'agralius', u'name': u'Andreas Nilsson', u'email': u'andreas.nilsson@10gen.com'} Message: SERVER-20364 Let find privilege -> killCursor privilege Branch: v3.0 https://github.com/mongodb/mongo/commit/5cb8c7d9810f15ba6926afe9d16d81eeb25724e5 xgen-internal-githook commented on Wed, 16 Sep 2015 18:20:28 +0000: Author: {u'username': u'agralius', u'name': u'Andreas Nilsson', u'email': u'andreas.nilsson@10gen.com'} Message: SERVER-20364 Let find privilege -> killCursor privilege Branch: master https://github.com/mongodb/mongo/commit/90d9a09c183b5dc6a83dfc30634e6807b393f457 rassi@10gen.com commented on Fri, 11 Sep 2015 07:32:15 +0000: True, you'd get a truncated result set. I suspect that it doesn't make a difference for John's use case, though. kevin.pulo@10gen.com commented on Fri, 11 Sep 2015 03:01:20 +0000: The negative limit workaround won't work if you need to fetch more than 1 batch, though. john.morales@10gen.com commented on Fri, 11 Sep 2015 02:44:50 +0000: Confirmed with both Go and Java reproducers that negative limit works around issue, thanks jason.rassi. rassi@10gen.com commented on Fri, 11 Sep 2015 00:02:05 +0000: Generally, cursors that are exhausted should be closed automatically by the server without a need for an explicit "kill cursors" request from the driver. However, queries that specify a positive limit will remain open even after the cursor.hasNext() call returns false, as the server does not know whether or not to interpret the positive ntoreturn value as a batchSize or a limit (edit: this is fixed in 3.2 with the introduction of the find command, which uses separate "batchSize" and "limit" options instead of ntoreturn). Thus, in this case, the driver will follow up with a "kill cursors" request, which should kill the cursor. That aside, I've confirmed your suspicion from inspecting the server source code that users with the clusterMonitor role are allowed to create cursors on system.profile but are not allowed to kill them. I believe that this bug has existed since the clusterMonitor role was first introduced. andreas.nilsson, can you triage? I'd suggest either adding ActionType::killCursors on "system.profile" to the out-argument in addClusterMonitorPrivileges() (and auditing the rest of role_graph_builtin_roles.cpp to fix all of the other places in this file that have the same problem), or getting rid of ActionType::killCursors entirely and changing killCursors/OP_KILL_CURSORS to require ActionType::find instead. John: as a workaround, can you see if the problem still manifests if you change the limit to a negative limit? I suspect it won't, as the server should close the cursor automatically in this case.
(Not exactly clear which of these steps, if any, are optional. These are just the final steps I took. Repro's 100%.) 1.) Start new replica set with auth. 2.) Add two users: 2a.) "admin" with roles: userAdminAnyDatabase, readWriteAnyDatabase, clusterAdmin, dbAdminAnyDatabase, clusterManager 2b.) "clustermonitor" with role "clusterMonitor" only. 3.) enable profiler level 2 on "test" database. 4.) as "admin" user, just perform some queries to populate "system.profile" collection. (I had a few hundred entries.) 5.) Go reproducer or Java reproducer (I tried using just the shell, but I wasn't clear on how to perform the explicit killCursor call from shell. If possible, I imagine that'd work, too.) Note if you instead use the "admin" user, the cursor is closed.