Operational Defect Database

BugZero found this defect 2743 days ago.

BugZero® Enterprise

Real-time monitoring
ServiceNow Integration
Problem ticket automation
Proactive by design

MongoDB - Defect ID: 377944
Duplicate key error returned after write concern timeout instead of TimeoutException

BugZero® Risk Score

Coming soon

Learn more

Last update date:

7/29/2017

Priority: Major - P3
Status: Closed

Info

Replica set with 2 nodes and an arbiter with one node down When performing a w:"majority" write that would otherwise fail with Duplicate Key Exception, I expect it to fail with Write Concern Timeout exception after the timeout has expired. However, it still produces the dup key exception after the wait: replset:PRIMARY> db.test.insertOne({_id:1}) { "acknowledged" : true, "insertedId" : 1 } replset:PRIMARY> db.test.insertOne({_id:1},{writeConcern:{w:"majority",wtimeout:5000}}) ... WAIT ... 2017-04-26T18:09:57.685-0400 E QUERY [thread1] WriteError: E11000 duplicate key error collection: test.test index: _id_ dup key: { : 1.0 } : WriteError({ "index" : 0, "code" : 11000, "errmsg" : "E11000 duplicate key error collection: test.test index: _id_ dup key: { : 1.0 }", "op" : { "_id" : 1 } }) WriteError@src/mongo/shell/bulk_api.js:469:48 Bulk/mergeBatchResults@src/mongo/shell/bulk_api.js:836:49 Bulk/executeBatch@src/mongo/shell/bulk_api.js:906:13 Bulk/this.execute@src/mongo/shell/bulk_api.js:1150:21 DBCollection.prototype.insertOne@src/mongo/shell/crud_api.js:242:9 @(shell):1:1 replset:PRIMARY> rs.status() { "set" : "replset", "date" : ISODate("2017-04-26T22:10:05.449Z"), "myState" : 1, "term" : NumberLong(3), "heartbeatIntervalMillis" : NumberLong(2000), "optimes" : { "lastCommittedOpTime" : { "ts" : Timestamp(1493242117, 1), "t" : NumberLong(1) }, "appliedOpTime" : { "ts" : Timestamp(1493244598, 1), "t" : NumberLong(3) }, "durableOpTime" : { "ts" : Timestamp(1493244598, 1), "t" : NumberLong(3) } }, "members" : [ { "_id" : 0, "name" : "AD-MAC10G.local:27017", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDurable" : { "ts" : Timestamp(0, 0), "t" : NumberLong(-1) }, "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2017-04-26T22:10:04.105Z"), "lastHeartbeatRecv" : ISODate("2017-04-26T21:28:42.092Z"), "pingMs" : NumberLong(0), "lastHeartbeatMessage" : "Connection refused", "configVersion" : -1 }, { "_id" : 1, "name" : "AD-MAC10G.local:27018", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 2740, "optime" : { "ts" : Timestamp(1493244598, 1), "t" : NumberLong(3) }, "optimeDate" : ISODate("2017-04-26T22:09:58Z"), "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1493244557, 1), "electionDate" : ISODate("2017-04-26T22:09:17Z"), "configVersion" : 1, "self" : true }, { "_id" : 2, "name" : "AD-MAC10G.local:27019", "health" : 1, "state" : 7, "stateStr" : "ARBITER", "uptime" : 57, "lastHeartbeat" : ISODate("2017-04-26T22:10:03.982Z"), "lastHeartbeatRecv" : ISODate("2017-04-26T22:10:02.801Z"), "pingMs" : NumberLong(0), "configVersion" : 1 } ], "ok" : 1 } replset:PRIMARY> This makes it impossible to distinguish the write concern failure.

Top User Comments

spencer commented on Mon, 3 Jul 2017 21:48:03 +0000: Looks like the issue was actually with the shell behavior, not the command behavior. spencer commented on Wed, 17 May 2017 22:49:01 +0000: Actually, I am unable to reproduce this outside of the shell bulk api. alex.komyagin, can you reproduce this using the command api directly? i.e. db.runCommand({insert:'test', documents: [{_id:2}], writeConcern:{w:3, wtimeout:3000}, ordered: true}) spencer commented on Wed, 17 May 2017 21:51:17 +0000: So, hilariously, the difference between ordered and unordered is coming from the shell, not the server. I filed SERVER-29263 for that bug. There may still be a bug in the server where we don't return the write concern error in some places. I'm still investigating the server side of the issue. alex.komyagin@10gen.com commented on Thu, 27 Apr 2017 20:45:13 +0000: Just to add to this, ordered and unordered bulk operations are not consistent in reporting write concern failures after a timeout: replset:PRIMARY> db.test1.bulkWrite([{insertOne:{document:{_id:1}}}],{ordered:false,writeConcern:{w:2,wtimeout:1000}}) 2017-04-27T16:38:17.558-0400 E QUERY [thread1] BulkWriteError: write error at item 0 and problem enforcing write concern in bulk operation : BulkWriteError({ "writeErrors" : [ { "index" : 0, "code" : 11000, "errmsg" : "E11000 duplicate key error collection: test.test1 index: _id_ dup key: { : 1.0 }", "op" : { "_id" : 1 } } ], "writeConcernErrors" : [ { "code" : 64, "codeName" : "WriteConcernFailed", "errInfo" : { "wtimeout" : true }, "errmsg" : "waiting for replication timed out" } ], "nInserted" : 0, "nUpserted" : 0, "nMatched" : 0, "nModified" : 0, "nRemoved" : 0, "upserted" : [ ] }) BulkWriteError@src/mongo/shell/bulk_api.js:372:48 BulkWriteResult/this.toError@src/mongo/shell/bulk_api.js:336:24 Bulk/this.execute@src/mongo/shell/bulk_api.js:1173:1 DBCollection.prototype.bulkWrite@src/mongo/shell/crud_api.js:191:20 @(shell):1:1 replset:PRIMARY> db.test1.bulkWrite([{insertOne:{document:{_id:1}}}],{ordered:true,writeConcern:{w:2,wtimeout:1000}}) 2017-04-27T16:38:39.568-0400 E QUERY [thread1] BulkWriteError: write error at item 0 in bulk operation : BulkWriteError({ "writeErrors" : [ { "index" : 0, "code" : 11000, "errmsg" : "E11000 duplicate key error collection: test.test1 index: _id_ dup key: { : 1.0 }", "op" : { "_id" : 1 } } ], "writeConcernErrors" : [ ], "nInserted" : 0, "nUpserted" : 0, "nMatched" : 0, "nModified" : 0, "nRemoved" : 0, "upserted" : [ ] }) milkie commented on Wed, 26 Apr 2017 23:58:31 +0000: Arguably, however, you should still be able to know of your timeout expired, because otherwise you would assume that your read could see the data even when using read concern majority. This seems like a bug that will be fixed if we merge the w timeout with maxTimeMS. milkie commented on Wed, 26 Apr 2017 23:54:10 +0000: This behavior is by design. You were always destined to get the duplicate key error, regardless of your write concern. However, because you used w:majority, the server has to delay its response to you so that the preexisting data that caused your error can first replicate to a majority of nodes. This is to allow you to immediately issue a read concern level majority read and be able to see the duplicate data that caused the write error.

Steps to Reproduce

Original Vendor Announcement

Top MongoDB defects
(by risk score)

Search:

...

OPERATIONAL DEFECT DATABASE

Operational Defect Database

BugZero® Enterprise

MongoDB - Defect ID: 377944
Duplicate key error returned after write concern timeout instead of TimeoutException

BugZero® Risk Score

Last update date:

Defect details

Known affected products

Known affected releases

Known fixed releases

Change History

Description

Info

Top User Comments

Steps to Reproduce

Sources / Additional Resources / Links

Top MongoDB defects
(by risk score)

Search:

OPERATIONAL DEFECT DATABASE

Operational Defect Database

BugZero® Enterprise

MongoDB - Defect ID: 377944 Duplicate key error returned after write concern timeout instead of TimeoutException

BugZero® Risk Score

Last update date:

Defect details

Known affected products

Known affected releases

Known fixed releases

Change History

Description

Info

Top User Comments

Steps to Reproduce

Sources / Additional Resources / Links

Top MongoDB defects(by risk score)

Search:

MongoDB - Defect ID: 377944
Duplicate key error returned after write concern timeout instead of TimeoutException

Top MongoDB defects
(by risk score)