MySQL deadlock on /notifications/individual/


#1

Observed Behavior:
We are investigating reports of intermittent “unable to connect to database” errors being observed on our Phabricator installation. Whilst investigating this, we found the following exception in our PHP FPM logs:

[13-Dec-2018 04:42:26 Etc/UTC] [2018-12-13 04:42:26] EXCEPTION: (AphrontDeadlockQueryException) #1213: Deadlock found when try
ing to get lock; try restarting transaction at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnecti
on.php:309]
[13-Dec-2018 04:42:26 Etc/UTC] arcanist(head=stable, ref.master=3534d2baca4b, ref.stable=45a8d22c74a6), phabricator(head=stabl
e, ref.master=9bfe558587aa, ref.stable=237a2a190984), phlab(head=master, ref.master=564c60d09ff4), phutil(head=stable, ref.master=35d0ec2dfa59, ref.stable=414a4c6abb1b)
[13-Dec-2018 04:42:26 Etc/UTC]   #0 <#2> AphrontBaseMySQLDatabaseConnection::throwCommonException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:351]
[13-Dec-2018 04:42:26 Etc/UTC]   #1 <#2> AphrontBaseMySQLDatabaseConnection::throwQueryCodeException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:292]
[13-Dec-2018 04:42:26 Etc/UTC]   #2 <#2> AphrontBaseMySQLDatabaseConnection::throwQueryException(mysqli) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:188]
[13-Dec-2018 04:42:26 Etc/UTC]   #3 <#2> AphrontBaseMySQLDatabaseConnection::executeQuery(PhutilQueryString) called at [<phutil>/src/xsprintf/queryfx.php:8]
[13-Dec-2018 04:42:26 Etc/UTC]   #4 <#2> queryfx(AphrontMySQLiDatabaseConnection, string, PhutilQueryString, PhabricatorMarkupCache, array, array) called at [<phutil>/src/aphront/storage/connection/AphrontDatabaseConnection.php:58]
[13-Dec-2018 04:42:26 Etc/UTC]   #5 <#2> AphrontDatabaseConnection::query(string, PhutilQueryString, PhabricatorMarkupCache, array, array) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1277]
[13-Dec-2018 04:42:26 Etc/UTC]   #6 <#2> LiskDAO::insertRecordIntoDatabase(string) called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1095]
[13-Dec-2018 04:42:26 Etc/UTC]   #7 <#2> LiskDAO::replace() called at [<phabricator>/src/infrastructure/markup/PhabricatorMarkupEngine.php:287]
[13-Dec-2018 04:42:26 Etc/UTC]   #8 <#2> PhabricatorMarkupEngine::loadPreprocessorCaches(array, array) called at [<phabricator>/src/infrastructure/markup/PhabricatorMarkupEngine.php:133]
[13-Dec-2018 04:42:26 Etc/UTC]   #9 <#2> PhabricatorMarkupEngine::process() called at [<phabricator>/src/applications/feed/story/PhabricatorFeedStory.php:173]
[13-Dec-2018 04:42:26 Etc/UTC]   #10 <#2> PhabricatorFeedStory::loadAllFromRows(array, PhabricatorUser) called at [<phabricator>/src/applications/notification/query/PhabricatorNotificationQuery.php:70]
[13-Dec-2018 04:42:26 Etc/UTC]   #11 <#2> PhabricatorNotificationQuery::loadPage() called at [<phabricator>/src/infrastructure/query/policy/PhabricatorPolicyAwareQuery.php:236]
[13-Dec-2018 04:42:26 Etc/UTC]   #12 <#2> PhabricatorPolicyAwareQuery::execute() called at [<phabricator>/src/applications/notification/controller/PhabricatorNotificationIndividualController.php:13]
[13-Dec-2018 04:42:26 Etc/UTC]   #13 <#2> PhabricatorNotificationIndividualController::handleRequest(AphrontRequest) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:259]
[13-Dec-2018 04:42:26 Etc/UTC]   #14 phlog(AphrontDeadlockQueryException) called at [<phabricator>/src/aphront/handler/PhabricatorAjaxRequestExceptionHandler.php:27]
[13-Dec-2018 04:42:26 Etc/UTC]   #15 PhabricatorAjaxRequestExceptionHandler::handleRequestThrowable(AphrontRequest, AphrontDeadlockQueryException) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:671]
[13-Dec-2018 04:42:26 Etc/UTC]   #16 AphrontApplicationConfiguration::handleThrowable(AphrontDeadlockQueryException) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:264]
[13-Dec-2018 04:42:26 Etc/UTC]   #17 AphrontApplicationConfiguration::processRequest(AphrontRequest, PhutilDeferredLog, AphrontPHPHTTPSink, MultimeterControl) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:182]
[13-Dec-2018 04:42:26 Etc/UTC]   #18 AphrontApplicationConfiguration::runHTTPRequest(AphrontPHPHTTPSink) called at [<phabricator>/webroot/index.php:17]

From SHOW ENGINE INNODB STATUS:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-12-13 04:42:26 0x2af380b91700
*** (1) TRANSACTION:
TRANSACTION 963083924, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 7 lock struct(s), heap size 1136, 5 row lock(s), undo log entries 1
MySQL thread id 1104401682, OS thread handle 47208636163840, query id 7694145767 10.50.0.187 app update
REPLACE INTO `phabricator_cache`.`cache_markupcache` (`cacheKey`, `cacheData`, `metadata`, `dateCreated`, `dateModified`) VALUES ('feed:6634333534274666701:comment/2575924@17@7VwBk7PApozc', 'a:3:{s:6:\"output\";O:14:\"PhutilSafeHTML\":1:{s:23:\"\0PhutilSafeHTML\0content\";s:73:\"<ul class=\"remarkup-list\">\n<li class=\"remarkup-list-item\">path</li>\n</ul>\";}s:7:\"storage\";a:0:{}s:8:\"metadata\";a:0:{}}', '{\"host\":\"ip-10-50-0-187\"}', '1544676146', '1544676146')
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 859 page no 83111 n bits 320 index cacheKey of table `phabricator_cache`.`cache_markupcache` trx id 963083924 lock_mode X waiting
Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 30; hex 666565643a363633343333333533343237343636363730313a636f6d6d65; asc feed:6634333534274666701:comme; (total 56 bytes);
 1: len 4; hex 00355d30; asc  5]0;;

*** (2) TRANSACTION:
TRANSACTION 963083925, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
8 lock struct(s), heap size 1136, 8 row lock(s), undo log entries 2
MySQL thread id 1104401687, OS thread handle 47225325033216, query id 7694145778 10.50.0.138 app update
REPLACE INTO `phabricator_cache`.`cache_markupcache` (`cacheKey`, `cacheData`, `metadata`, `dateCreated`, `dateModified`) VALUES ('feed:6634333534274666701:comment/2575924@17@7VwBk7PApozc', 'a:3:{s:6:\"output\";O:14:\"PhutilSafeHTML\":1:{s:23:\"\0PhutilSafeHTML\0content\";s:73:\"<ul class=\"remarkup-list\">\n<li class=\"remarkup-list-item\">path</li>\n</ul>\";}s:7:\"storage\";a:0:{}s:8:\"metadata\";a:0:{}}', '{\"host\":\"ip-10-50-0-138\"}', '1544676146', '1544676146')
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 859 page no 83111 n bits 320 index cacheKey of table `phabricator_cache`.`cache_markupcache` trx id 963083925 lock_mode X
Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
 0: len 30; hex 666566316138646633383366636133346337633334323661633836656438; asc fef1a8df383fca34c7c3426ac86ed8; (total 80 bytes);
 1: len 4; hex 00350978; asc  5 x;;

Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 30; hex 666565643a363633343333333533343237343636363730313a636f6d6d65; asc feed:6634333534274666701:comme; (total 56 bytes);
 1: len 4; hex 00355d30; asc  5]0;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 859 page no 83111 n bits 320 index cacheKey of table `phabricator_cache`.`cache_markupcache` trx id 963083925 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 32
 0: len 30; hex 666565643a363633343333333533343237343636363730313a636f6d6d65; asc feed:6634333534274666701:comme; (total 56 bytes);
 1: len 4; hex 00355d30; asc  5]0;;

*** WE ROLL BACK TRANSACTION (1)

I don’t believe that this is related to the “unable to connect to MySQL” errors that we are seeing, but it does seem like a bug nonetheless.

Expected Behavior:
There shouldn’t be MySQL deadlocks.

Phabricator Version:
rP237a2a190984cb359f8de62cdf023cfcff8d80e6
rARC45a8d22c74a62624e69f5cd6ce901c9ab2658904
rPHU414a4c6abb1b574f576d72432e8ceb4b9b3cbf91

Reproduction Steps:
I apologise as I don’t have a way to reproduce this since it seems to be a concurrency issue.


#2

Here is a different deadlock:

[13-Dec-2018 14:46:12 Etc/UTC] [2018-12-13 14:46:12] EXCEPTION: (AphrontDeadlockQueryException) #1213: Deadlock found when trying to get lock; try restarting transaction at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:309]
[13-Dec-2018 14:46:12 Etc/UTC] arcanist(head=stable, ref.master=3534d2baca4b, ref.stable=45a8d22c74a6), phabricator(head=stable, ref.master=9bfe558587aa, ref.stable=237a2a190984), phlab(head=master, ref.master=564c60d09ff4), phutil(head=stable, ref.master=35d0ec2dfa59, ref.stable=414a4c6abb1b)
[13-Dec-2018 14:46:12 Etc/UTC]   #0 <#2> AphrontBaseMySQLDatabaseConnection::throwCommonException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:351]
[13-Dec-2018 14:46:12 Etc/UTC]   #1 <#2> AphrontBaseMySQLDatabaseConnection::throwQueryCodeException(integer, string) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:292]
[13-Dec-2018 14:46:12 Etc/UTC]   #2 <#2> AphrontBaseMySQLDatabaseConnection::throwQueryException(mysqli) called at [<phutil>/src/aphront/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:188]
[13-Dec-2018 14:46:12 Etc/UTC]   #3 <#2> AphrontBaseMySQLDatabaseConnection::executeQuery(PhutilQueryString) called at [<phutil>/src/xsprintf/queryfx.php:8]
[13-Dec-2018 14:46:12 Etc/UTC]   #4 <#2> queryfx(AphrontMySQLiDatabaseConnection, string, string, array) called at [<phabricator>/src/applications/people/storage/PhabricatorUserCache.php:93]
[13-Dec-2018 14:46:12 Etc/UTC]   #5 <#2> PhabricatorUserCache::writeCaches(array) called at [<phabricator>/src/applications/people/storage/PhabricatorUserCache.php:51]
[13-Dec-2018 14:46:12 Etc/UTC]   #6 <#2> PhabricatorUserCache::writeCache(PhabricatorUserNotificationCountCacheType, string, string, string) called at [<phabricator>/src/applications/people/storage/PhabricatorUser.php:1549]
[13-Dec-2018 14:46:12 Etc/UTC]   #7 <#2> PhabricatorUser::requireCacheData(string) called at [<phabricator>/src/applications/people/storage/PhabricatorUser.php:808]
[13-Dec-2018 14:46:12 Etc/UTC]   #8 <#2> PhabricatorUser::getUnreadNotificationCount() called at [<phabricator>/src/applications/notification/controller/PhabricatorNotificationPanelController.php:9]
[13-Dec-2018 14:46:12 Etc/UTC]   #9 <#2> PhabricatorNotificationPanelController::handleRequest(AphrontRequest) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:259]
[13-Dec-2018 14:46:12 Etc/UTC]   #10 phlog(AphrontDeadlockQueryException) called at [<phabricator>/src/aphront/handler/PhabricatorAjaxRequestExceptionHandler.php:27]
[13-Dec-2018 14:46:12 Etc/UTC]   #11 PhabricatorAjaxRequestExceptionHandler::handleRequestThrowable(AphrontRequest, AphrontDeadlockQueryException) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:671]
[13-Dec-2018 14:46:12 Etc/UTC]   #12 AphrontApplicationConfiguration::handleThrowable(AphrontDeadlockQueryException) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:264]
[13-Dec-2018 14:46:12 Etc/UTC]   #13 AphrontApplicationConfiguration::processRequest(AphrontRequest, PhutilDeferredLog, AphrontPHPHTTPSink, MultimeterControl) called at [<phabricator>/src/aphront/configuration/AphrontApplicationConfiguration.php:182]
[13-Dec-2018 14:46:12 Etc/UTC]   #14 AphrontApplicationConfiguration::runHTTPRequest(AphrontPHPHTTPSink) called at [<phabricator>/webroot/index.php:17]

From SHOW ENGINE INNODB STATUS:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2018-12-13 14:46:12 0x2aef826c3700
*** (1) TRANSACTION:
TRANSACTION 964075079, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s), undo log entries 1
MySQL thread id 1106663655, OS thread handle 47209772553984, query id 7713255515 10.50.0.123 app update
INSERT INTO `user_cache` (userPHID, cacheIndex, cacheKey, cacheData, cacheType)
          VALUES ('PHID-USER-ygcpy2bt4qt2o2v767sm', 'rOGS6Fh944dc', 'user.notification.count.v1', '4', 'notification.count')
          ON DUPLICATE KEY UPDATE
            cacheData = VALUES(cacheData),
            cacheType = VALUES(cacheType)
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 2041 page no 41 n bits 248 index key_usercache of table `phabricator_user`.`user_cache` trx id 964075079 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 40 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 30; hex 504849442d555345522d7967637079326274347174326f3276373637736d; asc PHID-USER-ygcpy2bt4qt2o2v767sm;;
 1: len 12; hex 795f465058756a5439335533; asc y_FPXujT93U3;;
 2: len 4; hex 00000040; asc    @;;

*** (2) TRANSACTION:
TRANSACTION 964075078, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 1106663642, OS thread handle 47208173680384, query id 7713255463 10.50.0.132 app update
INSERT INTO `user_cache` (userPHID, cacheIndex, cacheKey, cacheData, cacheType)
          VALUES ('PHID-USER-ygcpy2bt4qt2o2v767sm', 'rOGS6Fh944dc', 'user.notification.count.v1', '4', 'notification.count')
          ON DUPLICATE KEY UPDATE
            cacheData = VALUES(cacheData),
            cacheType = VALUES(cacheType)
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 2041 page no 41 n bits 248 index key_usercache of table `phabricator_user`.`user_cache` trx id 964075078 lock_mode X locks gap before rec
Record lock, heap no 40 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 30; hex 504849442d555345522d7967637079326274347174326f3276373637736d; asc PHID-USER-ygcpy2bt4qt2o2v767sm;;
 1: len 12; hex 795f465058756a5439335533; asc y_FPXujT93U3;;
 2: len 4; hex 00000040; asc    @;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 2041 page no 41 n bits 248 index key_usercache of table `phabricator_user`.`user_cache` trx id 964075078 lock_mode X locks gap before rec insert intention waiting
Record lock, heap no 40 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 30; hex 504849442d555345522d7967637079326274347174326f3276373637736d; asc PHID-USER-ygcpy2bt4qt2o2v767sm;;
 1: len 12; hex 795f465058756a5439335533; asc y_FPXujT93U3;;
 2: len 4; hex 00000040; asc    @;;

*** WE ROLL BACK TRANSACTION (1)

#3

I asked our MySQL guru about this issue and he said this:

From the docs:

REPLACE is done like an INSERT if there is no collision on a unique key. Otherwise, an exclusive next-key lock is placed on the row to be replaced.

I haven’t seen phab’s code base but REPLACE is a slower version of INSERT ... ON DUPLICATE KEY UPDATE so it’s entirely possible that it holds the lock longer, so in this type of workload, it increases the risk for deadlocks. Note that REPLACE deletes the row and inserts a new one if there’s a match.