Importing a Subversion repository that includes entries with no author causes Phacility Phabricator to throw an exception

I am trying to import a Subversion repository into Phacility Phabricator. The first two entries in it look like this:

$ svn --non-interactive --no-auth-cache --username ***** --password ***** log --xml --limit 1 svn://svn.*****.com/*****@1
<?xml version="1.0" encoding="UTF-8"?>
<log>
<logentry
   revision="1">
<date>2007-06-15T19:42:24.000000Z</date>
<msg>Standard project directories initialized by cvs2svn.</msg>
</logentry>
</log>

$ svn --non-interactive --no-auth-cache --username ***** --password ***** log --xml --limit 1 svn://svn.*****.com/*****@2
<?xml version="1.0" encoding="UTF-8"?>
<log>
<logentry
   revision="2">
<author>danchris</author>
<date>2007-06-15T19:42:24.000000Z</date>
<msg>Initial checkin
</msg>
</logentry>
</log>


Notice that the first entry does not have an author field.  Lack of an author field causes Phacility Phabricator to throw an exception as shown:

$ bin/repository reparse --importing --trace R2:1
>>> [3] (+0) <connect> phabricator_repository
<<< [3] (+3) <connect> 3,584 us
>>> [4] (+18) <query> SELECT `r`.* FROM `repository` `r` WHERE ((r.id IN (2))) ORDER BY `r`.`id` DESC 
<<< [4] (+18) <query> 446 us
>>> [5] (+19) <query> SELECT `commit`.* FROM `repository_commit` `commit` WHERE (((commit.repositoryID = 2 AND commit.commitIdentifier = '1'))) ORDER BY `commit`.`id` DESC 
<<< [5] (+19) <query> 230 us
>>> [6] (+20) <query> SELECT `r`.* FROM `repository` `r` WHERE (r.id IN (2)) ORDER BY `r`.`id` DESC 
<<< [6] (+20) <query> 192 us
>>> [7] (+25) <query> SELECT `commit`.* FROM `repository_commit` `commit` WHERE (commit.id IN (11)) ORDER BY `commit`.`id` DESC 
<<< [7] (+26) <query> 463 us
>>> [8] (+26) <query> SELECT `r`.* FROM `repository` `r` WHERE (r.id IN (2)) ORDER BY `r`.`id` DESC 
<<< [8] (+26) <query> 194 us
>>> [9] (+55) <conduit> diffusion.querycommits()
>>> [10] (+56) <query> SELECT `r`.* FROM `repository` `r` WHERE (r.phid IN ('PHID-REPO-zyvqs3tep7tgkpds5m6i')) ORDER BY `r`.`id` DESC 
<<< [10] (+56) <query> 424 us
>>> [11] (+58) <query> SELECT `commit`.* FROM `repository_commit` `commit` WHERE ((commit.phid IN ('PHID-CMIT-fkhvsby623aqpx7x2fxy')) AND (commit.repositoryID IN (2))) ORDER BY `commit`.`id` DESC LIMIT 101
<<< [11] (+58) <query> 337 us
>>> [12] (+58) <query> SELECT `r`.* FROM `repository` `r` WHERE (r.id IN (2)) ORDER BY `r`.`id` DESC 
<<< [12] (+59) <query> 197 us
>>> [13] (+59) <query> SELECT * FROM `phabricator_repository`.`repository_commitdata` WHERE commitID in (11) 
<<< [13] (+59) <query> 199 us
>>> [14] (+62) <connect> phabricator_passphrase
<<< [14] (+62) <connect> 499 us
>>> [15] (+63) <query> SELECT `c`.* FROM `passphrase_credential` `c` WHERE (c.phid IN ('PHID-CDTL-fndc45bgj4uqyhuay4iz')) ORDER BY `c`.`id` DESC 
<<< [15] (+63) <query> 164 us
>>> [16] (+63) <query> SELECT * FROM `phabricator_passphrase`.`passphrase_secret` WHERE id IN (1) 
<<< [16] (+63) <query> 144 us
>>> [17] (+64) <exec> $ svn --non-interactive --no-auth-cache --username '********' --password '********' log --xml --limit 1 svn://svn.*****.com/*****@1
<<< [17] (+276) <exec> 212,198 us
<<< [9] (+277) <conduit> 222,064 us
>>> [18] (+279) <query> SELECT `identity`.* FROM `repository_identity` `identity` WHERE (identity.identityNameHash IN ('qVzKuHbdOl_L')) ORDER BY `identity`.`id` DESC 
<<< [18] (+279) <query> 288 us
>>> [19] (+282) <connect> phabricator_repository
<<< [19] (+283) <connect> 633 us
>>> [20] (+283) <query> INSERT INTO `phabricator_repository`.`repository_identity` (`authorPHID`, `identityNameHash`, `identityNameRaw`, `identityNameEncoding`, `automaticGuessedUserPHID`, `manuallySetUserPHID`, `currentEffectiveUserPHID`, `emailAddress`, `phid`, `dateCreated`, `dateModified`) VALUES ('PHID-CMIT-fkhvsby623aqpx7x2fxy', 'qVzKuHbdOl_L', NULL, 'utf8', NULL, NULL, NULL, '', 'PHID-RIDT-6ceyxigtrhqxu4q2azof', '1589831743', '1589831743')
<<< [20] (+284) <query> 313 us
[2020-05-18 19:55:43] EXCEPTION: (AphrontQueryException) #1048: Column 'identityNameRaw' cannot be null at [<phabricator>/src/infrastructure/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:386]
arcanist(head=master, ref.master=e3030ebcad53), phabricator(head=master, ref.master=f86d822a37ea)
  #0 AphrontBaseMySQLDatabaseConnection::throwQueryCodeException called at [<phabricator>/src/infrastructure/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:320]
  #1 AphrontBaseMySQLDatabaseConnection::throwQueryException called at [<phabricator>/src/infrastructure/storage/connection/mysql/AphrontBaseMySQLDatabaseConnection.php:216]
  #2 AphrontBaseMySQLDatabaseConnection::executeQuery called at [<phabricator>/src/infrastructure/storage/xsprintf/queryfx.php:8]
  #3 queryfx called at [<phabricator>/src/infrastructure/storage/connection/AphrontDatabaseConnection.php:58]
  #4 AphrontDatabaseConnection::query called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:1103]
  #5 LiskDAO::insertRecordIntoDatabase called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:939]
  #6 LiskDAO::insert called at [<phabricator>/src/infrastructure/storage/lisk/LiskDAO.php:908]
  #7 LiskDAO::save called at [<phabricator>/src/applications/repository/storage/PhabricatorRepositoryIdentity.php:130]
  #8 PhabricatorRepositoryIdentity::save called at [<phabricator>/src/applications/diffusion/identity/DiffusionRepositoryIdentityEngine.php:106]
  #9 DiffusionRepositoryIdentityEngine::updateIdentity called at [<phabricator>/src/applications/diffusion/identity/DiffusionRepositoryIdentityEngine.php:48]
  #10 DiffusionRepositoryIdentityEngine::newResolvedIdentity called at [<phabricator>/src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php:76]
  #11 PhabricatorRepositoryCommitMessageParserWorker::updateCommitData called at [<phabricator>/src/applications/repository/worker/commitmessageparser/PhabricatorRepositoryCommitMessageParserWorker.php:42]
  #12 PhabricatorRepositoryCommitMessageParserWorker::parseCommit called at [<phabricator>/src/applications/repository/worker/PhabricatorRepositoryCommitParserWorker.php:51]
  #13 PhabricatorRepositoryCommitParserWorker::doWork called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorWorker.php:124]
  #14 PhabricatorWorker::executeTask called at [<phabricator>/src/infrastructure/daemon/workers/PhabricatorWorker.php:163]
  #15 PhabricatorWorker::scheduleTask called at [<phabricator>/src/applications/repository/management/PhabricatorRepositoryManagementReparseWorkflow.php:260]
  #16 PhabricatorRepositoryManagementReparseWorkflow::execute called at [<arcanist>/src/parser/argument/PhutilArgumentParser.php:492]
  #17 PhutilArgumentParser::parseWorkflowsFull called at [<arcanist>/src/parser/argument/PhutilArgumentParser.php:377]
  #18 PhutilArgumentParser::parseWorkflows called at [<phabricator>/scripts/repository/manage_repositories.php:22]

Reproduction Instructions
I can think of three ways to reproduce this problem:
1: Find or make a Subversion repository that has entries that do not include an author.
2: Convert a CVS repository to Subversion. This approach would depend on whether or nor the current version of cvs2svn creates Subversion entries that do not include an author.
3: Put a front end in front of the SVN command line utility that removes the field.

Phabricator/Arcanist Version
phabricator f86d822a37ea Mon, May 18
arcanist e3030ebcad53 Fri, May 15
php 7.4.3 apache2handler
diff 3.7 /usr/bin/diff
git 2.25.1 /usr/bin/git
hg Not Available
pygmentize 2.3.1 /usr/bin/pygmentize
svn 1.13.0 /usr/bin/svn

This may have been fixed by https://secure.phabricator.com/D21266, although I didn’t aim that change at Subversion and haven’t tested Subversion.

Thanks for fixing this. I updated to the latest version and was able to completely import my Subversion based repository.

Great, thanks for confirming the fix!