PhutilURI fails to parse URIs with underscore in subdomain part

I tried creating a Repository in Diffusion and adding a URI that contains underscore to it fails with the error The URI protocol is unrecognized. It should begin with "ssh://", "http://", "https://", "git://", "svn://", "svn+ssh://", or be in the form "git@domain.com:path".

The URI I tried to add is https://svn_server.example.com/.

I then went into the source code and fount out that PhutilURI::getProtocol() silently fails here returning an empty string for any URI that contains an underscore in the subdomain part.

Are there any quick workarounds here, please?

Reproduction Instructions

  1. Create a Repository in Diffusion
  2. Add a URI for the repository like htttps://svn_server.example.com/

Phabricator/Arcanist Version

phabricator: 3c432225251303558de0dae903656d6159f2bf47 (Sat, Jul 6)
arcanist: d92fa96366c0ed50e4257508148aa75192d4fb1f (Fri, Jun 21)
phutil: b416093386a225b1d9a2de906899b94cbf4babcb (Tue, Jul 9)
php: 5.4.16

Are there any quick workarounds here, please?

You can:

  • use a domain without an underscore; or
  • modify Phabricator to parse these unusual domains; or
  • use different software which supports these domains natively if support for unusual domains is critical in your use case.

What makes you say they are unusual? Perhaps for a domain name underscores are unusual, but not for a sub-domain. And however, according to RFC 3986, an underscore is a perfectly correct character to be found in the authority component of a URI.

Could you please consider a patch on this matter?

What makes you say they are unusual?

I’d say these are unusual because I can’t remember ever encountering a normal service hosted on a domain with an underscore. I’ve used the internet for a long time, so I’d expect to have run into some of these domains by now if they were in common use.

I also can’t remember cases where other users have run into this kind of issue and made similar reports. If these sorts of domains were in common use, I’d expect we’d have received multiple reports earlier in the life of the software.

Offhand, the only situation I can recall where I’ve encountered an underscore in a domain is in the context of MX/SPF/DKIM metadata and only present as a prefix.

This question is frustrating because it feels disrespectful. I think these domains are very obviously unusual, and asking why I believe they’re unusual is asking me to spend time explaining something you already know.

In particular, it feels like your point of view is that you’re entitled to this change unless I can prove it’s a bad idea, and that the onus is on me to build a strong argument against it. Perhaps other software projects work like this, but Phabricator does not. The onus is on you to build a good case for changes you’d like to see in the upstream.

Could you please consider a patch on this matter?

The case for this change is very poor, so I don’t currently plan to make changes here.

Well I did not realize writing here is like writing a letter to the Supreme Court of the United States. I was just stating the issue I encountered while trying to set up Phabricator for use at my workplace.

As for disrespectful… really? Since a URI is valid by the standard, why would you decide not to support it so vehemently? I thought we all live and code by some standards in the world of computers.