C++ symbol indexing

tl;dr - Is the code change below a reasonable fix for T11135?

I just stumbled across the symbol indexing feature of Phabricator and thought it was pretty rad. I added a bunch of C++ symbols to my database using the provided scripts and found out quickly that the feature didn’t actually work. A bit of investigation into why led me to ⚓ T11135 Ctrl-click symbol lookup language in Diffusion ignores `syntax.filemap`. My project uses .cc for the C++ extension, rendering all of my ‘cpp’ language symbols in the database useless. A simple tweak to the generate_ctags_symbols.php script can fix that part by adding them as cc symbols instead of cpp, but the header files still won’t work (and there’s a few .cpp files in another part of the project that I’ll be indexing soon).

Given that ⚓ T13047 Plans: Symbol Indexes is out there in the future plans, I decided to take a stab at figuring out what makes it work in Differential and port those smarts to Diffusion. It turned out to be so simple that I thought I’d post it here for your consideration:

    diff --git a/src/applications/diffusion/document/DiffusionDocumentRenderingEngine.php b/src/applications/diffusion/document/DiffusionDocumentRenderingEngine.php
index 800d195726..ddbed58c2f 100644
--- a/src/applications/diffusion/document/DiffusionDocumentRenderingEngine.php
+++ b/src/applications/diffusion/document/DiffusionDocumentRenderingEngine.php
@@ -99,7 +99,8 @@ final class DiffusionDocumentRenderingEngine
     $symbol_repos = nonempty($repo->getSymbolSources(), array());
     $symbol_repos[] = $repo->getPHID();

-    $lang = last(explode('.', $drequest->getPath()));
+    $engine = PhabricatorSyntaxHighlighter::newEngine();
+    $lang = $engine->getLanguageFromFilename($drequest->getPath());

     return array(
       'repositories' => $symbol_repos,

This change, combined with mappings added to syntax.filemap, has the appearance of working. I plan to keep this change running locally, but I’m no expert by far, so maybe creating a syntax highlighter engine here is an awful thing to do. If it is, please let me know so I can revert it. If it looks good, maybe you want to incorporate it upstream. For C++ projects, it’s a nice improvement to the product for such a small code change, especially for an area that may not see the larger refactor for quite a while.

I think this is fine (and ridiculously simple :slight_smile: )
It’s a bit of a hack, because we want symbols, not highlighting, but that’s because the existing symbol mechanism is already tied to highlighting.

(1) doesn’t the Diffusion view already has a highlighting engine somewhere near by?
(2) Does this throw/log anything about directories/filenames with no . / images?

(1) It seems like it must, but after doing a bunch of grepping around in Diffusion, I’m not seeing it.
(2) I see no issues viewing images or files without extensions. Nothing is logged to the apache error log. Is there any other log location that I should check?

No, that’s pretty much it. Probably means it’s fine :slight_smile: