Allowing arc lint to run without some of the linters

Hello,

In our project we have a need to be able to run “arc lint” without some of the linters mentioned in .arclint. So, sometimes we do need to run all of the linters, and sometimes not. We expected the arc lint to have an optional argument, something like “arc lint --donotrunlinter myspecificlinter”. But, unfortunately, we did not find something like this supported. However, this seems like something really needed and natural to ask for. So we wanted to ask if it is possible to support this?

To temporary workaround the problem we have considered several options. We ended up creating a custom lint engine. We wanted this custom lint engine to be almost completely identical to ArcanistConfigurationDrivenLintEngine, with the only difference of pointing to another file with linters list, .arclint2. This suggested the following:
a. Currently ArcanistConfigurationDrivenLintEngine class is declared as final, so we could not inherit from it, but only from ArcanistLintEngine.
b. Secondly, if you agree with the bullet a. and make the ArcanistConfigurationDrivenLintEngine not final, it would be beneficial to have some short functions, for example get_dotarclint_file(), to allow the user to specify another .arclint file, while overwriting the function.
c. Continuing b., it might also be beneficial to have such an optional argument for the arc lint command. Something like “arc lint --dotarclint mydotarclintfile.

Currently, what we ended up doing is just copying ArcanistConfigurationDrivenLintEngine code to another file, renaming the class name and then making a couple of changes to replace ‘.arclint’ with our specific ‘.arclint2’. Which works, but it is not a proper software solution.

We would appreciate your response.
Best wishes.
Dmitry

IMPORTANT

  • Bug reports MUST include reproduction instructions which allow someone who does not have access to your environment to reproduce the issue you’re encountering.
  • Bug reports MUST be against a recent version of Phabricator, and include version information. You can find version information in “Config > Version Information” in the web UI, or arc version from the CLI.

Reproduction Instructions
Complete steps which allow someone else who does not have access to your environment to reproduce the bug.

Phabricator/Arcanist Version
Output from Config > Version Information or arc version.
$ arc version
arcanist cc850163f30c4697e925df0d6212469679600a2c (19 Nov 2019)
libphutil 39ed96cd818aae761ec92613a9ba0800824d0ab0 (30 Sep 2019)

What’s the use-case? Why do you sometimes need to run only part of the linters?

What’s wrong with running the TODO linter every time?

Sorry, Aviv. It was sent by a mistake before I finished to type the message :slight_smile:. Here is the full message.

Hi Aviv,

Thanks for the response.

An example to this, is our TODOs linter. In the code, when an engineer cannot finish some stuff properly, there is a possibility to leave a “TODO” comment. So, we are able to track unfinished work and act accordingly before the end of the project.

When submitting a differential we would like the TODO linter to run on the changed code to identify potential issues and signal “autofix”es and/or “advice”s.

After the arc diff is submitted, we have a Continuous Integration (CI) flow in place, which runs many tests, checks and reports to ensure that the diff did not break anything. One of the CI tasks is to run all of the linters on all of the files. We execute “arc lint –everything”. We have to avoid running the TODOs linter in this case, because it will run on all of the files (and there are ~10K files), will take very long time (which will eventually result in a timeout failure) and not accomplish anything, as it does not issue errors or warnings (just “autofix”es and “advice”s).

Hope it makes it a bit clearer.

Thank you.
Dmitry

I just ran grep TODO on 9000 files, and it took 15 seconds, so I’m not sure why your linter takes a “very long time”. Is this a real example or an artificial one?
I would expect the run-time of any linter to be very small relative to build time and the run-time of tests, which would be running in the CI phase.

Hi Aviv,

It is not just grepping.
Our TODO linter is a script and regex linter.
It does several things, among which are

  1. git grep on suspicious
  2. git blame through gitpython to identify the author of the todo.

The script takes a few second to run on each file. I have run “time” on running the todo script on one of files, and it took ~2.5-3 seconds to execute. Multiply it by 10K files and you get quite a long time, even if you allow some parallel execution.
But even beyond that, does it make sense to launch 10K jobs (or even 1 job) which do not contribute anything? We would really want to avoid this.

If you are against “arc lint --donotrunlinter myspecificlinter”, would you at least consider what I have written in the second part, while describing our workaround, and the problems we have encountered there, quoting:

To temporary workaround the problem we have considered several options. We ended up creating a custom lint engine. We wanted this custom lint engine to be almost completely identical to ArcanistConfigurationDrivenLintEngine, with the only difference of pointing to another file with linters list, .arclint2. This suggested the following:
a. Currently ArcanistConfigurationDrivenLintEngine class is declared as final, so we could not inherit from it, but only from ArcanistLintEngine.
b. Secondly, if you agree with the bullet a. and make the ArcanistConfigurationDrivenLintEngine not final, it would be beneficial to have some short functions, for example get_dotarclint_file(), to allow the user to specify another .arclint file, while overwriting the function.
c. Continuing b., it might also be beneficial to have such an optional argument for the arc lint command. Something like “arc lint --dotarclint mydotarclintfile.

Currently, what we ended up doing is just copying ArcanistConfigurationDrivenLintEngine code to another file, renaming the class name and then making a couple of changes to replace ‘.arclint’ with our specific ‘.arclint2’.

Thank you.
Dmitry

Your problems seems strange to me (I don’t see why lint should find the author, this can be performed later if it’s important), so I don’t see a use-case for the feature you request.

Having said that, I don’t see anything wrong with your workaround; I’d probably solve this by modifying .arclint in the ci station just before running arc lint.

Thanks Aviv for your reply.
We need the author because of the syntax requirement of the TODO message format. I did not go into specifics here, but the author is part of the autofix to make sure that the TODO has an owner.
Our workaround works, but is wrong in a sense that we just copied the content of your ArcanistConfigurationDrivenLintEngine into another class to make small changes instead of using inheritance.
What is the reason behind making ArcanistConfigurationDrivenLintEngine final? Could you please look again into a and b in the previous message?
We cannot amend the .arclint, as our project made a conscious decision not to auto-generate code (I do not have exact details why, can figure this out), and there are checks in place to avoid this.

I have some plans to introduce lint and unit “phases” (this may not be the final name) that will probably support this kind of thing.

The primary use case I have in mind is stuff like a “test” that spends an hour doing static analysis and isn’t ever appropriate to run on the client, but there are enough sort-of-plausible use cases like this one that I expect to just generalize it to the full lint and unit stages (and the new “beautifier” stage or whatever comes out of https://secure.phabricator.com/T8971).

If/when this exists, you will likely be able run arc lint --phase server (or whatever you call the phase) on the server to select appropriate linters. But I don’t have a timeline for this or a clearer picture of what it will look like.

Thank you Evan.
I will follow the T8971.
Looking forward to the outcome.
Dmitry