-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature proposal] Per-Repo Linting #421
Comments
After reading through the documentation on the wiki for the colon project there is already an option for setting the base directory to do discovery from. Maybe this can be used for limiting tests to a specific repo. After that, there would have to be some concept of a per-repo test/lint config. I'm not sure where that would fit. Would that be something that would make sense as an extension to colcon? |
You don't need permission from the colcon maintainers to create your own external plugin implementing a That said, as it stands at present, colcon has zero repo-level visibility— the only unit of division it understands is the package, and a package could be a lot of different things: part of a git checkout, a partial checkout from a VCS which supports that such as svn or p4, an extracted tarball, a symlink to a checkout somewhere else, etc. So I think attempting to determine "repo membership" would basically be down to something like starting from each repo's directory and climbing back up the directory hierarchy looking for some kind of marker/config file. As someone who deals regularly with a workspace containing hundreds of packages from a variety of sources (internal and external to my company) at varying levels of lint compliance and overall maintained-ness, I certainly appreciate linting being managed at the individual package level and that's always what I've advocated for (I'm the original |
clang-tidy and many static analysis tools that depend on the compile_commands.json files would disagree slightly. Not that these files are the explicit product of building (like object files) but they are at least a sideafect. As of now, there is no nice way to add a clang-tidy check, even on a per-package level, and only run it when the compilation files exist as the ament_clang_tidy or ament_cmake_clang_tidy test has no way to affect how the project is built. Right now all the litners would be run, and clang-tidy would always fail for missing input. Secondly, one of the big motivations for this is clang-tidy takes a long time to run (~30 sec per compilation unit, of which MoveIt has hundreds). As a result of the lack of knowledge of the repo layer in the build environment, there is no nice way to filter to only packages that have changed or packages that have changed and their dependencies. In the case of clang-tidy the former is sufficient to keep a codebase in compliance. There are other static analysis tools that would also need to run over all dependencies also because they trace execution paths in binaries.
I'm sorry this has been your experience and does seem normal in a product or hasty prototyping, but I would hope that large open-source libraries would not have this practice. Instead, the maintainers would agree at an org or at the least repo level what linters to use and standardize on a set of configurations for them. My current use-case is both, products, and open source libraries. I feel that linters being applied globally is one of the best ways of using CI to enforce standards on all PRs accepted into the project. On the projects we also work on that are not libraries I've fought to have this same standard be applied to any code we build from source. The reason I think linting should be per-repo is that I've never seen code mixed from different sources (and therefore different standards) within a repo. If you need a fork or a source build of something external it normally comes with its own form of source control (or I can just put it in a separate repo). I think your use-case of having varying levels of linting inside a single repo as a result of mixing internally developed tools with external ones in one repo seems odd. Maybe you or others can point to open source examples where this is done? This seems like it'd be error-prone at best and should generally not be encouraged.
Another thing I'd like to point out is that while it is possible for there to be other forms of source control, as far as I'm concerned there is only git (or gerrit which is built on git). This is for our own sanity and those we work with. Any project where we are maintaining the code is either in git or we put it in git, or package it in a debian and put that in a package repo. We always strive to contribute upstream if possible when we make improvements. As to making this something that is more generic I would leave that to those who care for other forms of source control and hopefully answering the question of "which source files have changed in this sub-directory since <branch, tag, hash, whatever string key>?" is generic enough they can figure out how to suggest changes to work with their preferred source control. I'd be happy to return an error if given a path that didn't have a .git folder as a child directory.
I did attempt this as new parameters in ament_clang_tidy in a PR because I thought it was the right place to implement that (it is a standalone script that crawls the build and source directory executing clang-tidy). Every other linter can be used as is as we'll just have a target workspace in ci with only what we want to test and those are all much faster. I was told that was not the right place for it as nothing in ament should depend on package discovery which is a feature I needed for my approach. I started with the PR to ament_lint and then came here because I am frustrated with the federated nature of all the ci systems used by large (including ours) ros libraries and would like to work towards collaboration and standards in linting and testing in general to reduce the maintenance burden of CI systems for everyone (checkout moveit_ci if you want to see how we are working now). As it stands there are many projects with varying amounts of quality in their testing that are all struggling with similar problems. Instead of trying to fix these problems in our own little bubble for ourselves, I'd like to make this easier for others as well, and at the same time learn from others. The intent of putting this here was to get input from others as to how to go about implementing this and to propose that the default approach to linting should maybe be per-repo by default. The current only option (imho) results in a bunch of copy-pasted code and just plain doesn't work for clang-tidy on large projects. Are you using ament_clang_tidy in your 600 package project? Do you have 600 source packages in a single repo from many different original sources? If so, do you think that has been a good experience, would you recommend it to others as good practice? I think moveit itself has gotten too big for one repo and we have maybe two dozen packages. I don't like the CI and test times of our project... I could only imagine what running linters over that project would be like. Thank you for responding to the issue. I look forward to hearing your opinion further. |
As mentioned before In the current design One of the design philosophies of In ROS 2 specific linters provide their own executable which can be used to lint arbitrary file. Unfortunately there is no single executable which triggers all available linters - optimally depending on some package-level configuration only the ones which should be applied. I would think extending this subsystem to satisfy your use case would be a better direction rather than implementing a custom |
I'm sorry I don't understand what you mean by this? Right now I'm able to construct the commands to run the linter clang-tidy in ament_clang_tidy on any arbitrary package within the workspace. The problem is that to do that I depend on package discovery and that is something explicitly outside the available dependencies for ament (I implemented it using a script from catkin, but I could have done the same thing with colcon list). I'm not understanding why implementing something in a colcon extension (in this case constructing the commands to run specific linters) would prevent a user from calling those same commands outside of colcon? I can invoke cmake even though colcon can do it for me.
This doesn't apply to clang-tidy (the reason I'm here) because it works per compilation unit and depends on the compilation_commands.json files being exported per-package.
I am specifically trying to filter which linters are run based on what files have changed since some arbitrary point in the past. This is because running clang-tidy over all the compilation units in moveit takes hours on CI systems. Because of the execution time of doing this we wouldn't be able to just "always run clang-tidy on all packages" as a fact of configuration.
What is this subsystem? Are you referring to ament_lint? To be clearer what I'm hoping for is something like this:
Or maybe an extension on the way colcon test works so there is a new type (not just ctest and pytest) that is based on repo level lint config files:
One of the problems I see with what I'm asking is I would be introducing a dependency from one argument on another. For --packages-changed-since to work I need a --base-paths argument that points into some repo so I can discover the paths in that repo and the base repo path (there is a git command that'll do that or I can search for a .git directory). Once I have that I can find the root source directory by parsing the cmakecache for that package. With the base source directory, I can then ask git for a list of source files that have changed (filtering for add and modify diff types) since some branch/tag/hash. Using that I can filter to just run the linter over packages that have changed in the given PR. Your reaction to this complex behavior is probably that it is niche and specific to the requirements of running in CI. Ideally ci just verifies what the submitter already knows (that their change will pass). For that to be true I want people to be able to easily run linters locally. In the case of clang-tidy if it were to do the nieve thing and lint all the packages every time no one would use it. I'm not waiting around for an hour with my computer tied up after changing a couple of files to see if I pass the linter. Because of this, I need this sort of filtering for users also to run locally. Otherwise, people will continue to use CI as a way to test if their PR passes clang-tidy which is at the least doubling the amount CI runs. This is also bad because CI runs take over an hour as it is with this logic implemented in bash. I'm working to reduce this CI time separately but reducing the number of times CI runs by enabling users to run the same tests easily locally would be a major improvement for our workflow. |
To solve my immediate problem and better explain the use-case I'm looking for I created this package: https://github.com/tylerjw/moveit_ci_tools and this PR to ament_lint: ament/ament_lint#287 Together what I want for
I'd appreciate more advice on how I could structure these changes as first-class features in colcon or ament. I am content to let my desire for per-repo lint config go as it seems like more trouble than it is worth. However, some sort of cli way to run ament_lint tests over specific packages would be really nice. To do this I propose adding a --packages-select command to all the ament_lint scripts. Thoughts? |
Something to clarify,
This usage looks reasonable. I'm not familiar with the inner workings of |
If a feature like linting is implemented in That is why linting is and imo should be implemented as a test in each package. Independent of what a user uses they can always run that test shipped by the package.
And I think
If an upstream package is only configured but not built and installed it would be impossible to configure any downstream packages because the dependencies aren't available. That is why |
|
Featrure proposal for Per-Repo Linting
Linters are generally applied at the repo (or even standardized across orgs) and are not package-specific. The way it is now if you have many packages in a single repo to have standardized linting you need to specify the set of linters you want to use in every package.xml and CMakeLists.txt by copy-pasting them. This seems to me that the lint tests could be declared at a more optimal level of abstraction that would enable some advanced features.
I would like there to be a way to define which linters apply to all packages within a given repo. My idea is something like a
.lint.yaml
file in the root of the repo that would contain things like this:Then when someone runs
colcon test
it runs the linters per-repo. The way it exists now (afaik) the linters are run through ctest per-package via cmake args andcolcon test
is basically just a way to call ctest recursively per-package.This would enable you to have per-repo configurations for the linters which would reduce copy-pasted configurations, increase standardization across the repo, and enable features that operate on a per-repo level. The specific feature I want is to limit linting to packages with source changes since some point in the git history. By doing linting per-repo we would be able to have linting logic (including filtering) per-repo.
Background
I want to be able to use ament lint tests on a ros1 project built with colon. I also want to add features to the executing of those tests that are per-repo. Specifically, I want to only run clang-tidy on packages that have source file changes since a specified branch. I starting implementing these as features to ament_clang_tidy here: ament/ament_lint#280. There I was told that this should exist at a higher level of abstraction than the ament_lint script and should instead be done at or above colcon. One of the motivations for this change is to make it simple for developers to run the linters locally before submitting PRs. The current design of
colcon test
includes running the linters however it is done per-package instead of per-repo which does not adding per-repo linting logic.Other options
A new verb to colcon that handles per-repo execution of linters. Maybe
colcon lint
? This has some upsides as maybe it could be developed outside of colcon itself and installed as an extension or plugin of sorts. While this makes some parts of developing this feature easier it would add yet another disjointed way to run part of the tests.A script that executes colcon/git/ament commands. This is what I think @tfoote was suggesting I do. This is sort of what we have already with moveit_ci for clang-tidy. We have a bash script that depends on variables in travis to execute the clang-tidy tests as a step in CI. This script can't be easily used locally to test and is not easily reused for other projects.
The text was updated successfully, but these errors were encountered: