Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insufficient functionality for ignoring length of specific body lines #255

Open
scop opened this issue Dec 23, 2021 · 8 comments
Open

Insufficient functionality for ignoring length of specific body lines #255

scop opened this issue Dec 23, 2021 · 8 comments
Labels
enhancement User-facing feature enhancements

Comments

@scop
Copy link
Contributor

scop commented Dec 23, 2021

In a nutshell, my use case is avoiding triggering line too long errors for certain lines in the commit body. I can accomplish that, but not in a way that it would not cause other problems. Some simplified examples follow, using lines starting with https:// as the example one I'd like to exempt from line length checks.

Try 1: ignore-body-lines

$ cat test-msg 
Title goes here

More info:
https://github.com/jorisroovers/gitlint/commit/3c017995633f602125f7caaa91e96bb767ca5953
$ cat .gitlint 
[ignore-body-lines]
regex = ^https://
$ cat test-msg | gitlint
3: B5 Body message is too short (10<20): "More info:"

This is an instance of the problem documented with ignore-body-lines (it mentions line number confusion only though which is much less problematic than resulting in the B5 false positive IMHO). Anyway, for this reason, ignore-body-lines does not fit the bill.

Try 2: ignore-body

$ cat test-msg 
Title goes here

More info:
https://github.com/jorisroovers/gitlint/commit/3c017995633f602125f7caaa91e96bb767ca5953

This is a very very very very very very very very very very very very very very very very long line whose line I do _not_ want to ignore.
$ cat .gitlint
[ignore-by-body]
regex = ^https://
ignore = body-max-line-length,body-min-length
$ cat test-msg | gitlint
# (no output)

To overcome the problem with ignore-body-lines line matches being entirely ignored and causing at least the B5 false positive, here's an approach using ignore-by-body with body-min-length in ignore. This sidesteps that problem, but causes another one: body-max-line-length must be in ignore in order for the line I want to be exempt for that check, but it causes line length checks to be skipped for all lines in the body. I'd like the "This is a very very ..." line to be flagged as too long (but I can see why it is currently not).

Did I miss something?

I'm wondering if I missed something that could be used to accomplish what I'm looking for without side effects.

Appendix

Aside, perhaps body lines without whitespace should be exempt for the line length checks by default. This would not help with my problem completely as it has a few more variables, but would make sense I think.

@scop
Copy link
Contributor Author

scop commented Dec 23, 2021

(The commit URL used as an example above does not have anything to do with the actual problem, it's just a random one I grabbed from this project for illustration purposes.)

@sigmavirus24
Copy link
Collaborator

I think a better example of what you're trying to show is:

> cat test-msg 
Title goes here

More info:
https://github.com/jorisroovers/gitlint/commit/3c017995633f602125f7caaa91e96bb767ca5953

This is a very very very very very very very very very very very very very very very very long line whose line I do _not_ want to ignore.

> cat .gitlint
[ignore-by-body]
regex = ^https://
ignore = body-min-length

> cat test-msg | gitlint  
4: B1 Line exceeds max length (87>80): "https://github.com/jorisroovers/gitlint/commit/3c017995633f602125f7caaa91e96bb767ca5953"
6: B1 Line exceeds max length (137>80): "This is a very very very very very very very very very very very very very very very very long line whose line I do _not_ want to ignore."

Because you're not explicitly ignoring that last line with ignore=body-max-length.

I suspect the idealized config that would work here would be:

> cat .gitlint
[body-max-line-length]
line-length=120
ignore-regex=^https?://

That said, I did find that this will give you what you want:

[ignore-body-lines]
regex = ^https?://

[ignore-by-body]
regex = ^More info:$
ignore = body-min-length

@scop
Copy link
Contributor Author

scop commented Dec 23, 2021

I think a better example of what you're trying to show is: [...]

Maybe, maybe not :) My example illustrates the problem of all long lines being ignored, yours illustrates none of them being ignored, with slightly different configs. But we are talking about the same thing, that's what's important here.

That said, I did find that this will give you what you want:

Sure, for that specific, simplified example. (It would fail to flag a commit whose entire body is More info: (without an URL etc) as invalid though.) But perhaps I should just give out the complete case, through the gitlint config in place, it's not that much more complex:

[general]
ignore = body-is-missing
ignore-fixup-commits = false
ignore-revert-commits = false
ignore-squash-commits = false
ignore-stdin = true
contrib = contrib-title-conventional-commits

In addition to the above base config, I'd like to ignore line lengths for lines starting with:

  • Co-authored-by:, and
  • an HTTP or HTTPS URL, possibly preceded by Refs or See

...without causing side effects hitting or missing other rules for other lines (I'm fine with ignoring everything on those long lines, if ignoring only the length isn't available). So in regex terms:

^(Co-authored-by:|((Refs|See) )?https?://\S+$)

Enumerating what are the accepted too short bodies to be ignored (like "More info:") as those special cases is not possible/feasible.

@jorisroovers
Copy link
Owner

jorisroovers commented Feb 10, 2022

Ok, let me see if I understand this correctly.

You’d like to have something like this:

# For matching lines, ignore ONLY the specified rules, but apply everything else
[ignore-body-lines]
regex=^Co-Authored-By
ignore=body-max-line-length

# When not specifiying the 'ignore' option, we're really just saying ignore all rules
[ignore-body-lines]
regex=^Co-Authored-By
ignore=all

# This has the same behavior as setting ignore=all
[ignore-body-lines]
regex=^Co-Authored-By

I can see this would be useful - it would require code changes to change how gitlint applies and implements the ignore rules though. @scop Please confirm or clarify :-)


Proposed implementation (gitlint internals)

Under-the-hood, the ignore-body-lines rule is really a ConfigurationRule.

class IgnoreBodyLines(ConfigurationRule):
name = "ignore-body-lines"
id = "I3"

ConfigurationRules are applied once per commit, before all other rules:

# Apply config rules
for rule in self.configuration_rules:
rule.apply(self.config, commit)

I think to support this, we'd need to change ConfigurationRules to have a new target attribute, being either Commit or Line (with Commit being default for backward compatibility).

We’d then apply the ConfigurationRules with target Commit as we do today, but then additionally apply all ConfigurationRules with target Line on a per line basis.

So this code, would need to change a bit:

# Apply config rules
for rule in self.configuration_rules:
rule.apply(self.config, commit)

Like so:

# We use a new `self.commit_configuration_rules` property that filters all configuration rules with target==Commit
for rule in self.commit_configuration_rules:
    rule.apply(self.config, commit)

And then lower down, right before this code

violations = []
# determine violations by applying all rules
violations.extend(self._apply_line_rules([commit.message.title], commit, self.title_line_rules, 1))
violations.extend(self._apply_line_rules(commit.message.body, commit, self.body_line_rules, 2))
violations.extend(self._apply_commit_rules(self.commit_rules, commit))

We’d add something like this:

# We use a new `self.line_configuration_rules` property that filters all configuration rules with target==Line
self._apply_line_rules(commit.message.body, commit, self.line_configuration_rules, 1))

In addition, ignore-body-lines would then need to be converted to a ConfigurationRule with target set to Line and support for the new ignore rule option.

I think this could work, and might even allow us to fix the line numbering discrepancy issue. I'd need to implement a prototype to make sure.

@scop
Copy link
Contributor Author

scop commented Feb 11, 2022

I think the suggested feature would work great for my purposes. Thanks for considering and working on it!

@jorisroovers jorisroovers added the enhancement User-facing feature enhancements label Feb 15, 2022
jorisroovers added a commit that referenced this issue Feb 15, 2022
@jorisroovers
Copy link
Owner

So I put together a quick-and-dirty prototype that seems to work.

@scop I’d appreciate if you can give this a try and see if this works for you:

#  git checkout instructions
git clone https://github.com/jorisroovers/gitlint.git
git checkout issues/255
python -m venv .venv
. .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt -r test-requirements.txt -r doc-requirements.txt
cd gitlint-core
python setup.py develop
cd ..
python setup.py develop
# gitlint should work at this point
gitlint --version

# test commit message and gitlint file
echo -e "Commit Title\n\nFoo\nThis \t line contains violations and it's also really long but that's on purpose  \nBar\n" > /tmp/test.txt
echo -e "[ignore-body-lines]\nregex=^This\nignore=body-trailing-whitespace,body-hard-tab\n" > /tmp/gitlint

# Testing the feature:
cat /tmp/test.txt  | gitlint -C /tmp/gitlint
# Output: notice how the rules B2 (body-trailing-whitespace) and B3 (body-hard-tab) are ignored
# But the rule B1 is still applied
# The line number is also fixed
4: B1 Line exceeds max length (81>80): "This 	 line contains violations and it's also really long but that's on purpose  "

Notes

  • This feature might slow down gitlint quite a bit since this feature requires us to clone the gitlint config object for every line. Even though performance has never been a big consideration for gitlint, I’d still like to run a few tests. Maybe this is a non-issue, but there’s a chance this will add up when linting a large set of commits
  • Current prototype implementation is sloppy, final implementation might require a bit of refactoring work beforehand
  • Needs tests, docs, and polishing, i.e. the majority of the work still needs to be done (this is like 20% done right now)

@Notgnoshi
Copy link
Contributor

An alternative approach I've used for this problem is to define a custom rule

import re
from gitlint.rules import BodyMaxLineLength

class BodyMaxLineLengthWithExceptions(BodyMaxLineLength):
    name = "body-max-line-length-with-exceptions"
    id = "UC1"

    def validate(self, line, commit):
        if line.startswith(" " * 4):
            return None

        ret = re.match(r"^\[\d+\] ", line)
        if ret is not None:
            return None

        return super().validate(line, commit)

The intent of the leading space exception was for block quotes, which are normally shell session, code snippets, or log snippets.

The intent of the footnote-style lines was for URLs, which is what I expect prompted this issue. The reason we chose footnotes was that you could put them inline with your commit message without impacting readability.

The downside of this approach is that, while it works great in CI pipelines, it forces you to disseminate both the custom rules, and the config that disables the default body-max-line-length for local use by developers, which is what prompted #188.

@scop
Copy link
Contributor Author

scop commented Jun 6, 2022

Sorry for taking ages to get back to this. I tried out the issues/255 branch briefly, and did not manage to find any problems with it 👍

But it just occurred to me that a general purpose "ignore output matching a given regex" would work for this purpose just fine, and I guess (100% unverified) it could be less intrusive to implement. It wouldn't be as elegant as more targeted approaches, but would serve as a brutally efficient last resort -- not only for this particular one, but for any message one wants to ignore for whatever reason.

So a given regex in this rule, say [ignore-output-lines] would be matched against the entire output of gitlint, and filtered out + no effect on the exit status. So for example, this config:

[ignore-output-lines]
regex = ^\d+:\s+B1\s[^:]:[\s"]*https?://.*

...could be used to ignore this:

3: B1 Line exceeds max length (125>80): "https://aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"

A downside would be dependency on gitlint's output formatting, but I wouldn't personally mind that, as long as it's stable.

geissonator pushed a commit to openbmc/openbmc-build-scripts that referenced this issue Aug 19, 2022
Throughout the project, it is a manual human process to enforce the idea
of commit message formatting, and leads to more conflict than would
ideally be required for something that's relatively algorithmic, and
able to be enforced by CI.  Jenkins is able to give faster response
times to users, thus ensuring that committers are more likely to be able
to resolve their commit message issues in a timely manner.

This commit adds the gitlint[1] application to our builds, and
integrates its checks with CI in the format-code.sh script.  Gitlint
appears to be a relatively active project with a number of releases,
relatively up to date commits on its github, and by the documentation as
well as this authors testing, appears to do exactly what the project
needs in terms of checks.

gitlint has a number of configuration options[2], of which the defaults
appear to be ok for OpenBMCs style requirements. This commit checks in a
.gitlint file that was generated via 'gitlint generate-config' to use as
a starting point.

To avoid impacting the entire project at this time, this commit checks
for the existence of a .openbmc-enforce-gitlint file in the root of the
directory, and uses that to gate its scanning.  At some point in the
future, once we've determined this meets our needs, this check will be
removed so that we can enforce this project-wide.

This commit makes use of the gitlint plugin system to support one
important feature that OpenBMC requires for block line length.  The
custom line length rule allows:
1. Block comments to exceed the line length limit
2. Signed-Off-By sections to exceed the line length limit
3. Tabbed in sections to exceed the line length limit

Presumably this same mechanism could be used to handle openbmc/openbmc
commits, to allow meta-<name> to precede the title and go over the
allowed limit, but for the moment, format-code.sh does not apply to
openbmc/openbmc, so this would be for a future change to repotest

When these fails, it attempts to give the user a message that conveys
these allowals to let them fix their commit message quickly.

Tested:
Created a commit with a title that was too long, as well as added a
.openbmc-enforce-gitlint file in bmcweb.  Ran openbmc-build-scripts and observed.

'''
-: UC1 Line exceeds max length (101>72).
    It's possible you intended to use one of the following exceptions:
    1. Put logs or bash lines in a quoted section with triple quotes (''') before and after the section
    2. Put a long link at the bottom in a footnote.  example: [1] https://my_long_link.com
    Line that was too long:
: "VERY LONG LOG LINE THAT TAKES WAY TOO MUCH SPACE AND GOES OVER 72 CHARACTERS, which is a problem"
'''

[1] https://jorisroovers.com/gitlint/
[2] https://jorisroovers.com/gitlint/configuration/
[3] https://jorisroovers.com/gitlint/user_defined_rules/
[4] jorisroovers/gitlint#255 (comment)

Signed-off-by: Ed Tanous <[email protected]>
Change-Id: If42a22bfeca223fd5bc8f35ed937aa5f60713f2a
neiljp added a commit to neiljp/zulip-terminal that referenced this issue Jun 1, 2024
This does not resolve issues for all previous commits where the B1 rule
of long lines in the body would fail, but will allow some obvious common
cases to pass.

Inspired by jorisroovers/gitlint#255, this should cover at least lines
starting with:
- http or https [where URLs can be long]
- Co-authored-by: [where emails/names can be long]

The downside to this rule is that the line numbering is apparently
offset, based on the gitlint documentation.
(https://jorisroovers.com/gitlint/latest/ignoring_commits/)
neiljp added a commit to zulip/zulip-terminal that referenced this issue Jun 1, 2024
This does not resolve issues for all previous commits where the B1 rule
of long lines in the body would fail, but will allow some obvious common
cases to pass.

Inspired by jorisroovers/gitlint#255, this should cover at least lines
starting with:
- http or https [where URLs can be long]
- Co-authored-by: [where emails/names can be long]

The downside to this rule is that the line numbering is apparently
offset, based on the gitlint documentation.
(https://jorisroovers.com/gitlint/latest/ignoring_commits/)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement User-facing feature enhancements
Projects
None yet
Development

No branches or pull requests

4 participants