Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring CLI wiring #75

Merged
merged 3 commits into from
Apr 27, 2023
Merged

Refactoring CLI wiring #75

merged 3 commits into from
Apr 27, 2023

Conversation

warpfork
Copy link
Collaborator

The main justifications of this are to make it easier to test, and to prepare for more features I want to work on that require more access to the CLI objects.

The justifications for this package structure can be found in the introduced HACKME file.

The main justifications of this are to make it easier to test,
and to prepare for more features I want to work on that require
more access to the CLI objects.

The justifications for this package structure can be found in
the introduced HACKME file.

The use of package-scope vars for the CLI object is considerably unfortunate,
but (again, as detailed in the HACKME), our CLI library already breaks the dam.
(This will become even more largely applicable when I get to work on
the planned features around help content generation, which will require
yet more global var touches to things inside the CLI library.)
So: in for a penny, in for a pound, and I hope I don't regret this too much.

Subcommands now register themselves in package init mechanisms.
As above: in for a penny, in for a pound, and I hope I don't regret this too much.

There's no more use of "internal" packages.
Those tend to footgun me painfully more often than not,
and so let's not have that.

Removed a fairly bonkers use of an "after" func in the CLI.
Its only user was part of the check subcommand.
Fewer lines of code and fewer weird long jumps across callbacks
occur by simply putting the code where it's used.

Many other miscellaneous fixes.
For example, fixing many references to `os.Args[0]` to use the app name.
Too many other small ones to recount.

A few things itch to me, but haven't yet received any changes:
there's still a "util" package in 'app/base/util',
and I'd like to tidy it away.
Also, the "middleware" functions in it for tracing are highly
redundant and I wonder if we can't make that a bit less frequently
copy-pasted.
This diff is already big enough, though.

The 'pkg/config' package also continues to exist, and I think it
should probably be unified into 'app/base' shortly.
The CLI library has features for env vars, as well as documenting them,
and we might as well use that.
(This config package is also pretty brazenly using package-scope vars
already too, so combining it with this other package that also does so
should produce no additional hurt.)

There are some uses of "../" in test setup that are also likely to
become broken soon -- as soon as we use 'app/testutil' from varying
depths of packages.  In this commit, that hasn't happened yet,
but expect some minor Fun dealing with that soon.
@warpfork
Copy link
Collaborator Author

I'm also, with some laments, going to move to remove the -strict flag from the serum-analyzer in our CI config.

"Strict" mode of the analyzer means that it reports a failure for any public functions or methods which lack the serum "// Errors: {...}" annotations in their docs.

I'm moving to drop the strict mode because it's turning out burdensome, and also the -strict flag is still not fully fulfilling it's own value. I think we can find a sweeter spot by continuing to use the serum-analyzer, but without the -strict flag.

I'm going to do a little bit of a write-up on that, and lessons-learned, for posterity:

The straw that breaks the camel's back

The straw that breaks the camel's back here is that I want to write a couple of dummy methods on dummy io.Reader implementations, and they want to just return io.EOF. It's supposed to be a very small number of lines of code, and it's in a little cul-de-sac of the codebase where they really just frankly don't matter.

The serum-analyzer, somewhat correctly (despite this running against the grain of community norms in golang) points out that returning io.EOF is a reference to a mutable value, and thus impossible to analyze, and a bad thing to return from a serum-annotated method. (More specifically, the current error message is "returned error may not be a parameter, receiver or global variable", but that's what it means.)

Now, while that's true, and I'm fine and even happy getting that shin-kick in code that I intend to be serum-annotated and highly orderly about its error values... the trouble is... this isn't such code. I don't actually care about having serum-annotations on these methods at all -- handwavey subjective judgements like "cul-de-sac that doesn't matter" aside, this code is being used by other code that doesn't use serum stuff, and through which the analysis does not proceed, so it's objectively pointless. (And on top of all that, I would argue that io.EOF isn't even really an "error" per se -- it's more of a status sentinel. But I digress.)

So I should just not put the serum annotations in the docs for those methods, right? Right; the analyzer is meant to be incrementally adoptable, so you can do that.

Except not with the -strict flag. The -strict flag removes that ability to be incrementally adoptable.

And the methods do have to be public, because they're to satisfy an interface. And remember, the definition of "strict" mode is that it now requires serum annotations on all public functions and methods. Whoops whoops whoops! There's no way out.

Other features that go-serum-analyzer could have to make this less problematic.

There are several ways the serum-analyzer could make this less painful (with varying levels of effort required to implement, or compromises in result):

  • go-serum-analyzer could be trained to ignore io.EOF, or even all package-scope vars.
    • This has a tradeoff in correctness. Probably a fine one, given that gophers assume package-scope vars are "constants in practice" on a regular basis, but still.
  • go-serum-analyzer could support "// Errors: not analyzed." annotations...
    • ... in the docs.
      • But this would mean readers of regular godocs would see it, so the phrasing should be sensible (and that would make it verbose).
    • ... as the first line of a function body?
      • The analyzer has no pragmas of that style yet, but I suppose it could.
    • ... on individual return sites?
      • No, actually... this would be a super bad idea and destroy the utility of the analyzer. A function is analyzed, or not.
    • In either case, it's adding boilerplate to every time the thing needs to be disabled, which deserves remark.
  • go-serum-analyzer could support using "strict" mode conditionally at package granularity.
    • This would probably be a good idea with nearly zero negative tradeoffs! Just needs implementation (and a decision on how to annotate it).
    • Technically, this may even already be attainable through invoking the CLI multiple times in different modes... but that seems unsustainable in many ways.

So, there are some options there. Some of them aren't terrible. But today, they're all filed under "potential future work".

Considering relative impact and harm-reduction

Dropping strict mode means we lose two things:

  1. We lose the imperative to add serum annotations to new code.
  2. We lose the sanity check against typos like "// Eror:".

But, keep in mind what "strict" mode was supposed to give us: it's supposed to make sure all code that becomes interactable beyond the package is annotated. (The implementation rule it uses is "all public functions and methods", but the goal that actually has value is "all code interactable beyond the package".)

So does "strict" mode actually live up to that goal?

Nope.

As an example of something else that breaks the veil anyway: If you take a real practical review of what's going on in our CLI code in this project, you'll see that there's an awful lot of functions that are being used as callbacks, and given to the CLI library that we use. (They're mostly declared at package scope, rather than inline, but nonetheless their actual use is via callback.) And almost none of them are publicly declared.

Callbacks are another way that code can become interactable beyond the current package.
Even when declared non-public.

So. Uuf.

I assert that:

  • -strict mode doesn't at all live up to its goals, because the callback loophole is wide enough to drive a whole fleet of trucks programs through;
  • losing the typo checking is mildly unfortunate (but in practice, we haven't actually seen this hit very often);
  • losing the imperative to add serum annotations is also unfortunate (but we can counter for that by code review; it's pretty visibility obvious, after all! And code review only has to detect that annotations are there; whether they're correct remains machine-enforced);
  • the improved development velocity and flexibility for accepting the above two tradeoffs is worth it.

Does serum analyze less when it's not in "strict" mode?

No, actually.

It depends on how much you have annotated, but it's quite possible to drop a large number of explicit annotations, and still end up with the same overall amount of analysis coverage.

The analysis tool is smart enough to recurse freely across unannotated functions! As long as they're called by an annotated function, they'll be analyzed. So, it's quite possible to have a large part of the codebase without explicit annotation boilerplate, but still be subjected to analysis.

This applies also to interfaces, where the presence of annotations on the interface means that any assignment of values into that interface will cause the implementation to be checked to match the constraints annotated on the interface (wherever the concrete type can be determined).

This means dropping the requirement of having every public function be analyzed is giving up a lot less ground than you might otherwise think.


So. Down with -strict mode. But long live go-serum-analyzer!

The "subcmd" folder seems confusing.
We now have packages per CLI command group,
and we have packages that are meant for reuse,
and I don't know exactly where these pieces of code are on
that spectrum, but I don't think a third whole category
will help anyone else understand it either, and is in fact
as likely to add confusion as not.

The "pkg/healthcheck" package already found itself in a
similarly ambiguous situation, and was already placed similarly,
so this continues that convention.

I *would* accept an argument that packages that contain
daemon-like code and goroutine usage belong in a different
grouping than code that's just directly callable...
and if we made that organizational ruling, then *some* of
the code in these packages might move to match.
But that's not today.

So.  For now: One less subtree.  Less to think about.
@warpfork warpfork merged commit 6c89cf5 into master Apr 27, 2023
4 checks passed
@warpfork warpfork deleted the cli-refactor branch April 27, 2023 09:49
@warpfork
Copy link
Collaborator Author

Merging this to move along. (I want to do some dependency updates and other stuff that doesn't seem wise to do with open branches.)

It's possible I'll still regret some of the changes in this diff towards using more package-scoped vars, but if so, that's just sadness I'll have to own. (I still haven't been able to find ways to further customize urfave/cli's help generation system to my liking without using package-scope mutations anyway, so, in for a penny, in for a pound...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant