Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting filesystem accesses; 'workspace inspect' command #13

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

warpfork
Copy link
Collaborator

I got busy without network for a while, so a heap of things come at once (the individual commits may end up more sane to review than the whole PR at once):

  • A new package is introduced, to be the home of anything to do with filesystem and data loading (see big doc file);
  • A copious amount of refactoring to use fs.FS in more places;
  • And there is now a workspace inspect subcommand, which takes a look around to see all the modules it can see, and gives some brief reporting on their health (are all imports resolvable, etc).
  • Extracted some stats counting features for Plots to be reusable to serve that (it's the same stuff as wf status is doing, just now in bulk).
  • (rather importantly) fixed that Plot stat counting stuff to be correctly recursive, rather than forget to count things beyond the top level object!

Everything about the appearance of the workspace inspect command should be considered comically WIP, for 'tis terribly ugly, but I hope it's an interesting start in a direction.

Created the 'data access broker' package.
So far it contains only the functions previously in the main command
package, but I think considerably more from the current pkg/workspace
package should be bound there in the near future as well.

The main command package carries an fs.FS around *much* more
consistently now.  This makes little practical difference in this diff,
because it's always still just `os.DirFS("/")`, but it makes the API
more futureproof and ready for pieces to be mocked and tested in units.

I skipped fs'izing some features (namely, quickstart) that are writing
to the filesystem heavily.  Golang still needs writability features
in the new fs packages.  Hopefully that comes out soon.

Added a new error code to describe data from newer versions.
However, I don't think those branches are actually reachable at all;
the libraries we're using from IPLD don't really disambiguate between
codec parse errors vs schema match errors... nor is it immediately
obvious how we'd disambiguate between {schema mismatch because of
something intended as a versioning hint} and {schema mismatch from
wildly different data}.  (Right now, everything will look like the
latter, even if it was something like a plot capsule with a string
looking like something as innocuous as "plot.v2".  Future work!)
Turns out fs.FS is very spikey about receiving a param that begins with
a slash.  Kind of unfortunate, honestly.  Would be a lot easier to use
if it just decided to normalize that out.  Ah well.

(Also turns out one of these was harder to notice than it should've
been because we were shrugging at an over-wide range of errors, rather
than only shrugging at a specifically acceptable not-exists error...
so that's a bug fixed, coincidentally.)

I hope this fixes all regressions.  It's not the cleanest changes now,
either... but this should continue to improve if we get writablity
features attached to the fs package in the future (which, again,
hopefully is very (very) soon now).
Right now this is behavior only found in `wf status`, but I want to
use it again in other places as well (such as workspace-wide stats).
The actual code that evaluates and execs things is, so the estimator
most definitely must match that as well.

Previously, this would underestimate how many things a plot was
referencing, and could miss it if mounts or ingests were used in a
subplot or in a protoformula instead of being imported and piped in.

If we had been using this for a security boundary, I'd say this is
rather bad.  However, we've only been treating this data as advisory
for now anyway, so... perhaps no klaxons.  But yikes.
Had to extract the closure to a method.  It seems to have have found an
unsupported situation in the analyzer tool that currently triggers a
panic.  :(

(But getting rid of a closure: alright, fine anyway; that avoids
some very unnecessary allocs anyway, as it turns out.)
Okay, not checkmarks yet.  I'm going to prettify this later.

But it does a bunch of interesting accounting.

And also, yes, having the ability to say "--gohard=false" is already
relevant in its performance impacts.  In the 29 modules currently
in the warpsys workspace... it takes 0m0.008s real wallclock time
to just enumerate them (makes sense; it's mostly walk, and just open
and parse on the module file itself to get the name)...

... and it takes a whopping 0m0.390s real wallclock seconds to collect
all these connectivity and health checks.  Like, it's *noticable*.

Mind, I suspect there's very low-hanging fruit for improving that.
I notice the system time is 0m0.273s of that (and 0m0.133s user time
from the same sample), so... I suspect even just throwing some caching
around the reads of the catalog would probably make a big difference.

Tracing might tell us more (but that work is going on parallel to this,
so, more on that later perhaps).
Characters placeholder.  I'm being entirely silly here.

This should get replaced by a JSON API or something, and a filter
function that emits ANSI colors and such.

But!  The demo is neat.  If you pipe it to "column -t", it DTRT,
and when using it on the warpsys repo?  Hej, suddenly all the example
modules stand out... because they use ingests, and get flagged with
cautionary markers!  Nice.
// Aliases: []string{"winspect"}, // doesn't put them at the top level. Womp.
Flags: []cli.Flag{
&cli.BoolFlag{
Name: "gohard",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha. Maybe health-check?

case 1:
return "✔"
case 2:
return "å"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

å is yellow?

but eventually you should still reach wfapi data types.

Functions that deal with the filesystem may expect to be dealing with either
a workspace filesystem (e.g., conmingled with other user files),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
a workspace filesystem (e.g., conmingled with other user files),
a workspace filesystem (e.g., commingled with other user files),

"path/filepath"
"time"

"github.com/go-git/go-git/v5"
"github.com/go-git/go-git/v5/plumbing"
"github.com/go-git/go-git/v5/storage/memory"
"github.com/urfave/cli/v2"
"go.opentelemetry.io/otel"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a thing to enforce our import patterns in CI?


// TODO: currently we read the module/plot from the provided path.
// instead, we should read it from the git cache dir
plot, err := plotFromFile(filepath.Join(path, PLOT_FILE_NAME))
// FIXME: though it's rare, this can be considerably divergent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

TripleDogDare
TripleDogDare previously approved these changes Aug 27, 2022
@@ -95,6 +96,10 @@ func cmdStatus(c *cli.Context) error {
if err != nil {
return fmt.Errorf("failed to open module file: %s", err)
}
} else if os.IsNotExist(err) {
Copy link
Collaborator

@TripleDogDare TripleDogDare Aug 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if-else block should be re-written to make the else statements go away.

// TODO: a bool flag for this.
if d.IsDir() && len(path) > len(wsPath) {
_, e2 := fs.Stat(wsFs, filepath.Join(path, dab.MagicFilename_Workspace))
if e2 == nil || os.IsNotExist(e2) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invert if block to get rid of else statement.

// If this is a dir (beyond the root): look see if it contains a workspace marker.
// If it does, we might not want to report on it.
// TODO: a bool flag for this.
if d.IsDir() && len(path) > len(wsPath) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider making this whole block a function.

"github.com/warpfork/warpforge/wfapi"
)

// Might not match the package name -- funcs in this file certainly don't exec anything.
Copy link
Collaborator

@TripleDogDare TripleDogDare Aug 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So? I don't really care and I don't think this comment is useful.

haveHappyExit := false
if c.Bool("gohard") {
if err != nil {
goto _checksDone
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oy.

havePacksCached := false // maybe should have a variant for "or we have a replay we're hopeful about"?
haveRunrecord := false
haveHappyExit := false
// 0 = idk; 1 = yes; 2 = no. (0 generally doesn't get rendered.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some iota enums make sense here.

@TripleDogDare
Copy link
Collaborator

Tracing might tell us more (but that work is going on parallel to this,
so, more on that later perhaps).

Give it a go. It's ready for you.

@TripleDogDare TripleDogDare dismissed their stale review August 27, 2022 05:40

meant to do commit-wise

@warpfork
Copy link
Collaborator Author

warpfork commented Sep 1, 2022

((fyi, I'm very probably going to just table this and re-implement most of it afresh. There's still some gnarly path bugs in here I just can't track down for the life of me, and I think it's gonna be a heck of a lot easier after the no-net-tests fix branch lands... and there's also already a bunch of conflicts here, and probably more to come when no-net-tests lands... so, yeah. I suspect it might end up faster to just reboot it (embarrassing though that is, heh).))

@TripleDogDare
Copy link
Collaborator

Yeah. This is a pretty solid draft though for a single offline power session.

@warpfork warpfork mentioned this pull request Sep 9, 2022
@warpfork warpfork marked this pull request as draft September 26, 2022 11:57
// - warpforge-error-catalog-parse -- like it says on the tin.
// - warpforge-error-io -- for IO errors while reading catalogs.
//
func ComputeStats(plot wfapi.Plot, wsSet workspace.WorkspaceSet) (PlotStats, error) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The is one thing in this PR that should definitely be rescued: this ComputeStats function is more correct than where it was hoisted from. The old occurrence isn't recursive, which is... wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants