Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parameterize a script.Pipe with a user data struct ? #185

Open
fbaube opened this issue Aug 21, 2023 · 5 comments
Open

Parameterize a script.Pipe with a user data struct ? #185

fbaube opened this issue Aug 21, 2023 · 5 comments

Comments

@fbaube
Copy link

fbaube commented Aug 21, 2023

There's many pipeline packages "out there" but script seems to be one that gets it conceptually correct.

Question: Should it be possible to attach a data structure to a pipeline using generics ? Something like

type ParamPipe[T any] struct {
        UserData T 
        Pipe
}

Then a process pipeline for an instance of a user-defined struct could easily be constructed in a one-liner, and new functions could specifically process the user data.

I wish to write processing pipelines for chunks of content, and script's bash-style primitives provide a lot of helpful functionality.

@bitfield
Copy link
Owner

Thanks for the suggestion, @fbaube! Can you come up with an example of the kind of program you'd like to write using this idea? That'll help me get a clearer picture of how it might work.

@gedw99
Copy link
Contributor

gedw99 commented Aug 21, 2023

This is really similar to what I was messing with but never got around to doing.

Can I hit you with a use case ?

I have 3 folders and each is sort of their own binary micro service

I want that when one folder changes to raise a change event to a broker like nats. This is done by a fs watcher

Nats broadcasts it to the other folders binaries and try do sone work and change their file system and this raise more events.

this is called Choreography . It’s bottom up work flow piping where the workflow emirates from whatever file change events are being board cast and who is listening.

mots simple like this project and the schema is just the file that changed in which project.

@fbaube
Copy link
Author

fbaube commented Aug 22, 2023

Can you come up with an example of the kind of program you'd like to write using this idea? That'll help me get a clearer picture of how it might work.

My goal is something like a DSL for processing Lightweight DITA. LwDITA is a DITA with a greatly reduced tag set, plus support for HTML5 and Markdown. This is where script would be used, and I would create new ParamPipe functions.

(As an aside, I figure that when I have M pipelines for M files, each with N processing stages, then there is a number of ways that this load could be distributed across multiple processors.)

So in the CLI program, the processing for a file looks (or will look) something like this:

  • Gather CLI references to files and directories
  • Expand directories into file lists
  • Process in-file metadata (e.g. HTML )
  • Read file content
  • Analyze file content (MIME type? Is XML? Has DOCTYPE? Is valid XML? etc.)
  • Parse file into an AST (e.g. using goldmark for Markdown, stdlib for HTML5 and XML)
  • Extract "interesting" links (cross-references, ToC entries, etc.)
  • (Note that up until this point, each file can be processed in isolation)
  • Resolve and check validity of inter-file links
  • Prepare file set for XSLT processing

fred

@bitfield
Copy link
Owner

That sounds great! So what would the script code look like to do this?

@fbaube
Copy link
Author

fbaube commented Aug 22, 2023

Good question! I already have code that looks like this:

return p.
                st1a_ProcessMetadata().
                st1b_GetCPR().         // Concrete Parse Results 
                st1c_MakeAFLfromCFL(). // Abstract Flat List from Concrete Flat List 
                st1d_PostMeta_notmkdn()

A pure DSL tho would need to deal with how a list of N files fans out into N separate pipelines. I'm not sure whether script can do this, and it's not a typical task for a shell script either. I don't know whether there is a best practice for DSLs to do this.

To [Param]Pipe I would also add a debug io.Writer and a DB connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants