Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for external frontends, take 2 #1600

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

EliRibble
Copy link

This is based on #1210 by collincross.

There are many requests for customizations to Ninja's build output cataloged in #746. This patch set attempts to address the underlying issue by spawning a separate frontend process and feeding serialized build status updates to it. The frontend can be extended without complicating Ninja's core logic, or can be completely swapped out for a custom frontend like implemented in #1020. The output can also be serialized to disk for later parsing, which will work well for buildbots.

The interface between Ninja and the frontend is documented in FRONTEND.md in 7th patch in the series, "Add frontend option for build status output"

This is an attempt to get external frontends landed in a way that is in harmony with ninja's policy of being conceptually simple. This means that the ninja binary will not depend on any implementation of protobufs. This PR will adds a new binary that has the ability to take a '--frontend' argument that passes protobuf messages and allows arbitrary logic for showing progress and messages to the user.

Add a tiny protobuf runtime in proto.cc and proto.h, and a python
script that takes a binary proto descriptor (produced by
protoc --descriptor_set_out) and produces a .pb.h file that
implements a subset of the protobuf C++ Message interface.

Original code from colincross.
@EliRibble
Copy link
Author

This PR is going to need some more work before it has fully achieved its purpose. From the original PR, #1210, there was the following post from nico:

I thought a bit about this. As said on the mail thread, I'm not a fan of putting protobuf in ninja itself (even if it's a reimplementation). At the same time, I agree that lots of people want to tweak ninja's output in many ways, and this is a good approach to attack that problem.

What do you think about adding a second binary, ninja_protoout or ninja_buildcore or what have you, somewhere below contrib/, make that link to the ninja lib, put the proto code there, and make the ninja lib have an interface for selecting the output behavior (and core ninja would provide a class that works like the current output and the new binary would instead use the proto behavior)? That way, ninja itself stays the way it is, and people who want very custom output can use the new binary for that?

I have not yet done that work. Before I do I'd like to actually understand the requirements. This is what I believe needs to be done:

  1. The ninja binary cannot have compiled into it any implementation of protobuf communication. For this PR, that means proto.h|proto.c.
  2. The current code in this PR creates a StatusSerializer or a StatusPrinter in real_main based on the presence of a --frontend parameter. Instead we'll need do work to break up real_main into code that can be shared and create two different source files, one that will be used by ninja and will always provide a StatusPrinter and another that will be used by ninja_buildcore and can provide StatusPrinter or StatusSerializer.
  3. There may be packaging work to add ninja_buildcore to releases.

Open questions:

Do I need to create a contrib directory in the root of the repository and put all protobuf-related source into it? If not, what does "somewhere below contrib/" mean?

Is there a preference in sharing the logic in ninja.cc between keeping most of the shared code within ninja.cc or moving shared code to a new source file?

Eli Ribble added 3 commits July 11, 2019 21:18
Make BuildStatus an abstract interface, and move the current
implementation to StatusPrinter, to make way for a serialized
Status output.

Original code by colincross.
Edges are nominally ordered by order in the build manifest, but in
fact are ordered by memory address.  In most cases the memory address
will be monontonically increasing.  Since serialized build output will
need unique IDs, add a monotonically increasing ID to edges, and use
that for sorting instead of memory address.

Original code by colincross.
Send all output after manifest parsing is finished to the Status
interface, so that when status frontends are added they can handle
build messages.

Original code from collincross.
@EliRibble EliRibble force-pushed the frontend branch 2 times, most recently from ce100e5 to f19f3f6 Compare July 12, 2019 04:22
Eli Ribble added 2 commits July 11, 2019 22:55
The times that end up in the build log currently originate in the
status printer, and are propagated back out to the Builder.  Move
the edge times into the Builder instead, and move the overall start
time into NinjaMain so it doesn't get reset during manifest
rebuilds.

Original source from colincross.
Store the number of running edges instead of trying to compute it
from the started and finshed edge counts, which may be different
for starting and ending status messages.  Allows removing the status
parameter to PrintStatus and the EdgeStatus enum.

Original code by colincross.
@EliRibble EliRibble force-pushed the frontend branch 2 times, most recently from 60023a3 to 22e94eb Compare July 12, 2019 05:59
Add a --frontend option that takes a command that will handle printing
build status.  The command will be executed with the read of a pipe on
fd 3, and each build event will cause a serialized message to be sent
on the pipe.

The frontend/ directory includes a description of the interface
between Ninja and a frontend, and an implementation of the interface
in python that mimics the native interface in ninja, which can be run
with:
ninja --frontend=frontend/native.py

Original code by colincross.
@jhasse
Copy link
Collaborator

jhasse commented Jul 17, 2019

Thanks! I've also done a rebase of this some some time ago: https://github.com/jhasse/ninja/tree/serialize Never managed to create a PR from it though. What else did you change beside rebasing?

From the original PR, #1210, there was the following post from nico:

In the post you've quoted nico is asking questions. I don't think that the final word about this has been spoking, so I wouldn't start on any work yet.

@EliRibble
Copy link
Author

I haven't changed anything yet aside from rebasing. That's because I agree, the final word has not been spoken on how this should be done.

@jonesmz

This comment was marked as abuse.

@jhasse jhasse mentioned this pull request Aug 4, 2019
virtual void BuildLoadDyndeps() = 0;
virtual void BuildStarted() = 0;
virtual void BuildFinished() = 0;

This comment was marked as abuse.

This comment was marked as abuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for protected, deleting Status* without a virtual dtor will result in a warning.

This comment was marked as abuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but even compilers that do may not be able to issue a diagnostic for this in every situation due to AST transformations losing the context of the delete operation.

Can you give an example?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some times,

How do you know that?

This comment was marked as abuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See GCC's or Clang's compiler manual.

This comment was marked as abuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never said that I'm "against" it, just that there is no need.

virtual void BuildStarted();
virtual void BuildFinished();

virtual ~StatusPrinter() { }

This comment was marked as abuse.

src/status.h Outdated

#include <map>
#include <string>
using namespace std;

This comment was marked as abuse.

src/ninja.cc Outdated
@@ -1299,6 +1298,8 @@ NORETURN void real_main(int argc, char** argv) {
exit((ninja.*options.tool->func)(&options, argc, argv));
}

Status* status = new StatusPrinter(config);

This comment was marked as abuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to avoid auto_ptr. I would keep the raw pointer for now.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

@@ -250,3 +251,24 @@ void StatusPrinter::PrintStatus(Edge* edge, EdgeStatus status) {
force_full_command ? LinePrinter::FULL : LinePrinter::ELIDE);
}

void StatusPrinter::Warning(const char* msg, ...) {

This comment was marked as abuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#ifdef #else with two code paths isn't a good idea, because only one of them will be tested by most people.

This comment was marked as abuse.

This comment was marked as abuse.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???
I did:

because only one of them will be tested by most people.

This comment was marked as abuse.

@jonesmz

This comment was marked as abuse.

@jonesmz

This comment was marked as abuse.

@jhasse
Copy link
Collaborator

jhasse commented Aug 4, 2019

Let's not discuss Protobuf vs JSON here please.

The library approach is something to consider, requires more work though (might be worth it).

@jonesmz

This comment was marked as abuse.

@jhasse
Copy link
Collaborator

jhasse commented Aug 4, 2019

If the user wanted to have the output formatted differently, then the JSON can be piped to another program, in the conventional way program output is handled on posix like systems.

That change doesn't require JSON, it could also work with protobuf.

The reason for using fd 3 is that jobs in the console pool can then still directly use fds 0, 1 and 2.

@jonesmz

This comment was marked as abuse.

@jhasse
Copy link
Collaborator

jhasse commented Aug 4, 2019

ninja --serialized-output | protoc --decode_raw | your_plain_text_tool

And now please stop with the opinion pieces about protobuf. I think everyone is aware of the advantages and disadvantages, if not see https://www.google.com/search?q=protobuf+vs+json

While I don't know the syntax for it, i'm fairly sure it's possible to pipe-operator alternative file descriptors, isn't it?

As far as I know it isn't possible.

@Sarcasm
Copy link

Sarcasm commented Aug 4, 2019

I'm really sorry to talk about this too, but as a user I would find JSON more accessible.
For example, if I want to make a Ninja UI inside Emacs, using JSON I would just have to write the UI code.
Using protobuf would just be painful, bigger barrier to entry for what reasons?

I don't think googling "json vs protobuf" is of any help, every tools has pros and cons.
ninja is quite simple to use, its lack of external dependencies is quite convenient for building and distribution.
Adding protobuf for json reporting while a "json line" protocol could do the same job without the protobuf dependency seems weird to me.

I feel like it would be better to explain what are the protobuf benefits over json line for ninja users.
IMHO, you could put "convenience" on the side of JSON.

@jonesmz

This comment was marked as abuse.

@jonesmz

This comment was marked as abuse.

@jonesmz

This comment was marked as abuse.

@jhasse
Copy link
Collaborator

jhasse commented Aug 7, 2019

@jonesmz

Sorry, I definitely should have phrased that differently in a not-so-dismissive way.

The reason why I asked to stop the anti-protobuf arguments was NOT because I don't value them. It's because I already know them. The same goes with the other points you've mentioned. I'm well aware and I never said that I don't agree with you - this PR won't be merged any time soon, so no need to panic or object.

@Sarcasm

I feel like it would be better to explain what are the protobuf benefits over json line for ninja users.

I think the general pro-protobuf arguments apply.

@jonesmz

This comment was marked as abuse.

@colincross
Copy link
Contributor

I'm the original author of these changes from #1210. Not sure why this needed to be split into a new pull request, and I'd prefer that you leave me as the author in Git when making minor changes to my patches.

As for JSON vs. protobuf, I would much rather have used JSON, but JSON is actually terrible at passing arbitrary data. The data must be valid UTF8, which precludes some arbitrary binary sequences that might be found in the output of rules unless you base64 encode everything, which is both inefficient and loses the human-readability advantages of JSON. I originally wrote this with UBJSON instead of protobuf, but that loses most of the advantages of JSON without providing the advantages of protobuf (library support in multiple languages, easy cross-version compatibility). I ended up having to write rules on how to rev the schema without breaking compatibility with existing frontend, and there were no good tools to enforce the schema. So I went with protobuf instead.

I'm happy to continue working on this PR if there is a path forward, but it is merged as-is to the fork of ninja used in the Android build system and has been working great for us there. It has enabled us to store multiple output streams with different levels of logging, collect the error messages into a separate file, and produce an accurate build trace that can be loaded in chrome://tracing. Recently we have turned on local build output using a status table instead of a single status line that shows the description of the 8 longest currently running edges. There are more features planned including streamed build progress from our CI and colorized logs output from CI.

@jonesmz

This comment was marked as abuse.

@Sarcasm
Copy link

Sarcasm commented Aug 8, 2019

JSON is actually terrible at passing arbitrary data. The data must be valid UTF8, which precludes some arbitrary binary sequences that might be found in the output of rules unless you base64 encode everything, which is both inefficient and loses the human-readability advantages of JSON.

I agree it's technically possible to have arbitrary byte sequences, that contains binary data, and ninja should support it, but I don't think everything needs to be base64 encoded.

Looking at the src/frontend.proto, I see the following string fields (correct me if I'm wrong):

EdgeStarted#inputs
EdgeStarted#outputs
EdgeStarted#desc
EdgeStarted#command
EdgeFinished#output
Message#message

I'm not sure all theses can contain binary data, but if ninja had to support if for all these fields, it should be possible to just add something like <field>_format with a value of base64 or something like that, e.g.:

{
  "type": "EdgeStarted",
  "inputs": ["Zm9vLGNwcA=="],
  "inputs_type": "base64",
  "outputs": ["foo.o"],     # outputs_format can be skipped if it's utf-8
}

JSON has way more support than protobuf and is supported by default by a lots of programming languages or environments (Python standard library, Emacs Lisp, frameworks like Qt, ...).
While ninja can content itself to write a light version of the protobuf serialization and add the new features it needs bit by bit, I expect the protobuf consumers will often try to support all of protobuf unless the ninja subset is stable and publicly documented.

The C++ Module folks have a similar issue to deal with non-utf8 text and it looks like JSON is not a showstopper for that http://www.open-std.org/pipermail/tooling/2019-March/000553.html, I expect most real-life files will be human readable, so it seems a good compromise.

I'm not sure I want to write this, but really my Emacs integration example is interesting.
With a protobuf API, I would have to build either:

  1. a protobuf deserializer in Emacs Lisp (because no one exists), I suspect it would take more code and time than to make the ninja UI itself
  2. or depend on an external tool that to do the conversion to an intermediate format

While with JSON, there is built-in support, with efficient built-in support.
If it's just a simple integration just for my comfort, I can even skip the handling of non-utf-8 data.
I could easily test it with fake data that I write by hand.


Another thing, regarding the frontend that is invoked by ninja.
I also expected more an API like @jonesmz feel like it would be easier to have an API like this:

ninja --serialized-output

With output going to stdout or stderr.
And if user want confidence that the data is not mangled with other outputs, make it configurable, e.g.:

ninja --serialized-output --serialized-output-fd=3

That way, it seems easier to integrate in a tool like Emacs, because if I wanted to integrate the ninja build inside my Emacs session, with the current design, it looks like I would to write a frontend script, that forward the data to Emacs, so I would need 3 processes, while Emacs could maybe just invoke ninja with the rights params and consume the additional reporting stream.


Sorry to bring Emacs up, but I feel like editor integration is a reasonable example of a custom frontend, and right now the API does not seem easy to support that.

The other idea of a frontend I have in mind, is to make a python frontend that would make a D-Bus progress notification, so I can start the build in a workspace, go to another workspace, e.g. in my browser, and still track compilation progress.

@EliRibble
Copy link
Author

@colincross

I'm the original author of these changes from #1210. Not sure why this needed to be split into a new pull request, and I'd prefer that you leave me as the author in Git when making minor changes to my patches.

My apologies, I didn't realize that was something that could be done in Github. I wanted to try to move your ideas forward without requiring you to do additional work. I tried to point out as much as I could that the work is almost entirely yours. If there's something you'd like me to do to make this right please let me know.

@EliRibble
Copy link
Author

As far as the discussion here goes. I tend to agree with the idea that making ninja a library rather than making it start up a separate process to act as its frontend makes sense. I'll going to start looking in to what it would take to do that. I also think this PR could be modified to remove the (controversial) protobuf work to get the benefit of some of the changes while the discussion continues. I'll try to make that happen.

@jhasse
Copy link
Collaborator

jhasse commented Aug 9, 2019

My apologies, I didn't realize that was something that could be done in Github.

Not really GitHub, but normally Git will leave the original author of the commit when you rebase. It will look like this in GitHub: https://github.com/jhasse/ninja/commits/serialize "colincross authored and jhasse committed"

@EliRibble EliRibble mentioned this pull request Aug 9, 2019
@EliRibble
Copy link
Author

Created subset PR: #1631

@XVilka
Copy link

XVilka commented Jul 28, 2020

There are now some conflicts that should be resolved. Please rebase.

@kaspar030
Copy link

Another thing, regarding the frontend that is invoked by ninja.
I also expected more an API like @jonesmz feel like it would be easier to have an API like this:

ninja --serialized-output

[...]

That way, it seems easier to integrate in a tool like Emacs, because if I wanted to integrate the ninja build inside my Emacs session, with the current design, it looks like I would to write a frontend script, that forward the data to Emacs, so I would need 3 processes, while Emacs could maybe just invoke ninja with the rights params and consume the additional reporting stream.

Just as a datapoint, I agree with this.
I'm currently writing a frontend that "transparently" calls out to ninja. Currently I cannot now what ninja did. if it exited without error, I can just assume that everything has been built, and if it fails, ... I guess the need for the functionality in general is clear. :)

So with the solution this PR proposes, my frontend would call ninja with another tool (which is a "frontend" in ninja's terminology) as parameter which ninja executes, which would somehow communicate back to my frontend. That sounds clumsy, error prone and not trivial to get right in a portable way.

Just having ninja's output in a structured way that's easy to parse would be much easier to integrate.

I don't care if it is json or protobufs.

@amilcarlucas
Copy link

ping

@jhasse jhasse added this to the 2.0.0 milestone Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants