Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for external frontends #1210

Closed
wants to merge 7 commits into from

Conversation

colincross
Copy link
Contributor

@colincross colincross commented Nov 21, 2016

There are many requests for customizations to Ninja's build output cataloged in #746. This patch set attempts to address the underlying issue by spawning a separate frontend process and feeding serialized build status updates to it. The frontend can be extended without complicating Ninja's core logic, or can be completely swapped out for a custom frontend like implemented in #1020. The output can also be serialized to disk for later parsing, which will work well for buildbots.

The interface between Ninja and the frontend is documented in FRONTEND.md in 7th patch in the series, "Add frontend option for build status output"

@colincross colincross force-pushed the serialize branch 4 times, most recently from 0ffc551 to 9fec830 Compare June 19, 2017 19:59
$ frontend/native.py 3< ninja.pb
```

The serialized output of a clean Ninja build is included in `frontend/ninja.pb`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frontend/frontend.pb

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, frontend/frontend.pb is the compiled protobuf descriptor, not serialized build output.


To save serialized output to a file:
```
$ ./ninja --frontend='cat /proc/self/fd/3 > ninja.pb all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cat <&3 >ninja.pb would be more portable (at least Mac, which doesn't have /proc)

And you're missing a closing '

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed

@jhasse
Copy link
Collaborator

jhasse commented Sep 4, 2017

Running with frontend/native.py I'm getting:

Traceback (most recent call last):
  File "frontend/native.py", line 300, in <module>
    main()
  File "frontend/native.py", line 297, in main
    native.handle(msg)
  File "frontend/native.py", line 124, in handle
    if msg.message.level == msg.message.Level.INFO:
AttributeError: 'EnumTypeWrapper' object has no attribute 'INFO'

Anything I'm missing?

@colincross
Copy link
Contributor Author

Oops, I misunderstood the python protobuf class structure, combined with an untested path. Pushed a quick fix for now.

@jhasse
Copy link
Collaborator

jhasse commented Sep 6, 2017

Works now, thanks :)

size = 0
shift = 0
while True:
byte = self.reader.read(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use byte = bytearray(self.reader.read(1)) ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed and tested with python3

if len(byte) == 0:
raise StopIteration()

byte = ord(byte[0])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and remove ord call here: byte = byte[0]. It will then work in both Python 2 and 3.

jhasse added a commit to jhasse/ja that referenced this pull request Oct 4, 2017
Currently uses cat to forward the messages.

frontend.pb, frontend.py and native.py are taken from
ninja-build/ninja#1210 with minimal changes.
self.status_class = self.get_status_proto()

def get_status_proto(self):
set = google.protobuf.descriptor_pb2.FileDescriptorSet()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[pylint] W0622:Redefining built-in 'set' Maybe fd_set instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed


def get_status_proto(self):
set = google.protobuf.descriptor_pb2.FileDescriptorSet()
descriptor = os.path.join(os.path.dirname(sys.argv[0]), 'frontend.pb')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if sys.argv[0] is a good idea, maybe __file__ instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


message = self.reader.read(size)
if len(message) != size:
raise "Unexpected EOF reading %d bytes" % size
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raise EOFError("Unexpected EOF reading {} bytes".format(size))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

break
shift += 1
if shift > 4:
raise "Expected varint32 length-delimeted message"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of raising a str it's nicer to use something like RuntimeError (which inherits from BaseException).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, and all the other raised strings

@jhasse
Copy link
Collaborator

jhasse commented Oct 5, 2017

I've used this PR to write a new frontend which adds some features I've missed: https://bixense.com/ja/ It works pretty well so far :)

@colincross colincross force-pushed the serialize branch 2 times, most recently from 3c09006 to f5fe5af Compare October 5, 2017 18:17
Edges are nominally ordered by order in the build manifest, but in
fact are ordered by memory address.  In most cases the memory address
will be monontonically increasing.  Since serialized build output will
need unique IDs, add a monotonically increasing ID to edges, and use
that for sorting instead of memory address.
The times that end up in the build log currently originate in the
status printer, and are propagated back out to the Builder.  Move
the edge times into the Builder instead, and move the overall start
time into NinjaMain so it doesn't get reset during manifest
rebuilds.
Store the number of running edges instead of trying to compute it
from the started and finshed edge counts, which may be different
for starting and ending status messages.  Allows removing the status
parameter to PrintStatus and the EdgeStatus enum.
Make BuildStatus an abstract interface, and move the current
implementation to StatusPrinter, to make way for a serialized
Status output.
Send all output after manifest parsing is finished to the Status
interface, so that when status frontends are added they can handle
build messages.
Add a tiny protobuf runtime in proto.cc and proto.h, and a python
script that takes a binary proto descriptor (produced by
protoc --descriptor_set_out) and produces a .pb.h file that
implements a subset of the protobuf C++ Message interface.
Add a --frontend option that takes a command that will handle printing
build status.  The command will be executed with the read of a pipe on
fd 3, and each build event will cause a serialized message to be sent
on the pipe.

The frontend/ directory includes a description of the interface
between Ninja and a frontend, and an implementation of the interface
in python that mimics the native interface in ninja, which can be run
with:
ninja --frontend=frontend/native.py
@jhasse
Copy link
Collaborator

jhasse commented Nov 1, 2017

Is there anything I could do to help getting this merged?

@Boddlnagg
Copy link

I would also really like to see this being merged. My goal is to have a frontend that logs everything to a file (similar to a simple stdout redirect with current ninja), but still provides status output. In this way, warnings and executed commands can be found in the logfile, while a status line (and output of failed commands) is shown directly on the console.

@nico
Copy link
Collaborator

nico commented Apr 5, 2018

I thought a bit about this. As said on the mail thread, I'm not a fan of putting protobuf in ninja itself (even if it's a reimplementation). At the same time, I agree that lots of people want to tweak ninja's output in many ways, and this is a good approach to attack that problem.

What do you think about adding a second binary, ninja_protoout or ninja_buildcore or what have you, somewhere below contrib/, make that link to the ninja lib, put the proto code there, and make the ninja lib have an interface for selecting the output behavior (and core ninja would provide a class that works like the current output and the new binary would instead use the proto behavior)? That way, ninja itself stays the way it is, and people who want very custom output can use the new binary for that?

@jhasse
Copy link
Collaborator

jhasse commented Apr 5, 2018

Doesn't ninja also stay the way it is when one isn't using the --frontend option?

@nico
Copy link
Collaborator

nico commented Apr 5, 2018

The behavior stays the same, but the code still ends up in the binary.

@jhasse
Copy link
Collaborator

jhasse commented Apr 5, 2018

IMHO that's an acceptable tradeoff. What binary size difference are we talking about here? Probably a few kilobytes.

That way, ninja itself stays the way it is, and people who want very custom output can use the new binary for that?

So the binary gets distributed with ninja? That way the size of the package stays the same anyway.

If not, it would still be very inconvenient as frontends still need to ship a binary.

@nico
Copy link
Collaborator

nico commented Apr 5, 2018

It's more about the conceptual "ninja doesn't depend on protobuf" imho.

Depends on what you mean by "distributed". The only distribution we do is the releases page on github; we can certainly include the binary there. All other distribution is up to the people owning the linux packages and whatnot.

@jhasse
Copy link
Collaborator

jhasse commented Apr 5, 2018

It's more about the conceptual "ninja doesn't depend on protobuf" imho.

Can you elaborate what you mean by this exactly? Would a compile time switch be satisfactory for example?

Btw: An argument why merging this might actually decrease ninja's size in the long run: Lots of fancy interface stuff can be implemented by external frontends and doesn't need any code in ninja itself.

All other distribution is up to the people owning the linux packages and whatnot.

Yeah that's what I meant, because I think that's what most Linux people use.

@nico
Copy link
Collaborator

nico commented Apr 5, 2018

Ninja should stay small and simple. It should do builds, quickly, and not much else. Having a flexible cross-process communication protocol in there is philosophically out of scope for ninja itself.

@colincross
Copy link
Contributor Author

I'm happy to implement this if it is necessary to get this merged, but I agree with Jan that it doesn't make much sense vs. the extra complexity. The proto stuff is all encapsulated through a subclass of the Status interface anyways (StatusSerializer in status.cc, which uses the generated frontend.pb.h file and the 264 line proto runtime in src/proto.cc and src/proto.h), so moving it to its own binary won't materially change the new code.

All the argument parsing logic and the main loop in ninja.cc would need to be in the new binary, which probably means moving ninja.cc into libninja and putting two new main.cc files that just configure the output class and call real_main().

As a measure of how big this patch is, for a local build before and after serialize, there is an 8% increase in text section size:

	   text	   data	    bss	    dec	    hex	filename
before:	 179193	   2312	    560	 182065	  2c731	ninja
after:	 193758	   2952	    600	 197310	  302be	ninja

@nicolasdespres
Copy link
Contributor

I think there is a simpler approach to customize ninja's output. By simpler I mean with a smaller footprint on Ninja's code base. I think I have mentioned it in the past but never got comment on it.

Following the same spirit than the NINJA_STATUS environment variable we could have an environment variable specifying the template to use when printing the start of an edge (NINJA_START_EDGE), when an edge stop (NINJA_STOP_EDGE). Users can use the protocol they like (json or ad-hoc for instance) when defining the value of these variables. Eventually, we will have to support some extra placeholders (like one printing the job id so that users can match edge stop line with edge start line. Some escaping function would maybe land in ninja code base too but that will be all. Users would finally just have to write a filter understanding their protocol and do the nice printing.

@colincross
Copy link
Contributor Author

It's certainly simpler, but I have a few concerns about using NINJA_START_EDGE/NINJA_STOP_EDGE instead of proper serialization.

  • It only covers edges, this patch also covers messages from ninja and starting and ending the build. Messages (warnings and errors) in particular would require frontends to parse various strings that ninja prints. Protobuf also allows new messages to be added later (periodic messages containing system cpu usage or memory usage? heartbeat?)
  • There's no way to tell that edge output (which will follow a NINJA_STOP_EDGE) has finished until another edge has started or finished or the build has ended. When using pools, or at the end of builds, there is no guarantee that an edge finish is immediately followed by an edge start, so this could be a long time.
  • Printing things like the list of input/output targets will require a placeholder that produces an escaped list, which means there won't end up being that much variability in the output format. For example, %O could produce: "a.pb.cc", "a.pb.h". At that point ninja might as well have just produced the whole output in JSON instead of forcing each frontend to manually smoosh NINJA_START_EDGE into something that looks like JSON.
  • Separating the start/stop edge lines from edge output requires some sort of magic sequence that must never appear in an edge output, which seems hacky.

@colincross
Copy link
Contributor Author

Another use case covered by this pull request but not by NINJA_START_EDGE, and one of my main motivations for pursuing this pull request, is the ability to get ANSI escape codes (colorized logs) even when outputting to something that is not a smart terminal, like a file or a pipe. I want to produce build breakage emails that retain the original colorized error messages, but ninja strips them. This pull request puts all control of the output in the frontend, so ninja doesn't have to have an option for every different way someone wants the output to look.

@nicolasdespres
Copy link
Contributor

It is true that we cannot achieve the same level of versatility with two environment variables than with properly designed inter-process communication protocol.

  • For warning/error messages, I think most IDEs already parse compiler warnings/errors without problem so we could follow this pseudo-standard (as with "Entering directory...")
  • Edge's output should indeed be prefixed by some kind of edge id so that the front-end can relate one to another and enable stream-mode. In stream-mode ninja no longer buffers the whole command output but print each line prefixed by the edge id when it is received.
  • People know their build system and the commands involved with the kind of output they produce. I see no problem in letting front-end designers deciding of the magic sequence to find the beginning/end of an edge output. Something like __ninja_edge_start will work.
  • Placeholder expanded values could use the same syntax as in the ninja build file to express file list.
  • Ninja should not be concerned by ANSI escape codes specially in such a batch mode (except for dependency extraction). IMHO, front-ends are in charge of parsing command's output.

I think a compromise between versatility and code base impact should be found.

@jhasse
Copy link
Collaborator

jhasse commented Apr 9, 2018

I think most IDEs already parse compiler warnings/errors without problem

I can assure that they aren't doing this "without problems", I've had my own share of problems with that. More likely they are doing it because of no available alternative.

For example the Rust compiler added JSON output for IDEs, because the text parsing method was restricting them and IDE developers didn't like it either. Without it they wouldn't have been able to change the default output format for their error messages.

Keep that in mind: Frontends need a stable interface and if you let them use Ninja's normal output it will be harder to change that.

Something like __ninja_edge_start will work.

What if the output of an edge contains that, too? Might happen when you recursively call Ninja.

I think a compromise between versatility and code base impact should be found.

I think the code base impact of this approach will actually be smaller, because lots of other issues / feature requests / etc. can then be easily moved to frontends.

@nicolasdespres
Copy link
Contributor

@jhasse

Emacs and others finds error/warning messages. It is not pretty and probably hacky but it works 90% of the time. That's what I meant when I said "without problems".

It is extremely unlikely that __ninja_edge_start is included in the output and ninja does not support recursive mode. You could add the recursion level if needed anyway.

To all:

Please do not take me wrong. I am fully aware of the benefit of well defined interface and I generally prefer principled approaches. But this pull-request diffstat is +2,743 −400 and @nico's last comment suggests that it won't be merged unless a compromise is found. Ninja covers only 80% of build-system use cases. It is a design choice. Maintainers do not seem inclined to invest in the effort required to support the remaining 20% which may probably double the code base size. I think the same philosophy applies in the case of the front-end.

@areusch
Copy link

areusch commented Jul 18, 2018

Hi everyone,

Just wondering if there's been any more traction on merging this--it'd be super useful to me. In my use case, ninja is running potentially-interactive tools, so having a way to get status updates outside of stdout/stderr is pretty valuable.

Thanks!
Andrew

@Boddlnagg
Copy link

Boddlnagg commented Nov 16, 2018

I already stated my use case for this in #1210 (comment), and I'd like to add that this does not necessarily require a fully customizable frontend. It would be enough if ninja would (optionally) print information about started jobs in addition to finished jobs, so when I pipe the output to a different process ("custom frontend"), I can still show a status line about the currently running unfinished job(s).

Btw, this PR should be tagged with the frontend label.

@jonesmz

This comment was marked as abuse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants