Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ANSI writer #9565

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

add ANSI writer #9565

wants to merge 14 commits into from

Conversation

silby
Copy link

@silby silby commented Mar 12, 2024

This requires jgm/doclayout#28 in addition to the stuff I already got merged; cabal.project does not reflect this. The required doclayout commit is added to cabal.project.

I think that the surface area supported by what I have here is a minimum scope to be worth releasing, but I don't really think this is quite good enough as is to merge. I'm filing a draft PR to get some feedback on the choices and compromises made so far and what the requirements for a shipping ANSI writer should be. I'd really love some ideas on how to format headings! Some of what's here reflects arbitrary design choices that make me happy, like putting a four-column margin on the whole output.

It's a bit slow; rendering MANUAL.txt to ANSI takes 2.77 seconds on my machine, compared to 0.66 seconds for rendering it to HTML. I have an inkling this may be due to the quantity and size of definition lists in MANUAL.txt, but who knows. The slowness is fixed.

There's no new tests or docs so far.


The ANSI writer (-t ansi) outputs a document formatted with ANSI control sequences for reading on the console.

Most Pandoc elements are supported and printed in a reasonable way, if not always ideally. This version does no detection of terminal capabilities nor does it fall back to different output styles for less-capable terminals.

Some gory details:

  • Title blocks are formatted with modest extravagance in –standalone mode.
  • Strong, Emph, Underline, and Strikeout spans are all formatted accordingly using SGR codes (which will be silently ignored by terminals that don’t support them).
  • Headings have somewhat arbitrary styles applied to them that probably need immediate improvement.
  • Blockquotes and all flavors of list look pretty good.
  • Code spans are colored magenta-on-white, which on the author’s terminal looks kind of like the pinkish treatment of code spans used by many stylesheets. This probably isn’t a good final decision.
  • Code blocks are formatted by Skylighting’s formatANSI using standard writer options and included directly in the output. This has some issues; see code comments.
  • Links are printed with OSC 8 to create hyperlinks and colored cyan. The author’s terminal automatically adds a dotted-underline to OSC 8 hyperlinks, but only colors them differently on command-mouseover. Setting an underlined style on links may be more broadly accessible. OSC 8 support is not checked for, so on terminals not supporting it or with support disabled, the link text will be colored but not do anything and the links will not be printed.
  • Images are displayed as their alt text. Support for the Kitty and iTerm 2 inline image protocols is planned. Supporting other terminals by using Chafa (https://hpjansson.org/chafa/) to print sixels etc would be cool too but the author would have to do some FFI stuff and it would add a dependency to Pandoc.
  • Tables are replaced with a useless placeholder. Table output using box-drawing characters is desired.
  • Subscripts and Superscripts are just parenthesized when accurate Unicode representations aren’t available. Because these span types could have all kinds of semantics, there’s not an obvious thing to do with them.
  • Simple math is translated to Pandoc inlines using existing functionality. An ambitious person could look into emulating the console-mode math output of a computer algebra system, or rendering each display math element as an image with TeX or Typst and including it, or some other thing.

@jgm
Copy link
Owner

jgm commented Mar 13, 2024

Nice! Upload a screenshot?

The slowness does concern me. I think the main application of this would be reading files on your terminal, and a 2.7 sec delay seems too much for that. Can you pin down where the slowdown comes from, by experimentation or profiling?

@silby
Copy link
Author

silby commented Mar 13, 2024

Ah, there's definitely something with super-linear time complexity going on with renderANSI, I'll try to fix that!

@silby
Copy link
Author

silby commented Mar 13, 2024

See jgm/doclayout#29 for the performance issue.

@silby
Copy link
Author

silby commented Mar 15, 2024

Here's a couple screenshots of how things look presently:

MANUAL.txt:

Screenshot 2024-03-15 at 2 08 42 PM

ol, ul, code, strikeout, math, blockquotes
Screenshot 2024-03-15 at 2 11 01 PM

@silby
Copy link
Author

silby commented Mar 15, 2024

Staring at my first screenshot there I realized DocLayout's line breaking is broken by the styled text, filed jgm/doclayout#30 to track since I don't think I'm going to plow all the way to a solution right away.

@silby
Copy link
Author

silby commented Mar 23, 2024

Rebased. Please note d664dbe; something about jgm/doclayout#31 has caused a whitespace regression here. Actually I just realized what it probably is; hang 2 (bullet <> space) ultimately is putting an empty doc inside the Prefixed, which is getting eliminated now by doclayout's flatten.

The djot reader parses `-\n' parses as an empty list item just fine but nevertheless I'm not sure if tweaking this test is fine or if we should go put the missing space back where it belongs.

@silby silby marked this pull request as ready for review March 23, 2024 01:15
@jgm
Copy link
Owner

jgm commented Mar 23, 2024

I'm not worried about that space; I think removing it is actually the intended behavior so I'll consider this a bug fix.

@jgm
Copy link
Owner

jgm commented Mar 23, 2024

Some comments on the design choices:

Title blocks are formatted with modest extravagance in –standalone mode.

I think the horizontal line with the emoji is too opinionated (even though it looks cool).

Code blocks are formatted by Skylighting’s formatANSI using standard writer options and included directly in the output. This has some issues; see code comments.

Is the issue just that skylighting produces a fully rendered string? I don't see why that's a problem, necessarily.

Setting an underlined style on links may be more broadly accessible.

Probably a good idea.

OSC 8 support is not checked for, so on terminals not supporting it or with support disabled, the link text will be colored but not do anything and the links will not be printed.

Is there any way to check for it?

Images are displayed as their alt text. Support for the Kitty and iTerm 2 inline image protocols is planned. Supporting other terminals by using Chafa (https://hpjansson.org/chafa/) to print sixels etc would be cool too but the author would have to do some FFI stuff and it would add a dependency to Pandoc.

I'd like to avoid C dependencies.

Tables are replaced with a useless placeholder. Table output using box-drawing characters is desired.

For placeholder purposes, one option would be to include a table generated by the plain writer. This would give you the content, though without ANSI formatting.

I wonder whether it could work to use the gridTable function from T.P.Writers.Shared? That should in principle get you grid tables with ANSI formatting inside the cells.

@jgm
Copy link
Owner

jgm commented Mar 23, 2024

Code spans are colored magenta-on-white, which on the author’s terminal looks kind of like the pinkish treatment of code spans used by many stylesheets. This probably isn’t a good final decision.

Why not use skylighting for these too, for consistency?

@silby
Copy link
Author

silby commented Mar 23, 2024

Some comments on the design choices:

Title blocks are formatted with modest extravagance in –standalone mode.

I think the horizontal line with the emoji is too opinionated (even though it looks cool).

Admitted. I'll calm this down.

Code blocks are formatted by Skylighting’s formatANSI using standard writer options and included directly in the output. This has some issues; see code comments.

Is the issue just that skylighting produces a fully rendered string? I don't see why that's a problem, necessarily.

Honestly I only partly remember what I was talking about. The main thing that's subpar is if you select a highlight-style that uses a nondefault background color, you get what I would consider subpar results:

Screenshot 2024-03-23 at 2 12 45 PM

If we could get a Doc back from the highlighting we'd be able to keep the background color bounded to the borders of the code block, cf:

Screenshot 2024-03-23 at 2 17 37 PM

Overall I think the state of affairs with code blocks is fine for starters.

Setting an underlined style on links may be more broadly accessible.

Probably a good idea.

  • Set underlined style on links.

OSC 8 support is not checked for, so on terminals not supporting it or with support disabled, the link text will be colored but not do anything and the links will not be printed.

Is there any way to check for it?

Not really, I guess. Someone has a list of the terminal emulators that are known to support it, but there's no terminfo capability for it or a report control sequence that can tell it to us. That would leave us with the environment-variable equivalent of user-agent sniffing, which will be annoying to maintain and perpetually incomplete.

I think the ANSI writer does need to be able to print the links out visibly, as there are certainly widely used terminal emulators that don't and may never support OSC 8 (Terminal.app, xterm, urxvt). Some users with supporting terminals might not want it anyway. I think it has to be a writer option provided by a CLI flag.

Images are displayed as their alt text. Support for the Kitty and iTerm 2 inline image protocols is planned. Supporting other terminals by using Chafa (https://hpjansson.org/chafa/) to print sixels etc would be cool too but the author would have to do some FFI stuff and it would add a dependency to Pandoc.

I'd like to avoid C dependencies.

Valid. I don't think the lofi outputs supported by Chafa are a critical feature for Pandoc. The Kitty and iTerm 2 protocols will probably cover most of the users who would want to see images in the terminal anyway.

Tables are replaced with a useless placeholder. Table output using box-drawing characters is desired.

For placeholder purposes, one option would be to include a table generated by the plain writer. This would give you the content, though without ANSI formatting.

I wonder whether it could work to use the gridTable function from T.P.Writers.Shared? That should in principle get you grid tables with ANSI formatting inside the cells.

  • Try using gridTable for a first iteration of table output

@jgm
Copy link
Owner

jgm commented Mar 23, 2024

The main thing that's subpar is if you select a highlight-style that uses a nondefault background color, you get what I would consider subpar results

Ah yes, that is bad. I'd be open to having skylighting-format-ansi export a function that yields a Doc Text.

@jgm
Copy link
Owner

jgm commented Mar 24, 2024

Another thought about the 4-space indentation on body text. This echoes the formatting of man pages, and it looks good and familiar. But one drawback is that it makes it hard to copy/paste content from a document you are viewing this way -- since you'll get these initial indents.

@silby
Copy link
Author

silby commented Mar 25, 2024

I think the copy-paste drawback you mention is of limited relevance to the ANSI writer use-cases. The analogy to viewing man pages is apt; I think the ANSI writer should optimize for reading over other considerations. Even if there weren't an indented margin, copy-pasting would still be a bit annoying due to hard breaks in the formatted output. At least in Pandoc if someone wants a chunk of more pasteable text you can rerun with -t plain --wrap=none; I don't think there's a way to get unwrapped/unindented text output from mandoc for this purpose.

@silby
Copy link
Author

silby commented Mar 26, 2024

Code spans are colored magenta-on-white, which on the author’s terminal looks kind of like the pinkish treatment of code spans used by many stylesheets. This probably isn’t a good final decision.

Why not use skylighting for these too, for consistency?

I see relevant examples in other writers that support syntax highlighting. I sort of assume code spans marked up with their language are rare but it makes sense to provide as much support for it here as elsewhere. We'll still need some kind of fallback for generic code spans. The basic issue there of course is that all the text in the terminal is already fixed-width, so we can't use that for contrast.

Magenta-on-white sort of resembles what Slack does with code/fixed-width spans:

Screenshot 2024-03-25 at 7 58 46 PM

The GitHub CLI uses Glamour to render gfm and its code spans default to bright red on dark red, plus an extra space's worth of padding on each end of the span for additional contrast.

Screenshot 2024-03-25 at 8 01 32 PM

I want to land on something here that doesn't require me to get to jgm/doclayout#32 just yet, because I uh, don't want to work on it right now.

Any particular preferences or suggestions?

@jgm
Copy link
Owner

jgm commented Mar 26, 2024

Copy-paste: I'm not sure. I certainly have wanted to copy code from man pages and READMEs before, and wouldn't have wanted to context shift, exit man, fire up a new program, and find the relevant section again. Even if we're optimizing for reading, copying text IS something one often wants to do when reading a document.

Certainly if I wanted to copy some text from a Word or HTML document or EPUB, I'd rather fire up pandoc -t ansii in the terminal and copy it from that instead of starting Word or a browser or an ebook reader.

Note also that eliminating the 4-space indent would make part of the problem with highlighted code blocks with backgrounds go away. (There would still be the odd color shift at the end of the first line, which I don't really understand....maybe this is something that needs to be fixed in skylighting-format-ansi?)

@silby
Copy link
Author

silby commented Mar 27, 2024

Note also that eliminating the 4-space indent would make part of the problem with highlighted code blocks with backgrounds go away. (There would still be the odd color shift at the end of the first line, which I don't really understand....maybe this is something that needs to be fixed in skylighting-format-ansi?)

I'm not positive I totally understand it either. Removing nest 4 from the body yields a code block that looks like this, in less -R,

Screenshot 2024-03-26 at 4 25 29 PM

but like the prior screenshot, when echoing directly to the terminal.

I think the different behavior of less is due to this documented behavior of less -R:

Color escape sequences are only supported when the color is
changed within one line, not across lines.  In other words, the beginning of each
line is assumed to be normal (non-colored), regardless of any escape sequences in
previous lines.

(wow there I go copy pasting from the man page)

So I interpret this to mean that less imputes a default-color control sequence at each \n in the input, in contrast to what my terminal is doing.

In any case, the best thing to do is use doclayout to put code blocks into an actual rectangle but as noted I think the current state is acceptable. The default pygments highlight style appears to stick to the default background color.


I don't want to bikeshed the matter of the margin; I can drop it from this PR with the idea that it makes sense to err on the side of less-opinionated output at least to begin with.

That does make the question of how to style headings a bit more of a headscratcher. What's here so far isn't thoroughly worked out but shifting the headings into the margin like man output was at least an available option. Since we can't make headings larger or use a different font family, it's not obvious how to create hierarchy.

Lynx's defaults are not at all exuberant:

Screenshot 2024-03-26 at 5 09 23 PM

The style used by the Charm.sh folks in https://github.com/charmbracelet/gum uses box-drawing characters for this in a way that seems fairly sensible, on top of a color contrast. I could take or leave the color part. (They have a 2-column left margin too, heh.)

Screenshot 2024-03-26 at 4 51 57 PM

We could try modestly more spacious decorations with box drawing characters as well, cf.

┌╴
╵ Heading 1 ╷
           ╶┘

If we we could even take advantage of markdown's familiarity and put the appropriate number of # before each heading.

Converting H1s to all-caps would match how man pages are written, but Pandoc doesn't really do any such thing by default elsewhere and makes no difference for many scripts.

Do you have any inclinations on this? I think getting headings right is the big remaining design decision I want to make.

@jgm
Copy link
Owner

jgm commented Mar 30, 2024

One question is whether it's necessary to distinguish different heading levels visually. Personally, I think it's okay not to. Using indentation would give you a way to distinguish a couple of levels of heading, but not six (unless you indent the body text really far, or do what lynx seems to do, which is just ugly -- with the heading indented more than the contained body text).

Putting the heading in boldface and a different color from the body text definitely sets it apart as a heading, even without indentation.

I don't like the box-drawing character options much.

For reasons not currently clear to me(!), the tip of doclayout consumes
the breaking space that is followed by nothing (i.e. the ignored raw
LaTeX inline). The tests of djoths itself still pass.
The ANSI writer (-t ansi) outputs a document formatted with ANSI control
sequences for reading on the console.

Most Pandoc elements are supported and printed in a reasonable way, if
not always ideally. This version does no detection of terminal
capabilities nor does it fall back to different output styles for
less-capable terminals.

Some gory details:

- Title blocks are formatted with modest extravagance in --standalone
  mode.
- Strong, Emph, Underline, and Strikeout spans are all formatted
  accordingly using SGR codes (which will be silently ignored by
  terminals that don't support them).
- Headings have somewhat arbitrary styles applied to them that
  probably need immediate improvement.
- Blockquotes and all flavors of list look pretty good.
- Code spans are colored magenta-on-white, which on the author's
  terminal looks kind of like the pinkish treatment of code spans used
  by many stylesheets. This probably isn't a good final decision.
- Code blocks are formatted by Skylighting's formatANSI using standard
  writer options and included directly in the output. This has some
  issues; see code comments.
- Links are printed with OSC 8 to create hyperlinks and colored cyan.
  The author's terminal automatically adds a dotted-underline to OSC 8
  hyperlinks, but only colors them differently on command-mouseover.
  Setting an underlined style on links may be more broadly accessible.
  OSC 8 support is not checked for, so on terminals not supporting it or
  with support disabled, the link text will be colored but not do
  anything and the links will not be printed.
- Images are displayed as their alt text. Support for the Kitty and
  iTerm 2 inline image protocols is planned. Supporting other terminals
  by using Chafa (https://hpjansson.org/chafa/) to print sixels etc would
  be cool too but the author would have to do some FFI stuff and it would
  add a dependency to Pandoc.
- Tables are replaced with a useless placeholder. Table output using
  box-drawing characters is desired.
- Subscripts and Superscripts are just parenthesized when accurate Unicode
  representations aren't available. Because these span types could have
  all kinds of semantics, there's not an obvious thing to do with them.
- Simple math is translated to Pandoc inlines using existing
  functionality.  An ambitious person could look into emulating the
  console-mode math output of a computer algebra system, or rendering each
  display math element as an image with TeX or Typst and including it, or
  some other thing.
@silby
Copy link
Author

silby commented Apr 3, 2024

Added a commit with what I think is a reasonably good approach to headings, which is marking them with §. Only h1-h3 get this treatment; h4-h6 are indented and italicized but I think in any reasonable document hierarchy making them more distinctive with more § is too much ink.

Screenshot 2024-04-02 at 5 00 01 PM

@silby
Copy link
Author

silby commented Apr 3, 2024

rebased and force-pushed.

@jgm
Copy link
Owner

jgm commented Apr 3, 2024

I'm not a big fan of the new idea for sections. Having section headings indented to the right relative to body text just seems wrong. And I'm not sure I like the section symbols.

Would it be so bad just to make all headings boldface, flush-left, with space above and below? You wouldn't then be able to distinguish heading levels, but how big a drawback is that? If you have a document with many levels and need to keep track of them, you could always render with --number-sections (assuming your writer implements that), and it would make the levels very clear with numbering.

Another option would be, e.g.,

level 1 - bold, underlined, all caps
level 2 - bold, underlined
level 3 - bold
level 4 - underlined
level 5 - italics

or something like that. (Though personally I think this might be ugly.)

@silby
Copy link
Author

silby commented Apr 3, 2024

Numbered sections isn't implemented so far, I can spend some time on it but I think it could go in the backlog for after merge.

I think the minimum I want to ship with is H1s and H2s that are distinguishable from each other and from H3-H6. At this point that'd probably be bold+caps for H1, bold for H2, italic for H3-H6. This is slightly worse for scripts without capitals but I guess most such scripts also commonly don't have italics, so there's some degradation anyway. The remaining option which I have mostly discounted til now is to add a color; since I've used it sparingly elsewhere I think a contrasting color shared by all headings might be helpful after all.

Since you are pretty firmly on the side of keeping it simple I'll do another iteration with something like the above.

@jgm
Copy link
Owner

jgm commented Apr 3, 2024

That sounds reasonable. I think the idea of having a color for the headings is a good one. (Might as well be the same for all levels, because color isn't a good way to distinguish heading levels.) The scheme you suggest is fine with me. Anyway, we can get a version of this out and see what people say...the heading scheme can always be tweaked in the future.

@jgm
Copy link
Owner

jgm commented Apr 3, 2024

Implementing numbering shouldn't be too complicated.

The basic approach is to do something like

let blocks' = makeSections (writerNumberSections opts) Nothing blocks

where blocks are the body blocks of the document and opts are the writer options.
If writerNumberSections opts is True, this will add a number attributes on the Header elements, and then you can just have your renderer for headings print these when present, perhaps in a special color to distinguish them from the heading text. Some extra Divs will be added in blocks', but your renderer can just ignore these (as I assume it does already).

All headings at the document top-level are green. Headings inner to
structures like blockquotes and lists are not. H1 is bold and all caps.
H2 is bold. H3-H6 are italic. The design here is meant to be relatively
boring/simple and allow telling H1 and H2 apart from each other and from
the remaining heading levels.
@jgm
Copy link
Owner

jgm commented Apr 5, 2024

I tried this out; I think it's shaping up nicely.

One thought is that we could ultimately make things like colors and styles for section headings and code configurable. It's going to be hard to satisfy everyone's tastes; maybe in this way we could even make it possible to have indented text and unindented titles as you prefer. But this could be for the future.

I note that the colored background on inline code seems to have a kind of "padding"; it extends a character width ahead and behind. Is that intentional? I'm not sure I like that, but maybe I would if I say the alternative. (One thing I don't like is that it might mislead people into thinking that the quoted code includes a space.)

When I render the pandoc manual, the code blocks with no syntax specified have no formatting whatsoever, which seems non-ideal. Not sure what the best approach is here. One idea would be to put all code blocks in a box with line-drawing characters, but of course that inhibits cut/paste.

Another possibility would be to use a different stroke color by default for code, including inline code and unhighlighted code blocks. To my eye, using a different foreground color would produce a more harmonious appearance than using a background color for inline code, especially if that extra padding is needed to make it look good.

Tables are still not rendered. I think we need some support for tables before this can be shipped.

@silby
Copy link
Author

silby commented Apr 6, 2024

Thx for the comments! Going on vacation for a week, will work on wrapping this up when I’m back.

@silby
Copy link
Author

silby commented Apr 16, 2024

One thought is that we could ultimately make things like colors and styles for section headings and code configurable. It's going to be hard to satisfy everyone's tastes; maybe in this way we could even make it possible to have indented text and unindented titles as you prefer. But this could be for the future.

Yes, I would like for this to be configurable by authors/users somehow but I have only thought briefly about a UI. Presumably something like a stylesheet could be encoded into document metadata or a defaults file.

I note that the colored background on inline code seems to have a kind of "padding"; it extends a character width ahead and behind. Is that intentional? I'm not sure I like that, but maybe I would if I say the alternative. (One thing I don't like is that it might mislead people into thinking that the quoted code includes a space.)

This is cribbed from what the Charm family of products does with inline code spans. Them:

Screenshot 2024-04-16 at 11 44 27 AM

Us with:

Screenshot 2024-04-16 at 11 48 17 AM

Us without:

Screenshot 2024-04-16 at 11 47 09 AM

No strong feelings as to the default here, just trying stuff out.

When I render the pandoc manual, the code blocks with no syntax specified have no formatting whatsoever, which seems non-ideal. Not sure what the best approach is here. One idea would be to put all code blocks in a box with line-drawing characters, but of course that inhibits cut/paste.

Yeah that needs some attention. I already have blockquotes using box-drawing characters for the left margin. I'll try some stuff.

Another possibility would be to use a different stroke color by default for code, including inline code and unhighlighted code blocks. To my eye, using a different foreground color would produce a more harmonious appearance than using a background color for inline code, especially if that extra padding is needed to make it look good.

My sense having a different background color for inline code is common in onscreen styles; the technical audience reading docs with inline code using this writer will have some familiarity with that convention. Not least from github itself. See also, Slack:

Screenshot 2024-04-16 at 11 55 55 AM

StackOverflow:

Screenshot 2024-04-16 at 11 57 04 AM

Without being able to switch font families, I still think setting a stroke and background color for inline Code is our best default, with or without the extra horizontal padding for visual weight. This is also the only part of our "stylesheet" so far using other background colors (apart from whatever skylighting does), which I think is good.

Tables are still not rendered. I think we need some support for tables before this can be shipped.

Will get a handle on this as well.

@silby
Copy link
Author

silby commented Apr 16, 2024

Here's a possibility for code blocks that uses dashed box drawing borders, red fg for unhighlighted code, and a λ as a sigil. (Inspired by the pandoc.vim conceal mode for fenced code blocks.)

Screenshot 2024-04-16 at 12 18 08 PM Screenshot 2024-04-16 at 12 20 42 PM

@jgm
Copy link
Owner

jgm commented Apr 16, 2024

My worry about using box drawing characters for code blocks is that it makes it hard to cut and paste from them. (This again!) A line above and below would be compatible with this goal, though -- maybe that's worth trying, but I can also see drawbacks to wasting the vertical space.

A third idea would be to simply use a very light background color for code blocks (whether highlighted or not). I'm not sure this is even possible with the highlighted blocks, since we just get them as an ANSI blob.

@jgm
Copy link
Owner

jgm commented Apr 16, 2024

I think that for inline code I prefer the non-padded version. Ideally one would like something in between, as they have in Slack and GitHub, but that's not available in ANSI.

@silby
Copy link
Author

silby commented Apr 17, 2024

I think that for inline code I prefer the non-padded version. Ideally one would like something in between, as they have in Slack and GitHub, but that's not available in ANSI.

Yeah I'm fine dropping the padding.

My worry about using box drawing characters for code blocks is that it makes it hard to cut and paste from them. (This again!) A line above and below would be compatible with this goal, though -- maybe that's worth trying, but I can also see drawbacks to wasting the vertical space.

I remain unconvinced that ease of copy-pasting deserves this much priority as we make decisions about the default styling. I see the use-case of the ANSI writer as allowing easy reading of styled documents in the terminal. If there are two just-as-good alternatives for styling an element I don't mind opting to one that makes it easier to copy-paste. Note that when authors add .numberLines to code blocks, that will unavoidably make code less copy-pasteable in this output format anyway.

Nevertheless, this seems reasonable:

Screenshot 2024-04-17 at 10 54 24 AM

If you wanted to apply the same design idea to blockquotes, instead of using the vertical line in the margin, you could perhaps stick “” where the lambda is in the code block listing. Not an unheard of styling of blockquotes on the web. But I still think the line on the side is more familiar for that case, cf. GitHub, email clients, etc.

@jgm
Copy link
Owner

jgm commented Apr 17, 2024

I remain unconvinced that ease of copy-pasting deserves this much priority as we make decisions about the default styling. I see the use-case of the ANSI writer as allowing easy reading of styled documents in the terminal.

Well, we already rejected your original way of styling headings (with indented body text) for this reason. So if we wanted to revisit that desideratum, we'd need to revisit that as well.

I'm also not sure how much weight to give this.

As for the horizontal lines, I think the extra vertical space and noise is going to be a bit awkward with short code blocks (e.g. one-liners, which are not uncommon in manuals and the like). I also think the lambda is too opinionated. Would the difference in text color itself be enough to mark this off as a code block?

@silby
Copy link
Author

silby commented Apr 17, 2024

As for the horizontal lines, I think the extra vertical space and noise is going to be a bit awkward with short code blocks (e.g. one-liners, which are not uncommon in manuals and the like). I also think the lambda is too opinionated. Would the difference in text color itself be enough to mark this off as a code block?

I think indenting code blocks and coloring un-highlighted ones would be fine. The indent tracks with what the plain writer does.

@jgm
Copy link
Owner

jgm commented Apr 18, 2024

Do you mean that code blocks that don't have syntax highlighting would be indented, but those that do wouldn't? Or that both would be indented?

@silby
Copy link
Author

silby commented Apr 18, 2024

Both indented. Seems like it should be consistent in that regard.

@jgm
Copy link
Owner

jgm commented Apr 18, 2024

If we do take this approach (indented code blocks), then we could also reconsider about headings (assuming you think your original indented approach was better -- I'm not sure what I think about that).

@silby
Copy link
Author

silby commented May 2, 2024

fyi back on this finally, thinking about table output right now. I'd like to implement it translating from the Table structure to DocLayout blocks and have it look a little bit fancy instead of relying on T.P.W.Shared.gridTable. Handling row and column spans the "correct" (HTML) way seems like it could be annoying to implement, you'd need to map table cells to "slots" on a logical grid and then recursively build up contiguous rectangles. I bet that's a textbook algorithm but I sure don't know what the name of it would be. I might just do what gridTable does and treat spanned cells as blanks without actually letting the spanning content occupy those cells.

@jgm
Copy link
Owner

jgm commented May 2, 2024

you'd need to map table cells to "slots" on a logical grid and then recursively build up contiguous rectangles

I had the impression that we have a function in pandoc that does this, though I wasn't involved with this code. @tarleb is that right?

@tarleb
Copy link
Collaborator

tarleb commented May 2, 2024

Writing that algorithm is still on my todo list, unfortunately.

@silby
Copy link
Author

silby commented May 3, 2024

Realized the notion I had for how to do this with existing doclayout capabilities isn't general enough anyway. The hard thing here is being able to take tables like this and compose the cell contents into lines of text.

Screenshot 2024-05-03 at 3 06 03 PM

You can't ask for that in Pandoc's markdown I don't think but the HTML reader will parse them fine, it'd be nice if I can do something "correct" here even if it's annoying code to write.

Edit: Started trying to write an imperative Python version of this and it's already a bit of a trial trying to figure out what you need to keep track of. I might not have as much passion for correctness as I thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants