Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds --columns for Print using custom width #13 #164

Merged
merged 16 commits into from Sep 11, 2022
Merged

Adds --columns for Print using custom width #13 #164

merged 16 commits into from Sep 11, 2022

Conversation

sharifhsn
Copy link
Contributor

@sharifhsn sharifhsn commented Jul 12, 2022

This pull request fixes #13 partially by adding the --columns option to manually set the number of columns. The default is 2 as with previous behavior.

Some notes:

  • the short option is -w not -c as that option is already reserved for --bytes
  • it passes cargo test
  • I could have added the option to automatically set columns to terminal width, but that would require adding another dependency

Copy link
Contributor Author

@sharifhsn sharifhsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixes previous pull request by running cargo fmt.

@sharifhsn
Copy link
Contributor Author

I have a pull request ready for terminal width, but it requires adding the terminal_width dependency. I'll wait to submit it until this request is approved.

src/bin/hexyl.rs Outdated Show resolved Hide resolved
src/squeezer.rs Outdated Show resolved Hide resolved
Copy link
Owner

@sharkdp sharkdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. Works great!

Does this have any impact on performance? (see e.g. #66 (comment) on how to perform hexyl benchmarks using hyperfine)

@sharifhsn
Copy link
Contributor Author

Thank you very much. Works great!

Does this have any impact on performance? (see e.g. #66 (comment) on how to perform hexyl benchmarks using hyperfine)

No, I'll do that now. I couldn't find a benchmark harness before, thank you for pointing me to it.

@sharkdp
Copy link
Owner

sharkdp commented Jul 17, 2022

We should probably add a script with a few useful benchmarks, true. For this PR (or in general), I would suggest to use something like:

#!/bin/bash

dd if=/dev/urandom bs=1M count=1 of=/tmp/data

hyperfine \
    --warmup 3 \
    --shell=none \
    --output=pipe \
    --export-markdown=result-pipe.md \
    'hexyl /tmp/data'

hyperfine \
    --warmup 3 \
    --shell=none \
    --output=inherit \
    --export-markdown=result-inherit.md \
    'hexyl /tmp/data'

rm /tmp/data

Here, we use hyperfines new --output option to specify where to redirect the output of the benchmarked program. Using "pipe" is like doing hexyl … | cat > /dev/null (which is much different from doing hexyl … > /dev/null because hexyl doesn't directly write to /dev/null and kernel optimizations don't apply, i.e. write calls actually need to take place). And using "inherit" is like calling the program with the TTY attached (this will output a lot to your screen).

@sharkdp
Copy link
Owner

sharkdp commented Jul 17, 2022

You can also create two binaries, hexyl-master and hexyl-feature, and then run something like:

hyperfine \
    --warmup 3 \
    --shell=none \
    --output=pipe \
    --parameter-list version master,feature
    --export-markdown=result-pipe.md \
    './hexyl-{version} /tmp/data'

to see the direct comparison.

@sharifhsn
Copy link
Contributor Author

Command Mean [ms] Min [ms] Max [ms] Relative
hexyl data 421.3 ± 14.3 404.5 444.0 1.24 ± 0.11
hexyl --no-squeezing data 563.5 ± 14.7 542.3 583.4 1.65 ± 0.15
target/release/hexyl data 433.0 ± 16.0 409.1 462.5 1.27 ± 0.12
target/release/hexyl --no-squeezing data 526.2 ± 18.1 506.6 559.6 1.55 ± 0.14
target/release/hexyl --columns 4 data 367.4 ± 12.7 347.8 382.6 1.08 ± 0.10
target/release/hexyl --columns 4 --no-squeezing data 407.6 ± 14.2 375.5 426.5 1.20 ± 0.11
target/release/hexyl --columns 8 data 340.5 ± 28.6 306.6 399.2 1.00
target/release/hexyl --columns 8 --no-squeezing data 351.9 ± 15.3 318.9 366.2 1.03 ± 0.10

These are my results using 1 MB of /dev/urandom. I used my system-installed binary of hexyl to compare since I'm pretty sure there were no feature changes since the last release.

It seems as though for the default width of two columns the new version is slightly slower, but within noise for the benchmark. With more columns it's much faster, but that of course makes sense since it's writing fewer lines.

@sharifhsn
Copy link
Contributor Author

One more thing; I've implemented columns here as a u8, since I find it unlikely that anyone would ever use over 255 columns in the viewer. This also places a cap on terminal width at 8969. Is this desirable, or should I change the type to u16?

@sharifhsn
Copy link
Contributor Author

I've added the terminal width setting, so this should completely close #13.

@sharkdp
Copy link
Owner

sharkdp commented Jul 17, 2022

One more thing; I've implemented columns here as a u8, since I find it unlikely that anyone would ever use over 255 columns in the viewer. This also places a cap on terminal width at 8969. Is this desirable, or should I change the type to u16?

I wanted to remark this in my first review but didn't want to be overly annoying 😄. In my personal experience: trying to use size-limited (unsigned) integers (u8, u16, i8, i16) in places which are not size- or performance critical is almost always a mistake. It's usually meant as a (premature) optimization. But the reality is that it (1) leads to more errors due to overflows (2) leads to more complicated code due to more explicit casts between 8-bit, 16-bit and 32-bit numbers and (3) doesn't even necessarily generate faster code (because if they are not closely packed, we can't really make use of them) or smaller data structs (due to alignment mismatches).

So personally, for a single parameter, I would never mind trying to go for a small integer type and simply go with a 32-bit integer, or even isize/usize. Often, it's even undesirable to use unsigned integers for parameters that cannot (reasonably) be negative, because again, it might lead to more complicated code, and it can lead to tricky overflow errors.

@sharkdp
Copy link
Owner

sharkdp commented Jul 17, 2022

Feel free to bump the min. supported Rust version if that causes problems.

@sharkdp
Copy link
Owner

sharkdp commented Jul 17, 2022

Concerning the newly added options:

Would it be possible to scratch --auto-width and use something like --columns=max or --columns=auto instead?

Also, I think --terminal-width should be usable with other "width-setting" options. I think of it purely as an overwrite for the auto-detected terminal width. What do you think?

I've added the terminal width setting, so this should completely close #13.

Does it? I believe the issue also asks for custom widths that would not be multiples of 16. Like if you have some data structure with elements that are 14 byte long. Then you might want to specify --width=14.

@sharifhsn
Copy link
Contributor Author

One more thing; I've implemented columns here as a u8, since I find it unlikely that anyone would ever use over 255 columns in the viewer. This also places a cap on terminal width at 8969. Is this desirable, or should I change the type to u16?

I wanted to remark this in my first review but didn't want to be overly annoying smile. In my personal experience: trying to use size-limited (unsigned) integers (u8, u16, i8, i16) in places which are not size- or performance critical is almost always a mistake. It's usually meant as a (premature) optimization. But the reality is that it (1) leads to more errors due to overflows (2) leads to more complicated code due to more explicit casts between 8-bit, 16-bit and 32-bit numbers and (3) doesn't even necessarily generate faster code (because if they are not closely packed, we can't really make use of them) or smaller data structs (due to alignment mismatches).

So personally, for a single parameter, I would never mind trying to go for a small integer type and simply go with a 32-bit integer, or even isize/usize. Often, it's even undesirable to use unsigned integers for parameters that cannot (reasonably) be negative, because again, it might lead to more complicated code, and it can lead to tricky overflow errors.

Thank you for saying this! I am quite new to Rust so I appreciate the advice. Making it a u16 eliminates a cast (since terminal_width uses a u16) so I'll use that instead. Using a bigger value would be more of a hassle than a u16.

src/bin/hexyl.rs Outdated Show resolved Hide resolved
src/lib.rs Outdated
@@ -164,6 +166,7 @@ impl<'a, Writer: Write> Printer<'a, Writer> {
show_position_panel: bool,
border_style: BorderStyle,
use_squeeze: bool,
columns: u8,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change for hexyl-as-a-library. This whole new method is really unfortunate. There are too many arguments even before this PR. We should really use a builder-pattern instead.

But that's not really the problem of your PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be willing to work on implementing a builder pattern if you're committed to pushing a breaking change anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a commit ready for a builder pattern. It uses a new PrinterBuilder struct, so it can technically be implemented in a non-breaking way, although that's probably undesirable. It's quite out of scope for this pull request, so I'll submit a new one once this one is approved.

src/lib.rs Outdated Show resolved Hide resolved
@sharifhsn
Copy link
Contributor Author

Concerning the newly added options:

Would it be possible to scratch --auto-width and use something like --columns=max or --columns=auto instead?

Of course.

Also, I think --terminal-width should be usable with other "width-setting" options. I think of it purely as an overwrite for the auto-detected terminal width. What do you think?

I was only going off of the original issue's suggestions. The use case for --terminal-width is probably in a situation where the width is necessarily limited for some reason. In that case, I would pick the minimum between the two options i.e. prefer --columns unless the number of columns exceeds that which is dictated by --terminal-width.

I've added the terminal width setting, so this should completely close #13.

Does it? I believe the issue also asks for custom widths that would not be multiples of 16. Like if you have some data structure with elements that are 14 byte long. Then you might want to specify --width=14.

Ah, I wasn't aware that was in the scope of the original issue. That would probably require a more significant rewrite.

@sharkdp
Copy link
Owner

sharkdp commented Jul 17, 2022

Ah, I wasn't aware that was in the scope of the original issue. That would probably require a more significant rewrite.

That's okay. Let's focus on --columns for now.

@sharifhsn
Copy link
Contributor Author

sharifhsn commented Jul 18, 2022

Ok, I think I misunderstood the initial issue. The custom number of columns refers to the individual columns, whereas my pull request focuses on customizing the number of panels. With that in mind, I think I should rename columns to panels and reflect that in the documentation.

A future change that customizes columns per panel would therefore be able to slot into this change better.

@sharkdp
Copy link
Owner

sharkdp commented Aug 14, 2022

Another option would be to leave the option name as is, but change the meaning. And then for this PR, we only allow --columns=8,16,24,…. What do you think?

@sharifhsn
Copy link
Contributor Author

Another option would be to leave the option name as is, but change the meaning. And then for this PR, we only allow --columns=8,16,24,…. What do you think?

I think it would be more ergonomic here to have the two options, columns-per-panel and number-of-panels.

In my mind, the idea behind variable number columns is to have the panel size itself adjust. With the other option, the panel size would always be 8 bytes, and a future ability to have e.g. columns=14 would leave blank space at the end.

It would be better to set columns=7 and panels=2 which is more explicit in my opinion.

@sharkdp
Copy link
Owner

sharkdp commented Aug 14, 2022

You are right.

Let's change the command line option name then 👍🏼

@sharifhsn
Copy link
Contributor Author

@sharkdp I think we can declare this PR complete? I don't see any other roadblocks.

src/bin/hexyl.rs Outdated Show resolved Hide resolved
@sharkdp
Copy link
Owner

sharkdp commented Sep 6, 2022

I have a commit ready for a builder pattern. It uses a new PrinterBuilder struct, so it can technically be implemented in a non-breaking way, although that's probably undesirable. It's quite out of scope for this pull request, so I'll submit a new one once this one is approved.

That would be great! If we do that, I would make that a breaking change, i.e. remove the old new method.

@sharkdp I think we can declare this PR complete? I don't see any other roadblocks.

Yes. Looks great. Two things: the inline comment regarding the short options. And it would be great if you could add a CHANGELOG entry in the "unreleased" section. The format for entries is:

- Description what has been changed, see #123 (@user)

where #123 links to the bug ticket and/or the PR and user is your username.

@sharkdp sharkdp merged commit 9bc8182 into sharkdp:master Sep 11, 2022
@sharkdp
Copy link
Owner

sharkdp commented Sep 11, 2022

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Print using custom width
2 participants