Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support raw output #219

Open
jess-sol opened this issue Mar 20, 2024 · 2 comments
Open

Support raw output #219

jess-sol opened this issue Mar 20, 2024 · 2 comments

Comments

@jess-sol
Copy link

Not sure if this makes sense for Hexyl to support, but because it already has -s and -c, it'd be really nice to be able to output raw binary data to trim down a large binary to a small one quickly; after locating the portion wanted. If there's interest in a --raw flag, I'd be happy to implement it.

@sharkdp
Copy link
Owner

sharkdp commented Apr 7, 2024

That sounds like an interesting idea! I think it would be a nice feature to have, if it can be cleanly integrated into the code base.

We should maybe also research what other tools do (hexdump, xxd). I think they provide options to turn their output back into binary? which might be even more powerful. Because you can build pipelines and integrate other tools as well.

@jess-sol
Copy link
Author

jess-sol commented Apr 7, 2024

So I did a bit more digging on how other tools do it.

With xxd, it supports it by the -r option:

-r | -revert
  Reverse operation: convert (or patch) hex dump into binary.  If not writing to stdout, xxd writes into its output file without truncating it. Use the combination -r -p to read plain hexadecimal
  dumps  without  line number information and without a particular column layout. Additional whitespace and line breaks are allowed anywhere. Use the combination -r -b to read a bits dump instead
  of a hex dump.

Basically the expectation is that you feed the output of a previous xxd command into xxd -r to revert back to the original file. Formatting options given to the first invocation must be given to the second, for example:

xxd input | xxd -r
xxd -b input | xxd -br

Hexdump on the other hand has a very powerful output formatting syntax. It provides a way to split output into groups of bytes, and consume/format some number of groups. This is how you'd output the raw binary:

hexdump input -ve '1/1 "%c"' # Number of groups / Number of bytes
hexdump input -ve '"%c"' # 1/1 elided

Hexdump output formatting is powerful (see some examples in a Suse blog post), though it seems like it'd be more practical to write a python script than a hexdump format file these days for most uses.

I think having a --raw output would be the simplest codewise. It'd just require bailing out early and copying the reader to stdout after skip/take. I could see an argument for adding a custom output format option similar to hexdump's (though ideally simplified). In that case it'd make sense to collect some usecases that Hexyl would want to support and design towards that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants