Add support for Tamp (low-memory intended for embedded targets) #135

BrianPugh · 2024-02-25T21:11:55Z

Tamp is a low-memory, DEFLATE-inspired lossless compression library intended for embedded targets. Tamp is intended for situations where previous heatshrink was used. Tamp offers higher compression ratios, better tooling, better API, small firmware, and a small memory footprint (barely larger than the window buffer). Tamp has an easy to install CLI, python library, and C implementation.

The design priorities (in order) of Tamp is:

Low memory usage.
Good compression ratios.
Small firmware size.

Here's an example output from running the following on a M1 macbook air; typical use-case is level 10:

$ ./lzbench -t16,16 -etamp silesia.tar
lzbench 1.8 (64-bit MacOS)  (null)
Assembled by P.Skibinski

Compressor name         Compress. Decompress. Compr. size  Ratio Filename
memcpy                  29634 MB/s 29552 MB/s   211975168 100.00 silesia.tar
tamp 1.3.1 -8            13.0 MB/s   188 MB/s   114873328  54.19 silesia.tar
tamp 1.3.1 -9            10.2 MB/s   196 MB/s   107632159  50.78 silesia.tar
tamp 1.3.1 -10           6.24 MB/s   202 MB/s   102660646  48.43 silesia.tar
tamp 1.3.1 -11           3.67 MB/s   206 MB/s    99280285  46.84 silesia.tar
tamp 1.3.1 -12           2.16 MB/s   216 MB/s    95567672  45.08 silesia.tar
tamp 1.3.1 -13           1.23 MB/s   227 MB/s    93114331  43.93 silesia.tar
tamp 1.3.1 -14           0.71 MB/s   236 MB/s    91469264  43.15 silesia.tar
tamp 1.3.1 -15           0.40 MB/s   243 MB/s    90653607  42.77 silesia.tar
done... (cIters=1 dIters=1 cTime=16.0 dTime=16.0 chunkSize=1706MB cSpeed=0MB)

tansy · 2024-03-10T01:30:17Z

It cannot decompress it's own 'files'.
Also levels below 8 crash.

BrianPugh · 2024-03-10T03:08:56Z

It cannot decompress it's own 'files'.

Can you elaborate? What issues are you seeing?

Also levels below 8 crash.

This is intentional, in Tamp levels below 8 are invalid. This is why I have the range begins at 8 in lzbench.h

tansy · 2024-03-10T18:07:36Z

It seems to not be able to correctly decompress compressed data. Lzbench check correctness of decompressed result and if it differs from original then it indicates error.
In simple terms - your decompressor does not decompress its own compressed stream into original state.

$ lzbench-tamp -etamp reymont 
lzbench 1.8 (32-bit Linux)

Compressor name         Compress. Decompress. Compr. size  Ratio Filename
tamp 1.3.1 -8            5.12 MB/s      ERROR     3281840  49.52 reymont
tamp 1.3.1 -9            3.38 MB/s      ERROR     3014779  45.49 reymont
tamp 1.3.1 -10           2.14 MB/s      ERROR     2847981  42.97 reymont
^C

BrianPugh · 2024-03-11T14:37:19Z

I know it's not a great response, but it works fine on my M1 macos machine :D

I'll try and replicate when I get my hands on my linux box in a week.

BrianPugh · 2024-03-22T22:18:24Z

I just ran this on a 64 bit linux machine without issues:

$ ./lzbench -t16,16 -etamp silesia.tar
lzbench 1.8 (64-bit Linux)  Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz
Assembled by P.Skibinski

Compressor name         Compress. Decompress. Compr. size  Ratio Filename
memcpy                  13128 MB/s 13567 MB/s   211975168 100.00 silesia.tar
tamp 1.3.1 -8            9.19 MB/s   101 MB/s   114873328  54.19 silesia.tar
tamp 1.3.1 -9            5.82 MB/s   104 MB/s   107632159  50.78 silesia.tar
tamp 1.3.1 -10           3.53 MB/s   109 MB/s   102660646  48.43 silesia.tar

I then cross-compiled it for 32-bit:

$ ./lzbench -t16,16 -etamp reymont
lzbench 1.8 (32-bit Linux)  Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz
Assembled by P.Skibinski

Compressor name         Compress. Decompress. Compr. size  Ratio Filename
memcpy                  15503 MB/s 15493 MB/s     6627202 100.00 reymont
tamp 1.3.1 -8            9.16 MB/s      ERROR     3281840  49.52 reymont
tamp 1.3.1 -9            6.23 MB/s      ERROR     3014779  45.49 reymont
tamp 1.3.1 -10           4.00 MB/s      ERROR     2847981  42.97 reymont

Investigating what the cause could be.

BrianPugh · 2024-03-22T22:30:00Z

@tansy this should be fixed now by 2a9d7d9. The issue was that I was pointing at the reported size int64_t with a size_t *. This isn't an issue if the int64_t size is initialized to 0, but it was never explicitly initialized, so the upper 4 bytes were just whatever garbage was on the stack. Because the code was using a size_t * pointer, only the 4 lower bytes were being updated.

tansy · 2024-03-24T08:30:14Z

Yes, it works now.

tansy · 2024-03-24T09:51:26Z

This is intentional, in Tamp levels below 8 are invalid

What's the rationale behind that?

BrianPugh · 2024-03-24T16:50:10Z

With tamp, the compression level directly corresponds to the window size. Tamp's header uses 3 bits to represent the window size:

Number of bits, minus 8, used to represent the size of the shifting window. e.g. A 12-bit window is encoded as the number 4, 0b100. This means the smallest window is 256 bytes, and largest is 32768.

For the API, it was decided that the user should just provide values in range [8, 15] instead of [0, 7] as those values are more meaningful.

tansy · 2024-03-26T12:52:42Z

It may be more meaningful to you but not neccesarily to average user, who doesn't (even) know what the sliding window is.

BrianPugh · 2024-03-26T15:46:17Z

Tamp doesn't perform any allocations, so the user must provide the window buffer. It is much more natural to do:

TampCompressor compressor;
const WINDOW_BITS = 10;
window_buffer = malloc(1 << WINDOW_BITS);
TampConf conf = {
    .window = WINDOW_BITS,  // this will change in example below
    .literal=8,
}
tamp_compressor_init(&compressor, &conf, window_buffer);

rather than

    .window = WINDOW_BITS - 8,  // magical number 8

In this PR, we could change the range expressed by lzbench to something like [1, 8] by adding a constant, but that seems unnecessarily confusing to me.

Add tamp v1.3.1

05480b5

Fix tamp benchmark on 32-bit systems.

2a9d7d9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Tamp (low-memory intended for embedded targets) #135

Add support for Tamp (low-memory intended for embedded targets) #135

BrianPugh commented Feb 25, 2024

tansy commented Mar 10, 2024

BrianPugh commented Mar 10, 2024

tansy commented Mar 10, 2024 •

edited

BrianPugh commented Mar 11, 2024

BrianPugh commented Mar 22, 2024

BrianPugh commented Mar 22, 2024

tansy commented Mar 24, 2024

tansy commented Mar 24, 2024

BrianPugh commented Mar 24, 2024

tansy commented Mar 26, 2024

BrianPugh commented Mar 26, 2024 •

edited

Add support for Tamp (low-memory intended for embedded targets) #135

Are you sure you want to change the base?

Add support for Tamp (low-memory intended for embedded targets) #135

Conversation

BrianPugh commented Feb 25, 2024

tansy commented Mar 10, 2024

BrianPugh commented Mar 10, 2024

tansy commented Mar 10, 2024 • edited

BrianPugh commented Mar 11, 2024

BrianPugh commented Mar 22, 2024

BrianPugh commented Mar 22, 2024

tansy commented Mar 24, 2024

tansy commented Mar 24, 2024

BrianPugh commented Mar 24, 2024

tansy commented Mar 26, 2024

BrianPugh commented Mar 26, 2024 • edited

tansy commented Mar 10, 2024 •

edited

BrianPugh commented Mar 26, 2024 •

edited