Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

force_align on vectors missing (or: how to align vectors in flatcc?) #179

Open
benvanik opened this issue Apr 16, 2021 · 37 comments
Open

force_align on vectors missing (or: how to align vectors in flatcc?) #179

benvanik opened this issue Apr 16, 2021 · 37 comments

Comments

@benvanik
Copy link

benvanik commented Apr 16, 2021

[ Summary: flatcc should eventually support force_align on vectors - meanwhile use vectors of structs and force_align structs, or use low level builder calls to create aligned vectors access as discussed below - the problems investigated were eventually decided to be incorrect use of API - low level alignment doesn't work if the buffer isn't started first. ]

Hello! I have some variable-length data that must be aligned on a 16-byte boundary that I'm storing in a [uint8]:

table Buffer {
  data:[uint8];
}

At some point it looks like the google flatbuffers library started supporting force_align on these vector fields, so this works there:

table Buffer {
  data:[ubyte] (force_align: 16);
}

The current flatcc compiler doesn't support this new usage of force_align yet though:

[build] bytecode_module_def.fbs:103:17: error: 'force_align': known attribute not expected in this context

My use requires me being able to ensure alignment for compatibility with DMA operations and SIMD code, but there are also some other big flatbuffers users that are now requiring it: https://github.com/tensorflow/tensorflow/blob/1e66b5790c93a1752fd8ea92a146aff811f2ee95/tensorflow/lite/schema/schema_v3a.fbs

If force_align on vectors could be supported in the flatcc parser/builder that'd be fantastic, but I was also wondering if there was a workaround that could be used (I suspect this isn't the first time alignment has come up as a question :). I tried to directly do this by passing an alignment during construction of the vector but that doesn't seem to do what I expected (16 byte alignment of the data in the flatbuffer):

flatcc_builder_start_vector(fbb, 1, /*align=*/16, FLATBUFFERS_COUNT_MAX(1));
uint8_t* ptr = flatbuffers_uint8_vec_extend(fbb, length);
memcpy(ptr, source_data, length);
flatbuffers_uint8_vec_end(fbb);

I'd expect the flatbuffers_uint8_vec_t when loaded from the buffer to be at a 16 byte offset (or, ignoring file header weirdness, would at least be consistently stride 16) however the pointers I get back seem rather arbitrary (4-, 8-, and 12-byte alignment).
I've started trying to use vectors of structs (that do support force_align in flatcc) but that also doesn't seem to do what I expect, but would love to hear if any of you have seen solutions in the past that are known to work. Also I could be doing something very wrong - I just started digging into the internals of flatcc and don't know much yet :)

@benvanik
Copy link
Author

it looks like the google flatbuffers compiler turns the force_align into a ForceVectorAlignment before starting the vector:
https://github.com/google/flatbuffers/blob/78f0c0d1d96a220163a03174d3240864a6a139da/include/flatbuffers/flatbuffers.h#L1667-L1675

Is there an equivalent on the flatcc builder API?

@mikkelfj
Copy link
Contributor

mikkelfj commented Apr 16, 2021

Hi Ben, to your last question first. Flatcc does not have ForceVectorAlignment equivalent. Alignment is taken from the schema during construction where supported. (EDIT: not entirely true, see my follow up comment). (Though there is force assignment to table field members similar to what flatcdoes to bypass skipping default values, but that is something different.

I don't really recall exactly what is supposed to be supported. You report either suggests a regression or something that hasn't yet been implemented. If it is supported by Googles flatc compiler it certainly should be in flatcc as well.

In either case I fully support adding force_align to vectors, I'm just not sure how much time I can commit to that for the time being.

@mikkelfj
Copy link
Contributor

mikkelfj commented Apr 16, 2021

If this is really important to you, you can call the low level builder interface to create the vector without generated code. This can be freely mixed with other code, you just have to be somewhat careful.

/**
* Creates a vector in a single operation using an externally supplied
* buffer. This completely bypasses the stack, but the size must be
* known and the content must be little endian. Do not use for strings
* and offset vectors. Other flatbuffer vectors could be used as a
* source, but the length prefix is not required.
*
* Set `max_count` to `FLATBUFFERS_COUNT_MAX(elem_size)` before a call
* to any string or vector operation to the get maximum safe vector
* size, or use (size_t)-1 if overflow is not a concern.
*
* The max count property is a global property that remains until
* explicitly changed.
*
* `max_count` is to prevent malicous or accidental overflow which is
* difficult to detect by multiplication alone, depending on the type
* sizes being used and having `max_count` thus avoids a division for
* every vector created. `max_count` does not guarantee a vector will
* fit in an empty buffer, it just ensures the internal size checks do
* not overflow. A safe, sane limit woud be max_count / 4 because that
* is half the maximum buffer size that can realistically be
* constructed, corresponding to a vector size of `UOFFSET_MAX / 4`
* which can always hold the vector in 1GB excluding the size field when
* sizeof(uoffset_t) = 4.
*/
flatcc_builder_ref_t flatcc_builder_create_vector(flatcc_builder_t *B,
const void *data, size_t count, size_t elem_size, uint16_t align, size_t max_count);
/**
* Starts a vector on the stack.
*
* Do not use these calls for string or offset vectors, but do store
* scalars, enums and structs, always in little endian encoding.
*
* Use `extend_vector` subsequently to add zero, one or more elements
* at time.
*
* See `create_vector` for `max_count` argument (strings and offset
* vectors have a fixed element size and does not need this argument).
*
* Returns 0 on success.
*/
int flatcc_builder_start_vector(flatcc_builder_t *B, size_t elem_size,
uint16_t align, size_t max_count);
/**
* Emits the vector constructed on the stack by start_vector.
*
* The vector may be accessed in the emitted stream using the returned
* reference, even if the containing buffer is still under construction.
* This may be useful for sorting. This api does not support sorting
* because offset vectors cannot read their references after emission,
* and while plain vectors could be sorted, it has been chosen that this
* task is better left as a separate processing step. Generated code can
* provide sorting functions that work on final in-memory buffers.
*/
flatcc_builder_ref_t flatcc_builder_end_vector(flatcc_builder_t *B);

Also, if you can help investigate if this is a regression or a missing feature, it would be helpful.

@mikkelfj
Copy link
Contributor

mikkelfj commented Apr 16, 2021

Another way to work around this is to create a struct with the necessary alignment and store that struct in a vector. It slowly comes back now - I think flatc originally skipped vector force_align because you could do this instead.

Using a struct in this way will not physically change the buffer and you can cast from a simple array if you create the vector in one operation from a C array, and likewise when reading.

EDIT: I should read more close:

I've started trying to use vectors of structs (that do support force_align in flatcc) but that also doesn't seem to do what I expect, but would love to hear if any of you have seen solutions in the past that are known to work. Also I could be doing something very wrong - I just started digging into the internals of flatcc and don't know much yet :)

Can you please elaborate on why this doesn't work?

You can also use the verifier to test buffer alignment. (For the same reason the verifier must be updated upon changes to force_align support).

@benvanik
Copy link
Author

Thanks for the suggestions! I agree that the support for force_align on vectors is not a regression in flatcc but was just a late addition to flatbuffers at some point - I suspect someone came along and ran into the same thing I did and threw it in there :)

I think I noticed what was throwing me off about my direct flatcc_builder_start_vector use: it is aligning correctly, however everything is offset 8 bytes in the final buffer due to the file identifier. If I strip the first 8 bytes with the identifier then everything is correctly aligned to what I specify. I suppose the question then becomes: is there a way to tell flatcc to pad out the buffers to ensure that alignment is still respected when the file identifier is prefixed? I think even if force_align was supported this may still be an issue that prevents the intended alignment from being met.

Today I'm doing:

file_identifier "FILE";
table MyRoot {
  data:[uint8];
}
root_type MyRoot;
flatcc_builder_init(&fbb);
flatcc_builder_start_vector(&fbb, 1, 16, FLATBUFFERS_COUNT_MAX(1));
uint8_t* p = flatbuffers_uint8_vec_extend(&fbb, 1234);
memset(p, 'x', 1234);
flatbuffers_uint8_vec_ref_t data = flatcc_builder_end_vector(&fbb);
MyTable_start_as_root(&fbb);
MyTable_data_add(&fbb, data);
MyTable_end_as_root(&fbb);
// etc

The output buffer then has 08000000 46494C45 for the identifier and later on the xxxxxxx at a +8 alignment.

Thoughts? And thanks for the help! If I can get something working I'm happy to write it up in the docs so that others coming along can use it too :)

@mikkelfj
Copy link
Contributor

mikkelfj commented Apr 16, 2021

I'd be happy to help you progress on this. The following is an explanation of how things work or are supposed to work, but it probably won't help much. But it is a start:

I think I noticed what was throwing me off about my direct flatcc_builder_start_vector use: it is aligning correctly, however everything is offset 8 bytes in the final buffer due to the file identifier. If I strip the first 8 bytes with the identifier then everything is correctly aligned to what I specify.

This is not correct. The header is included in the alignment. Are you sure the buffer is placed in sufficiently aligned memory? There are subtle differences in the way a buffer is created if you create it with an identifier or with a null identifier (I don't recall all the details up front). The buffer can also be created with an optional size prefix.

The buffer can have a header in a few different ways: there can optionally be a file identifer, and there can optionally be a size prefix. You need to all the verifier in the proper way, and then it will very from the buffer start.

There are some variants start_as_root_with_identifier and with _size.

The header format is : [<size:4 bytes>]<offset to root object:4 bytes>[<file identifier: 4 bytes>]

(personally I think this is a mess, but it is what we have)

It is actually rather complex, but when creating a buffer, flatcc adjusts alignment according these different kinds of headers. Technically it builds the buffer backwards, properly aligned relative to a virtual zero point, and then inserts enough prepadding between the header and data to ensure that everything keeps staying aligned from the buffer start based on the largest alignment requirement seen during construction.

Related, but not very important here:
You may also notice that vectors are aligned to their elements and that flatcc reader returns a pointer to the first element, which should be aligned. But the actual vector starts 4 bytes earlier, and this vector header is not ailgned if elements are larger than 4 bytes because the vector size field is 4 bytes.

@mikkelfj
Copy link
Contributor

Here is the padding before the header:

header_pad = front_pad(B, field_size + id_size + (uoffset_t)(with_size ? field_size : 0), align);

@mikkelfj
Copy link
Contributor

Thoughts? And thanks for the help! If I can get something working I'm happy to write it up in the docs so that others coming along can use it too :)

Documentation is always welcome. Buf if you dig sufficiently into this, you might consider implementing support for vector force align. I think it might be relatively easy, albeit hard to grasp.

/*
* `align`, `offset` are for structs only. 64-bit allows for
* dynamically configured 64-bit file offsets. Align is restricted to
* at most 256 and must be a power of 2.
*/
uint16_t align;

Here struct sets the align field

if ((m = knowns[fb_attr_force_align])) {
if (!is_valid_align(m->value.u)) {
error_sym(P, &ct->symbol, "'force_align' exceeds maximum permitted alignment or is not a power of 2");
} else {
/* This may still fail on natural alignment size. */
ct->align = (uint16_t)m->value.u;
}
}

In code generation, the align field might be ignored where not supported, i.e. non structs, but it could also already be supported, since e.g. integers have different alignment requirements already, so merely setting the align field could possibly work.

Complications: The generated verifier must also support this, and the JSON parser must also, but again, if the align field is already inspected, it might happen automatically.

On the other hand, if structs are not aligned correctly in vectors right now, there are bigger problems.

Also important to never reduce alignment below natural alignment.
I think this has something to do with it:

static inline uint64_t fb_align(uint64_t size, uint64_t align)

@benvanik
Copy link
Author

Thank you for explaining (I really appreciate it!) - but I'm realizing that I'm still missing something :)

I'm using flatcc_builder_copy_buffer to get the final builder bytes and fwrite-ing them to a file which then verifies/loads correctly with flatcc so I'm (fairly) sure it's ok. I'm looking at the bytes in the file as written to disk in a hex editor and definitely seeing that they are not 16 byte aligned as I had specified (the 'xxxx's being the data as in the example above):
image
There's dozens of these vectors in the file and all are exactly +8, which is great as it means they are all consistently aligned whereas before using the vector_start with align=16 they were all over the place, but also means I don't understand what you mean about the file header being included in the alignment. Is it possible there's a bug in flatcc's handling of this alignment, or something I need to set to ensure it is included?

I have been hacking around in my own code but if this seems like it should be working then I'll build out a reproducer based on samples/bugreport/ to help isolate whether it's my code producing the buffers or not.

@mikkelfj
Copy link
Contributor

mikkelfj commented Apr 16, 2021

I wrote the following just before receiving your last comment:

I was wondering: there could be a bug in the builder api where it does align vectors correctly but it fails to record the alignment size in the buffer state such that the prepadding works on wrong assumptions. This would also affect 64-bit integers because their alignment are larger than what is minimally required, so it ought to have been detected by now if that were the case.

@mikkelfj
Copy link
Contributor

I don't understand what you mean about the file header being included in the alignment

This means that if a something is aligned to, say, 16 bytes, then that something is placed at multiple of 16 bytes relative to the start of buffer where the header is placed. At least it is supposed to.

@benvanik
Copy link
Author

Ok cool - sounds like a reproducer is worth it! I'll put one together now.

@mikkelfj
Copy link
Contributor

uint16_t min_align;
/* The current active objects alignment isolated from nested activity. */

This min_align should be updated whenever adding something to the builder.

You could try adding a function flatcc_builder_set_min_align(B, align); and see if that helps. It just needs to set min_align to max of existing min_align and the new argument. Call it after starting the buffer and before ending it. See if that helps.

@mikkelfj
Copy link
Contributor

It is called for create_vector:

flatcc/src/runtime/builder.c

Lines 1380 to 1381 in e327f24

get_min_align(&align, field_size);
set_min_align(B, align);

But it appears to be missing from start_vector:

get_min_align(&align, field_size);

Does it help to add it there?

benvanik added a commit to benvanik/flatcc that referenced this issue Apr 16, 2021
@benvanik
Copy link
Author

Reproducer here: benvanik@017023d
If I run this I end up with an output.bin file with the vectors aligned to +8:
image

I tried adding the set_min_align(B, align); but it didn't change the output.

@mikkelfj
Copy link
Contributor

I tried adding the set_min_align(B, align); but it didn't change the output.

Exit frame calls set_min_align when a stack object pops, so end_vector should pop the frame and effectively call set_min_align at that point, provided the frame context has the proper local aligment stored.

set_min_align(B, B->align);

@mikkelfj
Copy link
Contributor

The fact the vectors are correctly aligned to something, here +8, is not surprising, but it confirms the suspicion that the padding between the header and data is incorrect sometimes. The min_align field should control this. It would be helpful with traces of the min_align value.

benvanik added a commit to benvanik/flatcc that referenced this issue Apr 16, 2021
@benvanik
Copy link
Author

I added some prints of it:

pre start min_align: 0
post start min_align: 0
pre end min_align: 0
post end min_align: 16
pre start min_align: 16
post start min_align: 16
pre end min_align: 16
post end min_align: 16
pre start min_align: 16
post start min_align: 16
pre end min_align: 16
post end min_align: 16
pre start min_align: 16
post start min_align: 16
pre end min_align: 16
post end min_align: 16
pre start min_align: 16
post start min_align: 16
pre end min_align: 16
post end min_align: 16
pre root end min_align: 1
post root end min_align: 16

It looks like it is staying as 16 after initially set in the first vector, but 1 right before I call the Eclectic_FooBar_end_as_root - perhaps that's the issue?

@mikkelfj
Copy link
Contributor

See my comments to the repro. That example is not really working as intended, I presume.

@mikkelfj
Copy link
Contributor

It looks like it is staying as 16 after initially set in the first vector, but 1 right before I call the Eclectic_FooBar_end_as_root - perhaps that's the issue?

Are you saying it suddently drops down to 1? That seems very odd. Can you track down where that happens?

@benvanik
Copy link
Author

(sorry didn't see comment on repro)

Yes - and if I set it back to 16 before ending the root then I get the correctly aligned output:

    B->min_align = 16;
    Eclectic_FooBar_end_as_root(B);

image

The verifier does succeed:

running build/myissue      
pre start min_align: 0     
post start min_align: 0    
pre end min_align: 0       
post end min_align: 16     
pre start min_align: 16    
post start min_align: 16   
pre end min_align: 16      
post end min_align: 16     
pre start min_align: 16    
post start min_align: 16   
pre end min_align: 16      
post end min_align: 16     
pre start min_align: 16    
post start min_align: 16   
pre end min_align: 16      
post end min_align: 16     
pre start min_align: 16    
post start min_align: 16   
pre end min_align: 16      
post end min_align: 16     
pre root end min_align: 1  
post root end min_align: 16
success                    

Regarding the comment on the repo: I would have expected the vector contents to be aligned to the alignment (the xxxx's). What does the align on start_vector do if not align the contents? (according to something I read - which I'll try to find - the alignment specifies the start of the vector contents but not the size prefix, which will remain 4 byte aligned)

@mikkelfj
Copy link
Contributor

My comment was misguided - I somehow thought you created a high level example using standard constructs, but you were using the low level features we discussed all along - brain fart. My comments are still relevant if you skip the low level details and want to test struct alignment.

I think I found the issue:

B->min_align = 1;

You cannot create the vector before you start the buffer. Usually it is harmless, especially outside of nested buffers, but this is an example of where it does not work.

@mikkelfj
Copy link
Contributor

mikkelfj commented Apr 16, 2021

Also note:

MyTable_start_as_root is really a combined start_buffer and start_table call. In principle you must start the buffer, then create the vector, then start the root table. That is how it works in other languages. But in flatcc you are allowed to create objects while parent objects are open. But you are not allowed to create objects before the buffer is initialized.

EDIT: fixed misleading text: changed "create the table" to "create the vector".

@mikkelfj
Copy link
Contributor

@benvanik
Copy link
Author

Ahhhh that did it!

I am now calling flatcc_builder_start_buffer(B, Eclectic_Buffer_file_identifier, 16, 0); immediately after init'ing the builder, recording my vectors, and then using Eclectic_FooBar_start instead of Eclectic_FooBar_start_as_root. This produces results that both verify and have the correct alignment in the data!

I had seen the buffer calls but must have just skipped over them as the _as_root was working (well, not as I wanted, but I didn't think to look there!). Very useful to know that the implicit start_as_root behavior may mask issues like this. I'll try these changes in my application and verify it works there too.

@mikkelfj
Copy link
Contributor

I actually wonder if the following example under Tables is correct:

Monster_start(B);
Monster_Hp_add(B, 80);
...
flatcc_builder_buffer_create(B, Monster_end(B));

@mikkelfj
Copy link
Contributor

It shouldn't be necessary to use the alignment of 16 in your start buffer call, provided that ordering is correct. 0 should work just as well.

@benvanik
Copy link
Author

benvanik commented Apr 16, 2021

You're right - I may have a use for block_align to pad the buffer out but it's not needed here. For anyone who googles this in the future and lands here this is the code to align a vector:

    flatcc_builder_init(B);
    // Start the root table before doing anything else:
    Eclectic_FooBar_start_as_root(B);

    // Specify an alignment for the vector contents:
    flatcc_builder_start_vector(B, 1, /*align=*/16, FLATBUFFERS_COUNT_MAX(1));
    uint8_t* p = flatcc_builder_extend_vector(B, 12345);
    memset(p, 'x', 12345);
    flatcc_builder_ref_t data_vec = flatcc_builder_end_vector(B);

    // Finish building the root table:
    Eclectic_FooBar_data_add(B, data_vec);
    Eclectic_FooBar_end_as_root(B);

@mikkelfj
Copy link
Contributor

Note that you can use start_as_root and then create you vectors. It is slightly less efficient because the root table is parked on the stack while the vectors are created, but nothing significant, and it can be much more convenient since you can add the vectors to parent one at time, something you cannot do in C++ or other languages.

@mikkelfj
Copy link
Contributor

mikkelfj commented Apr 16, 2021

On your example: while it probably is correct, a flatcc principle is to always pair start and end calls with _as_root or not.
To be strictly correct you should call root = Eclectic_FooBar_end(B), then end_buffer(B, root), or use start_as_root before the vectors as per my previous comment.

@benvanik
Copy link
Author

Totally correct - updated (and tested to verify my sanity :)

@mikkelfj
Copy link
Contributor

Also, the example requires use of little endian because the endian conversion is not automatic here. As long as the data is 8 bit units it doesn't matter. But given the nature of the problem, there are probably higher level structures stored in that vector, that isn't platform portable anyway.

@benvanik
Copy link
Author

Correct - in my particular case it's an entire file of a format that has its own specifications (WAV, PNG, etc) but you are right that if it was something like a list of floats then it would need special handling.

@mikkelfj
Copy link
Contributor

I still wonder why it didn' work with vectors of structs?

@benvanik
Copy link
Author

Now that I have this reproducer set up I'll give that a try too - it's possible you'll be able to spot what I was doing wrong :)

benvanik added a commit to iree-org/iree that referenced this issue Apr 16, 2021
This required a minor tweak to start_as_root the flatbuffer root tables
prior to recording any data. This was discovered thanks to the gracious
help (and patience) of of the flatcc author:
dvidelabs/flatcc#179

With this all of our binary blobs embedded into the flatbuffers are now
16-byte aligned with the option to tweak it further via a vm.rodata
attribute. We don't need utf8 strings to be 16-byte aligned, for example.
benvanik added a commit to iree-org/iree that referenced this issue Apr 16, 2021
This required a minor tweak to start_as_root the flatbuffer root tables
prior to recording any data. This was discovered thanks to the gracious
help (and patience) of of the flatcc author:
dvidelabs/flatcc#179

With this all of our binary blobs embedded into the flatbuffers are now
16-byte aligned with the option to tweak it further via a vm.rodata
attribute. We don't need utf8 strings to be 16-byte aligned, for example.
@benvanik
Copy link
Author

I think struct alignment is working perfectly (assuming this is what you intended):

struct Aligned128 (force_align: 16) {
  a:uint64;
  b:uint64;
}
table Buffer {
  data:[Aligned128];
}
      #define iree_align(value, alignment) \
        (((value) + (alignment)-1) / (alignment)) * (alignment)
      Eclectic_Buffer_start(B);
      Eclectic_Buffer_data_start(B);
      uint8_t* p = (uint8_t*)Eclectic_Buffer_data_extend(B, iree_align(sizes[i], 16) / 16);
      memset(p, 'x', sizes[i]);
      Eclectic_Buffer_data_end(B);
      buffer_refs[i] = Eclectic_Buffer_end(B);

This produces 16-byte aligned struct vectors in the output file:
image

I believe I was running into the same buffer start-related issue before with this approach as this only works with the start_buffer (or start_as_root) prior to the vector starts. So that was my main mistake, thankfully!

Now that I have the flexibility of doing the alignment based on runtime parameters via the flatcc_builder_start_vector I'll avoid specifying it in the file - since these are all byte buffers there's not much value in the generated named versions for my particular use case though there may be others who would want the convenience of force_align on vectors for other primitive types.

Thanks again for all the help! I leave it in your hands as to whether you leave this issue as the feature request for force_align so others see it or close it - I can recreate one without all the noise focused only on that feature :)

@mikkelfj
Copy link
Contributor

You are welcome, and thanks for the update. It is a feature that should be added, so I'll leave it open.

benvanik added a commit to iree-org/iree that referenced this issue Apr 16, 2021
This required a minor tweak to start_as_root the flatbuffer root tables
prior to recording any data. This was discovered thanks to the gracious
help (and patience) of of the flatcc author:
dvidelabs/flatcc#179

With this all of our binary blobs embedded into the flatbuffers are now
16-byte aligned with the option to tweak it further via a vm.rodata
attribute. We don't need utf8 strings to be 16-byte aligned, for example.
benvanik added a commit to iree-org/iree that referenced this issue Apr 16, 2021
This required a minor tweak to start_as_root the flatbuffer root tables
prior to recording any data. This was discovered thanks to the gracious
help (and patience) of of the flatcc author:
dvidelabs/flatcc#179

With this all of our binary blobs embedded into the flatbuffers are now
16-byte aligned with the option to tweak it further via a vm.rodata
attribute. We don't need utf8 strings to be 16-byte aligned, for example.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants