Not sure how to interpret the output #9

bartbutenaers · 2023-05-14T21:18:00Z

Thank you for sharing these models!

When I run a coco ssd model (from the coral.ai site) via tfjs, then I get as expected 4 output sensors as prediction result: scores, classes, bboxes, and detected object count.

I can load your models (i.e. edgetpu.tflite) without problems in tfjs, but afterwards the object detection output contains only one tensor. That tensor contains an array of 3087 sub-arrays (each containing 85 integer numbers):

I see that you use python instead of TensorflowJs, so I assume you don't use tfjs... But do you perhaps have an idea how I can interpret this output, or what I might be doing wrong? The tfjs-tflite package has status "work in progress" so perhaps it contains some bugs...

Thanks!!!

Bart

jveitchmichaelis · 2023-05-14T22:13:25Z

Hi Bart Have a look at the https://github.com/jveitchmichaelis/edgetpu-yolo/blob/784d9be1bb13ce4b8b3c1bad729a02a69cca97bb/edgetpumodel.py#L226 process predictions method. And also https://github.com/jveitchmichaelis/edgetpu-yolo/blob/784d9be1bb13ce4b8b3c1bad729a02a69cca97bb/nms.py#L52 which might make it clearer how the array is handled. You should have the four bounding coordinates, object confidence and a score for each class. That's a lot of detections though, so normally you'd want to run thresholding and non max suppression on the output. You also need to rescale the coords back to absolute pixels. So there's a bit of post processing required, but just dropping low confidence predictions should clean things up a bit. Cheers

…

On Sun, 14 May 2023 at 23:18, bartbutenaers ***@***.***> wrote: Hi @jveitchmichaelis <https://github.com/jveitchmichaelis>, Thank you for sharing these models! When I run a coco ssd model (from the coral.ai <https://coral.ai/models/object-detection/> site) via tfjs <https://github.com/tensorflow/tfjs>, then I get as expected 4 output sensors as prediction result: scores, classes, bboxes, and detected object count. I can load your models (i.e. edgetpu.tflite) without problems in tfjs, but afterwards the object detection output contains only one tensor. That tensor contains an array of 3087 sub-arrays (each containing 85 integer numbers): [image: image] <https://user-images.githubusercontent.com/14224149/238209740-e5e2bdf0-b564-400a-8a72-87eabe112507.png> I see that you use python instead of TensorflowJs, so I assume you don't use tfjs... But do you perhaps have an idea how I can interpret this output, or what I might be doing wrong? The tfjs-tflite package ***@***.***/tfjs-tflite> has status *"work in progress"* so perhaps it contains some bugs... Thanks!!! Bart — Reply to this email directly, view it on GitHub <#9>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAYDMJZZUCF2LZ3JXKGPOFLXGFDZFANCNFSM6AAAAAAYBMO5LU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

bartbutenaers · 2023-05-17T05:50:12Z

Hi Josh,

Thanks for the pointers to the code!!

Although I am not very familiar with Python, the code was very illuminating. I am getting closer to a solution, but my bounding boxes do not fit my objects yet. So I would appreciate if you could help me a bit more with this...

Can you please confirm whether my assumptions are correct:

I have 1 tensor in the output, whose data array contains this information?
The bounding boxes are in the format [center_x, center_y, width, height]?
The bounding box coordinates are relative to the resized image dimensions (like the model requires), so I need to transform the coordinates like this: coordinate * original_input_image_dimension / resized_model_image_dimension?
The confidence numbers are percentages, so not values between 0 and 1?
I get a score per class (for each of the 80 classes) and I need to find the class with the highest score, which is the class of the detected object.

Or perhaps you have any other info that could help me determine why my bounding boxes do not match my objects inside the image...

bartbutenaers · 2023-05-19T05:22:09Z

@jveitchmichaelis,

I don't get the bounding boxes fixed when running your model in tfjs ;-(

When debugging your code, I see that the x contains all floats:

But when I debug my own code, I see that your model works with int32 data:

And all values in my output are indeed integers...

I would appreciate a lot if you could give me some advice based on your knowledge!! Because I think you do some extra postprocessing - next to the things I listed above - but not all Python code is entirely clear to me

Thanks!
Bart

jveitchmichaelis · 2023-05-19T07:46:09Z

Hi Bart, Could you send over a test image and the tensor that comes out of the model? Best Josh

…

On Fri, 19 May 2023 at 07:22, bartbutenaers ***@***.***> wrote: @jveitchmichaelis <https://github.com/jveitchmichaelis>, I don't get the bounding boxes fixed when running your model in tfjs ;-( When debugging your code, I see that the x contains all floats: [image: image] <https://user-images.githubusercontent.com/14224149/239443814-7ab72b49-84d1-4d21-b887-b06d77ef14e1.png> But when I debug my own code, I see that your model works with int32 data: [image: image] <https://user-images.githubusercontent.com/14224149/239443994-d087633d-bff3-4bb8-ae65-be7c1326aa16.png> And all values in my output are indeed integers... I would appreciate a lot if you could give me some advice based on your knowledge!! Because I think you do some extra postprocessing - next to the things I listed above - but not all Python code is entirely clear to me Thanks! Bart — Reply to this email directly, view it on GitHub <#9 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAYDMJ5IEPYUW2TMZGYMO5TXG37QZANCNFSM6AAAAAAYBMO5LU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

bartbutenaers · 2023-05-19T19:17:50Z

Hi Josh,
That is very kind of you!

For example if I use this image, the I get this output tensor:

In the following file you can find the tensor's data array:
tensordata.txt

bartbutenaers · 2023-08-22T05:26:55Z

Hi @jveitchmichaelis,

I tried lots of things, but can't get this running unfortunately...
I assume I do something really wrong, because I even get probability percentages above 100:

It would be very appreciated if you could find some free time to have a look at my tensordata.txt file above.
Thanks!!!

jveitchmichaelis · 2023-08-22T10:11:33Z

Hi Bart, One thing that seems obviously wrong is your tensor data is all integers. It should be float data - they represent probabilities (mostly). Are you casting somewhere?

The model does use integer math but there is some scaling that happens to the output tensor which converts back to float.

edgetpu-yolo/edgetpumodel.py

Line 163 in 784d9be

# Scale output

And see a few lines above, you should also scale the input to the model.

If you're just using the weights on their own (without scaling the input image or the predictions) I guess this won't work.

The shape looks good though!

bartbutenaers · 2023-08-23T07:41:38Z

Hello Josh,
yes indeed it is all integers instead of floats. Don't know how that happens. I 'assume' Tfjs does it somewhere under the cover...

The only input processing I do is a bilinear resizing of my input image tensor, to resize the image to the resolution requirements of your model.

I also have a normalization preprocessing part, but that is not being executed for your model since both the input image tensor and the requirement of your model is int32:

If I force this normalization to be executed for your model, then my prediction gives an exception

So I assume your input scaling does something different? Would be nice if you could explain it in pseudo code, so I can convert it to javascript. Because unfortunately my Python knowledge is a bit lacking...

jveitchmichaelis · 2023-08-23T08:44:16Z

Sure, the process is described here: https://www.tensorflow.org/lite/performance/post_training_quantization

https://www.tensorflow.org/lite/performance/quantization_spec

See the bit at the bottom about representation for quantized tensors. You need to apply the scaling to both the image going in (float > int) and on the tensor coming out (int > float). So going in, rearrange for the int8 bit and going out, you want the real bit. If you're getting an int output then tfjs isn't doing it for you I think.

The scaling parameters are stored with the model as they're calibrated by running a bunch of images through. The Python code here reads them from the checkpoint.

This is not the usual 1/255 scaling you might do for a normal CNN. You need to do that first, and then apply the conversion to a scaled integer. I guess you could also roll it into one scale factor but sensible to separate it for clarity.

We do the 1/255 here:

edgetpu-yolo/utils.py

Line 65 in 784d9be

def get_image_tensor(img, max_size, debug=False):

Another example here https://www.tensorflow.org/lite/performance/post_training_integer_quant

In this example they check for a uint8 input (ie the model spec, not if the image is 8 bit!) and if so they apply scaling. I've not had much luck looking for a tfjs version but there may be one somewhere. Note they read the "quantization" parameter to get the scaling values.

See at the top where test_images is defined, they also scale by 255.

I guess tfjs doesn't have an 8 bit datatype so you have int32?

Something like:

const norm_image = image.cast('float32').div(255);
const scale_image = norm_image.div(scale).add(zero_point);

const pred = model.infer(norm_image.cast('int32'));

const float_pred = pred.cast('float32').add(zero_point).mul(scale);

// now do NMS, filter low probs etc.

bartbutenaers · 2023-08-24T06:44:50Z

Thanks for the clarification! Never heard of quantization before...

I only had time last night to implement scaling on the output tensor (i.e. detectionResult):

let max = imageTensor.max().cast('float32');
let  min = imageTensor.min().cast('float32');
let qmax = detectionResult.max().cast('float32');
let qmin = detectionResult.min().cast('float32');
let scaleFactor = max.sub(min).div(qmax.sub(qmin)).cast('float32');
let zeroPoint = qmin.sub(min.div(scaleFactor)).cast('float32');
let scaledDetectionResult = detectionResult.add(zeroPoint).mul(scaleFactor).cast('float32');

When applying this, the bounding boxes already start to making sense (when I filter out the ones with a low score):

Although there are still some things I don't get:

I have 19 detections for the same person.
Some of the scores have a value above 1:
Some scores (see previous screenshot) seem to be identical, which look like duplicates to me. Not sure whether this is somewhere caused by own code...
The bounding boxes don't enclose the person exactly. Seems to be a bit shifted in both directions.

Perhaps this is caused because I didn't scale the input yet. Will try to find some time tonight for that...

bartbutenaers · 2023-08-24T06:55:26Z

About scaling the input image tensor: how can you determine the scalefactor that you use in your code snippet BEFORE the prediction is executed? Because to determine the scaleFactor you need qmin and qmax, which are based on the prediction result that you don't have yet. Hmm I assume my factor calculation is not correct :-(

jveitchmichaelis · 2023-08-24T07:46:48Z

The quantisation values are essentially weights. They are computed once when you calibrate the model and are the same for every subsequent input. They're stored in the model checkpoint - I assume you can access it with tfjs but I'm not familiar. We load the in/out scale factors here: https://github.com/jveitchmichaelis/edgetpu-yolo/blob/784d9be1bb13ce4b8b3c1bad729a02a69cca97bb/edgetpumodel.py#L90 This uses functions from pycoral, but there must be an equivalent

…

On Thu, 24 Aug 2023 at 07:55, bartbutenaers ***@***.***> wrote: About scaling the input image tensor: how can you determine the scalefactor that you use in your code snippet BEFORE the prediction is executed? Because to determine the scaleFactor you need qmin and qmax, which are based on the prediction result that you don't have yet. Hmm I assume my factor calculation is not correct :-( — Reply to this email directly, view it on GitHub <#9 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAYDMJ66KMDALDNQCDYXLL3XW33GVANCNFSM6AAAAAAYBMO5LU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

jveitchmichaelis · 2023-08-24T07:50:45Z

Also note there are different input and output scale values (I forgot about that) stored in the model. The model will predict a lot of very similar boxes due to the way yolo works (you'll get a fixed number of predictions each time). You normally drop most of them based on the confidence score and run non-max suppression to get rid of duplicates. On Thu, 24 Aug 2023 at 08:46, Josh Veitch-Michaelis < ***@***.***> wrote:

…

The quantisation values are essentially weights. They are computed once when you calibrate the model and are the same for every subsequent input. They're stored in the model checkpoint - I assume you can access it with tfjs but I'm not familiar. We load the in/out scale factors here: https://github.com/jveitchmichaelis/edgetpu-yolo/blob/784d9be1bb13ce4b8b3c1bad729a02a69cca97bb/edgetpumodel.py#L90 This uses functions from pycoral, but there must be an equivalent On Thu, 24 Aug 2023 at 07:55, bartbutenaers ***@***.***> wrote: > About scaling the input image tensor: how can you determine the > scalefactor that you use in your code snippet BEFORE the prediction is > executed? Because to determine the scaleFactor you need qmin and qmax, > which are based on the prediction result that you don't have yet. Hmm I > assume my factor calculation is not correct :-( > > — > Reply to this email directly, view it on GitHub > <#9 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAYDMJ66KMDALDNQCDYXLL3XW33GVANCNFSM6AAAAAAYBMO5LU> > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> >

bartbutenaers · 2023-08-25T05:03:24Z

BTW I could not find a way to get the quantization parameters from the model in Tfjs. So I have asked in the Tensorflow forum for help. Hopefully somebody of their community can provide me the golden tip...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not sure how to interpret the output #9

Not sure how to interpret the output #9

bartbutenaers commented May 14, 2023

jveitchmichaelis commented May 14, 2023 via email

bartbutenaers commented May 17, 2023

bartbutenaers commented May 19, 2023

jveitchmichaelis commented May 19, 2023 via email

bartbutenaers commented May 19, 2023

bartbutenaers commented Aug 22, 2023

jveitchmichaelis commented Aug 22, 2023 •

edited

bartbutenaers commented Aug 23, 2023

jveitchmichaelis commented Aug 23, 2023 •

edited

bartbutenaers commented Aug 24, 2023

bartbutenaers commented Aug 24, 2023

jveitchmichaelis commented Aug 24, 2023 via email

jveitchmichaelis commented Aug 24, 2023 via email

bartbutenaers commented Aug 25, 2023

Not sure how to interpret the output #9

Not sure how to interpret the output #9

Comments

bartbutenaers commented May 14, 2023

jveitchmichaelis commented May 14, 2023 via email

bartbutenaers commented May 17, 2023

bartbutenaers commented May 19, 2023

jveitchmichaelis commented May 19, 2023 via email

bartbutenaers commented May 19, 2023

bartbutenaers commented Aug 22, 2023

jveitchmichaelis commented Aug 22, 2023 • edited

bartbutenaers commented Aug 23, 2023

jveitchmichaelis commented Aug 23, 2023 • edited

bartbutenaers commented Aug 24, 2023

bartbutenaers commented Aug 24, 2023

jveitchmichaelis commented Aug 24, 2023 via email

jveitchmichaelis commented Aug 24, 2023 via email

bartbutenaers commented Aug 25, 2023

jveitchmichaelis commented Aug 22, 2023 •

edited

jveitchmichaelis commented Aug 23, 2023 •

edited