input and output tensors. #60

slimcdk · 2019-04-03T15:31:39Z

I'm in the progress of converting the model to Tensorflow Lite, but I'm not very experienced with Tensorflow yet.

For the conversion I need to use the input and output tensor sizes. Where am I able to find those?

Will the input be the image size and color channels? Eg [None, FLAGS.input_size, FLAGS.input_size, 3] ?
And for output, would that be just the num_of_joints number?

To clarify my question, I'm using the second code snippet provided by Pannag Sanketi : https://stackoverflow.com/questions/50632152/tensorflow-convert-pb-file-to-tflite-using-python

The text was updated successfully, but these errors were encountered:

hkawii · 2019-05-12T01:47:19Z

Hello @slimcdk did you find out how to convert it to tensorflow lite yet ?
Im searching for any way to do that but dont know where to start

slimcdk · 2019-05-12T02:02:03Z

Hi

Yea, take a look at my fork of the repo: https://github.com/slimcdk/convolutional-pose-machines-tensorflow
I found that the first tensor has a misspelling in the provided weights, which is corrected in the model source code.

I did also manage to do inference on the model, but processing time were between 2-3 seconds, on a Galaxy S10. I still need to create the kalman filter and possible the tracker module, or you could just feed the model with a fixed resolution.

hkawii · 2019-05-25T23:40:34Z

Hello @slimcdk , really thank you so much for your help finally I managed to convert it to a tflite file thanks to your comment

I also managed to make an inference on iOS device and processing time is better there

But there is a problem with the output,
How can we get the labels from the output there, forgive me Im completely new to that field and would appreciate any help

This is an example for the output

[[[[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  ...

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]

  [[ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   ...
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]
   [ 4.3078829e-03  7.9320744e-05 -1.2679170e-03 ... -1.5761635e-03
    -2.9271552e-03 -8.5114062e-02]]]]

slimcdk · 2019-05-26T03:29:33Z

If you think about how the three color channels (red, green, blue) in regular images form a stacked layer approach.

The output of this model is similar, but insted of three color channels you get 21 channels (heatmaps). One for each joint. Each heatmap is a 2d array of zeros (black pixels) except those where a joint has been recognized, those spots are ones (white ) -> hence the name heatmap.

Each layer/heatmap is a 2d array, which can be seen as a x and y coordinate system. What you would do, is first to find which value is the highest in the heatmap and afterwards find the x and y indexes of that value.

The above calculation is done right here: https://github.com/timctho/convolutional-pose-machines-tensorflow/blob/master/run_demo_hand_with_tracker.py#L298-L299

This image visualizes all 21 heatmaps as a single layer, but behind the scenes, they are in their own layer.

luchen828 · 2020-04-16T00:49:57Z

I'm in the progress of converting the model to Tensorflow Lite, but I'm not very experienced with Tensorflow yet.

For the conversion I need to use the input and output tensor sizes. Where am I able to find those?

Will the input be the image size and color channels? Eg [None, FLAGS.input_size, FLAGS.input_size, 3] ?
And for output, would that be just the num_of_joints number?

To clarify my question, I'm using the second code snippet provided by Pannag Sanketi : https://stackoverflow.com/questions/50632152/tensorflow-convert-pb-file-to-tflite-using-python

hi,do you konw the numbers of data in the label txt? is the name of jpg + 4 coordinates of hand bbox + 21 coordinates of hand keypoint?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

input and output tensors. #60

input and output tensors. #60

slimcdk commented Apr 3, 2019 •

edited

hkawii commented May 12, 2019

slimcdk commented May 12, 2019

hkawii commented May 25, 2019

slimcdk commented May 26, 2019 •

edited

luchen828 commented Apr 16, 2020

input and output tensors. #60

input and output tensors. #60

Comments

slimcdk commented Apr 3, 2019 • edited

hkawii commented May 12, 2019

slimcdk commented May 12, 2019

hkawii commented May 25, 2019

slimcdk commented May 26, 2019 • edited

luchen828 commented Apr 16, 2020

slimcdk commented Apr 3, 2019 •

edited

slimcdk commented May 26, 2019 •

edited