Misclassification vs. TensorFlow #170

janjongboom · 2019-08-21T12:42:38Z

I have an example program here: https://github.com/janjongboom/utensor-misclassifies

The first classification is completely wrong... I added a second one to do sanity checking and that one seems to be OK.

TF result:

0: 0.1487136036157608,
1: 0.07502711564302444
2: 0.07752762734889984
3: 0.6987316608428955

uTensor result:

0: 0.532946
1: 0.180336
2: 0.112851
3: 0.173868

The trained model and classification scripts for TF and uTensor are available in the repo.

The text was updated successfully, but these errors were encountered:

dboyliao · 2019-08-21T14:04:34Z

@mbartling
I think you are the one to blame XDDD

@janjongboom
According to my private message with @mbartling, the SoftmaxOp implementation is incorrect.
It was a hack for an internal demo.
I just finish graph constructor for the codegen, so I'll fix the implementation later.
Just give me some time.

dboyliao · 2019-08-21T14:05:43Z

Still, thanks for your demo code.
I'll test the result against your code.

dboyliao · 2019-08-21T14:26:37Z

BTW, I'll implement the C++ api of TF softmax:
https://www.tensorflow.org/api_docs/cc/class/tensorflow/ops/softmax

,which works on only 2-D tensor.

If you look at the python wrapper for softmax in TF:
https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/ops/nn_ops.py#L2876-L2903

I think it's too much of work support multi dim tensor softmax op in uTensor runtime.
At least for now.
What do you guys think?
@janjongboom @Knight-X @neil-tan @mbartling

janjongboom · 2019-08-21T14:50:47Z

@dboyliao What do you mean? I'm not using any explicit softmax as far as I know. Just simple MLP 40x20x10x4.

dboyliao · 2019-08-21T15:15:40Z

https://github.com/janjongboom/utensor-misclassifies/blob/master/fw/utensor-model/trained.cpp#L537-L542

I see the SoftmaxOp is in your model cpp file.
So I think this misclassification issue is due to the incorrect output of SoftmaxOp

dboyliao · 2019-08-21T15:37:20Z

@janjongboom
I'll need your training script for debugging.
TF-related issue.

janjongboom · 2019-08-21T19:17:59Z

@dboyliao

    from keras.models import Sequential
    from keras.layers import Dense, InputLayer
    from keras import regularizers
    
    r = regularizers.l1(0.0001)
    
    model = Sequential()
    model.add(InputLayer(input_shape=(X_train.shape[1], ), name='x_input'))
    model.add(Dense(20, activation='relu', activity_regularizer=r ))
    model.add(Dense(10, activation='relu', activity_regularizer=r ))
    model.add(Dense(classes, activation='softmax', name='y_pred'))
    
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    
    model.fit(X_train, Y_train, batch_size=50, epochs=100, validation_data=(X_test, Y_test))

janjongboom · 2019-08-26T06:48:56Z

@dboyliao @mbartling Does landing #171 mean this issue should be fixed?

neil-tan · 2019-08-26T07:44:15Z

Further test is required to confirm this. But, yes, according to @dboyliao 's comment on your sample training script.

dboyliao · 2019-08-26T08:00:27Z

I'm pretty sure the original softmax in uTensor is incorrect.
The new softmax pass my unit test so I'm like over 90% confidence my softmax is aligned with TF's on both 1D or 2D tensors.
Besides, since Keras is a high-level TF api, I'm not sure if it did anything unexpected procedures and need further integration test to see if my PR solves your issue.

janjongboom · 2019-08-26T12:05:36Z

@dboyliao @neil-tan I've tested this against the fix-softmax branch, updated my graph to (this might be incorrect, I'm not sure?):

{
    ctx.add(new RamTensor<float>(), "y_pred/Softmax:0");
    ctx.push(new SoftmaxOp<float, float>(),
             { "y_pred/BiasAdd:0" },
             { "y_pred/Softmax:0" });
}

But still yields the same incorrect results as on develop.

dboyliao · 2019-08-26T16:02:32Z

I see.
Then there should be some bugs in the intermediate layers.
Given that I've run unit tests on the SoftmaxOp.
It will take me a while to find it.

dboyliao · 2019-08-26T16:07:34Z

Let's see if I can come up with an automatic tests generator :P

janjongboom · 2019-08-26T17:35:09Z

@dboyliao @neil-tan Could you confirm that I'm using the API correct here?

https://github.com/janjongboom/utensor-misclassifies/blob/e0bfb95df3cdd97a5a665e055e7f656e9797ad64/fw/main.cpp#L30

If the softmax issue is now fixed, I might be using the API wrongly?

Knight-X · 2019-08-26T17:40:13Z

I think that api usage is fine for acquiring output tensor.

janjongboom · 2019-09-02T11:20:06Z

I have looked into the fix @dboyliao provided for SoftmaxOp (which was indeed broken). But this is an issue at a deeper end as the values are already incorrect when passing them into the SoftmaxOp:

logit 0 -0.986198
logit 1 -2.069798
logit 2 -2.538547
logit 3 -2.106324

As you see here logit 0's value is too high already.

@neil-tan @mbartling If I compile without quantization ops this works fine:

utensor-cli convert  ../trained.pb --output-nodes=y_pred/Softmax -m utensor-model/ --transform-methods "dropout,inline,biasAdd,remove_id_op,refcnt"

Yields me the exact response that I would get from TF. Is there any way I can see both the quantized and the unquantized state at similar times? I'm logging all input and output tensors right now, but I don't have a reference state to check in which layer something goes wrong which makes it virtually impossible to debug.

janjongboom · 2019-09-03T09:08:32Z

I have put a (I think also helpful for other things) tool here that compares output between uTensor and TF:

https://github.com/janjongboom/utensor-fuzzer

It currently finds 383 cases in the associated test file where uTensor misclassifies. Output is something like this:

ERR (line: 349) - TensorFlow: [ '0.03293', '0.00571', '0.00938', '0.18354', '0.76844' ] , uTensor: [ '0.00767', '0.00152', '0.00252', '0.02009', '0.96820' ]
ERR (line: 350) - TensorFlow: [ '0.04695', '0.00689', '0.01163', '0.80370', '0.13083' ] , uTensor: [ '0.01621', '0.00238', '0.00540', '0.92386', '0.05214' ]
...

Note that when quantization is disabled all these errors disappear, so it's definitely something there.

dboyliao · 2019-09-03T09:11:58Z

Okay. Good to know that the issue is due to quantization.
I'll try to see if I can integrate your work into the cli so it'd be much easier to run the tests.

janjongboom · 2019-09-03T09:24:11Z

@dboyliao I've added the simplest case that I can think of too here: https://github.com/janjongboom/utensor-fuzzer/tree/super-simple-usecase. Just a two-layer NN (one input layer, then softmax layer) and two examples (one is OK, one is wrong).

dboyliao mentioned this issue Aug 21, 2019

Fix SoftmaxOp #171

Merged

janjongboom mentioned this issue Aug 26, 2019

Sigmoid op type #174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Misclassification vs. TensorFlow #170

Misclassification vs. TensorFlow #170

janjongboom commented Aug 21, 2019

dboyliao commented Aug 21, 2019

dboyliao commented Aug 21, 2019

dboyliao commented Aug 21, 2019 •

edited

janjongboom commented Aug 21, 2019

dboyliao commented Aug 21, 2019

dboyliao commented Aug 21, 2019

janjongboom commented Aug 21, 2019

janjongboom commented Aug 26, 2019

neil-tan commented Aug 26, 2019

dboyliao commented Aug 26, 2019 •

edited

janjongboom commented Aug 26, 2019 •

edited

dboyliao commented Aug 26, 2019

dboyliao commented Aug 26, 2019 •

edited

janjongboom commented Aug 26, 2019

Knight-X commented Aug 26, 2019

janjongboom commented Sep 2, 2019 •

edited

janjongboom commented Sep 3, 2019

dboyliao commented Sep 3, 2019

janjongboom commented Sep 3, 2019

Misclassification vs. TensorFlow #170

Misclassification vs. TensorFlow #170

Comments

janjongboom commented Aug 21, 2019

dboyliao commented Aug 21, 2019

dboyliao commented Aug 21, 2019

dboyliao commented Aug 21, 2019 • edited

janjongboom commented Aug 21, 2019

dboyliao commented Aug 21, 2019

dboyliao commented Aug 21, 2019

janjongboom commented Aug 21, 2019

janjongboom commented Aug 26, 2019

neil-tan commented Aug 26, 2019

dboyliao commented Aug 26, 2019 • edited

janjongboom commented Aug 26, 2019 • edited

dboyliao commented Aug 26, 2019

dboyliao commented Aug 26, 2019 • edited

janjongboom commented Aug 26, 2019

Knight-X commented Aug 26, 2019

janjongboom commented Sep 2, 2019 • edited

janjongboom commented Sep 3, 2019

dboyliao commented Sep 3, 2019

janjongboom commented Sep 3, 2019

dboyliao commented Aug 21, 2019 •

edited

dboyliao commented Aug 26, 2019 •

edited

janjongboom commented Aug 26, 2019 •

edited

dboyliao commented Aug 26, 2019 •

edited

janjongboom commented Sep 2, 2019 •

edited