Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom model with grad-CAM #216

Open
memon-aliraza opened this issue Apr 7, 2020 · 0 comments
Open

Custom model with grad-CAM #216

memon-aliraza opened this issue Apr 7, 2020 · 0 comments

Comments

@memon-aliraza
Copy link

memon-aliraza commented Apr 7, 2020

I have a custom model that takes two input images and decides whether images are the same or different. I am facing problem to pass multiple inputs. My model architecture is as follows:

Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 256, 256, 3)  0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            (None, 256, 256, 3)  0                                            
__________________________________________________________________________________________________
encoder (Sequential)            (None, 7, 7, 256)    3752704     input_1[0][0]                    
                                                                 input_2[0][0]                    
__________________________________________________________________________________________________
Merged_feature_map (Concatenate (None, 7, 7, 512)    0           encoder[1][0]                    
                                                                 encoder[2][0]                    
__________________________________________________________________________________________________
mnet_conv1 (Conv2D)             (None, 7, 7, 1024)   2098176     Merged_feature_map[0][0]         
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 7, 7, 1024)   4096        mnet_conv1[0][0]                 
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 7, 7, 1024)   0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
mnet_pool1 (MaxPooling2D)       (None, 3, 3, 1024)   0           activation_1[0][0]               
__________________________________________________________________________________________________
mnet_conv2 (Conv2D)             (None, 3, 3, 2048)   8390656     mnet_pool1[0][0]                 
__________________________________________________________________________________________________
batch_normalization_2 (BatchNor (None, 3, 3, 2048)   8192        mnet_conv2[0][0]                 
__________________________________________________________________________________________________
activation_2 (Activation)       (None, 3, 3, 2048)   0           batch_normalization_2[0][0]      
__________________________________________________________________________________________________
mnet_pool2 (MaxPooling2D)       (None, 1, 1, 2048)   0           activation_2[0][0]               
__________________________________________________________________________________________________
reshape_1 (Reshape)             (None, 1, 2048)      0           mnet_pool2[0][0]                 
__________________________________________________________________________________________________
fc1 (Dense)                     (None, 1, 256)       524544      reshape_1[0][0]                  
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 1, 256)       1024        fc1[0][0]                        
__________________________________________________________________________________________________
activation_3 (Activation)       (None, 1, 256)       0           batch_normalization_3[0][0]      
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 1, 256)       0           activation_3[0][0]               
__________________________________________________________________________________________________
fc2 (Dense)                     (None, 1, 128)       32896       dropout_1[0][0]                  
__________________________________________________________________________________________________
batch_normalization_4 (BatchNor (None, 1, 128)       512         fc2[0][0]                        
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 1, 128)       0           batch_normalization_4[0][0]      
__________________________________________________________________________________________________
dropout_2 (Dropout)             (None, 1, 128)       0           activation_4[0][0]               
__________________________________________________________________________________________________
fc3 (Dense)                     (None, 1, 64)        8256        dropout_2[0][0]                  
__________________________________________________________________________________________________
batch_normalization_5 (BatchNor (None, 1, 64)        256         fc3[0][0]                        
__________________________________________________________________________________________________
activation_5 (Activation)       (None, 1, 64)        0           batch_normalization_5[0][0]      
__________________________________________________________________________________________________
dropout_3 (Dropout)             (None, 1, 64)        0           activation_5[0][0]               
__________________________________________________________________________________________________
fc4 (Dense)                     (None, 1, 1)         65          dropout_3[0][0]                  
__________________________________________________________________________________________________
batch_normalization_6 (BatchNor (None, 1, 1)         4           fc4[0][0]                        
__________________________________________________________________________________________________
activation_6 (Activation)       (None, 1, 1)         0           batch_normalization_6[0][0]      
__________________________________________________________________________________________________
dropout_4 (Dropout)             (None, 1, 1)         0           activation_6[0][0]               
__________________________________________________________________________________________________
reshape_2 (Reshape)             (None, 1)            0           dropout_4[0][0]                  
==================================================================================================

The model takes two inputs and extract features using the encoder network. Both features merged and then the rest of the network decides the images are the same or not. I have tried the following code:


import numpy as np
from keras.models import load_model
import keras.backend as K
import matplotlib.cm as cm
from vis.utils import utils
from vis.visualization import visualize_cam

model = load_model('model.h5', compile=False)

img1 = utils.load_img('/path/image1.jpg', target_size=(256, 256))
img2 = utils.load_img('/path/images2.jpg', target_size=(256, 256))

penultimate_layer = utils.find_layer_idx(model, 'mnet_conv2')
layer_idx = utils.find_layer_idx(model, 'fc4')

for i, img in enumerate([img1, img2]):
  grads = visualize_cam(model, 
                        layer_idx, 
                        filter_indices=1, 
                        seed_input=img, 
                        penultimate_layer_idx=penultimate_layer)

I am receiving the following error:

ValueError: slice index 1 of dimension 2 out of bounds. for 'strided_slice' (op: 'StridedSlice') with input shapes: [?,1,1], [3], [3], [3] and with computed input tensors: input[1] = <0 0 1>, input[2] = <0 0 2>, input[3] = <1 1 1>

I am looking for a way to pass two images and plot the heatmaps on both images. Or even, if possible, modify the network architecture to process images one by one. I just want to visualize for a given image, where the network is focusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant