Some question about computing the adversarial saliency map in JSMA attack #2306

HIT1180300227 · 2023-10-09T01:30:23Z

Hi,

When using JSMA method, I found that the implementation of adversarial saliency map of this toolbox is slightly different from the original paper:

In this toolbox, corresponding implementation in saliency_map.py looks like this:

  def _saliency_map(self, x: np.ndarray, target: Union[np.ndarray, int], search_space: np.ndarray) -> np.ndarray:
  
    grads = self.estimator.class_gradient(x, label=target)
    grads = np.reshape(grads, (-1, self._nb_features))

    # Remove gradients for already used features
    used_features = 1 - search_space
    coeff = 2 * int(self.theta > 0) - 1
    grads[used_features == 1] = -np.inf * coeff

    if self.theta > 0:
        ind = np.argpartition(grads, -2, axis=1)[:, -2:]**
    else:  # pragma: no cover
        ind = np.argpartition(-grads, -2, axis=1)[:, -2:]

    return ind

I notice that ind is selected directly from grads

But in original paper, adversarial saliency map is computed like this :

or heuristic equation like this:

I'm confused about this difference.

The text was updated successfully, but these errors were encountered:

beat-buesser · 2023-10-10T10:48:43Z

Hi @HIT1180300227 I think this implementation of JSMA is neglecting the additional terms on gradients towards classes other than the target class. Have you been able to use the attack successfully?

HIT1180300227 · 2023-10-11T02:11:09Z

Hi @beat-buesser ,

I use JSMA method in ids(intrusion detection system) field.Specifically, I use the targeted JSMA method on the statistical feature vectors as follows:

art_classifier = KerasClassifier(model=model, use_logits=False)
attack = SaliencyMapMethod(classifier=art_classifier, theta=theta, gamma=gamma, batch_size=1,verbose=True)

#x_test are original statistical feature vectors 
targeted_x_test_jsma = attack.generate(x=x_test,y=numpy_targets)

Before using jsma attack,I can get 90% classification accuracy.After using this attack method, the classification accuracy will be reduced to 20%.

It seems that although the implementation of this attack method is not consistent with the original paper, it can still successfully confuse the classification model.

Why does the jsma attack still work?

beat-buesser · 2023-11-01T16:22:54Z

Hi @HIT1180300227 I think it still works because the main component of the gradients is the same, e.g. the direction in which the current classes' logit value decreases. The paper is more accurate by requiring additional terms for updates to this direction to make sure the other logins are not increasing. It looks that for many applications these additional therms might be small/negligible, but it would be more complicated to implement.

beat-buesser assigned beat-buesser and unassigned beat-buesser Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some question about computing the adversarial saliency map in JSMA attack #2306

Some question about computing the adversarial saliency map in JSMA attack #2306

HIT1180300227 commented Oct 9, 2023

beat-buesser commented Oct 10, 2023

HIT1180300227 commented Oct 11, 2023

beat-buesser commented Nov 1, 2023 •

edited

Some question about computing the adversarial saliency map in JSMA attack #2306

Some question about computing the adversarial saliency map in JSMA attack #2306

Comments

HIT1180300227 commented Oct 9, 2023

beat-buesser commented Oct 10, 2023

HIT1180300227 commented Oct 11, 2023

beat-buesser commented Nov 1, 2023 • edited

beat-buesser commented Nov 1, 2023 •

edited