@inproceedings{biggio2013evasion,
title={Evasion attacks against machine learning at test time},
author={Biggio, Battista and Corona, Igino and Maiorca, Davide and Nelson, Blaine and {\v{S}}rndi{\'c}, Nedim and Laskov, Pavel and Giacinto, Giorgio and Roli, Fabio},
booktitle={Joint European conference on machine learning and knowledge discovery in databases},
pages={387--402},
year={2013},
organization={Springer}
}
The authors presented a gradient-decent based appproach to attack the target model. The attack strategy is
$$
\begin{aligned} \mathbf{x}^{*}=\underset{\mathbf{x}}{\arg \min } & \hat{g}(\mathbf{x}) \ \text { s.t. } & d\left(\mathbf{x}, \mathbf{x}^{0}\right) \leq d_{\max } \end{aligned}
\tag{1}
$$
This strategy is particularly susceptible to failure because
The discriminant function does not incorporate the evidence we have about the data distribution, p(x), and thus, using gradient descent to optimize Eq.
Then an additional component is introduced, the objective is below: $$ \begin{array}{c}{\arg \min {x} F(\mathbf{x})=\hat{g}(\mathbf{x})-\frac{\lambda}{n} \sum{i | y_{i}^{c}=-1} k\left(\frac{\mathbf{x}-\mathbf{x}{i}}{h}\right)} \ {\text { s.t. } d\left(\mathbf{x}, \mathbf{x}^{0}\right) \leq d{\max }}\end{array} $$ where h is a bandwidth parameter for a kernel density estimator (KDE), and n is the number of benign samples (yc = −1) available to the adversary.
The added component estimates
❌ I do not understand the relationship between the component and