Papers

Query-efficient Black-box Adversarial Examples

Andrew Ilyas*, Logan Engstrom*, Anish Athalye*, Jessy Lin*

https://arxiv.org/abs/1712.07113

Abstract

Current neural network-based image classifiers are susceptible to adversarial examples, even in the black-box setting, where the attacker is limited to query access without access to gradients. Previous methods --- substitute networks and coordinate-based finite-difference methods --- are either unreliable or query-inefficient, making these methods impractical for certain problems.

We introduce a new method for reliably generating adversarial examples under more restricted, practical black-box threat models. First, we apply natural evolution strategies to perform black-box attacks using two to three orders of magnitude fewer queries than previous methods. Second, we introduce a new algorithm to perform targeted adversarial attacks in the partial-information setting, where the attacker only has access to a limited number of target classes. Using these techniques, we successfully perform the first targeted adversarial attack against a commercially deployed machine learning system, the Google Cloud Vision API, in the partial information setting.

@unpublished{blackbox,
  author = {Andrew Ilyas and Logan Engstrom and Anish Athalye and Jessy Lin},
  title = {Query-efficient Black-box Adversarial Examples},
  year = {2017},
  url = {https://arxiv.org/abs/1712.07113},
}

Synthesizing Robust Adversarial Examples

Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin Kwok

https://arxiv.org/abs/1707.07397

Abstract

Neural network-based classifiers parallel or exceed human-level accuracy on many common tasks and are used in practical systems. Yet, neural networks are susceptible to adversarial examples, carefully perturbed inputs that cause networks to misbehave in arbitrarily chosen ways. When generated with standard methods, these examples do not consistently fool a classifier in the physical world due to viewpoint shifts, camera noise, and other natural transformations. Adversarial examples generated using standard techniques require complete control over direct input to the classifier, which is impossible in many real-world systems.

We introduce the first method for constructing real-world 3D objects that consistently fool a neural network across a wide distribution of angles and viewpoints. We present a general-purpose algorithm for generating adversarial examples that are robust across any chosen distribution of transformations. We demonstrate its application in two dimensions, producing adversarial images that are robust to noise, distortion, and affine transformation. Finally, we apply the algorithm to produce arbitrary physical 3D-printed adversarial objects, demonstrating that our approach works end-to-end in the real world. Our results show that adversarial examples are a practical concern for real-world systems.

@unpublished{robustadv,
  author = {Anish Athalye and Logan Engstrom and Andrew Ilyas and Kevin Kwok},
  title = {Synthesizing Robust Adversarial Examples},
  year = {2017},
  url = {https://arxiv.org/abs/1707.07397},
}