
Picture a machine learning (ML) model on the front lines of battle trying to determine whether a hazy-looking image is a school bus or a combat vehicle. In leaning on its underlying software program to recognize the picture, it runs through its bank of learned images. End-users have little choice but to assume in good faith that the program has been trained correctly and that the images it classifies as school buses actually qualify. But malicious actors (nation-state or otherwise) can introduce flawed images, spooking the ML model and causing it to mislabel a school bus as a combat vehicle, or not recognize any areas of interest at all. It is important for defense leaders to understand the current and rising threat of adversarial machine learning.
What is adversarial Machine Learning?
Attacks like the above example, targeting ML models and their execution, are termed adversarial ML and intentionally trigger steep costs in missions. And they come at a time when forces around the world are looking to increase their adoption of AI and ML. On June 30, 2022, NATO launched a $1 billion innovation fund with the goal of accelerating work on AI, big data, and automation. In 2022 alone, the Pentagon is looking to spend about $1 billion on AI-related technology.
Adversarial ML is not science fiction—it is a present, growing threat. And as the development of AI systems in combat scenarios accelerates, the risk and complexity of these attacks are only going to grow. In fact, many see AI security as one of the greatest threats in the global AI race.
What adversarial Machine Learning attacks currently exist?
The military has special reason to worry about adversarial ML because these attacks usually target computer vision-based models. These kinds of image classification models—used to differentiate between enemy and friendly assets through pictures—are the AI technologies that are likely to see early adoption.
Adversarial ML falls under various categories. It can be “white box,” in which the attacker has access to the model and its parameters, while a “black box” attack is one where the attacker lacks such access. The most common adversarial attacks at this time can be classified as evasion attacks or model inversion attacks.
– An evasion attack is one in which the threat actor dupes the model by feeding it misleading information during deployment. Tricking a model into interpreting a school bus as an enemy humvee, for example, would fall under this category. Evasion attacks are often “untargeted” attacks, which means the attack does not care how the model prediction changes—simply that it changes. Targeted attacks, in contrast, hope to achieve a specific outcome or prediction from the model.
– In a model inversion attack, an attacker attempts to learn something about the private training data with only access to the model. For example, if a model is used for facial recognition, an attack might attempt to learn the faces that the model is capable of detecting.
The need for ongoing independent testing & validation
No matter the kinds of attacks that threat actors carry out, keeping them at bay is key to the successful field deployment of AI.
When missions involve split-second decisions based on data, you have to trust the data. But the possibility of adversarial ML attacks dilutes user confidence: Can we fully trust the answer the ML model is telling us? The answer might be “no” and the primary workaround is model testing before, during, and after deployment, to gain visibility into any security issues.
Adversarial robustness—a model’s ability to resist being tricked—is a key performance indicator as we look to increase the deployment of AI and ML on the front lines. Repeated testing can catch malicious data or other instances of adversarial ML before they can cause permanent damage on the frontlines.
Testing is not a one-and-done operation. Model resilience in the early stages of deployment doesn’t guarantee ongoing resilience once deployed. Adversarial ML is an evolving field and the intrusions and methods threat actors use to cause harm are likely to change. Ongoing testing is key to staying ahead of the anticipated evolution of these technologies on the frontlines.
Damage to ML models through adversarial attacks can be devastating and lead to loss of life and damage to trust in systems. Independent and frequent testing of ML models, both before and during deployment, is a strategic solution to a persistent and ongoing threat that will only grow more urgent in the years ahead.