Blog

AI Red-teaming: Using a cutting-edge military technique to safeguard your AI

Red-teaming, a military process through which one challenges assumptions and adopts an attacker’s mindset, can help organizations understand and address the unique risks introduced by artificial intelligence and machine learning. In this article, CalypsoAI’s Chief Business Officer discusses AI red-teaming.

I first learned about red-teaming while planning a wartime humanitarian operation. My team and I were in Northern Iraq, just outside the ISIS-held city of Mosul. We were preparing operational plans that called for us to enter the city to deliver much-needed humanitarian aid while Iraqi coalition forces were still engaging ISIS.

To make the desired impact, we would have to get to the front lines. Because the operation carried substantial risks, we diligently crafted contingency plans for a variety of potential failures. Having worked in conflict zones for years, I thought I understood how to effectively plan for — and mitigate — risk. But one of my colleagues, a veteran of two wars, was not convinced.

In his experience, it was not enough to have a contingency plan in place for a vehicle in our convoy breaking down, communications equipment failing, or a member of our team being wounded. As he reasoned, ISIS likely knew humanitarians and civilians would be right behind the military. In fact, given their expansive intelligence operation and the tacit support of many civilians in the region, they may have even known how to target our outfit directly.

We had already seen ISIS stage attacks to look like they were perpetrated by civilians, and we knew they had made a habit of using young boys as spotters. So, in short, this was no ordinary humanitarian mission. To mitigate these additional risks, we had to think like an attacker. If we were the enemy, my colleague asked, how would we attack our convoy?

The night before we entered Mosul, we ran through our operation again, this time with one of us acting like an attacker. We reassessed where to stop, how to manage crowds, and which routes to take. We put ourselves in the mindset of an ISIS militant — a dark place that was uncomfortable even for trained military professionals — and thought through how we might end up failing, or worse, inadvertently doing more harm than good to the civilian population we were trying to assist. By the end of the planning session, we had redesigned our strategy in a way that accounted for previously unseen risks.

Ultimately, the plan succeeded. Over the next week, we spent much of our time in an active battlespace, but thanks to our careful planning, we delivered our humanitarian aid effectively and all made it home safely.

In the years since this operation, I have used red-teaming to interrogate two administrations’ national security strategies as a White House Presidential Innovation Fellow, and now use it regularly as an executive at Calypso AI. And while red-teaming can make a considerable impact on any business as a whole, it is especially effective at fine-tuning artificial intelligence (AI) and machine learning (ML) applications, the success of which increasingly hinges on one’s ability to think like an adversary.

Why we need red-teaming for artificial intelligence and machine learning

Thinking like an adversary is uncomfortable, as outside the military and security sectors, imagining attacking another person is quite unusual. On the battlefield, such mental processes save lives, but in the boardroom, they can save your company from catastrophic breaches and failures.

While red-teaming is a necessary component of any digital transformation process, it is absolutely central to deployments of AI/ML, as these technologies are maturing far faster than their attendant security protocols. As was the case during the early days of the internet, AI/ML is being incorporated into both consumer-facing and mission critical systems despite containing massive security gaps. Red-teaming these systems is arguably the most effective way to close these gaps.

Unfortunately, instead of prioritizing security, most organizations are rapidly expanding their use of AI/ML and treating security not as a “must have,” but as a “nice to have.” In fact, according to a McKinsey survey of 2,000 participants working in 10 distinct sectors:

  • 30 percent of organizations are conducting AI pilots;
  • 47 percent embedded at least one AI capability in their standard business process in 2018, compared to just 21 percent in 2017;
  • 71 percent expect AI investment to increase significantly in the coming years;
  • 78 percent report receiving moderate or significant value from AI, compared to just one percent who report receiving negative value.

Despite their demonstrated willingness to pour resources into AI/ML capabilities, most organizations have yet to spend anywhere near as much on AI security — whether through SaaS platforms, consulting, or in-house development. In a Calypso survey of the AI market, we found only a handful of companies that were paying attention to the security implications of their AI/ML. This small cohort was composed primarily of large companies with both the capital and engineering talent to not only understand these implications fully, but actually address them.

However, for the majority of companies we surveyed, the primary obstacle to achieving AI security was neither a lack of capital nor a lack of engineering talent — plenty of firms had both, yet were still unconcerned — but a failure of imagination.

We need to start designing AI/ML with risks in mind

If you ask AI/ML developers to describe their jobs, most will talk about generating insights, building new products and services, creating optimization workflows, and/or building next-generation platforms. If they include it in their job descriptions at all, ensuring security will almost always be somewhere near the end of the list.

This is a significant oversight, as most corporate Chief Information Security Officers (CISOs) are already stretched thin attempting to secure traditional cyber-systems. What’s more, even if a forward-thinking CISO does have their eye on the new digital risks introduced by AI/ML, they likely don’t have the requisite staff or budget to address these concerns.

But beyond staffing and budgetary issues, there is a larger, more structural issue at play: the industry at large seldom designs technology — least of all AI/ML — with security risks in mind. Technologists tend to be optimists, believing we can create a better world through the development of new digital tools and techniques.

For instance, the engineers who designed the early generations of self-driving cars did not expect their cars’ computer vision systems to be hacked in ways that would cause crashes. Indeed, the very purpose of these systems was to avoid crashes! But this is exactly what happened. Likewise, the creators of voice applications did not write their code thinking people would embed whisper data into audio files in a way that would compromise the applications’ security, but again, this is exactly what came to pass.

Such optimism-to-a-fault has shaped the tech space since at least the advent of the internet. Few people anticipated that the internet would become the driving engine of global finance and communications (and so much more) it has, and as a result, few meaningful security measures were built into its foundation. This created a massive cybersecurity debt that organizations are still trying to pay off today. The harsh reality is, we’re building AI/ML in much the same way — it is open and rarely has security built in.

Using red-teaming to play devil’s advocate

Developed in the military, red-teaming is the process of forcing a team to think through dissenting strategic or security elements. In other words, red-teaming involves assuming the role not only of the devil’s advocate, but of an attacker. It’s not merely about assessing risks, but about looking for opportunities to structure the operating rhythms of a company such that security is woven into the very fabric of the organization.

Highly agile companies like Amazon and Google frequently use red-teaming (sometimes under a different name) to assess new strategies, products, and services. According to legendary Ford executive Alan Mulally, red-teaming is essential at a strategic level “because your competitors are changing, the technology is changing, and you’re never done. You always need to be working on a better plan to serve your customers and grow your business.”

In more strictly technological contexts, red-teams adopt the optimism of a technologist while assuming the role of an attacker. A good red-team should neither shoot down a project nor stand in the way of a new product rollout, but should instead take for granted that attackers will find some way to trick, fool, or maliciously abuse a digital system — and try to facilitate a defense against such abuses. Simply put, red-teams look at new technologies and ask, “If I were an attacker, how might I use this technology to my advantage?”

Types of AI/ML attacks

Taking a step back, it bears asking why anyone would want to attack an AI/ML system in the first place. In most cases, it has nothing to do with the core underlying technology. Instead, attackers simply use the endpoint created by an AI/ML’s data collection processes as a means of executing some sort of criminal activity like bank fraud, data theft, or cybersecurity evasion. As such, strictly speaking, these are not so much new risks as new permutations of the kinds of attacks that have been directed at traditional networks and systems for years.

There are three primary types of attacks that can be leveled against an AI/ML. Each involves the weaponization of data in a way that tricks the AI/ML into making the wrong judgement based on a given set of inputs.

  • Evasion attacks: Evasion attacks use slight changes, or perturbations, in data inputs that are designed to trick a classifier. These could include malware hidden inside an otherwise benign file or slight changes in an image that lead a military AI to misclassify a warship.
  • Poisoning attacks: In a poisoning attack, training data is perturbated in a way that creates backdoors in an AI/ML’s decision-making. For example, an attacker could place false data in public data sources to fool an organization’s AI/ML training model. The attacker would then know that any AI/ML model trained on this data would have specific backdoors they could exploit at any time.
  • Model stealing attacks: Model stealing attacks use blackhat testing of a running model to understand how the model thinks. This type of attack could be used to steal stock trading algorithms from a rival firm, or to learn how to trick an AI/ML-enabled cybersecurity system by learning what it classifies as benign and malicious.

The Calypso Attack Threat Model

Successful red-teaming begins with understanding the motivations of an attack and the operational risks the attack could present. The Calypso Attack Threat Model provides an effective way for red-teams to gain an understanding of critical aspects of an attack. The model examines the motivations, technical feasibility, and outcomes of various attacks in order to gauge various levels of risk.

The Calypso Attack Threat Model starts with the data being used. Is the AI/ML system using visual, audio, code, time series, or tabular data sources — or a mix of them all? Every data type is susceptible to adversarial attacks, but different types are more susceptible to certain kinds of attacks than others.

Next, the model looks at an attacker’s motivations. For instance, an attack that involves hacking the computer vision system of a self-driving car will yield limited returns for a financially-motivated attacker. Conversely, an attack that involves figuring out a bank’s fraud-detection algorithm will yield more immediate financial benefits. That said, red-teaming is all about imagination. Just because an attack’s potential payout might seem small to the average developer doesn’t mean the attack should be discounted. Instead, every risk should be thoroughly interrogated, and all worst-case scenarios should be played out in full.

Finally, the Calypso Attack Threat Model reviews the business, legal, and compliance risks of an AI/ML failure. This helps determine the overall cost to a company of an AI/ML breach, which can add risk to any possible failure.

How to red-team AI/ML

As mentioned above, red-teaming an AI/ML system begins with answering questions about motivations and feasibility. Key questions to ask include:

  • What does this AI/ML system do and who benefits from its use?
    • Look into the benign applications of the AI/ML and the outcomes it is expected to deliver for your organization.
  • What benefit(s) could someone gain from attacking this system?
    • Examine the expected payoff of an attack on the system. Fraud? Terrorism? Fake news? Malware injection?
  • In what sorts of ways do people currently get these benefits?
    • How easy is it to achieve a similar payoff without attacking an AI/ML?
  • What level of technical capabilities would be required to execute the attack?
    • Can this attack be done by anyone with a laptop, or is sophisticated equipment or knowledge required?
  • Can we generate specific adversarial examples that break the system and illuminate how quickly it can be done?
    • This is where Calypso’s Red-Team comes in handy. You need someone to break your machine to find out if the theoretical can be translated into the tactical.
  • What level of data access is required?
    • Does this attack need a conspiring insider or access to private training data, or can it be performed entirely from the outside?
  • When is the attack likely to take place: in testing or in training?
    • When should you expect the attack?
  • What would the outcome of the attack be?
    • What would be the impact to your organization? What would be the size of the attacker’s payoff?

After theoretical questioning and red-teaming has taken place, it is time to actually attack the model. Using software like Calypso’s Vulnerability Audit Suite, developers and their managers should attack their model to test their preliminary assumptions. Calypso has developed an automated solution for testing thousands of adversarial attacks against AI models in order to assess red-team assumptions, find vulnerabilities, and build model robustness into the DevOps pipeline.

 A red-teamed future

I don’t spend much time on battlefields these days — instead, as the Chief Business Officer of Calypso AI, I build security solutions for AI/ML systems. However, when I was in Iraq, thinking like an attacker likely saved countless lives by helping me and my team understand — and prepare for — the worst-case scenario. Now, we at Calypso have taken these lessons and applied them to AI/ML security.

At the start of any new client project, my team and I begin by breaking the AI/ML models our client is using, developing, or planning to develop in the future. We assess risks and optimize our solution by thinking like an attacker. We think outside the box, imagining creative new ways an adversary could manipulate a model, perturbate data inputs, and benefit from the model’s failures.

As artificial intelligence has begun its journey from the realm of fantasy to the realm of reality, the threat of data weaponization has become increasingly ominous. To understand — and, more importantly, counteract — this threat, you need to start thinking like an attacker. Doing so just might save your business.

learn more
SHARE
case-cta@2x

Request a demo with CalypsoAI and learn how we can work for you.

Request Demo