We've updated one of our most popular blogs with new information.
Safely and securely integrating multimodal models and/or multiple models into existing digital infrastructures is rapidly emerging as a significant issue in conversations about why, when, and how to deploy generative AI (GenAI) models, including large language models (LLMs), across the enterprise. While many organizations have already deployed at least one GenAI model—nearly 29% according to a recent Gartner survey—considerably more have put off deployment, with the balance continuing to debate whether to bring LLMs into their systems at all. The benefits of having even a single unimodal language model in place, for instance, ChatGPT 3.5 or other early models, can bring enormous increases to efficiency and productivity of text-based tasks, such as content generation, summarization, analysis, and translation. However, bringing in multiple models or models that use or create non-text content (multimodal), is the creative version of a “force multiplier” and requires a different strategy.Multimodal? Multi-Model?
The scenarios for organizations that adopt multiple models vary widely. While some organizations might want to leverage the capabilities of more than one large public LLM, for instance deploying both BERT and Gemini rather than one or the other, there are many more options available. In the almost two years since ChatGPT appeared on the business horizon, new model types have emerged, including:- Targeted models focused on specific markets, such as finance, law, pharma, and retail
- Models embedded in software-as-a-service (SaaS) applications
- Internal models built in-house and trained on proprietary data
- Retrieval-augmented generation (RAG) models, which are extensions added to the large public models to provide access to more recent information
- Small language models (SMLs) that require considerably less compute power and can run on a personal computer
Risks
The benefits of integrating multiple models and multimodal models into an organizational structure are tremendous and continue to expand into new use cases at the developer, analyst, general user, and customer levels. The risks, however, cannot be overstated and must be addressed before integration. Adding any AI tools to an organization’s digital infrastructure immediately expands the attack surface, and adding multimodal models can complicate that further. A multimodal model’s architecture includes numerous entry points, which become vulnerable to exploitation by attackers. Likewise, focusing on securing some modalities in a model or in a system while overlooking others can create blind spots in the security posture of a multimodal system. And attackers can always be counted on to exploit the weakest link, whether that means entering a system through an unsecured portal or slipping malicious code into the model’s response. As government regulations have begun to come into effect across the globe, compliance, especially around ensuring data privacy, has become a significant pain point for AI security teams. The data that must be protected no longer consists of just text records, but myriad types of digital images, such as photographs; scans, such as retina scans, fingerprints, x-rays, MRIs, etc.; audio files; voice prints, for instance used as verbal confirmation; and videos, as well as highly specialized scientific content. All of this data, if exfiltrated, could be exposed in a data breach, but could also be manipulated and sold for malicious uses, such as identity theft and the creation of deep fakes. Integrating various modalities into existing digital infrastructures also presents challenges such as interoperability issues, in which different technologies and protocols must work together seamlessly and one misconfiguration or vulnerability in one component can compromise the entire system's security. This can be resource-intensive to fix, can potentially introduce model performance problems, such as latency, and could require additional hardware, which, if not managed correctly, may itself introduce security vulnerabilities.Solutions
The lack of standardized practices or strategies around the subject of AI use in general makes it even more challenging to ensure consistent security in user environments, but resources are emerging in the AI security space to make deployments more secure and, therefore, more attractive to risk-averse organizations. Our model-agnostic GenAI security platform for GenAI is the first of its kind to provide 360° protection to systems. A “weightless” trust and control layer positioned outside, but immediately adjacent to the infrastructure, our CalypsoAI platform shields the organization from threats buried in model responses and other external threats, and ensures prompts that contain damaging instructions or private information do not leave the system, and does so all without introducing latency into the process. Administrators can use policy-based access controls to authorize or block access to models by individual users or by identified groups, which:- Allows cost-monitoring by limiting who can access the models
- Supports deployment of rate limits to ensure that Model Denial of Service (DoS) attacks do not succeed in slowing or shutting down the system
- Enables customizable scanner criteria to be set according to the needs of each team, enabling greater adherence to company acceptable use policies