Multimodal? Multi-Model?
The scenarios for an organization that deploys multiple models vary widely. While some organizations might want to leverage the capabilities of more than one large public LLM, for instance using both BERT and Gemini rather than one or the other, there are many more options available to them. In the year and a bit since ChatGPT appeared on the business horizon, new model types have emerged, including:- Targeted models focused on specific markets, such as finance, law, pharma, and retail
- Models embedded in software-as-a-service (SaaS) applications
- Internal models built in-house and trained on proprietary data
- Retrieval-augmented generation (RAG) models, which are extensions added to the large public models to provide access to more recent information
- Small language models (SMLs) that require considerably less compute power and, in some cases, can run on a personal computer
Risks
The benefits of integrating multiple models and multimodal models into an organizational structure are tremendous and continue to expand into new use cases at the developer, analyst, general user, and customer levels. The risks, however, cannot be overstated and must be addressed before integration. Adding any AI tools to an organization’s digital infrastructure immediately expands the attack surface, and adding multimodal models can complicate that further. A multimodal model’s architecture includes numerous entry points, which become vulnerable to exploitation by attackers. Likewise, focusing on securing some modalities in a model or in a system while overlooking others can create blind spots in the security posture of a multimodal system. And attackers can always be counted on to exploit the weakest link, whether that means entering a system through an unsecured portal or slipping malicious code into the model’s response. As government regulations have begun to come into effect across the globe, compliance, especially around ensuring data privacy, has become a significant pain point for AI security teams. The data that must be protected no longer consists of just text records, but myriad types of digital images, such as photographs; scans, such as retina scans, fingerprints, x-rays, MRIs, etc.; audio files; voice prints, for instance used as verbal confirmation; and videos, as well as highly specialized scientific content. All of this data, if exfiltrated, could be exposed in a data breach, but could also be manipulated and sold for malicious uses, including the creation of deep fakes. Integrating various modalities into existing digital infrastructures also presents challenges such as interoperability issues, in which different technologies and protocols must work together seamlessly and one misconfiguration or vulnerability in one component can compromise the entire system's security. This can be resource-intensive to fix, can potentially introduce model performance problems, such as latency, and could require additional hardware, which, if not managed correctly, may itself introduce security vulnerabilities.Solutions
The lack of standardized practices around the subject of AI use in general makes it even more challenging to ensure consistent security in user environments, but resources are emerging in the AI security space to make deployments more secure and, therefore, more attractive to risk-averse organizations. Our model-agnostic GenAI security and enablement platform is the first of its kind to provide 360° protection to systems. A “weightless” trust and control layer positioned outside, but immediately adjacent to the infrastructure, our CalypsoAI platform shields the organization from threats buried in model responses and other external threats, and ensures prompts that contain damaging instructions or private information do not leave the system, and does so all without introducing latency into the process. Administrators can use policy-based access controls to authorize or block access to models by individual user and by teams, which:- Allows cost-monitoring by limiting who can access the models
- Supports deployment of rate limits to ensure that Model Denial of Service (DoS) attacks do not succeed in slowing or shutting down the system
- Enables customizable scanner criteria to be set according to the needs of each team, enabling greater adherence to company acceptable use policies