Key Security Considerations to Keep in Mind When Using GenAI

With game-changing technologies like generative AI (GenAI), it's common for innovation to outpace standard operating procedures. Organizations are rushing to adopt GenAI without first establishing best practices for how to use the technology safely. While GenAI opens up countless opportunities to create value and leverage automation, teams have to be cautious about the information they input, how they interact with GenAI, where they store data, and more.

This article covers security considerations to remember when using GenAI. No matter how you are using GenAI in your business today, whether it's for marketing, software engineering, customer support, or some other purpose, it's essential to tread carefully. This applies to external GenAI products, as well as solutions you build in-house.

Only Use GenAI Apps from Reputable Businesses (and Still Vet the Tools Thoroughly)

As with any third-party software product, it's important to understand how the maintainer of your GenAI software approaches security. When using external products, ask vendors how their solutions handle data and what other services are involved in generating responses. Make sure you also have a clear understanding of who is responsible for what aspects of security. You may be surprised by the separation between the vendor's obligations and yours.

Additionally, get clarity on whether your inputs will eventually be used as training data. You may not want your organization's data surfacing for someone else's future request. This is why employees should receive training on how to interact with GenAI responsibly. Too much intellectual property is escaping organizations today as a result of improper GenAI usage.

Keep Sensitive Data Out of Prompts

Although this may seem obvious, users should not prompt GenAI applications with sensitive information, unless they know it will travel over a private connection. This can include API keys, passwords, account numbers, PHI — really anything that shouldn't be exposed to the public internet. Although GenAI developers may guarantee data is safe once submitted, it's up to users to only share appropriate information with the underlying large language models (LLMs).

This is why Amazon Bedrock is an attractive solution for those developing their own GenAI applications. Whereas requests made to tools like ChatGPT travel over the public internet, Amazon Bedrock makes it easy to build and deploy a GenAI offering that keeps everything private to the AWS cloud.

Avoid Training Data Problems or Leakages

If you are building your own GenAI application, securing your training data is paramount. Any information used to train your AI should be clean of errors and biases. Sensitive information should also get filtered out so that nothing inappropriate bubbles to the top for future users.

Similarly, as with any application involving databases or data management solutions, GenAI data storage requires rigorous security. This means making Gen AI databases inaccessible to the public internet unless you want users to be able to make requests directly to your database. Keep data behind robust firewalls and authentication processes. Implement strict password requirements and use tools like AWS Secrets Manager that can automatically rotate credentials.

Maintain Your Software Development Lifecycle

While there is an incentive to move fast, it does not justify poor architecture design and development practices that put GenAI apps at risk. GenAI development should be subject to the same quality controls and deployment processes. Applications should be tested thoroughly and constantly.

MLOps teams should also be aware of problems like data drift and model degradation that affect large-scale machine learning projects. GenAI is, after all, artificial intelligence and deserves full attention in the software development lifecycle.

A Note About Hallucinations

One of the well-documented shortcomings of text-based GenAI solutions is the propensity for them to hallucinate. The LLM technology underneath today's leading GenAI text products is designed to predict the most likely next word given an existing sequence of words.

In other words, LLMs aren't actually thinking "intelligently." Outputs are based on complex math. Therefore, the content generated by LLMs may not be accurate. It's helpful to keep this in mind, both when using third-party GenAI products and when training your models in-house.

Anthony Loss is the Director of Solution Strategy at ClearScale.

Comments

Plain text