Why AI Coding Assistants May Be Greatest Cybersecurity Threat Facing Your Business

If you follow cybersecurity news and trends, you probably know all about threats like ransomware, phishing, and supply chain breaches, which were among the most serious types of threats of 2023.

But here's another risk that is arguably poised to become the single greatest security liability for businesses across the world in 2024 and beyond: improperly secured AI-powered software development tools. As more and more developers take advantage of generative AI to help write code, more and more organizations are likely to find themselves exposed to risks like leakage of sensitive data due to lack of security controls around AI coding assistants.

Allow me to explain why this is such a serious problem, as well as the steps businesses should be taking now to ensure that AI-assisted coding doesn't become the weakest link in their cybersecurity strategy.

What Are AI Coding Assistants?

An AI coding assistant is a tool that uses AI to help developers write code. GitHub Copilot and Amazon CodeWhisperer are popular examples of AI coding assistants that have gained widespread adoption over the past couple of years. Copilot, for example, had been used by more than one million developers as of mid-2023, according to GitHub.

Most modern AI coding assistants are powered by large language models (LLMs), such as those from OpenAI. They work by analyzing huge volumes of code, identifying patterns, and using those patterns to generate new code or suggest coding enhancements. In most cases, the tools connect directly to the code repositories and Integrated Development Environments (IDEs) where developers store and write code — which means they have real-time access to application source code and the environment in which it's built and tested.

The Security Risks of AI-Assisted Coding

Let me make one thing clear: There is real value in AI code assistants, which can dramatically speed up the process of developing applications. I'm not here to tell you that no one should use these tools.

What I am here to do is highlight the oft-overlooked security, compliance, and data privacy risks that can arise from insecure use of AI-assisted coding tools, which include the following:

Exposure of source code to third-party organizations

Because most AI-assisted coding tools can integrate directly with IDEs and code repositories, they can read the source code that developers write, which is often proprietary and not intended to be shared outside the organization. The relatively vague terms of service that AI coding tool vendors offer provide little assurance that the AI services won't ingest a company's proprietary code and share it with outside organizations.

Indeed, complaints that AI tools have been doing exactly that — leaking copyrighted code to third parties — go back at least to 2022, and there is no reason to think the risks have grown less serious today.

Leaked credentials

In addition to exposing source code, AI coding assistants may also leak secrets — such as passwords and access tokens. That's because secrets are sometimes embedded within source code or within the environment where new applications are built and tested. If AI tools can access your repositories or IDEs, they can also likely read your secrets.

This is a truly grave risk because anyone who obtains secrets can potentially use them to create all manner of havoc — such as impersonating a company's employees, logging into databases, and turning off critical services.

Compliance risks

The risk of AI-assisted coding isn't limited to sensitive information that AI services can copy from your organization and expose externally. They may also plant other companies' sensitive data inside your applications, leading to potential compliance violations.

This happens when an AI-assisted coding tool generates "new" code that contains copyrighted code borrowed from another company. If your developers don't catch the code — which they would have no way of doing in most cases — you may end up being charged with stealing someone else's intellectual proprietary.

If you think you can simply say that violations like these aren't your developers' fault because the code was suggested by an AI service, think again. In most cases, there is no way to prove that the code was generated by an AI tool.

Shadow LLMs

There are a growing number of AI-assisted coding tools available today, and most install with just a few clicks or commands. In addition, browser-based chatbots, like ChatGPT, are easily accessible from any device with an internet connection.

This means developers have ready access to a variety of AI coding tools, making it challenging for businesses to track which AI-assisted coders their developers are using — let alone ensure they're using them securely. As a result, AI coding tools can introduce "shadow LLMs" into an organization. Shadow LLMs are a form of shadow IT — meaning IT resources that are used inside an organization but are not properly monitored or secured.

Malicious code

Although most of the code produced by AI assistants is benign, there is a risk that they will generate malicious code. This could happen if the code they trained on included malicious software, in which case they may reproduce elements of malicious applications.

Because it's impossible in most cases to know exactly which code a given tool was trained on or which new code it might generate, the risk of generating malicious code using AI tools is always out there.

Hallucination risks

The tendency of AI models to "hallucinate" — meaning produce false or unreliable information — could also cause AI coding assistants to generate risky code or software.

For example, researchers have demonstrated the phenomenon of package hallucination, in which coding assistants generate software dependencies on packages that don't actually exist. If threat actors detect these dependencies — which is relatively easy to do because it requires simply comparing an application's dependency list against inventories of actual packages — they can produce malicious packages that have the same names as the hallucinated ones. If this risk remains undetected at the time of application deployment, the app would download and execute the malicious packages in order to satisfy its dependencies.

Managing AI-Assisted Coding Risks

In a perfect world, AI tools would be designed to prevent the types of risks described above. But given the tremendous complexity of AI services and the LLMs that power them, it's very challenging for tool vendors to guarantee that their software won't make mistakes that lead to security and compliance risks. Trusting AI-assisted coding to be risk-free is just not realistic today.

What businesses can do, however, is invest in measures that mitigate the risks posed by whichever AI-assisted coding tools their developers choose to leverage. Effective risk mitigations for AI coding assistants include:

Continuous monitoring of IDEs and browsers to track whether any AI coding assistants are in use and which data they can access.
Monitoring of code generated by AI coding assistants to check for code that may be subject to copyright or other restrictions.
Policy-based controls that make it possible to block IDEs from accessing secrets in situations where doing so may expose the secrets to AI services.

With these protections in place, businesses can allow their developers to benefit from AI-assisted coding, while minimizing the risk of suffering security breaches or compliance violations.

Conclusion: AI-Assisted Coding Is Great — But Only if You Protect It

Again, AI-assisted coding tools are great resources that can help developers work faster, more efficiently, and with less toil.

But these solutions also pose some momentous security risks, which many organizations aren't even aware of. That's why it's high time for businesses to recognize that AI coding tools are part of their cybersecurity attack surface and to invest in protections that allow their developers freedom to benefit from generative AI without undercutting security and compliance for the business.

Ophir Dror is the co-founder and CPO of Lasso Security.

Comments

Plain text