NJIT Cybersecurity Research Adds Protection to AI-Built Code
Image created with AI
Image created with AI
Published: May 26, 2026
Written by: Evan Koblentz
Software that will harden the security of AI-developed code is being developed at New Jersey Institute of Technology, funded by a $450,000 National Science Foundation grant.
NJIT professors Zephyr Yao and Iulian Neamtiu decided they’d seen enough of the downside of programming assisted by artificial intelligence — that this increasingly common process creates too many bugs — so they’re taking action now before it is too late.
“Undeniably, more and more programmers are using AI to help them write code, and somehow this looks productive, but it carries a lot of risk. They don't know what they're writing, and AI-generated code may look very nice and polished, right? And it still contains security errors,” Yao explained.
"There's an incoming wave of unscrutinized low-quality code generated by AI. We must act urgently to prevent that code from turning into widespread software disasters, or at least reduce the impact of such code," Neamtiu added.
Citing prior studies and their own preliminary work, Yao and Neamtiu stated that 40% of programs generated by large language models are buggy, 65% of an LLM’s first attempts at code generation are simply insecure and attempting to fix these issues by adding more prompts only makes it worse.
With the planned framework, not yet named, a developer would connect their code repository such as GitHub to their preferred AI system. The AI could be a mainstream system like Claude, Codex or CoPilot. It could also be something proprietary to an organization.
Then, when acting on the developer’s prompt, the framework adds security guardrails — “Not just to write code, but also what safety rules the code has to follow. Then we check those against the generated code, look for security problems and guide AI to improve it iteratively,” Yao noted.
The researchers use both static and dynamic analysis, which refers to tools that examine code without actually running it and those that do. They then put the results back into iterative prompting.
The framework will test for three broad bug categories: the industry-standard Common Weakness Enumeration, which refers to the current 25 most pressing vulnerabilities; ambitious tasks with substantial context, such as large open-source projects; and finally some time-consuming, difficult bugs that require true expertise, Yao said. It then applies the results locally in the user’s specific code environment. The framework would also be expandable and scalable.
Yao cited several challenges during the upcoming three-year project. They must correctly translate security requirements into the right context that is useful for the language model, because language models are incapable of human understanding. Their software must also scale to real-world code bases without operating too slowly.
But most importantly, they need to verify that their tool’s revised output is actually better than the language model’s own submission.
“We make it automatic. This is part of the goal [is] an automated feedback loop that checks whether generated code is more secure over time or not, and one interesting observation we had is that sometimes AI would just make things worse in the iteration. If that happens, our system, the part that performs programming analysis and verification, will stop and explain the problem instead of just blindly accepting the next wrong answer,” he said.
Yao said he’d eventually like to see the framework integrated into AI-enabled development environments themselves, functioning as an open-source coding assistant. But whether his team members themselves will use AI in creating this framework remains to be seen. “If they do,” he said, “I will make sure that use of AI is secure and safe.”
Discover More