The recent explosion of Artificial Intelligence applications has spurred nationwide efforts to ensure these technologies are safe and secure; at the University of Utah, President Taylor Randall announced a $100 Million initiative on Responsible AI last fall.
And while AI-based technologies are increasingly accessible to the public, research on improving the trustworthiness of these systems still requires a staggering amount of computational power. Time on the supercomputing clusters required for this work is thus a highly-coveted resource.
With this challenge in mind — and as a result of the President Biden’s Executive Order on the Safe, Secure and Trustworthy Development and Use of AI — the National Science Foundation (NSF) and the Department of Energy (DOE) recently launched an initiative that connects cutting-edge AI researchers with the most advanced computational resources: the National Artificial Intelligence Research Resource Pilot (NAIRR).
Today, NSF and DOE announced the first class of researchers supported under this program. Among them is Vivek Srikumar, associate professor in the John and Marcia Price College of Engineering’s Kahlert School of Computing.
“Today marks a pivotal moment in the advancement of AI research as we announce the first round of NAIRR pilot projects. The NAIRR pilot, fueled by the need to advance responsible AI research and broaden access to cutting-edge resources needed for AI research, symbolizes a firm stride towards democratizing access to vital AI tools across the talented communities in all corners of our country,” said NSF Director Sethuraman Panchanathan. “While this is only the first step in our NAIRR efforts, we plan to rapidly expand our partnerships and secure the level of investments needed to realize the NAIRR vision and unlock the full potential of AI for the benefit of humanity and society.”
Srikumar’s proposal, “An Empirical Evaluation of Accuracy, Robustness and Bias of Compressed Language Model,” will investigate how well Large Language Models (LLMs) can perform under more stringent memory and storage constraints.
Such LLMs, which form the backbone of text-generating AI systems like ChatGPT, may contain databases with hundreds of billions of parameters. Their sheer size — as well as the computation power necessary to quickly access the data within — means these systems typically run on cloud-based servers, rather than on local copies that users have on their own devices.
This arrangement, however, can be an insurmountable challenge for a number of different proposed AI applications, including ones deployed in places without consistent internet access, or those that are based on sensitive data with increased privacy or security requirements.
Srikumar and colleagues will test several compression algorithms that could shrink LLM databases into more manageable sizes, potentially enabling them to run on an air-gapped laptop or a cellphone. The challenge will be to find a technique that produces meaningful size advantages without corrupting the relationships between the database’s parameters, which is what these systems “learn” after being trained on natural human language.
As part of this NAIRR funded project, Srikumar will be able to test these compression techniques with 20,000 hours of compute time on Delta, an advanced research platform based at the University of Illinois Urbana-Champaign’s National Center for Supercomputing Applications.
Other projects granted computing allocations in this initial round encompass a diverse range of AI-related areas, including investigations into language model safety and security, privacy and federated models, and privacy-preserving synthetic data generation. Other projects also focus on domain-specific research, such as using AI and satellite imagery to map permafrost disturbances, developing a foundation model for aquatic sciences, securing medical imaging data and using AI for agricultural pest identification.
“DOE Office of Science has decades of experience in cutting-edge AI research and a longstanding commitment to developing world-leading high performance computing resources that are open to the scientific community,” said Harriet Kung, acting director of the DOE Office of Science. “We are proud to continue our mission by providing valuable access to some of the fastest computing facilities in the world to the NAIRR Pilot. Innovations developed in collaboration with industry partners are designed to address not only traditional scientific workloads but also the growing demands of AI research at scale. We are excited to see what the future holds for AI in science.”