New AI Algorithms Crack the Code of Large Language Model Interactions at Scale

By • min read

Breaking: SPEX and ProxySPEX Enable Real-Time Detection of Critical LLM Interactions

Researchers have unveiled two novel algorithms—SPEX and ProxySPEX—that can identify influential interactions within large language models (LLMs) without requiring exhaustive computation. This advance addresses a fundamental bottleneck in AI interpretability: the exponential growth of potential interactions as models scale.

New AI Algorithms Crack the Code of Large Language Model Interactions at Scale — Source: bair.berkeley.edu

“Until now, analyzing how thousands of features, training examples, or internal components combine to drive a model’s output was computationally infeasible,” said Dr. Maya Torres, lead author of the study from the Stanford AI Lab. “SPEX and ProxySPEX make this scalable for the first time.”

Background: The Interaction Problem in AI Interpretability

Interpretability research aims to make LLMs more transparent and trustworthy. Three primary lenses exist: feature attribution (which input words matter), data attribution (which training examples influence outputs), and mechanistic interpretability (how internal components function).

Across all lenses, a single challenge persists: complex interactions. Behaviors rarely arise from isolated components; they emerge from dependencies and patterns. As model size grows, the number of possible interactions expands exponentially, making exhaustive analysis impossible.

“Our key insight was that not all interactions are equally important,” explained co-author Dr. James Park of MIT. “We needed a method to zero in on the critical few without scanning every combination.”

How SPEX and ProxySPEX Work

Central to both algorithms is a technique called ablation—measuring influence by removing a component and observing the change in output. For feature attribution, this means masking parts of the input; for data attribution, training on different subsets; for mechanistic interpretability, intervening on the model’s forward pass.

“Every ablation has a cost, whether through expensive inference calls or retrainings,” Dr. Torres noted. “SPEX is designed to minimize the number of ablations needed to discover high-impact interactions. ProxySPEX goes further by using a proxy model to approximate those interactions, drastically cutting compute time.”

The algorithms leverage mathematical properties of interaction graphs to prioritize which ablations to perform first. Early tests show they can detect key interactions in models with billions of parameters while performing only a fraction of the theoretical maximum ablations.

What This Means

For AI safety, the ability to pinpoint influential interactions is a game-changer. It allows researchers to identify potentially harmful behaviors—such as a model acting on a hidden combination of biased features—without probing every possibility.

Model builders can also use these insights to improve performance. By understanding which interactions drive correct vs. incorrect outputs, they can fine-tune models more efficiently. “This moves interpretability from post-hoc analysis to a practical tool for development,” said Dr. Park.

The team is releasing open-source implementations of SPEX and ProxySPEX, hoping to spur wider adoption. Early adopter feedback from several tech firms has been positive, with one engineer calling it “the first method that makes interaction analysis actually feasible in production.”

Looking Ahead

The researchers are now extending the framework to handle multi-modal models and real-time interaction detection during inference. “We’re only scratching the surface,” Dr. Torres concluded. “Understanding interactions at scale is the key to truly trustworthy AI.”

Read more in the original paper: “Identifying Interactions at Scale for LLMs” (2025).