Representation lenses expose layer-wise predictions in LLMs. Current methods rely on full-rank affine maps with quadratic cost. How- ever, spectral evidence across multiple model families shows these maps are intrinsically low-rank. We propose LoRA-Lens, a low-rank residual alignment mechanism that reduces parameters by over 95% while preserv- ing fidelity to the model’s final output. Experiments on OLMo, Qwen, and Gemma (up to 32B) demonstrate strong fidelity, large memory sav- ings, robust transfer to instruction-tuned models, and effective early-exit inference.
Low-Rank Lens for Scalable LLMs Interpretability / Trimigno, G., Lombardo, G., Cagnoni, S.. - (2026).
Low-Rank Lens for Scalable LLMs Interpretability
Giuseppe TrimignoConceptualization
;Gianfranco LombardoMethodology
;Stefano CagnoniSupervision
2026-01-01
Abstract
Representation lenses expose layer-wise predictions in LLMs. Current methods rely on full-rank affine maps with quadratic cost. How- ever, spectral evidence across multiple model families shows these maps are intrinsically low-rank. We propose LoRA-Lens, a low-rank residual alignment mechanism that reduces parameters by over 95% while preserv- ing fidelity to the model’s final output. Experiments on OLMo, Qwen, and Gemma (up to 32B) demonstrate strong fidelity, large memory sav- ings, robust transfer to instruction-tuned models, and effective early-exit inference.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


