Retrieval-augmented generation (RAG) enriches prompts with external knowledge, but it often relies on additional infrastructure that may be impractical in resource-constrained or offline settings. In addition, updating the internal knowledge of a language model through retraining is costly and inflexible. To address these limitations, we propose an explainable and structured prompt augmentation pipeline that enhances inputs using pre-trained models and rule-based extractors, without requiring external sources. We describe this approach as an orchestrated LLM workflow: a structured sequence in which lightweight LLM modules assume specialized roles. Specifically, (1) an extractor module identifies factual triples from input prompts by combining dependency parsing with a rule-based extraction algorithm; (2) a scorer module, based on a generic lightweight LLM, evaluates the importance of each triple via its self-attention patterns, leveraging internal beliefs to promote explainability and trustworthy cooperation with the downstream model; (3) a performer module processes the augmented prompt for downstream tasks in supervised fine-tuning or zero-shot settings. Much like in a theater staging, each module operates transparently behind the scenes to support and elevate the performer’s final output. We evaluate this approach across multiple performer architectures (encoder-only, encoder-decoder, and decoder-only) and NLP tasks (multiple-choice QA, open-book QA, and summarization). Our results show that this structured augmentation with scored facts yields consistent improvements compared to baseline prompting: up to a (Formula presented.) accuracy improvement for multiple-choice QA, up to a (Formula presented.) BLEURT improvement for open-book QA, and up to a (Formula presented.) ROUGE-L improvement for summarization. By decoupling knowledge scoring from task execution, our method provides a practical, interpretable, and low-cost alternative to RAG in static or knowledge-limited
LLMs in Staging: An Orchestrated LLM Workflow for Structured Augmentation with Fact Scoring / Trimigno, G., Lombardo, G., Tomaiuolo, M., Cagnoni, S., Poggi, A.. - In: FUTURE INTERNET. - ISSN 1999-5903. - 17:12(2025), p. 535. [10.3390/fi17120535]
LLMs in Staging: An Orchestrated LLM Workflow for Structured Augmentation with Fact Scoring
Giuseppe TrimignoSoftware
;Gianfranco LombardoConceptualization
;Michele TomaiuoloWriting – Review & Editing
;Stefano CagnoniWriting – Review & Editing
;Agostino PoggiSupervision
2025-01-01
Abstract
Retrieval-augmented generation (RAG) enriches prompts with external knowledge, but it often relies on additional infrastructure that may be impractical in resource-constrained or offline settings. In addition, updating the internal knowledge of a language model through retraining is costly and inflexible. To address these limitations, we propose an explainable and structured prompt augmentation pipeline that enhances inputs using pre-trained models and rule-based extractors, without requiring external sources. We describe this approach as an orchestrated LLM workflow: a structured sequence in which lightweight LLM modules assume specialized roles. Specifically, (1) an extractor module identifies factual triples from input prompts by combining dependency parsing with a rule-based extraction algorithm; (2) a scorer module, based on a generic lightweight LLM, evaluates the importance of each triple via its self-attention patterns, leveraging internal beliefs to promote explainability and trustworthy cooperation with the downstream model; (3) a performer module processes the augmented prompt for downstream tasks in supervised fine-tuning or zero-shot settings. Much like in a theater staging, each module operates transparently behind the scenes to support and elevate the performer’s final output. We evaluate this approach across multiple performer architectures (encoder-only, encoder-decoder, and decoder-only) and NLP tasks (multiple-choice QA, open-book QA, and summarization). Our results show that this structured augmentation with scored facts yields consistent improvements compared to baseline prompting: up to a (Formula presented.) accuracy improvement for multiple-choice QA, up to a (Formula presented.) BLEURT improvement for open-book QA, and up to a (Formula presented.) ROUGE-L improvement for summarization. By decoupling knowledge scoring from task execution, our method provides a practical, interpretable, and low-cost alternative to RAG in static or knowledge-limitedI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


