Trusting LLMs through attributions: the TreeFinder approach

Lize Pirenne, Gaspard Lambrechts, Norman Marlier, Maxence de la Brassinne Bonardeaux, Gilles Louppe, Damien Ernst

Improving LLM trustworthiness

Large Language Models (LLMs) have been adopted to reduce the amount of menial tasks in textual environments such as mail classification or spell checking. However, some sectors, including legal and medical question-answering (QA), enterprise search, and content moderation, are requiring high levels of trust that are not yet achieved by most LLM systems. Indeed, LLMs often (i) miss or ignore key sentences in long documents, (ii) are distracted by irrelevant text, and (iii) provide answers with little traceability. Those issues make it hard to fact-check, debug, or certify model outputs in high-stakes settings.

One promising avenue for improving their trustworthiness, and as a result their adoption, is to make them auditable. This can be done by finding the sentences the LLM uses for generating its answers. These are called contributive attributions.

Locating attributions can lead to improvements in the whole LLM pipeline. As a first example, we can mention that these sentences can act as summaries of the original context. These summaries can be rapidly scanned for errors by humans or other AI processes. Secondly, the attributions can be analysed so as to be sure that they entail the answer. Such an analysis can be carried out by traditional natural language processing models or by running the LLM again on this new, smaller set. As a third and final example, they can be used to detect bias in a model presented with contradictory sources.

TreeFinder : A tractable evaluation of sentences importances.

The value of a subcontext as an attribution can be measured by its necessity to the production of the answer and its sufficiency to the generated answer. We can estimate these values by analysing the probability drop of the answer for different subcontexts. Combining both estimations, we construct a single metric from probability drops that measures a subcontext’s role as an attribution for the answer. There subsists the issue of the number of subcontexts. Indeed, we could consider every permutations of every possible ablation, rendering the computation intractable.

In this paper, we introduce TreeFinder, a model-agnostic framework for searching contributive attributions by iteratively dropping smaller and smaller parts of the context using our constructed metric. This hierarchical, tree-based pruning strategy trades exhaustive search for targeted, tractable search.

Experiments : Better ranking of groups of sentences

We evaluate our method against two other methodologies that also leverage probability drops: ContextCite and TracLLM. On HotpotQA, we observe that TreeFinder finds groups of sentences that achieve better attribution metrics. Furthermore, the most important sentences taken separately, as defined by our metric, are ranked higher by TreeFinder than ContextCite.

We complement the analysis in the appendix by running experiments on LongBench and two subsets of Loogle. The analysis of the experiments shows that TracLLM and ContextCite are able to find sentences with better attribution metrics. However, on these datasets as well, the rankings done by TreeFinder is the one that corresponds the best with the ground truth.

Future works : some possible improvements

Blindly using probability drops has inherent limitations. First, dropping a sentence can change the meaning of the context by breaking references. Therefore, smarter sentence grouping (semantic or syntactic) can further improve the method. Second, while keeping the context in its entirety except the dropped chunk can accurately inform us of its importance, dropping the least interesting parts can speed up each estimation further down in the tree. Therefore, there should be an optimum between choosing to permanently remove a chunk like TracLLM or to simply stop subdividing it like TreeFinder. Looking for this optimum would certainly deserve further research. Finally, there could be benefits to estimating chunk importance before measuring it, for both budgeting the search and factoring in surprise.

Conclusion : simple and effective method for finding contributive attributions

TreeFinder provides a tractable way to find the sentences an LLM used to answer a question by combining necessity and sufficiency probability-drop metrics with a tree-pruning search. It is a practical step towards explainable QA. This research is especially valuable for long-document applications where confidence in the answer is crucial.

Trusting LLMs through attributions: the TreeFinder approach

Comments

Leave a comment Cancel reply

Trusting LLMs through attributions: the TreeFinder approach

Share this:

Comments

Leave a comment Cancel reply