Community Jameel

In this paper, we propose a novel approach to conformal prediction (CP) that is adapted to generative, large language models (LLMs). Conformal prediction is a popular technique for deriving prediction sets from machine learning models that have rigorous, statistical performance guarantees. We extend conformal techniques to a broad class of language models that sample from a conditional distribution over the combinatorial, unbounded space of possible text outputs, given some input prompt. Specifically, we translate the process of constructing prediction sets into calibrating a \emph{stopping rule}, under which we draw diverse samples from our model until we are confident that the growing set of candidate answers includes at least one high-quality response. At the same time, we calibrate a \emph{rejection rule} to selectively discard low-quality or redundant responses to reduce sample noise. Under minimal assumptions, we theoretically prove that our resulting output sets contain at least one high-quality answer with some desired probability that a user can set (such as $90\%$), while still remaining empirically precise on average. Furthermore, within this set of sampled candidate answers, we show that we can also accurately identify subsets of individual components (e.g., phrases or sentences) that are each independently correct (e.g., that are not ``hallucinations'')---again, with provably high probability. We demonstrate the effectiveness of our approach on multiple types of large language models applied to tasks in open-domain question answering, text summarisation, and radiology report generation.

Conformal language modeling

Details

author(s)

publication date

source

related programme

Link to publication

Generative AI in the era of 'alternative facts'

External data and AI are making each other more valuable

Rethinking patch dependence for masked autoencoders

Removing biases from molecular representations via information maximisation

Effective human-AI teams via learned natural language rules and onboarding

A deep dive into single-cell RNA sequencing foundation models

Antibiotic identified by AI

LLM-grounded video diffusion models

Successful Development of a Natural Language Processing Algorithm for Pancreatic Neoplasms and Associated Histologic Features

Leveraging artificial intelligence in the fight against infectious diseases

BioAutoMATED: An end-to-end automated machine learning tool for explanation and design of biological sequences

Conformal language modeling

Comparison of mammography AI algorithms with a clinical risk model for 5-year breast cancer risk prediction: An observational study

Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii

Algorithmic pluralism: A structural approach towards equal opportunity

Artificial intelligence and machine learning in lung cancer screening

Wide and deep neural networks achieve consistency for classification

Autocatalytic base editing for RNA-responsive translational control

DiffDock: Diffusion steps, twists and turns for molecular docking

Sybil: A Validated Deep Learning Model to Predict Future Lung Cancer Risk From a Single Low-Dose Chest Computed Tomography

Sequential multi-dimensional self-supervised learning for clinical time series

Queueing theory: Classical and modern methods

Upper body thermal images and associated clinical data from a pilot cohort study of COVID-19

Toward robust mammography-based models for breast cancer risk

The age of AI: And our human future

Uniform priors for data-efficient transfer

Machine learning under a modern optimisation lens

The marginal value of adaptive gradient methods in machine learning

Efficient graph-based image segmentation