Although Large Language Models (LLMs) have shown great potential in Natural Language Generation, it is still challenging to characterize the uncertainty of model generations, i.e., when users could trust model outputs. Our research is derived from the heuristic facts that tokens are created unequally in reflecting the meaning of generations by auto-regressive LLMs, i.e., some tokens are more relevant (or representative) than others, yet all the tokens are equally valued when estimating uncertainty. It is because of the linguistic redundancy where mostly a few keywords are sufficient to convey the meaning of a long sentence. We name these inequalities as generative inequalities and investigate how they affect uncertainty estimation. Our resu...
This paper introduces Bayesian uncertainty modeling using Stochastic Weight Averaging-Gaussian (SWAG...
Large language models (LLMs), such as GPT-3.5 and GPT-4, have greatly advanced the performance of ar...
Large language models (LLMs) have recently been used as models of psycholinguistic processing, usual...
The increased deployment of LMs for real-world tasks involving knowledge and facts makes it importan...
Recent advances of powerful Language Models have allowed Natural Language Generation (NLG) to emerge...
With the recent success of deep learning methods, neural-based models have achieved superior perform...
We show that a GPT-3 model can learn to express uncertainty about its own answers in natural languag...
Despite recent advances in statistical machine learning that significantly improve performance, the ...
We investigate the problem of determining the predictive confidence (or, conversely, uncertainty) of...
Generative language models are usually pretrained on large text corpus via predicting the next token...
Uncertainty decomposition refers to the task of decomposing the total uncertainty of a model into da...
Reliable uncertainty quantification is a first step towards building explainable, transparent, and a...
Language models (LMs) have demonstrated remarkable capabilities across a wide range of tasks in vari...
Cheap-to-Build Very Large-Language Models (CtB-LLMs) with affordable training are emerging as the ne...
Since convolutional neural networks (CNNs) achieved top performance on the ImageNet task in 2012, de...
This paper introduces Bayesian uncertainty modeling using Stochastic Weight Averaging-Gaussian (SWAG...
Large language models (LLMs), such as GPT-3.5 and GPT-4, have greatly advanced the performance of ar...
Large language models (LLMs) have recently been used as models of psycholinguistic processing, usual...
The increased deployment of LMs for real-world tasks involving knowledge and facts makes it importan...
Recent advances of powerful Language Models have allowed Natural Language Generation (NLG) to emerge...
With the recent success of deep learning methods, neural-based models have achieved superior perform...
We show that a GPT-3 model can learn to express uncertainty about its own answers in natural languag...
Despite recent advances in statistical machine learning that significantly improve performance, the ...
We investigate the problem of determining the predictive confidence (or, conversely, uncertainty) of...
Generative language models are usually pretrained on large text corpus via predicting the next token...
Uncertainty decomposition refers to the task of decomposing the total uncertainty of a model into da...
Reliable uncertainty quantification is a first step towards building explainable, transparent, and a...
Language models (LMs) have demonstrated remarkable capabilities across a wide range of tasks in vari...
Cheap-to-Build Very Large-Language Models (CtB-LLMs) with affordable training are emerging as the ne...
Since convolutional neural networks (CNNs) achieved top performance on the ImageNet task in 2012, de...
This paper introduces Bayesian uncertainty modeling using Stochastic Weight Averaging-Gaussian (SWAG...
Large language models (LLMs), such as GPT-3.5 and GPT-4, have greatly advanced the performance of ar...
Large language models (LLMs) have recently been used as models of psycholinguistic processing, usual...