BED-LLM: Intelligent Data Collection and LLMs and Bayesian Experimental Design

We propose a general-purpose approach to improve the ability of Large-scale Language Models (LLMs) to intelligently and flexibly gather information from a user or other external source using a sequential Bayesian experimental design (BED) framework. This enables LLMs to act as more effective negotiation agents and communicate through communication with external environments. Our method, which we call BED-LLM (Bayesian Experimental Design With Large Language Models), is based on iteratively selecting questions or questions that maximize the expected information gain (EIG) about the task of interest given previously collected answers. We show how this EIG can be constructed in a formal way using a probabilistic model based on the LLM belief distribution and provide detailed information on key decisions in its construction. Also important to the success of the BED-LLM is a number of innovative methods, such as the carefully designed EIG scale, which does not rely solely on in-contextual review to set conditions on previous answers, and a targeted strategy for raising candidates' questions. We find that BED-LLM achieves significant performance gains on a wide range of 20-question game-based tests and using LLM to reflect user preferences, compared to direct LLM and other dynamic design techniques.
- † University of Oxford
- ‡ City University of Hong Kong



