Stanford University
Framework for Evaluating Theory-of-Mind in Language Models
Pages
22
Time to read
61 mins
Publication
Language
English
Pages
22
Time to read
61 mins
Publication
Language
English
This technical report presents a novel framework aimed at evaluating the Theory-of-Mind (ToM) reasoning capabilities of Large Language Models (LLMs). The report outlines the challenges faced in assessing LLMs' alignment with human mental state comprehension, particularly focusing on inconsistent results from previous evaluations and the validity of existing methodologies. To address these issues, the authors introduce a systematic approach for generating evaluations using causal templates, resulting in the creation of a new benchmark called BigToM. This benchmark comprises 25 controls and 5,000 model-written evaluations, which were rated higher by human participants compared to previous crowd-sourced evaluations. The report details the methodology for constructing these evaluations and compares the social reasoning capabilities of various LLMs, finding that while GPT-4 exhibits ToM capabilities similar to human inference patterns, its reliability is less consistent. The findings contribute to a deeper understanding of LLMs' social reasoning abilities and highlight the effectiveness of the proposed framework.