Assessing Logical Reasoning Capabilities of Encoder-Only Transformer Models
Abstract
Transformer models have shown impressive abilities in natural language tasks such as text generation and question answering. Still, it is not clear whether these models can successfully conduct a rule-guided task such as logical reasoning. In this paper, we investigate the extent to which encoder-only transformer language models (LMs) can reason according to logical rules. We ask whether these LMs can deduce theorems in propositional calculus and first-order logic, if their relative success in these problems reflects general logical capabilities, and which layers contribute the most to the task. First, we show for several encoder-only LMs that they can be trained, to a reasonable degree, to determine logical validity on various datasets. Next, by cross-probing fine-tuned models on these datasets, we show that LMs have difficulty in transferring their putative logical reasoning ability, which suggests that they may have learned dataset-specific features instead of a general capability. Finally, we conduct a layerwise probing experiment, which shows that the hypothesis classification task is mostly solved through higher layers. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
URI
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85204618703&doi=10.1007%2f978-3-031-71167-1_2&partnerID=40&md5=5443516f30a09ac64e156f7486dfe04chttps://repositorio.maua.br/handle/MAUA/584