{"pk":49383,"title":"Do Large Language Models Truly Grasp Mathematics? An Empirical Exploration from Cognitive Psychology","subtitle":null,"abstract":"The cognitive mechanism by which Large Language Models (LLMs) solve mathematical problems remains a widely debated and unresolved issue. Currently, there is little interpretable experimental evidence that connects LLMs' problem-solving with human cognitive psychology. To determine whether LLMs possess human-like mathematical reasoning, we modified the problems used in the human Cognitive Reflection Test (CRT). Our results show that even with the use of Chain-of-Thought (CoT) prompts, mainstream LLMs, including the o1 model (noted for its reasoning capabilities), have a high error rate when solving these modified CRT problems. Specifically, the average accuracy rate dropped by up to 50% compared to the original problems. Further analysis of LLMs' incorrect answers suggests that they primarily rely on pattern matching from their training data, which aligns more with human intuition (System 1 thinking) rather than with human-like reasoning (System 2 thinking). This finding challenges the belief that LLMs have genuine mathematical reasoning abilities comparable to humans. As a result, this work may adjust overly optimistic views on LLMs' progress toward Artificial General Intelligence.","language":"eng","license":{"name":"","short_name":"","text":null,"url":""},"keywords":[{"word":"Artificial Intelligence; Psychology; Cognitive architectures; Reasoning; Comparative Analysis"}],"section":"Papers with Poster Presentation","is_remote":true,"remote_url":"https://escholarship.org/uc/item/24x9t7s1","frozenauthors":[{"first_name":"Shuoyoucheng","middle_name":"","last_name":"Ma","name_suffix":"","institution":"National University of Defense Technology","department":""},{"first_name":"Wei","middle_name":"","last_name":"Xie","name_suffix":"","institution":"National University of Defense Technology","department":""},{"first_name":"Zhenhua","middle_name":"","last_name":"Wang","name_suffix":"","institution":"National University of Defense Technology","department":""},{"first_name":"Xiaobing","middle_name":"","last_name":"Sun","name_suffix":"","institution":"Agency for Science, Technology and Research","department":""},{"first_name":"Kai","middle_name":"","last_name":"Chen","name_suffix":"","institution":"University of Chinese Academy of Sciences","department":""},{"first_name":"Enze","middle_name":"","last_name":"Wang","name_suffix":"","institution":"College of Computer Science and Technology, National University of Defense Technology","department":""},{"first_name":"Wei","middle_name":"","last_name":"Liu","name_suffix":"","institution":"College of Computer Science and Technology","department":""},{"first_name":"Hanying","middle_name":"","last_name":"Tong","name_suffix":"","institution":"College of Computer Science and Technology","department":""}],"date_submitted":null,"date_accepted":null,"date_published":"2025-01-01T18:00:00Z","render_galley":null,"galleys":[{"label":"PDF","type":"pdf","path":"https://journalpub.escholarship.org/cognitivesciencesociety/article/49383/galley/37345/download/"}]}