{"pk":49583,"title":"Cognitive Insights into Document Comprehension: The Role of Reading Order and Visual Attention in Human and Large Language Models","subtitle":null,"abstract":"This study investigates how integrating human eye-tracking data into Large Language Models (LLMs) and Visual Large Language Models (VLLMs) can enhance document comprehension in tasks that require both linguistic understanding and visual attention, specifically Semantic Entity Recognition (SER) and Document Question Answering (DQA). Despite rapid advancements in AI-based document understanding, LLMs still face challenges in replicating the depth of human cognition, particularly in how reading order and visual attention affect comprehension. The results demonstrate that human reading order and the regions they focus on significantly impact performance in both tasks. Additionally, while LLMs do not need to fully mimic human reading sequences, their performance improves when their attention patterns align more closely with human visual strategies. This highlights the importance of incorporating cognitive-inspired attention mechanisms in AI systems, offering a path to better AI models that reflect human cognitive strategies in complex document understanding.","language":"eng","license":{"name":"","short_name":"","text":null,"url":""},"keywords":[{"word":"Artificial Intelligence; Human-computer interaction; Natural Language Processing; Reading; Eye tracking"}],"section":"Papers with Poster Presentation","is_remote":true,"remote_url":"https://escholarship.org/uc/item/6mx1z1r4","frozenauthors":[{"first_name":"QingXuan","middle_name":"","last_name":"Wang","name_suffix":"","institution":"School of Computer Engineering and Science,  Shanghai University","department":""},{"first_name":"Hao","middle_name":"","last_name":"Wang","name_suffix":"","institution":"School of Computer Engineering and Science,  Shanghai University","department":""},{"first_name":"Huiran","middle_name":"","last_name":"Zhang","name_suffix":"","institution":"Shanghai University","department":""},{"first_name":"Chenhui","middle_name":"","last_name":"Chu","name_suffix":"","institution":"Kyoto University","department":""},{"first_name":"Rui","middle_name":"","last_name":"Wang","name_suffix":"","institution":"Shanghai Jiao Tong University","department":""},{"first_name":"Pinpin","middle_name":"","last_name":"Zhu","name_suffix":"","institution":"Shanghai University","department":""}],"date_submitted":null,"date_accepted":null,"date_published":"2025-01-01T12:00:00-06:00","render_galley":null,"galleys":[{"label":"PDF","type":"pdf","path":"https://journalpub.escholarship.org/cognitivesciencesociety/article/49583/galley/37545/download/"}]}