This is a Preprint and has not been peer reviewed. This is version 2 of this Preprint.
From linguistic evaluation to mechanistic verification: testing LLM-generated farm recommendations
Downloads
Authors
Abstract
Large language models (LLM) are increasingly used to generate farm-management advice, but their biophysical consequences remain largely unverified. We introduce a process-based verification framework that combines management portfolios generated by ChatGPT and Claude with the process-based model LandscapeDNDC across 11 contrasting agroecosystems. The LLMs produced agronomically plausible interventions, privileging changes in fertiliser timing, splitting and rates, and generally preserved crop yields. However, their agreement was markedly weaker for environmental targets like nitrogen losses and soil organic carbon. LLMs predicted the direction of agri-environmental change more reliably than its magnitude: directional agreement averaged 86%, whereas only 49% of simulated responses fell within the expected ranges. Portfolios that targeted several targets simultaneously rarely performed consistently across sites. Our results show that process-based models can screen AI-generated farm recommendations for environmental burden shifting before they are used in practice.
DOI
https://doi.org/10.31223/X5F78W
Subjects
Agricultural Science, Agriculture, Agronomy and Crop Sciences Life Sciences, Biochemistry, Biophysics, and Structural Biology, Biogeochemistry, Environmental Indicators and Impact Assessment, Research Methods in Life Sciences, Soil Science, Sustainability
Keywords
LLM, PBM, LandscapeDNDC, ChatGPT, Claude, Foundational model, Verification layer, Farm recommendation, Agri-environment, Yields
Dates
Published: 2026-07-03 09:31
Last Updated: 2026-07-03 16:49
Older Versions
License
CC BY Attribution 4.0 International
Additional Metadata
Data Availability:
The analysis scripts, processed model outputs and prompting templates will be made available in a public repository upon release of the full manuscript. The present preprint reports a proof-of-concept analysis based on existing LandscapeDNDC demonstration setups.
Metrics
Views: 296
Downloads: 10
There are no comments or no comments have been made public for this article.