This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.1016/j.cageo.2022.105248. This is version 2 of this Preprint.
Downloads
Supplementary Files
Authors
Abstract
Decision Trees (DT) is a machine learning method that has been widely used in the geosciences to automatically extract patterns from complex and high dimensional data. However, like any data-based method, the application of DT is hindered by data limitations and potentially physically unrealistic results. We develop interactive DT (iDT) that put the human in the loop and integrate the power of experts’ scientific knowledge with the power of the algorithms to automatically learn patterns from large datasets. We created an open-source Python toolbox that implements the iDT framework. Users can create new composite variables, manually change the variable and threshold to split, manually prune and group variables based on their physical meaning. We demonstrate with three case studies that iDT help experts incorporate their knowledge in the DT development achieving higher interpretability.
DOI
https://doi.org/10.31223/X5PP75
Subjects
Civil and Environmental Engineering, Engineering, Environmental Engineering
Keywords
Interactive Decision Trees, Human-in-the-Loop, Interpretability, Open-source toolbox
Dates
Published: 2021-07-24 14:34
Last Updated: 2022-03-29 21:30
There are no comments or no comments have been made public for this article.