A Minimum Information Framework for capturing FAIR data with small Uncrewed Aircraft Systems

10 Small Uncrewed Aircraft Systems (sUAS) are an increasingly common tool for data collection in many scientific fields. However, there are few standards or best practices guiding the collection, sharing, or publication of data collected with these tools. This makes collaboration, data quality control, and reproducibility challenging. To that end, we have used iterative rounds of research process modeling and user engagement to develop a Minimum Information Framework (MIF) to guide sUAS users in collecting the metadata necessary to ensure that their data is trust-worthy and shareable. This MIF outlines 74 metadata terms in four classes that users should consider collecting for any given study. The MIF provides a foundation which can be used for developing standards and best practices. 11

practice development, and should inform the selection of formal metadata best practices. The terms in the MIF can be mapped 40 to existing standards and ontologies in creating an application profile. The MIF can also be used as a checklist for different 41 organizations and communities to explore the kinds of metadata that might be important in facilitating data reuse. We developed 42 this framework in collaboration with sUAS producers via the authors' on-going and extensive work building community around 43 sUAS-based scientific research 1 . We additionally developed in depth case studies of real-world case studies of sUAS use for 44 scientific research; conducted systems and workflow analysis; and conducted community surveys.

45
This framework is not intended as a standard in and of itself, but rather, is a first step towards the development of domain-46 or institution-specific standards and best practices. We do not provide any guidance about specific tooling or other hardware set-ups that might make data more or less FAIR; we simply outline the metadata elements that are potentially important for the 48 provisioning of FAIR data. We describe the implications of our design further in the discussion section.

50
We identified 74 terms, divided into the following four classes of information that must be collected to make sUAS data FAIR:     The MIF can be used by data collectors or archives to begin development of best practices or other guidelines for collecting and 66 curating data. We expect that every group will not need to capture every data element. Rather, the MIF outlines important data 67 types that should be considered in any sUAS project. Research teams may wish to rank terms according to their importance for 68 a given study, context, or organization. We demonstrate the use of the MIF to develop localized best practices with a group 69 from the U.S. Long Term Ecological Research network. 70 interviews with four scientists who use drones in their field work and who use drone data in their research. We walked through 138 the same survey of terms and asked for responses on the same four-point scale, and received richer responses that helped us 139 better understand how users interpreted the proposed terms in their different domains. 140 We reviewed and revised our proposed MIF to incorporate this feedback. We found that our survey respondents and 141 interview subjects sometimes offered contradictory opinions on the necessity of a particular term, which typically reflected the 142 needs of their respective domains and the different terms deemed necessary for drone flight operations and management and the 143 terms deemed necessary for data reuse. We consequently left many terms in that wouldn't necessarily be needed be all groups, 144 with the idea that each group could create different application profiles from the MIF.

145
Phase III: Pilot instantiation of the MIF 146 The MIF was in a Data Best Practices working group with the U.S. LTER data managers as described above. Through a 6-month 147 collaboration, we demonstrated how the MIF might serve their emerging needs, and simultaneously refined our terms based on 148 their feedback. We worked with their team and users to rank each metadata term according to its usefulness in the contexts 149 of: Discovery (enables search in data archives); Fitness for use (enables an end user to assess whether a dataset will suit their 150 research needs); Necessary for reuse (details that would be needed to reuse, reprocess or otherwise interpret the data). For all 151 three contexts, each term was assigned a value on a scale from 1-5 (where 1 = not important, 5 = absolutely necessary). The

152
LTER information managers and their users provided us with expert input on these value assignments. Based on this input we 153 have now included these rankings in our published MIF, while also noting that these rankings may differ by user communities.