Data sets are often easy to find but complicated to use because scientists store and use information differently and they vary their vocabulary.
Now, UF researchers are leading a national project to help scientists find and reuse data to analyze crops. With updated data and software, researchers can use information that will help them feed the world’s 7.9 billion people.
To increase food production, scientists constantly find crops that resist diseases, pests, drought and floods—often through breeding. They also use data to develop models that guide them on places and times to plant fruits and vegetables.
UF/IFAS is leading a national initiative to update and streamline information on the Agricultural Research Data Network (ARDN), which combines information from researchers from different universities. The project will help researchers find, interpret and combine data from multiple sources.
Cheryl Porter, a UF/IFAS computer applications specialist and an investigator on the project, cited an example of how ARDN can help scientists who are developing a breeding tool.
The tool will allow scientists to link genetically sequenced data for several crop varieties with experimental data collected for these varieties that grow in different parts of the world. By combining the genetic data with the field crop experimental data, novel, gene-based modeling tools can be developed to predict crop growth characteristics based only on the genetic material.
“We want to take the data that scientists have archived in their current forms,” Porter said. “Through software we’re developing, we’ll add notes that researchers globally can understand.”
By late October, the research team will post new software to a scientific data repository known as Ag Data Commons, a platform managed by the National Agriculture Library. Ag Data Commons contains data and documents from agricultural-related research.
Typically, researchers collect data and use them only for their own original experiments.
UF/IFAS professor Gerrit Hoogenboom, the principal investigator on the project, believes scientists can gain far more value from data if they were combined across locations, time and conditions.
“With the rapid increase in computer technologies for data analytics and especially artificial intelligence, these old data sets have even become more valuable,” said Hoogenboom, a faculty member in agricultural and biological engineering. “Although the interest of our group is mainly in crop modeling and decision support systems, making data findable, accessible and reusable has a much broader impact.”
The research by Hoogenboom and Porter is part of a $1 million, three-year NIFA-funded project involving Iowa State University, the University of Georgia, the University of Arizona and the Kellogg Biological Station (part of Michigan State University). Researchers just started the third year of the grant. Hoogenboom and his team have spent the last two years developing software for ARDN.
“ARDN provides a means of standardizing data so that users don’t have to figure it out for themselves. If we can reuse data that have already been collected, we conserve resources,” Porter said. “Scientists would otherwise have to go out and repeat an experiment. Implementing this new software with both current and old data and combining it with modeling, analytics, and AI is the new frontier in agricultural research for feeding the world.”