Soybean ‘Big Data’ Online

Infrastructure promotes collaboration among researchers

A new online data resource, the Soybean Knowledge Base (SoyKB), was unveiled by scientists at the University of Missouri. It will allow greater collaboration among international researchers, scientists and farmers to solve questions encountered in soybean research.

Gary Stacey, professor of plant sciences.Gary Stacey, professor of plant sciences.

“Researchers essentially deposit their results from experiments into the database, and high capacity computer systems crunch the numbers to help determine results,” said Trupti Joshi, assistant research professor in computer science at Mizzou. “Their experiments become a part of the bigger picture allowing future researchers to narrow their own results.”

“Humans only can look at so many lines in an Excel spreadsheet—then it just kind of blurs,” said Gary Stacey, collaborator on the project, a professor of plant sciences in the College of Agriculture, Food and Natural Resources, and an investigator in the MU Bond Life Sciences Center. “We need these kinds of tools to be able to deal with this high-volume data. With this database, all the data is deposited and available so something that’s not valuable to me may be valuable to somebody else.”

In the era of “big data,” many scientific discoveries are being made without researchers ever stepping foot in traditional laboratories. Often, data from numerous experiments is gathered and disregarded, with only the desired results analyzed. SoyKB was developed to provide the digital infrastructure needed to store previously disregarded data to take plant science to the next level.

Joshi-Trupti[1]Trupti Joshi developed the free online database that assists international researchers, scientists and farmers to solve questions encountered in soybean research.

Collaborating for the Greater Research Good

Highly collaborative in nature, SoyKB uses computational methods developed by computer science engineers that can be used for many disciplines, such as health sciences, animal sciences, physics and genetics. Additionally, a 3D-protein modeling tool available at the website assists with researchers studying drug design. Pharmaceutical companies may test hypotheses and, in some situations, the proposed drug may yield the expected results — formulated solely by data analysis making drug design more cost effective.

Joshi is a valued resource to the finished product because she has both a biology degree and a computer science background, or a “foot in each camp,” Stacey said.

SoyKB has turned out to be a very good public resource for the soybean community to cross reference and check the details of their findings,” Joshi said. “It can be really difficult for biologists to handle the large scope of data by themselves, and this tool allows researchers to focus more on the biology.”

SoyKB is a part of the U.S.’s $200 million “Big Data” Initiative, a program that works to improve the ability to extract knowledge and insight from large and complex collections of digital data and promises to help solve some of the nation’s most pressing challenges.

The progress of SoyKB was presented at the International Conference on Bioinformatics and Biomedicine in Shanghai. The project is funded by grants from the National Science Foundation.