Environmental Data Initiative receives NSF grant
The National Science Foundation has awarded a second grant to the Environmental Data Initiative to continue gathering and storing ecological data. A collaboration between researchers at the University of New Mexico, the University of Wisconsin, and the University of California – Santa Barbara, EDI aims to store environmental data in a way that is Findable, Accessible, Interoperable, and Reusable (FAIR). A portion of the grant will be used to purchase equipment and services from the UNM Center for Advanced Research Computing using the cost model for premium research computing resources.
In 2009, UNM Research Assistant Professor Mark Servilla and his colleagues began work on a data repository software called Provenance Aware Synthesis Tracking Architecture, or PASTA. The software was created as a means to cultivate a persistent archive of ecological data for future reference. In 2016, Servilla’s team joined researchers at UW and UCSB to create the EDI. While Servilla’s group specializes in data storage, a team headed by Corinna Gries of UW assumed the task of ecologist outreach – finding qualified ecologists who can contribute valuable research data. In its first year, EDI was awarded a $3.1 million grant ($1.5 million of which was allocated to UNM) from the NSF.
In its beginning stages, the EDI data repository was primarily used to store data contributed by researchers affiliated with the Long Term Ecological Research (LTER) network, but data storage has since been made available to all qualified ecologists. Today, the EDI data repository houses over 70,000 data packages on a wide range of environmental research topics. While ecologists must be vetted before contributing to the collection, the vast majority of EDI’s data is publicly accessible via the EDI website.
This year, the Environmental Data Initiative has received a second grant of $3 million ($1.4 million of which has been allocated to UNM) from the NSF to fund the continued upkeep of the data repository system. The system is in a maintenance phase but, as Dr. Servilla explains, “it’s never done.” The continuation of this project will require new hardware and regular software updates.
As of July 2020, EDI’s production and staging environment virtual servers are hosted entirely by CARC; its production data are housed on storage devices hosted by both CARC and University Libraries. Using hardware hosted by CARC allows EDI to maintain reliability and substantially increases the database’s storage capacity.
As for the future of the EDI data repository, Servilla would like to make the system more accessible, automated, and modernized. EDI programmers are already well on their way to automating the process of converting new ecological data into the proper format for storage, and the team hopes to rely more heavily upon a cloud-based infrastructure in the future. The work of the Environmental Data Initiative will undoubtedly prove itself vital to future generations of ecologists who will be enabled to use today’s data for tomorrow’s discoveries.