VertNet: distributed databases with backbone

image hump backed whale
Ara chloroptera. Dr. Lloyd Glenn Ingles © Cal. Academy of Sciences.

VertNet is a NSF-funded collaborative project that makes biodiversity data free and available on the web. VertNet is a tool designed to help people discover, capture, and publish biodiversity data. It is also the core of a collaboration between hundreds of biocollections that contribute biodiversity data and work together to improve it. VertNet is an engine for training current and future professionals to use and build upon best practices in data quality, curation, research, and data publishing. Yet, VertNet is still the aggregate of all of the information that it mobilizes. To us, VertNet is all of these things and more.

The VertNet team includes collaborators from the Universities of California, Colorado, Kansas, and Tulane, and partners from wide-range of biodiversity projects, who are working to build upon the successes of four classic vertebrate networks (FishNet, MaNIS, HerpNET, ORNIS), to combine them into a single integrated data portal, and solve the problems that these networks face. Our goal is to design, implement, and maintain a cloud-based computing strategy to create a fast, sustainable, and scalable data platform with capabilities and applications for data discovery, data quality improvement, and visualization that go beyond those of the current networks.

In other words, we strive to make the lives of people who work with biodiversity data more productive by providing tools and services to make data easy to find, easy to publish, and easy to use.

Just like that. Easy.

Alarm over global climate change and associated loss of biodiversity has resulted in international demand for quick, reliable access to high quality data on the spatio-temporal occurrence of species and their relation to environment. Responses to this demand have led to the development of four NSF-funded vertebrate distributed database networks (FishNet2, MaNIS, HerpNET, ORNIS), which currently include 171 collections from 12 countries and 52 additional collections (20 countries) committed to participation. Collectively, these networks have successfully demonstrated community data sharing and cooperative data management. Participation in each of these networks has far exceeded expectations, resulting in growing problems of scalability, performance, sustainability, and ability to incorporate new members. The proposed creation of VertNet will address these problems by using a cloud-based computing strategy to create a fast, cost-effective, and scalable data platform. This new platform will have capabilities and applications for data discovery, data quality improvement, and visualization that go beyond those of the current networks. Specifically, VertNet will (1) have new user interfaces with expanded search capabilities (keyword and full text, synonyms for search terms, phylogenetic browsing), (2) incorporate new kinds of data (paleontological), (3) provide improved, open methods for accessing data (via application programming interfaces that connect web browsers, mobile devices, and integrated applications), (4) enable customized change notifications, (5) create novel annotation and user feedback services, and (6) integrate with several targeted biodiversity and collection management applications (GEOLocate, AmphibiaWeb, Map of Life, Specify, Arctos, DataONE, Encyclopedia of Life, and Animal Diversity Web). This strategic combination of open access to data, new capabilities, and integration with other applications will transform the use of vertebrate biodiversity data for cross-disciplinary research and for conservation.

  • First

    VertNet will fulfill the critical need for high quality data for biodiversity monitoring and assessment by providing an integrated, globally accessible, and sustainable infrastructure for vertebrates.

  • Second

    VertNet represents a new model for data publishing that maintains the fundamental capacity for curation of data at the source, while leveraging the advantages of a cloud-based computing platform. This new model will support the anticipated growth in digitized vertebrate biodiversity data, and will serve as a model for other scientific communities with similar data sharing challenges and needs.

  • Third

    VertNet has the potential to transform how biodiversity science is conducted. Researchers endeavoring to model species distributions, determine overall patterns of global biodiversity, and document biodiversity changes over time will have access to new integrative tools and data that increase the quality and usefulness of information contributed by the VertNet community.

VertNet will continue a strong tradition of biodiversity informatics training and community-building. The four predecessor projects (FishNet2, MaNIS, HerpNET, ORNIS) have trained at least 175 undergraduates and 383 researchers from 41 countries, with considerable participation by groups historically underrepresented in the natural sciences. The impacts of VertNet will extend beyond the funded institutions by engaging students from across the United States in two Summer Internships in Biodiversity Informatics and two Workshops in Biodiversity Informatics. Each workshop will involve 25 undergraduate and graduate students. In addition, undergraduate students will be offered volunteer apprenticeships through existing programs at UC Berkeley. These internships, apprenticeships, and workshops will emphasize the use and analysis of aggregated biodiversity data from VertNet and other sites for research and data improvement (e.g., georeferencing), and will contribute information back to these sites. Open access to VertNet data will facilitate repatriation of data to the countries of origin, enabling informed conservation assessments and designations of protected areas by non-governmental organizations and government agencies. Continued participation by VertNet personnel in international forums (e.g., Biodiversity Information Standards-TDWG), and close interaction with high-level data aggregators and biodiversity resources (Global Biodiversity Information Facility, National Biological Information Infrastructure, Map of Life, Encyclopedia of Life, AmphibiaWeb, Animal Diversity Web, and NEOMAP), will ensure that the solutions and applications developed in this project will have maximum impact on biodiversity informatics and the global dissemination of vertebrate data.

*Text excerpted from the VertNet proposal (DBI-1062193) submitted to National Science Foundation in August 2010.