Datasets, Tools & Code
Looking for VertNet snapshots? Need some tools for data analysis? Curious about our code base? All of that is here, and more. All resources produced for and by VertNet are free and open source, except where copyright or other restrictions apply. We encourage you to build upon these resources for your own professional and personal projects. Please send us some feedback if you have questions, suggestions, or comments.
Sometimes you just need all the data, so we created these taxon-based data snapshots
from the VertNet Index. Together, these datasets represent the entirety of the content
within the VertNet data portal at a given point in time. Our goal is to update these
snapshots quarterly. They will be made available to the public via
on their Data Commons Repository.
We’ll also keep our first set of snapshots up on the KNB Data Repository, a
member node of DataONE. Both CyVerse and KNB will provide a persistent home
for these datasets and mint a DOI (digital object identifier),
so that you can refer to the dataset by name or by DOI, just like you would a publication.
September 2016 - CyVerse
- Amphibia – dx.doi.org/10.7946/P2F59W
- Aves – dx.doi.org/10.7946/P2K01C
- Fishes – dx.doi.org/10.7946/P2PP4B
- Mammalia – dx.doi.org/10.7946/P2TG68
- Reptilia - dx.doi.org/10.7946/P2Z59J
- Traits – dx.doi.org/10.7946/P23011
October 2015 - CyVerse
- Amphibia – dx.doi.org/10.7946/P2VC7X
- Aves – dx.doi.org/10.7946/P2059V
- Fishes – dx.doi.org/10.7946/P23W2C
- Mammalia – dx.doi.org/10.7946/P27P49
- Reptilia – dx.doi.org/10.7946/P2CC7K
April 2015 - KNB
- Amphibia - dx.doi.org/10.5063/F1VX0DF9
- Aves - dx.doi.org/10.5063/F1MG7MDB
- Fishes - dx.doi.org/10.5063/F1R49NQB
- Mammalia - dx.doi.org/10.5063/F1GQ6VPM
- Reptilia - dx.doi.org/10.5063/F10P0WX6
VertNet snapshots creation process:
- VertNet assembles its datastore into BigQuery, a Google tool, with which we run queries over the complete VertNet index.
- From these queries we create taxon-based snapshots - Amphibia, Aves, Fishes*, Mammalia, and Reptilia (additional datasets may be produced in the future).
- These snapshots are converted into CSV (comma-separated) text files and compressed into .zip files.
- VertNet generates a metadata** file that describes the contents contained within each zipped dataset.
- Finally, we bundle up the metadata and the dataset and upload them to CyVerse.
* Yes, we understand that “fish” is a paraphyletic assemblage, but for the sake of simplicity and
to avoid a ton of “fish-related resources” we have bundled everything to “fish”.
** The metadata we create is in the same format we use for any other resource in VertNet.
Orig Release, 08Apil2015 (David Bloom)
Updated, 10Oct2016 (David Bloom)
Updated, 29Sept2016 (David Bloom)
VertNet on GitHub
- Gulo Repository (data harvesting)
- DwC Indexing Repository (Darwin Core data indexing)
- WebApp Repository (search portal)
- API Repository (search and download API)
- Toolkit Repository (data quality/cleanup)
- rVertNet Repository (r client for VertNet with rOpenSci)
Issue Tracking Code
Every data set harvested by VertNet has an associated issue tracking repository on GitHub. To track issues or follow a specific institution or data set visit https://github.com and use the search field to find an institution. All VertNet-related repositories and issues lists are public. For example, to find all repositories sets for Tall Timbers Research Station, simply type ttrs-vertnet into the search field. Likewise, the Museum of Vertebrate Zoology or the University of Kansas can be found at mvz-vertnet and ku-vertnet respectively.