This vignette demonstrates more advanced features and customization available in occCite
. We recommend you read vignette("Simple.Rmd", package = "occCite")
first, if you have not already done so.
Querying GBIF can take quite a bit of time, especially for multiple species and/or well-known species. In this case, you may wish to access previously-downloaded data sets from your computer by specifying the general location of your downloaded .zip
files. occQuery
will crawl through your specified GBIFDownloadDirectory
to collect all the .zip
files contained in that folder and its subfolders. It will then import the most recent downloads that match your taxon list. These GBIF data will be appended to a BIEN search the same as if you do the simple real-time search (if you chose BIEN as well as GBIF), as was shown above. checkPreviousGBIFDownload
is TRUE
by default, but if loadLocalGBIFDownload
is TRUE
, occQuery
will ignore checkPreviousDownload
. It is also worth noting that occCite
does not currently support mixed data download sources. That is, you cannot do GBIF queries for some taxa, download previously-prepared data sets for others, and load the rest from local data sets on your computer.
# Simple search
<- occQuery(x = "Protea cynaroides",
myOldOccCiteObject datasources = c("gbif", "bien"),
GBIFLogin = GBIFLogin,
GBIFDownloadDirectory =
system.file('extdata/', package='occCite'),
checkPreviousGBIFDownload = T)
Here is the result. Look familiar?
#GBIF search results
head(myOldOccCiteObject@occResults$`Protea cynaroides`$GBIF$OccurrenceTable);
## name longitude latitude day month year
## 1 Protea cynaroides 26.51756 -33.34703 22 10 2020
## 2 Protea cynaroides 19.45966 -34.52285 7 11 2020
## 3 Protea cynaroides 19.13672 -33.76127 1 11 2020
## 4 Protea cynaroides 18.42365 -33.96614 28 3 2019
## 5 Protea cynaroides 18.42872 -33.99052 6 9 2020
## 6 Protea cynaroides 25.23694 -33.88793 4 11 2020
## Dataset DatasetKey
## 1 iNaturalist research-grade observations 50c9509d-22c7-4a22-a47d-8c48425ef4a7
## 2 iNaturalist research-grade observations 50c9509d-22c7-4a22-a47d-8c48425ef4a7
## 3 iNaturalist research-grade observations 50c9509d-22c7-4a22-a47d-8c48425ef4a7
## 4 iNaturalist research-grade observations 50c9509d-22c7-4a22-a47d-8c48425ef4a7
## 5 iNaturalist research-grade observations 50c9509d-22c7-4a22-a47d-8c48425ef4a7
## 6 iNaturalist research-grade observations 50c9509d-22c7-4a22-a47d-8c48425ef4a7
## DataService
## 1 GBIF
## 2 GBIF
## 3 GBIF
## 4 GBIF
## 5 GBIF
## 6 GBIF
#The full summary
summary(myOldOccCiteObject)
##
## OccCite query occurred on: 24 November, 2020
##
## User query type: User-supplied list of taxa.
##
## Sources for taxonomic rectification: NCBI
##
##
## Taxonomic cleaning results:
##
## Input Name Best Match Taxonomic Databases w/ Matches
## 1 Protea cynaroides Protea cynaroides NCBI
##
## Sources for occurrence data: gbif, bien
##
## Species Occurrences Sources
## 1 Protea cynaroides 1293 17
##
## GBIF dataset DOIs:
##
## Species GBIF Access Date GBIF DOI
## 1 Protea cynaroides 2020-11-23 10.15468/dl.2449qy
Getting citation data works the exact same way with previously-downloaded data as it does from a fresh data set.
#Get citations
<- occCitation(myOldOccCiteObject) myOldOccCitations
## [1] "NOTE: 1 BIEN dataset(s) for Protea cynaroides is/are missing citation data. Key(s) missing citations are: 280. Source(s) are identified as: MO."
print(myOldOccCitations)
## Writing 5 Bibtex entries ... OK
## Results written to file 'temp.bib'
## AFFOUARD A, JOLY A, LOMBARDO J, CHAMP J, GOEAU H, BONNET P (2020). Pl@ntNet automatically identified occurrences. Version 1.2. Pl@ntNet. https://doi.org/10.15468/mma2ec. Accessed via GBIF on 2020-11-23.
## AFFOUARD A, JOLY A, LOMBARDO J, CHAMP J, GOEAU H, BONNET P (2020). Pl@ntNet observations. Version 1.2. Pl@ntNet. https://doi.org/10.15468/gtebaa. Accessed via GBIF on 2020-11-23.
## CEN Limousin & MAÇONNERIE Delphine. Accessed via BIEN on NA.
## Cameron E, Auckland Museum A M (2022). Auckland Museum Botany Collection. Version 1.71. Auckland War Memorial Museum. https://doi.org/10.15468/mnjkvv. Accessed via GBIF on 2020-11-23.
## Capers R (2014). CONN. University of Connecticut. https://doi.org/10.15468/w35jmd. Accessed via GBIF on 2020-11-23.
## Chamberlain, S., Barve, V., Mcglinn, D., Oldoni, D., Desmet, P., Geffert, L., Ram, K. (2022). rgbif: Interface to the Global Biodiversity Information Facility API. R package version 3.7.1. https://CRAN.R-project.org/package=rgbif.
## Chamberlain, S., Boettiger, C. (2017). R Python, and Ruby clients for GBIF species occurrence data. PeerJ PrePrints.
## Department of Agriculture and Fisheries. Accessed via BIEN on NA.
## Fatima Parker-Allie, Ranwashe F (2018). PRECIS. South African National Biodiversity Institute. https://doi.org/10.15468/rckmn2. Accessed via GBIF on 2020-11-23.
## ITA327. Accessed via BIEN on NA.
## MNHN, Chagnoux S (2022). The vascular plants collection (P) at the Herbarium of the Muséum national d'Histoire Naturelle (MNHN - Paris). Version 69.251. MNHN - Museum national d'Histoire naturelle. https://doi.org/10.15468/nc6rxy. Accessed via GBIF on 2020-11-23.
## Maitner, B. (2022). BIEN: Tools for Accessing the Botanical Information and Ecology. R package version 1.2.5. https://CRAN.R-project.org/package=BIEN.
## NA. Accessed via BIEN on NA.
## Owens, H., Merow, C., Maitner, B., Kass, J., Barve, V., Guralnick, R. (2022). occCite: Querying and Managing Large Biodiversity Occurrence Datasets. R package version 0.5.4. https://CRAN.R-project.org/package=occCite.
## R Core Team. (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
## Senckenberg (2020). African Plants - a photo guide. https://doi.org/10.15468/r9azth. Accessed via GBIF on 2020-11-23.
## Solomon J, Stimmel H (2021). Tropicos Specimen Data. Missouri Botanical Garden. https://doi.org/10.15468/hja69f. Accessed via GBIF on 2020-11-23.
## Tela Botanica. Carnet en Ligne. https://doi.org/10.15468/rydcn2. Accessed via GBIF on 2020-11-23.
## UPRM. Accessed via BIEN on NA.
## de Vries H, Lemmens M (2021). Observation.org, Nature data from around the World. Observation.org. https://doi.org/10.15468/5nilie. Accessed via GBIF on 2020-11-23.
## iNaturalist contributors, iNaturalist (2022). iNaturalist Research-grade Observations. iNaturalist.org. https://doi.org/10.15468/ab3s5x. Accessed via GBIF on 2020-11-23.
## naturgucker.de. naturgucker. https://doi.org/10.15468/uc1apo. Accessed via GBIF on 2020-11-23.
Note that you can also load multiple species using either a vector of species names or a phylogeny (provided you have previously downloaded data for all of the species of interest), and you can load occurrences from non-GBIF data sources (e.g. BIEN) in the same query.
In addition to doing a simple, single species search, you can also use occCite
to search for and manage occurrence datasets for multiple species. You can either submit a vector of species names, or you can submit a phylogeny! The occCitation function will return a named list of citation tables in the case of multiple species.
Here is an example of how such a search is structured, using an unpublished phylogeny of billfishes.
library(ape)
#Get tree
<- system.file("extdata/Fish_12Tax_time_calibrated.tre", package='occCite')
treeFile <- ape::read.nexus(treeFile)
phylogeny <- ape::extract.clade(phylogeny, 22)
tree #Query databases for names
<- studyTaxonList(x = tree,
myPhyOccCiteObject datasources = "GBIF Backbone Taxonomy")
#Query GBIF for occurrence data
<- occQuery(x = myPhyOccCiteObject,
myPhyOccCiteObject datasources = "gbif",
GBIFDownloadDirectory = system.file('extdata/', package='occCite'),
loadLocalGBIFDownload = T,
checkPreviousGBIFDownload = F)
# What does a multispecies query look like?
summary(myPhyOccCiteObject)
##
## OccCite query occurred on: 20 March, 2022
##
## User query type: User-supplied phylogeny.
##
## Sources for taxonomic rectification: GBIF Backbone Taxonomy
##
##
## Taxonomic cleaning results:
##
## Input Name Best Match
## 1 Tetrapturus_angustirostris Tetrapturus angustirostris Tanaka, 1915
## 2 Tetrapturus_belone Tetrapturus belone Rafinesque, 1810
## 3 Tetrapturus_pfluegeri Tetrapturus pfluegeri Robins & de Sylva, 1963
## Taxonomic Databases w/ Matches
## 1 GBIF Backbone Taxonomy
## 2 GBIF Backbone Taxonomy
## 3 GBIF Backbone Taxonomy
##
## Sources for occurrence data: gbif
##
## Species Occurrences Sources
## 1 Tetrapturus angustirostris Tanaka, 1915 649 23
## 2 Tetrapturus belone Rafinesque, 1810 9 6
## 3 Tetrapturus pfluegeri Robins & de Sylva, 1963 410 8
##
## GBIF dataset DOIs:
##
## Species GBIF Access Date
## 1 Tetrapturus angustirostris Tanaka, 1915 2019-07-04
## 2 Tetrapturus belone Rafinesque, 1810 2019-07-04
## 3 Tetrapturus pfluegeri Robins & de Sylva, 1963 2019-07-04
## GBIF DOI
## 1 10.15468/dl.mumi5e
## 2 10.15468/dl.q2nxb1
## 3 10.15468/dl.qjidbs
When you have results for multiple species, as in this case, you can also plot the summary figures either for the whole search…
plot(myPhyOccCiteObject)
or you can plot the results by species!
plot(myPhyOccCiteObject, bySpecies = T, plotTypes = c("yearHistogram", "source"))
And then you can print out the citations, separated by species (or not, but in this example, they’re separate).
#Get citations
<- occCitation(myPhyOccCiteObject) myPhyOccCitations
## Warning in occCitation(myPhyOccCiteObject): GBIF connection unsuccessful
#Print citations as text with accession dates.
print(myPhyOccCitations, bySpecies = T)
## NULL