| Title: | An Interface to the 'Arctos' Database |
|---|---|
| Description: | Performs requests to the 'Arctos' API to download data. Provides a set of builder classes for performing complex requests, as well as a set of simple functions for automating many common requests and workflows. More information about 'Arctos' can be found in Cicero et al. (2024) <doi:10.1371/journal.pone.0296478> or on their website <https://arctosdb.org/>. |
| Authors: | Harlan R. Williams [aut, cre] (ORCID: <https://orcid.org/0009-0000-9958-121X>), Jocelyn P. Colella [aut] (ORCID: <https://orcid.org/0000-0003-2463-1029>), Vijay Barve [aut] (ORCID: <https://orcid.org/0000-0002-4852-2567>), Michelle S. Koo [aut] (ORCID: <https://orcid.org/0000-0003-0410-722X>), Marlon E. Cobos [aut] (ORCID: <https://orcid.org/0000-0002-2611-1767>) |
| Maintainer: | Harlan R. Williams <[email protected]> |
| License: | GPL-3 |
| Version: | 0.1.4 |
| Built: | 2026-05-17 09:29:11 UTC |
| Source: | https://github.com/hrhwilliams/arctosr |
The ArctosR package provides a set of functions to help users
perform requests to the Arctos API to download data. It provides a set of
builder classes for performing complex requests, as well as a set of simple
functions for automating many common requests and workflows.
Arctos is a collection management information system serving over 5 million records from natural and cultural history collections. Arctos integrates access to collections from disciplines such as anthropology, botany, entomology, ethnology, herpetology, geology, ichthyology, mammalogy, mineralogy, ornithology, paleontology, parasitology, as well as archival and cultural collections. The Arctos database is accessible through a web interface at https://arctos.database.museum/ More information about Arctos, can be found at https://arctosdb.org/about/, and in Cicero et al. (2024) https://doi.org/10.1371/journal.pone.0296478.
get_query_parameters, get_result_parameters,
get_record_count, get_records,
check_for_status, get_error_response,
get_last_response_url,
response_data, save_response_rds,
read_response_rds, save_response_csv,
expand_column
Maintainer: Harlan R. Williams [email protected]
Authors:
Marlon E. Cobos [email protected]
Jocelyn P. Colella [email protected]
Michelle S. Koo [email protected]
Vijay Barve [email protected]
Useful links:
CatalogRequestBuilder
CatalogRequestBuilder
ArctosR::RequestBuilder -> CatalogRequestBuilder
set_limit()
Sets the limit on how many records to initially request from Arctos.
CatalogRequestBuilder$set_limit(limit)
limit(integer(1)).
set_query()
Sets the query parameters to use to search Arctos.
CatalogRequestBuilder$set_query(...)
query(list).
set_filter()
Sets the result parameters to use to filter out results.
CatalogRequestBuilder$set_filter(...)
query(list).
set_parts()
Set parts to query over.
CatalogRequestBuilder$set_parts(...)
parts(list).
set_attributes()
Set attributes to query over.
CatalogRequestBuilder$set_attributes(...)
attributes(list).
set_components()
Set components to query over.
CatalogRequestBuilder$set_components(...)
components(list).
set_columns()
Sets the columns in the dataframe returned by Arctos.
CatalogRequestBuilder$set_columns(...)
cols(list).
set_columns_list()
Sets the columns in the dataframe returned by Arctos.
CatalogRequestBuilder$set_columns_list(l)
cols(list).
from_previous_response()
Sets the columns in the dataframe returned by Arctos.
CatalogRequestBuilder$from_previous_response(response)
responsea response object from a previous request
build_request()
Send a request for data to Arctos with parameters specified by the other methods called on this class.
CatalogRequestBuilder$build_request()
clone()
The objects of this class are cloneable with this method.
CatalogRequestBuilder$clone(deep = FALSE)
deepWhether to make a deep clone.
Checks if a response failed as part of a query.
check_for_status(query)check_for_status(query)
query |
A query object to check the return status of |
TRUE if the query succeeded, FALSE otherwise
library(ArctosR) if (interactive()) { # query with an invalid column name 'paarts' query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "paarts", "partdetail") ) check_for_status(query) }library(ArctosR) if (interactive()) { # query with an invalid column name 'paarts' query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "paarts", "partdetail") ) check_for_status(query) }
Expand all information contained in a JSON formatted column in a query object. Information is presented as nested data frames if needed.
expand_column(query, column_name)expand_column(query, column_name)
query |
The query object with a JSON formatted column to be expanded. |
column_name |
(character) The name of the column to be expanded. |
Nothing.
library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # The partdetail column is a JSON list of parts and their attributes # This will convert the column to dataframes: expand_column(query, "partdetail") }library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # The partdetail column is a JSON list of parts and their attributes # This will convert the column to dataframes: expand_column(query, "partdetail") }
Builder for the case where a request is made with the context of a previous response.
ArctosR::RequestBuilder -> FromResponseRequestBuilder
new()
FromResponseRequestBuilder$new(response, records)
request_more()
Request at most count more records from this response's
original query
FromResponseRequestBuilder$request_more(count)
countnumber of additional records to request
FromResponseRequestBuilder
build_request()
Perform the request.
FromResponseRequestBuilder$build_request()
Request
clone()
The objects of this class are cloneable with this method.
FromResponseRequestBuilder$clone(deep = FALSE)
deepWhether to make a deep clone.
Returns the error string returned from Arctos if the last response in a query object returned a status code other than HTTP 200 for debugging purposes.
get_error_response(query)get_error_response(query)
query |
A query object to return the error string of |
The error string of a failing response object, or "No error" if the query didn't fail
library(ArctosR) if (interactive()) { # query with an invalid column name 'paarts' query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "paarts", "partdetail") ) get_error_response(query) }library(ArctosR) if (interactive()) { # query with an invalid column name 'paarts' query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "paarts", "partdetail") ) get_error_response(query) }
Returns the last URL used by a request in a query object
get_last_response_url(query)get_last_response_url(query)
query |
A query object to return the URL for |
The URL of the last performed request in a query object
library(ArctosR) if (interactive()) { query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) get_last_response_url(query) }library(ArctosR) if (interactive()) { query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) get_last_response_url(query) }
Request information about all valid query parameters for querying Arctos.
get_query_parameters()get_query_parameters()
Data frame listing valid query parameters and associated description and
category. The returned columns are: display, obj_name, category,
subcategory, description. All entries in obj_name are valid parameters
to pass to get_records as keys.
library(ArctosR) if (interactive()) { q <- get_query_parameters() }library(ArctosR) if (interactive()) { q <- get_query_parameters() }
Request from Arctos the total number of records that match a specific query.
A list of possible query keys can be obtained from the output of
get_query_parameters.
get_record_count(..., api_key)get_record_count(..., api_key)
... |
Query parameters and their values to pass to Arctos to search. For example, 'scientific_name = "Canis lupus"“ |
api_key |
(character) The API key to use for this request. |
The number of records matching the given query, as an integer.
library(ArctosR) if (interactive()) { count <- get_record_count( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm" ) }library(ArctosR) if (interactive()) { count <- get_record_count( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm" ) }
Make a request to Arctos to return data based on a query. The columns
(fields) returned are specified in the list defined in columns.
A list of possible query keys can be obtained from the output of
get_query_parameters.
get_records(..., api_key = NULL, columns = NULL, limit = NULL, filter_by = NULL, all_records = FALSE)get_records(..., api_key = NULL, columns = NULL, limit = NULL, filter_by = NULL, all_records = FALSE)
... |
Query parameters and their values to pass to Arctos to search.
For example, |
api_key |
(character) The API key to use for this request. |
columns |
A list of columns to be returned in the table of records to be downloaded from Arctos. |
limit |
(numeric) The maximum number of records to download at once. Default is 100. |
filter_by |
An optional list of record attributes to filter results by. |
all_records |
(logical) If true, the request is performed multiple times to obtain data from Arctos until all records matching the query have been downloaded. |
A query object consisting of metadata for each request sent to Arctos to fulfill the user's query, and a data frame of records.
library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) } if (interactive()) { # Request to download data about rodents examined for Orthohantavirus orthohantavirus_MSB <- get_records(guid_prefix="MSB:Mamm", taxon_name=Rodentia, filter_by=list("detected"="Orthohantavirus") ) }library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) } if (interactive()) { # Request to download data about rodents examined for Orthohantavirus orthohantavirus_MSB <- get_records(guid_prefix="MSB:Mamm", taxon_name=Rodentia, filter_by=list("detected"="Orthohantavirus") ) }
A cataloged item in Arctos can be related to any other number of
items by relationships defined in the code table ctid_references. This
function will return all items related by any such relationship in the table
in a data frame.
get_relationships(guid, api_key = NULL)get_relationships(guid, api_key = NULL)
guid |
The Arctos GUID of the item to query relationships over |
api_key |
(character) The API key to use for this request. |
a data frame of all related items. This contains URLs
library(ArctosR) if (interactive()) { r <- get_relationships("MSB:Mamm:140026") }library(ArctosR) if (interactive()) { r <- get_relationships("MSB:Mamm:140026") }
Returns the first URL used by a completed query which can be shared. The API key is automatically stripped from the URL for security.
get_request_url(query)get_request_url(query)
query |
A completed query returned from |
A URL as a string
library(ArctosR) if (interactive()) { q <- get_records(guid_prefix="MSB:Mamm") url <- get_request_url(q) }library(ArctosR) if (interactive()) { q <- get_records(guid_prefix="MSB:Mamm") url <- get_request_url(q) }
Request information about all valid result columns to request from Arctos.
get_result_parameters()get_result_parameters()
Data frame listing valid result columns and associated
description and category. The returned columns are: display, obj_name,
query_cost, category, description, default_order. The names in
obj_name are passed to get_records in the columns
parameter as a list.
library(ArctosR) if (interactive()) { r <- get_result_parameters() }library(ArctosR) if (interactive()) { r <- get_result_parameters() }
Builder for a request for query parameter or result parameter documentation from Arctos. For a valid request, only one method to specify the type of request can be called.
ArctosR::RequestBuilder -> InfoRequestBuilder
build_request()
InfoRequestBuilder$build_request()
clone()
The objects of this class are cloneable with this method.
InfoRequestBuilder$clone(deep = FALSE)
deepWhether to make a deep clone.
Metadata for a specific HTTP response from Arctos.
to_list()
Metadata$to_list()
clone()
The objects of this class are cloneable with this method.
Metadata$clone(deep = FALSE)
deepWhether to make a deep clone.
The results of a user query. Able to accept multiple responses to increase the record count, or to add columns.
catalog_request()
Query$catalog_request()
from_response_request()
Query$from_response_request()
info_request()
Query$info_request()
perform()
Query$perform(api_key = NULL)
save_metadata_json()
Query$save_metadata_json(file_path)
save_records_csv()
Query$save_records_csv(file_path, expanded = FALSE)
expand_col()
Query$expand_col(column_name)
get_responses()
Query$get_responses()
clone()
The objects of this class are cloneable with this method.
Query$clone(deep = FALSE)
deepWhether to make a deep clone.
Load in a query object saved to an RDS file.
read_response_rds(filename)read_response_rds(filename)
filename |
(character) The name of the file to load in. |
A query object
library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # Save the data in a .RDS file save_response_rds(query, "wolves.RDS") # Load the data from the .RDS just saved read_response_rds("wolves.RDS") }library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # Save the data in a .RDS file save_response_rds(query, "wolves.RDS") # Load the data from the .RDS just saved read_response_rds("wolves.RDS") }
A (possibly nested) data frame of records returned by a static set of query and result parameters
new()
Records$new(df, tbl)
append()
Records$append(other)
save_flat_csv()
Writes the data in the response object to a CSV file.
Records$save_flat_csv(file_path)
save_nested_csvs()
Records$save_nested_csvs(file_path)
expand_col()
Expand a column of nested JSON tables in the response to a list of dataframes.
Records$expand_col(column)
col(string)
clone()
The objects of this class are cloneable with this method.
Records$clone(deep = FALSE)
deepWhether to make a deep clone.
A generic Arctos request. Not intended to be used directly. See InfoRequestBuilder and CatalogRequestBuilder.
with_endpoint()
Request$with_endpoint(endpoint)
add_param()
Request$add_param(...)
add_params()
Request$add_params(l)
serialize()
Request$serialize()
perform()
Request$perform(api_key = NULL)
from_raw_response()
Request$from_raw_response(raw_response)
clone()
The objects of this class are cloneable with this method.
Request$clone(deep = FALSE)
deepWhether to make a deep clone.
A builder for a generic Arctos request. Not to be used directly.
debug()
Turn on printing of debug information.
RequestBuilder$debug()
build_request()
RequestBuilder$build_request()
clone()
The objects of this class are cloneable with this method.
RequestBuilder$clone(deep = FALSE)
deepWhether to make a deep clone.
Response returned from Arctos.
new()
Response$new(request, raw_response)
set_start_index()
Response$set_start_index(start)
was_success()
Response$was_success()
is_empty()
Response$is_empty()
has_json_content()
Response$has_json_content()
to_list()
Response$to_list()
to_records()
Response$to_records(start = 0)
clone()
The objects of this class are cloneable with this method.
Response$clone(deep = FALSE)
deepWhether to make a deep clone.
Obtain the data frame with the records from a successful query.
response_data(query)response_data(query)
query |
The query object to extract the data frame from. |
A data frame with the information requested in the query.
library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # Grab the dataframe of records from the query df <- response_data(query) }library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # Grab the dataframe of records from the query df <- response_data(query) }
Save the records inside the query object as a CSV file, optionally alongside metadata relating to the requests made to download the data.
save_response_csv(query, filename, expanded = FALSE, with_metadata = TRUE)save_response_csv(query, filename, expanded = FALSE, with_metadata = TRUE)
query |
The query object to be saved |
filename |
(character) Name of the file to be saved. |
expanded |
(logical) Setting this option to TRUE will create a folder of CSVs representing hierarchical data. See details. |
with_metadata |
Whether to save the metadata of the response as a JSON file along side the CSV or folder of CSVs. |
Some columns from Arctos are themselves tables, so to accurately represent the structure of the data, these inner tables can be saved as separate CSVs that are named according to which record they belong.
Nothing.
library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # Save the response in a flat CSV with an additional metadata file in JSON save_response_csv(query, "msb-wolves.csv", with_metadata = TRUE) }library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # Save the response in a flat CSV with an additional metadata file in JSON save_response_csv(query, "msb-wolves.csv", with_metadata = TRUE) }
Save the query object as an RDS file, which stores the entire state of the query and can be loaded at a later time.
save_response_rds(query, filename)save_response_rds(query, filename)
query |
The query object to be saved. |
filename |
(character) Name of the file to be saved. |
Nothing.
library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # Save the data in a .RDS file save_response_rds(query, "wolves.RDS") }library(ArctosR) if (interactive()) { # Request to download all available data query <- get_records( scientific_name = "Canis lupus", guid_prefix = "MSB:Mamm", columns = list("guid", "parts", "partdetail") ) # Save the data in a .RDS file save_response_rds(query, "wolves.RDS") }