getFileMetadata.RdRetrieve metadata for a file or the file itself from an ArtifactDB using its REST endpoints.
getFileMetadata(id, url, cache = NULL, follow.links = TRUE, user.agent = NULL)String containing the ArtifactDB identifier for a file.
This is a concatenated identifier involving the project name, file path and version,
i.e., <project>:<path>@<version> (see Examples).
String containing the URL of the ArtifactDB REST endpoint.
Function to perform caching, see Details.
If NULL, no caching is performed.
Logical scalar indicating whether to follow links, if id is a link to another target resource.
If TRUE, metadata for the target resource is returned; otherwise, metadata for the link itself is returned.
String containing the user agent, see authorizedVerb.
A list containing metadata for the specified file.
The contents will depend on the schema used by the ArtifactDB at url.
The caching function should accept:
key, the endpoint URL used to acquire the requested resource.
This is used as a unique key for the requested metadata.
The caching mechanism should convert this into a suitable file path, e.g., via URLencode.
It can be assumed that any latest version aliases in the input id have already been resolved.
save, a function that accepts a single string containing a path to a local file system generated from key.
It will perform the request to the AritfactDB REST API, save the response to the specified path and return nothing.
The caching function should call save with a suitable key-derived path if key does not already exist in the cache.
The caching function itself should return the path used in save.
See biocCache for one possible implementation based on Bioconductor's BiocFileCache package.
packID, to create id from various pieces of information.
identityAvailable, to inject authorization information into the API request.
# No caching:
X <- getFileMetadata(example.id, url = example.url)
str(X)
#> List of 5
#> $ $schema : chr "generic_file/v1.json"
#> $ generic_file:List of 1
#> ..$ format: chr "text"
#> $ md5sum : chr "0eb827652a5c272e1c82002f1c972018"
#> $ path : chr "blah.txt"
#> $ _extra :List of 10
#> ..$ $schema : chr "generic_file/v1.json"
#> ..$ id : chr "test-public:blah.txt@base"
#> ..$ project_id : chr "test-public"
#> ..$ version : chr "base"
#> ..$ metapath : chr "blah.txt"
#> ..$ meta_indexed : chr "2022-10-12T19:23:40.530Z"
#> ..$ meta_uploaded: chr "2022-10-12T19:23:17.912Z"
#> ..$ uploaded : chr "2022-10-12T19:23:17.912Z"
#> ..$ uploader_name: chr "ArtifactDB-bot"
#> ..$ permissions :List of 5
#> .. ..$ scope : chr "project"
#> .. ..$ read_access : chr "public"
#> .. ..$ write_access: chr "owners"
#> .. ..$ owners : chr "ArtifactDB-bot"
#> .. ..$ viewers : list()
# Simple caching in the temporary directory:
tmp.cache <- file.path(tempdir(), "zircon-cache")
dir.create(tmp.cache)
#> Warning: '/tmp/RtmpqT1SXc/zircon-cache' already exists
cache.fun <- function(key, save) {
path <- file.path(tmp.cache, URLencode(key, reserved=TRUE, repeated=TRUE))
if (!file.exists(path)) {
save(path)
} else {
cat("cache hit!\n")
}
path
}
X1 <- getFileMetadata(example.id, example.url, cache = cache.fun)
#> cache hit!
X2 <- getFileMetadata(example.id, example.url, cache = cache.fun) # just re-uses the cache
#> cache hit!