getFileMetadata.Rd
Retrieve metadata for a file or the file itself from an ArtifactDB using its REST endpoints.
getFileMetadata(id, url, cache = NULL, follow.links = TRUE, user.agent = NULL)
String containing the ArtifactDB identifier for a file.
This is a concatenated identifier involving the project name, file path and version,
i.e., <project>:<path>@<version>
(see Examples).
String containing the URL of the ArtifactDB REST endpoint.
Function to perform caching, see Details.
If NULL
, no caching is performed.
Logical scalar indicating whether to follow links, if id
is a link to another target resource.
If TRUE
, metadata for the target resource is returned; otherwise, metadata for the link itself is returned.
String containing the user agent, see authorizedVerb
.
A list containing metadata for the specified file.
The contents will depend on the schema used by the ArtifactDB at url
.
The caching function should accept:
key
, the endpoint URL used to acquire the requested resource.
This is used as a unique key for the requested metadata.
The caching mechanism should convert this into a suitable file path, e.g., via URLencode
.
It can be assumed that any latest
version aliases in the input id
have already been resolved.
save
, a function that accepts a single string containing a path to a local file system generated from key
.
It will perform the request to the AritfactDB REST API, save the response to the specified path and return nothing.
The caching function should call save
with a suitable key-derived path if key
does not already exist in the cache.
The caching function itself should return the path used in save
.
See biocCache
for one possible implementation based on Bioconductor's BiocFileCache package.
packID
, to create id
from various pieces of information.
identityAvailable
, to inject authorization information into the API request.
# No caching:
X <- getFileMetadata(example.id, url = example.url)
str(X)
#> List of 5
#> $ $schema : chr "generic_file/v1.json"
#> $ generic_file:List of 1
#> ..$ format: chr "text"
#> $ md5sum : chr "0eb827652a5c272e1c82002f1c972018"
#> $ path : chr "blah.txt"
#> $ _extra :List of 10
#> ..$ $schema : chr "generic_file/v1.json"
#> ..$ id : chr "test-public:blah.txt@base"
#> ..$ project_id : chr "test-public"
#> ..$ version : chr "base"
#> ..$ metapath : chr "blah.txt"
#> ..$ meta_indexed : chr "2022-10-12T19:23:40.530Z"
#> ..$ meta_uploaded: chr "2022-10-12T19:23:17.912Z"
#> ..$ uploaded : chr "2022-10-12T19:23:17.912Z"
#> ..$ uploader_name: chr "ArtifactDB-bot"
#> ..$ permissions :List of 5
#> .. ..$ scope : chr "project"
#> .. ..$ read_access : chr "public"
#> .. ..$ write_access: chr "owners"
#> .. ..$ owners : chr "ArtifactDB-bot"
#> .. ..$ viewers : list()
# Simple caching in the temporary directory:
tmp.cache <- file.path(tempdir(), "zircon-cache")
dir.create(tmp.cache)
#> Warning: '/tmp/RtmpqT1SXc/zircon-cache' already exists
cache.fun <- function(key, save) {
path <- file.path(tmp.cache, URLencode(key, reserved=TRUE, repeated=TRUE))
if (!file.exists(path)) {
save(path)
} else {
cat("cache hit!\n")
}
path
}
X1 <- getFileMetadata(example.id, example.url, cache = cache.fun)
#> cache hit!
X2 <- getFileMetadata(example.id, example.url, cache = cache.fun) # just re-uses the cache
#> cache hit!