Retrieve metadata for a file or the file itself from an ArtifactDB using its REST endpoints.

getFileMetadata(id, url, cache = NULL, follow.links = TRUE, user.agent = NULL)

Arguments

id

String containing the ArtifactDB identifier for a file. This is a concatenated identifier involving the project name, file path and version, i.e., <project>:<path>@<version> (see Examples).

url

String containing the URL of the ArtifactDB REST endpoint.

cache

Function to perform caching, see Details. If NULL, no caching is performed.

follow.links

Logical scalar indicating whether to follow links, if id is a link to another target resource. If TRUE, metadata for the target resource is returned; otherwise, metadata for the link itself is returned.

user.agent

String containing the user agent, see authorizedVerb.

Value

A list containing metadata for the specified file. The contents will depend on the schema used by the ArtifactDB at url.

Details

The caching function should accept:

  • key, the endpoint URL used to acquire the requested resource. This is used as a unique key for the requested metadata. The caching mechanism should convert this into a suitable file path, e.g., via URLencode. It can be assumed that any latest version aliases in the input id have already been resolved.

  • save, a function that accepts a single string containing a path to a local file system generated from key. It will perform the request to the AritfactDB REST API, save the response to the specified path and return nothing. The caching function should call save with a suitable key-derived path if key does not already exist in the cache.

The caching function itself should return the path used in save. See biocCache for one possible implementation based on Bioconductor's BiocFileCache package.

See also

packID, to create id from various pieces of information.

identityAvailable, to inject authorization information into the API request.

Author

Aaron Lun

Examples

# No caching:
X <- getFileMetadata(example.id, url = example.url)
str(X)
#> List of 5
#>  $ $schema     : chr "generic_file/v1.json"
#>  $ generic_file:List of 1
#>   ..$ format: chr "text"
#>  $ md5sum      : chr "0eb827652a5c272e1c82002f1c972018"
#>  $ path        : chr "blah.txt"
#>  $ _extra      :List of 10
#>   ..$ $schema      : chr "generic_file/v1.json"
#>   ..$ id           : chr "test-public:blah.txt@base"
#>   ..$ project_id   : chr "test-public"
#>   ..$ version      : chr "base"
#>   ..$ metapath     : chr "blah.txt"
#>   ..$ meta_indexed : chr "2022-10-12T19:23:40.530Z"
#>   ..$ meta_uploaded: chr "2022-10-12T19:23:17.912Z"
#>   ..$ uploaded     : chr "2022-10-12T19:23:17.912Z"
#>   ..$ uploader_name: chr "ArtifactDB-bot"
#>   ..$ permissions  :List of 5
#>   .. ..$ scope       : chr "project"
#>   .. ..$ read_access : chr "public"
#>   .. ..$ write_access: chr "owners"
#>   .. ..$ owners      : chr "ArtifactDB-bot"
#>   .. ..$ viewers     : list()

# Simple caching in the temporary directory:
tmp.cache <- file.path(tempdir(), "zircon-cache")
dir.create(tmp.cache)
#> Warning: '/tmp/RtmpqT1SXc/zircon-cache' already exists
cache.fun <- function(key, save) {
    path <- file.path(tmp.cache, URLencode(key, reserved=TRUE, repeated=TRUE))
    if (!file.exists(path)) {
        save(path)
    } else {
        cat("cache hit!\n")
    }
    path
}
X1 <- getFileMetadata(example.id, example.url, cache = cache.fun)
#> cache hit!
X2 <- getFileMetadata(example.id, example.url, cache = cache.fun) # just re-uses the cache
#> cache hit!