Retrieve a file or its metadata

Retrieve metadata for a file or the file itself from an ArtifactDB using its REST endpoints.

getFileMetadata(id, url, cache = NULL, follow.links = TRUE, user.agent = NULL)

Arguments

id: String containing the ArtifactDB identifier for a file. This is a concatenated identifier involving the project name, file path and version, i.e., <project>:<path>@<version> (see Examples).
url: String containing the URL of the ArtifactDB REST endpoint.
cache: Function to perform caching, see Details. If NULL, no caching is performed.
follow.links: Logical scalar indicating whether to follow links, if id is a link to another target resource. If TRUE, metadata for the target resource is returned; otherwise, metadata for the link itself is returned.
user.agent: String containing the user agent, see authorizedVerb.

Value

A list containing metadata for the specified file. The contents will depend on the schema used by the ArtifactDB at url.

Details

The caching function should accept:

key, the endpoint URL used to acquire the requested resource. This is used as a unique key for the requested metadata. The caching mechanism should convert this into a suitable file path, e.g., via URLencode. It can be assumed that any latest version aliases in the input id have already been resolved.
save, a function that accepts a single string containing a path to a local file system generated from key. It will perform the request to the AritfactDB REST API, save the response to the specified path and return nothing. The caching function should call save with a suitable key-derived path if key does not already exist in the cache.

The caching function itself should return the path used in save. See biocCache for one possible implementation based on Bioconductor's BiocFileCache package.

Author

Aaron Lun

Examples

# No caching:
X <- getFileMetadata(example.id, url = example.url)
str(X)
#> List of 5
#>  $ $schema     : chr "generic_file/v1.json"
#>  $ generic_file:List of 1
#>   ..$ format: chr "text"
#>  $ md5sum      : chr "0eb827652a5c272e1c82002f1c972018"
#>  $ path        : chr "blah.txt"
#>  $ _extra      :List of 10
#>   ..$ $schema      : chr "generic_file/v1.json"
#>   ..$ id           : chr "test-public:blah.txt@base"
#>   ..$ project_id   : chr "test-public"
#>   ..$ version      : chr "base"
#>   ..$ metapath     : chr "blah.txt"
#>   ..$ meta_indexed : chr "2022-10-12T19:23:40.530Z"
#>   ..$ meta_uploaded: chr "2022-10-12T19:23:17.912Z"
#>   ..$ uploaded     : chr "2022-10-12T19:23:17.912Z"
#>   ..$ uploader_name: chr "ArtifactDB-bot"
#>   ..$ permissions  :List of 5
#>   .. ..$ scope       : chr "project"
#>   .. ..$ read_access : chr "public"
#>   .. ..$ write_access: chr "owners"
#>   .. ..$ owners      : chr "ArtifactDB-bot"
#>   .. ..$ viewers     : list()

# Simple caching in the temporary directory:
tmp.cache <- file.path(tempdir(), "zircon-cache")
dir.create(tmp.cache)
#> Warning: '/tmp/RtmpqT1SXc/zircon-cache' already exists
cache.fun <- function(key, save) {
    path <- file.path(tmp.cache, URLencode(key, reserved=TRUE, repeated=TRUE))
    if (!file.exists(path)) {
        save(path)
    } else {
        cat("cache hit!\n")
    }
    path
}
X1 <- getFileMetadata(example.id, example.url, cache = cache.fun)
#> cache hit!
X2 <- getFileMetadata(example.id, example.url, cache = cache.fun) # just re-uses the cache
#> cache hit!

Arguments

Value

Details

See also

Author

Examples