Python interface to the SewerRat API¶
Pretty much as it says on the tin: provides a Python client for the API of the same name. It is assumed that the users of the sewerrat client and the SewerRat API itself are accessing the same shared filesystem; this is typically the case for high-performance computing clusters in scientific institutions. To demonstrate, let’s spin up a mock SewerRat instance:
import sewerrat as sr
_, url = sr.start_sewerrat()
Let’s mock up a directory of metadata files:
import tempfile
import os
mydir = tempfile.mkdtemp()
with open(os.path.join(mydir, "metadata.json"), "w") as handle:
handle.write('{ "first": "foo", "last": "bar" }')
os.mkdir(os.path.join(mydir, "diet"))
with open(os.path.join(mydir, "diet", "metadata.json"), "w") as handle:
handle.write('{ "fish": "barramundi" }')
We can then easily register it via the register()
function.
Similarly, we can deregister this directory with deregister(mydir)
.
# Only indexing metadata files named 'metadata.json'.
sr.register(mydir, names=["metadata.json"], url=url)
To search the index, we use the query()
function to perform free-text searches.
This does not require filesystem access and can be done remotely.
sr.query(url, "foo")
sr.query(url, "bar*") # partial match to 'bar...'
sr.query(url, "bar* AND foo") # boolean operations
sr.query(url, "fish:bar*") # match in the 'fish' field
We can also search on the user, path components, and time of creation:
sr.query(url, user="LTLA") # created by myself
sr.query(url, path="diet/") # path has 'diet/' in it
import time
sr.query(url, after=time.time() - 3600) # created less than 1 hour ago
Once we find a file of interest from a registered directory, we can retrieve its metadata, or other files in the same directory, or the entire directory itself:
sr.retrieve_metadata(mydir + "/metadata.json", url)
sr.list_files(mydir, url)
sr.retrieve_file(mydir + "/diet/metadata.json", url)
sr.retrieve_directory(mydir, url)
Check out the API documentation for more details on each function. For the concepts underlying the SewerRat itself, check out the repository for a detailed explanation.