sewerrat package

Submodules

sewerrat.deregister module

sewerrat.deregister.deregister(path, url, retry=3, wait=1)[source]

Deregister a directory from the SewerRat search index. It is assumed that this directory is world-readable and that the caller has write access to it; or, the directory does not exist.

Parameters:
  • path (str) – Path to the directory to be registered.

  • url (str) – URL to the SewerRat REST API.

  • retry (int) – Deprecated, ignored.

  • wait (float) – Deprecated, ignored.

sewerrat.list_files module

sewerrat.list_files.list_files(path, url, recursive=True, force_remote=False)[source]

List the contents of a registered directory or a subdirectory thereof.

Parameters:
  • path (str) – Absolute path of the directory to list.

  • url (str) – URL to the SewerRat REST API. Only used for remote access.

  • recursive (bool) – Whether to list the contents recursively. If False, the contents of subdirectories are not listed, and the names of directories are suffxed with / in the returned list.

  • force_remote (bool) – Whether to force remote access via the API, even if path is on the same filesystem as the caller.

Return type:

List[str]

Returns:

List of strings containing the relative paths of files in path.

sewerrat.list_registered_directories module

sewerrat.list_registered_directories.list_registered_directories(url, user=None, contains=None, prefix=None)[source]

List all registered directories in the SewerRat instance.

Parameters:
  • url (str) – URL to the SewerRat REST API.

  • user (Union[str, bool, None]) – Name of a user. If not None, this is used to filter the returned directories based on the user who registered them. Alternatively True, to automatically use the name of the current user.

  • contains (Optional[str]) – String containing an absolute path. If not None, results are filtered to directories that contain this path.

  • prefix (Optional[str]) – String containing an absolute path or a prefix thereof. If not None, results are filtered to directories starting with this string.

Return type:

List[Dict]

Returns:

List of objects where each object corresponds to a registered directory and contains the path to the directory, the user who registered it, the Unix epoch time of the registration, and the names of the metadata files to be indexed.

sewerrat.query module

sewerrat.query.query(url, text=None, user=None, path=None, after=None, before=None, number=100, on_truncation='message')[source]

Query the metadata in the SewerRat backend based on free text, the owner, creation time, etc. This function does not require filesystem access.

Parameters:
  • url (str) – String containing the URL to the SewerRat REST API.

  • text (Optional[str]) – String containing a free-text query, following the syntax described here. If None, no filtering is applied based on the metadata text.

  • user (Optional[str]) – String containing the name of the user who generated the metadata. If None, no filtering is applied based on the user.

  • path (Optional[str]) – String containing any component of the path to the metadata file. If None, no filtering is applied based on the path.

  • after (Optional[int]) – Integer containing a Unix time in seconds, where only files newer than after will be retained. If None, no filtering is applied to remove old files.

  • before (Optional[int]) – Integer containing a Unix time in seconds, where only files older than before will be retained. If None, no filtering is applied to remove new files.

  • number (int) – Integer specifying the maximum number of results to return.

  • on_truncation (Literal['message', 'warning', 'none']) – String specifying the action to take when the number of search results is capped by number.

Return type:

List[Dict]

Returns:

List of dictionaries where each inner dictionary corresponds to a metadata file and contains:

  • path, a string containing the path to the file.

  • user, the identity of the file owner.

  • time, the Unix time of most recent file modification.

  • metadata, a list representing the JSON contents of the file.

sewerrat.register module

sewerrat.register.register(path, names, url, retry=3, wait=1)[source]

Register a directory into the SewerRat search index. It is assumed that that the directory is world-readable and that the caller has write access. If a metadata file cannot be indexed (e.g., due to incorrect formatting, insufficient permissions), a warning will be printed but the function will not throw an error.

Parameters:
  • path (str) – Path to the directory to be registered.

  • names (Union[str, List[str]]) – List of strings containing the base names of metadata files inside path to be indexed. Alternatively, a single string containing the base name for a single metadata file.

  • url (str) – URL to the SewerRat REST API.

  • retry (int) – Deprecated, ignored.

  • wait (int) – Deprecated, ignored.

sewerrat.retrieve_directory module

sewerrat.retrieve_directory.retrieve_directory(path, url, cache=None, force_remote=False, overwrite=False, concurrent=1, update_delay=3600)[source]

Obtain the path to a registered directory or one of its subdirectories. This may create a local copy of the directory’s contents if the caller is not on the same filesystem.

Parameters:
  • path (str) – Relative path to a registered directory or its subdirectories.

  • url (str) – URL to the Gobbler REST API. Only used for remote queries.

  • cache (Optional[str]) – Path to a cache directory. If None, an appropriate location is automatically chosen. Only used for remote access.

  • force_remote (bool) – Whether to force remote access. This will download all files in the path via the REST API and cache them locally, even if path is present on the same filesystem.

  • overwrite (bool) – Whether to overwrite existing files in the cache.

  • concurrent (int) – Number of concurrent downloads.

  • update_delay (int) – Delay interval before checking for updates in a cached directory, seconds. Only used for remote access.

Return type:

str

Returns:

Path to the subdirectory on the caller’s filesystem. This is either path if it is accessible, or a path to a local cache of the directory’s contents otherwise.

sewerrat.retrieve_file module

sewerrat.retrieve_file.retrieve_file(path, url, cache=None, force_remote=False, overwrite=False)[source]

Retrieve the path to a single file in a registered directory. This will call the REST API if the caller is not on the same filesystem.

Parameters:
  • path – Relative path to a registered directory or its subdirectories.

  • url – URL to the Gobbler REST API. Only used for remote queries.

  • cache (Optional[str]) – Path to a cache directory. If None, an appropriate location is automatically chosen. Only used for remote access.

  • force_remote (bool) – Whether to force remote access. This will download path via the REST API and cache it locally, even if path is present on the same filesystem.

  • overwrite (bool) – Whether to overwrite existing files in the cache.

Return type:

str

Returns:

Path to the subdirectory on the caller’s filesystem. This is either path if it is accessible, or a path to a local copy otherwise.

sewerrat.retrieve_metadata module

sewerrat.retrieve_metadata.retrieve_metadata(path, url)[source]

Retrieve a single metadata entry in a registered directory from the SewerRat API.

Parameters:
  • path (str) – Absolute path to a metadata file in a registered directory.

  • url (str) – URL to the SewerRat REST API.

Returns:

  • path, the path to the metadata file.

  • user, the identity of the owning user.

  • time, the Unix time at which the file was modified.

  • metadata, the loaded metadata, typically another dictionary representing a JSON object.

Return type:

Dictionary containing

sewerrat.start_sewerrat module

sewerrat.start_sewerrat.start_sewerrat(db=None, port=None, wait=1, version='1.1.0', overwrite=False)[source]

Start a test SewerRat service.

Parameters:
  • db (Optional[str]) – Path to a SQLite database. If None, one is automatically created.

  • port (Optional[int]) – An available port. If None, one is automatically chosen.

  • wait (float) – Number of seconds to wait for the service to initialize before use.

  • version (str) – Version of the service to run.

  • overwrite (bool) – Whether to overwrite the existing Gobbler binary.

Return type:

Tuple[bool, int]

Returns:

A tuple indicating whether a new test service was created (or an existing instance was re-used) and its URL. If a service is already running, this function is a no-op and the configuration details of the existing service will be returned.

sewerrat.start_sewerrat.stop_sewerrat()[source]

Stop the SewerRat test service started by start_sewerrat(). If no test service was running, this function is a no-op.

Module contents