VCF Experiment

Type: object

A summarized experiment subclass containing variant calling data, typically corresponding to the VCF classes in the VariantAnnotation Bioconductor package. It is guaranteed to have assays with name and type listed in the INFO fields of the header file. Each row corresponds to a genomic position.

Derived from summarized_experiment/v1.json: a summarized experiment where each row corresponds to a feature and each column corresponds to an experimental sample. The layout of this data structure is based on Bioconductor's SummarizedExperiment class. This metadata document contains pointers to the various components of the summarized experiment, including the row data, column data and assays.

No Additional Properties

Type: string

The schema to use.

Type: boolean Default: false

Is this a child document, only to be interpreted in the context of the parent document from which it is linked? This may have implications for search and metadata requirements.

Type: string

Path to the file in the project directory.

Type: object
No Additional Properties

Type: array of object

An array of pointers to the assay data. Each entry corresponds to a single assay in the summarized experiment object.

Must contain a minimum of 1 items

No Additional Items

Each item of this array must be:

Type: object

Type: string

Name of the assay. Each assay must have a non-empty name. Assay names should not be duplicated within assays.

Must be at least 1 characters long

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: object

Pointer to the column data. This should be data frame (as defined by the data_frame schema) where each row corresponds to a column of the summarized experiment and each column contains some annotation for the experimental samples. Omitted if no sample-level annotation is present.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: array of integer

Dimensions of a two-dimensional object.

Must contain a minimum of 2 items

Must contain a maximum of 2 items

No Additional Items

Each item of this array must be:

Type: object

Pointer to the additional metadata for this object, typically stored as a list (via the basic_list schema). Omitted if no additional metadata is present.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: object

Pointer to the row data. This should be data frame (as defined by the data_frame schema) where each row corresponds to a row of the summarized experiment and each column contains some annotation for the features. Omitted if no feature-level annotation is present.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: object

Pointer to the genomic coordinates corresponding to the rows. This should comply with the genomic_ranges or genomic_ranges_list schemas, where each range or group defines the genomic location of the feature corresponding to a row of the summarized experiment. Omitted if no genomic coordinates are present. This is based on Bioconductor's RangedSummarizedExperiment class.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: object
No Additional Properties

Type: boolean

Is this an expanded VCF? In an expanded VCF, genomic positions with multiple alternative alleles are expanded into multiple rows, one per alternative allele. In a collapsed VCF, each row strictly corresponds to a single genomic position and any multi-allelic information is embedded in the same row.

Type: object

Pointer to a data frame with number of rows equal to the number of genomic positions in the VCF experiment. This should contains a fixed set of columns:
- REF, a DNA string set or character field containing the reference allele for each row.
- ALT, a list containing one or more alternative alleles for each position when expanded is false, and a DNA string set or character field when expanded is true.
- QUAL, an integer field with the quality score for the alternative allele calls.
-FILTER, a character field indicating whether a row passes filter (PASS) or fails for some reason as listed in the FILTER tags of the header.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: object

Pointer to a VCF file with header information but no other data. This is only used to transmit the headers efficiently, given that the headers are highly heterogeneous and do not easily fit into other formats.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"

Type: object

Pointer to a data frame with additional information about each genomic position. Column names and types are as listed in the INFO tags of the header.

Type: object

Type: string

Relative path of the resource from the root of the project directory.

Type: enum (of string)

Type of file. Local files should be present in the same project directory.

Must be one of:

  • "local"