AlignmentsTrack class and methods

A class to represent short sequences that have been aligned to a reference genome as they are typically generated in next generation sequencing experiments.

# S4 method for class 'AlignmentsTrack'
initialize(
  .Object,
  stackRanges = GRanges(),
  stacks = numeric(),
  sequences = DNAStringSet(),
  referenceSequence = NULL,
  ...
)

# S4 method for class 'ReferenceAlignmentsTrack'
initialize(
  .Object,
  stream,
  reference,
  mapping = list(),
  args = list(),
  defaults = list(),
  stacks = numeric(),
  stackRanges = GRanges(),
  sequences = Biostrings::DNAStringSet(),
  referenceSequence = NULL,
  ...
)

AlignmentsTrack(
  range = NULL,
  start = NULL,
  end = NULL,
  width = NULL,
  strand,
  chromosome,
  genome,
  stacking = "squish",
  id,
  cigar,
  mapq,
  flag = scanBamFlag(isUnmappedQuery = FALSE),
  isize,
  groupid,
  status,
  md,
  seqs,
  name = "AlignmentsTrack",
  isPaired = TRUE,
  importFunction,
  referenceSequence,
  ...
)

# S4 method for class 'AlignmentsTrack'
values(x)

# S4 method for class 'AlignmentsTrack'
chromosome(GdObject) <- value

# S4 method for class 'AlignmentsTrack'
stacks(GdObject)

# S4 method for class 'AlignmentsTrack'
setStacks(GdObject, ...)

# S4 method for class 'AlignmentsTrack'
subset(x, from = NULL, to = NULL, stacks = FALSE, use.defaults = TRUE, ...)

# S4 method for class 'ReferenceAlignmentsTrack'
subset(x, from, to, chromosome, ...)

# S4 method for class 'AlignmentsTrack'
drawAxis(GdObject, ...)

# S4 method for class 'AlignmentsTrack'
drawGD(GdObject, minBase, maxBase, prepare = FALSE, subset = TRUE, ...)

# S4 method for class 'AlignmentsTrack'
show(object)

# S4 method for class 'ReferenceAlignmentsTrack'
show(object)

Arguments

.Object

.Object

stackRanges

stackRanges

stacks

stacks

sequences

sequences

referenceSequence

An optional SequenceTrack object containing the reference sequence against which the reads have been aligned. This is only needed when mismatch information has to be added to the plot (i.e., the showMismatchs display parameter is TRUE) because this is normally not encoded in the BAM file. If not provided through this argument, the plotTracks function is smart enough to detect the presence of a SequenceTrack object in the track list and will use that as a reference sequence.

...

Additional items which will all be interpreted as further display parameters. See settings and the "Display Parameters" section below for details.

stream

stream

reference

reference

mapping

mapping

args

args

defaults

defaults

range

An optional meta argument to handle the different input types. If the range argument is missing, all the relevant information to create the object has to be provided as individual function arguments (see below).

The different input options for range are:

A character string:: the path to a BAM file containing the read alignments. To be precise, this will result in the instantiation of a ReferenceAlignmentsTrack object, but for the user this implementation detail should be of no concern.
A GRanges object:: the genomic ranges of the individual reads as well as the optional additional metadata columns id, cigar, mapq, flag, isize, groupid, status, md and seqs (see description of the individual function parameters below for details). Calling the constructor on a GRanges object without further arguments, e.g. AlignmentsTrack(range=obj) is equivalent to calling the coerce method as(obj, "AlignmentsTrack").
An IRanges object:: almost identical to the GRanges case, except that the chromosome and strand information as well as all additional metadata has to be provided in the separate chromosome, strand, feature, group or id arguments, because it can not be directly encoded in an IRanges object. Note that none of those inputs are mandatory, and if not provided explicitely the more or less reasonable default values chromosome=NA and strand="*" are used.
A data.frame object:: the data.frame needs to contain at least the two mandatory columns start and end with the range coordinates. It may also contain a chromosome and a strand column with the chromosome and strand information for each range. If missing it will be drawn from the separate chromosome or strand arguments. In addition, the id, cigar, mapq, flag, isize, groupid, status, md and seqs data can be provided as additional columns. The above comments about potential default values also apply here.

start, end, width

Integer vectors, giving the start and the end coordinates for the individual track items, or their width. Two of the three need to be specified, and have to be of equal length or of length one, in which case this single value will be recycled. Otherwise, the usual R recycling rules for vectors do not apply here.

strand

Character vector, the strand information for the reads. It may be provided in the form + for the Watson strand, - for the Crick strand or * for either one of the two. Needs to be of equal length as the provided genomic coordinates, or of length 1. Please note that paired reads need to be on opposite strands, and erroneous entries will result in casting of an error.

chromosome

The chromosome on which the track's genomic ranges are defined. A valid UCSC chromosome identifier if options(ucscChromosomeNames=TRUE). Please note that in this case only syntactic checking takes place, i.e., the argument value needs to be an integer, numeric character or a character of the form chrx, where x may be any possible string. The user has to make sure that the respective chromosome is indeed defined for the the track's genome. If not provided here, the constructor will try to construct the chromosome information based on the available inputs, and as a last resort will fall back to the value chrNA. Please note that by definition all objects in the Gviz package can only have a single active chromosome at a time (although internally the information for more than one chromosome may be present), and the user has to call the chromosome<- replacement method in order to change to a different active chromosome.

genome

The genome on which the track's ranges are defined. Usually this is a valid UCSC genome identifier, however this is not being formally checked at this point. If not provided here the constructor will try to extract this information from the provided input, and eventually will fall back to the default value of NA.

stacking

The stacking type for overlapping items of the track. One in c(hide, dense, squish, pack, full). Currently, only squish (make best use of the available space), dense (no stacking, collapse overlapping ranges), and hide (do not show any track items at all) are implemented.

id

Character vector of read identifiers. Those identifiers have to be unique, i.e., each range representing a read needs to have a unique id.

cigar

A character vector of valid CIGAR strings describing details of the alignment. Typically those include alignment gaps or insertions and deletions, but also hard and soft clipped read regions. If missing, a fully mapped read without gaps or indels is assumed. Needs to be of equal length as the provided genomic coordinates, or of length 1.

mapq

A numeric vector of read mapping qualities. Needs to be of equal length as the provided genomic coordinates, or of length 1.

flag

A named integer vector of length 2, as is produced by Rsamtools::scanBamFlag(), used to filter out undesirable reads. If missing, all mapped reads will be included.

isize

A numeric vector of empirical insert sizes. This only applies if the reads are paired. Needs to be of equal length as the provided genomic coordinates, or of length 1. Currently not used.

groupid

A factor (or vector than can be coerced into one) defining the read pairs. Reads with the same groupid are considered to be mates. Please note that each read group may only have one or two members. Needs to be of equal length as the provided genomic coordinates, or of length 1.

status

A factor describing the mapping status of a read. Has to be one in mated, unmated or ambiguous. Needs to be of equal length as the provided genomic coordinates, or of length 1.

md

A character vector describing the mapping details. This is effectively and alternative to the CIGAR encoding and it removes the dependency on a reference sequence to figure out read mismatches. Needs to be of equal length as the provided genomic coordinates, or of length 1. Currently not used.

seqs

DNAStringSet of read sequences.

name

Character scalar of the track's name used in the title panel when plotting.

isPaired

A logical scalar to determine whether the reads are paired or not. While this may be used to render paired-end data as single-end, the oppsite will typically not have any effect because the appropriate groupid settings will not be present. Thus setting isPaired to TRUE can usually be used to autodetect the pairing state of the input data.

importFunction

A user-defined function to be used to import the data from a file. This only applies when the range argument is a character string with the path to the input data file. The function needs to accept an argument x containing the file path and a second argument selection with the desired plotting ranges. It has to return a proper GRanges object with all the necessary metadata columns set. A single default import function is already implemented in the package for BAM files.

x

GdObject

Object of GdObject-class.

value

value

from, to

from,to

use.defaults

logical

minBase, maxBase

minBase,maxBase

prepare

logical

subset

logical

object

object

Value

The return value of the constructor function is a new object of class AlignmentsTrack or ReferenceAlignmentsTrack.

Functions

initialize(AlignmentsTrack): Initialize.
ReferenceAlignmentsTrack-class: The file-based version of the AlignmentsTrack-class.
initialize(ReferenceAlignmentsTrack): Initialize.
AlignmentsTrack(): Constructor for AlignmentsTrack-class.
values(AlignmentsTrack): Return all additional annotation information except for the genomic coordinates for the track items as a data.frame.
chromosome(AlignmentsTrack) <- value: replace the value of the track's chromosome. This has to be a valid UCSC chromosome identifier or an integer or character scalar that can be reasonably coerced into one.
stacks(AlignmentsTrack): return the stack indices for each track item.
setStacks(AlignmentsTrack): recompute the stacks based on the available space and on the object's track items and stacking settings.
subset(AlignmentsTrack): Subset a AlignmentsTrack by coordinates and sort if necessary.
subset(ReferenceAlignmentsTrack): Subset a ReferenceAlignmentsTrack by coordinates and sort if necessary.
drawAxis(AlignmentsTrack): add a y-axis to the title panel of a track.
drawGD(AlignmentsTrack): plot the object to a graphics device. The return value of this method is the input object, potentially updated during the plotting operation. Internally, there are two modes in which the method can be called. Either in 'prepare' mode, in which case no plotting is done but the object is preprocessed based on the available space, or in 'plotting' mode, in which case the actual graphical output is created. Since subsetting of the object can be potentially costly, this can be switched off in case subsetting has already been performed before or is not necessary.
show(AlignmentsTrack): Show method.
show(ReferenceAlignmentsTrack): Show method.

Objects from the Class

Objects can be created using the constructor function AlignmentsTrack.

Author

Florian Hahne

Examples

## Creating objects
afrom <- 2960000
ato <- 3160000
alTrack <- AlignmentsTrack(system.file(
    package = "Gviz", "extdata",
    "gapped.bam"
), isPaired = TRUE)
plotTracks(alTrack, from = afrom, to = ato, chromosome = "chr12")


## Omit the coverage or the pile-ups part
plotTracks(alTrack,
    from = afrom, to = ato, chromosome = "chr12",
    type = "coverage"
)

plotTracks(alTrack,
    from = afrom, to = ato, chromosome = "chr12",
    type = "pileup"
)


## Including sequence information with the constructor
if (require(BSgenome.Hsapiens.UCSC.hg19)) {
    strack <- SequenceTrack(Hsapiens, chromosome = "chr21")
    afrom <- 44945200
    ato <- 44947200
    alTrack <- AlignmentsTrack(system.file(
        package = "Gviz", "extdata",
        "snps.bam"
    ), isPaired = TRUE, referenceSequence = strack)
    plotTracks(alTrack, chromosome = "chr21", from = afrom, to = ato)

    ## Including sequence information in the track list
    alTrack <- AlignmentsTrack(system.file(
        package = "Gviz", "extdata",
        "snps.bam"
    ), isPaired = TRUE)
    plotTracks(c(alTrack, strack),
        chromosome = "chr21", from = 44946590,
        to = 44946660
    )
}
#> Loading required package: BSgenome.Hsapiens.UCSC.hg19
#> Loading required package: BSgenome
#> Loading required package: Biostrings
#> Loading required package: XVector
#> 
#> Attaching package: 'Biostrings'
#> The following object is masked from 'package:grid':
#> 
#>     pattern
#> The following object is masked from 'package:base':
#> 
#>     strsplit
#> Loading required package: BiocIO
#> Loading required package: rtracklayer
#> 
#> Attaching package: 'rtracklayer'
#> The following object is masked from 'package:BiocIO':
#> 
#>     FileForFormat

Arguments

Value

Functions

Objects from the Class

See also

Author

Examples