rnavigate package

Subpackages

Submodules

rnavigate.data_loading module

Parsing function for rnavigate.Sample data_keywords.

rnavigate.helper_functions module

Contains PlottingArgumentParser for plotting_functions.py and fit_data for retreiving aligned data objects

rnavigate.plotting_functions module

Contains all rnavigate convenience plotting functions

rnavigate.rnavigate module

The RNAvigate Sample object for parsing, assigning, and organizing data.

rnavigate.styles module

Contains global plot display settings.

Module contents

RNAvigate

RNA visualization and graphical analysis toolset

A Jupyter-compatible toolset for visually exploring RNA structure and chemical probing data.

class rnavigate.Sample(sample, inherit=None, keep_inherited_defaults=True, **data_keywords)

Bases: object

Loads and organizes RNA structural data for use with plotting functions.

The Sample class stores all of the relevant experimental and computational structural data for a single RNA experiment. Between samples, common data types should be given a common data keyword so they can be easily compared.

Parameters

samplestr: an arbitrary name. This will be used as a label in plot legends and titles to differentiate it from other samples
inheritSample or list of these, optional: Data keywords and associated data from other samples become data keywords and associated data of this sample. This does not make additional copies of the data: i.e. operations that make changes to inherited data change the original sample, and any other samples that inherited that data. This can be useful to save time and memory on operations and large data structures that are shared between samples.
keep_inherited_defaultsbool, default = True: whether to keep inherited default keywords
**kwargs: There are many built-in data keywords with different expectations and behaviors. See Loading data for more information.

Attributes

samplestr: the name of the sample
inputsdict: a dictionary of data keywords and their (user-defined) inputs
datadict: a dictionary of data keywords and their associated data
defaultsdict: a dictionary of data classes and their default data keywords

Example

>>> sample = rnavigate.Sample(
...     sample="My sample",
...     shapemap="path/to/shapmapper_profile.txt",
...     ss="path/to/structure.ct",
...     ringmap="path/to/ringmapper_rings.txt",
...     pdb="path/to/pdb.pdb",
...     arbitrary_keyword={
...         "sites": [10, 20, 30],
...         "name": "sites of interest",
...         "color": "red",
...     },
... )
>>> sample.print_data_keywords()
My sample data keywords:
  annotations:
    arbitrary_keyword (default)
  profiles:
    shapemap (default)
  structures:
    ss (default)
  interactions:
    ringmap (default)
  pdbs:
    pdb (default)

filter_interactions(interactions, metric=None, cmap=None, normalization=None, values=None, **kwargs)

sets coloring properties and applies filters to interactions data.

Parameters

interactionsrnavigate.data.Interactions or data keyword string: Interactions object to be filtered. If a string, value is replaced with self.get_data(interactions)
metricstr, optional: column of interactions data to be used as metric for coloring interactions. “Distance” will compute 3D distance in “pdb”, defaulting to 2’OH atom. “Distance_DMS” or “Distance_[atom id]” will use those atoms to compute distance.
cmapstr or list, optional: sets the interactions colormap, used to color interactions according to metric values.
normalizationstr, optional: “norm”: extreme values in colormap are given to the extreme values of interactions metric data “bins”: data are colored according to which bin they fall in; values defines bins (list, length = 2 less than cmap) “min_max”: extreme values in cmap are given to values beyond minimum and maximum, defined by values
valuesvaries: behavior depends on normalization. “norm”: values are not needed “bins”: list of floats containing the boundaries between bins (one fewer than the number of categories) “min_max”: list of floats containing the minimum and maximum
**kwargs: Other arguments are passed to interactions.filter()

get_data(data_keyword, data_class=None)

Replaces data keyword with data object, even if nested.

Parameters

data_keywordrnavigate.data.Data or data keyword or list/dict of these: If None, returns None. If a data keyword, returns associated data from sample If Data, returns that data. If a list or dictionary, returns list or dictionary with data keyword values replaced with associated Data
data_classrnavigate.data.Data class or subclass, optional: If provided, ensures that returned data is of this type.

Returns

Same type as data_keyword argument, with data keywords replaced by associated data.

Raises

ValueError:: if data is not found in sample
ValueError:: if the data retrieved is not of the specified data_class

inherit_data(inherit, keep_inherited_defaults, overwrite)

retrieves and stores data and data keywords from other samples

Parameters

inheritSample or list of Samples: Other samples from which to inherit data and data keywords
keep_inherited_defaultsbool: Use default values from inherited samples
overwritebool: whether to overwrite any existing keywords with inherited keywords

print_data_keywords(return_dict=False)

Print a nicely formatted, organized list of data keywords.

Returns a dictionary of data keywords, organized by data type, if return_dict is True.

set_as_default(data_keyword, overwrite=True)

Set the given data keyword as the default for its data class

It’s data class is determined automatically. Only one default exists per data class and per Sample object.

Parameters

data_keywordstr: The data keyword to set as the default
overwritebool, defaults to True: whether to overwrite a pre-existing default data keyword

set_data(data_keyword, inputs, overwrite=False)

Add data to Sample using the given data keyword and inputs

This methods works similarly to the data keywords arguments used during Sample initialization:

my_sample = rnavigate.Sample(
sample=”name”, data_keyword=inputs)

is equivalent to:

my_sample = rnavigate.Sample(
sample=”name”)

my_sample.add_data(
“data_keyword”, inputs)

Parameters

data_keywordstr: a data keyword used to store and/or parse the inputs
inputsdict or rnavigate.data.Data: a dictionary used to create the data object or a data object itself
overwritebool, defaults to False: whether to overwrite a pre-existing data_keyword

rnavigate.plot_alignment(data1, data2, labels=None, plot_kwargs=None)

Plots the sequence alignment used to compare two sequences

Parameters

data1tuple (rnavigate Sample, data keyword string): a sample and data keyword to retrieve a sequence
data2tuple (rnavigate Sample, data keyword string): another sample and data keyword to retrieve a second sequence
labelslist of 2 strings, defaults to “sample.sample: data keyword” for each: Labels used for each sample
plot_kwargsdict, defaults to {}: passed to matplotlib.pyplot.subplots()

Returns

rnavigate.plots.Alignment: the Alignment plot object

rnavigate.plot_arcs(samples, sequence, structure=None, structure2=None, interactions=None, interactions2=None, profile=None, annotations=None, domains=None, labels=None, nt_ticks=(20, 5), profile_scale_factor=1, plot_error=False, annotation_mode='track', panels=None, seqbar=True, region='all', colorbars=True, title=True, plot_kwargs=None)

Plots interactions and/or base-pairs as arcs.

Parameters

sampleslist of rnavigate Samples

samples used to retrieve data

sequencedata keyword string, data object, or sequence string

All data are mapped to this sequence before plotting If a data keyword string, data from the first sample will be used

structuredata keyword string or data object, defaults to None

secondary structure to plot as arcs

structure2data keyword string or data object, defaults to None

another secondary structure to compare with the first structure arcs will be colored depending on which structure they are in Defaults to None

interactionsone of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot as arcs, no filtering performed
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary
format 3 (list of format 2 dictionaries): This format allows multiple filtering schemes to be applied, each will be plotted on a seperate axis

interactions2one of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot as arcs, no filtering performed
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary

profiledata or data keyword, defaults to None

Profile from which values will be plotted

annotationslist of data keyword strings or data objects, defaults to []

Annotations used to highlight regions or sites of interest

domainsdata keyword string or data object, defaults to None

domains to label along x-axis

labelslist of strings, defaults to sample.sample for each sample

list containing Labels to be used in plot legends

nt_tickstuple of two integers, defaults to (20, 5)

first integer is the gap between major tick marks second integer is the gap between minor tick marks

profile_scale_factornumber, defaults to 1

small profile values will be hard to see large profile values will overwhelm the plot e.g. use 1/10 to scale values down 10-fold, use 10 to scale up

plot_errorbool, defaults to False

Whether to plot error bars, values are determined by profile.metric

annotation_mode{ “track” | “bars”}, default “track”

“track” will highlight annotations along the x-axis “bars” will use a vertical transparent bar over the plot

panelsdict, optional

a dictionary of whether plot elements are displayed on the “top” (above x-axis) or “bottom” (below x-axis) Only the values you wish to change from the default are needed defaults to {“interactions”: “bottom”, “interactions2”: “bottom”, “structure”: “top”, “profile”: “top”}

seqbarbool, default True

whether to display the sequence along the x-axis

regionlist of 2 integers, defaults to [1, length of sequence]

start and end positions to plot. 1-indexed, inclusive.

colorbarsbool, default True

Whether to plot colorbars for all plot elements

titlebool, defaults to True

Whether to display titles for each axis

plot_kwargsdict, defaults to {}

Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.AP: the ArcPlot object

rnavigate.plot_arcs_compare(samples, sequence, structure=None, structure2=None, interactions=None, interactions2=None, profile=None, labels=None, profile_scale_factor=1, plot_error=False, region='all', colorbars=True, plot_kwargs=None)

Generates a single arc plot displaying combinations of secondary structures, per-nucleotide data, inter-nucleotide data, and sequence annotations. The first sample will be on top, the second on the bottom. Center shows how these sequences are being aligned. This view does not

Parameters

sampleslist of 2 rnavigate Samples

samples used to retrieve data This plotting function can only compare two samples at a time

sequencedata keyword string, data object, or sequence string

All data are mapped to this sequence taken from their respective sample before plotting

structuredata keyword string or data object, defaults to None

secondary structure to plot as arcs

structure2data keyword string or data object, defaults to None

another secondary structure to compare with the first structure arcs will be colored depending on which structure they are in

interactionsone of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot as arcs, no filtering performed
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary
format 3 (list of format 2 dictionaries): This format allows multiple filtering schemes to be applied, each will be plotted on a seperate axis

interactions2one of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot as arcs, no filtering performed
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary

profiledata keyword string or data object, defaults to None

Profile from which values will be plotted

labelslist of strings, defaults to sample.sample for each sample

list containing Labels to be used in plot legends

profile_scale_factornumber, defaults to 1

small profile values will be hard to see large profile values will overwhelm the plot e.g. use 1/10 to scale values down 10-fold, use 10 to scale up

plot_errorbool, defaults to False

Whether to plot error bars, values are determined by profile.metric

regionlist of 2 integers, defaults to [1, length of sequence]

start and end positions to plot. 1-indexed, inclusive.

colorbarsbool, defaults to True

Whether to plot color scales for all plot elements

plot_kwargsdict, defaults to {}

Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.AP plot: object containing matplotlib figure and axes with additional plotting and file saving methods

rnavigate.plot_circle(samples, sequence, structure=None, structure2=None, interactions=None, interactions2=None, annotations=None, profile=None, colors=None, nt_ticks=(20, 5), gap=30, labels=None, colorbars=True, plot_kwargs=None)

Creates a figure containing a circle plot for each sample given.

Data that can be plotted on circle plots includes annotations (highlights regions around the edge.Generates a multipanel secondary structure drawing with optional coloring by per-nucleotide data and display of inter- nucleotide data and/or sequence annotations. Each plot may display a unique sample and/or inter-nucleotide data filtering scheme.

Parameters

sampleslist of rnavigate Samples

samples used to retrieve data

sequencedata or data keyword

All data are mapped to this sequence before plotting

structuredata keyword string or data object, defaults to None

Structure used to plot base-pairs on circle plot

structure2data keyword str, data obj or list of either, defaults to None

Structures to compare with Structure. Each base-pair is colored by which structure contains it or how many structures contain it.

interactionsone of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot on cirle plot, no filtering
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary
format 3 (list of format 2 dictionaries): This format allows multiple filtering schemes to be applied, each will be plotted on a seperate axis

interactions2one of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot on circle plot, no filtering
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary

annotationslist of data keyword strings or data objects, defaults to []

Annotations used to highlight regions or sites of interest

profiledata keyword string or data object, defaults to None

Profile used for coloring if “profile” used in colors dictionary

labelslist of strings, defaults to sample.sample for each sample

list containing Labels to be used in plot legends

colorsdictionary, optional

a dictionary of element: value pairs that determines how colors will be applied to each plot element and if that element is plotted only the elements you wish to change need to be included Keys can be “sequence” (letter labels), “nucleotides” (circles behind letters), “structure” (lines connecting nucleotides), and “basepairs” (lines connecting base-paired nucleotides). Values can be: None (don’t plot), “sequence” (color by nucleotide identity), “position” (position in sequence), “annotations” (sequence annotations), “profile” (per-nucleotide data from profile argument), “structure” (base-pairing status), a single matplotlib color for all positions, or an array of one color per position which matches the structure length. “sequence” may also use “contrast” which automatically chooses white or black for each letter to contrast with that “nucleotide” color. Defaults to {“sequence”: None, “nucleotides”: “sequence”, “structure”: “grey”}

nt_tickstuple of two integers, defaults to (20, 5)

first integer is the gap between major tick marks second integer is the gap between minor tick marks

gapinteger, defaults to 30

Width of gap between 5’ and 3’ end in degrees

colorbarsbool, defaults to True

Whether to plot color scales for all plot elements

plot_kwargsdict, defaults to {}

Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.Circle: object containing matplotlib figure and axes with additional plotting and file saving methods

rnavigate.plot_disthist(samples, structure, interactions, bg_interactions=None, labels=None, same_axis=False, atom="O2'", rows=None, cols=None, plot_kwargs=None)

Calculates 3D distance of nucleotides in inter-nucleotide data and plots the distribution of these distances. Compares this to a “background” distribution consisting of either all pairwise distances in structure, or those defined by bg_interactions and bg_interactions_filter

Parameters

sampleslist of rnavigate Samples

Samples from which to retreive data There will be one panel for each sample unless same_axis is True

structuredata keyword string or data object

secondary structure or 3D structure to calculate inter-nucleotide contact distance or 3D distance, respectively

interactionsone of the formats below, defaults to None

format 1 (data or data keyword): Interactions used to calculate distance histogram, no filtering
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary
format 3 (list of format 2 dictionaries): This format allows multiple filtering schemes to be applied, each will be plotted on a seperate axis

bg_interactionsone of the formats below, defaults to None

format 1 (data or data keyword): Interactions to calculate background distance histogram, no filtering is performed
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary

if not provided, background distance histogram is calculated from all pairwise distances in structure

labelslist of strings, defaults to sample.sample for each sample

Labels to be used as titles, must be same length as samples list Defaults to sample.sample for each sample

atomstring or dictionary, defaults to “O2’”

from which atoms to calculate distances for DMS reactive atoms (N1 for A and G, N3 for U and C) use “DMS” use a dictionary to specify a different atom for each nucleotide e.g. “DMS” == {“A”: “N1”, “G”: “N1”, “U”: “N3”, “C”: “N3”}

rowsinteger, defaults to None (determined automatically)

number of rows of plots

colsinteger, defaults to None (determined automatically)

number of columns of plots

plot_kwargsdictionary, defaults to {}

Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.DistHist: object containing matplotlib figure and axes with additional plotting and file saving methods

rnavigate.plot_heatmap(samples, sequence, structure=None, interactions=None, regions=None, labels=None, levels=None, interpolation='nearest', atom="O2'", plot_type='heatmap', weights=None, rows=None, cols=None, plot_kwargs=None)

Generates a multipanel plot displaying a heatmap of inter-nucleotide data (nucleotide resolution of 2D KDE) and/or contour map of pdb distances. Each plot may display a unique sample and/or filtering scheme.

Parameters

sampleslist of rnavigate Samples

samples used to retrieve data

sequencedata keyword string, data object, or sequence string

All data are mapped to this sequence before plotting

structuredata keyword string or data object, defaults to None

secondary structure or 3D structure used to plot contour lines contour lines are drawn according to levels argument

interactionsone of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot as a heatmap, no filtering performed
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary
format 3 (list of format 2 dictionaries): This format allows multiple filtering schemes to be applied, each will be plotted on a seperate axis

regionslist of lists of 4 integers, defaults to None (no boxes)

each inner list defines two regions of the RNA that are interacting a box will be drawn around this interaction on the heatmap e.g. [[10, 20, 50, 60], [35, 45, 70, 80]] draws 2 boxes the first connects nucleotides 10-20 and 50-60, the second connects nucleotides 35-45 and 70-80

labelslist of strings, defaults to sample.sample for each sample

Labels to be used as titles, must be same length as samples list

levelslist of floats, defaults to [5] contact distance or [20] 3D distance

contours are drawn separating nucleotides above and below these distances. If structure is a secondary structure, distance refers to contact distance. If structure is a 3D structure, distance refers to spatial distance in angstroms.

interpolationstring, defaults to “nearest”

one of matplotlib’s interpolations for heatmap (used with imshow) “nearest” works well for shorter RNAs (under 300 nt) “none” works well for longer RNAs (over 1200 nt)

atomstring or dictionary, defaults to “O2’”

from which atoms to calculate distances for DMS reactive atoms (N1 for A and G, N3 for U and C) use “DMS” use a dictionary to specify a different atom for each nucleotide e.g. “DMS” == {“A”: “N1”, “G”: “N1”, “U”: “N3”, “C”: “N3”}

plot_type“heatmap” or “kde”, defaults to “heatmap”

how to plot interactions data “heatmap” will plot raw data, each interaction is a pixel in a grid “kde” will calculate a kernel density estimate and plot 5 levels

weightsstring, defaults to None (no weights)

weights to be used in kernel density estimation must be a column of interactions data

rowsinteger, defaults to None (determined automatically)

number of rows of plots

colsinteger, defaults to None (determined automatically)

number of columns of plots

plot_kwargsdictionary, defaults to {}

Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.Heatmap: object containing matplotlib figure and axes with additional plotting and file saving methods

rnavigate.plot_linreg(samples, profile, sequence=None, structure=None, annotations=None, labels=None, kde=False, scale='linear', regression='pearson', colors='sequence', column=None, region='all', colorbars=True, plot_kwargs=None)

Performs linear regression analysis and generates scatter plots of all sample-to-sample profile vs. profile comparisons. Colors nucleotides by identity or base-pairing status.

Parameters

sampleslist of rnavigate Samples: samples used to retrieve data
profiledata keyword string or data object: per-nucleotide data to perform linear regression all data are mapped to the sequence of the profile data from the first sample before plotting, unless sequence is supplied
sequencedata keyword str, data obj, or sequence str, defaults to None: a sequence from which to align all profiles if a data keyword, uses data from the first sample
structuredata keyword string or data object, defaults to None: Structure used for coloring if colors argument is “structure”
annotationslist of data keyword strings or data objects, defaults to []: Annotations used for coloring if colors argument is “annotations”
labelslist of strings, defaults to sample.sample for each sample: list containing Labels to be used in plot legends
kdebool, defaults to False: whether to plot kde (density) instead of a scatter plot
scale“linear” or “log”, defaults to “linear”: “linear” performs regression on raw values, displays linear units “log” performs regression on log10(values), displays log10 units
regression“pearson” or “spearman”, defaults to “pearson”: “pearson” calculates Pearson R-squared (standard) “spearman” calculates Spearman R-squared (rank-order)
colorsstring or list of colors, defaults to “sequence”: Values can be: None (don’t plot), “sequence” (color by nucleotide identity), “position” (position in sequence), “annotations” (sequence annotations), “profile” (per-nucleotide data from profile argument), “structure” (base-pairing status), a single matplotlib color for all positions, or an array of one color per position which matches the structure length.
columnstring, defaults to profile.metric: column name of values from profile to use in regression
regionlist of 2 integers, defaults to [1, length of sequence]: start and end nucleotide positions to include. 1-indexed, inclusive
colorbarsbool, defaults to True: Whether to plot colorbars for scatter plot colors
plot_kwargsdict, defaults to {}: Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.LinReg: object containing matplotlib figure and axes with additional plotting and file saving methods

rnavigate.plot_mol(samples, structure, profile=None, interactions=None, labels=None, style='cartoon', hide_cylinders=False, colors='grey', atom="O2'", rotation=None, orientation=None, get_orientation=False, title=True, colorbars=True, width=400, height=400, rows=None, cols=None, background_alpha=1, show=True)

Generates a multipanel interactive 3D molecular rendering of a PDB structure. Nucleotides may be colored by per-nucleotide data or custom color lists. Inter-nucleotide data may be displayed as cylinders connecting atoms or residues. Each plot may display a unique sample and/or filtering scheme.

Parameters

sampleslist of rnavigate Samples

samples used to retrieve data

structuredata keyword string or data object

3D structure to view as interactive molecule All data are mapped to this sequence before plotting

profiledata keyword string or data object, defaults to None

Profile used to color nucleotides if colors=”profile”

interactionsone of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot on molecule, no filtering performed
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary
format 3 (list of format 2 dictionaries): This format allows multiple filtering schemes to be applied, each will be plotted on a seperate axis

labelslist of strings, defaults to sample.sample for each sample

list containing Labels to be used in plot titles

style“cartoon”, “cross”, “line”, “sphere” or “stick”, defaults to “cartoon”

sets the py3Dmol style for drawing the molecule

hide_cylindersbool, defaults to False

whether to hide nucleotide cylinders (only shows backbone ribbon)

colorsstring or list of colors, defaults to “grey”

“sequence”: color by nucleotide identity “position”: color by position in sequence “annotations”: color by sequence annotations from annotations “profile”: color by per-nucleotide data from profile “structure”: color by base-pairing status matplotlib color: all positions plotted in this color array of colors: one per position, must match structure length

atomstring or dictionary, defaults to “O2’”

which atoms to draw interactions between for DMS reactive atoms (N1 for A and G, N3 for U and C) use “DMS” use a dictionary to specify a different atom for each nucleotide e.g. “DMS” == {“A”: “N1”, “G”: “N1”, “U”: “N3”, “C”: “N3”}.

rotationdictionary, defaults to {“x”: 0, “y”: 0, “z”: 0}

axis-degrees pairs for setting the starting orientation of the molecule, only the axes to be rotated are needed e.g. {“x”: 180} flips the molecule on the x-axis

orientationlist of 9 floats, defaults to None

set the precise starting orientation see get_orientation for more details

get_orientationbool, defaults to False

allows getting the orientation for use with orientation argument all other arguments will be ignored and a larger, single panel view window is displayed with no title 1. adjust the molecule to the desired orientation 2. click on the molecule to display the orientation vector 3. copy this orientation vector (manually) 4. provide this list of values to the orientation argument

titlebool, defaults to True

whether to display the title

colorbarsbool, defaults to True

Whether to plot color scales for all plot elements

widthinteger, defaults to 400

width of view window in pixels

heightinteger, defaults to 400

height of view window in pixels

rowsinteger, defaults to None (set automatically)

the number of rows in the view window

colsinteger, defaults to None (set automatically)

the number of columns in the view window

background_alphafloat, defaults to 1 (completely opaque)

the opacity of the view window, must be between 0 and 1

showbool, defaults to True

whether to display the viewer object

Returns

rnavigate.plots.Mol:: object containing py3dmol viewer with additional plotting and file saving methods

rnavigate.plot_ntdist(samples, profile, labels=None, column=None, plot_kwargs=None)

Plots the distributions of values at A, U, C, and G.

Calculates the kernel density estimate (KDE) for each nucleobase and plots them on one axis per sample.

Parameters

sampleslist of rnavigate Samples: samples used to retrieve data
profiledata keyword string or data object: per-nucleotide data to plot per-nt-identity distributions
labelslist of strings, defaults to sample.sample for each sample: list containing Labels to be used in plot legends
columnstring, defaults to profile.metric: which column of data to use for KDE
plot_kwargsdict, defaults to {}: Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.NucleotideDistribution: object containing matplotlib figure and axes with additional plotting and file saving methods

rnavigate.plot_options(samples)

Prints a list of plotting functions compatible with a sample or list of samples.

Some plotting functions require specific data classes to be loaded into the sample. For plotting multiple samples, data keywords that are not shared, or are shared, but are not of the same data class, are considered invalid.

Parameters

samplesrnavigate.Sample or list of rnavigate.Sample: samples to check for compatible plotting functions

rnavigate.plot_profile(samples, profile, sequence=None, annotations=None, domains=None, labels=None, nt_ticks=(20, 5), column=None, plot_error=True, annotations_mode='track', seqbar=True, region='all', colorbars=True, plot_kwargs=None)

Aligns reactivity profiles by sequence and plots them on seperate axes.

Parameters

sampleslist of rnavigate Samples: samples used to retrieve data
profiledata keyword string or data object: Profile from which values will be plotted
sequencedata keyword str, data obj, or sequence str, defaults to profile: All data are mapped to this sequence before plotting If a data keyword, data from the first sample will be used
annotationslist of data keyword strings or data objects, defaults to []: Annotations used to highlight regions or sites of interest
domainsdata keyword string or data object, defaults to None: domains to label along x-axis
labelslist of strings, defaults to sample.sample for each sample: list containing Labels to be used in plot legends
nt_tickstuple of two integers, defaults to (20, 5): first integer is the gap between major tick marks second integer is the gap between minor tick marks
columnstring, defaults to profile.metric: column name of values from profile to plot
plot_errorbool, defaults to True: Whether to plot error bars, values are determined by profile.metric
annotations_mode“track” or “bars”, defaults to “track”: “track” will highlight annotations along the x-axis “bars” will use a vertical transparent bar over the plot
seqbarbool, defaults to True: whether to display the sequence along the x-axis
regionlist of 2 integers, defaults to [1, length of sequence]: start and end positions to plot. 1-indexed, inclusive.
colorbarsbool, defaults to True: Whether to plot color scales for per-nucleotide data
plot_kwargsdictionary, defaults to {}: Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.Profile: the Profile plot object

rnavigate.plot_qc(samples, profile, labels=None)

Creates a multipanel quality control plot displaying mutations per molecule, read length distribution, and mutation rate distributions for modified and unmodified samples.

Parameters

sampleslist of rnavigate.Sample: samples to retrieve data from
profiledata keyword string or data object: ShapeMaP or similar data for plotting reactivity distributions Must contain data from ShapeMapper log file
labelslist of str, defaults to sample.sample for each sample in samples: labels to be used on legends, must be same length as samples list

Returns

rnavigate.plots.QC: the quality control plot object

rnavigate.plot_roc(samples, structure, profile, labels=None, nts='AUCG', plot_kwargs=None)

Performs receiver operator characteristic analysis (ROC), calculates area under ROC curve (AUC), and generates ROC plots to assess how well per-nucleotide data predicts base-paired status. Does this for all positions as well as positions categorized by nucleotide 5 plots: All, A, U, C, G

Parameters

sampleslist of rnavigate Samples: samples used to retrieve data
structuredata keyword string or data object: secondary structure to use as classifier (paired or unpaired) profile data for each sample is first aligned to this structure
profiledata keyword string or data object: per-nucleotide data to perform ROC analysis
labelslist of strings, defaults to sample.sample for each sample: list containing Labels to be used in plot legends
ntsstring, defaults to “AUCG”: which nucleotides to plot nucleotide-type ROC plots
plot_kwargsdict, defaults to {}: Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.ROC: object containing matplotlib figure and axes with additional plotting and file saving methods

rnavigate.plot_shapemapper(sample, profile, label=None, panels=None)

Makes a standard ShapeMapper2 profile plot with 3 panels: Normalized Reactivities, mutation rates, and read depths.

Parameters

samplernavigate Sample: The sample from which data profile and label will be retreived
profiledata keyword string or data object: ShapeMaP or similar data for plotting profiles
labelstr, defaults to sample.sample: A label to use as the title of the figure
panelslist of str, defaults to [“profile”, “rates”, “depth”]: Which panels to include: options are “profile”, “rates”, and “depth”

Returns

rnavigate.plots.SM: the ShapeMapper2 plot object

rnavigate.plot_skyline(samples, profile, sequence=None, annotations=None, domains=None, labels=None, nt_ticks=(20, 5), columns=None, errors=None, annotations_mode='track', seqbar=True, region='all', plot_kwargs=None)

Plots multiple per-nucleotide datasets on a single axis.

Parameters

sampleslist of rnavigate Samples: samples used to retrieve data
profiledata keyword string or data object: Profile from which values will be plotted
sequencedata keyword str, data obj, or sequence str, defaults to profile: All data are mapped to this sequence before plotting If a data keyword, data from the first sample will be used
annotationslist of data keyword strings or data objects, defaults to []: Annotations used to highlight regions or sites of interest
domainsdata keyword string or data object, defaults to None: domains to label along x-axis
labelslist of str, defaults to sample.sample for each sample: list containing Labels to be used in plot legends
nt_tickstuple of two integers, defaults to (20, 5): first integer is the gap between major tick marks second integer is the gap between minor tick marks
columnsstring or list of strings, defaults to profile.metric: columns names of values from profile to plot
errorsstring or list of strings, defaults to None (no error bars): column names of error values for plotting error bars
annotations_mode“track” or “bars”, defaults to “track”: “track” will highlight annotations along the x-axis “bars” will use a vertical transparent bar over the plot
seqbarbool, defaults to True: whether to display the sequence along the x-axis
regionlist of 2 integers, defaults to [1, length of sequence]: start and end positions to plot. 1-indexed, inclusive.
plot_kwargsdictionary, defaults to {}: Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.Skyline: the skyline plot object

rnavigate.plot_ss(samples, structure, profile=None, annotations=None, interactions=None, interactions2=None, labels=None, colors=None, nt_ticks=None, bp_style='dotted', colorbars=True, plot_kwargs=None)

Generates a multipanel secondary structure drawing with optional coloring by per-nucleotide data and display of inter-nucleotide data and/or sequence annotations. Each plot may display a unique sample and/or inter-nucleotide data filtering scheme.

Parameters

sampleslist of rnavigate Samples

samples used to retrieve data

structuredata keyword string or data object

secondary structure to plot as arcs All data are mapped to this sequence before plotting

profiledata keyword string or data object, defaults to None

Profile used for coloring if “profile” used in colors dictionary

annotationslist of data keyword strings or data objects, defaults to []

Annotations used to highlight regions or sites of interest

interactionsone of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot on secondary structure, no filtering
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary
format 3 (list of format 2 dictionaries): This format allows multiple filtering schemes to be applied, each will be plotted on a seperate axis

interactions2one of the formats below, defaults to None

format 1 (data or data keyword): Interactions to plot on secondary structure, no filtering
format 2 (dictionary): e.g. {“interactions”: format 1} additional filtering options can be added to the dictionary

labelslist of strings, defaults to sample.sample for each sample

list containing Labels to be used in plot legends Defaults to sample.sample for each sample

colorsdictionary, optional

a dictionary of element: value pairs that determines how colors will be applied to each plot element and if that element is plotted only the elements you wish to change need to be included Keys can be “sequence” (letter labels), “nucleotides” (circles behind letters), “structure” (lines connecting nucleotides), and “basepairs” (lines connecting base-paired nucleotides). Values can be: None (don’t plot), “sequence” (color by nucleotide identity), “position” (position in sequence), “annotations” (sequence annotations), “profile” (per-nucleotide data from profile argument), “structure” (base-pairing status), a single matplotlib color for all positions, or an array of one color per position which matches the structure length. “sequence” may also use “contrast” which automatically chooses white or black for each letter to contrast with that “nucleotide” color. Defaults to {“sequence”: None, “nucleotides”: “sequence”, “structure”: “grey”, “basepairs”: “grey”}

nt_ticksinteger, defaults to None (no labels)

gap between major tick marks

bp_style“dotted”, “line”, or “conventional”, defaults to “dotted”

“dotted” plots basepairs as a dotted line “line” plots basepairs as a solid line “conventional” plots basepairs using Leontis-Westhof conventions for canonical and wobble pairs (“G-A” plotted as solid dot)

colorbarsbool, defaults to True

Whether to plot color scales for all plot elements

plot_kwargsdict, defaults to {}

Keyword-arguments passed to matplotlib.pyplot.subplots

Returns

rnavigate.plots.SS plot: object containing matplotlib figure and axes with additional plotting and file saving methods