rnavigate.analysis package
Submodules
rnavigate.analysis.auroc module
Windowed AUROC assesses agreement between reactivities and base-pairing.
- class rnavigate.analysis.auroc.WindowedAUROC(sample, window=81, profile='default_profile', structure='default_structure')
Bases:
objectCompute and display windowed AUROC analysis.
This analysis computes the ROC curve over a sliding window for the performance of per-nucleotide data (usually SHAPE-MaP or DMS-MaP Normalized reactivity) in predicting the base-pairing status of each nucleotide. The area under this curve (AUROC) is displayed compared to the median across the RNA. Below, an arc plot displays the secondary structure and per-nucleotide profile.
AUROC values (should) range from 0.5 (no predictive power) to 1.0
(perfect predictive power). A value of 0.5 indicates that the reactivity profile does not fit the structure prediction well. These regions are good candidates for further investigation with ensemble deconvolution.
References
- Lan, T.C.T., Allan, M.F., Malsick, L.E. et al. Secondary structural
ensembles of the SARS-CoV-2 RNA genome in infected cells. Nat Commun 13, 1128 (2022). https://doi.org/10.1038/s41467-022-28603-2
Methods
__init__: Computes the AUROC array and AUROC median. plot_auroc: Displays the AUROC analysis over the given region.
Returns Plot object
Attributes
- samplernavigate.Sample
sample to retrieve profile and secondary structure
- structurestr
Data keyword of sample pointing to secondary structure e.g. sample.data[structure]
- profilestr
Data keyword of sample pointing to profile e.g. sample.data[profile]
sequence : the sequence string of sample.data[structure] window: the size of the windows nt_length: the length of sequence string auroc: the auroc numpy array, length = nt_length, padded with np.nan median_auroc: the median of the auroc array
- plot_auroc(region=None)
Plot the result of the windowed AUROC analysis, with arc plot of structure and reactivity profile.
- Args:
- region (list of int: length 2, optional): Start and end nucleotide
positions to plot. Defaults to [1, RNA length].
rnavigate.analysis.check_sequence module
SequenceChecker analysis used to inspect sequence differences.
Given a list of samples, we can inspect which data keywords belong to the samples, which sequences match up perfectly, and inspect the differences between sequences.
- class rnavigate.analysis.check_sequence.SequenceChecker(samples)
Bases:
objectCheck the sequences stored in a list of samples.
Attributes
- sampleslist
samples in which to check sequences
- sequenceslist
all unique sequence strings stored in the list of samples. These are converted to an all uppercase RNA alphabet.
- keywordslist
all unique data keywords stored in the list of samples.
- which_sequencesPandas.DataFrame
each row is a sample, keyword, and index of self.sequences
- get_keywords()
A list of all unique data keywords across samples.
- get_sequences()
A list of all unique sequences (uppercase RNA) across samples.
- get_which_sequences()
A DataFrame of sequence IDs (integers) for each data keyword.
- print_alignments(print_format='long', which='all')
Print alignments in the given format for sequence IDs provided.
Parameters
- print_formatstring, defaults to “long”
What format to print the alignments in: “cigar” prints the cigar string “short” prints the numbers of mismatches and indels “long” prints the location and nucleotide identity of all
mismatches, insertions and deletions.
- whichtuple of two of integers, defaults to “all” (every pairwise comparison)
two sequence IDs to compare.
- print_mulitple_sequence_alignment(base_sequence)
Print the multiple sequence alignment with nice formatting.
Parameters
- base_sequencestring
a sequence string that represents the longest common sequence. Usually, this is the return value from:
rnav.data.set_multiple_sequence_alignment()
- print_which_sequences()
Print sequence ID (integer) for each data keyword and sample.
- reset()
Reset keywords and sequences from sample list in case of changes.
- write_fasta(filename, which='all')
Write all unique sequences to a fasta file.
This is very useful for using external multiple sequence aligners such as ClustalOmega.
upload new fasta file
under STEP 2 output format, select Pearson/FASTA
click ‘Submit’
wait for your alignment to finish
download the alignment fasta file
use rnav.data.set_multiple_sequence_alignment()
Parameters
- filenamestring
path to a new file to which fasta entries are written
- whichlist of integers, defaults to “all” (every sequence)
Sequence IDs to write to file.
rnavigate.analysis.deltashape module
DeltaSHAPE for detecting meaningful changes in SHAPE reactivity between two samples.
Parameters are optimized for detecting in cell vs. cell free protein protections and enhancements, but useful for identifying any useful differences.
Copyright Matthew J. Smola 2015 Largely rewritten for RNAvigate by Patrick Irving 2023
- class rnavigate.analysis.deltashape.DeltaSHAPE(sample1, sample2, profile='shapemap', smoothing_window=3, zf_coeff=1.96, ss_thresh=1, site_window=3, site_nts=2)
Bases:
SampleDetects meaningful differences in chemical probing reactivity
References
doi:10.1021/acs.biochem.5b00977
Algorithm
- Extract SHAPE-MaP sequence, normalized profile, and normalized
standard error from given samples
- Calculated smoothed profiles (mean) and propagate standard errors
over rolling windows
Subtract raw and smoothed normalized profiles and propogate errors
- Calculate Z-factors for smoothed data. This is the magnitude of the
difference relative to the standard error
- Calculate Z-scores for smoothed data. This is the magnitude of the
difference in standard deviations from the mean difference
- Call sites. Called sites must have # nucleotides that pass Z-factor
and Z-score thresholds per window.
Smoothing window size, Z factor threshold, Z score threshold, site-calling window size and minimum nucleotides per site can be specified.
- calculate_deltashape(smoothing_window=3, zf_coeff=1.96, ss_thresh=1, site_window=2, site_nts=3)
Calculate or recalculate deltaSHAPE profile and called sites
Parameters
- smoothing_windowint, default=3
Size of windows for data smoothing
- zf_coefffloat, default=1.96
Sites must have a difference more than zf_coeff standard errors
- ss_threshint, default=1
Sites must have a difference that is ss_thresh standard deviations from the mean difference
- site_windowint, default=3
Number of nucleotides to include when calling sites
- site_ntsint, default=2
Number of nts within site_window that must pass thresholds
- class rnavigate.analysis.deltashape.DeltaSHAPEProfile(input_data, metric='Smooth_diff', metric_defaults=None, sequence=None, name=None, **kwargs)
Bases:
ProfileProfile data class for performing deltaSHAPE analysis
- calculate_deltashape(smoothing_window=3, zf_coeff=1.96, ss_thresh=1, site_window=3, site_nts=2)
Calculate the deltaSHAPE profile metrics
Parameters
- smoothing_windowint, default=3
Size of windows for data smoothing
- zf_coefffloat, default=1.96
Sites must have a difference more than zf_coeff standard errors
- ss_threshint, default=1
Sites must have a difference that is ss_thresh standard deviations from the mean difference
- site_windowint, default=3
Number of nucleotides to include when calling sites
- site_ntsint, default=2
Number of nts within site_window that must pass thresholds
- get_enhancements_annotation()
Get an annotations object for the significant enhancements
- get_protections_annotation()
Get an annotations object for the significant protections
rnavigate.analysis.fragmapper module
Fragmapper analysis tools.
Description: FragMapper compares reactivity profile differences between SHAPE-MaP profiles. The intended application of Fragmapper is to detect fragment or ligand crosslinking sites in RNA.
- class rnavigate.analysis.fragmapper.FragMaP(input_data, parameters, metric='Delta_zscore', metric_defaults=None, read_table_kw=None, sequence=None, name=None)
Bases:
Profile- get_annotation()
- get_dataframe(profile1, profile2, mutation_rate_threshold, depth_threshold, delta_rate_threshold, zscore_threshold, zscore_min_threshold)
- property recreation_kwargs
A dictionary of keyword arguments to pass when recreating the object.
- class rnavigate.analysis.fragmapper.Fragmapper(sample1, sample2, parameters=None, profile='shapemap')
Bases:
Sample- plot_scatter(column='Modified_rate')
Generates scatter plots useful for fragmapper quality control.
- Args:
- column (str, optional):
Dataframe column containing data to plot (must be avalible for the sample and control). Defaults to “Modified_rate”.
- Returns:
- (matplotlib figure, matplotlib axis)
Scatter plot with control values on the x-axis, sample values on the y-axis, and each point representing a nucleotide not filtered out in the fragmapper pipeline.
- update_annotation()
- class rnavigate.analysis.fragmapper.FragmapperReplicates(samples_1: list, samples_2: list, parameters=None, profile='shapemap')
Bases:
Sample- average_columns(df: DataFrame, avg_columns: list[str] = ['Modified_mutations', 'Modified_effective_depth', 'Modified_rate'], sem_column: list[str] = ['Modified_rate'])
- merge_samples(samples: list, profile: str = 'shapemap', suffix: str = 'rep', columns: list = ['Nucleotide', 'Sequence', 'Modified_mutations', 'Modified_effective_depth', 'Modified_rate'], exceptions: list = ['Nucleotide', 'Sequence'])
- plot_scatter(column: str = 'Modified_rate', error: str = 'Std_err', label_size: int = None, ylabel: str = None, xlabel: str = None)
Generates scatter plots useful for fragmapper quality control.
- Args:
- column (str, optional):
Dataframe column containing data to plot (must be avalible for the sample and control). Defaults to “Modified_rate”.
- Returns:
- (matplotlib figure, matplotlib axis)
Scatter plot with control values on the x-axis, sample values on the y-axis, and each point representing a nucleotide not filtered out in the fragmapper pipeline.
- update_annotation()
rnavigate.analysis.logcompare module
LogCompare compares reactivity profiles for significant differences.
This analysis requires replicates.
- class rnavigate.analysis.logcompare.LogCompare(samples1, samples2, name1, name2, profile_kw, sequence=None, inherit=None)
Bases:
SampleCompares 2 experimental samples, given replicates of each sample.
Algorithm
Calculate the ln(modified/untreated) rate for each replicate.
2. Scale these values to minimize the median of the absolute difference between samples. 3. Calculate the standard error in these values for each replicate. 4. Calculate the difference between samples. 5. Calculate z-scores between samples. 6. Plot the results in two panels:
(1) the scaled log10(modified/untreated) rate for each sample with error bars, and (2) the difference between samples, colored by z-score.
Methods
- __init__: computes log10(modified/untreated) rates, rescales the data,
then calls make_plot()
get_profile_sequence: gets log10(m/u) rate and sequence from sample rescale: rescales a profile to minimize difference to another profile load_replicates: calculates average and standard error of replicates make_plots: displays the two panels described above.
- Attributes:
data (str): a key of sample.data to retrieve per-nucleotide data groups (dict): a dictionary containing the following key-value pairs:
- 1: a dictionary containing these key-value pairs:
self.data: averaged scaled log10(m/u) across replicates “stderr”: the standard errors across replicates “stacked”: 2d array containing each scaled log10(m/u) array “seq”: the sequence string
2: same as 1 above, for the second sample
- class rnavigate.analysis.logcompare.LogProfile(input_data, metric='mean_diff', metric_defaults=None, sequence=None, **kwargs)
Bases:
ProfileA class for log10(Modified_rate/Untreated_rate) profiles.
- calc_profile(profile)
Calculate log10(Modified_rate/Untreated_rate) for the given sample/profile.
- Args:
sample (rnavigate.Sample): an rnavigate sample
- Returns:
np.array: log profile
- load_replicates(profiles)
calculates log profiles, avg and sterr for a group of replicates.
- Args:
*profiles (list of rnavigate.Sample): replicates to load
- rescale(profile, target_profile)
scales profile to minimize difference to target_profile.
- Args:
profile (np.array): log10 profile to scale target_profile (np.array): 2nd log10 profile
- Returns:
np.array: scaled profile
rnavigate.analysis.lowss module
Performs low SHAPE, low Shannon entropy analysis
- Citation:
- Siegfried, N., Busan, S., Rice, G. et al. RNA motif discovery by SHAPE and
mutational profiling (SHAPE-MaP). Nat Methods 11, 959-965 (2014). https://doi.org/10.1038/nmeth.3029
Typical usage example:
import rnavigate as rnav my_sample = rnav.Sample(
sample=”example sample”, shapemap=”my_shape_profile.txt”, pairprob=”pairing_probabilities.txt”, ss=”MFE_structure.ct” )
lowss_sample = rnav.analysis.LowSS(my_sample) plot = lowss_sample.plot_lowss() plot.save(“lowss_figure.svg”)
- class rnavigate.analysis.lowss.LowSS(sample, window=55, shapemap='shapemap', pairprob='pairprob', structure='ss')
Bases:
SampleCreates a new RNAvigate Sample which computes and displays Low SHAPE, low Shannon entropy regions (LowSS) given a sample containing SHAPE reactivities, pairing probabilities, and MFE structure.
Methods
__init__: performs the analysis plot_lowss: displays the result and returns plot object
Attributes
- samplestr
the new label for this Sample’s data on plots
- parentrnavigate.Sample
the sample from which data is retrieved
- windowint
size of the windows, must be odd
- median_shapefloat
global median SHAPE reactivity
- median_entropyfloat
global median Shannon entropy
- datadictionary
- dictionary of data keyword: Data objects, keys are:
- “structure” (rnav.data.SecondaryStructure)
copy of provided MFE structure
- “shapemap” (rnav.data.SHAPEMaP)
copy of provided SHAPE-MaP data aligned to “structure”
- “pairprob” (rnav.data.PairingProbability)
copy of pairing probabilities aligned to “structure”
- “entropies” (rnav.data.Profile)
Profile of Shannon entropies calculated from “pairprob”
- “lowSS” (rnav.data.Annotations)
annotations defining low SHAPE, low Shannon entropy regions
- plot_lowss(region=None, colorbars=True)
Visualize LowSS analysis over the given region.
Parameters
- regioninteger or list of 2 integers, default=None (entire sequence)
If list: lowSS start and end positions to plot. If integer: region number, +/- 150 nts are shown.
- colorbarsbool, default=True
whether to plot colorbars for pairing probability
Returns
- rnavigate.plots.AP
LowSS visualization
- reset_lowss(maximum_shape=None, maximum_entropy=0.08)
Generates an annotation of lowSS regions. Stored as self.lowSS
Parameters
- maximum_shapefloat, default=None (median SHAPE reactivity)
maximum normalized SHAPE reactivity to be called lowSS.
- maximum_entropyfloat, default=0.08
maximum shannon entropy to be called lowSS.
Module contents
- class rnavigate.analysis.DeltaSHAPE(sample1, sample2, profile='shapemap', smoothing_window=3, zf_coeff=1.96, ss_thresh=1, site_window=3, site_nts=2)
Bases:
SampleDetects meaningful differences in chemical probing reactivity
References
doi:10.1021/acs.biochem.5b00977
Algorithm
- Extract SHAPE-MaP sequence, normalized profile, and normalized
standard error from given samples
- Calculated smoothed profiles (mean) and propagate standard errors
over rolling windows
Subtract raw and smoothed normalized profiles and propogate errors
- Calculate Z-factors for smoothed data. This is the magnitude of the
difference relative to the standard error
- Calculate Z-scores for smoothed data. This is the magnitude of the
difference in standard deviations from the mean difference
- Call sites. Called sites must have # nucleotides that pass Z-factor
and Z-score thresholds per window.
Smoothing window size, Z factor threshold, Z score threshold, site-calling window size and minimum nucleotides per site can be specified.
- calculate_deltashape(smoothing_window=3, zf_coeff=1.96, ss_thresh=1, site_window=2, site_nts=3)
Calculate or recalculate deltaSHAPE profile and called sites
Parameters
- smoothing_windowint, default=3
Size of windows for data smoothing
- zf_coefffloat, default=1.96
Sites must have a difference more than zf_coeff standard errors
- ss_threshint, default=1
Sites must have a difference that is ss_thresh standard deviations from the mean difference
- site_windowint, default=3
Number of nucleotides to include when calling sites
- site_ntsint, default=2
Number of nts within site_window that must pass thresholds
- class rnavigate.analysis.DeltaSHAPEProfile(input_data, metric='Smooth_diff', metric_defaults=None, sequence=None, name=None, **kwargs)
Bases:
ProfileProfile data class for performing deltaSHAPE analysis
- calculate_deltashape(smoothing_window=3, zf_coeff=1.96, ss_thresh=1, site_window=3, site_nts=2)
Calculate the deltaSHAPE profile metrics
Parameters
- smoothing_windowint, default=3
Size of windows for data smoothing
- zf_coefffloat, default=1.96
Sites must have a difference more than zf_coeff standard errors
- ss_threshint, default=1
Sites must have a difference that is ss_thresh standard deviations from the mean difference
- site_windowint, default=3
Number of nucleotides to include when calling sites
- site_ntsint, default=2
Number of nts within site_window that must pass thresholds
- get_enhancements_annotation()
Get an annotations object for the significant enhancements
- get_protections_annotation()
Get an annotations object for the significant protections
- class rnavigate.analysis.FragMaP(input_data, parameters, metric='Delta_zscore', metric_defaults=None, read_table_kw=None, sequence=None, name=None)
Bases:
Profile- get_annotation()
- get_dataframe(profile1, profile2, mutation_rate_threshold, depth_threshold, delta_rate_threshold, zscore_threshold, zscore_min_threshold)
- property recreation_kwargs
A dictionary of keyword arguments to pass when recreating the object.
- class rnavigate.analysis.Fragmapper(sample1, sample2, parameters=None, profile='shapemap')
Bases:
Sample- plot_scatter(column='Modified_rate')
Generates scatter plots useful for fragmapper quality control.
- Args:
- column (str, optional):
Dataframe column containing data to plot (must be avalible for the sample and control). Defaults to “Modified_rate”.
- Returns:
- (matplotlib figure, matplotlib axis)
Scatter plot with control values on the x-axis, sample values on the y-axis, and each point representing a nucleotide not filtered out in the fragmapper pipeline.
- update_annotation()
- class rnavigate.analysis.FragmapperReplicates(samples_1: list, samples_2: list, parameters=None, profile='shapemap')
Bases:
Sample- average_columns(df: DataFrame, avg_columns: list[str] = ['Modified_mutations', 'Modified_effective_depth', 'Modified_rate'], sem_column: list[str] = ['Modified_rate'])
- merge_samples(samples: list, profile: str = 'shapemap', suffix: str = 'rep', columns: list = ['Nucleotide', 'Sequence', 'Modified_mutations', 'Modified_effective_depth', 'Modified_rate'], exceptions: list = ['Nucleotide', 'Sequence'])
- plot_scatter(column: str = 'Modified_rate', error: str = 'Std_err', label_size: int = None, ylabel: str = None, xlabel: str = None)
Generates scatter plots useful for fragmapper quality control.
- Args:
- column (str, optional):
Dataframe column containing data to plot (must be avalible for the sample and control). Defaults to “Modified_rate”.
- Returns:
- (matplotlib figure, matplotlib axis)
Scatter plot with control values on the x-axis, sample values on the y-axis, and each point representing a nucleotide not filtered out in the fragmapper pipeline.
- update_annotation()
- class rnavigate.analysis.LogCompare(samples1, samples2, name1, name2, profile_kw, sequence=None, inherit=None)
Bases:
SampleCompares 2 experimental samples, given replicates of each sample.
Algorithm
Calculate the ln(modified/untreated) rate for each replicate.
2. Scale these values to minimize the median of the absolute difference between samples. 3. Calculate the standard error in these values for each replicate. 4. Calculate the difference between samples. 5. Calculate z-scores between samples. 6. Plot the results in two panels:
(1) the scaled log10(modified/untreated) rate for each sample with error bars, and (2) the difference between samples, colored by z-score.
Methods
- __init__: computes log10(modified/untreated) rates, rescales the data,
then calls make_plot()
get_profile_sequence: gets log10(m/u) rate and sequence from sample rescale: rescales a profile to minimize difference to another profile load_replicates: calculates average and standard error of replicates make_plots: displays the two panels described above.
- Attributes:
data (str): a key of sample.data to retrieve per-nucleotide data groups (dict): a dictionary containing the following key-value pairs:
- 1: a dictionary containing these key-value pairs:
self.data: averaged scaled log10(m/u) across replicates “stderr”: the standard errors across replicates “stacked”: 2d array containing each scaled log10(m/u) array “seq”: the sequence string
2: same as 1 above, for the second sample
- class rnavigate.analysis.LowSS(sample, window=55, shapemap='shapemap', pairprob='pairprob', structure='ss')
Bases:
SampleCreates a new RNAvigate Sample which computes and displays Low SHAPE, low Shannon entropy regions (LowSS) given a sample containing SHAPE reactivities, pairing probabilities, and MFE structure.
Methods
__init__: performs the analysis plot_lowss: displays the result and returns plot object
Attributes
- samplestr
the new label for this Sample’s data on plots
- parentrnavigate.Sample
the sample from which data is retrieved
- windowint
size of the windows, must be odd
- median_shapefloat
global median SHAPE reactivity
- median_entropyfloat
global median Shannon entropy
- datadictionary
- dictionary of data keyword: Data objects, keys are:
- “structure” (rnav.data.SecondaryStructure)
copy of provided MFE structure
- “shapemap” (rnav.data.SHAPEMaP)
copy of provided SHAPE-MaP data aligned to “structure”
- “pairprob” (rnav.data.PairingProbability)
copy of pairing probabilities aligned to “structure”
- “entropies” (rnav.data.Profile)
Profile of Shannon entropies calculated from “pairprob”
- “lowSS” (rnav.data.Annotations)
annotations defining low SHAPE, low Shannon entropy regions
- plot_lowss(region=None, colorbars=True)
Visualize LowSS analysis over the given region.
Parameters
- regioninteger or list of 2 integers, default=None (entire sequence)
If list: lowSS start and end positions to plot. If integer: region number, +/- 150 nts are shown.
- colorbarsbool, default=True
whether to plot colorbars for pairing probability
Returns
- rnavigate.plots.AP
LowSS visualization
- reset_lowss(maximum_shape=None, maximum_entropy=0.08)
Generates an annotation of lowSS regions. Stored as self.lowSS
Parameters
- maximum_shapefloat, default=None (median SHAPE reactivity)
maximum normalized SHAPE reactivity to be called lowSS.
- maximum_entropyfloat, default=0.08
maximum shannon entropy to be called lowSS.
- class rnavigate.analysis.SequenceChecker(samples)
Bases:
objectCheck the sequences stored in a list of samples.
Attributes
- sampleslist
samples in which to check sequences
- sequenceslist
all unique sequence strings stored in the list of samples. These are converted to an all uppercase RNA alphabet.
- keywordslist
all unique data keywords stored in the list of samples.
- which_sequencesPandas.DataFrame
each row is a sample, keyword, and index of self.sequences
- get_keywords()
A list of all unique data keywords across samples.
- get_sequences()
A list of all unique sequences (uppercase RNA) across samples.
- get_which_sequences()
A DataFrame of sequence IDs (integers) for each data keyword.
- print_alignments(print_format='long', which='all')
Print alignments in the given format for sequence IDs provided.
Parameters
- print_formatstring, defaults to “long”
What format to print the alignments in: “cigar” prints the cigar string “short” prints the numbers of mismatches and indels “long” prints the location and nucleotide identity of all
mismatches, insertions and deletions.
- whichtuple of two of integers, defaults to “all” (every pairwise comparison)
two sequence IDs to compare.
- print_mulitple_sequence_alignment(base_sequence)
Print the multiple sequence alignment with nice formatting.
Parameters
- base_sequencestring
a sequence string that represents the longest common sequence. Usually, this is the return value from:
rnav.data.set_multiple_sequence_alignment()
- print_which_sequences()
Print sequence ID (integer) for each data keyword and sample.
- reset()
Reset keywords and sequences from sample list in case of changes.
- write_fasta(filename, which='all')
Write all unique sequences to a fasta file.
This is very useful for using external multiple sequence aligners such as ClustalOmega.
upload new fasta file
under STEP 2 output format, select Pearson/FASTA
click ‘Submit’
wait for your alignment to finish
download the alignment fasta file
use rnav.data.set_multiple_sequence_alignment()
Parameters
- filenamestring
path to a new file to which fasta entries are written
- whichlist of integers, defaults to “all” (every sequence)
Sequence IDs to write to file.
- class rnavigate.analysis.WindowedAUROC(sample, window=81, profile='default_profile', structure='default_structure')
Bases:
objectCompute and display windowed AUROC analysis.
This analysis computes the ROC curve over a sliding window for the performance of per-nucleotide data (usually SHAPE-MaP or DMS-MaP Normalized reactivity) in predicting the base-pairing status of each nucleotide. The area under this curve (AUROC) is displayed compared to the median across the RNA. Below, an arc plot displays the secondary structure and per-nucleotide profile.
AUROC values (should) range from 0.5 (no predictive power) to 1.0
(perfect predictive power). A value of 0.5 indicates that the reactivity profile does not fit the structure prediction well. These regions are good candidates for further investigation with ensemble deconvolution.
References
- Lan, T.C.T., Allan, M.F., Malsick, L.E. et al. Secondary structural
ensembles of the SARS-CoV-2 RNA genome in infected cells. Nat Commun 13, 1128 (2022). https://doi.org/10.1038/s41467-022-28603-2
Methods
__init__: Computes the AUROC array and AUROC median. plot_auroc: Displays the AUROC analysis over the given region.
Returns Plot object
Attributes
- samplernavigate.Sample
sample to retrieve profile and secondary structure
- structurestr
Data keyword of sample pointing to secondary structure e.g. sample.data[structure]
- profilestr
Data keyword of sample pointing to profile e.g. sample.data[profile]
sequence : the sequence string of sample.data[structure] window: the size of the windows nt_length: the length of sequence string auroc: the auroc numpy array, length = nt_length, padded with np.nan median_auroc: the median of the auroc array
- plot_auroc(region=None)
Plot the result of the windowed AUROC analysis, with arc plot of structure and reactivity profile.
- Args:
- region (list of int: length 2, optional): Start and end nucleotide
positions to plot. Defaults to [1, RNA length].