Strategies for Linear Data

Recent literature lists several methods and strategies that deal with the problem to convert linear experimental RNA structure probing data into pseudo energy terms to guide RNA secondary strucure predictions. Such data may come from SHAPE, DMS, lead, inline probing, or similar techniques. The most commonly used method may be the one presented in Deigan et al. [2009] where the free energy evaluation for stacked base pairs adds a pseudo energy contribution derived from the reactivity values of the probing data. The higher the reactivity the stronger the stack is penalized by a positive energy.

Generic Probing Data Strategy API

Defines

VRNA_PROBING_DATA_WEIGHT_POSITION_WISE
#include <ViennaRNA/probing/basic.h>
VRNA_PROBING_DATA_SINGLE_STRATEGY
#include <ViennaRNA/probing/basic.h>
VRNA_PROBING_DATA_SINGLE_WEIGHT
#include <ViennaRNA/probing/basic.h>
VRNA_PROBING_DATA_DEFAULT
#include <ViennaRNA/probing/basic.h>
VRNA_PROBING_DATA_LINEAR_TARGET_STACK
#include <ViennaRNA/probing/basic.h>
VRNA_PROBING_DATA_LINEAR_TARGET_UP
#include <ViennaRNA/probing/basic.h>
VRNA_PROBING_DATA_LINEAR_TARGET_BP
#include <ViennaRNA/probing/basic.h>
VRNA_PROBING_METHOD_MULTI_PARAMS_0

probing data conversion flag for comparative structure predictions indicating no parameter to be sequence specific

#include <ViennaRNA/probing/basic.h>

VRNA_PROBING_METHOD_MULTI_PARAMS_1

probing data conversion flag for comparative structure predictions indicating 1st parameter to be sequence specific

#include <ViennaRNA/probing/basic.h>

VRNA_PROBING_METHOD_MULTI_PARAMS_2

probing data conversion flag for comparative structure predictions indicating 2nd parameter to be sequence specific

#include <ViennaRNA/probing/basic.h>

VRNA_PROBING_METHOD_MULTI_PARAMS_3

probing data conversion flag for comparative structure predictions indicating 3rd parameter to be sequence specific

#include <ViennaRNA/probing/basic.h>

VRNA_PROBING_METHOD_MULTI_PARAMS_DEFAULT

probing data conversion flag for comparative structure predictions indicating default parameter settings

#include <ViennaRNA/probing/basic.h>

Essentially, this setting indicates that all probing data is to be converted using the same parameters. Use any combination of VRNA_PROBING_METHOD_MULTI_PARAMS_1, VRNA_PROBING_METHOD_MULTI_PARAMS_2, VRNA_PROBING_METHOD_MULTI_PARAMS_3, and so on to indicate that the first, second, third, or other parameter is sequence specific.

Typedefs

typedef double *(*vrna_probing_strategy_f)(vrna_fold_compound_t *fc, const double *data, size_t data_size, unsigned int target, void *options)

Prototype of a strategy to derive pseudo energies from linear structure probing data.

#include <ViennaRNA/probing/basic.h>
Param fc:

The fold compound the probing data will be applied to

Param data:

The structure probing data (1-based array of reactivity values)

Param data_size:

The size of data, i.e. the total number of reactivity values

Param target:

The structural context for which pseudo energies are requested from the strategy

Param options:

An arbitrary data structure the strategy requires for working on the data

Return:

A pointer to an array of pseudo energies ready to be included as soft constraints

The Deigan 2009 Strategy API

Defines

VRNA_PROBING_METHOD_DEIGAN2009_DEFAULT_m

Default parameter for slope m as used in method of Deigan et al. [2009] .

#include <ViennaRNA/probing/strategy_deigan.h>

VRNA_PROBING_METHOD_DEIGAN2009_DEFAULT_b

Default parameter for intercept b as used in method of Deigan et al. [2009] .

#include <ViennaRNA/probing/strategy_deigan.h>

Functions

double *vrna_probing_strategy_deigan(vrna_fold_compound_t *fc, const double *data, size_t data_size, unsigned int target, void *options)
#include <ViennaRNA/probing/strategy_deigan.h>
void *vrna_probing_strategy_deigan_options(double m, double b, double max_value, vrna_math_fun_dbl_f cb_preprocess, vrna_math_fun_dbl_opt_t cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_deigan.h>
void vrna_probing_strategy_deigan_options_free(void *options)
#include <ViennaRNA/probing/strategy_deigan.h>
vrna_probing_data_t vrna_probing_data_deigan(const double *reactivities, unsigned int n, double m, double b)

Prepare probing data according to Deigan et al. 2009 method.

#include <ViennaRNA/probing/strategy_deigan.h>

Prepares a data structure to be used with vrna_sc_probing() to directed RNA folding using the simple linear ansatz

\[ \Delta G_{\text{SHAPE}}(i) = m \ln(\text{SHAPE reactivity}(i)+1)+ b \]

to convert probing data, e.g. SHAPE reactivity values, to pseudo energies whenever a nucleotide \( i \) contributes to a stacked pair. A positive slope \( m \) penalizes high reactivities in paired regions, while a negative intercept \( b \) results in a confirmatory bonus free energy for correctly predicted base pairs. Since the energy evaluation of a base pair stack involves two pairs, the pseudo energies are added for all four contributing nucleotides. Consequently, the energy term is applied twice for pairs inside a helix and only once for pairs adjacent to other structures. For all other loop types the energy model remains unchanged even when the experimental data highly disagrees with a certain motif.

Note

For further details, we refer to Deigan et al. [2009] .

Parameters:
  • reactivities – 1-based array of per-nucleotide probing data, e.g. SHAPE reactivities

  • n – The length of the reactivities list

  • m – The slope used for the probing data to soft constraints conversion strategy

  • b – The intercept used for the probing data to soft constraints conversion strategy

Returns:

A pointer to a data structure containing the probing data and any preparations necessary to use it in vrna_sc_probing() according to the method of Deigan et al. [2009] or NULL on any error.

vrna_probing_data_t vrna_probing_data_deigan_trans(const double *reactivities, unsigned int n, double m, double b, vrna_math_fun_dbl_f cb_preprocess, vrna_math_fun_dbl_opt_t cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_deigan.h>
vrna_probing_data_t vrna_probing_data_deigan_comparative(const double **reactivities, const unsigned int *n, unsigned int n_seq, double *ms, double *bs, unsigned int multi_params)

Prepare (multiple) probing data according to Deigan et al. 2009 method for comparative structure predictions.

#include <ViennaRNA/probing/strategy_deigan.h>

Similar to vrna_probing_data_deigan(), this function prepares a data structure to be used with vrna_sc_probing() to directed RNA folding using the simple linear ansatz

\[ \Delta G_{\text{SHAPE}}(i) = m \ln(\text{SHAPE reactivity}(i)+1)+ b \]

to convert probing data, e.g. SHAPE reactivity values, to pseudo energies whenever a nucleotide \( i \) contributes to a stacked pair. This functions purpose is to allow for adding multiple probing data as required for comparative structure predictions over multiple sequence alignments (MSA) with n_seq sequences. For that purpose, reactivities can be provided for any of the sequences in the MSA. Individual probing data is always expected to be specified in sequence coordinates, i.e. without considering gaps in the MSA. Therefore, each set of reactivities may have a different length as specified the parameter n. In addition, each set of probing data may undergo the conversion using different parameters \( m \) and \( b \). Whether or not multiple sets of conversion parameters are provided must be specified using the multi_params flag parameter. Use VRNA_PROBING_METHOD_MULTI_PARAMS_1 to indicate that ms points to an array of slopes for each sequence. Along with that, VRNA_PROBING_METHOD_MULTI_PARAMS_2 indicates that bs is pointing to an array of intercepts for each sequence. Bitwise-OR of the two values renders both parameters to be sequence specific.

Note

For further details, we refer to Deigan et al. [2009] .

Parameters:
  • reactivities – 0-based array of 1-based arrays of per-nucleotide probing data, e.g. SHAPE reactivities

  • n – 0-based array of lengths of the reactivities lists

  • n_seq – The number of sequences in the MSA

  • ms – 0-based array of the slopes used for the probing data to soft constraints conversion strategy or the address of a single slope value to be applied for all data

  • bs – 0-based array of the intercepts used for the probing data to soft constraints conversion strategy or the address of a single intercept value to be applied for all data

  • multi_params – A flag indicating what is passed through parameters ms and bs

Returns:

A pointer to a data structure containing the probing data and any preparations necessary to use it in vrna_sc_probing() according to the method of Deigan et al. [2009] or NULL on any error.

vrna_probing_data_t vrna_probing_data_deigan_trans_comparative(const double **reactivities, const unsigned int *n, unsigned int n_seq, double *ms, double *bs, unsigned int multi_params, vrna_math_fun_dbl_f *cb_preprocess, vrna_math_fun_dbl_opt_t *cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f *cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_deigan.h>

The Zarringhalam 2012 Strategy API

Defines

VRNA_PROBING_METHOD_ZARRINGHALAM2012_DEFAULT_beta

Default parameter beta as used in method of Zarringhalam et al. [2012] .

#include <ViennaRNA/probing/strategy_zarringhalam.h>

VRNA_PROBING_METHOD_ZARRINGHALAM2012_DEFAULT_conversion

Default conversion method of probing data into probabilities as used in method of Zarringhalam et al. [2012] .

#include <ViennaRNA/probing/strategy_zarringhalam.h>

VRNA_PROBING_METHOD_ZARRINGHALAM2012_DEFAULT_probability

Default probability value for missing data in method of Zarringhalam et al. [2012] .

#include <ViennaRNA/probing/strategy_zarringhalam.h>

Functions

double *vrna_probing_strategy_zarringhalam(vrna_fold_compound_t *fc, const double *data, size_t data_size, unsigned int target, void *options)
#include <ViennaRNA/probing/strategy_zarringhalam.h>
void *vrna_probing_strategy_zarringhalam_options(double beta, double default_probability, double max_value, vrna_math_fun_dbl_f cb_preprocess, vrna_math_fun_dbl_opt_t cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_zarringhalam.h>
void vrna_probing_strategy_zarringhalam_options_free(void *option)
#include <ViennaRNA/probing/strategy_zarringhalam.h>
vrna_probing_data_t vrna_probing_data_zarringhalam(const double *reactivities, unsigned int n, double beta, const char *pr_conversion, double pr_default)

Prepare probing data according to Zarringhalam et al. 2012 method.

#include <ViennaRNA/probing/strategy_zarringhalam.h>

Prepares a data structure to be used with vrna_sc_probing() to directed RNA folding using the method of Zarringhalam et al. [2012] .

This method first converts the observed probing data of nucleotide \( i \) into a probability \( q_i \) that position \( i \) is unpaired by means of a non-linear map. Then pseudo-energies of the form

\[ \Delta G_{\text{SHAPE}}(x,i) = \beta\ |x_i - q_i| \]

are computed, where \( x_i=0 \) if position \( i \) is unpaired and \( x_i=1 \) if \( i \) is paired in a given secondary structure. The parameter \( \beta \) serves as scaling factor. The magnitude of discrepancy between prediction and experimental observation is represented by \( |x_i - q_i| \).

Note

For further details, we refer to Zarringhalam et al. [2012]

Parameters:
  • reactivities – 1-based array of per-nucleotide probing data, e.g. SHAPE reactivities

  • n – The length of the reactivities list

  • beta – The scaling factor \( \beta \) of the conversion function

  • pr_conversion – A flag that specifies how to convert reactivities to probabilities

  • pr_default – The default probability for a nucleotide where reactivity data is missing for

Returns:

A pointer to a data structure containing the probing data and any preparations necessary to use it in vrna_sc_probing() according to the method of Zarringhalam et al. [2012] or NULL on any error.

vrna_probing_data_t vrna_probing_data_zarringhalam_trans(const double *reactivities, unsigned int n, double beta, double pr_default, vrna_math_fun_dbl_f cb_preprocess, vrna_math_fun_dbl_opt_t cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_zarringhalam.h>
vrna_probing_data_t vrna_probing_data_zarringhalam_comparative(const double **reactivities, unsigned int *n, unsigned int n_seq, double *betas, const char **pr_conversions, double *pr_defaults, unsigned int multi_params)

Prepare probing data according to Zarringhalam et al. 2012 method for comparative structure predictions.

#include <ViennaRNA/probing/strategy_zarringhalam.h>

Similar to vrna_probing_data_zarringhalam(), this function prepares a data structure to be used with vrna_sc_probing() to guide RNA folding using the method of Zarringhalam et al. [2012] .

This functions purpose is to allow for adding multiple probing data as required for comparative structure predictions over multiple sequence alignments (MSA) with n_seq sequences. For that purpose, reactivities can be provided for any of the sequences in the MSA. Individual probing data is always expected to be specified in sequence coordinates, i.e. without considering gaps in the MSA. Therefore, each set of reactivities may have a different length as specified the parameter n. In addition, each set of probing data may undergo the conversion using different parameters \( beta \). Additionally, the probing data to probability conversions strategy and default values for missing data can be specified in a sequence-based manner. Whether or not multiple conversion parameters are provided must be specified using the multi_params flag parameter. Use VRNA_PROBING_METHOD_MULTI_PARAMS_1 to indicate that betas points to an array of \( beta \) values for each sequence. VRNA_PROBING_METHOD_MULTI_PARAMS_2 indicates that pr_conversions is pointing to an array of probing data to probability conversion strategies, and VRNA_PROBING_METHOD_MULTI_PARAMS_3 indicates multiple default probabilities for missing data. Bitwise-OR of the three values renders all of them to be sequence specific.

Note

For further details, we refer to Zarringhalam et al. [2012]

Parameters:
  • reactivities – 0-based array of 1-based arrays of per-nucleotide probing data, e.g. SHAPE reactivities

  • n – 0-based array of lengths of the reactivities lists

  • n_seq – The number of sequences in the MSA

  • betas – 0-based array with scaling factors \( \beta \) of the conversion function or the address of a scaling factor to be applied for all data

  • pr_conversions – 0-based array of flags that specifies how to convert reactivities to probabilities or the address of a conversion strategy to be applied for all data

  • pr_defaults – 0-based array of default probabilities for a nucleotide where reactivity data is missing for or the address of a single default probability to be applied for all data

  • multi_params – A flag indicating what is passed through parameters betas, pr_conversions, and pr_defaults

Returns:

A pointer to a data structure containing the probing data and any preparations necessary to use it in vrna_sc_probing() according to the method of Zarringhalam et al. [2012] or NULL on any error.

vrna_probing_data_t vrna_probing_data_zarringhalam_trans_comparative(const double **reactivities, unsigned int *n, unsigned int n_seq, double *betas, double *pr_defaults, unsigned int multi_params, vrna_math_fun_dbl_f *cb_preprocess, vrna_math_fun_dbl_opt_t *cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f *cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_zarringhalam.h>

The Eddy 2014 Strategy API

Defines

VRNA_PROBING_STRATEGY_EDDY_OPTIONS_DEFAULT

Default options for the Eddy [2014] : probing data conversion strategy.

#include <ViennaRNA/probing/strategy_eddy.h>

See also

vrna_probing_strategy_eddy_options(), vrna_probing_data_eddy(), vrna_probing_data_eddy_trans(), vrna_probing_data_eddy_comparative(), vrna_probing_data_eddy_trans_comparative()

VRNA_PROBING_STRATEGY_EDDY_NO_TEMPERATURE_RESCALING

Prevent temperature dependent energy rescaling in Eddy [2014] strategy.

#include <ViennaRNA/probing/strategy_eddy.h>

This option flag forces the probing data conversion strategy to always use the same thermodynamic temperature \( T \), no matter what temperature the predictions are made for.

See also

vrna_probing_strategy_eddy_options(), vrna_probing_data_eddy(), vrna_probing_data_eddy_trans(), vrna_probing_data_eddy_comparative(), vrna_probing_data_eddy_trans_comparative()

Functions

double *vrna_probing_strategy_eddy(vrna_fold_compound_t *fc, const double *data, size_t data_size, unsigned int target, void *options)
#include <ViennaRNA/probing/strategy_eddy.h>
void *vrna_probing_strategy_eddy_options(double temperature, unsigned char options, const double *prior_unpaired, size_t prior_unpaired_size, const double *prior_paired, size_t prior_paired_size, vrna_math_fun_dbl_f cb_preprocess, vrna_math_fun_dbl_opt_t cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_eddy.h>
void vrna_probing_strategy_eddy_options_free(void *options)
#include <ViennaRNA/probing/strategy_eddy.h>
struct vrna_probing_data_s *vrna_probing_data_eddy(const double *reactivities, unsigned int n, double temperature, unsigned char options, const double *unpaired_data, unsigned int unpaired_len, const double *paired_data, unsigned int paired_len)

Add probing data as soft constraints (Eddy/RNAprob-2 method).

#include <ViennaRNA/probing/strategy_eddy.h>

This approach of probing data directed RNA folding uses the probability framework proposed by Eddy [2014] :

\[ \Delta G_{\text{data}}(i) = - RT\ln(\mathbb{P}(\text{data}(i)\mid x_i\pi_i)) \]

to convert probing data to pseudo energies for given nucleotide \( x_i \) and class probability \( \pi_i \) at position \( i \). The conditional probability is taken from a prior-distribution of probing data for the respective classes.

Here, the method distinguishes exactly two different classes of structural context, (i) unpaired and (ii) paired positions, following the lines of the RNAprob-2 method of Deng et al. [2016] . The reactivity distribution is computed using Gaussian kernel density estimation (KDE) with bandwidth \( h \) computed using Scott factor

\[ h = n^{-\frac{1}{5}} \]

where \( n \) is the number of data points of the prior distribution.

Note

For further details, we refer to Eddy [2014] and Deng et al. [2016] .

Parameters:
  • reactivities – A 1-based vector of probing data, e.g. normalized SHAPE reactivities

  • n – Length of reactivities

  • temperature – The thermodynamic temperature \( T \)

  • options – Options bit flags to change the behavior of the strategy

  • unpaired_data – Pointer to an array of probing data for unpaired nucleotides

  • unpaired_len – Length of unpaired_data

  • paired_data – Pointer to an array of probing data for paired nucleotides

  • paired_len – Length of paired_data

Returns:

A pointer to a data structure containing the probing data and any preparations necessary to use it in vrna_sc_probing() according to the method of Eddy [2014] or NULL on any error.

struct vrna_probing_data_s *vrna_probing_data_eddy_trans(const double *reactivities, unsigned int n, double temperature, unsigned char options, const double *unpaired_data, unsigned int unpaired_len, const double *paired_data, unsigned int paired_len, vrna_math_fun_dbl_f cb_preprocess, vrna_math_fun_dbl_opt_t cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_eddy.h>
struct vrna_probing_data_s *vrna_probing_data_eddy_comparative(const double **reactivities, const unsigned int *n, unsigned int n_seq, double temperature, unsigned char options, const double **unpaired_datas, const unsigned int *unpaired_lens, const double **paired_datas, const unsigned int *paired_lens, unsigned int multi_params)

Add probing data as soft constraints (Eddy/RNAprob-2 method) for comparative structure predictions.

#include <ViennaRNA/probing/strategy_eddy.h>

Similar to vrna_probing_data_eddy(), this function prepares a data structure for probing data directed RNA folding. It uses the probability framework proposed by Eddy [2014] :

\[ \Delta G_{\text{data}}(i) = - RT\ln(\mathbb{P}(\text{data}(i)\mid x_i\pi_i)) \]

to convert probing data to pseudo energies for given nucleotide \( x_i \) and class probability \( \pi_i \) at position \( i \). The conditional probability is taken from a prior-distribution of probing data for the respective classes.

This functions purpose is to allow for adding multiple probing data as required for comparative structure predictions over multiple sequence alignments (MSA) with n_seq sequences. For that purpose, reactivities can be provided for any of the sequences in the MSA. Individual probing data is always expected to be specified in sequence coordinates, i.e. without considering gaps in the MSA. Therefore, each set of reactivities may have a different length as specified the parameter n. In addition, each set of probing data may undergo the conversion using different prior distributions for unpaired and paired nucleotides. Whether or not multiple sets of conversion priors are provided must be specified using the multi_params flag parameter. Use VRNA_PROBING_METHOD_MULTI_PARAMS_1 to indicate that unpaired_datas points to an array of unpaired probing data for each sequence. Similarly, VRNA_PROBING_METHOD_MULTI_PARAMS_2 indicates that paired_datas is pointing to an array paired probing data for each sequence. Bitwise-OR of the two values renders both parameters to be sequence specific.

Note

For further details, we refer to Eddy [2014] and Deng et al. [2016] .

Parameters:
  • reactivities – 0-based array of 1-based arrays of per-nucleotide probing data, e.g. SHAPE reactivities

  • n – 0-based array of lengths of the reactivities lists

  • n_seq – The number of sequences in the MSA

  • temperature – The thermodynamic temperature \( T \)

  • options – Options bit flags to change the behavior of the strategy

  • unpaired_datas – 0-based array of 0-based arrays with probing data for unpaired nucleotides or address of a single array of such data

  • unpaired_lens – 0-based array of lengths for each probing data array in unpaired_datas

  • paired_datas – 0-based array of 0-based arrays with probing data for paired nucleotides or address of a single array of such data

  • paired_lens – 0-based array of lengths for each probing data array in paired_data

  • multi_params – A flag indicating what is passed through parameters unpaired_datas and paired_datas

Returns:

A pointer to a data structure containing the probing data and any preparations necessary to use it in vrna_sc_probing() according to the method of Eddy [2014] or NULL on any error.

struct vrna_probing_data_s *vrna_probing_data_eddy_trans_comparative(const double **reactivities, const unsigned int *n, unsigned int n_seq, double temperature, unsigned char options, const double **unpaired_datas, const unsigned int *unpaired_lens, const double **paired_datas, const unsigned int *paired_lens, unsigned int multi_params, vrna_math_fun_dbl_f *cb_preprocess, vrna_math_fun_dbl_opt_t *cb_preprocess_opt, vrna_math_fun_dbl_opt_free_f *cb_preprocess_opt_free)
#include <ViennaRNA/probing/strategy_eddy.h>