Experimental Structure Probing Data
While RNA secondary structure prediction yields good predictions in general, the model implemented in the prediction algorithms and its parameters are not perfect. This may be due to several reasons, such as uncertainties in the parameters and the simplified assumptions of the model itself. However, prediction performance can be increased by integrating (experimental) RNA structure probing data, such as derived from selective 2’-hydroxyl acylation analyzed by primer extension (SHAPE), dimethyl sulfate (DMS), inline probing, or similar techniques.
Such experimental probing data is usually integrated in the form of small pertubations in the evaluated energy contributions (Soft Constraints) that effectively guide the prediction towards the information gained from the experiment.
Specialized Modules:
Structure Probing Data Workflow
Our implementations follow a particular workflow that basically divides the application of the structure probing data into three steps:
Pre-processing of the probing data
Conversion of the pre-processed data into pseudo energy contributions
Passing the pseudo energies to the soft-constraints interface
The first two steps are handled by a dedicated strategy, that basically is a
single callback function (vrna_probing_strategy_f). Then, the probing data and
the strategy need to be bundled, e.g. by vrna_probing_data_linear(), to obtain
a data structure of type vrna_probing_data_t. This data structure is then passed
to the vrna_sc_probing() function that performs the last of the three steps by
calling the strategy to process the data and adding the derived pseudo energy
contributions as soft constraints.
We already implemented a few different strategies that can be used out-of-the-box, i.e. we provide wrappers that generate the bundled data structure for a particular strategy. The available API symbols can be found in Strategies for Linear Data.
However, the generic interface allows for any user-defined strategy, thus new strategies can be easily implemented by the user and then passed to our API.
After all the above three steps are done, any prediction (MFE, partition function, etc.) will acknowledge the probing data and therefore guide the prediction based on it.
In the following, you’ll find the respective API symbols that allow for the integration of experimental probing data.
Generic Probing Data API
Include Experimental Structure Probing Data to Guide Structure Predictions.
Defines
-
VRNA_REACTIVITY_MISSING
#include <ViennaRNA/probing/basic.h>
Typedefs
-
typedef struct vrna_probing_data_s *vrna_probing_data_t
A data structure that contains RNA structure probing data and specifies how this data is to be integrated into structure predictions.
#include <ViennaRNA/probing/basic.h>
Functions
-
int vrna_sc_probing(vrna_fold_compound_t *fc, vrna_probing_data_t data)
Apply probing data (e.g. SHAPE) to guide the structure prediction.
#include <ViennaRNA/probing/basic.h>
- SWIG Wrapper Notes:
This function is attached as method sc_probing() to objects of type fold_compound. See, e.g.
RNA.fold_compound.sc_probing()in the Python API .
See also
vrna_probing_data_t, vrna_probing_data_free(), vrna_probing_data_deigan(), vrna_probing_data_deigan_comparative(), vrna_probing_data_zarringhalam(), vrna_probing_data_zarringhalam_comparative(), vrna_probing_data_eddy(), vrna_probing_data_eddy_comparative()
- Parameters:
fc – The vrna_fold_compound_t the probing data should be applied to in subsequent computations
data – The prepared probing data and probing data integration strategy
- Returns:
The number of probing data sets applied, 0 upon any error
-
vrna_probing_data_t vrna_probing_data_linear(const double *data, unsigned int data_length, const double *data_weights, vrna_probing_strategy_f strategy_cb, void *strategy_cb_options, vrna_auxdata_free_f strategy_cb_options_free, unsigned int options)
#include <ViennaRNA/probing/basic.h>
-
vrna_probing_data_t vrna_probing_data_linear_multi(const double **data, unsigned int data_size, const unsigned int *data_lengths, const double **data_weights, vrna_probing_strategy_f *strategy_cbs, void **strategy_cbs_options, vrna_auxdata_free_f *strategy_cbs_options_free, unsigned int options)
#include <ViennaRNA/probing/basic.h>
-
void vrna_probing_data_free(vrna_probing_data_t d)
Free memory occupied by the (prepared) probing data.
#include <ViennaRNA/probing/basic.h>
-
unsigned int vrna_probing_data_linear_num(struct vrna_probing_data_s *data)
Get the number of structure probing data sets.
#include <ViennaRNA/probing/basic.h>
See also
vrna_probing_data_linear(), vrna_probing_data_linear_multi()
- Parameters:
data – The data structure storing the probing data
- Returns:
The number of probing data sets stored in
data
-
double *vrna_probing_data_linear_raw(struct vrna_probing_data_s *data, unsigned int pos, unsigned int *data_size)
Get the raw probing data.
#include <ViennaRNA/probing/basic.h>
This function retrieves a copy of the structure probing data set number
posas stored indata. The number of values returned by this function is stored indata_size.See also
vrna_probing_data_linear(), vrna_probing_data_linear_multi(), vrna_probing_data_linear_num(), vrna_probing_data_linear_weight(), vrna_probing_data_linear_energies()
- Parameters:
data – The data structure storing the probing data
pos – The position of the data set
data_size – A pointer to a variable to store the number of returned values to
- Returns:
A copy of the
pos-th probing data set stored indata
-
double *vrna_probing_data_linear_weight(struct vrna_probing_data_s *data, unsigned int pos, unsigned int *data_size)
Get the weights for a probing data set.
#include <ViennaRNA/probing/basic.h>
See also
vrna_probing_data_linear(), vrna_probing_data_linear_multi(), vrna_probing_data_linear_num(), vrna_probing_data_linear_raw(), vrna_probing_data_linear_energies()
- Parameters:
data – The data structure storing the probing data
pos – The position of the weights vector
data_size – A pointer to a variable to store the number of returned values to
- Returns:
A copy of the
pos-th weighting vector stored indata
-
double *vrna_probing_data_linear_energies(struct vrna_probing_data_s *data, unsigned int pos, vrna_fold_compound_t *fc, unsigned int target, unsigned int *data_size)
Get pseudo energy contributions from a structure probing data set.
#include <ViennaRNA/probing/basic.h>
This function retrieves a vector of pseudo energies derived from a set of structure probing data stored in
data. For that, the function calls the probing data strategy that is associated withdata.See also
vrna_probing_data_linear(), vrna_probing_data_linear_multi(), vrna_probing_data_linear_num(), vrna_probing_data_linear_raw(), vrna_probing_data_linear_weight(), #VRNA_PROBING_DATA_LINEAR_TARGET_STACK, #VRNA_PROBING_DATA_LINEAR_TARGET_UP, #VRNA_PROBING_DATA_LINEAR_TARGET_BP
- Parameters:
data – The data structure storing the probing data
pos – The position of the weights vector
fc – The fold_compound that will be passed through to the conversion strategy
target – The target, i.e. the structure context the pseudo energies should be applied to
data_size – A pointer to a variable to store the number of returned values to
- Returns:
An array of pseudo energies (1-based) or NULL
-
double **vrna_probing_data_load_n_distribute(unsigned int n_seq, unsigned int *ns, const char **sequences, const char **file_names, const int *file_name_association, unsigned int options)
#include <ViennaRNA/probing/basic.h>
-
vrna_probing_data_t vrna_probing_data_Deigan2009(const double *reactivities, unsigned int n, double m, double b)
#include <ViennaRNA/probing/basic.h>
- SWIG Wrapper Notes:
This function exists in two forms, (i) as overloaded function probing_data_Deigan2009() and (ii) as constructor of the probing_data object. For the former the second argument
ncan be omitted since the length of thereactivitieslist is determined from the list itself. When the #vrna_probing_data_s constructor is called with the three parametersreactivities,mandb, it will automatically create a prepared data structure for the Deigan et al. 2009 method. See, e.g.RNA.probing_data_Deigan2009()andRNA.probing_data()in the Python API .
-
vrna_probing_data_t vrna_probing_data_Deigan2009_comparative(const double **reactivities, const unsigned int *n, unsigned int n_seq, double *ms, double *bs, unsigned int multi_params)
#include <ViennaRNA/probing/basic.h>
-
vrna_probing_data_t vrna_probing_data_Zarringhalam2012(const double *reactivities, unsigned int n, double beta, const char *pr_conversion, double pr_default)
#include <ViennaRNA/probing/basic.h>
-
vrna_probing_data_t vrna_probing_data_Zarringhalam2012_comparative(const double **reactivities, unsigned int *n, unsigned int n_seq, double *betas, const char **pr_conversions, double *pr_defaults, unsigned int multi_params)
#include <ViennaRNA/probing/basic.h>
-
vrna_probing_data_t vrna_probing_data_Eddy2014_2(const double *reactivities, unsigned int n, const double *unpaired_data, unsigned int unpaired_len, const double *paired_data, unsigned int paired_len)
#include <ViennaRNA/probing/basic.h>
-
vrna_probing_data_t vrna_probing_data_Eddy2014_2_comparative(const double **reactivities, unsigned int *n, unsigned int n_seq, const double **unpaired_datas, unsigned int *unpaired_lens, const double **paired_datas, unsigned int *paired_lens, unsigned int multi_params)
#include <ViennaRNA/probing/basic.h>
-
VRNA_REACTIVITY_MISSING