##########
RNAalifold
##########

:program:`RNAalifold` - manual page for RNAalifold 2.7.1

Synopsis
--------

.. code:: bash

    RNAalifold [options] [<input0.aln>] [<input1.aln>]...

DESCRIPTION
-----------

RNAalifold 2.7.1

calculate secondary structures for a set of aligned RNAs

Read aligned RNA sequences from stdin or file.aln and calculate their minimum
free energy (mfe) structure, partition function (pf) and base pairing
probability matrix. Currently, input alignments have to be in CLUSTAL,
Stockholm, FASTA, or MAF format. The input format must be set manually in
interactive mode (default is Clustal), but will be determined automagically
from the input file, if not expplicitly set. It returns the mfe structure in
bracket notation, its energy, the free energy of the thermodynamic ensemble and
the frequency of the mfe structure in the ensemble to stdout.  It also produces
Postscript files with plots of the resulting secondary structure graph
("alirna.ps") and a "dot plot" of the base pairing matrix ("alidot.ps").
The file "alifold.out" will contain a list of likely pairs sorted by
credibility, suitable for viewing  with "AliDot.pl". Be warned that output
file will overwrite any existing files of the same name.

.. option:: -h, --help

    Print help and exit

.. option:: --detailed-help

    Print help, including all details and hidden options, and exit

.. option:: --full-help

    Print help, including hidden options, and exit

.. option:: -V, --version

    Print version and exit

.. option:: -v, --verbose

    Be verbose. *(default=off)*


    Lower the log level setting such that even INFO messages are passed through.

.. option:: -q, --quiet

    Be quiet. *(default=off)*


    This option can be used to minimize the output of additional information and
    non-severe warnings which otherwise might spam stdout/stderr.

I/O Options:
^^^^^^^^^^^^



    Command line options for input and output (pre-)processing

.. option:: -f, --input-format=C|S|F|M

    File format of the input multiple sequence alignment (MSA).


    If this parameter is set, the input is considered to be in a particular file
    format. Otherwise, the program tries to determine the file format
    automatically, if an input file was provided in the set of parameters. In
    case the input MSA is provided in interactive mode, or from a terminal (TTY),
    the programs default is to assume CLUSTALW format.
    Currently, the following formats are available: ClustalW (``C``), Stockholm 1.0
    (``S``), FASTA/Pearson (``F``), and MAF (``M``).

.. option:: --mis

    Output "most informative sequence" instead of simple consensus: For each column of the alignment output the set of nucleotides with frequency greater than average in IUPAC notation.


    *(default=off)*

.. option:: -j, --jobs[=number]

    Split batch input into jobs and start processing in parallel using multiple threads. A value of 0 indicates to use as many parallel threads as computation cores are available.


    *(default="0")*


    Default processing of input data is performed in a serial fashion, i.e. one
    alignment at a time. Using this switch, a user can instead start the
    computation for many alignments in the input in parallel. RNAalifold will
    create as many parallel computation slots as specified and assigns input
    alignments of the input file(s) to the available slots. Note, that this
    increases memory consumption since input alignments have to be kept in memory
    until an empty compute slot is available and each running job requires its
    own dynamic programming matrices.

.. option:: --unordered

    Do not try to keep output in order with input while parallel processing is in place.


    *(default=off)*


    When parallel input processing (:option:`--jobs` flag) is enabled, the order in which
    input is processed depends on the host machines job scheduler. Therefore, any
    output to stdout or files generated by this program will most likely not
    follow the order of the corresponding input data set. The default of
    RNAalifold is to use a specialized data structure to still keep the results
    output in order with the input data. However, this comes with a trade-off in
    terms of memory consumption, since all output must be kept in memory for as
    long as no chunks of consecutive, ordered output are available. By setting
    this flag, RNAalifold will not buffer individual results but print them as
    soon as they have been computated.

.. option:: --noconv

    Do not automatically substitute nucleotide "T" with "U".


    *(default=off)*

.. option:: -n, --continuous-ids

    Use continuous alignment ID numbering when no alignment ID can be retrieved from input data.


    *(default=off)*


    Due to its past, RNAalifold produces a specific set of output file names for
    the first input alignment, "alirna.ps", "alidot.ps", etc. But for all
    further alignments in the input, it usually adopts a naming scheme based on
    IDs, which may be retrieved from the input alignment's meta-data, or
    generated by a prefix followed by an increasing counter. Setting this flag
    instructs RNAalifold to use the ID naming scheme also for the first
    alignment.

.. option:: --auto-id

    Automatically generate an ID for each alignment.


    *(default=off)*


    The default mode of RNAalifold is to automatically determine an ID from the
    input alignment if the input file format allows to do that. Alignment IDs
    are, for instance, usually given in Stockholm 1.0 formatted input. If this
    flag is active, RNAalifold ignores any IDs retrieved from the input and
    automatically generates an ID for each alignment.

.. option:: --id-prefix=STRING

    Prefix for automatically generated IDs (as used in output file names).


    *(default="alignment")*


    If this parameter is set, each alignment will be prefixed with the provided
    string. Hence, the output files will obey the following naming scheme:
    "prefix_xxxx_ss.ps" (secondary structure plot), "prefix_xxxx_dp.ps"
    (dot-plot), "prefix_xxxx_aln.ps" (annotated alignment), etc. where xxxx is
    the alignment number beginning with the second alignment in the input. Use
    this setting in conjunction with the :option:`--continuous-ids` flag to assign IDs
    beginning with the first input alignment.

.. option:: --id-delim=CHAR

    Change the delimiter between prefix and increasing number for automatically generated IDs (as used in output file names).


    *(default="_")*


    This parameter can be used to change the default delimiter ``_`` between the
    prefix string and the increasing number for automatically generated ID.

.. option:: --id-digits=INT

    Specify the number of digits of the counter in automatically generated alignment IDs.


    *(default="4")*


    When alignments IDs are automatically generated, they receive an increasing
    number, starting with 1. This number will always be left-padded by leading
    zeros, such that the number takes up a certain width. Using this parameter,
    the width can be specified to the users need. We allow numbers in the range
    [1:18].

.. option:: --id-start=LONG

    Specify the first number in automatically generated alignment IDs.


    *(default="1")*


    When alignment IDs are automatically generated, they receive an increasing
    number, usually starting with 1. Using this parameter, the first number can
    be specified to the users requirements. Note: negative numbers are not
    allowed.
    Note: Setting this parameter implies continuous alignment IDs, i.e. it
    activates the :option:`--continuous-ids` flag.

.. option:: --filename-delim=CHAR

    Change the delimiting character used in sanitized filenames.


    *(default="ID-delimiter")*


    This parameter can be used to change the delimiting character used while
    sanitizing filenames, i.e. replacing invalid characters. Note, that the
    default delimiter ALWAYS is the first character of the "ID delimiter" as
    supplied through the :option:`--id-delim` option. If the delimiter is a whitespace
    character or empty, invalid characters will be simply removed rather than
    substituted. Currently, we regard the following characters as illegal for use
    in filenames: backslash ``\``, slash ``/``, question mark ``?``, percent sign ``%``,
    asterisk ``*``, colon ``:``, pipe symbol ``|``, double quote ``"``, triangular
    brackets ``<`` and ``>``.

.. option:: --log-level=level

    Set log level threshold. *(default="2")*


    By default, any log messages are filtered such that only warnings (level 2)
    or errors (level 3) are printed. This setting allows for specifying the log
    level threshold, where higher values result in fewer information. Log-level 5
    turns off all messages, even errors and other critical information.

.. option:: --log-file[=filename]

    Print log messages to a file instead of stderr. *(default="RNAalifold.log")*

.. option:: --log-time

    Include time stamp in log messages.


    *(default=off)*

.. option:: --log-call

    Include file and line of log calling function.


    *(default=off)*

Algorithms:
^^^^^^^^^^^



    Select additional algorithms which should be included in the calculations.

.. option:: -p, --partfunc[=INT]

    Calculate the partition function and base pairing probability matrix in addition to the mfe structure. Default is calculation of mfe structure only.


    *(default="1")*


    In addition to the MFE structure we print a coarse representation of the pair
    probabilities in form of a pseudo bracket notation, followed by the ensemble
    free energy, as well as the centroid structure derived from the pair
    probabilities together with its free energy and distance to the ensemble.
    Finally it prints the frequency of the mfe structure.


    An additionally passed value to this option changes the behavior of partition
    function calculation:
    :option:`-p0` deactivates the calculation of the pair probabilities, saving about 50%
    in runtime. This prints the ensemble free energy ``dG=-kT ln(Z)``.

.. option:: --betaScale=DOUBLE

    Set the scaling of the Boltzmann factors. *(default="1.")*


    The argument provided with this option is used to scale the thermodynamic
    temperature in the Boltzmann factors independently from the temperature of
    the individual loop energy contributions. The Boltzmann factors then become
    ``exp(- dG/(kTn*betaScale))`` where ``k`` is the Boltzmann constant, ``dG`` the
    free energy contribution of the state, ``T`` the absolute temperature and ``n``
    the number of sequences.

.. option:: -S, --pfScale=DOUBLE

    In the calculation of the pf use scale*mfe as an estimate for the ensemble free energy (used to avoid overflows).


    *(default="1.07")*


    The default is 1.07, useful values are 1.0 to 1.2. Occasionally needed for
    long sequences.

.. option:: --MEA[=gamma]

    Compute MEA (maximum expected accuracy) structure.


    *(default="1.")*


    The expected accuracy is computed from the pair probabilities: each base pair
    ``(i,j)`` receives a score ``2*gamma*p_ij`` and the score of an unpaired base is
    given by the probability of not forming a pair. The parameter gamma tunes the
    importance of correctly predicted pairs versus unpaired bases. Thus, for
    small values of gamma the MEA structure will contain only pairs with very
    high probability. Using :option:`--MEA` implies :option:`-p` for computing the pair
    probabilities.

.. option:: --sci

    Compute the structure conservation index (SCI) for the MFE consensus structure of the alignment.


    *(default=off)*

.. option:: -c, --circ

    Assume a circular (instead of linear) RNA molecule.


    *(default=off)*

.. option:: --bppmThreshold=cutoff

    Set the threshold/cutoff for base pair probabilities included in the postscript output.


    *(default="1e-6")*


    By setting the threshold the base pair probabilities that are included in the
    output can be varied. By default only those exceeding ``1e-6`` in probability
    will be shown as squares in the dot plot. Changing the threshold to any other
    value allows for increase or decrease of data.

.. option:: -g, --gquad

    Incoorporate G-Quadruplex formation into the structure prediction algorithm.


    *(default=off)*

.. option:: -s, --stochBT=INT

    Stochastic backtrack. Compute a certain number of random structures with a probability dependend on the partition function. See :option:`-p` option in RNAsubopt.

.. option:: --stochBT_en=INT

    same as :option:`-s` option but also print out the energies and probabilities of the backtraced structures.

.. option:: -N, --nonRedundant

    Enable non-redundant sampling strategy.


    *(default=off)*

.. option:: --random-seed=INT

    Set the seed for the random number generator

Structure Constraints:
^^^^^^^^^^^^^^^^^^^^^^



    Command line options to interact with the structure constraints feature of
    this program

.. option:: --maxBPspan=INT

    Set the maximum base pair span.


    *(default="-1")*

.. option:: -C, --constraint[=filename]

    Calculate structures subject to constraints. The constraining structure will be read from ``stdin``, the alignment has to be given as a file name on the command line.


    *(default="")*


    The program reads first the sequence, then a string containing constraints on
    the structure encoded with the symbols:


    ``.`` (no constraint for this base)


    ``|`` (the corresponding base has to be paired


    ``x`` (the base is unpaired)


    ``<`` (base i is paired with a base j>i)


    ``>`` (base i is paired with a base j<i)


    and matching brackets ``(`` ``)`` (base i pairs base j)


    With the exception of ``|``, constraints will disallow all pairs conflicting
    with the constraint. This is usually sufficient to enforce the constraint,
    but occasionally a base may stay unpaired in spite of constraints. PF folding
    ignores constraints of type ``|``.

.. option:: --batch

    Use constraints for all alignment records. *(default=off)*


    Usually, constraints provided from input file are only applied to a single
    sequence alignment. Therefore, RNAalifold will stop its computation and quit
    after the first input alignment was processed. Using this switch, RNAalifold
    processes all sequence alignments in the input and applies the same provided
    constraints to each of them.

.. option:: --enforceConstraint

    Enforce base pairs given by round brackets ``(`` ``)`` in structure constraint.


    *(default=off)*

.. option:: --SS_cons

    Use consensus structures from Stockholm file (``#=GF SS_cons``) as constraint.


    *(default=off)*


    Stockholm formatted alignment files have the possibility to store a secondary
    structure string in one of if (``#=GC``) column annotation meta tags. The
    corresponding tag name is usually ``SS_cons``, a consensus secondary structure.
    Activating this flag allows one to use this consensus secondary structure
    from the input file as structure constraint. Currently, only the following
    characters are interpreted:


    ``(`` ``)`` [mathing parenthesis: column i pairs with column j]


    ``<`` ``>`` [matching angular brackets: column i pairs with column j]


    All other characters are not interpreted (yet).
    Note: Activating this flag implies :option:`--constraint`.

.. option:: --shape=file1,file2

    Use SHAPE reactivity data to guide structure predictions.


    Multiple shapefiles for the individual sequences in the alignment may be
    specified  as a comma separated list. An optional association of particular
    shape files to a specific  sequence in the alignment can be expressed by
    prepending the sequence number to the filename,  e.g.
    "5=seq5.shape,3=seq3.shape" will assign the reactivity values from file
    seq5.shape to  the fifth sequence in the alignment, and the values from file
    seq3.shape to sequence 3. If  no assignment is specified, the reactivity
    values are assigned to corresponding sequences in  the order they are given.

.. option:: --shapeMethod=D[mX][bY]

    Specify the method how to convert SHAPE reactivity data to pseudo energy contributions.


    *(default="D")*


    Currently, the only data conversion method available is that of to Deigan et
    al 2009.  This method is the default and is recognized by a capital ``D`` in
    the provided parameter, i.e.:  :option:`--shapeMethod=`"D" is the default setting.
    The slope ``m`` and the intercept ``b`` can be set to a  non-default value if
    necessary. Otherwise m=1.8 and b=-0.6 as stated in the paper mentionen
    before.  To alter these parameters, e.g. m=1.9 and b=-0.7, use a  parameter
    string like this: :option:`--shapeMethod=`"Dm1.9b-0.7". You may also provide only one
    of the two  parameters like: :option:`--shapeMethod=`"Dm1.9" or
    :option:`--shapeMethod=`"Db-0.7".

Energy Parameters:
^^^^^^^^^^^^^^^^^^



    Energy parameter sets can be adapted or loaded from user-provided input files

.. option:: -T, --temp=DOUBLE

    Rescale energy parameters to a temperature of temp C. Default is 37C.


    *(default="37.0")*

.. option:: -P, --paramFile=paramfile

    Read energy parameters from paramfile, instead of using the default parameter set.


    Different sets of energy parameters for RNA and DNA should accompany your
    distribution.
    See the RNAlib documentation for details on the file format. The placeholder
    file name ``DNA`` can be used to load DNA parameters without the need to
    actually specify any input file.

.. option:: -4, --noTetra

    Do not include special tabulated stabilizing energies for tri-, tetra- and hexaloop hairpins.


    *(default=off)*


    Mostly for testing.

.. option:: --salt=DOUBLE

    Set salt concentration in molar (M). Default is 1.021M.

Model Details:
^^^^^^^^^^^^^^



    Tweak the energy model and pairing rules additionally using the following
    parameters

.. option:: -d, --dangles=INT

    How to treat "dangling end" energies for bases adjacent to helices in free ends and multi-loops.


    *(default="2")*


    With :option:`-d2` dangling energies will be added for the bases adjacent to a helix on
    both sides


    in any case.


    The option :option:`-d0` ignores dangling ends altogether (mostly for debugging).

.. option:: --noLP

    Produce structures without lonely pairs (helices of length 1).


    *(default=off)*


    For partition function folding this only disallows pairs that can only occur
    isolated. Other pairs may still occasionally occur as helices of length 1.

.. option:: --noGU

    Do not allow GU pairs.


    *(default=off)*

.. option:: --noClosingGU

    Do not allow GU pairs at the end of helices.


    *(default=off)*

.. option:: --cfactor=DOUBLE

    Set the weight of the covariance term in the energy function


    *(default="1.0")*

.. option:: --nfactor=DOUBLE

    Set the penalty for non-compatible sequences in the covariance term of the energy function


    *(default="1.0")*

.. option:: -E, --endgaps

    Score pairs with endgaps same as gap-gap pairs.


    *(default=off)*

.. option:: -R, --ribosum_file=ribosumfile

    use specified Ribosum Matrix instead of normal


    energy model.


    Matrixes to use should be 6x6 matrices, the order of the terms is ``AU``, ``CG``,
    ``GC``, ``GU``, ``UA``, ``UG``.

.. option:: -r, --ribosum_scoring

    use ribosum scoring matrix. *(default=off)*


    The matrix is chosen according to the minimal and maximal pairwise identities
    of the sequences in the file.

.. option:: --old

    use old energy evaluation, treating gaps as characters.


    *(default=off)*

.. option:: --nsp=STRING

    Allow other pairs in addition to the usual AU,GC,and GU pairs.


    Its argument is a comma separated list of additionally allowed pairs. If the
    first character is a "-" then AB will imply that AB and BA are allowed
    pairs, e.g. :option:`--nsp=`"-GA"  will allow GA and AG pairs. Nonstandard pairs are
    given 0 stacking energy.

.. option:: --energyModel=INT

    Set energy model.


    Rarely used option to fold sequences from the artificial ABCD... alphabet,
    where A pairs B, C-D etc.  Use the energy parameters for GC (:option:`--energyModel` 1)
    or AU (:option:`--energyModel` 2) pairs.

.. option:: --helical-rise=FLOAT

    Set the helical rise of the helix in units of Angstrom.


    *(default="2.8")*


    Use with caution! This value will be re-set automatically to 3.4 in case DNA
    parameters are loaded via :option:`-P` DNA and no further value is provided.

.. option:: --backbone-length=FLOAT

    Set the average backbone length for looped regions in units of Angstrom.


    *(default="6.0")*


    Use with caution! This value will be re-set automatically to 6.76 in case DNA
    parameters are loaded via :option:`-P` DNA and no further value is provided.

Plotting:
^^^^^^^^^



    Command line options for changing the default behavior of structure layout
    and pairing probability plots

.. option:: --color

    Produce a colored version of the consensus structure plot "alirna.ps" (default b&w only)


    *(default=off)*

.. option:: --color-threshold=FLOAT

    Set the threshold of maximum counter examples for coloring consensus structure plot.


    *(default="2")*


    Floating point numbers between 0 and 1 are treated as frequencies among all
    sequencesin the alignment. All other will be truncated to integer and used as
    absolute number of counter
    examples.

.. option:: --color-min-sat=FLOAT

    Set the minimum saturation for coloring consensus structure plot.


    *(default="0.2")*


    Floating point number >= 0 and smaller than 1.

.. option:: --aln

    Produce a colored and structure annotated alignment in PostScript format in the file "aln.ps" in the current directory.


    *(default=off)*

.. option:: --aln-EPS-cols=INT

    Number of columns in colored EPS alignment output.


    *(default="60")*


    A value less than 1 indicates that the output should not be wrapped at all.

.. option:: --aln-stk[=prefix]

    Create a multi-Stockholm formatted output file. *(default="RNAalifold_results")*


    The default file name used for the output is "RNAalifold_results.stk".
    Users may change the filename to "prefix.stk" by specifying the prefix as
    optional argument. The file will be create in the current directory if it
    does not already exist. In case the file already exists, output will be
    appended to it. Note: Any special characters in the filename will be replaced
    by the filename delimiter, hence there is no way to pass an entire directory
    path through this option yet. (See also the "--filename-delim" parameter)

.. option:: --noPS

    Do not produce postscript drawing of the mfe structure.


    *(default=off)*

.. option:: --noDP

    Do not produce dot-plot postscript file containing base pair or stack probabilitities.


    *(default=off)*


    In combination with the :option:`-p` option, this flag turns-off creation of individual
    dot-plot files. Consequently, computed base pair probability output is
    omitted but centroid and MEA structure prediction is still performed.

.. option:: -t, --layout-type=INT

    Choose the layout algorithm. *(default="1")*


    Select the layout algorithm that computes the nucleotide coordinates.
    Currently, the following algorithms are available:


    ``0``: simple radial layout


    ``1``: Naview layout (Bruccoleri et al. 1988)


    ``2``: circular layout


    ``3``: RNAturtle (Wiegreffe et al. 2018)


    ``4``: RNApuzzler (Wiegreffe et al. 2018)

Caveats:

Sequences are not weighted. If possible, do not mix very similar and dissimilar
sequences. Duplicate sequences, for example, can distort the prediction.

REFERENCES
----------

*If you use this program in your work you might want to cite:*

R. Lorenz, S.H. Bernhart, C. Hoener zu Siederdissen, H. Tafer, C. Flamm, P.F. Stadler and I.L. Hofacker (2011),
"ViennaRNA Package 2.0",
Algorithms for Molecular Biology: 6:26

I.L. Hofacker, W. Fontana, P.F. Stadler, S. Bonhoeffer, M. Tacker, P. Schuster (1994),
"Fast Folding and Comparison of RNA Secondary Structures",
Monatshefte f. Chemie: 125, pp 167-188

R. Lorenz, I.L. Hofacker, P.F. Stadler (2016),
"RNA folding with hard and soft constraints",
Algorithms for Molecular Biology 11:1 pp 1-13

The algorithm is a variant of the dynamic programming algorithms of M. Zuker and P. Stiegler (mfe)
and J.S. McCaskill (pf) adapted for sets of aligned sequences with covariance information.

Ivo L. Hofacker, Martin Fekete, and Peter F. Stadler (2002),
"Secondary Structure Prediction for Aligned RNA Sequences",
J.Mol.Biol.: 319, pp 1059-1066.

Stephan H. Bernhart, Ivo L. Hofacker, Sebastian Will, Andreas R. Gruber, and Peter F. Stadler (2008),
"RNAalifold: Improved consensus structure prediction for RNA alignments",
BMC Bioinformatics: 9, pp 474


*The energy parameters are taken from:*

D.H. Mathews, M.D. Disney, D. Matthew, J.L. Childs, S.J. Schroeder, J. Susan, M. Zuker, D.H. Turner (2004),
"Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure",
Proc. Natl. Acad. Sci. USA: 101, pp 7287-7292

D.H Turner, D.H. Mathews (2009),
"NNDB: The nearest neighbor parameter database for predicting stability of nucleic acid secondary structure",
Nucleic Acids Research: 38, pp 280-282

EXAMPLES
--------


A simple call to compute consensus MFE structure, ensemble free energy,
base pair probabilities, centroid structure, and MEA structure for a
multiple sequence alignment (MSA) provided as Stockholm formatted file
alignment.stk might look like:

.. code::

    $ RNAalifold -p --MEA alignment.stk
    


Consider the following MSA file for three sequences

.. code::

    # STOCKHOLM 1.0
    
    #=GF AC   RF01293
    #=GF ID   ACA59
    #=GF DE   Small nucleolar RNA ACA59
    #=GF AU   Wilkinson A
    #=GF SE   Predicted; WAR; Wilkinson A
    #=GF SS   Predicted; WAR; Wilkinson A
    #=GF GA   43.00
    #=GF TC   44.90
    #=GF NC   40.30
    #=GF TP   Gene; snRNA; snoRNA; HACA-box;
    #=GF BM   cmbuild -F CM SEED
    #=GF CB   cmcalibrate --mpi CM
    #=GF SM   cmsearch --cpu 4 --verbose --nohmmonly -E 1000 -Z 549862.597050 CM SEQDB
    #=GF DR   snoRNABase; ACA59;
    #=GF DR   SO; 0001263; ncRNA_gene;
    #=GF DR   GO; 0006396; RNA processing;
    #=GF DR   GO; 0005730; nucleolus;
    #=GF RN   [1]
    #=GF RM   15199136
    #=GF RT   Human box H/ACA pseudouridylation guide RNA machinery.
    #=GF RA   Kiss AM, Jady BE, Bertrand E, Kiss T
    #=GF RL   Mol Cell Biol. 2004;24:5797-5807.
    #=GF WK   Small_nucleolar_RNA
    #=GF SQ   3
    
    
    AL031296.1/85969-86120     CUGCCUCACAACGUUUGUGCCUCAGUUACCCGUAGAUGUAGUGAGGGUAACAAUACUUACUCUCGUUGGUGAUAAGGAACAGCU
    AANU01225121.1/438-603     CUGCCUCACAACAUUUGUGCCUCAGUUACUCAUAGAUGUAGUGAGGGUGACAAUACUUACUCUCGUUGGUGAUAAGGAACAGCU
    AAWR02037329.1/29294-29150 ---CUCGACACCACU---GCCUCGGUUACCCAUCGGUGCAGUGCGGGUAGUAGUACCAAUGCUAAUUAGUUGUGAGGACCAACU
    #=GC SS_cons               -----((((,<<<<<<<<<___________>>>>>>>>>,,,,<<<<<<<______>>>>>>>,,,,,))))::::::::::::
    #=GC RF                    CUGCcccaCAaCacuuguGCCUCaGUUACcCauagguGuAGUGaGgGuggcAaUACccaCcCucgUUgGuggUaAGGAaCAgCU
    //
    



Then, the above program call will produce this output:

.. code::

    3 sequences; length of alignment 84.
    >ACA59
    CUGCCUCACAACAUUUGUGCCUCAGUUACCCAUAGAUGUAGUGAGGGUAACAAUACUUACUCUCGUUGGUGAUAAGGAACAGCU
    ...((((((.(((((((((...........))))))))).))))))..........(((((......)))))............ (-12.54 = -12.77 +   0.23)
    ...((((((.(((((((((...........))))))))).)))))){{,.......{{{{,......}))))............ [-14.38]
    ...((((((.(((((((((...........))))))))).))))))..........((((........))))............ {-12.44 = -12.33 +  -0.10 d=10.94}
    ...((((((.(((((((((...........))))))))).))))))..........((((........))))............ {-12.44 = -12.33 +  -0.10 MEA=66.65}
    frequency of mfe structure in ensemble 0.368739; ensemble diversity 17.77
    


Here, the first line is written to *stderr* and simply states the number of sequences and
the length of the alignment. This line can be suppressed using the :option:`--quiet` option.
The main output then consists of 7 lines, where the first two resemble the FASTA header
with the ID as read from the input data set, followed by the consensus sequence in the
second line. The third line consists of the consensus secondary structure in dot-bracket
notation followed by the averaged minimum free energy in parenthesis. This energy is
composed of two major contributions, the actual free energies derived from the Nearest
Neighbor model, and the covariance pseudo-energy term, which are both displayed after
the equal sign. The fourth line shows the base pair propensity in pseudo dot-bracket
notation followed by the ensemble free energy dG = -kT ln(Z) in square brackets.
Similarly, the next two lines state the controid- and the MEA structure in dot-bracket
notation, followed by their corresponding free energy contributions, the mean distance
(d) to the ensemble as well as the maximum expected accuracy (MEA). Again, the free
energies are split into Nearest Neighbor contribution and the covariance pseudo-energy
term.

Furthermore, RNAalifold will produce three output files: ACA59_ss.ps, ACA59_dp.ps, and
ACA59_ali.out that contain the secondary structure drawing, the base pair probability
dot-plot, and a detailed table of base pair probabilities, respectively.



THE ALIOUT FILE
---------------


When computing base pair probabilities (:option:`--partfunc` option), RNAalifold will produce
a file with the suffix `ali.out`. This file contains the base pairing probabilities between
different alignment columns together with some detailed statistics for the individual
sequences within the alignment. The file is a simple text file with a two line header that
states the number of sequences and length of the alignment. The first couple of lines
of this file may look like:

.. code::

    3 sequence; length of alignment 84
    alifold output
    14    36  0  92.7%   0.212 CG:1    UA:2
    13    37  0  92.7%   0.218 GU:1    AU:2
    12    38  0  92.7%   0.231 CG:3
    15    35  0  91.9%   0.239 UG:3
    16    34  0  85.2%   0.434 UA:2    --:1
    8    42  0  80.7%   0.526 AU:3   +
    9    41  0  80.4%   0.542 CG:3   +
    7    43  1  80.1%   0.541 CG:2   +
    


Starting with the third row, there are at least six and at most 13 columns separated by
whitespaces stating: (1) the i-position and (2) the j-position of a potential base pair
(i, j), followed by (3) the number of counter examples, i.e. the number of sequences in
the alignment that can't form a canonical base pair with their respective sequence positions.
Next is (4) the base pair probabilitiy in percent, (5) a pseudo entropy measure
S_ij = S_i + S_j - p_ij ln(p_ij), where S_i and S_j are the positional entropies for the
two alignment columns i and j, and p_ij is the base pair probability. Finally, the last
columns (6-12) state the number of particular base pairs for the individual sequences in
the alignment. Here, we distinguish the base pairs "GC","CG","AU","UA","GU","UG", and
the special case "--" that represents gaps at both positions i and j.
Finally, base pairs that are not part of the MFE structure are marked by an additional
"+" sign in the last column.

AUTHOR
------


Ivo L Hofacker, Stephan Bernhart, Ronny Lorenz

REPORTING BUGS
--------------


If in doubt our program is right, nature is at fault.
Comments should be sent to rna@tbi.univie.ac.at.

SEE ALSO
--------


The ALIDOT package http://www.tbi.univie.ac.at/RNA/Alidot/