Help

IncaRNAfbinv offers an interactive environment for the inverse folding of RNA using a fragment-based design approach.
The algorithm implemented in our web server is a significant extension of two complementary methodologies: that described in Weinbrand et al. (Bioinformatics 2013, 29(22): 2938-2940) called RNAfbinv, together with Reinharz et al. (Bioinformatics 2013, 29(13): i308-i315) called incaRNAtion.

The server receives the desired secondary structure in dot bracket notation and additional parameters to allow the user to control specific aspects of the design. The maximum length allowed is 500 bases. The output includes the designed sequences and additional information such as structural distance to input, minimum free energy (based on the Turner model, 2004), neutrality and more.

Input

  • Job name:

    For personal use, can be used later on to search for old results (up to 1 week). This parameter is optional.
  • e-mail:

    Upon submission of the query form an e-mail will be sent to the given address which includes a link to the results page. Another mail will be sent again when the calculation is done and the results are ready for review. Inserting your e-mail is optional but very much recommended for requests that include Mutational robustness or require a large amount of designed sequences.
  • Target structure:

    A sequence pattern based on the dot bracket notation (not including pseudoknots). This means legal characters are '.' to mark an unbounded base, '(' to mark first base in a base-pair and ')' to mark second base in a base pair (Or '<' and '>' respectively).
    Example:
    ((((((((...(.(((((.......))))).)........((((((.......))))))..))))))))
    Structure of the Guanine-binding riboswitch aptamer (Kim and Breaker, Biol. Cell, 2008).
  • Target sequence:

    A sequence pattern based on IUPAC sequence notation (not including 'x' and '-'). The sequence constraint is optional, if left empty then it will be replaced with 'N' x structure length. If used, sequence constraints must have the same length as target structure. Result sequences must fit this sequence pattern. The sequence pattern is rigid and attached to an index.

    Example:

    The following sequence was constructed to match the structure above:
    NNNNNNNNUNNNNNNNNNNNNNNNNNNNNNNNNUNNNUNNNNNNNNNNNNNNNNNNNNNNYNNNNNNNN
    Specific locations that are sequence conserved are constrained. Specifically these are the nucleic acids that interact with the purine ligand.
  • Target Energy: (Advanced Option)

    Designed sequences will aim to fit the given minimum free energy. The calculation is done using RNAfold From the Vienna RNA Package with the Turner energy model, 2004. Target energy is an optional input.
  • Target Mutational robustness: (Advanced Option)

    Designed sequences will aim to fit the given neutrality value [0,1]. Mutational robustness tests the base pair distance between the current sequence to the fold of all the sequences that are a single point mutation away. This means that at every iteration, to calculate this value, RNAfbinv must fold 3 * length(sequence) times. Using the option slows down the calculation significantly and allows up to 300 max iterations and 50 output sequences only.
  • Simulated Annealing Iterations: (Advanced Option)

    The number of simulated annealing iterations done by RNAfbinv. By default 1000.
  • Motif constraints:

    Allows the user to select a single motif from the structure that will have a greater chance to appear in the final result. The list of motifs will be filled upon insertion of a legal structure along side an image of the structure generated by VARNA Visualization Applet for RNA.
  • Seed generation method:

    Any RNAfbinv run can start using a seed, the following methods are supported by the web-server.
    • incaRNAtion
      Unlike the original RNAfbinv, incaRNAtion uses a global search strategy. The adaptive sampling approach simply generates sets of sequences by repeatedly running the stochastic backtrack algorithm. incaRNAtion also allows the user to set a desired GC content distribution for the designed sequences. Starting from incaRNAtion seeds allows RNAfbinv to reach the target structure in less iterations and generates seeds with approximately the starting GC content.
      If selected the user must set the GC content of the seed sequences (Advanced option). It is also possible to set a maximum GC content error from the selection. The GC error option only effects the incaRNAtion seed content.
    • Random initial guess
      RNAfbinv starts from a totally random sequence.
    • User Defined
      RNAfbinv starts from a sequence given by the user. The sequence must be the same length of the structure and in the IUPAC sequence notation. The given sequence will be set as input for RNAfbinv for all of the runs.
  • Number of output sequences

    Select the number of output designed sequences.

Examples

We provide two simple examples. The examples are accessible in the selection box at the bottom of the input page. Once an example is selected, press the set button to apply it to the input form.
  • Purine Riboswitch aptamer

    Structure of the Guanine-binding riboswitch aptamer (Kim and Breaker, Biol. Cell, 2008).
    ((((((((...(.(((((.......))))).)........((((((.......))))))..))))))))
    NNNNNNNNUNNNNNNNNNNNNNNNNNNNNNNNNUNNNUNNNNNNNNNNNNNNNNNNNNNNYNNNNNNNN
  • miRNA-146 precursor

    Structure of miRNA-146 precursor (Krol et al., J. Biol. Chem., 2004)
    ((((..((((((((((((.((((((((............)))))))).)))))))))))).))))

Results

The results section contains the designed pattern list with predicted structure and additional information stated below. The default sort is by Shapiro distance primary and BP distance secondary. The results can be downloaded in excel format for further analysis.
  • Run no:

    The RNAfbinv run number. Only signifies the order of completion.
  • Sequence:

    The resultant designed sequence with its folding predicted structure below it.
  • Shapiro structure: (Coarse grained representation)

    Fragment based structure for the predicted fold. Hairpins, interior loops, bulges, multi-loops and stems are represented by (H), (I), (B), (M) and (S) respectively (Shapiro B.A., 1988)
  • Energy score (dG):

    Given the designed sequence and predicted structure we calculate the free energy using the Turner energy model, 2004. This value is in kcal/mol. The value is calculated using functions from the Vienna RNA Package.
  • Mutational Robustness

    Mutational robustness tests the base pair distance between the current sequence to the fold of all the sequences that are a single point mutation away. This means that at every iteration, to calculate this value, RNAfbinv must fold 3 * length(seqeuence) times. Using the option slows down the calculation significantly and allows up to 300 max iterations and 50 output sequences only.
  • BP distance

    The base pair distance between the structure of the predicted fold for the resultant sequence to the target structure given in the input.
  • Shapiro distance

    The distance between the Shapiro structure tree-graph representation of the predicted fold for the resultant sequence to the Shapiro tree-graph representation for the target structure given in the input.
  • GC% content

    The percentage of GC in the result sequence.
  • Additional Information:

    Fold Image:
    A secondary structure image of the designed sequence and its predicted fold. This image is generated by VARNA Visualization Applet for RNA

Run Time

The following table shows run times (Log-10 seconds) for four different structures under five GC% contents. Tests were made with default options. The graph shows both seed generation times when using incaRNAtion seeds and RNAfbinv calculation, put together.

Structures:

  1. miRNA-146 precursor

    65 bases.
    ((((..((((((((((((.((((((((............)))))))).)))))))))))).))))
  2. Purine Riboswitch aptamer

    69 bases.
    ((((((((...(.(((((.......))))).)........((((((.......))))))..))))))))
  3. Cobalamin Riboswitch aptamer

    127 bases
    ..((((((((......(((.......))).....((((......))))...........................(((((.......))))).....(((.......))).......))))))))..
  4. S14 Ribosomal RNA - Domain 2 (for timing purposes)

    361 bases
    ..........(((((...(.((((.(.(((.(((((((.(((((((((((....(((((((.....)))))))...)))))))))..)))))))))...(((((((((..(((((((((..((((((((...(((......)))......))))))))..))....(..((....)))))))))).)))))).)))...))))..))))....((((((...((...((((.........))))...))))))))..........((((((..((((((((((((((.....))))))))))))))...((..)))).....)))))))))).(((......((((....))))....)))