Examples:
- (0|1)* - any alignment of matches (i.e. letter '1') or mismatches (indels or substitutions, i.e. letter '0')
- 10{3}(U!C) - a segment of the alignment has a match, followed by three consecutive mismatches (indels or substitutions), followed by a U-C substitution.
- (1(G!U)|(U!C)0)|00 - a segment of the alignment has ends with two consecutive mismatches, and either starts with a match followed by a G-U pair or starts with a U-C pairs followed by a mismatch.
- i1sss(1|d) - a segment of the alignment starts with an insert (i.e. letter 'i') followed by a match, three consecutive substitutions (i.e. letter 's'), and ends with either a match or a delete (i.e. letter 'd')
For more information click here.
Scoring functions:
All the scoring functions work over a DNA/RNA alphabet.
In addition, the letter 'N' may specify an unknown letter in the sequence.
- RNA/DNA LCS -
This simple longest common substring scoring scheme,
assigns a score of 1 for every match (i.e. a letter aligned to itself) and assigns a score of 0 otherwise (insert, delete or substitution).
This scoring function assigns 1 to an alignment of the letter 'N' (unknown) with itself and assigns 0 to an alignment of 'N' with any other letter.
- RNA Hybridization - This scoring scheme is intended for hybridization (base pairing interaction). It assigns a score of 1 to any Watson-Crick base pair (C:G or A:U), assigns a score of 0.5 to any wobble pair (G:U) and assigns a score of 0 otherwise.
This scoring function assigns 1 to an alignment of the letter 'N' (unknown) with itself and assigns 0 to an alignment of 'N' with any other letter.
- RNA Hybridization negative - This scoring scheme is intended for hybridization (base pairing interaction) includes negative score values. It assigns a score of +3 to C:G pairs, +2 tp U:A pairs, +0.8 to G:U (wobble) pairs, +1 to N:N (unknown) pairs and 0 otherwise.