« Projects 2008

Flexible Search Method for Structural Motifs in RNA

  • Project number: 202-08-06
  • Students: Isana Vaksler
  • Supervisor: Klara Kedem, Danny Barash

The discovery of non coding RNA (ncRNA) motifs and their role in regulating gene expression has recently attracted considerable attention. The goal is to discover these motifs in a sequence database.

Previous RNA motif search methods start from the primary sequence and only then take into account secondary structure considerations. Since different motifs vary in structure rigidity and in local sequence constraints, there is a need for algorithms and tools that can be fine-tuned according to the searched RNA motif.

We present an RNA motif search tool called STRMS (Structural RNAMotif Search), which takes as input the secondary structure of the query, including local sequence and structure constraints, and a target sequence database. It reports all occurrences of the query in the target, ranked by their similarity to the query, and produces an html file that displays graphical images of the predicted structures for both the query and the candidate hits.

Our tool is flexible and takes into account a large number of sequence options and existence of potential pseudoknots as dictated by specific queries. Our approach combines pre-folding and an O(mn) RNA pattern matching algorithm based on subtree homeomorphism for ordered, rooted trees.

We employed STRMS in search for both new and known RNA motifs (riboswitches and tRNAs) in large target databases. Our results point to a number of additional purine bacterial riboswitch candidates in newly sequenced bacteria, and demonstrate high sensitivity on known riboswitches and tRNAs.