bioinformatics - Any tool to align whole genome sequence data to another genome and give exon regions a higher mark?

Tuesday, 8 January 2008

bioinformatics - Any tool to align whole genome sequence data to another genome and give exon regions a higher mark?

If you are not trying to assemble but just to align each read to the genome, you can use exonerate. On a Unix/Linux platform, once you have installed it run something like:

exonerate -m genome2genome WGS.fasta genome.fasta > out.txt

From the exonerate manual:

          genome2genome
                 This  model  is  similar  to  the  cod‐
                 ing2coding  model,  except  introns are
                 modelled on both sequences.  (not work‐
                 ing well yet)

What I would recommend though, is to align against a reference cDNA dataset, not the whole genome. In that case, you should use this instead:

exonerate -m cdna2genome genome_cdna.fasta WGS.fasta > out.txt

From the exonerate manual:

          cdna2genome
                 This   combines   properties   of   the
                 est2genome and coding2genome models, to
                 allow modeling of an whole cDNA where a
                 central coding region can be flanked by
                 non-coding UTRs.  When  the  CDS  start
                 and  end  is  known it may be specified
                 using  the  --annotation  option   (see
                 below)  to permit only the correct cod‐
                 ing region to appear in the alignemnt.

Answer Desk

Tuesday, 8 January 2008

bioinformatics - Any tool to align whole genome sequence data to another genome and give exon regions a higher mark?

No comments:

Post a Comment