I am in the process of assembling and annotating the genome of a non-model organism, using almost exclusively short read (paired-end Illumina) data. Throughput is one obvious benefit of these data (high coverage for reasonable cost, error rates notwithstanding), but another benefit of RNA-seq data in particular is that they can be used in multiple ways. For example, I am mapping the RNA-seq reads to the genome to estimate transcript abundances and do some differential expression analysis, but I have also assembled the RNA-seq reads de novo to use as evidence for genome annotation.
There are, however, limitations to using RNA-seq-derived transcripts for annotation. Full-length cDNAs would facilitate much more reliable annotation. However, I have no intuition for the relative cost, time commitment, or complexity of extracting and sequencing full-length cDNAs vs next-gen sequencing. Can anyone here comment on this?
No comments:
Post a Comment