Sunday, 28 January 2007

bioinformatics - Organize sequence database

I recently took up the task of organizing the sequencing database for a research project. I am still quite unsure of which database type I should use for this. Some advices/recommendations would be wonderful.



This research project is more like a class where students are taught how to sequence a piece of plasmid via Sanger sequencing method, with paired end primers. For each plasmid there will be two entries (~1kbp each). Then the students would BLAST the sequence, mostly to see if there is any close match with certain related organisms.



The main purpose of this database I'm creating is to monitor the sequencing progress, as well as to single out the samples that need to be redone. So far, the record-keeping is not exactly accessible, hence the need for re-organization. Additionally, the students would need to be able to view their own results and enter the BLAST results themselves. The last condition seems to necessitate a web-server type of thing that the student can connect to enter their information.



The total number of bacteria clones containing plasmids are ~18k, which means there will be ~36k entries in this database. It would be great if there is anyway to manage the trace files of the entries, or if not, I think I can add a column for the file path.



Is there a database management program that is appropriate for organizing sequence and trace file, with easy access for data entry by multiple users? It would also help if the program don't have a high maintenance need.



Disclaimer: I have very limited knowledge with database structure, having only taken an online intro to database class. About programming language, I am still learning Python. This project is a way to force myself to learn more.

No comments:

Post a Comment