Global alignment is when you take the entirety of both sequences into consideration when finding alignments, whereas in local you may only take a small portion into account. This sounds confusing so here an example:
Lets say you have a large reference, maybe 2000 bp. And you have a sequence, which is about 100 bp. Lets say that the reference contains the sequence almost exactly. If you did a local alignment, you would have a very good match. But if you did a global alignment, it may not match. Instead, it may look for matches throughout the entire reference, so youd end up with an alignment with many large gaps. It does not matter that it matches near perfectly at one particular region on the reference, because its looking for matches globally (ie throughout the reference).
If you have a really good match it may not matter what type of alignment you use. But when you have mismatches and such it starts to get important. This is because of the scoring algorithms used. In the example above lets say that there is a 100bp region in the reference that matches your 100bp sequence with 85% accuracy. In local alignment its very likely it will align there. Now lets say that the first 30 bp of your sequence matches a region in the beginning of the reference 95%, and the next 30bp matches a region in the middle of the reference 85%, and the final 40bp matches a region at the end of the reference about 90%. In global alignment the best match is the gapped alignment, whereas in local alignment the ungapped alignment would be best. I think in general gap penalties are less in global alignments, but Im not really an expert on the scoring algorithms.
What you want to use depends on what you are doing. If you think your sequence is a subsequence of the reference, do a local alignment. But if you think your entire sequence should match your entire reference, you would do a global alignment.
No comments:
Post a Comment