Outline the steps used to find values for a BLOSUM amino acid similarity matrix.
I) Eliminating Sequences.
Eliminate the sequences that are more than r% identical. There are two ways to eliminate the sequences.
ii) Calculating Frequency & Probability.
A database storing the sequence alignments of the most conserved regions of protein families. These alignments are used to derive the BLOSUM matrices.
iii) Log odd ratio.
It gives the ratio of the occurrence each amino acid combination in the observed data to the expected value of occurrence of the pair. It is rounded off and used in the substitution matrix.
iv) BLOSUM Matrices.
The odds for relatedness are calculated from log odd ratio, which are then rounded off to get the substitution matrices BLOSUM matrices.
V) Score of the BLOSUM matrices.
A scoring matrix or a table of values is required for evaluating the significance of a sequence alignment, such as describing the probability of a biologically meaningful amino-acid or nucleotide residue-pair occurring in an alignment.
Comments
Leave a comment