Currently the multiple alignment similarity measure is incorrect.

Hamming distance measures a dissimilarity of two sequences and means "How many substitutions is needed to get one sequence from another".

There must be two distance algorithms: 1) "Hamming distance" for dissimilarity and 2) "Simple similarity" for similarity.

They use the following weight schemes:

1)

w("A", "T") = 1

w("A", "-") = w ("-", "A") = 0 or 1 (depends on "Exclude gaps" option that will be added in the dialog)

w("-", "-") = 0

w("A", "A") = 0

2) w("A", "T") = 0

w("A", "-") = w ("-", "A") = 0

w("-", "-") = 0

w("A", "A") = 1

A measure is a total weight of all pairs of characters in two sequences. It is recommended to align sequences to get a better value of a measure.

There are two ways to show the measure: pure weight value and similarity/dissimilarity estimation in percent. In percentage case, the value must be calculated as weight value divided on min(len1, len2), where len1 is a number of non-gap characters in the first sequence and len2 is a number of non-gap characters in the second sequence.

Also the distance matrix view must be revised. It must show similarity or dissimilarity depending on algorithm chosen.