The traditional way in computational biology to demonstrate that a
technique or set of parameters is better is to pick a biologically
interesting problem and compare methods for solving it.
Many of the regularizers in Section 3 have been
validated in this way [HH92, BHK
93, TAK94].
This sort of anecdotal evidence is very valuable for establishing that techniques are useful in real biological problems, but is very difficult to quantify. It is difficult to determine how much improvement is expected on different problems, and whether the improved technique is better in general, or just on the specific problem it was applied to.
In this paper, the regularizers are compared quantitatively on a rather generic problem--independently encoding the columns of multiple alignments. This generic problem has some attractive features:
Throughout this paper, the trusted alignments used are the BLOCKS database [HH91] with the sequence weighting scheme mentioned in Section 5.