In addition to serial machines, the HMM software has also been ported
to a Cray Y/MP and C-Linda. Table 2 tabulates, for a
variety of machines, two metrics of performance: HMM CUPS (note that a
cell update involves more computation for HMM training than for
edit-distance calculation) and speedup relative to a Sun 4/50
(Sparc-2), as well as a rough, dimensionless measure of the difficulty
of adapting the code to each platform. Several interesting
observations can be drawn from the table. First, there is a great
difference in both performance and effort between a working array
processor program (MP-1, unoptimized) and an efficient array processor
program (MP-1, optimized).
Second, conversion to different architectures takes
time. For the Maspar version, the entire dynamic programming routine
was rewritten. The Cray case was simpler: the loop indices were
modified to allow vectorization. The Linda version was the simplest,
taking advantage of the course-grain parallelism available in training
the model with several hundred sequences. The CM-2 results are
estimated from a partial implementation of the dynamic programming
operation; performance was severely handicapped by the lack of local
addressing.