next up previous contents
Next: Method Up: Introduction Previous: Limitations of the

Application to Other Genomes

While E.Coli is an excellent testbed for gene recognition, genomes of other species interest us. There is no reason why this parser could not be retrained on other procaryotes or even archaebacteria. We recommend that the parser be modified to work on more general databases than EcoSeq. For this parser's applicability to non-E.Coli organisms to be shown, it should be able to use training data from ACEDB [4], which is a genome database system being used to store sequences of many organisms.

Eventually these methods will be applied to the human genome to locate coding regions. However, eucaryote genomes are much more complex because their genes contain introns. To locate genes in human DNA will require a change in the optimal parse method to contain multiple states. Sensors will be used to predict regions of gene or nongene, as well as intron character. The parser will then train to find optimal weights for transitions between the gene/nongene/intron states.



David Konerding
Sun May 21 12:19:38 PDT 1995