UCSC Computational Biology Research Projects


Last updated: Dec 23 2002

DNA

  • Genefinding:
  • Simple and hidden Markov models for finding REPs (PostScript only)
  • Using simple Markov Models to search DNA databases (HTML)
  • Linear hidden Markov models for protein and nucleic acid modeling (SAM)
  • RNA

  • Detecting base-pairs in RNA multiple alignments
  • Stochastic context-free grammars for RNA modeling
  • Gibbs Sampling for creating Stochastic context-free grammars for RNA modeling
  • Pseudoknot Modeling Using Intersections of Stochastic Context Free Grammars with Applications to Database Search
  • Proteins

  • Investigation of non-pairwise protein structure score functions using sets of decoy structures (compressed postscript, compressed pdf) PhD dissertation for Christian Barrett.
  • Linear hidden Markov models for protein and nucleic acid modeling (SAM)
  • Getting the most out of Hidden Markov Models A tutorial presented at ISMB99 (Intelligent Systems in Molecular Biology).
  • Predicting Protein Structure using only Sequence Information Proteins: Structure, Function, and Genetics Supplement 3, 1999. The results of SAM-T98 at CASP3. official journal site
  • Remote Homolog Detection with SAM-T98 Bioinformatics 14(10): 846-856, 1998. A more detailed presentation of SAM-T98 than the Proteins paper.
  • Sequence Comparisons Using Multiple Sequences Detect Three Times as Many Remote Homologues as Pairwise Methods (Journal of Molecular Biology, 284(4):249-51, 1998)
  • A discriminative framework for detecting remote protein homologies (Abstract, Postscript)
  • Weighting hidden Markov models for maximum discrimination Bioinformatics, 14(9):772-82,1998. [postscript]
  • Three papers on the Visualization and Sonification of Protein Structure Sequence Alignments
  • Neural-network protein structure prediction A short poster presentation.
  • Predicting Protein Structure Using hidden Markov Models Proteins: Structure, Function, and Genetics Supplement 1, 1997, pp. 134-139. Fold recognition in CASP2 (compressed postscript) Official journal web site.
  • "Hidden Markov Models in Computational Biology: Applications to Protein Modeling," Journal Mol. Biology, 235:1501--1531, February 1994. [compressed postscript of longer tech. rep. version, part 1] [second part of postscript]
  • "Classifying G-Protein Coupled Receptors with Support Vector Machines" Master's Thesis, June 2000. [compressed postscript, pdf]
  • Estimating amino acid distributions

  • Dirichlet mixtures: a method for improving the estimation of amino acid distributions
  • Evaluating Regularizers for Estimating Distributions of Amino Acids (ISMB95)
  • Karplus, Kevin. Regularizers for Estimating Distributions of Amino Acids from Small Samples. UCSC-CRL-95-11. (longer tutorial version---Postscipt only)
  • Parallel Sequence Analysis

  • The Kestrel VLSI sequence analysis co-processor
  • Linear hidden Markov models for protein and nucleic acid modeling (SAM)
  • Discriminative Models in Computational Biology

  • Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data (Postscript of technical report )
  • Support Vector Machine Classification of Microarray Gene Expression Data (UCSC-CRL-99-09) (Postscript of technical report )
  • Exploiting generative models in discriminative classifiers (Abstract, Postscript)
  • A discriminative framework for detecting remote protein homologies (Abstract, Postscript dataset, 56mb compressed)
  • Classifying G-protein coupled receptors with support vector machines (Postscript PDF)
    preprint of accepted paper to appear in Bioinformatics
  • EST Analysis

  • A Probabilistic Approach to Consensus Multiple Alignment (Postscript)
  • Towards an Accurate EST consensus (Postscript)
  • Software Tools

  • LaTeX style file for CABIOS.