UCSC Computational Biology Research Projects
Last updated: Dec 23 2002
DNA
Genefinding:
A review of Genefinding methods
[Postscript file]
[pdf file]
[html version]
(Short version:
[Postscript file]
[pdf file]
)
Early genefinding collaboration with LBNL
Improved Splice Site Detection in Genie
RECOMB97
(gzip/postscript)
Integrating Database Homology in a Probabilistic Gene Structure Model
PSB97
(gzip/postscript)
A Generalized Hidden Markov Model for the Recognition of Human Genes in DNA
. ISMB96
(gzip/postscript)
EcoParse
is a mailserver for finding genes in E. coli DNA that uses a Hidden Markov Model. An
abstract
and a
full paper
are available.
Optimal Parse of DNA
is a gene-finding program for prokaryotes that is general, allowing the user to add or remove "sensors" which look for coding signals.
Postscript
is also available.
Simple and hidden Markov models for finding REPs (PostScript only)
Using simple Markov Models to search DNA databases (HTML)
Linear hidden Markov models for protein and nucleic acid modeling (SAM)
RNA
Detecting base-pairs in RNA multiple alignments
Stochastic context-free grammars for RNA modeling
Gibbs Sampling for creating Stochastic context-free grammars for RNA modeling
Pseudoknot Modeling Using Intersections of Stochastic Context Free Grammars with Applications to Database Search
Proteins
Investigation of non-pairwise protein structure score functions using sets of decoy structures (
compressed postscript
,
compressed pdf
) PhD dissertation for Christian Barrett.
Linear hidden Markov models for protein and nucleic acid modeling (SAM)
Getting the most out of Hidden Markov Models
A tutorial presented at ISMB99 (Intelligent Systems in Molecular Biology).
Predicting Protein Structure using only Sequence Information
Proteins: Structure, Function, and Genetics
Supplement 3, 1999. The results of SAM-T98 at CASP3.
official journal site
Remote Homolog Detection with SAM-T98
Bioinformatics
14(10): 846-856, 1998. A more detailed presentation of SAM-T98 than the
Proteins
paper.
Sequence Comparisons Using Multiple Sequences Detect Three Times as Many Remote Homologues as Pairwise Methods
(Journal of Molecular Biology, 284(4):249-51, 1998)
A discriminative framework for detecting remote protein homologies
(
Abstract
,
Postscript
)
Weighting hidden Markov models for maximum discrimination
Bioinformatics
, 14(9):772-82,1998. [
postscript
]
Three papers on the
Visualization and Sonification of Protein Structure Sequence Alignments
Neural-network protein structure prediction
A short poster presentation.
Predicting Protein Structure Using hidden Markov Models
Proteins: Structure, Function, and Genetics
Supplement 1, 1997, pp. 134-139. Fold recognition in CASP2 (compressed postscript)
Official journal web site.
"Hidden Markov Models in Computational Biology: Applications to Protein Modeling,"
Journal Mol. Biology,
235:1501--1531, February 1994.
[compressed postscript of longer tech. rep. version, part 1]
[second part of postscript]
"Classifying G-Protein Coupled Receptors with Support Vector Machines"
Master's Thesis
, June 2000. [
compressed postscript
,
pdf
]
Estimating amino acid distributions
Dirichlet mixtures: a method for improving the estimation of amino acid distributions
Evaluating Regularizers for Estimating Distributions of Amino Acids (ISMB95)
Karplus, Kevin. Regularizers for Estimating Distributions of Amino Acids from Small Samples. UCSC-CRL-95-11.
(longer tutorial version---Postscipt only)
Parallel Sequence Analysis
The Kestrel VLSI sequence analysis co-processor
Linear hidden Markov models for protein and nucleic acid modeling (SAM)
Discriminative Models in Computational Biology
Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data
(
Postscript of technical report
)
Support Vector Machine Classification of Microarray Gene Expression Data (UCSC-CRL-99-09)
(
Postscript of technical report
)
Exploiting generative models in discriminative classifiers
(
Abstract
,
Postscript
)
A discriminative framework for detecting remote protein homologies
(
Abstract
,
Postscript
dataset, 56mb compressed
)
Classifying G-protein coupled receptors with support vector machines
(
Postscript
PDF
)
preprint of accepted paper to appear in
Bioinformatics
EST Analysis
A Probabilistic Approach to Consensus Multiple Alignment
(
Postscript
)
Towards an Accurate EST consensus
(
Postscript
)
Software Tools
LaTeX style file for CABIOS.