Instructions for using NNPREDICT


nnpredict is a program that predicts the secondary structure type for each residue in an amino acid sequence. The basis of the prediction is a two-layer, feed-forward neural network. The network weights were determined by a separate program -- a modification of the Parallel Distributed Programming suite of McClelland and Rumelhart (1). Complete details of the determination of the network weights is found in Kneller et. al. (2).

nnpredict takes as input a sequence consisting of one-letter amino acid codes (A C D E F G H I K L M N P Q R S T V W Y) (NOTE: B and Z are not recognized as valid amino acid codes) or three-letter amino acid codes separated by spaces (ALA CYS ASP GLU PHE GLY HIS ILE LYS LEU MET ASN PRO GLN ARG SER THR VAL TRP TYR). The output is a secondary structure prediction for each position in the sequence. Multiple-chain proteins can be predicted either in pieces, or as a single sequence, with a '!' character between chains.

The predicted type will be either: 'H', a helix element; 'E', a beta strand element, or '-', a turn element. If your sequence contains any symbols that are not standard amino acids, '?'s will be used in the output to indicate that no prediction could be made in the window around the unrecognized amino acid.

nnpredict uses the tertiary class of the protein (either none, all- alpha, all-beta, or alpha/beta) for prediction. The possible options are:


References:
(1) J. L. McClelland and D. E. Rumelhart. (1988) "Explorations in Parallel Distributed Processing" vol 3. pp 318-362. MIT Press, Cambridge MA.

(2) D. G. Kneller, F. E. Cohen and R. Langridge (1990) "Improvements in Protein Secondary Structure Prediction by an Enhanced Neural Network" J. Mol. Biol. (214) 171-182.

Abstract for (2):
Computational neural networks have recently been used to predict the mapping between protein sequence and secondary structure. They have proven adequate for determining the first-order dependence between these two sets, but have, until now, been unable to garner higher-order information that helps determine secondary structure. By adding neural network units that detect periodicities in the input sequence, we have modestly increased the secondary structure prediction accuracy. The use of tertiary structural class causes a marked increase in accuracy. The best case prediction was 79% for the class of all-alpha proteins. A scheme for employing neural networks to validate and refine structural hypotheses is proposed. The operational difficulties of applying a learning algorithm to a dataset where sequence heterogeneity is under-represented and where local and global effects are inadequately partitioned are discussed.


This Web server was written by Nomi Harris
Copyright (C) 1995 Regents of the University of California
(Don't bother asking Nomi about the details of the nnpredict algorithm, because she didn't write nnpredict.)

nnpredict was written by Donald Kneller
Copyright (C) 1991 Regents of the University of California