meshi.applications.corpus
Class Corpus

java.lang.Object
  extended by meshi.applications.corpus.Corpus
All Implemented Interfaces:
CompositeTorsionsDefinitions, MeshiPotential, KeyWords

public class Corpus
extends java.lang.Object
implements CompositeTorsionsDefinitions, MeshiPotential, KeyWords

A class that hold the 20-type-mutation energy data of one or several proteins. This class can be loaded from disk or created from a PDB file of a protein. Another way to create this class is by merging two instances into a larger corpus.


Field Summary
static double[][] blosum62
           
protected  double[][][] coors
           
protected  double[][][] energies
           
protected  java.lang.String[] energyNames
           
protected  java.lang.String[] excludedProteinsFromUngapped
           
protected  double[][] forAlign1
           
protected  double[][] forAlign2
           
protected  int Nenergies
           
protected  int Nres
           
protected  double[] prePro
           
protected  java.lang.String[] proteinNames
           
protected  int[] protInd
           
protected  boolean[] resHasAllEne
           
protected  int[] resNum
           
protected  int[] resType
           
protected  double[][] torsions
           
protected  int[] ungapped
           
 
Fields inherited from interface meshi.energy.simpleEnergyTerms.compositeTorsions.CompositeTorsionsDefinitions
ALL, CHI_1, CHI_2, CHI_3, CHI_4, COIL, HELIX, NUM_SIDECHAIN_TORSIONS, OMG, OMNI, PHI, POLYNOMIAL_CHI_1, POLYNOMIAL_CHI_1_CHI_2, POLYNOMIAL_CHI_1_CHI_2_TORSIONS, POLYNOMIAL_CHI_1_CHI_3, POLYNOMIAL_CHI_1_CHI_3_TORSIONS, POLYNOMIAL_CHI_1_CHI_4, POLYNOMIAL_CHI_1_CHI_4_TORSIONS, POLYNOMIAL_CHI_1_TORSIONS, POLYNOMIAL_PHI_PSI, POLYNOMIAL_PHI_PSI_CHI_1, POLYNOMIAL_PHI_PSI_CHI_1_TORSIONS, POLYNOMIAL_PHI_PSI_TORSIONS, PREPRO, PSI, SHEET, TOTAL_TORSION_ANGLES, UNIDENTIFIED_TORSION_TYPE
 
Fields inherited from interface meshi.parameters.MeshiPotential
ACCESSIBLE, ALPHA_ANGLE_PARAMETERS, ALPHA_TORSION_PARAMETERS, ANGLE_PARAMETERS, ATOMIC_PAIRWISE_PMF_SUMMA_PARAMETERS, BOND_PARAMETERS, BURIED, COIL, COMPOSITE_PROPENSITY_2D_PARAMETERS, COMPOSITE_PROPENSITY_2D_WITH_PP_PARAMETERS, COMPOSITE_PROPENSITY_PARAMETERS, COMPOSITE_TORSIONS_PARAMETERS, CONTACTS_ENVIRONMENT_PARAMETERS, CONTACTS_PARAMETERS, COOPERATIVE_ATOMIC_PAIRWISE_PMF_SUMMA_PARAMETERS, COOPERATIVE_PROPENSITY_PARAMETERS, COOPERATIVE_RAMACHANDRAN_PARAMETERS, ELECTROSTATICS_PARAMETERS, EXCLUDED_VOL_PARAMETERS, FLAT_RAMACH_PARAMETERS, HELIX, HELIX_OR_COIL, HYDROGEN_BONDS_PAIRS_BETA_PARAMETERS, HYDROGEN_BONDS_PAIRS_HELIX_PARAMETERS, HYDROGEN_BONDS_PAIRS_PARAMETERS_SURFACE, LENNARD_JONES_PARAMETERS, LENNARD_JONES_PARAMETERS_BACKBONE, LENNARD_JONES_PARAMETERS_CA, LJ_ENVIRONMENT_PARAMETERS, LJ_ENVIRONMENT_PARAMETERS_BACKBONE, LJ_ENVIRONMENT_PARAMETERS_CA, ONE_ANGLE_PARAMETERS, OUT_OF_PLANE_PARAMETERS, PLANE_PARAMETERS, PROPENSITY_ANGLE_PARAMETERS, PROPENSITY_TORSION_PARAMETERS, SHEET, SHEET_OR_COIL, SOLVATE_LONG_HB_PARAMETERS, SOLVATE_MINIMIZE_HB_PARAMETERS, SOLVATE_NOHB_PARAMETERS, SOLVATE_PARAMETERS, SOLVATE_SC_PARAMETERS, TWO_ANGLES_PARAMETERS, TWO_TORSIONS_PARAMETERS
 
Fields inherited from interface meshi.util.KeyWords
AA_SEQUENCE, ACCESIBILITY_SEQUENCE, ALINMENT_FILE_PATH, ALL_CA, ALPHA_ANGLE_ENERGY, ALPHA_TORSION_ENERGY, ANGLE_ENERGY, ANGLE_X, ANGLE_Z, ATOMIC_PAIRWISE_PMF_SUMMA_ENERGY, BEAUTIFY_PROBLEMATIC_RANGE, BFGS, BOND_ENERGY, BUFFER_SIZE, CA_CLASH_DISTANCE, CA_LONG_DISTANCE, CA_MODEL, CA_SHORT_DISTANCE, CA_TETHER_ENERGY, CALPHA_HYDROGEN_BONDS, CALPHA_HYDROGEN_BONDS_PLANE, CASP_GROUP, CG, CHECK_INTERLOOP_DISTANCE, CLASH_DISTANCE, COMPOSITE_PROPENSITY_ENERGY, CONSENSUS_ENERGY, CONSTRICT, COOPERATIVE_ATOMIC_PAIRWISE_PMF_SUMMA_ENERGY, COOPERATIVE_ATOMIC_PAIRWISE_PMF_SUMMA_FILENAME, COOPERATIVE_PERATOM_SUMMA_ENERGY, COOPERATIVE_PERATOM_SUMMA_FILENAME, COOPERATIVE_PROPENSITY_ENERGY, COOPERATIVE_PROPENSITY_FILENAME, COOPERATIVE_RAMACHANDRAN_ENERGY, COOPERATIVE_RAMACHANDRAN_FILENAME, COOPERATIVE_Z_PROPENSITY_ENERGY, COOPERATIVE_Z_PROPENSITY_FILENAME, COOPERATIVE_Z_RAMACHANDRAN_ENERGY, COOPERATIVE_Z_RAMACHANDRAN_FILENAME, COOPERATIVE_Z_SUMMA_ENERGY, COOPERATIVE_Z_SUMMA_FILENAME, CORPUS_FILE_NAME, CSAonly_FILES_LOCATION_PATH, CUTOFF, CYLINDER_ENERGY, DICTIONARY_KEY, DIELECTRIC_CONSTANT, DISTANCE_CONSTRAINT_PCA, DISTANCE_CONSTRAINTS_ENERGY, DISTANCE_CONSTRAINTS_MASK, DISTANCE_FROM_CENTROID_ENERGY, DRESSER_FRAGMENTS, EDM_ENERGY, EDM_ENERGY_FILE_NAME, ELECTROSTATICS, END, EXCLUDED_VOL, FINAL_TEMPERATURE, FIX_C_TERMINAL, FIX_N_TERMINAL, FLAT_RAMACH_ENERGY, FREE_FINAL_MINIMIZATION, GRID_EDGE, HYDROGEN_BONDS, HYDROGEN_BONDS_ANGLES, HYDROGEN_BONDS_PAIRS, HYDROGEN_BONDS_PLANE, INFLATE_ENERGY, INITIAL_TEMPERATURE, INPUT_FILE, INTER_SEGMENT_FACTOR, INTER_SEGMENT_TOLERANCE, INTRA_SEGMENT_FACTOR, INTRA_SEGMENT_TOLERANCE, ITERATIONS_ALLATOM, ITERATIONS_BACKBONE, ITERATIONS_CA, KEY_KEY, KOEHL_FILE, LBFGS, LENNARD_JONES, LENNARD_JONES_CA, LINEAR_RG, LOOP1, LOOP2, LOOSEN_EDGE_LENGTH, MAX_ANGLE, MAX_CLASHES, MAX_DISTANCE, MAX_RUN_TIME, MAX_STEPS, MAX_WIDTH_OF_HAIRPIN, MCM, MCM_PERTURBATION, MESHILOG_KEY, METHOD, MIN_WIDTH_OF_HAIRPIN, MINIMIZATION_LOOP, MINIMIZE, MODE, MODEL, MODEL_DSSP, MODEL_NUMBER, N_ATOMS, N_TRIES, NON_FROZEN_BOND_DEPTH, NON_FROZEN_RADIUS, NONE, NUMBER, NUMBER_OF_CA_ITERATIONS, NUMBER_OF_CHAINS, NUMBER_OF_MODELS, NUMBER_OF_RUNS, OFF, ON, OPTIMIZER, OUT_OFPLANE_ENERGY, OUTPUT_FILE_NAME, OUTPUT_FILE_PATH, PARAMETERS_DIRECTORY, PDB_FILE, PLANE_ENERGY, PROPENSITY_TORSION_ENERGY, R_MAX, RAMACHANDRAN_SIDECHAIN_ENERGY, REFERENCE, RELAX, REPORT_EVERY, RESTART_EVERY, RMS_TARGET, ROTAMER_LIBRARY, SATURATION, SECONDARY_STRUCTURE, SEED, SEQUENCE, SHOTGUN_MODEL, SMOOTH_ROTAMER_LIBRARY_ENERGY, SOLVATE_ENERGY, SS_NAME, SS_SEQUENCE, STEEPEST_DECENT, STEPS, STRICT_CLASHES, STRUCTURE_NAMES, SUPERIMPOSE, SYMMETRY_ENERGY, TARGET_FILE_PATH, TARGET_NAME, TARGET_SEQUENCE, TEMPLATE_DISTANCE_CONSTRAINTS, TEMPLATE_DSSP, TEMPLATE_ENERGY, TEMPLATE_FILE_PATH, TEMPLATE_NAME, TEMPLATE_STRUCTURE, TEMPLATE_TARGET_ALIGNMENT, TETHER_ENERGY, TOLERANCE, TOPOLOGY_MAP, TWO_TORSIONS_ENERGY, UN_WARP_ENERGY, UNSATISFIED_CUTTOF, UP_TO_CUTOFF, USE_FAST_ARCCOS, VALUE_KEY, VOLUME_CONSTRAINT, WARP_ENERGY, WARP_STEP_SIZE, WARP_THRESHOLD, WEIGHT, WIDTH_OF_HAIRPIN
 
Constructor Summary
Corpus(java.lang.String exsitingCorpusFile)
          Reading a corpus from file.
Corpus(java.lang.String PDBfile, CommandList commands, EnergyCreator[] energyCreators)
           
 
Method Summary
 void buildUngappedArray(int fragL)
           
 void buildUngappedArray(int fragL, java.lang.String[] excludeProteins)
          Building the ungapped array for fragments of length fragL.
 void buildUngappedArraynoPG(int fragL)
           
 double calcRmsBetweenStruct(int ind1, int ind2, int fragL, int manner, int overlap)
          This method gives the rms between two fragments in the corpus (starting in ind1 and ind2).
private  void calculateSolRot1(java.lang.String PDBfile, CommandList commands)
           
private static void extractCoordinatesOfRes(Residue res, double[][] co)
           
 void findDisDist(int len)
          Find the distributaion of distances between ends.
 int findInUngapped(java.lang.String proteinName, int residueNumber)
           
 void findPhiPsiRelations(int howMany)
          Find the relation between Delta{phi,psi} and RMS
 void GeneralThreadingExperiment(int fragL, int Ninstances, java.lang.String header, int RMSind1, int RMSind2)
          This is a method for general ungapped threading.
private static boolean haveAllBackboneAtoms(Residue res)
           
 void merge(Corpus corpus)
           
 void setExcludedProteinsFromUngapped(java.lang.String[] list)
           
private  void specialMutate(Protein aux, int mutateThis, int mutateTo, double[][] pp, DunbrackLib lib)
           
 void threadingExperiment_withPP(int fragL, double Wprop, double Wsolv)
           
 void threadingExperiment(int fragL)
           
 void threadingExperimentnoPG(int fragL)
           
 void writeToDisk(java.lang.String fileName)
          This overload writes the entire corpus to disk.
private  void writeToDisk(java.lang.String fileName, boolean[] toWrite)
          Writing all the residues in the corpus that are marked as TRUE in the 'toWrite' input array (its length must be as the corpus's).
 void writeToDisk(java.lang.String fileName, int indAr, int fragL)
          This overload writes to disk a subset of the corpus, all the residues that have of the previous method, that writes to disk only residues from the fragments whos indices are given in the input array.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

proteinNames

protected java.lang.String[] proteinNames

energyNames

protected java.lang.String[] energyNames

Nenergies

protected int Nenergies

Nres

protected int Nres

protInd

protected int[] protInd

resNum

protected int[] resNum

resType

protected int[] resType

prePro

protected double[] prePro

energies

protected double[][][] energies

coors

protected double[][][] coors

torsions

protected double[][] torsions

resHasAllEne

protected boolean[] resHasAllEne

ungapped

protected int[] ungapped

excludedProteinsFromUngapped

protected java.lang.String[] excludedProteinsFromUngapped

forAlign1

protected double[][] forAlign1

forAlign2

protected double[][] forAlign2

blosum62

public static final double[][] blosum62
Constructor Detail

Corpus

public Corpus(java.lang.String PDBfile,
              CommandList commands,
              EnergyCreator[] energyCreators)

Corpus

public Corpus(java.lang.String exsitingCorpusFile)
Reading a corpus from file. See the file format in the method 'writeToDisk'

Method Detail

writeToDisk

private void writeToDisk(java.lang.String fileName,
                         boolean[] toWrite)
Writing all the residues in the corpus that are marked as TRUE in the 'toWrite' input array (its length must be as the corpus's). The format is: ... ... > ... ... ... > > ... ... ... > .... > ... ... ... >


writeToDisk

public void writeToDisk(java.lang.String fileName)
This overload writes the entire corpus to disk. Residues that do not have all the energies are not written.


writeToDisk

public void writeToDisk(java.lang.String fileName,
                        int indAr,
                        int fragL)
This overload writes to disk a subset of the corpus, all the residues that have of the previous method, that writes to disk only residues from the fragments whos indices are given in the input array. This is useful to create more compact libraries from a big one. The format is as for the other method.


merge

public void merge(Corpus corpus)

buildUngappedArray

public void buildUngappedArray(int fragL)

buildUngappedArray

public void buildUngappedArray(int fragL,
                               java.lang.String[] excludeProteins)
Building the ungapped array for fragments of length fragL. Fragments from the proteins in the provided list are excluded.


threadingExperiment

public void threadingExperiment(int fragL)

threadingExperiment_withPP

public void threadingExperiment_withPP(int fragL,
                                       double Wprop,
                                       double Wsolv)

GeneralThreadingExperiment

public void GeneralThreadingExperiment(int fragL,
                                       int Ninstances,
                                       java.lang.String header,
                                       int RMSind1,
                                       int RMSind2)
This is a method for general ungapped threading. You give it the frag length 'fragL', and the number of instances you want 'Ninstances'. It will randomly pick 'Ninstances' random fragments and thread the sequence of the first into all the others. Than print:
<1>
<2>
<3> ...
The RMS is calculated between two indices (inclusive) in the FRAG reference frame.


haveAllBackboneAtoms

private static boolean haveAllBackboneAtoms(Residue res)

extractCoordinatesOfRes

private static void extractCoordinatesOfRes(Residue res,
                                            double[][] co)

calculateSolRot1

private void calculateSolRot1(java.lang.String PDBfile,
                              CommandList commands)

specialMutate

private void specialMutate(Protein aux,
                           int mutateThis,
                           int mutateTo,
                           double[][] pp,
                           DunbrackLib lib)

findInUngapped

public int findInUngapped(java.lang.String proteinName,
                          int residueNumber)

calcRmsBetweenStruct

public double calcRmsBetweenStruct(int ind1,
                                   int ind2,
                                   int fragL,
                                   int manner,
                                   int overlap)
This method gives the rms between two fragments in the corpus (starting in ind1 and ind2). fragL is the fragment length. The fragments are superimposed according to an overlap region who's size is a parameter. The type of the overlap region are given by manner: -1 - overlap of the N-terminus 0 - two overlap regions in both termini (each of length overlap) 1 - overlap of the C-terminus You can force the overlap of the entire fragments by giving (overlap=fragL) and (manner=1)


buildUngappedArraynoPG

public void buildUngappedArraynoPG(int fragL)

threadingExperimentnoPG

public void threadingExperimentnoPG(int fragL)

findDisDist

public void findDisDist(int len)
Find the distributaion of distances between ends.


findPhiPsiRelations

public void findPhiPsiRelations(int howMany)
Find the relation between Delta{phi,psi} and RMS


setExcludedProteinsFromUngapped

public void setExcludedProteinsFromUngapped(java.lang.String[] list)