ensembles.ens_build submodule

cg_openmm.ensembles.ens_build.get_ensemble(cgmodel, ensemble_size=100, high_energy=False, low_energy=False)[source]

Given a coarse grained model, this function generates an ensemble of high energy configurations and, by default, saves this ensemble to the foldamers/ensembles database for future reference/use, if a high-energy ensemble with these settings does not already exist.

Parameters
  • cgmodel (class) – CGModel() class object.

  • ensemble_size (integer) – Number of structures to generate for this ensemble, default = 100

  • high_energy (Logical) – If set to ‘True’, this function will generate an ensemble of high-energy structures, default = False

  • low_energy (Logical) – If set to ‘True’, this function will generate an ensemble of low-energy structures, default = False

Returns

  • ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the ensemble.

cg_openmm.ensembles.ens_build.get_ensemble_data(cgmodel, ensemble_directory)[source]

Given a CGModel() class object and an ‘ensemble_directory’, this function reads the PDB files within that directory, as well as any energy data those files contain.

Parameters
  • cgmodel (class) – CGModel() class object

  • ensemble_directory (str) – The path/name of the directory where PDB files for this ensemble are stored

Returns

  • ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the ensemble.

  • ensemble_energies ( List(Quantity() )) - A list of the energies that were stored in the PDB files for the ensemble, if any.

Warning

When energies are written to a PDB file, only the sigma and epsilon values for the model are written to the file with the positions. Unless the user is confident about the model parameters that were used to generate the energies in the PDB files, it is probably best to re-calculate their energies. This can be done with the ‘cg_openmm’ package. More specifically, one can compute an updated energy for individual ensemble members, with the current coarse grained model parameters, with ‘get_mm_energy’, a function in ‘cg_openmm/cg_openmm/simulation/tools.py’.

cg_openmm.ensembles.ens_build.get_ensemble_directory(cgmodel, ensemble_type=None)[source]

Given a CGModel() class object, this function uses its attributes to assign an ensemble directory name.

For example, the directory name for a model with 20 monomers, all of which contain one backbone bead and one sidechain bead, and whose bond lengths are all 7.5 Angstroms, would be: “foldamers/ensembles/20_1_1_0_7.5_7.5_7.5”.

Parameters
  • cgmodel (class) – CGModel() class object

  • ensemble_type (str) – Designates the type of ensemble for which we will assign a directory name. default = None. Valid options include: “native” and “nonnative”

Returns

  • ensemble_directory ( str ) - The path/name for the ensemble directory.

cg_openmm.ensembles.ens_build.get_ensembles(cgmodel, native_structure, ensemble_size=None)[source]

Given a native structure as input, this function builds both native and nonnative ensembles.

Parameters
  • cgmodel (class) – CGModel() class object

  • native_structure – The positions for the model’s native structure

  • ensemble_size (int) – The number of poses to generate for the nonnative ensemble, default = None

Returns

  • nonnative_ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the nonnative ensemble

  • nonnative_ensemble_energies ( List(Quantity() )) - A list of the energies for all members of the nonnative ensemble

  • native_ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the native ensemble

  • native_ensemble_energies ( List(Quantity() )) - A list of the energies for the native ensemble

cg_openmm.ensembles.ens_build.get_ensembles_from_replica_positions(cgmodel, replica_positions, replica_energies, temperature_list, native_fraction_cutoff=0.95, nonnative_fraction_cutoff=0.9, native_ensemble_size=10, nonnative_ensemble_size=100, decorrelate=True, native_structure_contact_distance_cutoff=None, optimize_Q=False)[source]

Given a coarse grained model and replica positions, this function: 1) decorrelates the samples, 2) clusters the samples with MSMBuilder, and 3) generates native and nonnative ensembles based upon the RMSD positions of decorrelated samples.

Parameters
  • cgmodel (class) – CGModel() class object

  • replica_positions (np.array( num_replicas x num_steps x np.array(float*simtk.unit (shape = num_beads x 3))))) –

  • replica_energies (List( List( float * simtk.unit.energy for simulation_steps ) for num_replicas )) – List of dimension num_replicas X simulation_steps, which gives the energies for all replicas at all simulation steps

  • temperature_list (List( SIMTK Unit() * number_replicas )) – List of temperatures that will be used to define different replicas (thermodynamics states), default = None

  • native_fraction_cutoff (float) – The fraction of native contacts above which a pose is considered ‘native’

  • nonnative_fraction_cutoff (float) – The fraction of native contacts above which a pose is considered ‘native’

  • native_ensemble_size (index) – The number of poses to generate for a native ensemble

  • nonnative_ensemble_size (index) – The number of poses to generate for a nonnative ensemble

  • decorrelate (Logical) – Determines whether or not to subsample the replica exchange trajectories using pymbar, default = True

  • native_structure_contact_distance_cutoff – The distance below which two nonbonded, interacting particles that are defined as “native contact”,default=None

  • optimize_Q – Determines whether or not to call a procedure that optimizes parameters which influence determination of native contacts

Returns

  • native_ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the native ensemble

  • native_ensemble_energies ( List(Quantity() )) - A list of the energies for the native ensemble

  • nonnative_ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the nonnative ensemble

  • nonnative_ensemble_energies ( List(Quantity() )) - A list of the energies for all members of the nonnative ensemble

cg_openmm.ensembles.ens_build.get_native_ensemble(cgmodel, native_structure, ensemble_size=10, native_fraction_cutoff=0.9, rmsd_cutoff=10.0, ensemble_build_method='mbar')[source]

Given a native structure as input, this function builds a “native” ensemble of structures.

Parameters
  • cgmodel (class) – CGModel() class object

  • native_structure – The positions for the model’s native structure

  • ensemble_size (int) – The number of poses to generate for the nonnative ensemble, default = 10

  • native_fraction_cutoff (float) – The fraction of native contacts above which a pose will be considered “native”, default = 0.9

  • rmsd_cutoff (float) – The distance beyond which non-bonded interactions will be ignored, default = 10.0 x bond_length

  • ensemble_build_method (str) – The method that will be used to generate a nonnative ensemble. Valid options include “mbar” and “native_contacts”. If the “mbar” approach is chosen, decorrelated replica exchange simulation data is used to generate the nonnative ensemble. If the “native_contacts” approach is chosen, individual NVT simulations are used to generate the nonnative ensemble, default = “mbar”

Returns

  • ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the ensemble.

  • ensemble_energies ( List(Quantity() )) - A list of the energies that were stored in the PDB files for the ensemble, if any.

cg_openmm.ensembles.ens_build.get_native_structure(replica_positions, replica_energies, temperature_list)[source]

Given replica exchange run positions and energies, this function identifies the “native” structure, calculated as the structure with the lowest reduced potential energy.

Parameters
  • replica_energies (List( List( float * simtk.unit.energy for simulation_steps ) for num_replicas )) – List of dimension num_replicas X simulation_steps, which gives the energies for all replicas at all simulation steps

  • replica_positions (np.array( ( float * simtk.unit.positions for num_beads ) for simulation_steps )) – List of positions for all output frames for all replicas

Returns

  • native_structure ( np.array( float * simtk.unit.positions for num_beads ) ) - The predicted native structure

cg_openmm.ensembles.ens_build.get_nonnative_ensemble(cgmodel, native_structure, ensemble_size=100, native_fraction_cutoff=0.75, rmsd_cutoff=10.0, ensemble_build_method='mbar')[source]

Given a native structure as input, this function builds a “nonnative” ensemble of structures.

Parameters
  • cgmodel (class) – CGModel() class object

  • native_structure – The positions for the model’s native structure

  • ensemble_size (int) – The number of poses to generate for the nonnative ensemble, default = 100

  • native_fraction_cutoff (float) – The fraction of native contacts below which a pose will be considered “nonnative”, default = 0.75

  • rmsd_cutoff (float) – The distance beyond which non-bonded interactions will be ignored, default = 10.0 x bond_length

  • ensemble_build_method (str) – The method that will be used to generate a nonnative ensemble. Valid options include “mbar” and “native_contacts”. If the “mbar” approach is chosen, decorrelated replica exchange simulation data is used to generate the nonnative ensemble. If the “native_contacts” approach is chosen, individual NVT simulations are used to generate the nonnative ensemble, default = “mbar”

Returns

  • ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the ensemble.

  • ensemble_energies ( List(Quantity() )) - A list of the energies that were stored in the PDB files for the ensemble, if any.

cg_openmm.ensembles.ens_build.get_pdb_list(ensemble_directory)[source]

Given an ‘ensemble_directory’, this function retrieves a list of the PDB files within it.

Parameters

ensemble_directory (str) – Path to a folder containing PDB files

Returns

  • pdb_list ( List(str) ) - A list of the PDB files in the provided ‘ensemble_directory’.

cg_openmm.ensembles.ens_build.improve_ensemble(energy, positions, ensemble, ensemble_energies, unchanged_iterations)[source]

Given an energy and positions for a single pose, as well as the same data for a reference ensemble, this function “improves” the quality of the ensemble by identifying poses with the lowest potential energy.

Parameters
  • energy – The energy for a pose.

  • positions – Positions for coarse grained particles in the model, default = None

  • ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) – A group of similar poses.

  • ensemble_energies – A list of energies for a conformational ensemble.

  • unchanged_iterations (int) – The number of iterations for which the ensemble has gone unchanged.

Returns

  • ensemble (List(positions(np.array(float*simtk.unit (shape = num_beads x 3))))) - A list of the positions for all members in the ensemble.

  • ensemble_energies ( List(Quantity() )) - A list of the energies that were stored in the PDB files for the ensemble, if any.

  • unchanged_iterations ( int ) - The number of iterations for which the ensemble has gone unchanged.

cg_openmm.ensembles.ens_build.test_energy(energy)[source]

Given an energy, this function determines if that energy is too large to be “physical”. This function is used to determine if the user-defined input parameters for a coarse grained model give a reasonable potential function.

Parameters

energy – The energy to test.

Returns

  • pass_energy_test ( Logical ) - A variable indicating if the energy passed (“True”) or failed (“False”) a “sanity” test for the model’s energy.

cg_openmm.ensembles.ens_build.write_ensemble_pdb(cgmodel, ensemble_directory=None)[source]

Given a CGModel() class object that contains positions, this function writes a PDB file for the coarse grained model, using those positions.

Parameters
  • cgmodel (class) – CGModel() class object

  • ensemble_directory (str) – Path to a folder containing PDB files, default = None

Warning

If no ‘ensemble_directory’ is provided, the

cg_openmm.ensembles.ens_build.z_score(nonnative_ensemble_energies, native_ensemble_energies)[source]

Given a set of nonnative and native ensemble energies, this function computes the Z-score (for a set of model parameters).

Parameters
  • nonnative_ensemble_energies – A list of the energies for all members of the nonnative ensemble

  • native_ensemble_energies – A list of the energies for the native ensemble

Returns

  • z_score ( float ) - The Z-score for the input ensembles.