thermo.calc submodule
- cg_openmm.thermo.calc.bootstrap_heat_capacity(frame_begin=0, sample_spacing=1, frame_end=-1, plot_file='heat_capacity_boot.pdf', output_data='output/output.nc', num_intermediate_states=0, frac_dT=0.05, conf_percent='sigma', n_trial_boot=200, U_kln=None, sparsify_stride=1, compute_entropies=False)[source]
Calculate and plot the heat capacity curve, with uncertainty determined using bootstrapping. Uncorrelated datasets are selected using a random starting frame, repeated n_trial_boot times. Uncertainty in melting point and full-width half maximum of the Cv curve are also returned. If a re-evaluated energy array is specified, the heat capacity of the re-evaluated system is calculated using MBAR reweighting, based on the original energies in the output.nc file.
Note
If using a re-evaluated energy matrix, U_kln should include all frames after frame_begin
- Parameters
frame_begin (int) – index of first frame defining the range of samples to use as a production period (default=0)
sample_spacing (int) – spacing of uncorrelated data points, for example determined from pymbar timeseries subsampleCorrelatedData (default=1)
frame_end (int) – index of last frame to include in heat capacity calculation (default=-1)
plot_file (str) – path to filename to output plot (default=’heat_capacity_boot.pdf’)
output_data (str) – Path to the output data for a NetCDF-formatted file containing replica exchange simulation data (default = “output/output.nc”)
num_intermediate_states (int) – The number of states to insert between existing states in ‘temperature_list’ (default=0)
frac_dT (float) – The fraction difference between temperatures points used to calculate finite difference derivatives (default=0.05)
conf_percent (float) – Confidence level in percent for outputting uncertainties (default = 68.27 = 1 sigma)
n_trial_boot (int) – number of trials to run for generating bootstrapping uncertainties (default=200)
U_kln (3d numpy array (float) with dimensions [replica, evaluated_state, frame]) – re-evaluated state energies array to be used for the MBAR calculation (starts at frame_begin)
sparsify_stride (int) – apply this stride to the replica energies from file, to match sparsified reevaluated energies (default=1)
compute_entropies (bool) – option to compute the entropy changes from lowest simulated temperature to each other temperature defined by num_intermediate_states (default=False)
- Returns
T_list ( List( float * unit.simtk.temperature ) ) - The temperature list corresponding to the heat capacity values in ‘Cv’
Cv_values ( List( float * kJ/mol/K ) ) - The heat capacity values for all (including inserted intermediates) states
Cv_uncertainty ( Tuple ( np.array(float) * kJ/mol/K ) ) - confidence interval for all Cv_values computed from bootstrapping
Tm_value ( float * unit.simtk.temperature ) - Melting point mean value computed from bootstrapping
Tm_uncertainty ( Tuple ( float * unit.simtk.temperature ) ) - confidence interval for melting point computed from bootstrapping
Cv_height_value ( float * kJ/mol/K ) - Height of the Cv peak relative to the lowest value over the temperature range used.
Cv_height_uncertainty ( Tuple ( np.array(float) * kJ/mol/K ) ) - confidence interval for all Cv_height_value computed from bootstrapping
FWHM_value ( float * unit.simtk.temperature ) - Cv full width half maximum mean value computed from bootstrapping
FWHM_uncertainty ( Tuple ( float * unit.simtk.temperature ) ) - confidence interval for Cv full width half maximum computed from bootstrapping
N_eff_values ( np.array( float ) ) - The bootstrap mean number of effective samples at all simulated and non-simulated (including inserted intermediates) states
- cg_openmm.thermo.calc.bootstrap_partial_heat_capacities(array_folded_states, frame_begin=0, sample_spacing=1, frame_end=-1, plot_file='partial_heat_capacity_boot.pdf', output_data='output/output.nc', num_intermediate_states=0, frac_dT=0.05, conf_percent='sigma', n_trial_boot=200)[source]
Given an array classifying each frame into discrete conformational states, compute the heat capacity curve, and contributions from each conformational class to the heat capacity curve, with uncertainties determined using bootstrapping. Uncorrelated datasets are selected using a random starting frame, repeated n_trial_boot times.
Note
array_folded_states should include all frames after frame_begin
Note
The partial contributions to heat capacity must be weighted by the probabilities of finding each conformational class to reconstruct the total heat capacity curve.
- Parameters
array_folded_states (2d numpy array (int)) – a precomputed array classifying the different conformational states
frame_begin (int) – index of first frame defining the range of samples to use as a production period (default=0)
sample_spacing (int) – spacing of uncorrelated data points, for example determined from pymbar timeseries subsampleCorrelatedData (default=1)
frame_end (int) – index of last frame to include in heat capacity calculation (default=-1)
plot_file (str) – path to filename to output plot (default=’heat_capacity_boot.pdf’)
output_data (str) – Path to the output data for a NetCDF-formatted file containing replica exchange simulation data (default = “output/output.nc”)
num_intermediate_states (int) – The number of states to insert between existing states in ‘temperature_list’ (default=0)
frac_dT (float) – The fraction difference between temperatures points used to calculate finite difference derivatives (default=0.05)
conf_percent (float) – Confidence level in percent for outputting uncertainties (default = 68.27 = 1 sigma)
n_trial_boot (int) – number of trials to run for generating bootstrapping uncertainties (default=200)
- Returns
T_list ( np.array ( float * unit.simtk.temperature ) ) - The temperature list corresponding to the heat capacity values in ‘Cv’
Cv_values ( dict ( np.array ( float * kJ/mol/K ) ) ) - The heat capacity values for all (including inserted intermediates) states
Cv_uncertainty ( dict ( Tuple ( np.array(float) * kJ/mol/K ) ) ) - confidence interval for all Cv_values computed from bootstrapping
Tm_value ( dict ( float * unit.simtk.temperature ) ) - Melting point mean value computed from bootstrapping
Tm_uncertainty ( dict ( Tuple ( float * unit.simtk.temperature ) ) ) - confidence interval for melting point computed from bootstrapping
Cv_height_value ( dict ( float * kJ/mol/K ) ) - Height of the Cv peak relative to the lowest value over the temperature range used.
Cv_height_uncertainty ( dict ( Tuple ( np.array(float) * kJ/mol/K ) ) ) - confidence interval for all Cv_height_value computed from bootstrapping
FWHM_value ( dict ( float * unit.simtk.temperature ) ) - Cv full width half maximum mean value computed from bootstrapping
FWHM_uncertainty ( dict ( Tuple ( float * unit.simtk.temperature ) ) ) - confidence interval for Cv full width half maximum computed from bootstrapping
U_expect_values ( dict ( np.array ( float * kJ/mol ) ) ) - Energy expectation values for each conformational state, at each T in T_list
U_expect_uncertainty ( dict ( Tuple ( np.array(float) * kJ/mol ) ) ) - confidence interval for U_expect_values computed from bootstrapping
- cg_openmm.thermo.calc.get_cv_FWHM(Cv_values, T_list)[source]
Internal function for getting the full-width half-maximum, melting point, and peak height from a heat capacity vs T dataset.
- Parameters
Cv_values (float * 1D np.array) – heat capacity data series (unitless)
T_list (float * 1D np.array) – temperature data series (unitless)
- Returns
FWHM ( float ) - Full width half maximum from heat capacity vs T
Tm ( float ) - Melting point from heat capacity vs T
Cv_height ( float ) - Relative height of heat capacity peak
- cg_openmm.thermo.calc.get_heat_capacity(frame_begin=0, sample_spacing=1, frame_end=-1, output_data='output/output.nc', num_intermediate_states=0, frac_dT=0.05, plot_file=None, bootstrap_energies=None)[source]
Given a .nc output and a number of intermediate states to insert for the temperature list, this function calculates and plots the heat capacity profile.
- Parameters
frame_begin (int) – index of first frame defining the range of samples to use as a production period (default=0)
sample_spacing (int) – spacing of uncorrelated data points, for example determined from pymbar timeseries subsampleCorrelatedData (default=1)
frame_end (int) – index of last frame to include in heat capacity calculation (default=-1)
output_data (str) – Path to the output data for a NetCDF-formatted file containing replica exchange simulation data (default = “output/output.nc”)
num_intermediate_states (int) – The number of states to insert between existing states in ‘temperature_list’ (default=0)
frac_dT (float) – The fraction difference between temperatures points used to calculate finite difference derivatives (default=0.05)
plotfile (str) – path to filename to output plot
bootstrap_energies (3d numpy array (float)) – a custom replica_energies array to be used for bootstrapping calculations. Used instead of the energies in the .nc file.
- Returns
Cv ( List( float * kJ/mol/K ) ) - The heat capacity values for all (including inserted intermediates) states
dCv ( List( float * kJ/mol/K ) ) - The uncertainty in the heat capacity values for intermediate states
new_temp_list ( List( float * unit.simtk.temperature ) ) - The temperature list corresponding to the heat capacity values in ‘Cv’
FWHM ( float * unit.simtk.temperature ) - Full width half maximum from heat capacity vs T
Tm ( float * unit.simtk.temperature ) - Melting point from heat capacity vs T
Cv_height ( float * kJ/mol/K ) - Relative height of heat capacity peak
N_eff( np.array( float ) ) - The number of effective samples at all (including inserted intermediates) states
- cg_openmm.thermo.calc.get_heat_capacity_derivative(Cv, temperature_list, plotfile='dCv_dT.pdf')[source]
Fit a heat capacity vs T dataset to cubic spline, and compute derivatives
- Parameters
Cv (Quantity or numpy 1D array) – heat capacity data series
temperature_list – List of temperatures used in replica exchange simulations
plotfile (str) – path to filename to output plot
- Returns
dCv_out ( 1D numpy array (float) ) - 1st derivative of heat capacity, from a cubic spline evaluated at each point in Cv)
d2Cv_out ( 1D numpy array (float) ) - 2nd derivative of heat capacity, from a cubic spline evaluated at each point in Cv)
spline_tck ( scipy spline object (tuple) ) - knot points (t), coefficients (c), and order of the spline (k) fit to Cv data
- cg_openmm.thermo.calc.get_heat_capacity_integral_entropy(Cv, temperature_list, plotfile='Cv_entropy_change.pdf')[source]
Fit Cv/T vs T dataset to cubic spline, and compute integrals (delta_S(T)) from T_0 to T_i Note this is the total entropy change from all states upon a temperature change, separate from the entropy of folding.
- Parameters
Cv (Quantity or numpy 1D array) – heat capacity data series
temperature_list – List of temperatures used in replica exchange simulations
plotfile (str) – path to filename to output plot (default=’Cv_entropy_change.pdf’)
- Returns
dCv_out ( 1D numpy array (float) ) - entropy changes from lowest temperature to each temperature in temperature_list
- cg_openmm.thermo.calc.get_heat_capacity_reeval(U_kln, output_data='output/output.nc', frame_begin=0, sample_spacing=1, frame_end=-1, num_intermediate_states=0, frac_dT=0.05, plot_file_sim=None, plot_file_reeval=None, bootstrap_energies=None)[source]
Given an array of re-evaluated energies at a non-simulated set of force field parameters, and a corresponding temperature list, compute heat capacity as a function of temperature.
- Parameters
U_kln (3d numpy array (float) with dimensions [replica, evaluated_state, frame]) – re-evaluated state energies array to be used for the MBAR calculation (first frame is frame_begin)
output_data (str) – Path to the output data for a NetCDF-formatted file containing replica exchange simulation data (default = “output/output.nc”)
frame_begin (int) – index of first frame defining the range of samples to use as a production period (default=0)
sample_spacing (int) – spacing of uncorrelated data points, for example determined from pymbar timeseries subsampleCorrelatedData (default=1)
frame_end (int) – index of last frame to include in heat capacity calculation (default=-1)
num_intermediate_states (int) – The number of states to insert between existing states in ‘temperature_list’ (default=0)
frac_dT (float) – The fraction difference between temperatures points used to calculate finite difference derivatives (default=0.05)
plot_file_sim (str) – path to filename to output plot for simulated heat capacity (default=None)
plot_file_reeval (str) – path to filename to output plot for reevaluated heat capacity (default=None)
bootstrap_energies (3d numpy array (float)) – a custom replica_energies array to be used for bootstrapping calculations. Used instead of the energies in the .nc file. (default=None)
- Returns
Cv_sim ( np.array( float * kJ/mol/K ) ) - The heat capacity values for all simulated (including inserted intermediates) states
dCv_sim ( np.array( float * kJ/mol/K ) ) - The uncertainty of Cv_sim values
Cv_reeval ( np.array( float * kJ/mol/K ) ) - The heat capacity values for all reevaluated (including inserted intermediates) states
dCv_reeval ( np.array( float * kJ/mol/K ) ) - The uncertainty of Cv_reeval values
full_T_list ( np.array( float * unit.simtk.temperature ) ) - The temperature list corresponding to the heat capacity values in ‘Cv’
FWHM ( float * unit.simtk.temperature ) - Full width half maximum from heat capacity vs T
Tm ( float * unit.simtk.temperature ) - Melting point from heat capacity vs T
Cv_height ( float * kJ/mol/K ) - Relative height of heat capacity peak
N_eff( np.array( float ) ) - The number of effective samples at all (including inserted intermediates) states
- cg_openmm.thermo.calc.get_partial_heat_capacities(array_folded_states, frame_begin=0, sample_spacing=1, frame_end=-1, output_data='output/output.nc', num_intermediate_states=0, frac_dT=0.05, plot_file=None, bootstrap_energies=None)[source]
Given an array classifying each frame into discrete conformational states, compute the heat capacity curve, and contributions from each conformational class to the heat capacity curve. Uncertainties can be calculated using the function bootstrap_partial_heat_capacities.
***TODO: add the uncertainty calculation
Note
array_folded_states should include all frames after frame_begin
Note
The partial contributions to heat capacity must be weighted by the probabilities of finding each conformational class to reconstruct the total heat capacity curve.
- Parameters
array_folded_states (2d numpy array (int)) – a precomputed array classifying the different conformational states
frame_begin (int) – index of first frame defining the range of samples to use as a production period (default=0)
sample_spacing (int) – spacing of uncorrelated data points, for example determined from pymbar timeseries subsampleCorrelatedData (default=1)
frame_end (int) – index of last frame to include in heat capacity calculation (default=-1)
output_data (str) – Path to the output data for a NetCDF-formatted file containing replica exchange simulation data (default = “output/output.nc”)
num_intermediate_states (int) – The number of states to insert between existing states in ‘temperature_list’ (default=0)
frac_dT (float) – The fraction difference between temperatures points used to calculate finite difference derivatives (default=0.05)
plotfile (str) – path to filename to output plot
bootstrap_energies (3d numpy array (float)) – a custom replica_energies array to be used for bootstrapping calculations. Used instead of the energies in the .nc file.
- Returns
Cv ( dict ( List( float * kJ/mol/K ) ) ) - For each conformational class, the heat capacity values for all (including inserted intermediates) states
***dCv ( dict ( List( float * kJ/mol/K ) ) ) - The uncertainty in the heat capacity values for intermediate states (not yet implemented)
new_temp_list ( List( float * unit.simtk.temperature ) ) - The temperature list corresponding to the heat capacity values in ‘Cv’
FWHM_partial ( dict ( float * unit.simtk.temperature ) ) - For each conformational class, full width half maximum from heat capacity vs T
Tm_partial ( dict ( float * unit.simtk.temperature ) ) - For each conformational class, melting point from heat capacity vs T
Cv_height_partial ( dict ( float * kJ/mol/K ) ) - For each conformational class, relative height of heat capacity peak
U_expect_confs ( dict ( np.array( float * kJ/mol ) ) - For each conformational class, the energy expectations (in kJ/mol) at each T, including intermediate states
- cg_openmm.thermo.calc.plot_heat_capacity(Cv, dCv, temperature_list, file_name='heat_capacity.pdf')[source]
Given an array of temperature-dependent heat capacity values and the uncertainties in their estimates, this function plots the heat capacity curve.
- Parameters
Cv (List( float * kJ/mol/K )) – The heat capacity data to plot.
dCv (List( float * kJ/mol/K )) – The uncertainties in the heat capacity data
temperature_list – List of temperatures used in replica exchange simulations
file_name (str) – The name/path of the file where plotting output will be written, default = “heat_capacity.pdf”
- cg_openmm.thermo.calc.plot_partial_heat_capacities(Cv_partial, dCv, temperature_list, file_name='heat_capacity_partial.pdf')[source]
Given an array of temperature-dependent heat capacity values and the uncertainties in their estimates, this function plots the heat capacity curve.
- Parameters
Cv_partial (dict ( list( float * kJ/mol/K ) )) – The heat capacity data to plot, grouped by conformational state.
dCv (dict ( list( float * kJ/mol/K ) )) – The uncertainties corresponding to Cv_partial
temperature_list – List of temperatures used in replica exchange simulations
file_name (str) – The name/path of the file where plotting output will be written, default = “heat_capacity_partial.pdf”