gt4sd.algorithms.controlled_sampling.paccmann_gp.implementation module

Implementation of PaccMann^GP conditional generator.

Summary

Classes:

GPConditionalGenerator

Conditional generator as implemented in https://doi.org/10.1021/acs.jcim.1c00889.

Reference

class GPConditionalGenerator(resources_path, temperature=1.4, generated_length=100, batch_size=32, limit=5.0, acquisition_function='EI', number_of_steps=32, number_of_initial_points=16, initial_point_generator='random', seed=42, number_of_optimization_rounds=1, sampling_variance=0.1, samples_for_evaluation=4, maximum_number_of_sampling_steps=32, device=None)[source]

Bases: object

Conditional generator as implemented in https://doi.org/10.1021/acs.jcim.1c00889.

__init__(resources_path, temperature=1.4, generated_length=100, batch_size=32, limit=5.0, acquisition_function='EI', number_of_steps=32, number_of_initial_points=16, initial_point_generator='random', seed=42, number_of_optimization_rounds=1, sampling_variance=0.1, samples_for_evaluation=4, maximum_number_of_sampling_steps=32, device=None)[source]

Initialize the conditional generator.

Parameters
  • resources_path (str) – directory where to find models and parameters.

  • temperature (float) – temperature parameter for the softmax sampling in decoding. Defaults to 1.4.

  • generated_length (int) – maximum length in tokens of the generated molcules (relates to the SMILES length). Defaults to 100.

  • batch_size (int) – batch size used for the generative model sampling. Defaults to 16.

  • limit (float) – hypercube limits in the latent space. Defaults to 5.0.

  • acquisition_function (str) – acquisition function used in the Gaussian process. Defaults to “EI”. More details in https://scikit-optimize.github.io/stable/modules/generated/skopt.gp_minimize.html.

  • number_of_steps (int) – number of steps for an optmization round. Defaults to 32.

  • number_of_initial_points (int) – number of initial points evaluated. Defaults to 16.

  • initial_point_generator (str) – scheme to generate initial points. Defaults to “random”. More details in https://scikit-optimize.github.io/stable/modules/generated/skopt.gp_minimize.html.

  • seed (int) – seed used for random number generation in the optimizer. Defaults to 42.

  • number_of_optimization_rounds (int) – maximum number of optimization rounds. Defaults to 1.

  • sampling_variance (float) – variance of the Gaussian noise applied during sampling from the optimal point. Defaults to 0.1.

  • samples_for_evaluation (int) – number of samples averaged for each minimization function evaluation. Defaults to 4.

  • maximum_number_of_sampling_steps (int) – maximum number of sampling steps in an optmization round. Defaults to 32.

  • device (Union[device, str, None]) – . Defaults to None, a.k.a, picking a default one (“gpu” if present, “cpu” otherwise).

target_to_minimization_function(target)[source]

Use the target to configure a minimization function.

Parameters

target (Union[Dict[str, Dict[str, Any]], str]) – dictionary or JSON string describing the optimization target.

Return type

CombinedMinimization

Returns

a minimization function.

set_seed()[source]

Set the seed for the random number generators.

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.controlled_sampling.paccmann_gp.implementation', '__doc__': 'Conditional generator as implemented in https://doi.org/10.1021/acs.jcim.1c00889.', '__init__': <function GPConditionalGenerator.__init__>, 'target_to_minimization_function': <function GPConditionalGenerator.target_to_minimization_function>, 'set_seed': <function GPConditionalGenerator.set_seed>, 'generate_batch': <function GPConditionalGenerator.generate_batch>, '__dict__': <attribute '__dict__' of 'GPConditionalGenerator' objects>, '__weakref__': <attribute '__weakref__' of 'GPConditionalGenerator' objects>, '__annotations__': {}})
__doc__ = 'Conditional generator as implemented in https://doi.org/10.1021/acs.jcim.1c00889.'
__module__ = 'gt4sd.algorithms.controlled_sampling.paccmann_gp.implementation'
__weakref__

list of weak references to the object (if defined)

generate_batch(target)[source]

Generate molecules given a target.

Parameters

target (Any) – dictionary or JSON string describing the optimization target.

Return type

List[str]

Returns

a list of molecules as SMILES string.