gt4sd.algorithms.conditional_generation.guacamol.implementation package¶

GuacaMol algorithms implementation module.

Submodules:

Summary¶

Reference¶

class AaeIterator(resource_path, n_samples, n_batch, max_len)[source]¶

Bases: object

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.guacamol.implementation', '__init__': <function AaeIterator.__init__>, 'generate_batch': <function AaeIterator.generate_batch>, '__dict__': <attribute '__dict__' of 'AaeIterator' objects>, '__weakref__': <attribute '__weakref__' of 'AaeIterator' objects>, '__doc__': None, '__annotations__': {'aae_generator': 'AaeGenerator'}})¶

__doc__ = None¶

__init__(resource_path, n_samples, n_batch, max_len)[source]¶

Initialize AAE.

Parameters

resource_path (str) – path to load the hypothesis, candidate labels and, optionally, the smiles file.
n_samples (int) – number of samples to sample.
n_batch (int) – size of the batch.
max_len (int) – max length of SMILES.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

__weakref__¶: list of weak references to the object (if defined)

generate_batch(target=None)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.

class Generator[source]¶

Bases: object

Abstract interface for a conditional generator.

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.guacamol.implementation', '__doc__': 'Abstract interface for a conditional generator.', 'generate_batch': <function Generator.generate_batch>, '__dict__': <attribute '__dict__' of 'Generator' objects>, '__weakref__': <attribute '__weakref__' of 'Generator' objects>, '__annotations__': {}})¶

__doc__ = 'Abstract interface for a conditional generator.'¶

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

__weakref__¶: list of weak references to the object (if defined)

generate_batch(target)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.

class GraphGAIterator(resource_path, batch_size, population_size, offspring_size, n_jobs, mutation_rate, random_start, generations, patience)[source]¶

Bases: Generator

__annotations__ = {}¶

__doc__ = None¶

__init__(resource_path, batch_size, population_size, offspring_size, n_jobs, mutation_rate, random_start, generations, patience)[source]¶

Initialize GraphGAIterator.

Parameters

resource_path – path to load the hypothesis, candidate labels and, optionally, the smiles file.
batch_size (int) – number of molecules to generate.
population_size (int) – used for the initial generation of smiles within the population.
n_jobs (int) – number of concurrently running jobs.
random_start (bool) – set to True to randomly choose list of SMILES for generating optimizied molecules.
offspring_size (int) – number of molecules to select for new population.
mutation_rate (float) – frequency of the new mutations in a single gene or organism over time.
generations (int) – number of evolutionary generations.
patience (int) – used for early stopping if population scores remains the same after generating molecules.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

generate_batch(target)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.

class GraphMCTSIterator(init_smiles, batch_size, population_size, max_children, n_jobs, num_sims, max_atoms, generations, patience)[source]¶

Bases: Generator

__annotations__ = {}¶

__doc__ = None¶

__init__(init_smiles, batch_size, population_size, max_children, n_jobs, num_sims, max_atoms, generations, patience)[source]¶

Initialize GraphMCTSIterator.

Parameters

init_smiles (str) – path where to load hypothesis, candidate labels and, optionally, the smiles file.
batch_size (int) – number of molecules to generate.
population_size (int) – used for the initial generation of smiles within the population.
max_children (int) – maximum number of childerns a node could have.
n_jobs (int) – number of concurrently running jobs.
num_sims (float) – number of times to traverse the tree.
max_atoms (int) – maximum number of atoms to explore to terminal the node state.
generations (int) – number of evolutionary generations.
patience (int) – used for early stopping if population scores remains the same after generating molecules.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

generate_batch(target)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.

class OrganIterator(resource_path, n_samples, n_batch, max_len)[source]¶

Bases: object

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.guacamol.implementation', '__init__': <function OrganIterator.__init__>, 'generate_batch': <function OrganIterator.generate_batch>, '__dict__': <attribute '__dict__' of 'OrganIterator' objects>, '__weakref__': <attribute '__weakref__' of 'OrganIterator' objects>, '__doc__': None, '__annotations__': {'organ_generator': 'OrganGenerator'}})¶

__doc__ = None¶

__init__(resource_path, n_samples, n_batch, max_len)[source]¶

Initialize OrganIterator.

Parameters

resource_path (str) – path to load the hypothesis, candidate labels and, optionally, the smiles file.
n_samples (int) – number of samples to sample.
n_batch (int) – size of the batch.
max_len (int) – max length of SMILES.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

__weakref__¶: list of weak references to the object (if defined)

generate_batch(target=None)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.

class SMILESGAIterator(resource_path, batch_size, population_size, n_mutations, n_jobs, random_start, gene_size, generations, patience)[source]¶

Bases: Generator

__annotations__ = {}¶

__doc__ = None¶

__init__(resource_path, batch_size, population_size, n_mutations, n_jobs, random_start, gene_size, generations, patience)[source]¶

Initialize SMILESGAIterator.

Parameters

resource_path – path to load the hypothesis, candidate labels and, optionally, the smiles file.
batch_size (int) – number of molecules to generate.
population_size (int) – used with n_mutations for the initial generation of smiles within the population.
n_mutations (int) – used with population size for the initial generation of smiles within the population.
n_jobs (int) – number of concurrently running jobs.
random_start (bool) – set to True to randomly choose list of SMILES for generating optimizied molecules.
gene_size (int) – size of the gene which is used in creation of genes.
generations (int) – number of evolutionary generations.
patience (int) – used for early stopping if population scores remains the same after generating molecules.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

generate_batch(target)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.

class SMILESLSTMHCIterator(resource_path, batch_size, n_epochs, mols_to_sample, n_jobs, random_start, optimize_n_epochs, benchmark_num_samples, keep_top, max_len, optimize_batch_size)[source]¶

Bases: Generator

__annotations__ = {}¶

__doc__ = None¶

__init__(resource_path, batch_size, n_epochs, mols_to_sample, n_jobs, random_start, optimize_n_epochs, benchmark_num_samples, keep_top, max_len, optimize_batch_size)[source]¶

Initialize SMILESLSTMHCIterator.

Parameters

resource_path – path to load the hypothesis, candidate labels and, optionally, the smiles file.
batch_size (int) – number of molecules to generate.
n_epochs (int) – number of epochs to sample.
mols_to_sample (int) – molecules sampled at each step.
keep_top (int) – molecules kept each step.
optimize_n_epochs (int) – number of epochs for the optimization.
benchmark_num_samples (int) – number of molecules to generate from final model for the benchmark.
random_start (bool) – set to True to randomly choose list of SMILES for generating optimizied molecules.
n_jobs (int) – number of concurrently running jobs.
max_len (int) – maximum length of a SMILES string.
optimize_batch_size (int) – batch size for the optimization.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

generate_batch(target)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.

class SMILESLSTMPPOIterator(resource_path, batch_size, episode_size, num_epochs, optimize_batch_size, entropy_weight, kl_div_weight, clip_param)[source]¶

Bases: Generator

__annotations__ = {}¶

__doc__ = None¶

__init__(resource_path, batch_size, episode_size, num_epochs, optimize_batch_size, entropy_weight, kl_div_weight, clip_param)[source]¶

Initialize SMILESLSTMPPOIterator.

Parameters

resource_path – path to load the hypothesis, candidate labels and, optionally, the smiles file.
batch_size (int) – number of molecules to generate.
episode_size (int) – number of molecules sampled by the policy at the start of a series of ppo updates.
num_epochs (int) – number of epochs to sample.
optimize_batch_size (int) – batch size for the optimization.
entropy_weight (int) – used for calculating entropy loss.
kl_div_weight (int) – used for calculating Kullback-Leibler divergence loss.
clip_param (float) – used for determining how far the new policy is from the old one.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

generate_batch(target)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.

class VaeIterator(resource_path, n_samples, n_batch, max_len)[source]¶

Bases: object

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.guacamol.implementation', '__init__': <function VaeIterator.__init__>, 'generate_batch': <function VaeIterator.generate_batch>, '__dict__': <attribute '__dict__' of 'VaeIterator' objects>, '__weakref__': <attribute '__weakref__' of 'VaeIterator' objects>, '__doc__': None, '__annotations__': {'vae_generator': 'VaeGenerator'}})¶

__doc__ = None¶

__init__(resource_path, n_samples, n_batch, max_len)[source]¶

Initialize VaeIterator.

Parameters

resource_path (str) – path to load the hypothesis, candidate labels and, optionally, the smiles file.
n_samples (int) – number of samples to sample.
n_batch (int) – size of the batch.
max_len (int) – max length of SMILES.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'¶

__weakref__¶: list of weak references to the object (if defined)

generate_batch(target=None)[source]¶

Generate a batch of molecules.

Parameters: target – condition used for generation.
Return type: List[Any]
Returns: the generated molecules.