gt4sd.algorithms.conditional_generation.guacamol.implementation package

GuacaMol algorithms implementation module.

Submodules:

Summary

Reference

class AaeIterator(resource_path, n_samples, n_batch, max_len)[source]

Bases: object

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.guacamol.implementation', '__init__': <function AaeIterator.__init__>, 'generate_batch': <function AaeIterator.generate_batch>, '__dict__': <attribute '__dict__' of 'AaeIterator' objects>, '__weakref__': <attribute '__weakref__' of 'AaeIterator' objects>, '__doc__': None, '__annotations__': {'aae_generator': 'AaeGenerator'}})
__doc__ = None
__init__(resource_path, n_samples, n_batch, max_len)[source]

Initialize AAE.

Parameters
  • resource_path (str) – path to load the hypothesis, candidate labels and, optionally, the smiles file.

  • n_samples (int) – number of samples to sample.

  • n_batch (int) – size of the batch.

  • max_len (int) – max length of SMILES.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
__weakref__

list of weak references to the object (if defined)

generate_batch(target=None)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.

class Generator[source]

Bases: object

Abstract interface for a conditional generator.

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.guacamol.implementation', '__doc__': 'Abstract interface for a conditional generator.', 'generate_batch': <function Generator.generate_batch>, '__dict__': <attribute '__dict__' of 'Generator' objects>, '__weakref__': <attribute '__weakref__' of 'Generator' objects>, '__annotations__': {}})
__doc__ = 'Abstract interface for a conditional generator.'
__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
__weakref__

list of weak references to the object (if defined)

generate_batch(target)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.

class GraphGAIterator(resource_path, batch_size, population_size, offspring_size, n_jobs, mutation_rate, random_start, generations, patience)[source]

Bases: Generator

__annotations__ = {}
__doc__ = None
__init__(resource_path, batch_size, population_size, offspring_size, n_jobs, mutation_rate, random_start, generations, patience)[source]

Initialize GraphGAIterator.

Parameters
  • resource_path – path to load the hypothesis, candidate labels and, optionally, the smiles file.

  • batch_size (int) – number of molecules to generate.

  • population_size (int) – used for the initial generation of smiles within the population.

  • n_jobs (int) – number of concurrently running jobs.

  • random_start (bool) – set to True to randomly choose list of SMILES for generating optimizied molecules.

  • offspring_size (int) – number of molecules to select for new population.

  • mutation_rate (float) – frequency of the new mutations in a single gene or organism over time.

  • generations (int) – number of evolutionary generations.

  • patience (int) – used for early stopping if population scores remains the same after generating molecules.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
generate_batch(target)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.

class GraphMCTSIterator(init_smiles, batch_size, population_size, max_children, n_jobs, num_sims, max_atoms, generations, patience)[source]

Bases: Generator

__annotations__ = {}
__doc__ = None
__init__(init_smiles, batch_size, population_size, max_children, n_jobs, num_sims, max_atoms, generations, patience)[source]

Initialize GraphMCTSIterator.

Parameters
  • init_smiles (str) – path where to load hypothesis, candidate labels and, optionally, the smiles file.

  • batch_size (int) – number of molecules to generate.

  • population_size (int) – used for the initial generation of smiles within the population.

  • max_children (int) – maximum number of childerns a node could have.

  • n_jobs (int) – number of concurrently running jobs.

  • num_sims (float) – number of times to traverse the tree.

  • max_atoms (int) – maximum number of atoms to explore to terminal the node state.

  • generations (int) – number of evolutionary generations.

  • patience (int) – used for early stopping if population scores remains the same after generating molecules.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
generate_batch(target)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.

class OrganIterator(resource_path, n_samples, n_batch, max_len)[source]

Bases: object

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.guacamol.implementation', '__init__': <function OrganIterator.__init__>, 'generate_batch': <function OrganIterator.generate_batch>, '__dict__': <attribute '__dict__' of 'OrganIterator' objects>, '__weakref__': <attribute '__weakref__' of 'OrganIterator' objects>, '__doc__': None, '__annotations__': {'organ_generator': 'OrganGenerator'}})
__doc__ = None
__init__(resource_path, n_samples, n_batch, max_len)[source]

Initialize OrganIterator.

Parameters
  • resource_path (str) – path to load the hypothesis, candidate labels and, optionally, the smiles file.

  • n_samples (int) – number of samples to sample.

  • n_batch (int) – size of the batch.

  • max_len (int) – max length of SMILES.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
__weakref__

list of weak references to the object (if defined)

generate_batch(target=None)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.

class SMILESGAIterator(resource_path, batch_size, population_size, n_mutations, n_jobs, random_start, gene_size, generations, patience)[source]

Bases: Generator

__annotations__ = {}
__doc__ = None
__init__(resource_path, batch_size, population_size, n_mutations, n_jobs, random_start, gene_size, generations, patience)[source]

Initialize SMILESGAIterator.

Parameters
  • resource_path – path to load the hypothesis, candidate labels and, optionally, the smiles file.

  • batch_size (int) – number of molecules to generate.

  • population_size (int) – used with n_mutations for the initial generation of smiles within the population.

  • n_mutations (int) – used with population size for the initial generation of smiles within the population.

  • n_jobs (int) – number of concurrently running jobs.

  • random_start (bool) – set to True to randomly choose list of SMILES for generating optimizied molecules.

  • gene_size (int) – size of the gene which is used in creation of genes.

  • generations (int) – number of evolutionary generations.

  • patience (int) – used for early stopping if population scores remains the same after generating molecules.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
generate_batch(target)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.

class SMILESLSTMHCIterator(resource_path, batch_size, n_epochs, mols_to_sample, n_jobs, random_start, optimize_n_epochs, benchmark_num_samples, keep_top, max_len, optimize_batch_size)[source]

Bases: Generator

__annotations__ = {}
__doc__ = None
__init__(resource_path, batch_size, n_epochs, mols_to_sample, n_jobs, random_start, optimize_n_epochs, benchmark_num_samples, keep_top, max_len, optimize_batch_size)[source]

Initialize SMILESLSTMHCIterator.

Parameters
  • resource_path – path to load the hypothesis, candidate labels and, optionally, the smiles file.

  • batch_size (int) – number of molecules to generate.

  • n_epochs (int) – number of epochs to sample.

  • mols_to_sample (int) – molecules sampled at each step.

  • keep_top (int) – molecules kept each step.

  • optimize_n_epochs (int) – number of epochs for the optimization.

  • benchmark_num_samples (int) – number of molecules to generate from final model for the benchmark.

  • random_start (bool) – set to True to randomly choose list of SMILES for generating optimizied molecules.

  • n_jobs (int) – number of concurrently running jobs.

  • max_len (int) – maximum length of a SMILES string.

  • optimize_batch_size (int) – batch size for the optimization.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
generate_batch(target)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.

class SMILESLSTMPPOIterator(resource_path, batch_size, episode_size, num_epochs, optimize_batch_size, entropy_weight, kl_div_weight, clip_param)[source]

Bases: Generator

__annotations__ = {}
__doc__ = None
__init__(resource_path, batch_size, episode_size, num_epochs, optimize_batch_size, entropy_weight, kl_div_weight, clip_param)[source]

Initialize SMILESLSTMPPOIterator.

Parameters
  • resource_path – path to load the hypothesis, candidate labels and, optionally, the smiles file.

  • batch_size (int) – number of molecules to generate.

  • episode_size (int) – number of molecules sampled by the policy at the start of a series of ppo updates.

  • num_epochs (int) – number of epochs to sample.

  • optimize_batch_size (int) – batch size for the optimization.

  • entropy_weight (int) – used for calculating entropy loss.

  • kl_div_weight (int) – used for calculating Kullback-Leibler divergence loss.

  • clip_param (float) – used for determining how far the new policy is from the old one.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
generate_batch(target)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.

class VaeIterator(resource_path, n_samples, n_batch, max_len)[source]

Bases: object

__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.guacamol.implementation', '__init__': <function VaeIterator.__init__>, 'generate_batch': <function VaeIterator.generate_batch>, '__dict__': <attribute '__dict__' of 'VaeIterator' objects>, '__weakref__': <attribute '__weakref__' of 'VaeIterator' objects>, '__doc__': None, '__annotations__': {'vae_generator': 'VaeGenerator'}})
__doc__ = None
__init__(resource_path, n_samples, n_batch, max_len)[source]

Initialize VaeIterator.

Parameters
  • resource_path (str) – path to load the hypothesis, candidate labels and, optionally, the smiles file.

  • n_samples (int) – number of samples to sample.

  • n_batch (int) – size of the batch.

  • max_len (int) – max length of SMILES.

__module__ = 'gt4sd.algorithms.conditional_generation.guacamol.implementation'
__weakref__

list of weak references to the object (if defined)

generate_batch(target=None)[source]

Generate a batch of molecules.

Parameters

target – condition used for generation.

Return type

List[Any]

Returns

the generated molecules.