gt4sd.algorithms.conditional_generation.paccmann_rl.implementation module¶
Implementation of PaccMann^RL conditional generators.
Summary¶
Classes:
Abstract interface for a conditional generator. |
|
Protein conditional generator as implemented in https://doi.org/10.1088/2632-2153/abe808 (originally https://arxiv.org/abs/2005.13285). |
|
Transcriptomic conditional generator as implemented in https://doi.org/10.1016/j.isci.2021.102269 (originally https://doi.org/10.1007/978-3-030-45257-5_18, https://arxiv.org/abs/1909.05114). |
Reference¶
- class ConditionalGenerator[source]¶
Bases:
ABC
Abstract interface for a conditional generator.
- device: device¶
device where the inference is running.
- temperature: float¶
temperature for the sampling.
- generated_length: int¶
maximum length of the generated molecules.
- selfies_conditional_generator_params: dict¶
parameters for the SELFIES generator.
- selfies_conditional_generator: TeacherVAE¶
SELFIES generator.
- smiles_language: SMILESLanguage¶
SMILES language instance.
- generator_latent_size: int¶
- encoder_latent_size: int¶
- get_smiles_from_latent(latent)[source]¶
Take samples from the latent space.
- Parameters
latent (
Tensor
) – latent vector tensor.- Return type
List
[str
]- Returns
SMILES list and indexes for the valid ones.
- __abstractmethods__ = frozenset({'get_latent'})¶
- __annotations__ = {'device': <class 'torch.device'>, 'encoder_latent_size': <class 'int'>, 'generated_length': <class 'int'>, 'generator_latent_size': <class 'int'>, 'selfies_conditional_generator': <class 'paccmann_chemistry.models.vae.TeacherVAE'>, 'selfies_conditional_generator_params': <class 'dict'>, 'smiles_language': <class 'pytoda.smiles.smiles_language.SMILESLanguage'>, 'temperature': <class 'float'>}¶
- __dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.paccmann_rl.implementation', '__annotations__': {'device': <class 'torch.device'>, 'temperature': <class 'float'>, 'generated_length': <class 'int'>, 'selfies_conditional_generator_params': <class 'dict'>, 'selfies_conditional_generator': <class 'paccmann_chemistry.models.vae.TeacherVAE'>, 'smiles_language': <class 'pytoda.smiles.smiles_language.SMILESLanguage'>, 'generator_latent_size': <class 'int'>, 'encoder_latent_size': <class 'int'>}, '__doc__': 'Abstract interface for a conditional generator.', 'get_smiles_from_latent': <function ConditionalGenerator.get_smiles_from_latent>, 'validate_molecules': <staticmethod(<function ConditionalGenerator.validate_molecules>)>, 'get_latent': <function ConditionalGenerator.get_latent>, 'generate_batch': <function ConditionalGenerator.generate_batch>, '__dict__': <attribute '__dict__' of 'ConditionalGenerator' objects>, '__weakref__': <attribute '__weakref__' of 'ConditionalGenerator' objects>, '__abstractmethods__': frozenset({'get_latent'}), '_abc_impl': <_abc._abc_data object>})¶
- __doc__ = 'Abstract interface for a conditional generator.'¶
- __module__ = 'gt4sd.algorithms.conditional_generation.paccmann_rl.implementation'¶
- __weakref__¶
list of weak references to the object (if defined)
- _abc_impl = <_abc._abc_data object>¶
- class ProteinSequenceConditionalGenerator(resources_path, temperature=1.4, generated_length=100, samples_per_protein=100, device=None)[source]¶
Bases:
ConditionalGenerator
Protein conditional generator as implemented in https://doi.org/10.1088/2632-2153/abe808 (originally https://arxiv.org/abs/2005.13285). It generates highly binding and low toxic ligands.
- samples_per_protein¶
number of points sampled per protein. It has to be greater than 1.
- protein_embedding_encoder_params¶
parameter for the protein embedding encoder.
- protein_embedding_encoder¶
protein embedding encoder.
- __init__(resources_path, temperature=1.4, generated_length=100, samples_per_protein=100, device=None)[source]¶
Initialize the generator.
- Parameters
resources_path (
str
) – directory where to find models and parameters.temperature (
float
) – temperature for the sampling. Defaults to 1.4.generated_length (
int
) – maximum length of the generated molecules. Defaults to 100.samples_per_protein (
int
) – number of points sampled per protein. It has to be greater than 1. Defaults to 10.device (
Union
[device
,str
,None
]) – device where the inference is running either as a dedicated class or a string. If not provided is inferred.
- get_latent(protein)[source]¶
Given a protein generate the latent representation.
- Parameters
protein (
str
) – the protein used as context/condition.- Return type
Tensor
- Returns
- the latent representation for the given context. It contains
self.samples_per_protein repeats.
- __abstractmethods__ = frozenset({})¶
- __annotations__ = {'device': 'torch.device', 'encoder_latent_size': 'int', 'generated_length': 'int', 'generator_latent_size': 'int', 'selfies_conditional_generator': 'TeacherVAE', 'selfies_conditional_generator_params': 'dict', 'smiles_language': 'SMILESLanguage', 'temperature': 'float'}¶
- __doc__ = '\n Protein conditional generator as implemented in https://doi.org/10.1088/2632-2153/abe808\n (originally https://arxiv.org/abs/2005.13285).\n It generates highly binding and low toxic ligands.\n\n Attributes:\n samples_per_protein: number of points sampled per protein.\n It has to be greater than 1.\n protein_embedding_encoder_params: parameter for the protein embedding encoder.\n protein_embedding_encoder: protein embedding encoder.\n '¶
- __module__ = 'gt4sd.algorithms.conditional_generation.paccmann_rl.implementation'¶
- _abc_impl = <_abc._abc_data object>¶
- class TranscriptomicConditionalGenerator(resources_path, temperature=1.4, generated_length=100, samples_per_profile=100, device=None)[source]¶
Bases:
ConditionalGenerator
Transcriptomic conditional generator as implemented in https://doi.org/10.1016/j.isci.2021.102269 (originally https://doi.org/10.1007/978-3-030-45257-5_18, https://arxiv.org/abs/1909.05114). It generates highly effective small molecules against transcriptomic progiles.
- samples_per_profile¶
number of points sampled per profile. It has to be greater than 1.
- transcriptomic_encoder_params¶
parameter for the protein embedding encoder.
- transcriptomic_encoder¶
protein embedding encoder.
- __init__(resources_path, temperature=1.4, generated_length=100, samples_per_profile=100, device=None)[source]¶
Initialize the generator.
- Parameters
resources_path (
str
) – directory where to find models and parameters.temperature (
float
) – temperature for the sampling. Defaults to 1.4.generated_length (
int
) – maximum length of the generated molecules. Defaults to 100.samples_per_profile (
int
) – number of points sampled per protein. It has to be greater than 1. Defaults to 10.device (
Union
[device
,str
,None
]) – device where the inference is running either as a dedicated class or a string. If not provided is inferred.
- get_latent(profile)[source]¶
Given a profile generate the latent representation.
- Parameters
profile (
Union
[ndarray
,Series
,str
]) – the profile used as context/condition.- Raises
ValueError – in case the profile has a size mismatch with the genes panel.
- Return type
Tensor
- Returns
- the latent representation for the given context. It contains
self.samples_per_profile repeats.
- __abstractmethods__ = frozenset({})¶
- __annotations__ = {'device': 'torch.device', 'encoder_latent_size': 'int', 'generated_length': 'int', 'generator_latent_size': 'int', 'selfies_conditional_generator': 'TeacherVAE', 'selfies_conditional_generator_params': 'dict', 'smiles_language': 'SMILESLanguage', 'temperature': 'float'}¶
- __doc__ = '\n Transcriptomic conditional generator as implemented in https://doi.org/10.1016/j.isci.2021.102269\n (originally https://doi.org/10.1007/978-3-030-45257-5_18, https://arxiv.org/abs/1909.05114).\n It generates highly effective small molecules against transcriptomic progiles.\n\n Attributes:\n samples_per_profile: number of points sampled per profile.\n It has to be greater than 1.\n transcriptomic_encoder_params: parameter for the protein embedding encoder.\n transcriptomic_encoder: protein embedding encoder.\n '¶
- __module__ = 'gt4sd.algorithms.conditional_generation.paccmann_rl.implementation'¶
- _abc_impl = <_abc._abc_data object>¶