gt4sd.algorithms.conditional_generation.reinvent.reinvent_core.core module

MolecularAI Implementation of sample generation, randomizing scaffolds as well as fetching unique sample sequences

The source of this file is https://raw.githubusercontent.com/MolecularAI/Reinvent/982b26dd6cfeb8aa84b6d7e4a8c2a7edde2bad36/running_modes/lib_invent/rl_actions/sample_model.py and it was only minimally changed. See README.md.

Summary

Classes:

ReinventBase

SampledSequencesDTO

Reference

class SampledSequencesDTO(scaffold, decoration, nll)[source]

Bases: object

scaffold: str
decoration: str
nll: float
__annotations__ = {'decoration': <class 'str'>, 'nll': <class 'float'>, 'scaffold': <class 'str'>}
__dataclass_fields__ = {'decoration': Field(name='decoration',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'nll': Field(name='nll',type=<class 'float'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'scaffold': Field(name='scaffold',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD)}
__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)
__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.reinvent.reinvent_core.core', '__annotations__': {'scaffold': <class 'str'>, 'decoration': <class 'str'>, 'nll': <class 'float'>}, '__dict__': <attribute '__dict__' of 'SampledSequencesDTO' objects>, '__weakref__': <attribute '__weakref__' of 'SampledSequencesDTO' objects>, '__doc__': 'SampledSequencesDTO(scaffold: str, decoration: str, nll: float)', '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False), '__dataclass_fields__': {'scaffold': Field(name='scaffold',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'decoration': Field(name='decoration',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'nll': Field(name='nll',type=<class 'float'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD)}, '__init__': <function SampledSequencesDTO.__init__>, '__repr__': <function SampledSequencesDTO.__repr__>, '__eq__': <function SampledSequencesDTO.__eq__>, '__hash__': None, '__match_args__': ('scaffold', 'decoration', 'nll')})
__doc__ = 'SampledSequencesDTO(scaffold: str, decoration: str, nll: float)'
__eq__(other)

Return self==value.

__hash__ = None
__init__(scaffold, decoration, nll)
__match_args__ = ('scaffold', 'decoration', 'nll')
__module__ = 'gt4sd.algorithms.conditional_generation.reinvent.reinvent_core.core'
__repr__()

Return repr(self).

__weakref__

list of weak references to the object (if defined)

class ReinventBase(model, batch_size, logger=None, randomize=False, sample_uniquely=True)[source]

Bases: object

__init__(model, batch_size, logger=None, randomize=False, sample_uniquely=True)[source]

Creates an instance of SampleModel. :params model: A model instance (better in scaffold_decorating mode). :params batch_size: Batch size to use. :return:

get_dataloader(scaffold_list)[source]

Get a dataloader for the list of scaffolds to use with reinvent. NOTE: This method was factored out of the run method from the original source. :params scaffold_list: A list of scaffold SMILES. :rtype: DataLoader :return: An instance of a torch dataloader.

run(scaffold_list)[source]

Samples the model for the given number of SMILES. NOTE: this method was slightly adapted from the original source. :params scaffold_list: A list of scaffold SMILES. :rtype: List[SampledSequencesDTO] :return: A list of SampledSequencesDTO.

_sample_unique_sequences(sampled_sequences)[source]
Return type

List[SampledSequencesDTO]

_randomize_scaffolds(scaffolds)[source]
__annotations__ = {}
__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.conditional_generation.reinvent.reinvent_core.core', '__init__': <function ReinventBase.__init__>, 'get_dataloader': <function ReinventBase.get_dataloader>, 'run': <function ReinventBase.run>, '_sample_unique_sequences': <function ReinventBase._sample_unique_sequences>, '_randomize_scaffolds': <function ReinventBase._randomize_scaffolds>, '__dict__': <attribute '__dict__' of 'ReinventBase' objects>, '__weakref__': <attribute '__weakref__' of 'ReinventBase' objects>, '__doc__': None, '__annotations__': {}})
__doc__ = None
__module__ = 'gt4sd.algorithms.conditional_generation.reinvent.reinvent_core.core'
__weakref__

list of weak references to the object (if defined)