gt4sd.algorithms.generation.moler.core module¶
MoLeR Algorithm.
MoLeR generation algorithm.
Summary¶
Classes:
MoLeR Algorithm. |
|
Configuration to generate compounds using default parameters of MoLeR. |
Reference¶
- class MoLeR(configuration, target=None)[source]¶
Bases:
GeneratorAlgorithm
[S
,None
]MoLeR Algorithm.
- __init__(configuration, target=None)[source]¶
Instantiate MoLeR ready to generate items.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – domain and application specification defining parameters, types and validations.target (
None
) – a target for which to generate items.
Example
An example for generating small molecules (SMILES) with the default configuration:
configuration = MoLeRDefaultGenerator() MoLeR = MoLeR(configuration=configuration, target=target) items = list(MoLeR.sample(10)) print(items)
- get_generator(configuration, target)[source]¶
Get the function to sample batches via the MoLeRGenerator.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – helps to set up the application.target (
None
) – context or condition for the generation. Unused in the algorithm.
- Return type
Callable
[[],Iterable
[Any
]]- Returns
callable generating a batch of items.
- validate_configuration(configuration)[source]¶
Overload to validate the a configuration for the algorithm.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – the algorithm configuration.- Raises
InvalidAlgorithmConfiguration – in case the configuration for the algorithm is invalid.
- Return type
AlgorithmConfiguration
[~S,None
]- Returns
the validated configuration.
- __abstractmethods__ = frozenset({})¶
- __annotations__ = {'generate': 'Untargeted', 'generator': 'Union[Untargeted, Targeted[T]]', 'max_runtime': 'int', 'max_samples': 'int', 'target': 'Optional[T]'}¶
- __doc__ = 'MoLeR Algorithm.'¶
- __module__ = 'gt4sd.algorithms.generation.moler.core'¶
- __orig_bases__ = (gt4sd.algorithms.core.GeneratorAlgorithm[~S, NoneType],)¶
- __parameters__ = (~S,)¶
- _abc_impl = <_abc._abc_data object>¶
- class MoLeRDefaultGenerator(*args, **kwargs)[source]¶
Bases:
MoLeRDefaultGenerator
,Generic
[T
]Configuration to generate compounds using default parameters of MoLeR.
- algorithm_type: ClassVar[str] = 'generation'¶
General type of generative algorithm.
- domain: ClassVar[str] = 'materials'¶
General application domain. Hints at input/output types.
- algorithm_version: str = 'v0'¶
To differentiate between different versions of an application.
There is no imposed naming convention.
- scaffolds: str = ''¶
- num_samples: int = 32¶
- beam_size: int = 1¶
- seed: int = 0¶
- num_workers: int = 6¶
- seed_smiles: str = ''¶
- sigma: float = 0.0¶
- get_target_description()[source]¶
Get description of the target for generation.
- Return type
Optional
[Dict
[str
,str
],None
]- Returns
target description, returns None in case no target is used.
- get_conditional_generator(resources_path)[source]¶
Instantiate the actual generator implementation.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
instance with
generate
for generation.
- validate_item(item)[source]¶
Check that item is a valid SMILES.
- Parameters
item (
str
) – a generated item that is possibly not valid.- Raises
InvalidItem – in case the item can not be validated.
- Return type
str
- Returns
the validated SMILES.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': typing.ClassVar[str], 'algorithm_version': <class 'str'>, 'beam_size': <class 'int'>, 'domain': typing.ClassVar[str], 'num_samples': <class 'int'>, 'num_workers': <class 'int'>, 'scaffolds': <class 'str'>, 'seed': <class 'int'>, 'seed_smiles': <class 'str'>, 'sigma': <class 'float'>}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='MoLeRDefaultGenerator',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='MoLeR',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'beam_size': Field(name='beam_size',type=<class 'int'>,default=1,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Beam size to use during decoding.'}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='materials',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'num_samples': Field(name='num_samples',type=<class 'int'>,default=32,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of molecules to sample per call.'}),kw_only=False,_field_type=_FIELD), 'num_workers': Field(name='num_workers',type=<class 'int'>,default=6,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of workers used for generation.'}),kw_only=False,_field_type=_FIELD), 'scaffolds': Field(name='scaffolds',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': "Scaffolds as '.'-separated SMILES. If empty, no scaffolds are used."}),kw_only=False,_field_type=_FIELD), 'seed': Field(name='seed',type=<class 'int'>,default=0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Seed used for random number generation.'}),kw_only=False,_field_type=_FIELD), 'seed_smiles': Field(name='seed_smiles',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Dot-separated SMILES used to initialize the encoder. If empty, random codes are used.'}),kw_only=False,_field_type=_FIELD), 'sigma': Field(name='sigma',type=<class 'float'>,default=0.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Variance of Gaussian noise being added to latent code.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Configuration to generate compounds using default parameters of MoLeR.'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'scaffolds', 'num_samples', 'beam_size', 'seed', 'num_workers', 'seed_smiles', 'sigma')¶
- __module__ = 'gt4sd.algorithms.generation.moler.core'¶
- __orig_bases__ = (<class 'types.MoLeRDefaultGenerator'>, typing.Generic[~T])¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.moler.core.MoLeRDefaultGenerator'>, 'config': {'title': 'MoLeRDefaultGenerator'}, 'fields': ['algorithm_version', 'scaffolds', 'num_samples', 'beam_size', 'seed', 'num_workers', 'seed_smiles', 'sigma'], 'frozen': False, 'metadata': {'pydantic_js_annotation_functions': [], 'pydantic_js_functions': [functools.partial(<function modify_model_json_schema>, cls=<class 'gt4sd.algorithms.generation.moler.core.MoLeRDefaultGenerator'>, title=None)]}, 'post_init': False, 'ref': 'types.MoLeRDefaultGenerator:94662829181088', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'MoLeRDefaultGenerator', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'scaffolds', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'num_samples', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 32}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'beam_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 1}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'seed', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'num_workers', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 6}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'seed_smiles', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'sigma', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 0.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'beam_size': FieldInfo(annotation=int, required=False, default=1, description='Beam size to use during decoding.', init=True, init_var=False, kw_only=False), 'num_samples': FieldInfo(annotation=int, required=False, default=32, description='Number of molecules to sample per call.', init=True, init_var=False, kw_only=False), 'num_workers': FieldInfo(annotation=int, required=False, default=6, description='Number of workers used for generation.', init=True, init_var=False, kw_only=False), 'scaffolds': FieldInfo(annotation=str, required=False, default='', description="Scaffolds as '.'-separated SMILES. If empty, no scaffolds are used.", init=True, init_var=False, kw_only=False), 'seed': FieldInfo(annotation=int, required=False, default=0, description='Seed used for random number generation.', init=True, init_var=False, kw_only=False), 'seed_smiles': FieldInfo(annotation=str, required=False, default='', description='Dot-separated SMILES used to initialize the encoder. If empty, random codes are used.', init=True, init_var=False, kw_only=False), 'sigma': FieldInfo(annotation=float, required=False, default=0.0, description='Variance of Gaussian noise being added to latent code.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005618684e88a0, ), serializer: Fields( GeneralFieldsSerializer { fields: { "algorithm_version": SerField { key_py: Py( 0x00007f1dc510dac0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea52cf3f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "num_samples": SerField { key_py: Py( 0x00007f1dc38da070, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea94684d0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "scaffolds": SerField { key_py: Py( 0x00007f1dc38da030, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9470030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "beam_size": SerField { key_py: Py( 0x00007f1dc38da0b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea94680f0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "num_workers": SerField { key_py: Py( 0x00007f1dc38da130, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468190, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "seed": SerField { key_py: Py( 0x00007f1dc38da0f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea94680d0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "sigma": SerField { key_py: Py( 0x00007f1dc38da1b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc51cf5d0, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "seed_smiles": SerField { key_py: Py( 0x00007f1dc38da170, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9470030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 8, }, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ddd2878f0, ), Py( 0x00007f1e86744ab0, ), Py( 0x00007f1dcedc99b0, ), Py( 0x00007f1ea8602870, ), Py( 0x00007f1e33df0df0, ), Py( 0x00007f1dc51237f0, ), Py( 0x00007f1ea860c030, ), ], name: "MoLeRDefaultGenerator", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="MoLeRDefaultGenerator", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f1ea52ed250, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f1dc510db10, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f1dc510c170, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea52cf3f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "scaffolds", py_name: Py( 0x00007f1ddd2878f0, ), init: true, init_only: false, lookup_key: Simple { key: "scaffolds", py_key: Py( 0x00007f1dc3a35df0, ), path: LookupPath( [ S( "scaffolds", Py( 0x00007f1dc38d9d70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9470030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_samples", py_name: Py( 0x00007f1e86744ab0, ), init: true, init_only: false, lookup_key: Simple { key: "num_samples", py_key: Py( 0x00007f1dc38d9d30, ), path: LookupPath( [ S( "num_samples", Py( 0x00007f1dc38d9cf0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea94684d0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "beam_size", py_name: Py( 0x00007f1dcedc99b0, ), init: true, init_only: false, lookup_key: Simple { key: "beam_size", py_key: Py( 0x00007f1dc38d9db0, ), path: LookupPath( [ S( "beam_size", Py( 0x00007f1dc38d9df0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea94680f0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "seed", py_name: Py( 0x00007f1ea8602870, ), init: true, init_only: false, lookup_key: Simple { key: "seed", py_key: Py( 0x00007f1dc38d9e30, ), path: LookupPath( [ S( "seed", Py( 0x00007f1dc38d9e70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea94680d0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_workers", py_name: Py( 0x00007f1e33df0df0, ), init: true, init_only: false, lookup_key: Simple { key: "num_workers", py_key: Py( 0x00007f1dc38d9eb0, ), path: LookupPath( [ S( "num_workers", Py( 0x00007f1dc38d9ef0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468190, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "seed_smiles", py_name: Py( 0x00007f1dc51237f0, ), init: true, init_only: false, lookup_key: Simple { key: "seed_smiles", py_key: Py( 0x00007f1dc38d9f30, ), path: LookupPath( [ S( "seed_smiles", Py( 0x00007f1dc38d9f70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9470030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "sigma", py_name: Py( 0x00007f1ea860c030, ), init: true, init_only: false, lookup_key: Simple { key: "sigma", py_key: Py( 0x00007f1dc38d9fb0, ), path: LookupPath( [ S( "sigma", Py( 0x00007f1dc38d9ff0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc51cf5d0, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, ], positional_count: 8, init_only_count: None, dataclass_name: "MoLeRDefaultGenerator", validator_name: "dataclass-args[MoLeRDefaultGenerator]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x00005618684e88a0, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ddd2878f0, ), Py( 0x00007f1e86744ab0, ), Py( 0x00007f1dcedc99b0, ), Py( 0x00007f1ea8602870, ), Py( 0x00007f1e33df0df0, ), Py( 0x00007f1dc51237f0, ), Py( 0x00007f1ea860c030, ), ], post_init: None, revalidate: Never, name: "MoLeRDefaultGenerator", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (algorithm_version: str = 'v0', scaffolds: str = '', num_samples: int = 32, beam_size: int = 1, seed: int = 0, num_workers: int = 6, seed_smiles: str = '', sigma: float = 0.0) -> None>¶
- __wrapped__¶
alias of
MoLeRDefaultGenerator