gt4sd.algorithms.conditional_generation.regression_transformer.core module¶

RegressionTransformer algorithm.

RegressionTransformer is a mutlitask regression and conditional generation model.

Summary¶

Classes:

`RegressionTransformer`	RegressionTransformer Algorithm.
`RegressionTransformerMolecules`	Configuration to generate molecules given a continuous property target and a molecular sub-structure.
`RegressionTransformerProteins`	Configuration to generate protein given a continuous property target and a partial AAs.

Reference¶

class RegressionTransformer(configuration, target)[source]¶

Bases: GeneratorAlgorithm[S, T]

RegressionTransformer Algorithm.

max_samples: int = 50¶: The maximum number of samples a user can try to run in one go

__init__(configuration, target)[source]¶

Instantiate Regression Transformer ready to generate items.

Parameters

configuration (AlgorithmConfiguration[~S, ~T]) – domain and application specification defining parameters, types and validations.
target (Optional[~T, None]) – a target for which to generate items.

Example

An example for generating small molecules (SMILES) with high affinity for a target protein:

config = RegressionTransformerProteins(
    search='sample', temperature=2.0, tolerance=10
)
target = "<stab>0.393|GSQEVNSGT[MASK][MASK][MASK]YKNASPEEAE[MASK][MASK]IARKAGATTWTEKGNKWEIRI"
stability_generator = RegressionTransformer(configuration=config, target=target)
items = list(stability_generator.sample(10))
print(items)

get_generator(configuration, target)[source]¶

Get the function to sample with the given configuration.

Parameters

configuration (AlgorithmConfiguration[~S, ~T]) – helps to set up specific application of PaccMannRL.
target (Optional[~T, None]) – context or condition for the generation.

Return type

Callable[[~T], Iterable[Any]]

Returns

callable with target generating a batch of items.

validate_configuration(configuration)[source]¶

Overload to validate the a configuration for the algorithm.

Parameters: configuration (AlgorithmConfiguration[~S, ~T]) – the algorithm configuration.
Raises: InvalidAlgorithmConfiguration – in case the configuration for the algorithm is invalid.
Return type: AlgorithmConfiguration[~S, ~T]
Returns: the validated configuration.

__abstractmethods__ = frozenset({})¶

__annotations__ = {'generate': 'Untargeted', 'generator': 'Union[Untargeted, Targeted[T]]', 'max_runtime': 'int', 'max_samples': <class 'int'>, 'target': 'Optional[T]'}¶

__doc__ = 'RegressionTransformer Algorithm.'¶

__module__ = 'gt4sd.algorithms.conditional_generation.regression_transformer.core'¶

__orig_bases__ = (gt4sd.algorithms.core.GeneratorAlgorithm[~S, ~T],)¶

__parameters__ = (~S, ~T)¶

_abc_impl = <_abc._abc_data object>¶

class RegressionTransformerMolecules(*args, **kwargs)[source]¶

Bases: RegressionTransformerMolecules, Generic[T]

Configuration to generate molecules given a continuous property target and a molecular sub-structure.

Implementation from the paper: https://arxiv.org/abs/2202.01338.

Examples

An example for generating a peptide around a desired property value:

config = RegressionTransformerMolecules(
    algorithm_version='solubility', search='sample', temperature=2, tolerance=5
)
target = "<esol>-3.534|[Br][C][=C][C][MASK][MASK][=C][C][=C][C][=C][Ring1][MASK][MASK][Branch2_3][Ring1][Branch1_2]"
solubility_generator = RegressionTransformer(
    configuration=config, target=target
)
list(solubility_generator.sample(5))

An example for predicting the solubility of a molecule:

config = RegressionTransformerMolecules(
    algorithm_version='solubility', search='greedy'
)
target = "<esol>[MASK][MASK][MASK][MASK][MASK]|[Cl][C][Branch1_2][Branch1_2][=C][Branch1_1][C][Cl][Cl][Cl]"
solubility_generator = RegressionTransformer(
    configuration=config, target=target
)
list(solubility_generator.sample(1))

algorithm_type: ClassVar[str] = 'conditional_generation'¶: General type of generative algorithm.

domain: ClassVar[str] = 'materials'¶: General application domain. Hints at input/output types.

algorithm_version: str = 'solubility'¶

To differentiate between different versions of an application.

There is no imposed naming convention.

search: str = 'sample'¶

temperature: float = 1.4¶

batch_size: int = 8¶

tolerance: Union[float, Dict[str, float]] = 20.0¶

sampling_wrapper: Dict¶

get_target_description()[source]¶

Get description of the target for generation.

Return type: Dict[str, str]
Returns: target description.

get_conditional_generator(resources_path, context)[source]¶

Instantiate the actual generator implementation.

Parameters: resources_path (str) – local path to model files.
Return type: ChemicalLanguageRT
Returns: instance with generate_batch method for targeted generation.

validate_item(item)[source]¶

Check that item is a valid sequence.

Parameters: item (str) – a generated item that is possibly not valid.
Raises: InvalidItem – in case the item can not be validated.
Return type: Union[str, Mol]
Returns: the validated item.

classmethod get_filepath_mappings_for_training_pipeline_arguments()[source]¶

Get filepath mappings for the given training pipeline arguments. :type training_pipeline_arguments: TrainingPipelineArguments :param training_pipeline_arguments: training pipeline arguments.

Return type: Dict[str, str]
Returns: a mapping between artifacts’ files and training pipeline’s output files.

__annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': typing.ClassVar[str], 'algorithm_version': <class 'str'>, 'batch_size': <class 'int'>, 'domain': typing.ClassVar[str], 'sampling_wrapper': typing.Dict, 'search': <class 'str'>, 'temperature': <class 'float'>, 'tolerance': typing.Union[float, typing.Dict[str, float]]}¶

__dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='RegressionTransformerMolecules',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='RegressionTransformer',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='conditional_generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='solubility',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'The version of the algorithm to use.', 'options': ['solubility', 'qed', 'logp_and_synthesizability']}),kw_only=False,_field_type=_FIELD), 'batch_size': Field(name='batch_size',type=<class 'int'>,default=8,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Batch size for the conditional generation'}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='materials',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'sampling_wrapper': Field(name='sampling_wrapper',type=typing.Dict,default=<dataclasses._MISSING_TYPE object>,default_factory=<class 'dict'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': "High-level entry point for SMILES-level access. Provide a\n dictionary that is used to build a custom sampling wrapper.\n NOTE: If this is used, the `target` needs to be a single SMILES string.\n Example: {\n 'fraction_to_mask': 0.5,\n 'tokens_to_mask': [],\n 'property_goal': {'<qed>': 0.85}\n }\n - 'fraction_to_mask' specifies the ratio of tokens that can be changed by\n the model.\n - 'tokens_to_mask' specifies which atoms can be masked. This defaults\n to an empty list, meaning that all tokens can be masked.\n - 'property_goal' specifies the target conditions for the generation. The\n properties need to be specified as a dictionary. The keys need to be\n properties supported by the algorithm version.\n - 'substructures_to_mask': Specifies a list of substructures that should be masked.\n Given in SMILES format. This is excluded from the stochastic masking.\n NOTE: The model operates on SELFIES and the matching of the substructures occurs\n in SELFIES simply on a string level.\n - 'substructures_to_keep': Specifies a list of substructures that should definitely be kept.\n Given in SMILES format. This is excluded from the stochastic masking.\n NOTE: This keeps tokens even if they are included in `tokens_to_mask`.\n NOTE: The model operates on SELFIES and the matching of the substructures occurs\n in SELFIES simply on a string level.\n - `text_filtering`: Generated sequences are post-hoc filtered for the presence of\n `substructures_to_keep`. This is done with RDKit substructure matches. If the sub-\n structure cant be converted to a mol object, this argument toggles whether a substructure\n should be ignored from post-hoc filtering (this happens per default) or whether\n filtering should occur on a pure string level. Defaults to False.\n NOTE: This does not affect the actual generation process.\n "}),kw_only=False,_field_type=_FIELD), 'search': Field(name='search',type=<class 'str'>,default='sample',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Search algorithm to use for the generation: sample or greedy', 'options': ['sample', 'greedy']}),kw_only=False,_field_type=_FIELD), 'temperature': Field(name='temperature',type=<class 'float'>,default=1.4,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Temperature parameter for the softmax sampling in decoding.'}),kw_only=False,_field_type=_FIELD), 'tolerance': Field(name='tolerance',type=typing.Union[float, typing.Dict[str, float]],default=20.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Precision tolerance for the conditional generation task. This is the\n tolerated eviation between desired/primed property and predicted property of the\n generated molecule. Given in percentage with respect to the property range encountered\n during training. Either a single float or a dict of floats with properties as\n NOTE: The tolerance is *only* used for post-hoc filtering of the generated molecules.\n '}),kw_only=False,_field_type=_FIELD)}¶

__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶

__doc__ = '\n Configuration to generate molecules given a continuous property target and a molecular sub-structure.\n\n Implementation from the paper: https://arxiv.org/abs/2202.01338.\n\n Examples:\n An example for generating a peptide around a desired property value::\n\n config = RegressionTransformerMolecules(\n algorithm_version=\'solubility\', search=\'sample\', temperature=2, tolerance=5\n )\n target = "<esol>-3.534|[Br][C][=C][C][MASK][MASK][=C][C][=C][C][=C][Ring1][MASK][MASK][Branch2_3][Ring1][Branch1_2]"\n solubility_generator = RegressionTransformer(\n configuration=config, target=target\n )\n list(solubility_generator.sample(5))\n\n An example for predicting the solubility of a molecule::\n\n config = RegressionTransformerMolecules(\n algorithm_version=\'solubility\', search=\'greedy\'\n )\n target = "<esol>[MASK][MASK][MASK][MASK][MASK]|[Cl][C][Branch1_2][Branch1_2][=C][Branch1_1][C][Cl][Cl][Cl]"\n solubility_generator = RegressionTransformer(\n configuration=config, target=target\n )\n list(solubility_generator.sample(1))\n '¶

__eq__(other)¶: Return self==value.

__hash__ = None¶

__init__(*args, **kwargs)¶

__match_args__ = ('algorithm_version', 'search', 'temperature', 'batch_size', 'tolerance', 'sampling_wrapper')¶

__module__ = 'gt4sd.algorithms.conditional_generation.regression_transformer.core'¶

__orig_bases__ = (<class 'types.RegressionTransformerMolecules'>, typing.Generic[~T])¶

__parameters__ = (~T,)¶

__pydantic_complete__ = True¶

__pydantic_config__ = {}¶

__pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.conditional_generation.regression_transformer.core.RegressionTransformerMolecules'>, 'config': {'title': 'RegressionTransformerMolecules'}, 'fields': ['algorithm_version', 'search', 'temperature', 'batch_size', 'tolerance', 'sampling_wrapper'], 'frozen': False, 'post_init': False, 'ref': 'types.RegressionTransformerMolecules:94427938804656', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'RegressionTransformerMolecules', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'solubility'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'The version of the algorithm to use.'}}}, {'type': 'dataclass-field', 'name': 'search', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'sample'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Search algorithm to use for the generation: sample or greedy'}}}, {'type': 'dataclass-field', 'name': 'temperature', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.4}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Temperature parameter for the softmax sampling in decoding.'}}}, {'type': 'dataclass-field', 'name': 'batch_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 8}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Batch size for the conditional generation'}}}, {'type': 'dataclass-field', 'name': 'tolerance', 'schema': {'type': 'default', 'schema': {'type': 'union', 'choices': [{'type': 'float'}, {'type': 'dict', 'keys_schema': {'type': 'str'}, 'values_schema': {'type': 'float'}}]}, 'default': 20.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Precision tolerance for the conditional generation task. This is the\n tolerated eviation between desired/primed property and predicted property of the\n generated molecule. Given in percentage with respect to the property range encountered\n during training. Either a single float or a dict of floats with properties as\n NOTE: The tolerance is *only* used for post-hoc filtering of the generated molecules.\n '}}}, {'type': 'dataclass-field', 'name': 'sampling_wrapper', 'schema': {'type': 'default', 'schema': {'type': 'dict', 'keys_schema': {'type': 'any'}, 'values_schema': {'type': 'any'}}, 'default_factory': <class 'dict'>, 'default_factory_takes_data': False}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': "High-level entry point for SMILES-level access. Provide a\n dictionary that is used to build a custom sampling wrapper.\n NOTE: If this is used, the `target` needs to be a single SMILES string.\n Example: {\n 'fraction_to_mask': 0.5,\n 'tokens_to_mask': [],\n 'property_goal': {'<qed>': 0.85}\n }\n - 'fraction_to_mask' specifies the ratio of tokens that can be changed by\n the model.\n - 'tokens_to_mask' specifies which atoms can be masked. This defaults\n to an empty list, meaning that all tokens can be masked.\n - 'property_goal' specifies the target conditions for the generation. The\n properties need to be specified as a dictionary. The keys need to be\n properties supported by the algorithm version.\n - 'substructures_to_mask': Specifies a list of substructures that should be masked.\n Given in SMILES format. This is excluded from the stochastic masking.\n NOTE: The model operates on SELFIES and the matching of the substructures occurs\n in SELFIES simply on a string level.\n - 'substructures_to_keep': Specifies a list of substructures that should definitely be kept.\n Given in SMILES format. This is excluded from the stochastic masking.\n NOTE: This keeps tokens even if they are included in `tokens_to_mask`.\n NOTE: The model operates on SELFIES and the matching of the substructures occurs\n in SELFIES simply on a string level.\n - `text_filtering`: Generated sequences are post-hoc filtered for the presence of\n `substructures_to_keep`. This is done with RDKit substructure matches. If the sub-\n structure cant be converted to a mol object, this argument toggles whether a substructure\n should be ignored from post-hoc filtering (this happens per default) or whether\n filtering should occur on a pure string level. Defaults to False.\n NOTE: This does not affect the actual generation process.\n "}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶

__pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶

__pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='solubility', description='The version of the algorithm to use.', init=True, init_var=False, kw_only=False), 'batch_size': FieldInfo(annotation=int, required=False, default=8, description='Batch size for the conditional generation', init=True, init_var=False, kw_only=False), 'sampling_wrapper': FieldInfo(annotation=Dict, required=False, default_factory=dict, description="High-level entry point for SMILES-level access. Provide a\n dictionary that is used to build a custom sampling wrapper.\n NOTE: If this is used, the `target` needs to be a single SMILES string.\n Example: {\n 'fraction_to_mask': 0.5,\n 'tokens_to_mask': [],\n 'property_goal': {'<qed>': 0.85}\n }\n - 'fraction_to_mask' specifies the ratio of tokens that can be changed by\n the model.\n - 'tokens_to_mask' specifies which atoms can be masked. This defaults\n to an empty list, meaning that all tokens can be masked.\n - 'property_goal' specifies the target conditions for the generation. The\n properties need to be specified as a dictionary. The keys need to be\n properties supported by the algorithm version.\n - 'substructures_to_mask': Specifies a list of substructures that should be masked.\n Given in SMILES format. This is excluded from the stochastic masking.\n NOTE: The model operates on SELFIES and the matching of the substructures occurs\n in SELFIES simply on a string level.\n - 'substructures_to_keep': Specifies a list of substructures that should definitely be kept.\n Given in SMILES format. This is excluded from the stochastic masking.\n NOTE: This keeps tokens even if they are included in `tokens_to_mask`.\n NOTE: The model operates on SELFIES and the matching of the substructures occurs\n in SELFIES simply on a string level.\n - `text_filtering`: Generated sequences are post-hoc filtered for the presence of\n `substructures_to_keep`. This is done with RDKit substructure matches. If the sub-\n structure cant be converted to a mol object, this argument toggles whether a substructure\n should be ignored from post-hoc filtering (this happens per default) or whether\n filtering should occur on a pure string level. Defaults to False.\n NOTE: This does not affect the actual generation process.\n ", init=True, init_var=False, kw_only=False), 'search': FieldInfo(annotation=str, required=False, default='sample', description='Search algorithm to use for the generation: sample or greedy', init=True, init_var=False, kw_only=False), 'temperature': FieldInfo(annotation=float, required=False, default=1.4, description='Temperature parameter for the softmax sampling in decoding.', init=True, init_var=False, kw_only=False), 'tolerance': FieldInfo(annotation=Union[float, Dict[str, float]], required=False, default=20.0, description='Precision tolerance for the conditional generation task. This is the\n tolerated eviation between desired/primed property and predicted property of the\n generated molecule. Given in percentage with respect to the property range encountered\n during training. Either a single float or a dict of floats with properties as\n NOTE: The tolerance is *only* used for post-hoc filtering of the generated molecules.\n ', init=True, init_var=False, kw_only=False)}¶

__pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x000055e1b7bfd3b0, ), serializer: Fields( GeneralFieldsSerializer { fields: { "algorithm_version": SerField { key_py: Py( 0x00007f9dbe749160, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dc6b043f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "batch_size": SerField { key_py: Py( 0x00007f9dbe74c830, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d5001d0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "tolerance": SerField { key_py: Py( 0x00007f9dbe74c870, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbec9d470, ), ), serializer: Union( UnionSerializer { choices: [ Float( FloatSerializer { inf_nan_mode: Null, }, ), Dict( DictSerializer { key_serializer: Str( StrSerializer, ), value_serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), filter: SchemaFilter { include: None, exclude: None, }, name: "dict[str, float]", }, ), ], name: "Union[float, dict[str, float]]", }, ), }, ), ), required: true, }, "temperature": SerField { key_py: Py( 0x00007f9dbe74c7f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbec9c8d0, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "search": SerField { key_py: Py( 0x00007f9dbe74c7b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9cb38d30, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "sampling_wrapper": SerField { key_py: Py( 0x00007f9dbe7491b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: DefaultFactory( Py( 0x000055e1a1fd7da0, ), false, ), serializer: Dict( DictSerializer { key_serializer: Any( AnySerializer, ), value_serializer: Any( AnySerializer, ), filter: SchemaFilter { include: None, exclude: None, }, name: "dict[any, any]", }, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 6, }, ), fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9d59ac70, ), Py( 0x00007f9e7496ecf0, ), Py( 0x00007f9e996500b0, ), Py( 0x00007f9e997ff4b0, ), Py( 0x00007f9dbe730b70, ), ], name: "RegressionTransformerMolecules", }, ), definitions=[])¶

__pydantic_validator__ = SchemaValidator(title="RegressionTransformerMolecules", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f9e9963bc80, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f9dbe749020, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f9dbe7490c0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dc6b043f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "search", py_name: Py( 0x00007f9e9d59ac70, ), init: true, init_only: false, lookup_key: Simple { key: "search", py_key: Py( 0x00007f9dbe74c670, ), path: LookupPath( [ S( "search", Py( 0x00007f9dbe74c630, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9cb38d30, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "temperature", py_name: Py( 0x00007f9e7496ecf0, ), init: true, init_only: false, lookup_key: Simple { key: "temperature", py_key: Py( 0x00007f9dbe74c5f0, ), path: LookupPath( [ S( "temperature", Py( 0x00007f9dbe74c5b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbec9c8d0, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "batch_size", py_name: Py( 0x00007f9e996500b0, ), init: true, init_only: false, lookup_key: Simple { key: "batch_size", py_key: Py( 0x00007f9dbe74c6b0, ), path: LookupPath( [ S( "batch_size", Py( 0x00007f9dbe74c6f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d5001d0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "tolerance", py_name: Py( 0x00007f9e997ff4b0, ), init: true, init_only: false, lookup_key: Simple { key: "tolerance", py_key: Py( 0x00007f9dbe74c730, ), path: LookupPath( [ S( "tolerance", Py( 0x00007f9dbe74c770, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbec9d470, ), ), on_error: Raise, validator: Union( UnionValidator { mode: Smart, choices: [ ( Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), None, ), ( Dict( DictValidator { strict: false, key_validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), value_validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), min_length: None, max_length: None, name: "dict[str,float]", }, ), None, ), ], custom_error: None, strict: false, name: "union[float,dict[str,float]]", }, ), validate_default: false, copy_default: false, name: "default[union[float,dict[str,float]]]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "sampling_wrapper", py_name: Py( 0x00007f9dbe730b70, ), init: true, init_only: false, lookup_key: Simple { key: "sampling_wrapper", py_key: Py( 0x00007f9dbe749070, ), path: LookupPath( [ S( "sampling_wrapper", Py( 0x00007f9dbe749110, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: DefaultFactory( Py( 0x000055e1a1fd7da0, ), false, ), on_error: Raise, validator: Dict( DictValidator { strict: false, key_validator: Any( AnyValidator, ), value_validator: Any( AnyValidator, ), min_length: None, max_length: None, name: "dict[any,any]", }, ), validate_default: false, copy_default: false, name: "default[dict[any,any]]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, ], positional_count: 6, init_only_count: None, dataclass_name: "RegressionTransformerMolecules", validator_name: "dataclass-args[RegressionTransformerMolecules]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x000055e1b7bfd3b0, ), generic_origin: None, fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9d59ac70, ), Py( 0x00007f9e7496ecf0, ), Py( 0x00007f9e996500b0, ), Py( 0x00007f9e997ff4b0, ), Py( 0x00007f9dbe730b70, ), ], post_init: None, revalidate: Never, name: "RegressionTransformerMolecules", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶

__repr__()¶: Return repr(self).

__signature__ = <Signature (algorithm_version: str = 'solubility', search: str = 'sample', temperature: float = 1.4, batch_size: int = 8, tolerance: Union[float, Dict[str, float]] = 20.0, sampling_wrapper: Dict = <factory>) -> None>¶

__wrapped__¶: alias of RegressionTransformerMolecules

class RegressionTransformerProteins(*args, **kwargs)[source]¶

Bases: RegressionTransformerProteins, Generic[T]

Configuration to generate protein given a continuous property target and a partial AAs.

Implementation from the paper: https://arxiv.org/abs/2202.01338. It can also predict the property given a full sequence.

Examples

An example for generating a peptide around a desired property value:

config = RegressionTransformerProteins(
    search='sample', temperature=2, tolerance=5
)
target = "<stab>1.1234|TTIKNG[MASK][MASK][MASK]YTVPLSPEQAAK[MASK][MASK][MASK]KKRWPDYEVQIHGNTVKVT"
stability_generator = RegressionTransformer(
    configuration=config, target=target
)
list(stability_generator.sample(5))

An example for predicting the stability of a peptide:

config = RegressionTransformerProteins(search='greedy')
target = "<stab>[MASK][MASK][MASK][MASK][MASK]|GSQEVNSNASPEEAEIARKAGATTWTEKGNKWEIRI"
stability_generator = RegressionTransformer(
    configuration=config, target=target
)
list(stability_generator.sample(1))

algorithm_type: ClassVar[str] = 'conditional_generation'¶: General type of generative algorithm.

domain: ClassVar[str] = 'materials'¶: General application domain. Hints at input/output types.

algorithm_version: str = 'stability'¶

To differentiate between different versions of an application.

There is no imposed naming convention.

search: str = 'sample'¶

temperature: float = 1.4¶

batch_size: int = 32¶

tolerance: Union[float, Dict[str, float]] = 20.0¶

sampling_wrapper: Dict¶

get_target_description()[source]¶

Get description of the target for generation.

Return type: Dict[str, str]
Returns: target description.

get_conditional_generator(resources_path, context)[source]¶

Instantiate the actual generator implementation.

Parameters

resources_path (str) – local path to model files.
context (str) – input sequence to be used for the generation.

Return type

ProteinLanguageRT

Returns

instance with generate_batch method for targeted generation.

validate_item(item)[source]¶

Check that item is a valid sequence.

Parameters: item (str) – a generated item that is possibly not valid.
Raises: InvalidItem – in case the item can not be validated.
Return type: Union[str, Mol]
Returns: the validated item.

__annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': typing.ClassVar[str], 'algorithm_version': <class 'str'>, 'batch_size': <class 'int'>, 'domain': typing.ClassVar[str], 'sampling_wrapper': typing.Dict, 'search': <class 'str'>, 'temperature': <class 'float'>, 'tolerance': typing.Union[float, typing.Dict[str, float]]}¶

__dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='RegressionTransformerProteins',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='RegressionTransformer',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='conditional_generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='stability',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'The version of the algorithm to use.', 'options': ['stability']}),kw_only=False,_field_type=_FIELD), 'batch_size': Field(name='batch_size',type=<class 'int'>,default=32,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Batch size for the conditional generation'}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='materials',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'sampling_wrapper': Field(name='sampling_wrapper',type=typing.Dict,default=<dataclasses._MISSING_TYPE object>,default_factory=<class 'dict'>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': "High-level entry point for SMILES-level access. Provide a\n dictionary that is used to build a custom sampling wrapper.\n NOTE: If this is used, the `target` needs to be a single SMILES string.\n Example: {\n 'fraction_to_mask': 0.5,\n 'tokens_to_mask': [],\n 'property_goal': {'<qed>': 0.85}\n }\n - 'fraction_to_mask' specifies the ratio of tokens that can be changed by\n the model.\n - 'tokens_to_mask' specifies which atoms can be masked. This defaults\n to an empty list, meaning that all tokens can be masked.\n - 'property_goal' specifies the target conditions for the generation. The\n properties need to be specified as a dictionary. The keys need to be\n properties supported by the algorithm version.\n "}),kw_only=False,_field_type=_FIELD), 'search': Field(name='search',type=<class 'str'>,default='sample',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Search algorithm to use for the generation: sample or greedy'}),kw_only=False,_field_type=_FIELD), 'temperature': Field(name='temperature',type=<class 'float'>,default=1.4,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Temperature parameter for the softmax sampling in decoding.'}),kw_only=False,_field_type=_FIELD), 'tolerance': Field(name='tolerance',type=typing.Union[float, typing.Dict[str, float]],default=20.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Precision tolerance for the conditional generation task. This is the\n tolerated eviation between desired/primed property and predicted property of the\n generated molecule. Given in percentage with respect to the property range encountered\n during training. Either a single float or a dict of floats with properties as\n NOTE: The tolerance is *only* used for post-hoc filtering of the generated proteins.\n '}),kw_only=False,_field_type=_FIELD)}¶

__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶

__doc__ = '\n Configuration to generate protein given a continuous property target and a partial AAs.\n\n Implementation from the paper: https://arxiv.org/abs/2202.01338. It can also predict the property given a full sequence.\n\n Examples:\n An example for generating a peptide around a desired property value::\n\n config = RegressionTransformerProteins(\n search=\'sample\', temperature=2, tolerance=5\n )\n target = "<stab>1.1234|TTIKNG[MASK][MASK][MASK]YTVPLSPEQAAK[MASK][MASK][MASK]KKRWPDYEVQIHGNTVKVT"\n stability_generator = RegressionTransformer(\n configuration=config, target=target\n )\n list(stability_generator.sample(5))\n\n An example for predicting the stability of a peptide::\n\n config = RegressionTransformerProteins(search=\'greedy\')\n target = "<stab>[MASK][MASK][MASK][MASK][MASK]|GSQEVNSNASPEEAEIARKAGATTWTEKGNKWEIRI"\n stability_generator = RegressionTransformer(\n configuration=config, target=target\n )\n list(stability_generator.sample(1))\n '¶

__eq__(other)¶: Return self==value.

__hash__ = None¶

__init__(*args, **kwargs)¶

__match_args__ = ('algorithm_version', 'search', 'temperature', 'batch_size', 'tolerance', 'sampling_wrapper')¶

__module__ = 'gt4sd.algorithms.conditional_generation.regression_transformer.core'¶

__orig_bases__ = (<class 'types.RegressionTransformerProteins'>, typing.Generic[~T])¶

__parameters__ = (~T,)¶

__pydantic_complete__ = True¶

__pydantic_config__ = {}¶

__pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.conditional_generation.regression_transformer.core.RegressionTransformerProteins'>, 'config': {'title': 'RegressionTransformerProteins'}, 'fields': ['algorithm_version', 'search', 'temperature', 'batch_size', 'tolerance', 'sampling_wrapper'], 'frozen': False, 'post_init': False, 'ref': 'types.RegressionTransformerProteins:94427939126352', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'RegressionTransformerProteins', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'stability'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'The version of the algorithm to use.'}}}, {'type': 'dataclass-field', 'name': 'search', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'sample'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Search algorithm to use for the generation: sample or greedy'}}}, {'type': 'dataclass-field', 'name': 'temperature', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.4}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Temperature parameter for the softmax sampling in decoding.'}}}, {'type': 'dataclass-field', 'name': 'batch_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 32}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Batch size for the conditional generation'}}}, {'type': 'dataclass-field', 'name': 'tolerance', 'schema': {'type': 'default', 'schema': {'type': 'union', 'choices': [{'type': 'float'}, {'type': 'dict', 'keys_schema': {'type': 'str'}, 'values_schema': {'type': 'float'}}]}, 'default': 20.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Precision tolerance for the conditional generation task. This is the\n tolerated eviation between desired/primed property and predicted property of the\n generated molecule. Given in percentage with respect to the property range encountered\n during training. Either a single float or a dict of floats with properties as\n NOTE: The tolerance is *only* used for post-hoc filtering of the generated proteins.\n '}}}, {'type': 'dataclass-field', 'name': 'sampling_wrapper', 'schema': {'type': 'default', 'schema': {'type': 'dict', 'keys_schema': {'type': 'any'}, 'values_schema': {'type': 'any'}}, 'default_factory': <class 'dict'>, 'default_factory_takes_data': False}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': "High-level entry point for SMILES-level access. Provide a\n dictionary that is used to build a custom sampling wrapper.\n NOTE: If this is used, the `target` needs to be a single SMILES string.\n Example: {\n 'fraction_to_mask': 0.5,\n 'tokens_to_mask': [],\n 'property_goal': {'<qed>': 0.85}\n }\n - 'fraction_to_mask' specifies the ratio of tokens that can be changed by\n the model.\n - 'tokens_to_mask' specifies which atoms can be masked. This defaults\n to an empty list, meaning that all tokens can be masked.\n - 'property_goal' specifies the target conditions for the generation. The\n properties need to be specified as a dictionary. The keys need to be\n properties supported by the algorithm version.\n "}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶

__pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶

__pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='stability', description='The version of the algorithm to use.', init=True, init_var=False, kw_only=False), 'batch_size': FieldInfo(annotation=int, required=False, default=32, description='Batch size for the conditional generation', init=True, init_var=False, kw_only=False), 'sampling_wrapper': FieldInfo(annotation=Dict, required=False, default_factory=dict, description="High-level entry point for SMILES-level access. Provide a\n dictionary that is used to build a custom sampling wrapper.\n NOTE: If this is used, the `target` needs to be a single SMILES string.\n Example: {\n 'fraction_to_mask': 0.5,\n 'tokens_to_mask': [],\n 'property_goal': {'<qed>': 0.85}\n }\n - 'fraction_to_mask' specifies the ratio of tokens that can be changed by\n the model.\n - 'tokens_to_mask' specifies which atoms can be masked. This defaults\n to an empty list, meaning that all tokens can be masked.\n - 'property_goal' specifies the target conditions for the generation. The\n properties need to be specified as a dictionary. The keys need to be\n properties supported by the algorithm version.\n ", init=True, init_var=False, kw_only=False), 'search': FieldInfo(annotation=str, required=False, default='sample', description='Search algorithm to use for the generation: sample or greedy', init=True, init_var=False, kw_only=False), 'temperature': FieldInfo(annotation=float, required=False, default=1.4, description='Temperature parameter for the softmax sampling in decoding.', init=True, init_var=False, kw_only=False), 'tolerance': FieldInfo(annotation=Union[float, Dict[str, float]], required=False, default=20.0, description='Precision tolerance for the conditional generation task. This is the\n tolerated eviation between desired/primed property and predicted property of the\n generated molecule. Given in percentage with respect to the property range encountered\n during training. Either a single float or a dict of floats with properties as\n NOTE: The tolerance is *only* used for post-hoc filtering of the generated proteins.\n ', init=True, init_var=False, kw_only=False)}¶

__pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x000055e1b7c4bc50, ), serializer: Fields( GeneralFieldsSerializer { fields: { "temperature": SerField { key_py: Py( 0x00007f9dbe74dff0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbec9c8d0, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "search": SerField { key_py: Py( 0x00007f9dbe74d7f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9cb38d30, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "algorithm_version": SerField { key_py: Py( 0x00007f9dbe762c90, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dc6b04f70, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "tolerance": SerField { key_py: Py( 0x00007f9dbe74cbf0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbec9d470, ), ), serializer: Union( UnionSerializer { choices: [ Float( FloatSerializer { inf_nan_mode: Null, }, ), Dict( DictSerializer { key_serializer: Str( StrSerializer, ), value_serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), filter: SchemaFilter { include: None, exclude: None, }, name: "dict[str, float]", }, ), ], name: "Union[float, dict[str, float]]", }, ), }, ), ), required: true, }, "sampling_wrapper": SerField { key_py: Py( 0x00007f9dbe762ce0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: DefaultFactory( Py( 0x000055e1a1fd7da0, ), false, ), serializer: Dict( DictSerializer { key_serializer: Any( AnySerializer, ), value_serializer: Any( AnySerializer, ), filter: SchemaFilter { include: None, exclude: None, }, name: "dict[any, any]", }, ), }, ), ), required: true, }, "batch_size": SerField { key_py: Py( 0x00007f9dbe74d570, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d5004d0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 6, }, ), fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9d59ac70, ), Py( 0x00007f9e7496ecf0, ), Py( 0x00007f9e996500b0, ), Py( 0x00007f9e997ff4b0, ), Py( 0x00007f9dbe730b70, ), ], name: "RegressionTransformerProteins", }, ), definitions=[])¶

__pydantic_validator__ = SchemaValidator(title="RegressionTransformerProteins", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f9e9963bc80, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f9dbe762b50, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f9dbe762bf0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dc6b04f70, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "search", py_name: Py( 0x00007f9e9d59ac70, ), init: true, init_only: false, lookup_key: Simple { key: "search", py_key: Py( 0x00007f9dbe74daf0, ), path: LookupPath( [ S( "search", Py( 0x00007f9dbe74ddf0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9cb38d30, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "temperature", py_name: Py( 0x00007f9e7496ecf0, ), init: true, init_only: false, lookup_key: Simple { key: "temperature", py_key: Py( 0x00007f9dbe74d730, ), path: LookupPath( [ S( "temperature", Py( 0x00007f9dbe74db70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbec9c8d0, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "batch_size", py_name: Py( 0x00007f9e996500b0, ), init: true, init_only: false, lookup_key: Simple { key: "batch_size", py_key: Py( 0x00007f9dbe74dcb0, ), path: LookupPath( [ S( "batch_size", Py( 0x00007f9dbe74de30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d5004d0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "tolerance", py_name: Py( 0x00007f9e997ff4b0, ), init: true, init_only: false, lookup_key: Simple { key: "tolerance", py_key: Py( 0x00007f9dbe74dc30, ), path: LookupPath( [ S( "tolerance", Py( 0x00007f9dbe74dc70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbec9d470, ), ), on_error: Raise, validator: Union( UnionValidator { mode: Smart, choices: [ ( Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), None, ), ( Dict( DictValidator { strict: false, key_validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), value_validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), min_length: None, max_length: None, name: "dict[str,float]", }, ), None, ), ], custom_error: None, strict: false, name: "union[float,dict[str,float]]", }, ), validate_default: false, copy_default: false, name: "default[union[float,dict[str,float]]]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "sampling_wrapper", py_name: Py( 0x00007f9dbe730b70, ), init: true, init_only: false, lookup_key: Simple { key: "sampling_wrapper", py_key: Py( 0x00007f9dbe762b00, ), path: LookupPath( [ S( "sampling_wrapper", Py( 0x00007f9dbe762c40, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: DefaultFactory( Py( 0x000055e1a1fd7da0, ), false, ), on_error: Raise, validator: Dict( DictValidator { strict: false, key_validator: Any( AnyValidator, ), value_validator: Any( AnyValidator, ), min_length: None, max_length: None, name: "dict[any,any]", }, ), validate_default: false, copy_default: false, name: "default[dict[any,any]]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, ], positional_count: 6, init_only_count: None, dataclass_name: "RegressionTransformerProteins", validator_name: "dataclass-args[RegressionTransformerProteins]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x000055e1b7c4bc50, ), generic_origin: None, fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9d59ac70, ), Py( 0x00007f9e7496ecf0, ), Py( 0x00007f9e996500b0, ), Py( 0x00007f9e997ff4b0, ), Py( 0x00007f9dbe730b70, ), ], post_init: None, revalidate: Never, name: "RegressionTransformerProteins", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶

__repr__()¶: Return repr(self).

__signature__ = <Signature (algorithm_version: str = 'stability', search: str = 'sample', temperature: float = 1.4, batch_size: int = 32, tolerance: Union[float, Dict[str, float]] = 20.0, sampling_wrapper: Dict = <factory>) -> None>¶

__wrapped__¶: alias of RegressionTransformerProteins