gt4sd.algorithms.core module

Bases classes and core code used across multiple algorithms.

Summary

Classes:

AlgorithmConfiguration

Algorithm parameter definitions and implementation setup.

ConfigurablePropertyAlgorithmConfiguration

A configurable AlgorithmConfiguration to be used by the properties submodule.

GeneratorAlgorithm

Interface for automated generation via an AlgorithmConfiguration.

PredictorAlgorithm

Interface for automated prediction via an AlgorithmConfiguration.

PropertyPredictor

Functions:

get_configuration_class_with_attributes

Get AlgorithmConfiguration with set attributes.

Reference

class GeneratorAlgorithm(configuration, target=None)[source]

Bases: ABC, Generic[S, T]

Interface for automated generation via an AlgorithmConfiguration.

generator: Union[Callable[[], Iterable[Any]], Callable[[T], Iterable[Any]]]
target: Optional[T]
max_runtime: int = 86400

The maximum amount of time we should let the algorithm run

max_samples: int = 1000000

The maximum number of samples a user can try to run in one go

generate: Callable[[], Iterable[Any]]
__init__(configuration, target=None)[source]

Targeted or untargeted generation.

Parameters
  • configuration (AlgorithmConfiguration[~S, ~T]) – application specific helper that allows to setup the generator.

  • target (Optional[~T, None]) – context or condition for the generation. Defaults to None.

abstract get_generator(configuration, target)[source]

Set up the detail implementation using the configuration.

Note

This is the major method to implement in child classes, it is called at instantiation of the GeneratorAlgorithm and must return a callable:

  • Either Untargeted: the callable is taking no arguements, and target has to be None.

  • Or Targeted: the callable with the target (but not None).

Parameters
  • configuration (AlgorithmConfiguration[~S, ~T]) – application specific helper that allows to setup the generator.

  • target (Optional[~T, None]) – context or condition for the generation. Defaults to None.

Return type

Union[Callable[[], Iterable[Any]], Callable[[~T], Iterable[Any]]]

Returns

generator, the detail implementation used for generation. If the target is None, the generator is assumed to be untargeted.

timeout(item_set, detail, error)[source]
Throws a timeout exception if applicable, otherwise returns

items gracefully.

Parameters
  • item_set (Set) – Set of items generated thus far.

  • detail (str) – context or condition for the generation.

  • error (TimeoutError) – An error instance, child class of TimeoutError either GT4SDTimeoutError or SamplingError.

Raises

TimeoutError – If no items were sampled so far.

Return type

None

_setup_untargeted_generator(configuration, generator, target=None)[source]

Targeted or untargeted generation.

Parameters
  • configuration (AlgorithmConfiguration[~S, ~T]) – application specific helper that allows to setup the generator.

  • generator (Union[Callable[[], Iterable[Any]], Callable[[~T], Iterable[Any]]]) – the detail implementation used for generation. If the target is None, the generator is assumed to be untargeted.

  • target (Optional[~T, None]) – context or condition for the generation. Defaults to None.

Return type

Callable[[], Iterable[Any]]

sample(number_of_items=100)[source]

Generate a number of unique and valid items.

Filters duplicate items and iterates batches of generated items to reach the desired number of samples, but the number of yielded items is not guaranteed: In case the generate method does not create new samples for GT4SD_MAX_NUMBER_OF_STUCK_CALLS times, it will terminate the sampling process.

Parameters

number_of_items (int) – number of items to generate. Defaults to 100.

Raises
  • SamplingError – when requesting too many items.

  • GT4SDTimeoutError – when the algorithm takes longer than the allowed time limit. Or when no items were yielded (i.e., if not generating samples for many consecutive calls).

Yields

the items.

Return type

Iterator[~S]

validate_configuration(configuration)[source]

Overload to validate the a configuration for the algorithm.

Parameters

configuration (AlgorithmConfiguration) – the algorithm configuration.

Raises

InvalidAlgorithmConfiguration – in case the configuration for the algorithm is invalid.

Return type

AlgorithmConfiguration

Returns

the validated configuration.

__abstractmethods__ = frozenset({'get_generator'})
__annotations__ = {'generate': 'Untargeted', 'generator': 'Union[Untargeted, Targeted[T]]', 'max_runtime': 'int', 'max_samples': 'int', 'target': 'Optional[T]'}
__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.core', '__annotations__': {'generator': 'Union[Untargeted, Targeted[T]]', 'target': 'Optional[T]', 'max_runtime': 'int', 'max_samples': 'int', 'generate': 'Untargeted'}, '__doc__': 'Interface for automated generation via an :class:`AlgorithmConfiguration`.', 'max_runtime': 86400, 'max_samples': 1000000, '__init__': <function GeneratorAlgorithm.__init__>, 'get_generator': <function GeneratorAlgorithm.get_generator>, 'timeout': <function GeneratorAlgorithm.timeout>, '_setup_untargeted_generator': <function GeneratorAlgorithm._setup_untargeted_generator>, 'sample': <function GeneratorAlgorithm.sample>, 'validate_configuration': <function GeneratorAlgorithm.validate_configuration>, '__orig_bases__': (<class 'abc.ABC'>, typing.Generic[~S, ~T]), '__dict__': <attribute '__dict__' of 'GeneratorAlgorithm' objects>, '__weakref__': <attribute '__weakref__' of 'GeneratorAlgorithm' objects>, '__parameters__': (~S, ~T), '__abstractmethods__': frozenset({'get_generator'}), '_abc_impl': <_abc._abc_data object>})
__doc__ = 'Interface for automated generation via an :class:`AlgorithmConfiguration`.'
__module__ = 'gt4sd.algorithms.core'
__orig_bases__ = (<class 'abc.ABC'>, typing.Generic[~S, ~T])
__parameters__ = (~S, ~T)
__weakref__

list of weak references to the object (if defined)

_abc_impl = <_abc._abc_data object>
class PredictorAlgorithm(configuration)[source]

Bases: ABC, Generic[S, T]

Interface for automated prediction via an AlgorithmConfiguration.

max_runtime: int = 86400

The maximum amount of time we should let the algorithm run

__init__(configuration)[source]

Targeted or untargeted generation.

Parameters

configuration (AlgorithmConfiguration[~S, ~T]) – application specific helper that allows to setup the generator.

get_predictor(configuration)[source]
Set up the predictive model from the configuration. This is called at

instantiation of the PredictorAlgorithm and must return a callable.

Return type

Callable[[Any], Any]

Returns

predictor, a callable that takes an item and returns a prediction.

abstract get_model(resources_path)[source]

Restore the model from a local path.

Note

This is the major method to implement in child classes, it is called at instantiation of the PredictorAlgorithm and must return a callable:

Parameters

resources_path (str) – local path to the downloaded artifacts.

Return type

Callable[[Any], Any]

Returns

Predictor (callable)

predict(input)[source]

Perform a prediction for an input. :type input: Any :param input: the input for the predictive model

Raises

TimeoutError – when the walltime limit is hit.

Return type

Any

Returns

the prediction.

__call__(input)[source]

Alias for self.predict.

Return type

Any

__abstractmethods__ = frozenset({'get_model'})
__annotations__ = {'max_runtime': 'int'}
__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.core', '__annotations__': {'max_runtime': 'int'}, '__doc__': 'Interface for automated prediction via an :class:`AlgorithmConfiguration`.', 'max_runtime': 86400, '__init__': <function PredictorAlgorithm.__init__>, 'get_predictor': <function PredictorAlgorithm.get_predictor>, 'get_model': <function PredictorAlgorithm.get_model>, 'predict': <function PredictorAlgorithm.predict>, '__call__': <function PredictorAlgorithm.__call__>, '__orig_bases__': (<class 'abc.ABC'>, typing.Generic[~S, ~T]), '__dict__': <attribute '__dict__' of 'PredictorAlgorithm' objects>, '__weakref__': <attribute '__weakref__' of 'PredictorAlgorithm' objects>, '__parameters__': (~S, ~T), '__abstractmethods__': frozenset({'get_model'}), '_abc_impl': <_abc._abc_data object>})
__doc__ = 'Interface for automated prediction via an :class:`AlgorithmConfiguration`.'
__module__ = 'gt4sd.algorithms.core'
__orig_bases__ = (<class 'abc.ABC'>, typing.Generic[~S, ~T])
__parameters__ = (~S, ~T)
__weakref__

list of weak references to the object (if defined)

_abc_impl = <_abc._abc_data object>
class AlgorithmConfiguration(algorithm_version='')[source]

Bases: Generic[S, T]

Algorithm parameter definitions and implementation setup.

The signature of this class constructor (given by the instance attributes) is used for the REST API and needs to be serializable.

Child classes will add additional instance attributes to configure their respective algorithms. This will require setting default values for all of the attributes defined here. However, the values for algorithm_name and algorithm_application are set the registering decorator.

This strict setup has the following desired effects:

  • Ease child implementation. For example:

    from typing import ClassVar
    
    from gt4sd.algorithms.registry import ApplicationsRegistry
    from gt4sd.algorithms.core import AlgorithmConfiguration
    
    @ApplicationsRegistry.register_algorithm_application(ChildOfGeneratorAlgorithm)
    class ConfigurationForChildOfGeneratorAlgorithm(AlgorithmConfiguration):
        algorithm_type: ClassVar[str] = 'generation'
        domain: ClassVar[str] = 'materials'
        algorithm_version: str = 'version3.14'
        actual_parameter: float = 1.61
    
        # no __init__ definition required
    

2. Retrieve the algorithm and configuration easily (via the four class attributes) from the ApplicationsRegistry. For example:

from gt4sd.algorithms.registry import ApplicationsRegistry

application = ApplicationsRegistry.get_application(
     algorithm_type='generation',
     domain='materials',
     algorithm_name='ChildOfGeneratorAlgorithm',
     algorithm_application='ConfigurationForChildOfGeneratorAlgorithm',
 )
 Algorithm = application.algorithm_class
 Configuration = application.configuration_class
  1. An effortless validation at instantiation via pydantic.

  2. An effortless mapping to artifacts on s3, see ensure_artifacts().

Todo

show how to register a configuration manually (in case it applies to multiple algorithms and/or applications)

algorithm_type: ClassVar[str] = 'generation'

General type of generative algorithm.

domain: ClassVar[str]

General application domain. Hints at input/output types.

algorithm_name: ClassVar[str] = 'HuggingFaceGenerationAlgorithm'

Name of the algorithm to use with this configuration.

Will be set when registering to ApplicationsRegistry

algorithm_application: ClassVar[str] = 'HuggingFaceSeq2SeqGenerator'

Unique name for the application that is the use of this configuration together with a specific algorithm.

Will be set when registering to ApplicationsRegistry, but can be given by direct registration (See register_algorithm_application)

algorithm_version: str = 't5-small'

To differentiate between different versions of an application.

There is no imposed naming convention.

get_target_description()[source]

Get description of the target for generation.

Return type

Optional[Dict[str, str], None]

Returns

target description, returns None in case no target is used.

to_dict()[source]

Represent the configuration as a dictionary.

Return type

Dict[str, Any]

Returns

description of the configuration with parameters description.

validate_item(item)[source]

Overload to validate an item.

Parameters

item (Any) – validate an item.

Raises

InvalidItem – in case the item can not be validated.

Returns

the validated item.

Return type

S

classmethod get_application_prefix()[source]

Get prefix up to the specific application.

Return type

str

Returns

the application prefix.

classmethod list_versions()[source]

Get possible algorithm versions.

S3 is searched as well as the local cache is searched for matching versions.

Return type

Set[str]

Returns

viable values as algorithm_version for the environment.

classmethod list_remote_versions(prefix)[source]
Get possible algorithm versions on s3.

Before uploading an artifact on S3, we need to check that a particular version is not already present and overwrite by mistake. If the final set is empty we can then upload the folder artifact. If the final set is not empty, we need to check that the specific version of interest is not present.

only S3 is searched (not the local cache) for matching versions.

Return type

Set[str]

Returns

viable values as algorithm_version for the environment.

classmethod get_filepath_mappings_for_training_pipeline_arguments(training_pipeline_arguments)[source]

Ger filepath mappings for the given training pipeline arguments.

Parameters

training_pipeline_arguments (TrainingPipelineArguments) – training pipeline arguments.

Raises

ValueError – in case no mapping is available.

Return type

Dict[str, str]

Returns

a mapping between artifacts’ files and training pipeline’s output files.

classmethod save_version_from_training_pipeline_arguments_postprocess(training_pipeline_arguments)[source]

Postprocess after saving.

Parameters

training_pipeline_arguments (TrainingPipelineArguments) – training pipeline arguments.

classmethod save_version_from_training_pipeline_arguments(training_pipeline_arguments, target_version, source_version=None)[source]

Save a version using training pipeline arguments.

Parameters
  • training_pipeline_arguments (TrainingPipelineArguments) – training pipeline arguments.

  • target_version (str) – target version used to save the model in the cache.

  • source_version (Optional[str, None]) – source version to use for missing artifacts. Defaults to None, a.k.a., use the default version.

Return type

None

classmethod upload_version_from_training_pipeline_arguments_postprocess(training_pipeline_arguments)[source]

Postprocess after uploading. Not implemented yet.

Parameters

training_pipeline_arguments (TrainingPipelineArguments) – training pipeline arguments.

classmethod upload_version_from_training_pipeline_arguments(training_pipeline_arguments, target_version, source_version=None)[source]

Upload a version using training pipeline arguments.

Parameters
  • training_pipeline_arguments (TrainingPipelineArguments) – training pipeline arguments.

  • target_version (str) – target version used to save the model in s3.

  • source_version (Optional[str, None]) – source version to use for missing artifacts. Defaults to None, a.k.a., use the default version.

Return type

None

classmethod ensure_artifacts_for_version(algorithm_version)[source]

The artifacts matching the path defined by class attributes and the given version are downloaded.

That is all objects under algorithm_type/algorithm_name/algorithm_application/algorithm_version in the bucket are downloaded.

Parameters

algorithm_version (str) – version of the algorithm to ensure artifacts for.

Return type

str

Returns

the common local path of the matching artifacts.

ensure_artifacts()[source]

The artifacts matching the path defined by class attributes are downloaded.

That is all objects under algorithm_type/algorithm_name/algorithm_application/algorithm_version in the bucket are downloaded.

Return type

str

Returns

the common local path of the matching artifacts.

__annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'domain': 'ClassVar[str]'}
__dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type='ClassVar[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type='ClassVar[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type='ClassVar[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type='str',default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type='ClassVar[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR)}
__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)
__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.core', '__annotations__': {'algorithm_type': 'ClassVar[str]', 'domain': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_application': 'ClassVar[str]', 'algorithm_version': 'str'}, '__doc__': "Algorithm parameter definitions and implementation setup.\n\n    The signature of this class constructor (given by the instance attributes) is used\n    for the REST API and needs to be serializable.\n\n    Child classes will add additional instance attributes to configure their respective\n    algorithms. This will require setting default values for all of the attributes defined\n    here.\n    However, the values for :attr:`algorithm_name` and :attr:`algorithm_application`\n    are set the registering decorator.\n\n    This strict setup has the following desired effects:\n\n    - Ease child implementation. For example::\n\n        from typing import ClassVar\n\n        from gt4sd.algorithms.registry import ApplicationsRegistry\n        from gt4sd.algorithms.core import AlgorithmConfiguration\n\n        @ApplicationsRegistry.register_algorithm_application(ChildOfGeneratorAlgorithm)\n        class ConfigurationForChildOfGeneratorAlgorithm(AlgorithmConfiguration):\n            algorithm_type: ClassVar[str] = 'generation'\n            domain: ClassVar[str] = 'materials'\n            algorithm_version: str = 'version3.14'\n            actual_parameter: float = 1.61\n\n            # no __init__ definition required\n\n\n    2. Retrieve the algorithm and configuration easily (via the four class attributes)\n    from the :class:`ApplicationsRegistry<gt4sd.algorithms.registry.ApplicationsRegistry>`.\n    For example::\n\n       from gt4sd.algorithms.registry import ApplicationsRegistry\n\n       application = ApplicationsRegistry.get_application(\n            algorithm_type='generation',\n            domain='materials',\n            algorithm_name='ChildOfGeneratorAlgorithm',\n            algorithm_application='ConfigurationForChildOfGeneratorAlgorithm',\n        )\n        Algorithm = application.algorithm_class\n        Configuration = application.configuration_class\n\n    3. An effortless validation at instantiation via :mod:`pydantic`.\n\n    4. An effortless mapping to artifacts on s3, see :meth:`ensure_artifacts`.\n\n    Todo:\n        show how to register a configuration manually (in case it applies to multiple\n        algorithms and/or applications)\n\n    ", 'algorithm_version': 't5-small', 'get_target_description': <function AlgorithmConfiguration.get_target_description>, 'to_dict': <function AlgorithmConfiguration.to_dict>, 'validate_item': <function AlgorithmConfiguration.validate_item>, 'get_application_prefix': <classmethod(<function AlgorithmConfiguration.get_application_prefix>)>, 'list_versions': <classmethod(<function AlgorithmConfiguration.list_versions>)>, 'list_remote_versions': <classmethod(<function AlgorithmConfiguration.list_remote_versions>)>, 'get_filepath_mappings_for_training_pipeline_arguments': <classmethod(<function AlgorithmConfiguration.get_filepath_mappings_for_training_pipeline_arguments>)>, 'save_version_from_training_pipeline_arguments_postprocess': <classmethod(<function AlgorithmConfiguration.save_version_from_training_pipeline_arguments_postprocess>)>, 'save_version_from_training_pipeline_arguments': <classmethod(<function AlgorithmConfiguration.save_version_from_training_pipeline_arguments>)>, 'upload_version_from_training_pipeline_arguments_postprocess': <classmethod(<function AlgorithmConfiguration.upload_version_from_training_pipeline_arguments_postprocess>)>, 'upload_version_from_training_pipeline_arguments': <classmethod(<function AlgorithmConfiguration.upload_version_from_training_pipeline_arguments>)>, 'ensure_artifacts_for_version': <classmethod(<function AlgorithmConfiguration.ensure_artifacts_for_version>)>, 'ensure_artifacts': <function AlgorithmConfiguration.ensure_artifacts>, '__orig_bases__': (typing.Generic[~S, ~T],), '__dict__': <attribute '__dict__' of 'AlgorithmConfiguration' objects>, '__weakref__': <attribute '__weakref__' of 'AlgorithmConfiguration' objects>, '__parameters__': (~S, ~T), '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False), '__dataclass_fields__': {'algorithm_type': Field(name='algorithm_type',type='ClassVar[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'domain': Field(name='domain',type='ClassVar[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type='ClassVar[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_application': Field(name='algorithm_application',type='ClassVar[str]',default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type='str',default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD)}, '__init__': <function AlgorithmConfiguration.__init__>, '__repr__': <function AlgorithmConfiguration.__repr__>, '__eq__': <function AlgorithmConfiguration.__eq__>, '__hash__': None, '__match_args__': ('algorithm_version',), 'algorithm_type': 'generation', 'algorithm_name': 'HuggingFaceGenerationAlgorithm', 'algorithm_application': 'HuggingFaceSeq2SeqGenerator'})
__doc__ = "Algorithm parameter definitions and implementation setup.\n\n    The signature of this class constructor (given by the instance attributes) is used\n    for the REST API and needs to be serializable.\n\n    Child classes will add additional instance attributes to configure their respective\n    algorithms. This will require setting default values for all of the attributes defined\n    here.\n    However, the values for :attr:`algorithm_name` and :attr:`algorithm_application`\n    are set the registering decorator.\n\n    This strict setup has the following desired effects:\n\n    - Ease child implementation. For example::\n\n        from typing import ClassVar\n\n        from gt4sd.algorithms.registry import ApplicationsRegistry\n        from gt4sd.algorithms.core import AlgorithmConfiguration\n\n        @ApplicationsRegistry.register_algorithm_application(ChildOfGeneratorAlgorithm)\n        class ConfigurationForChildOfGeneratorAlgorithm(AlgorithmConfiguration):\n            algorithm_type: ClassVar[str] = 'generation'\n            domain: ClassVar[str] = 'materials'\n            algorithm_version: str = 'version3.14'\n            actual_parameter: float = 1.61\n\n            # no __init__ definition required\n\n\n    2. Retrieve the algorithm and configuration easily (via the four class attributes)\n    from the :class:`ApplicationsRegistry<gt4sd.algorithms.registry.ApplicationsRegistry>`.\n    For example::\n\n       from gt4sd.algorithms.registry import ApplicationsRegistry\n\n       application = ApplicationsRegistry.get_application(\n            algorithm_type='generation',\n            domain='materials',\n            algorithm_name='ChildOfGeneratorAlgorithm',\n            algorithm_application='ConfigurationForChildOfGeneratorAlgorithm',\n        )\n        Algorithm = application.algorithm_class\n        Configuration = application.configuration_class\n\n    3. An effortless validation at instantiation via :mod:`pydantic`.\n\n    4. An effortless mapping to artifacts on s3, see :meth:`ensure_artifacts`.\n\n    Todo:\n        show how to register a configuration manually (in case it applies to multiple\n        algorithms and/or applications)\n\n    "
__eq__(other)

Return self==value.

__hash__ = None
__init__(algorithm_version='')
__match_args__ = ('algorithm_version',)
__module__ = 'gt4sd.algorithms.core'
__orig_bases__ = (typing.Generic[~S, ~T],)
__parameters__ = (~S, ~T)
__repr__()

Return repr(self).

__weakref__

list of weak references to the object (if defined)

class ConfigurablePropertyAlgorithmConfiguration(domain, algorithm_name, algorithm_application, algorithm_version='v0', algorithm_type='prediction')[source]

Bases: AlgorithmConfiguration

A configurable AlgorithmConfiguration to be used by the properties submodule.

module: str = 'properties'
__init__(domain, algorithm_name, algorithm_application, algorithm_version='v0', algorithm_type='prediction')[source]
Parameters
  • domain (str) – submodule of properties where the model resides.

  • algorithm_version (str) – name of the predictive model, e.g., MCA.

  • algorithm_application (str) – application of the algorithm, e.g., dataset it was trained on, like Tox21.

  • algorithm_version – version of the algorithm, defaults to v0.

  • algorithm_type (str) – type of the algorithm. This should be prediction.

get_application_prefix()[source]

Get prefix up to the specific application.

NOTE: Unlike the parent method this uses the assgined attributes since it’s configurable.

Return type

str

Returns

the application prefix.

ensure_artifacts_for_version(algorithm_version)[source]

The artifacts matching the path defined by class attributes and the given version are downloaded.

NOTE: Unlike the parent method this uses the assigned attributes since it’s configurable.

That is all objects under algorithm_type/algorithm_name/algorithm_application/algorithm_version in the bucket are downloaded.

Parameters

algorithm_version (str) – version of the algorithm to ensure artifacts for.

Return type

str

Returns

the common local path of the matching artifacts.

list_versions()[source]

Get possible algorithm versions.

NOTE: Unlike the parent method this uses the assigned attributes since it’s configurable.

S3 is searched as well as the local cache is searched for matching versions.

Return type

Set[str]

Returns

viable values as algorithm_version for the environment.

__annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'domain': 'ClassVar[str]', 'module': 'str'}
__doc__ = 'A configurable AlgorithmConfiguration to be used by the properties submodule.'
__module__ = 'gt4sd.algorithms.core'
__parameters__ = ()
get_configuration_class_with_attributes(klass)[source]

Get AlgorithmConfiguration with set attributes.

Parameters

klass (Type[AlgorithmConfiguration]) – a class to be used to extract attributes from.

Return type

Type[AlgorithmConfiguration]

Returns

a class with the attributes set.

class PropertyPredictor(context)[source]

Bases: ABC, Generic[S, U]

__init__(context)[source]

Property predictor to investigate items.

Parameters

context (~U) – the context in which a property of an item can be computed or checked is very application specific.

abstract satisfies(item)[source]

Check whether an item satisfies given requirements.

Parameters

item (~S) – the item to check.

Return type

bool

compute(item)[source]

Compute some metric/property on an item.

Parameters

item (~S) – the item to compute a metric on.

Returns

the computed metric/property.

Return type

Any

__abstractmethods__ = frozenset({'satisfies'})
__annotations__ = {}
__dict__ = mappingproxy({'__module__': 'gt4sd.algorithms.core', '__init__': <function PropertyPredictor.__init__>, 'satisfies': <function PropertyPredictor.satisfies>, 'compute': <function PropertyPredictor.compute>, '__orig_bases__': (<class 'abc.ABC'>, typing.Generic[~S, ~U]), '__dict__': <attribute '__dict__' of 'PropertyPredictor' objects>, '__weakref__': <attribute '__weakref__' of 'PropertyPredictor' objects>, '__doc__': None, '__parameters__': (~S, ~U), '__abstractmethods__': frozenset({'satisfies'}), '_abc_impl': <_abc._abc_data object>, '__annotations__': {}})
__doc__ = None
__module__ = 'gt4sd.algorithms.core'
__orig_bases__ = (<class 'abc.ABC'>, typing.Generic[~S, ~U])
__parameters__ = (~S, ~U)
__weakref__

list of weak references to the object (if defined)

_abc_impl = <_abc._abc_data object>