gt4sd.algorithms.prediction.topics_zero_shot.core module¶
Algortihms for topic modelling using zero-shot learning via MLNI models.
Summary¶
Classes:
Configuration to generate topics. |
|
Topics prediction algorithm. |
Reference¶
- class TopicsZeroShot(configuration, target)[source]¶
Bases:
GeneratorAlgorithm
[S
,T
]Topics prediction algorithm.
- __init__(configuration, target)[source]¶
Instantiate TopicsZeroShot ready to predict topics.
- Parameters
configuration (
AlgorithmConfiguration
[~S, ~T]) – domain and application specification defining parameters, types and validations.target (
Optional
[~T,None
]) – a target for which to generate items.
Example
An example for predicting topics for a given text:
config = TopicsPredictor() algorithm = TopicsZeroShot(configuration=config, target="This is a text I want to understand better") items = list(algorithm.sample(1)) print(items)
- get_generator(configuration, target)[source]¶
Get the function to perform the prediction via TopicsZeroShot’s generator.
- Parameters
configuration (
AlgorithmConfiguration
[~S, ~T]) – helps to set up specific application of TopicsZeroShot.target (
Optional
[~T,None
]) – context or condition for the generation.
- Return type
Callable
[[~T],Iterable
[Any
]]- Returns
callable with target predicting topics sorted by relevance.
- __abstractmethods__ = frozenset({})¶
- __annotations__ = {'generate': 'Untargeted', 'generator': 'Union[Untargeted, Targeted[T]]', 'max_runtime': 'int', 'max_samples': 'int', 'target': 'Optional[T]'}¶
- __doc__ = 'Topics prediction algorithm.'¶
- __module__ = 'gt4sd.algorithms.prediction.topics_zero_shot.core'¶
- __orig_bases__ = (gt4sd.algorithms.core.GeneratorAlgorithm[~S, ~T],)¶
- __parameters__ = (~S, ~T)¶
- _abc_impl = <_abc._abc_data object>¶
- class TopicsPredictor(*args, **kwargs)[source]¶
Bases:
TopicsPredictor
,Generic
[T
]Configuration to generate topics.
- algorithm_type: ClassVar[str] = 'prediction'¶
General type of generative algorithm.
- domain: ClassVar[str] = 'nlp'¶
General application domain. Hints at input/output types.
- algorithm_version: str = 'dbpedia'¶
To differentiate between different versions of an application.
There is no imposed naming convention.
- model_name: str = 'facebook/bart-large-mnli'¶
- get_target_description()[source]¶
Get description of the target for generation.
- Return type
Dict
[str
,str
]- Returns
target description.
- get_conditional_generator(resources_path)[source]¶
Instantiate the actual generator implementation.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
instance with
generate_batch
method for targeted generation.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': typing.ClassVar[str], 'algorithm_version': <class 'str'>, 'domain': typing.ClassVar[str], 'model_name': <class 'str'>}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='TopicsPredictor',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='TopicsZeroShot',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='prediction',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='dbpedia',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'model_name': Field(name='model_name',type=<class 'str'>,default='facebook/bart-large-mnli',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'MLNI model name to use. If the model is not found in the cache, a download from HuggingFace will be attempted.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Configuration to generate topics.'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_name')¶
- __module__ = 'gt4sd.algorithms.prediction.topics_zero_shot.core'¶
- __orig_bases__ = (<class 'types.TopicsPredictor'>, typing.Generic[~T])¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.prediction.topics_zero_shot.core.TopicsPredictor'>, 'config': {'title': 'TopicsPredictor'}, 'fields': ['algorithm_version', 'model_name'], 'frozen': False, 'metadata': {'pydantic_js_annotation_functions': [], 'pydantic_js_functions': [functools.partial(<function modify_model_json_schema>, cls=<class 'gt4sd.algorithms.prediction.topics_zero_shot.core.TopicsPredictor'>, title=None)]}, 'post_init': False, 'ref': 'types.TopicsPredictor:94662829412384', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'TopicsPredictor', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'dbpedia'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'model_name', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'facebook/bart-large-mnli'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='dbpedia', init=True, init_var=False, kw_only=False), 'model_name': FieldInfo(annotation=str, required=False, default='facebook/bart-large-mnli', description='MLNI model name to use. If the model is not found in the cache, a download from HuggingFace will be attempted.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x0000561868521020, ), serializer: Fields( GeneralFieldsSerializer { fields: { "model_name": SerField { key_py: Py( 0x00007f1dc38f1170, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3937640, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "algorithm_version": SerField { key_py: Py( 0x00007f1dc3951ac0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc39393b0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 2, }, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6162b70, ), ], name: "TopicsPredictor", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="TopicsPredictor", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f1ea52ed250, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f1dc39343a0, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f1dc3951a70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc39393b0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_name", py_name: Py( 0x00007f1ea6162b70, ), init: true, init_only: false, lookup_key: Simple { key: "model_name", py_key: Py( 0x00007f1dc38b6ab0, ), path: LookupPath( [ S( "model_name", Py( 0x00007f1dc39328b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3937640, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, ], positional_count: 2, init_only_count: None, dataclass_name: "TopicsPredictor", validator_name: "dataclass-args[TopicsPredictor]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x0000561868521020, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6162b70, ), ], post_init: None, revalidate: Never, name: "TopicsPredictor", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (algorithm_version: str = 'dbpedia', model_name: str = 'facebook/bart-large-mnli') -> None>¶
- __wrapped__¶
alias of
TopicsPredictor