gt4sd.algorithms.generation.pgt.core module¶
Patent Generative Transformer (PGT) generation algorithm.
Summary¶
Classes:
PGT Algorithm. |
|
Basic configuration for a PGT algorithm |
|
Configuration for a PGT coherence check algorithm |
|
Configuration for a PGT Editor algorithm. |
|
Configuration for a PGT Generator algorithm |
Reference¶
- class PGT(configuration, target=None)[source]¶
Bases:
GeneratorAlgorithm
[S
,None
]PGT Algorithm.
- __init__(configuration, target=None)[source]¶
Instantiate PGT ready to generate items.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – domain and application specification defining parameters, types and validations.target (
None
) – unused since it is not a conditional generator.
Example
An example for generating abstract from a given claim:
config = PGTGenerator(task=”claim_to_abstract”, input_text=”My interesting claim”) generator = PGT(configuration=config) print(list(generator.sample(1)))
- get_generator(configuration, target)[source]¶
Get the function to sample with the given configuration.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – helps to set up specific application of PGT.target (
None
) – context or condition for the generation. Unused in the algorithm.
- Return type
Callable
[[],Iterable
[Any
]]- Returns
callable with target generating a batch of items.
- validate_configuration(configuration)[source]¶
Overload to validate the a configuration for the algorithm.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – the algorithm configuration.- Raises
InvalidAlgorithmConfiguration – in case the configuration for the algorithm is invalid.
- Return type
AlgorithmConfiguration
[~S,None
]- Returns
the validated configuration.
- __abstractmethods__ = frozenset({})¶
- __annotations__ = {'generate': 'Untargeted', 'generator': 'Union[Untargeted, Targeted[T]]', 'max_runtime': 'int', 'max_samples': 'int', 'target': 'Optional[T]'}¶
- __doc__ = 'PGT Algorithm.'¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __orig_bases__ = (gt4sd.algorithms.core.GeneratorAlgorithm[~S, NoneType],)¶
- __parameters__ = (~S,)¶
- _abc_impl = <_abc._abc_data object>¶
- class PGTAlgorithmConfiguration(*args, **kwargs)[source]¶
Bases:
PGTAlgorithmConfiguration
,Generic
[T
]Basic configuration for a PGT algorithm
- algorithm_type: ClassVar[str] = 'generation'¶
General type of generative algorithm.
- domain: ClassVar[str] = 'nlp'¶
General application domain. Hints at input/output types.
- algorithm_version: str = 'v0'¶
To differentiate between different versions of an application.
There is no imposed naming convention.
- model_type: str = ''¶
- max_length: int = 512¶
- top_k: int = 50¶
- top_p: float = 1.0¶
- num_return_sequences: int = 3¶
- no_repeat_ngram_size: int = 2¶
- get_target_description()[source]¶
Get description of the target for generation.
- Return type
Optional
[Dict
[str
,str
],None
]- Returns
target description, returns None in case no target is used.
- get_generator(resources_path, **kwargs)[source]¶
Instantiate the actual PGT implementation.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
- instance with
generate_batch
method for targeted generation.
- classmethod save_version_from_training_pipeline_arguments_postprocess()[source]¶
- Postprocess after saving. Remove temporarily converted hf model
if pytorch-lightning checkpoint is given.
- Parameters
training_pipeline_arguments (
TrainingPipelineArguments
) – training pipeline arguments.
- classmethod get_filepath_mappings_for_training_pipeline_arguments()[source]¶
Ger filepath mappings for the given training pipeline arguments.
- Parameters
training_pipeline_arguments (
TrainingPipelineArguments
) – training pipeline arguments.- Return type
Dict
[str
,str
]- Returns
a mapping between artifacts’ files and training pipeline’s output files.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': typing.ClassVar[str], 'algorithm_version': <class 'str'>, 'domain': typing.ClassVar[str], 'max_length': <class 'int'>, 'model_type': <class 'str'>, 'no_repeat_ngram_size': <class 'int'>, 'num_return_sequences': <class 'int'>, 'top_k': <class 'int'>, 'top_p': <class 'float'>}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTAlgorithmConfiguration',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Basic configuration for a PGT algorithm'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size')¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __orig_bases__ = (<class 'types.PGTAlgorithmConfiguration'>, typing.Generic[~T])¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTAlgorithmConfiguration'>, 'config': {'title': 'PGTAlgorithmConfiguration'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size'], 'frozen': False, 'metadata': {'pydantic_js_annotation_functions': [], 'pydantic_js_functions': [functools.partial(<function modify_model_json_schema>, cls=<class 'gt4sd.algorithms.generation.pgt.core.PGTAlgorithmConfiguration'>, title=None)]}, 'post_init': False, 'ref': 'types.PGTAlgorithmConfiguration:94662825971168', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTAlgorithmConfiguration', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005618681d8de0, ), serializer: Fields( GeneralFieldsSerializer { fields: { "max_length": SerField { key_py: Py( 0x00007f1dc38f2ab0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3c7f1b0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_p": SerField { key_py: Py( 0x00007f1dc38f2b30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3c7e750, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "num_return_sequences": SerField { key_py: Py( 0x00007f1dcaa87c30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_k": SerField { key_py: Py( 0x00007f1dc38f2af0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "algorithm_version": SerField { key_py: Py( 0x00007f1dcaa85160, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea52cf3f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "model_type": SerField { key_py: Py( 0x00007f1dc38f2a70, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9470030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f1dcaa85020, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 7, }, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6163a30, ), Py( 0x00007f1ea645a330, ), Py( 0x00007f1df968d570, ), Py( 0x00007f1de45a67f0, ), Py( 0x00007f1de4256240, ), Py( 0x00007f1de42561a0, ), ], name: "PGTAlgorithmConfiguration", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="PGTAlgorithmConfiguration", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f1ea52ed250, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f1dcaa84fd0, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f1dcaa87be0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea52cf3f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f1ea6163a30, ), init: true, init_only: false, lookup_key: Simple { key: "model_type", py_key: Py( 0x00007f1dc38f2930, ), path: LookupPath( [ S( "model_type", Py( 0x00007f1dc38f28f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9470030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f1ea645a330, ), init: true, init_only: false, lookup_key: Simple { key: "max_length", py_key: Py( 0x00007f1dc38f28b0, ), path: LookupPath( [ S( "max_length", Py( 0x00007f1dc38f15f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3c7f1b0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f1df968d570, ), init: true, init_only: false, lookup_key: Simple { key: "top_k", py_key: Py( 0x00007f1dc38f2970, ), path: LookupPath( [ S( "top_k", Py( 0x00007f1dc38f29b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f1de45a67f0, ), init: true, init_only: false, lookup_key: Simple { key: "top_p", py_key: Py( 0x00007f1dc38f29f0, ), path: LookupPath( [ S( "top_p", Py( 0x00007f1dc38f2a30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3c7e750, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f1de4256240, ), init: true, init_only: false, lookup_key: Simple { key: "num_return_sequences", py_key: Py( 0x00007f1dcaa87d20, ), path: LookupPath( [ S( "num_return_sequences", Py( 0x00007f1dcaa86dd0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f1de42561a0, ), init: true, init_only: false, lookup_key: Simple { key: "no_repeat_ngram_size", py_key: Py( 0x00007f1dcaa841c0, ), path: LookupPath( [ S( "no_repeat_ngram_size", Py( 0x00007f1dcaa87d70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, ], positional_count: 7, init_only_count: None, dataclass_name: "PGTAlgorithmConfiguration", validator_name: "dataclass-args[PGTAlgorithmConfiguration]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x00005618681d8de0, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6163a30, ), Py( 0x00007f1ea645a330, ), Py( 0x00007f1df968d570, ), Py( 0x00007f1de45a67f0, ), Py( 0x00007f1de4256240, ), Py( 0x00007f1de42561a0, ), ], post_init: None, revalidate: Never, name: "PGTAlgorithmConfiguration", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2) -> None>¶
- __wrapped__¶
alias of
PGTAlgorithmConfiguration
- class PGTGenerator(*args, **kwargs)[source]¶
Bases:
PGTGenerator
Configuration for a PGT Generator algorithm
- input_text: str = 'This is my input'¶
- task: str = 'title-to-abstract'¶
- get_generator(resources_path, **kwargs)[source]¶
Instantiate the actual PGT implementation for part of patent generation.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
- instance with
generate_batch
method for targeted generation.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'domain': 'ClassVar[str]', 'input_text': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': 'int', 'task': <class 'str'>, 'top_k': 'int', 'top_p': 'float'}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTGenerator',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_text': Field(name='input_text',type=<class 'str'>,default='This is my input',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input text.'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'task': Field(name='task',type=<class 'str'>,default='title-to-abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Generation tasks. Supported: title-to-abstract, abstract-to-claim, claim-to-abstract, abstract-to-title'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Configuration for a PGT Generator algorithm'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'task')¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTGenerator'>, 'config': {'title': 'PGTGenerator'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'task'], 'frozen': False, 'metadata': {'pydantic_js_annotation_functions': [], 'pydantic_js_functions': [functools.partial(<function modify_model_json_schema>, cls=<class 'gt4sd.algorithms.generation.pgt.core.PGTGenerator'>, title=None)]}, 'post_init': False, 'ref': 'types.PGTGenerator:94662825939424', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTGenerator', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'input_text', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'This is my input'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'task', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'title-to-abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'input_text': FieldInfo(annotation=str, required=False, default='This is my input', description='Input text.', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'task': FieldInfo(annotation=str, required=False, default='title-to-abstract', description='Generation tasks. Supported: title-to-abstract, abstract-to-claim, claim-to-abstract, abstract-to-title', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005618681d11e0, ), serializer: Fields( GeneralFieldsSerializer { fields: { "num_return_sequences": SerField { key_py: Py( 0x00007f1dcae35fc0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_p": SerField { key_py: Py( 0x00007f1dc38f1ef0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3c7e750, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "task": SerField { key_py: Py( 0x00007f1dc38f1f30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc510dd90, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f1dcae37000, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_k": SerField { key_py: Py( 0x00007f1dc38f1fb0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "algorithm_version": SerField { key_py: Py( 0x00007f1dcae37af0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea52cf3f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "model_type": SerField { key_py: Py( 0x00007f1dc38f2030, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9470030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "input_text": SerField { key_py: Py( 0x00007f1dc38f1570, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc510fe60, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "max_length": SerField { key_py: Py( 0x00007f1dc38f1f70, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3c7f1b0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 9, }, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6163a30, ), Py( 0x00007f1ea645a330, ), Py( 0x00007f1df968d570, ), Py( 0x00007f1de45a67f0, ), Py( 0x00007f1de4256240, ), Py( 0x00007f1de42561a0, ), Py( 0x00007f1de722f4f0, ), Py( 0x00007f1ea5707430, ), ], name: "PGTGenerator", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="PGTGenerator", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f1ea52ed250, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f1dcae37410, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f1dcae34760, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea52cf3f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f1ea6163a30, ), init: true, init_only: false, lookup_key: Simple { key: "model_type", py_key: Py( 0x00007f1dc3901a30, ), path: LookupPath( [ S( "model_type", Py( 0x00007f1dc39019f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9470030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f1ea645a330, ), init: true, init_only: false, lookup_key: Simple { key: "max_length", py_key: Py( 0x00007f1dc39019b0, ), path: LookupPath( [ S( "max_length", Py( 0x00007f1dc3901970, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3c7f1b0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f1df968d570, ), init: true, init_only: false, lookup_key: Simple { key: "top_k", py_key: Py( 0x00007f1dc3901a70, ), path: LookupPath( [ S( "top_k", Py( 0x00007f1dc3901ab0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f1de45a67f0, ), init: true, init_only: false, lookup_key: Simple { key: "top_p", py_key: Py( 0x00007f1dc3901af0, ), path: LookupPath( [ S( "top_p", Py( 0x00007f1dc3901b30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3c7e750, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f1de4256240, ), init: true, init_only: false, lookup_key: Simple { key: "num_return_sequences", py_key: Py( 0x00007f1dcae35750, ), path: LookupPath( [ S( "num_return_sequences", Py( 0x00007f1dcae341c0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f1de42561a0, ), init: true, init_only: false, lookup_key: Simple { key: "no_repeat_ngram_size", py_key: Py( 0x00007f1dcae36920, ), path: LookupPath( [ S( "no_repeat_ngram_size", Py( 0x00007f1dcae37460, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_text", py_name: Py( 0x00007f1de722f4f0, ), init: true, init_only: false, lookup_key: Simple { key: "input_text", py_key: Py( 0x00007f1dc3901b70, ), path: LookupPath( [ S( "input_text", Py( 0x00007f1dc3901bb0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc510fe60, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "task", py_name: Py( 0x00007f1ea5707430, ), init: true, init_only: false, lookup_key: Simple { key: "task", py_key: Py( 0x00007f1dc3901bf0, ), path: LookupPath( [ S( "task", Py( 0x00007f1dc3901c30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc510dd90, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, ], positional_count: 9, init_only_count: None, dataclass_name: "PGTGenerator", validator_name: "dataclass-args[PGTGenerator]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x00005618681d11e0, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6163a30, ), Py( 0x00007f1ea645a330, ), Py( 0x00007f1df968d570, ), Py( 0x00007f1de45a67f0, ), Py( 0x00007f1de4256240, ), Py( 0x00007f1de42561a0, ), Py( 0x00007f1de722f4f0, ), Py( 0x00007f1ea5707430, ), ], post_init: None, revalidate: Never, name: "PGTGenerator", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2, input_text: str = 'This is my input', task: str = 'title-to-abstract') -> None>¶
- __wrapped__¶
alias of
PGTGenerator
- algorithm_application: ClassVar[str] = 'PGTGenerator'¶
Unique name for the application that is the use of this configuration together with a specific algorithm.
Will be set when registering to
ApplicationsRegistry
, but can be given by direct registration (Seeregister_algorithm_application
)
- algorithm_name: ClassVar[str] = 'PGT'¶
Name of the algorithm to use with this configuration.
Will be set when registering to
ApplicationsRegistry
- class PGTEditor(*args, **kwargs)[source]¶
Bases:
PGTEditor
Configuration for a PGT Editor algorithm.
- input_text: str = 'This is my input'¶
- input_type: str = 'abstract'¶
- get_generator(resources_path, **kwargs)[source]¶
Instantiate the actual PGT implementation for part of patent editing.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
- instance with
generate_batch
method for targeted generation.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'domain': 'ClassVar[str]', 'input_text': <class 'str'>, 'input_type': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': 'int', 'top_k': 'int', 'top_p': 'float'}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTEditor',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_text': Field(name='input_text',type=<class 'str'>,default='This is my input',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input text.'}),kw_only=False,_field_type=_FIELD), 'input_type': Field(name='input_type',type=<class 'str'>,default='abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Part of a patent the input text belongs. Supported: abstract, claim'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Configuration for a PGT Editor algorithm.'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'input_type')¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTEditor'>, 'config': {'title': 'PGTEditor'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'input_type'], 'frozen': False, 'metadata': {'pydantic_js_annotation_functions': [], 'pydantic_js_functions': [functools.partial(<function modify_model_json_schema>, cls=<class 'gt4sd.algorithms.generation.pgt.core.PGTEditor'>, title=None)]}, 'post_init': False, 'ref': 'types.PGTEditor:94662829196352', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTEditor', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'input_text', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'This is my input'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'input_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'input_text': FieldInfo(annotation=str, required=False, default='This is my input', description='Input text.', init=True, init_var=False, kw_only=False), 'input_type': FieldInfo(annotation=str, required=False, default='abstract', description='Part of a patent the input text belongs. Supported: abstract, claim', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005618684ec440, ), serializer: Fields( GeneralFieldsSerializer { fields: { "algorithm_version": SerField { key_py: Py( 0x00007f1dc5000e40, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea52cf3f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "top_k": SerField { key_py: Py( 0x00007f1dc39026b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_p": SerField { key_py: Py( 0x00007f1dc3902670, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3c7e750, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "max_length": SerField { key_py: Py( 0x00007f1dc3902730, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3c7f1b0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "num_return_sequences": SerField { key_py: Py( 0x00007f1dc5000e90, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "input_type": SerField { key_py: Py( 0x00007f1dc3902430, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea7f53bb0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "model_type": SerField { key_py: Py( 0x00007f1dc3902770, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9470030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f1dc5000670, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "input_text": SerField { key_py: Py( 0x00007f1dc3901f30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc510fe60, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 9, }, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6163a30, ), Py( 0x00007f1ea645a330, ), Py( 0x00007f1df968d570, ), Py( 0x00007f1de45a67f0, ), Py( 0x00007f1de4256240, ), Py( 0x00007f1de42561a0, ), Py( 0x00007f1de722f4f0, ), Py( 0x00007f1ea52cf330, ), ], name: "PGTEditor", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="PGTEditor", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f1ea52ed250, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f1dc50005d0, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f1dc50020b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea52cf3f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f1ea6163a30, ), init: true, init_only: false, lookup_key: Simple { key: "model_type", py_key: Py( 0x00007f1dc39188b0, ), path: LookupPath( [ S( "model_type", Py( 0x00007f1dc3918870, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9470030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f1ea645a330, ), init: true, init_only: false, lookup_key: Simple { key: "max_length", py_key: Py( 0x00007f1dc3918830, ), path: LookupPath( [ S( "max_length", Py( 0x00007f1dc39187f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3c7f1b0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f1df968d570, ), init: true, init_only: false, lookup_key: Simple { key: "top_k", py_key: Py( 0x00007f1dc39188f0, ), path: LookupPath( [ S( "top_k", Py( 0x00007f1dc3918930, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f1de45a67f0, ), init: true, init_only: false, lookup_key: Simple { key: "top_p", py_key: Py( 0x00007f1dc3918970, ), path: LookupPath( [ S( "top_p", Py( 0x00007f1dc39189b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3c7e750, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f1de4256240, ), init: true, init_only: false, lookup_key: Simple { key: "num_return_sequences", py_key: Py( 0x00007f1dc5000530, ), path: LookupPath( [ S( "num_return_sequences", Py( 0x00007f1dc50004e0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f1de42561a0, ), init: true, init_only: false, lookup_key: Simple { key: "no_repeat_ngram_size", py_key: Py( 0x00007f1dc5002100, ), path: LookupPath( [ S( "no_repeat_ngram_size", Py( 0x00007f1dc5002470, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_text", py_name: Py( 0x00007f1de722f4f0, ), init: true, init_only: false, lookup_key: Simple { key: "input_text", py_key: Py( 0x00007f1dc39189f0, ), path: LookupPath( [ S( "input_text", Py( 0x00007f1dc3918a30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc510fe60, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_type", py_name: Py( 0x00007f1ea52cf330, ), init: true, init_only: false, lookup_key: Simple { key: "input_type", py_key: Py( 0x00007f1dc3918a70, ), path: LookupPath( [ S( "input_type", Py( 0x00007f1dc3918ab0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea7f53bb0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, ], positional_count: 9, init_only_count: None, dataclass_name: "PGTEditor", validator_name: "dataclass-args[PGTEditor]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x00005618684ec440, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6163a30, ), Py( 0x00007f1ea645a330, ), Py( 0x00007f1df968d570, ), Py( 0x00007f1de45a67f0, ), Py( 0x00007f1de4256240, ), Py( 0x00007f1de42561a0, ), Py( 0x00007f1de722f4f0, ), Py( 0x00007f1ea52cf330, ), ], post_init: None, revalidate: Never, name: "PGTEditor", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2, input_text: str = 'This is my input', input_type: str = 'abstract') -> None>¶
- algorithm_application: ClassVar[str] = 'PGTEditor'¶
Unique name for the application that is the use of this configuration together with a specific algorithm.
Will be set when registering to
ApplicationsRegistry
, but can be given by direct registration (Seeregister_algorithm_application
)
- algorithm_name: ClassVar[str] = 'PGT'¶
Name of the algorithm to use with this configuration.
Will be set when registering to
ApplicationsRegistry
- class PGTCoherenceChecker(*args, **kwargs)[source]¶
Bases:
PGTCoherenceChecker
Configuration for a PGT coherence check algorithm
- num_return_sequences: int = 1¶
- input_a: str = "I'm a stochastic parrot."¶
- input_b: str = "I'm a stochastic parrot."¶
- coherence_type: str = 'title-abstract'¶
- get_generator(resources_path, **kwargs)[source]¶
Instantiate the actual PGT implementation for patent coherence check.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
- instance with
generate_batch
method for targeted generation.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'coherence_type': <class 'str'>, 'domain': 'ClassVar[str]', 'input_a': <class 'str'>, 'input_b': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': <class 'int'>, 'top_k': 'int', 'top_p': 'float'}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTCoherenceChecker',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'coherence_type': Field(name='coherence_type',type=<class 'str'>,default='title-abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input types for the check. Supported: title-abstract, abstract-claim, title-claim'}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_a': Field(name='input_a',type=<class 'str'>,default="I'm a stochastic parrot.",default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'First input for coherence check.'}),kw_only=False,_field_type=_FIELD), 'input_b': Field(name='input_b',type=<class 'str'>,default="I'm a stochastic parrot.",default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Second input for coherence check.'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=1,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives should be always 1 for coherence check.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Configuration for a PGT coherence check algorithm'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_a', 'input_b', 'coherence_type')¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTCoherenceChecker'>, 'config': {'title': 'PGTCoherenceChecker'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_a', 'input_b', 'coherence_type'], 'frozen': False, 'metadata': {'pydantic_js_annotation_functions': [], 'pydantic_js_functions': [functools.partial(<function modify_model_json_schema>, cls=<class 'gt4sd.algorithms.generation.pgt.core.PGTCoherenceChecker'>, title=None)]}, 'post_init': False, 'ref': 'types.PGTCoherenceChecker:94662829214736', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTCoherenceChecker', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 1}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'input_a', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': "I'm a stochastic parrot."}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'input_b', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': "I'm a stochastic parrot."}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}, {'type': 'dataclass-field', 'name': 'coherence_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'title-abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_functions': [], 'pydantic_js_annotation_functions': [<function get_json_schema_update_func.<locals>.json_schema_update_func>]}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'coherence_type': FieldInfo(annotation=str, required=False, default='title-abstract', description='Input types for the check. Supported: title-abstract, abstract-claim, title-claim', init=True, init_var=False, kw_only=False), 'input_a': FieldInfo(annotation=str, required=False, default="I'm a stochastic parrot.", description='First input for coherence check.', init=True, init_var=False, kw_only=False), 'input_b': FieldInfo(annotation=str, required=False, default="I'm a stochastic parrot.", description='Second input for coherence check.', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=1, description='Number of alternatives should be always 1 for coherence check.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005618684f0c10, ), serializer: Fields( GeneralFieldsSerializer { fields: { "max_length": SerField { key_py: Py( 0x00007f1dc38f3270, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3c7f1b0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_p": SerField { key_py: Py( 0x00007f1dc38f3b30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc3c7e750, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f1dcaa870f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_k": SerField { key_py: Py( 0x00007f1dc38f3470, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9468710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "input_a": SerField { key_py: Py( 0x00007f1dc38f34b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc510ded0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "input_b": SerField { key_py: Py( 0x00007f1dc38f31f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc510ded0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "model_type": SerField { key_py: Py( 0x00007f1dc38f3570, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea9470030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "num_return_sequences": SerField { key_py: Py( 0x00007f1dcae35de0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea94680f0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "algorithm_version": SerField { key_py: Py( 0x00007f1dcae357a0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1ea52cf3f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "coherence_type": SerField { key_py: Py( 0x00007f1dc38f1cb0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f1dc38da530, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 10, }, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6163a30, ), Py( 0x00007f1ea645a330, ), Py( 0x00007f1df968d570, ), Py( 0x00007f1de45a67f0, ), Py( 0x00007f1de4256240, ), Py( 0x00007f1de42561a0, ), Py( 0x00007f1df97f5370, ), Py( 0x00007f1df97f53b0, ), Py( 0x00007f1dc38da570, ), ], name: "PGTCoherenceChecker", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="PGTCoherenceChecker", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f1ea52ed250, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f1dcae994d0, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f1dcae99610, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea52cf3f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f1ea6163a30, ), init: true, init_only: false, lookup_key: Simple { key: "model_type", py_key: Py( 0x00007f1dc391aaf0, ), path: LookupPath( [ S( "model_type", Py( 0x00007f1dc391aa30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9470030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f1ea645a330, ), init: true, init_only: false, lookup_key: Simple { key: "max_length", py_key: Py( 0x00007f1dc391aa70, ), path: LookupPath( [ S( "max_length", Py( 0x00007f1dc391a9f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3c7f1b0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f1df968d570, ), init: true, init_only: false, lookup_key: Simple { key: "top_k", py_key: Py( 0x00007f1dc391ab70, ), path: LookupPath( [ S( "top_k", Py( 0x00007f1dc391aab0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f1de45a67f0, ), init: true, init_only: false, lookup_key: Simple { key: "top_p", py_key: Py( 0x00007f1dc391abb0, ), path: LookupPath( [ S( "top_p", Py( 0x00007f1dc391ac30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc3c7e750, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f1de4256240, ), init: true, init_only: false, lookup_key: Simple { key: "num_return_sequences", py_key: Py( 0x00007f1dcae9b910, ), path: LookupPath( [ S( "num_return_sequences", Py( 0x00007f1dcae998e0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea94680f0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f1de42561a0, ), init: true, init_only: false, lookup_key: Simple { key: "no_repeat_ngram_size", py_key: Py( 0x00007f1dcae9b410, ), path: LookupPath( [ S( "no_repeat_ngram_size", Py( 0x00007f1dcae9aab0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1ea9468110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_a", py_name: Py( 0x00007f1df97f5370, ), init: true, init_only: false, lookup_key: Simple { key: "input_a", py_key: Py( 0x00007f1dc391ab30, ), path: LookupPath( [ S( "input_a", Py( 0x00007f1dc391ac70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc510ded0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_b", py_name: Py( 0x00007f1df97f53b0, ), init: true, init_only: false, lookup_key: Simple { key: "input_b", py_key: Py( 0x00007f1dc391acf0, ), path: LookupPath( [ S( "input_b", Py( 0x00007f1dc391abf0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc510ded0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, Field { kw_only: false, name: "coherence_type", py_name: Py( 0x00007f1dc38da570, ), init: true, init_only: false, lookup_key: Simple { key: "coherence_type", py_key: Py( 0x00007f1dc391ad30, ), path: LookupPath( [ S( "coherence_type", Py( 0x00007f1dc391adb0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f1dc38da530, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f1ea71db950, ), }, ), frozen: false, }, ], positional_count: 10, init_only_count: None, dataclass_name: "PGTCoherenceChecker", validator_name: "dataclass-args[PGTCoherenceChecker]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x00005618684f0c10, ), fields: [ Py( 0x00007f1ea52ed250, ), Py( 0x00007f1ea6163a30, ), Py( 0x00007f1ea645a330, ), Py( 0x00007f1df968d570, ), Py( 0x00007f1de45a67f0, ), Py( 0x00007f1de4256240, ), Py( 0x00007f1de42561a0, ), Py( 0x00007f1df97f5370, ), Py( 0x00007f1df97f53b0, ), Py( 0x00007f1dc38da570, ), ], post_init: None, revalidate: Never, name: "PGTCoherenceChecker", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 1, no_repeat_ngram_size: int = 2, input_a: str = "I'm a stochastic parrot.", input_b: str = "I'm a stochastic parrot.", coherence_type: str = 'title-abstract') -> None>¶
- __wrapped__¶
alias of
PGTCoherenceChecker
- algorithm_application: ClassVar[str] = 'PGTCoherenceChecker'¶
Unique name for the application that is the use of this configuration together with a specific algorithm.
Will be set when registering to
ApplicationsRegistry
, but can be given by direct registration (Seeregister_algorithm_application
)
- algorithm_name: ClassVar[str] = 'PGT'¶
Name of the algorithm to use with this configuration.
Will be set when registering to
ApplicationsRegistry