gt4sd.algorithms.generation.pgt.core module¶
Patent Generative Transformer (PGT) generation algorithm.
Summary¶
Classes:
PGT Algorithm. |
|
Basic configuration for a PGT algorithm |
|
Configuration for a PGT coherence check algorithm |
|
Configuration for a PGT Editor algorithm. |
|
Configuration for a PGT Generator algorithm |
Reference¶
- class PGT(configuration, target=None)[source]¶
Bases:
GeneratorAlgorithm
[S
,None
]PGT Algorithm.
- __init__(configuration, target=None)[source]¶
Instantiate PGT ready to generate items.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – domain and application specification defining parameters, types and validations.target (
None
) – unused since it is not a conditional generator.
Example
An example for generating abstract from a given claim:
config = PGTGenerator(task=”claim_to_abstract”, input_text=”My interesting claim”) generator = PGT(configuration=config) print(list(generator.sample(1)))
- get_generator(configuration, target)[source]¶
Get the function to sample with the given configuration.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – helps to set up specific application of PGT.target (
None
) – context or condition for the generation. Unused in the algorithm.
- Return type
Callable
[[],Iterable
[Any
]]- Returns
callable with target generating a batch of items.
- validate_configuration(configuration)[source]¶
Overload to validate the a configuration for the algorithm.
- Parameters
configuration (
AlgorithmConfiguration
[~S,None
]) – the algorithm configuration.- Raises
InvalidAlgorithmConfiguration – in case the configuration for the algorithm is invalid.
- Return type
AlgorithmConfiguration
[~S,None
]- Returns
the validated configuration.
- __abstractmethods__ = frozenset({})¶
- __annotations__ = {'generate': 'Untargeted', 'generator': 'Union[Untargeted, Targeted[T]]', 'max_runtime': 'int', 'max_samples': 'int', 'target': 'Optional[T]'}¶
- __doc__ = 'PGT Algorithm.'¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __orig_bases__ = (gt4sd.algorithms.core.GeneratorAlgorithm[~S, NoneType],)¶
- __parameters__ = (~S,)¶
- _abc_impl = <_abc._abc_data object>¶
- class PGTAlgorithmConfiguration(*args, **kwargs)[source]¶
Bases:
PGTAlgorithmConfiguration
,Generic
[T
]Basic configuration for a PGT algorithm
- algorithm_type: ClassVar[str] = 'generation'¶
General type of generative algorithm.
- domain: ClassVar[str] = 'nlp'¶
General application domain. Hints at input/output types.
- algorithm_version: str = 'v0'¶
To differentiate between different versions of an application.
There is no imposed naming convention.
- model_type: str = ''¶
- max_length: int = 512¶
- top_k: int = 50¶
- top_p: float = 1.0¶
- num_return_sequences: int = 3¶
- no_repeat_ngram_size: int = 2¶
- get_target_description()[source]¶
Get description of the target for generation.
- Return type
Optional
[Dict
[str
,str
],None
]- Returns
target description, returns None in case no target is used.
- get_generator(resources_path, **kwargs)[source]¶
Instantiate the actual PGT implementation.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
- instance with
generate_batch
method for targeted generation.
- classmethod save_version_from_training_pipeline_arguments_postprocess()[source]¶
- Postprocess after saving. Remove temporarily converted hf model
if pytorch-lightning checkpoint is given.
- Parameters
training_pipeline_arguments (
TrainingPipelineArguments
) – training pipeline arguments.
- classmethod get_filepath_mappings_for_training_pipeline_arguments()[source]¶
Ger filepath mappings for the given training pipeline arguments.
- Parameters
training_pipeline_arguments (
TrainingPipelineArguments
) – training pipeline arguments.- Return type
Dict
[str
,str
]- Returns
a mapping between artifacts’ files and training pipeline’s output files.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': typing.ClassVar[str], 'algorithm_version': <class 'str'>, 'domain': typing.ClassVar[str], 'max_length': <class 'int'>, 'model_type': <class 'str'>, 'no_repeat_ngram_size': <class 'int'>, 'num_return_sequences': <class 'int'>, 'top_k': <class 'int'>, 'top_p': <class 'float'>}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTAlgorithmConfiguration',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Basic configuration for a PGT algorithm'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size')¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __orig_bases__ = (<class 'types.PGTAlgorithmConfiguration'>, typing.Generic[~T])¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTAlgorithmConfiguration'>, 'config': {'title': 'PGTAlgorithmConfiguration'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size'], 'frozen': False, 'post_init': False, 'ref': 'types.PGTAlgorithmConfiguration:94427937169712', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTAlgorithmConfiguration', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Type of the model.'}}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Maximum length of the generated text.'}}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of top-k probability tokens to keep.'}}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of alternatives to be generated.'}}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Size of n-gram to not appear twice.'}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x000055e1b7a6e130, ), serializer: Fields( GeneralFieldsSerializer { fields: { "algorithm_version": SerField { key_py: Py( 0x00007f9dbdfcadd0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e996507f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "max_length": SerField { key_py: Py( 0x00007f9db7c84af0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7c48450, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "model_type": SerField { key_py: Py( 0x00007f9db7c84ab0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d508030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "top_k": SerField { key_py: Py( 0x00007f9db7c84b30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "num_return_sequences": SerField { key_py: Py( 0x00007f9dbdfcad80, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f9dbdfc88a0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_p": SerField { key_py: Py( 0x00007f9db7c84b70, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7fbf5b0, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 7, }, ), fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9a4b3ab0, ), Py( 0x00007f9e9a798ab0, ), Py( 0x00007f9e6fe3e7b0, ), Py( 0x00007f9dd967e430, ), Py( 0x00007f9dd9060120, ), Py( 0x00007f9dd9060080, ), ], name: "PGTAlgorithmConfiguration", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="PGTAlgorithmConfiguration", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f9e9963bc80, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f9dbdfc8990, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f9dbdfcae70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e996507f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f9e9a4b3ab0, ), init: true, init_only: false, lookup_key: Simple { key: "model_type", py_key: Py( 0x00007f9db7c84970, ), path: LookupPath( [ S( "model_type", Py( 0x00007f9db7c84930, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d508030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f9e9a798ab0, ), init: true, init_only: false, lookup_key: Simple { key: "max_length", py_key: Py( 0x00007f9db7c848f0, ), path: LookupPath( [ S( "max_length", Py( 0x00007f9db7c848b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7c48450, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f9e6fe3e7b0, ), init: true, init_only: false, lookup_key: Simple { key: "top_k", py_key: Py( 0x00007f9db7c849b0, ), path: LookupPath( [ S( "top_k", Py( 0x00007f9db7c849f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f9dd967e430, ), init: true, init_only: false, lookup_key: Simple { key: "top_p", py_key: Py( 0x00007f9db7c84a30, ), path: LookupPath( [ S( "top_p", Py( 0x00007f9db7c84a70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7fbf5b0, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f9dd9060120, ), init: true, init_only: false, lookup_key: Simple { key: "num_return_sequences", py_key: Py( 0x00007f9dbdfc89e0, ), path: LookupPath( [ S( "num_return_sequences", Py( 0x00007f9dbdfcae20, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f9dd9060080, ), init: true, init_only: false, lookup_key: Simple { key: "no_repeat_ngram_size", py_key: Py( 0x00007f9dbdfc8940, ), path: LookupPath( [ S( "no_repeat_ngram_size", Py( 0x00007f9dbdfc88f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, ], positional_count: 7, init_only_count: None, dataclass_name: "PGTAlgorithmConfiguration", validator_name: "dataclass-args[PGTAlgorithmConfiguration]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x000055e1b7a6e130, ), generic_origin: None, fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9a4b3ab0, ), Py( 0x00007f9e9a798ab0, ), Py( 0x00007f9e6fe3e7b0, ), Py( 0x00007f9dd967e430, ), Py( 0x00007f9dd9060120, ), Py( 0x00007f9dd9060080, ), ], post_init: None, revalidate: Never, name: "PGTAlgorithmConfiguration", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2) -> None>¶
- __wrapped__¶
alias of
PGTAlgorithmConfiguration
- class PGTGenerator(*args, **kwargs)[source]¶
Bases:
PGTGenerator
Configuration for a PGT Generator algorithm
- input_text: str = 'This is my input'¶
- task: str = 'title-to-abstract'¶
- get_generator(resources_path, **kwargs)[source]¶
Instantiate the actual PGT implementation for part of patent generation.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
- instance with
generate_batch
method for targeted generation.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'domain': 'ClassVar[str]', 'input_text': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': 'int', 'task': <class 'str'>, 'top_k': 'int', 'top_p': 'float'}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTGenerator',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_text': Field(name='input_text',type=<class 'str'>,default='This is my input',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input text.'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'task': Field(name='task',type=<class 'str'>,default='title-to-abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Generation tasks. Supported: title-to-abstract, abstract-to-claim, claim-to-abstract, abstract-to-title'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Configuration for a PGT Generator algorithm'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'task')¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTGenerator'>, 'config': {'title': 'PGTGenerator'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'task'], 'frozen': False, 'post_init': False, 'ref': 'types.PGTGenerator:94427950922864', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTGenerator', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Type of the model.'}}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Maximum length of the generated text.'}}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of top-k probability tokens to keep.'}}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of alternatives to be generated.'}}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Size of n-gram to not appear twice.'}}}, {'type': 'dataclass-field', 'name': 'input_text', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'This is my input'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Input text.'}}}, {'type': 'dataclass-field', 'name': 'task', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'title-to-abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Generation tasks. Supported: title-to-abstract, abstract-to-claim, claim-to-abstract, abstract-to-title'}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'input_text': FieldInfo(annotation=str, required=False, default='This is my input', description='Input text.', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'task': FieldInfo(annotation=str, required=False, default='title-to-abstract', description='Generation tasks. Supported: title-to-abstract, abstract-to-claim, claim-to-abstract, abstract-to-title', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x000055e1b878bc70, ), serializer: Fields( GeneralFieldsSerializer { fields: { "num_return_sequences": SerField { key_py: Py( 0x00007f9dbdfcaec0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "algorithm_version": SerField { key_py: Py( 0x00007f9dbdfc8d50, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e996507f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "max_length": SerField { key_py: Py( 0x00007f9db7c79130, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7c48450, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "top_p": SerField { key_py: Py( 0x00007f9db7c79ff0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7fbf5b0, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "task": SerField { key_py: Py( 0x00007f9db7c787b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbe1020b0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "model_type": SerField { key_py: Py( 0x00007f9db7c7a370, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d508030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "top_k": SerField { key_py: Py( 0x00007f9db7c791f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f9dbdfc9d40, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "input_text": SerField { key_py: Py( 0x00007f9db7c79370, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbe1017a0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 9, }, ), fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9a4b3ab0, ), Py( 0x00007f9e9a798ab0, ), Py( 0x00007f9e6fe3e7b0, ), Py( 0x00007f9dd967e430, ), Py( 0x00007f9dd9060120, ), Py( 0x00007f9dd9060080, ), Py( 0x00007f9ddb0084b0, ), Py( 0x00007f9e9989bcb0, ), ], name: "PGTGenerator", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="PGTGenerator", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f9e9963bc80, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f9dbe71bb40, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f9dbe71b780, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e996507f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f9e9a4b3ab0, ), init: true, init_only: false, lookup_key: Simple { key: "model_type", py_key: Py( 0x00007f9db7c86a70, ), path: LookupPath( [ S( "model_type", Py( 0x00007f9db7c869b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d508030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f9e9a798ab0, ), init: true, init_only: false, lookup_key: Simple { key: "max_length", py_key: Py( 0x00007f9db7c86930, ), path: LookupPath( [ S( "max_length", Py( 0x00007f9db7c86a30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7c48450, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f9e6fe3e7b0, ), init: true, init_only: false, lookup_key: Simple { key: "top_k", py_key: Py( 0x00007f9db7c86ab0, ), path: LookupPath( [ S( "top_k", Py( 0x00007f9db7c869f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f9dd967e430, ), init: true, init_only: false, lookup_key: Simple { key: "top_p", py_key: Py( 0x00007f9db7c86af0, ), path: LookupPath( [ S( "top_p", Py( 0x00007f9db7c86b30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7fbf5b0, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f9dd9060120, ), init: true, init_only: false, lookup_key: Simple { key: "num_return_sequences", py_key: Py( 0x00007f9dbe719980, ), path: LookupPath( [ S( "num_return_sequences", Py( 0x00007f9dbe7197f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f9dd9060080, ), init: true, init_only: false, lookup_key: Simple { key: "no_repeat_ngram_size", py_key: Py( 0x00007f9dbe71b640, ), path: LookupPath( [ S( "no_repeat_ngram_size", Py( 0x00007f9dbe71b6e0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_text", py_name: Py( 0x00007f9ddb0084b0, ), init: true, init_only: false, lookup_key: Simple { key: "input_text", py_key: Py( 0x00007f9db7c86bb0, ), path: LookupPath( [ S( "input_text", Py( 0x00007f9db7c86bf0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbe1017a0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "task", py_name: Py( 0x00007f9e9989bcb0, ), init: true, init_only: false, lookup_key: Simple { key: "task", py_key: Py( 0x00007f9db7c86c30, ), path: LookupPath( [ S( "task", Py( 0x00007f9db7c86c70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbe1020b0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, ], positional_count: 9, init_only_count: None, dataclass_name: "PGTGenerator", validator_name: "dataclass-args[PGTGenerator]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x000055e1b878bc70, ), generic_origin: None, fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9a4b3ab0, ), Py( 0x00007f9e9a798ab0, ), Py( 0x00007f9e6fe3e7b0, ), Py( 0x00007f9dd967e430, ), Py( 0x00007f9dd9060120, ), Py( 0x00007f9dd9060080, ), Py( 0x00007f9ddb0084b0, ), Py( 0x00007f9e9989bcb0, ), ], post_init: None, revalidate: Never, name: "PGTGenerator", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2, input_text: str = 'This is my input', task: str = 'title-to-abstract') -> None>¶
- __wrapped__¶
alias of
PGTGenerator
- algorithm_application: ClassVar[str] = 'PGTGenerator'¶
Unique name for the application that is the use of this configuration together with a specific algorithm.
Will be set when registering to
ApplicationsRegistry
, but can be given by direct registration (Seeregister_algorithm_application
)
- algorithm_name: ClassVar[str] = 'PGT'¶
Name of the algorithm to use with this configuration.
Will be set when registering to
ApplicationsRegistry
- class PGTEditor(*args, **kwargs)[source]¶
Bases:
PGTEditor
Configuration for a PGT Editor algorithm.
- input_text: str = 'This is my input'¶
- input_type: str = 'abstract'¶
- get_generator(resources_path, **kwargs)[source]¶
Instantiate the actual PGT implementation for part of patent editing.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
- instance with
generate_batch
method for targeted generation.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'domain': 'ClassVar[str]', 'input_text': <class 'str'>, 'input_type': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': 'int', 'top_k': 'int', 'top_p': 'float'}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTEditor',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_text': Field(name='input_text',type=<class 'str'>,default='This is my input',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input text.'}),kw_only=False,_field_type=_FIELD), 'input_type': Field(name='input_type',type=<class 'str'>,default='abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Part of a patent the input text belongs. Supported: abstract, claim'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Configuration for a PGT Editor algorithm.'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'input_type')¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTEditor'>, 'config': {'title': 'PGTEditor'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'input_type'], 'frozen': False, 'post_init': False, 'ref': 'types.PGTEditor:94427950951120', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTEditor', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Type of the model.'}}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Maximum length of the generated text.'}}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of top-k probability tokens to keep.'}}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of alternatives to be generated.'}}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Size of n-gram to not appear twice.'}}}, {'type': 'dataclass-field', 'name': 'input_text', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'This is my input'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Input text.'}}}, {'type': 'dataclass-field', 'name': 'input_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Part of a patent the input text belongs. Supported: abstract, claim'}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'input_text': FieldInfo(annotation=str, required=False, default='This is my input', description='Input text.', init=True, init_var=False, kw_only=False), 'input_type': FieldInfo(annotation=str, required=False, default='abstract', description='Part of a patent the input text belongs. Supported: abstract, claim', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x000055e1b8792ad0, ), serializer: Fields( GeneralFieldsSerializer { fields: { "model_type": SerField { key_py: Py( 0x00007f9db7c87db0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d508030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "top_k": SerField { key_py: Py( 0x00007f9db7c87ef0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "algorithm_version": SerField { key_py: Py( 0x00007f9dbe1af410, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e996507f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "input_text": SerField { key_py: Py( 0x00007f9db7c87270, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbe1017a0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "input_type": SerField { key_py: Py( 0x00007f9db7c85f30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9c7151b0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "top_p": SerField { key_py: Py( 0x00007f9db7c87fb0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7fbf5b0, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "max_length": SerField { key_py: Py( 0x00007f9db7c87df0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7c48450, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "num_return_sequences": SerField { key_py: Py( 0x00007f9dbe1af140, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f9dbe1af0a0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 9, }, ), fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9a4b3ab0, ), Py( 0x00007f9e9a798ab0, ), Py( 0x00007f9e6fe3e7b0, ), Py( 0x00007f9dd967e430, ), Py( 0x00007f9dd9060120, ), Py( 0x00007f9dd9060080, ), Py( 0x00007f9ddb0084b0, ), Py( 0x00007f9e9964d1b0, ), ], name: "PGTEditor", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="PGTEditor", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f9e9963bc80, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f9dbe1aec40, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f9dbe1aea10, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e996507f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f9e9a4b3ab0, ), init: true, init_only: false, lookup_key: Simple { key: "model_type", py_key: Py( 0x00007f9db7c87c70, ), path: LookupPath( [ S( "model_type", Py( 0x00007f9db7c87cf0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d508030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f9e9a798ab0, ), init: true, init_only: false, lookup_key: Simple { key: "max_length", py_key: Py( 0x00007f9db7c87b70, ), path: LookupPath( [ S( "max_length", Py( 0x00007f9db7c872b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7c48450, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f9e6fe3e7b0, ), init: true, init_only: false, lookup_key: Simple { key: "top_k", py_key: Py( 0x00007f9db7c87cb0, ), path: LookupPath( [ S( "top_k", Py( 0x00007f9db7c87930, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f9dd967e430, ), init: true, init_only: false, lookup_key: Simple { key: "top_p", py_key: Py( 0x00007f9db7c87870, ), path: LookupPath( [ S( "top_p", Py( 0x00007f9db7c87c30, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7fbf5b0, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f9dd9060120, ), init: true, init_only: false, lookup_key: Simple { key: "num_return_sequences", py_key: Py( 0x00007f9dbe1af230, ), path: LookupPath( [ S( "num_return_sequences", Py( 0x00007f9dbe1af4b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f9dd9060080, ), init: true, init_only: false, lookup_key: Simple { key: "no_repeat_ngram_size", py_key: Py( 0x00007f9dbe1af050, ), path: LookupPath( [ S( "no_repeat_ngram_size", Py( 0x00007f9dbe1af190, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_text", py_name: Py( 0x00007f9ddb0084b0, ), init: true, init_only: false, lookup_key: Simple { key: "input_text", py_key: Py( 0x00007f9db7c87970, ), path: LookupPath( [ S( "input_text", Py( 0x00007f9db7c87e70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbe1017a0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_type", py_name: Py( 0x00007f9e9964d1b0, ), init: true, init_only: false, lookup_key: Simple { key: "input_type", py_key: Py( 0x00007f9db7c861b0, ), path: LookupPath( [ S( "input_type", Py( 0x00007f9db7c87eb0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9c7151b0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, ], positional_count: 9, init_only_count: None, dataclass_name: "PGTEditor", validator_name: "dataclass-args[PGTEditor]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x000055e1b8792ad0, ), generic_origin: None, fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9a4b3ab0, ), Py( 0x00007f9e9a798ab0, ), Py( 0x00007f9e6fe3e7b0, ), Py( 0x00007f9dd967e430, ), Py( 0x00007f9dd9060120, ), Py( 0x00007f9dd9060080, ), Py( 0x00007f9ddb0084b0, ), Py( 0x00007f9e9964d1b0, ), ], post_init: None, revalidate: Never, name: "PGTEditor", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2, input_text: str = 'This is my input', input_type: str = 'abstract') -> None>¶
- algorithm_application: ClassVar[str] = 'PGTEditor'¶
Unique name for the application that is the use of this configuration together with a specific algorithm.
Will be set when registering to
ApplicationsRegistry
, but can be given by direct registration (Seeregister_algorithm_application
)
- algorithm_name: ClassVar[str] = 'PGT'¶
Name of the algorithm to use with this configuration.
Will be set when registering to
ApplicationsRegistry
- class PGTCoherenceChecker(*args, **kwargs)[source]¶
Bases:
PGTCoherenceChecker
Configuration for a PGT coherence check algorithm
- num_return_sequences: int = 1¶
- input_a: str = "I'm a stochastic parrot."¶
- input_b: str = "I'm a stochastic parrot."¶
- coherence_type: str = 'title-abstract'¶
- get_generator(resources_path, **kwargs)[source]¶
Instantiate the actual PGT implementation for patent coherence check.
- Parameters
resources_path (
str
) – local path to model files.- Return type
- Returns
- instance with
generate_batch
method for targeted generation.
- __annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'coherence_type': <class 'str'>, 'domain': 'ClassVar[str]', 'input_a': <class 'str'>, 'input_b': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': <class 'int'>, 'top_k': 'int', 'top_p': 'float'}¶
- __dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTCoherenceChecker',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'coherence_type': Field(name='coherence_type',type=<class 'str'>,default='title-abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input types for the check. Supported: title-abstract, abstract-claim, title-claim'}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_a': Field(name='input_a',type=<class 'str'>,default="I'm a stochastic parrot.",default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'First input for coherence check.'}),kw_only=False,_field_type=_FIELD), 'input_b': Field(name='input_b',type=<class 'str'>,default="I'm a stochastic parrot.",default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Second input for coherence check.'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=1,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives should be always 1 for coherence check.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __doc__ = 'Configuration for a PGT coherence check algorithm'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(*args, **kwargs)¶
- __match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_a', 'input_b', 'coherence_type')¶
- __module__ = 'gt4sd.algorithms.generation.pgt.core'¶
- __parameters__ = (~T,)¶
- __pydantic_complete__ = True¶
- __pydantic_config__ = {}¶
- __pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTCoherenceChecker'>, 'config': {'title': 'PGTCoherenceChecker'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_a', 'input_b', 'coherence_type'], 'frozen': False, 'post_init': False, 'ref': 'types.PGTCoherenceChecker:94427950976416', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTCoherenceChecker', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Type of the model.'}}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Maximum length of the generated text.'}}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of top-k probability tokens to keep.'}}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 1}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of alternatives should be always 1 for coherence check.'}}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Size of n-gram to not appear twice.'}}}, {'type': 'dataclass-field', 'name': 'input_a', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': "I'm a stochastic parrot."}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'First input for coherence check.'}}}, {'type': 'dataclass-field', 'name': 'input_b', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': "I'm a stochastic parrot."}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Second input for coherence check.'}}}, {'type': 'dataclass-field', 'name': 'coherence_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'title-abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Input types for the check. Supported: title-abstract, abstract-claim, title-claim'}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶
- __pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶
- __pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'coherence_type': FieldInfo(annotation=str, required=False, default='title-abstract', description='Input types for the check. Supported: title-abstract, abstract-claim, title-claim', init=True, init_var=False, kw_only=False), 'input_a': FieldInfo(annotation=str, required=False, default="I'm a stochastic parrot.", description='First input for coherence check.', init=True, init_var=False, kw_only=False), 'input_b': FieldInfo(annotation=str, required=False, default="I'm a stochastic parrot.", description='Second input for coherence check.', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=1, description='Number of alternatives should be always 1 for coherence check.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶
- __pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x000055e1b8798da0, ), serializer: Fields( GeneralFieldsSerializer { fields: { "algorithm_version": SerField { key_py: Py( 0x00007f9dbdfd2060, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e996507f0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "model_type": SerField { key_py: Py( 0x00007f9db7c966f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d508030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "input_a": SerField { key_py: Py( 0x00007f9db7c965f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbe102420, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "input_b": SerField { key_py: Py( 0x00007f9db7c965b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9dbe102420, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "top_k": SerField { key_py: Py( 0x00007f9db7c96670, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "coherence_type": SerField { key_py: Py( 0x00007f9db7c96570, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7c64db0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, }, "top_p": SerField { key_py: Py( 0x00007f9db7c96630, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7fbf5b0, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f9dbdfd1de0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d500110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "num_return_sequences": SerField { key_py: Py( 0x00007f9dbdfd3fa0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9e9d5000f0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, "max_length": SerField { key_py: Py( 0x00007f9db7c966b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f9db7c48450, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 10, }, ), fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9a4b3ab0, ), Py( 0x00007f9e9a798ab0, ), Py( 0x00007f9e6fe3e7b0, ), Py( 0x00007f9dd967e430, ), Py( 0x00007f9dd9060120, ), Py( 0x00007f9dd9060080, ), Py( 0x00007f9e6ffc25b0, ), Py( 0x00007f9e6ffc25f0, ), Py( 0x00007f9db7c64df0, ), ], name: "PGTCoherenceChecker", }, ), definitions=[])¶
- __pydantic_validator__ = SchemaValidator(title="PGTCoherenceChecker", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f9e9963bc80, ), init: true, init_only: false, lookup_key: Simple { key: "algorithm_version", py_key: Py( 0x00007f9dbdfd3e60, ), path: LookupPath( [ S( "algorithm_version", Py( 0x00007f9dbdfd1a70, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e996507f0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f9e9a4b3ab0, ), init: true, init_only: false, lookup_key: Simple { key: "model_type", py_key: Py( 0x00007f9db7ca40f0, ), path: LookupPath( [ S( "model_type", Py( 0x00007f9db7ca40b0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d508030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f9e9a798ab0, ), init: true, init_only: false, lookup_key: Simple { key: "max_length", py_key: Py( 0x00007f9db7ca4070, ), path: LookupPath( [ S( "max_length", Py( 0x00007f9db7ca4030, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7c48450, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f9e6fe3e7b0, ), init: true, init_only: false, lookup_key: Simple { key: "top_k", py_key: Py( 0x00007f9db7ca4130, ), path: LookupPath( [ S( "top_k", Py( 0x00007f9db7ca4170, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f9dd967e430, ), init: true, init_only: false, lookup_key: Simple { key: "top_p", py_key: Py( 0x00007f9db7ca41b0, ), path: LookupPath( [ S( "top_p", Py( 0x00007f9db7ca41f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7fbf5b0, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f9dd9060120, ), init: true, init_only: false, lookup_key: Simple { key: "num_return_sequences", py_key: Py( 0x00007f9dbdfd3e10, ), path: LookupPath( [ S( "num_return_sequences", Py( 0x00007f9dbdfd3cd0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d5000f0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f9dd9060080, ), init: true, init_only: false, lookup_key: Simple { key: "no_repeat_ngram_size", py_key: Py( 0x00007f9dbdfd1b10, ), path: LookupPath( [ S( "no_repeat_ngram_size", Py( 0x00007f9dbdfd1ac0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9e9d500110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_a", py_name: Py( 0x00007f9e6ffc25b0, ), init: true, init_only: false, lookup_key: Simple { key: "input_a", py_key: Py( 0x00007f9db7ca4230, ), path: LookupPath( [ S( "input_a", Py( 0x00007f9db7ca4270, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbe102420, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_b", py_name: Py( 0x00007f9e6ffc25f0, ), init: true, init_only: false, lookup_key: Simple { key: "input_b", py_key: Py( 0x00007f9db7ca42b0, ), path: LookupPath( [ S( "input_b", Py( 0x00007f9db7ca42f0, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9dbe102420, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, Field { kw_only: false, name: "coherence_type", py_name: Py( 0x00007f9db7c64df0, ), init: true, init_only: false, lookup_key: Simple { key: "coherence_type", py_key: Py( 0x00007f9db7ca4330, ), path: LookupPath( [ S( "coherence_type", Py( 0x00007f9db7ca4370, ), ), ], ), }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f9db7c64db0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f9e9b5139a0, ), }, ), frozen: false, }, ], positional_count: 10, init_only_count: None, dataclass_name: "PGTCoherenceChecker", validator_name: "dataclass-args[PGTCoherenceChecker]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, }, ), class: Py( 0x000055e1b8798da0, ), generic_origin: None, fields: [ Py( 0x00007f9e9963bc80, ), Py( 0x00007f9e9a4b3ab0, ), Py( 0x00007f9e9a798ab0, ), Py( 0x00007f9e6fe3e7b0, ), Py( 0x00007f9dd967e430, ), Py( 0x00007f9dd9060120, ), Py( 0x00007f9dd9060080, ), Py( 0x00007f9e6ffc25b0, ), Py( 0x00007f9e6ffc25f0, ), Py( 0x00007f9db7c64df0, ), ], post_init: None, revalidate: Never, name: "PGTCoherenceChecker", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶
- __repr__()¶
Return repr(self).
- __signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 1, no_repeat_ngram_size: int = 2, input_a: str = "I'm a stochastic parrot.", input_b: str = "I'm a stochastic parrot.", coherence_type: str = 'title-abstract') -> None>¶
- __wrapped__¶
alias of
PGTCoherenceChecker
- algorithm_application: ClassVar[str] = 'PGTCoherenceChecker'¶
Unique name for the application that is the use of this configuration together with a specific algorithm.
Will be set when registering to
ApplicationsRegistry
, but can be given by direct registration (Seeregister_algorithm_application
)
- algorithm_name: ClassVar[str] = 'PGT'¶
Name of the algorithm to use with this configuration.
Will be set when registering to
ApplicationsRegistry