gt4sd.algorithms.generation.pgt.core module¶

Patent Generative Transformer (PGT) generation algorithm.

Summary¶

Classes:

`PGT`	PGT Algorithm.
`PGTAlgorithmConfiguration`	Basic configuration for a PGT algorithm
`PGTCoherenceChecker`	Configuration for a PGT coherence check algorithm
`PGTEditor`	Configuration for a PGT Editor algorithm.
`PGTGenerator`	Configuration for a PGT Generator algorithm

Reference¶

class PGT(configuration, target=None)[source]¶

Bases: GeneratorAlgorithm[S, None]

PGT Algorithm.

__init__(configuration, target=None)[source]¶

Instantiate PGT ready to generate items.

Parameters

configuration (AlgorithmConfiguration[~S, None]) – domain and application specification defining parameters, types and validations.
target (None) – unused since it is not a conditional generator.

Example

An example for generating abstract from a given claim:

config = PGTGenerator(task=”claim_to_abstract”, input_text=”My interesting claim”) generator = PGT(configuration=config) print(list(generator.sample(1)))

get_generator(configuration, target)[source]¶

Get the function to sample with the given configuration.

Parameters

configuration (AlgorithmConfiguration[~S, None]) – helps to set up specific application of PGT.
target (None) – context or condition for the generation. Unused in the algorithm.

Return type

Callable[[], Iterable[Any]]

Returns

callable with target generating a batch of items.

validate_configuration(configuration)[source]¶

Overload to validate the a configuration for the algorithm.

Parameters: configuration (AlgorithmConfiguration[~S, None]) – the algorithm configuration.
Raises: InvalidAlgorithmConfiguration – in case the configuration for the algorithm is invalid.
Return type: AlgorithmConfiguration[~S, None]
Returns: the validated configuration.

__abstractmethods__ = frozenset({})¶

__annotations__ = {'generate': 'Untargeted', 'generator': 'Union[Untargeted, Targeted[T]]', 'max_runtime': 'int', 'max_samples': 'int', 'target': 'Optional[T]'}¶

__doc__ = 'PGT Algorithm.'¶

__module__ = 'gt4sd.algorithms.generation.pgt.core'¶

__orig_bases__ = (gt4sd.algorithms.core.GeneratorAlgorithm[~S, NoneType],)¶

__parameters__ = (~S,)¶

_abc_impl = <_abc._abc_data object>¶

class PGTAlgorithmConfiguration(*args, **kwargs)[source]¶

Bases: PGTAlgorithmConfiguration, Generic[T]

Basic configuration for a PGT algorithm

algorithm_type: ClassVar[str] = 'generation'¶: General type of generative algorithm.

domain: ClassVar[str] = 'nlp'¶: General application domain. Hints at input/output types.

algorithm_version: str = 'v0'¶

To differentiate between different versions of an application.

There is no imposed naming convention.

model_type: str = ''¶

max_length: int = 512¶

top_k: int = 50¶

top_p: float = 1.0¶

num_return_sequences: int = 3¶

no_repeat_ngram_size: int = 2¶

get_target_description()[source]¶

Get description of the target for generation.

Return type: Optional[Dict[str, str], None]
Returns: target description, returns None in case no target is used.

get_generator(resources_path, **kwargs)[source]¶

Instantiate the actual PGT implementation.

Parameters

resources_path (str) – local path to model files.

Return type

Generator

Returns

instance with

generate_batch: method for targeted generation.

classmethod save_version_from_training_pipeline_arguments_postprocess()[source]¶

Postprocess after saving. Remove temporarily converted hf model: if pytorch-lightning checkpoint is given.

Parameters: training_pipeline_arguments (TrainingPipelineArguments) – training pipeline arguments.

classmethod get_filepath_mappings_for_training_pipeline_arguments()[source]¶

Ger filepath mappings for the given training pipeline arguments.

Parameters: training_pipeline_arguments (TrainingPipelineArguments) – training pipeline arguments.
Return type: Dict[str, str]
Returns: a mapping between artifacts’ files and training pipeline’s output files.

__annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': typing.ClassVar[str], 'algorithm_version': <class 'str'>, 'domain': typing.ClassVar[str], 'max_length': <class 'int'>, 'model_type': <class 'str'>, 'no_repeat_ngram_size': <class 'int'>, 'num_return_sequences': <class 'int'>, 'top_k': <class 'int'>, 'top_p': <class 'float'>}¶

__dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTAlgorithmConfiguration',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶

__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶

__doc__ = 'Basic configuration for a PGT algorithm'¶

__eq__(other)¶: Return self==value.

__hash__ = None¶

__init__(*args, **kwargs)¶

__is_pydantic_dataclass__ = True¶

__match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size')¶

__module__ = 'gt4sd.algorithms.generation.pgt.core'¶

__orig_bases__ = (<class 'types.PGTAlgorithmConfiguration'>, typing.Generic[~T])¶

__parameters__ = (~T,)¶

__pydantic_complete__ = True¶

__pydantic_config__ = {}¶

__pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTAlgorithmConfiguration'>, 'config': {'title': 'PGTAlgorithmConfiguration'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size'], 'frozen': False, 'post_init': False, 'ref': 'types.PGTAlgorithmConfiguration:93913132059024', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTAlgorithmConfiguration', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Type of the model.'}}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Maximum length of the generated text.'}}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of top-k probability tokens to keep.'}}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of alternatives to be generated.'}}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Size of n-gram to not appear twice.'}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶

__pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶

__pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶

classmethod __pydantic_fields_complete__()¶

Return whether the fields where successfully collected (i.e. type hints were successfully resolves).

This is a private property, not meant to be used outside Pydantic.

Return type: bool

__pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005569dae04990, ), serializer: Fields( GeneralFieldsSerializer { fields: { "num_return_sequences": SerField { key_py: Py( 0x00007f855b28ea10, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f855b28e330, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "top_p": SerField { key_py: Py( 0x00007f8559f30470, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f855a11a190, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, serialize_by_alias: None, }, "top_k": SerField { key_py: Py( 0x00007f8559f30430, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "model_type": SerField { key_py: Py( 0x00007f8559f303b0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a8030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "max_length": SerField { key_py: Py( 0x00007f8559f303f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f855a11aa90, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "algorithm_version": SerField { key_py: Py( 0x00007f855b28ea60, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f863c298a30, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 7, }, ), fields: [ Py( 0x00007f863c2ac030, ), Py( 0x00007f863e4c8170, ), Py( 0x00007f863d4cd670, ), Py( 0x00007f858fad4430, ), Py( 0x00007f857ab65db0, ), Py( 0x00007f857a823d70, ), Py( 0x00007f857a823cd0, ), ], name: "PGTAlgorithmConfiguration", }, ), definitions=[])¶

__pydantic_validator__ = SchemaValidator(title="PGTAlgorithmConfiguration", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f863c2ac030, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "algorithm_version", py_key: Py( 0x00007f855b28e470, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f863c298a30, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f863e4c8170, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "model_type", py_key: Py( 0x00007f855a0815b0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a8030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f863d4cd670, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "max_length", py_key: Py( 0x00007f855a0f35f0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f855a11aa90, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f858fad4430, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "top_k", py_key: Py( 0x00007f8559f30330, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f857ab65db0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "top_p", py_key: Py( 0x00007f8559f30370, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f855a11a190, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f857a823d70, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "num_return_sequences", py_key: Py( 0x00007f855b28e6a0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f857a823cd0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "no_repeat_ngram_size", py_key: Py( 0x00007f855b28e650, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, ], positional_count: 7, init_only_count: None, dataclass_name: "PGTAlgorithmConfiguration", validator_name: "dataclass-args[PGTAlgorithmConfiguration]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, validate_by_alias: None, validate_by_name: None, }, ), class: Py( 0x00005569dae04990, ), generic_origin: None, fields: [ Py( 0x00007f863c2ac030, ), Py( 0x00007f863e4c8170, ), Py( 0x00007f863d4cd670, ), Py( 0x00007f858fad4430, ), Py( 0x00007f857ab65db0, ), Py( 0x00007f857a823d70, ), Py( 0x00007f857a823cd0, ), ], post_init: None, revalidate: Never, name: "PGTAlgorithmConfiguration", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶

__repr__()¶: Return repr(self).

__signature__ = <Signature (algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2) -> None>¶

__wrapped__¶: alias of PGTAlgorithmConfiguration

class PGTGenerator(*args, **kwargs)[source]¶

Bases: PGTGenerator

Configuration for a PGT Generator algorithm

input_text: str = 'This is my input'¶

task: str = 'title-to-abstract'¶

get_generator(resources_path, **kwargs)[source]¶

Instantiate the actual PGT implementation for part of patent generation.

Parameters

resources_path (str) – local path to model files.

Return type

Generator

Returns

instance with

generate_batch: method for targeted generation.

__annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'domain': 'ClassVar[str]', 'input_text': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': 'int', 'task': <class 'str'>, 'top_k': 'int', 'top_p': 'float'}¶

__dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTGenerator',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_text': Field(name='input_text',type=<class 'str'>,default='This is my input',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input text.'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'task': Field(name='task',type=<class 'str'>,default='title-to-abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Generation tasks. Supported: title-to-abstract, abstract-to-claim, claim-to-abstract, abstract-to-title'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶

__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶

__doc__ = 'Configuration for a PGT Generator algorithm'¶

__eq__(other)¶: Return self==value.

__hash__ = None¶

__init__(*args, **kwargs)¶

__is_pydantic_dataclass__ = True¶

__match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'task')¶

__module__ = 'gt4sd.algorithms.generation.pgt.core'¶

__parameters__ = (~T,)¶

__pydantic_complete__ = True¶

__pydantic_config__ = {}¶

__pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTGenerator'>, 'config': {'title': 'PGTGenerator'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'task'], 'frozen': False, 'post_init': False, 'ref': 'types.PGTGenerator:93913123475328', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTGenerator', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Type of the model.'}}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Maximum length of the generated text.'}}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of top-k probability tokens to keep.'}}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of alternatives to be generated.'}}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Size of n-gram to not appear twice.'}}}, {'type': 'dataclass-field', 'name': 'input_text', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'This is my input'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Input text.'}}}, {'type': 'dataclass-field', 'name': 'task', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'title-to-abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Generation tasks. Supported: title-to-abstract, abstract-to-claim, claim-to-abstract, abstract-to-title'}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶

__pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶

__pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'input_text': FieldInfo(annotation=str, required=False, default='This is my input', description='Input text.', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'task': FieldInfo(annotation=str, required=False, default='title-to-abstract', description='Generation tasks. Supported: title-to-abstract, abstract-to-claim, claim-to-abstract, abstract-to-title', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶

classmethod __pydantic_fields_complete__()¶

Return whether the fields where successfully collected (i.e. type hints were successfully resolves).

This is a private property, not meant to be used outside Pydantic.

Return type: bool

__pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005569da5d4f80, ), serializer: Fields( GeneralFieldsSerializer { fields: { "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f8560ee5020, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "algorithm_version": SerField { key_py: Py( 0x00007f85c956ea10, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f863c298a30, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "task": SerField { key_py: Py( 0x00007f8560ab1ab0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f8560dfb410, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "top_p": SerField { key_py: Py( 0x00007f8559f2aa30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f855a11a190, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, serialize_by_alias: None, }, "max_length": SerField { key_py: Py( 0x00007f8559f2aa70, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f855a11aa90, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "top_k": SerField { key_py: Py( 0x00007f8559f2ab70, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "num_return_sequences": SerField { key_py: Py( 0x00007f8560ee5750, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "model_type": SerField { key_py: Py( 0x00007f8559f29d30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a8030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "input_text": SerField { key_py: Py( 0x00007f8559f29fb0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f8560dfb460, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 9, }, ), fields: [ Py( 0x00007f863c2ac030, ), Py( 0x00007f863e4c8170, ), Py( 0x00007f863d4cd670, ), Py( 0x00007f858fad4430, ), Py( 0x00007f857ab65db0, ), Py( 0x00007f857a823d70, ), Py( 0x00007f857a823cd0, ), Py( 0x00007f857d686b30, ), Py( 0x00007f863d19ce30, ), ], name: "PGTGenerator", }, ), definitions=[])¶

__pydantic_validator__ = SchemaValidator(title="PGTGenerator", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f863c2ac030, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "algorithm_version", py_key: Py( 0x00007f8560ee5390, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f863c298a30, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f863e4c8170, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "model_type", py_key: Py( 0x00007f855b30a2b0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a8030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f863d4cd670, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "max_length", py_key: Py( 0x00007f855a0802b0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f855a11aa90, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f858fad4430, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "top_k", py_key: Py( 0x00007f855a082670, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f857ab65db0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "top_p", py_key: Py( 0x00007f8559f325b0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f855a11a190, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f857a823d70, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "num_return_sequences", py_key: Py( 0x00007f8560ee5430, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f857a823cd0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "no_repeat_ngram_size", py_key: Py( 0x00007f8560ee55c0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_text", py_name: Py( 0x00007f857d686b30, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "input_text", py_key: Py( 0x00007f8559f32330, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f8560dfb460, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "task", py_name: Py( 0x00007f863d19ce30, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "task", py_key: Py( 0x00007f8559f322b0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f8560dfb410, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, ], positional_count: 9, init_only_count: None, dataclass_name: "PGTGenerator", validator_name: "dataclass-args[PGTGenerator]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, validate_by_alias: None, validate_by_name: None, }, ), class: Py( 0x00005569da5d4f80, ), generic_origin: None, fields: [ Py( 0x00007f863c2ac030, ), Py( 0x00007f863e4c8170, ), Py( 0x00007f863d4cd670, ), Py( 0x00007f858fad4430, ), Py( 0x00007f857ab65db0, ), Py( 0x00007f857a823d70, ), Py( 0x00007f857a823cd0, ), Py( 0x00007f857d686b30, ), Py( 0x00007f863d19ce30, ), ], post_init: None, revalidate: Never, name: "PGTGenerator", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶

__repr__()¶: Return repr(self).

__signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2, input_text: str = 'This is my input', task: str = 'title-to-abstract') -> None>¶

__wrapped__¶: alias of PGTGenerator

algorithm_application: ClassVar[str] = 'PGTGenerator'¶

Unique name for the application that is the use of this configuration together with a specific algorithm.

Will be set when registering to ApplicationsRegistry, but can be given by direct registration (See register_algorithm_application)

algorithm_name: ClassVar[str] = 'PGT'¶

Name of the algorithm to use with this configuration.

Will be set when registering to ApplicationsRegistry

class PGTEditor(*args, **kwargs)[source]¶

Bases: PGTEditor

Configuration for a PGT Editor algorithm.

input_text: str = 'This is my input'¶

input_type: str = 'abstract'¶

get_generator(resources_path, **kwargs)[source]¶

Instantiate the actual PGT implementation for part of patent editing.

Parameters

resources_path (str) – local path to model files.

Return type

Generator

Returns

instance with

generate_batch: method for targeted generation.

__annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'domain': 'ClassVar[str]', 'input_text': <class 'str'>, 'input_type': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': 'int', 'top_k': 'int', 'top_p': 'float'}¶

__dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTEditor',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_text': Field(name='input_text',type=<class 'str'>,default='This is my input',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input text.'}),kw_only=False,_field_type=_FIELD), 'input_type': Field(name='input_type',type=<class 'str'>,default='abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Part of a patent the input text belongs. Supported: abstract, claim'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=3,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives to be generated.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶

__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶

__doc__ = 'Configuration for a PGT Editor algorithm.'¶

__eq__(other)¶: Return self==value.

__hash__ = None¶

__init__(*args, **kwargs)¶

__is_pydantic_dataclass__ = True¶

__match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'input_type')¶

__module__ = 'gt4sd.algorithms.generation.pgt.core'¶

__parameters__ = (~T,)¶

__pydantic_complete__ = True¶

__pydantic_config__ = {}¶

__pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTEditor'>, 'config': {'title': 'PGTEditor'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_text', 'input_type'], 'frozen': False, 'post_init': False, 'ref': 'types.PGTEditor:93913123447408', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTEditor', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Type of the model.'}}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Maximum length of the generated text.'}}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of top-k probability tokens to keep.'}}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 3}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of alternatives to be generated.'}}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Size of n-gram to not appear twice.'}}}, {'type': 'dataclass-field', 'name': 'input_text', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'This is my input'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Input text.'}}}, {'type': 'dataclass-field', 'name': 'input_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Part of a patent the input text belongs. Supported: abstract, claim'}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶

__pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶

__pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'input_text': FieldInfo(annotation=str, required=False, default='This is my input', description='Input text.', init=True, init_var=False, kw_only=False), 'input_type': FieldInfo(annotation=str, required=False, default='abstract', description='Part of a patent the input text belongs. Supported: abstract, claim', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=3, description='Number of alternatives to be generated.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶

classmethod __pydantic_fields_complete__()¶

Return whether the fields where successfully collected (i.e. type hints were successfully resolves).

This is a private property, not meant to be used outside Pydantic.

Return type: bool

__pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005569da5ce270, ), serializer: Fields( GeneralFieldsSerializer { fields: { "top_k": SerField { key_py: Py( 0x00007f8559f33af0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "input_text": SerField { key_py: Py( 0x00007f8559f326f0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f8560dfb460, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "algorithm_version": SerField { key_py: Py( 0x00007f856091f550, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f863c298a30, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "max_length": SerField { key_py: Py( 0x00007f8559f33b30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f855a11aa90, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "input_type": SerField { key_py: Py( 0x00007f8559f32e30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f863ef68cf0, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "model_type": SerField { key_py: Py( 0x00007f8559f33b70, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a8030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "num_return_sequences": SerField { key_py: Py( 0x00007f856091f640, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0130, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "top_p": SerField { key_py: Py( 0x00007f8559f33830, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f855a11a190, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, serialize_by_alias: None, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f856091ff00, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 9, }, ), fields: [ Py( 0x00007f863c2ac030, ), Py( 0x00007f863e4c8170, ), Py( 0x00007f863d4cd670, ), Py( 0x00007f858fad4430, ), Py( 0x00007f857ab65db0, ), Py( 0x00007f857a823d70, ), Py( 0x00007f857a823cd0, ), Py( 0x00007f857d686b30, ), Py( 0x00007f863c285170, ), ], name: "PGTEditor", }, ), definitions=[])¶

__pydantic_validator__ = SchemaValidator(title="PGTEditor", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f863c2ac030, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "algorithm_version", py_key: Py( 0x00007f8560ee54d0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f863c298a30, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f863e4c8170, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "model_type", py_key: Py( 0x00007f855b027170, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a8030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f863d4cd670, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "max_length", py_key: Py( 0x00007f8559f33530, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f855a11aa90, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f858fad4430, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "top_k", py_key: Py( 0x00007f8559f339b0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f857ab65db0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "top_p", py_key: Py( 0x00007f8559f33570, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f855a11a190, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f857a823d70, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "num_return_sequences", py_key: Py( 0x00007f856091ff50, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0130, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f857a823cd0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "no_repeat_ngram_size", py_key: Py( 0x00007f856091e420, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_text", py_name: Py( 0x00007f857d686b30, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "input_text", py_key: Py( 0x00007f8559f33470, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f8560dfb460, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_type", py_name: Py( 0x00007f863c285170, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "input_type", py_key: Py( 0x00007f8560b0e9b0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f863ef68cf0, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, ], positional_count: 9, init_only_count: None, dataclass_name: "PGTEditor", validator_name: "dataclass-args[PGTEditor]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, validate_by_alias: None, validate_by_name: None, }, ), class: Py( 0x00005569da5ce270, ), generic_origin: None, fields: [ Py( 0x00007f863c2ac030, ), Py( 0x00007f863e4c8170, ), Py( 0x00007f863d4cd670, ), Py( 0x00007f858fad4430, ), Py( 0x00007f857ab65db0, ), Py( 0x00007f857a823d70, ), Py( 0x00007f857a823cd0, ), Py( 0x00007f857d686b30, ), Py( 0x00007f863c285170, ), ], post_init: None, revalidate: Never, name: "PGTEditor", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶

__repr__()¶: Return repr(self).

__signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 3, no_repeat_ngram_size: int = 2, input_text: str = 'This is my input', input_type: str = 'abstract') -> None>¶

__wrapped__¶: alias of PGTEditor

algorithm_application: ClassVar[str] = 'PGTEditor'¶

Unique name for the application that is the use of this configuration together with a specific algorithm.

Will be set when registering to ApplicationsRegistry, but can be given by direct registration (See register_algorithm_application)

algorithm_name: ClassVar[str] = 'PGT'¶

Name of the algorithm to use with this configuration.

Will be set when registering to ApplicationsRegistry

class PGTCoherenceChecker(*args, **kwargs)[source]¶

Bases: PGTCoherenceChecker

Configuration for a PGT coherence check algorithm

num_return_sequences: int = 1¶

input_a: str = "I'm a stochastic parrot."¶

input_b: str = "I'm a stochastic parrot."¶

coherence_type: str = 'title-abstract'¶

get_generator(resources_path, **kwargs)[source]¶

Instantiate the actual PGT implementation for patent coherence check.

Parameters

resources_path (str) – local path to model files.

Return type

Generator

Returns

instance with: generate_batch method for targeted generation.

__annotations__ = {'algorithm_application': 'ClassVar[str]', 'algorithm_name': 'ClassVar[str]', 'algorithm_type': 'ClassVar[str]', 'algorithm_version': 'str', 'coherence_type': <class 'str'>, 'domain': 'ClassVar[str]', 'input_a': <class 'str'>, 'input_b': <class 'str'>, 'max_length': 'int', 'model_type': 'str', 'no_repeat_ngram_size': 'int', 'num_return_sequences': <class 'int'>, 'top_k': 'int', 'top_p': 'float'}¶

__dataclass_fields__ = {'algorithm_application': Field(name='algorithm_application',type=typing.ClassVar[str],default='PGTCoherenceChecker',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_name': Field(name='algorithm_name',type=typing.ClassVar[str],default='PGT',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_type': Field(name='algorithm_type',type=typing.ClassVar[str],default='generation',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'algorithm_version': Field(name='algorithm_version',type=<class 'str'>,default='v0',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=False,_field_type=_FIELD), 'coherence_type': Field(name='coherence_type',type=<class 'str'>,default='title-abstract',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Input types for the check. Supported: title-abstract, abstract-claim, title-claim'}),kw_only=False,_field_type=_FIELD), 'domain': Field(name='domain',type=typing.ClassVar[str],default='nlp',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=_FIELD_CLASSVAR), 'input_a': Field(name='input_a',type=<class 'str'>,default="I'm a stochastic parrot.",default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'First input for coherence check.'}),kw_only=False,_field_type=_FIELD), 'input_b': Field(name='input_b',type=<class 'str'>,default="I'm a stochastic parrot.",default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Second input for coherence check.'}),kw_only=False,_field_type=_FIELD), 'max_length': Field(name='max_length',type=<class 'int'>,default=512,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Maximum length of the generated text.'}),kw_only=False,_field_type=_FIELD), 'model_type': Field(name='model_type',type=<class 'str'>,default='',default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Type of the model.'}),kw_only=False,_field_type=_FIELD), 'no_repeat_ngram_size': Field(name='no_repeat_ngram_size',type=<class 'int'>,default=2,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Size of n-gram to not appear twice.'}),kw_only=False,_field_type=_FIELD), 'num_return_sequences': Field(name='num_return_sequences',type=<class 'int'>,default=1,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of alternatives should be always 1 for coherence check.'}),kw_only=False,_field_type=_FIELD), 'top_k': Field(name='top_k',type=<class 'int'>,default=50,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Number of top-k probability tokens to keep.'}),kw_only=False,_field_type=_FIELD), 'top_p': Field(name='top_p',type=<class 'float'>,default=1.0,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}),kw_only=False,_field_type=_FIELD)}¶

__dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶

__doc__ = 'Configuration for a PGT coherence check algorithm'¶

__eq__(other)¶: Return self==value.

__hash__ = None¶

__init__(*args, **kwargs)¶

__is_pydantic_dataclass__ = True¶

__match_args__ = ('algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_a', 'input_b', 'coherence_type')¶

__module__ = 'gt4sd.algorithms.generation.pgt.core'¶

__parameters__ = (~T,)¶

__pydantic_complete__ = True¶

__pydantic_config__ = {}¶

__pydantic_core_schema__ = {'cls': <class 'gt4sd.algorithms.generation.pgt.core.PGTCoherenceChecker'>, 'config': {'title': 'PGTCoherenceChecker'}, 'fields': ['algorithm_version', 'model_type', 'max_length', 'top_k', 'top_p', 'num_return_sequences', 'no_repeat_ngram_size', 'input_a', 'input_b', 'coherence_type'], 'frozen': False, 'post_init': False, 'ref': 'types.PGTCoherenceChecker:93913128879136', 'schema': {'collect_init_only': False, 'computed_fields': [], 'dataclass_name': 'PGTCoherenceChecker', 'fields': [{'type': 'dataclass-field', 'name': 'algorithm_version', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'v0'}, 'kw_only': False, 'init': True, 'metadata': {}}, {'type': 'dataclass-field', 'name': 'model_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': ''}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Type of the model.'}}}, {'type': 'dataclass-field', 'name': 'max_length', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 512}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Maximum length of the generated text.'}}}, {'type': 'dataclass-field', 'name': 'top_k', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 50}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of top-k probability tokens to keep.'}}}, {'type': 'dataclass-field', 'name': 'top_p', 'schema': {'type': 'default', 'schema': {'type': 'float'}, 'default': 1.0}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Only tokens with cumulative probabilities summing up to this value are kept.'}}}, {'type': 'dataclass-field', 'name': 'num_return_sequences', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 1}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Number of alternatives should be always 1 for coherence check.'}}}, {'type': 'dataclass-field', 'name': 'no_repeat_ngram_size', 'schema': {'type': 'default', 'schema': {'type': 'int'}, 'default': 2}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Size of n-gram to not appear twice.'}}}, {'type': 'dataclass-field', 'name': 'input_a', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': "I'm a stochastic parrot."}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'First input for coherence check.'}}}, {'type': 'dataclass-field', 'name': 'input_b', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': "I'm a stochastic parrot."}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Second input for coherence check.'}}}, {'type': 'dataclass-field', 'name': 'coherence_type', 'schema': {'type': 'default', 'schema': {'type': 'str'}, 'default': 'title-abstract'}, 'kw_only': False, 'init': True, 'metadata': {'pydantic_js_updates': {'description': 'Input types for the check. Supported: title-abstract, abstract-claim, title-claim'}}}], 'type': 'dataclass-args'}, 'slots': True, 'type': 'dataclass'}¶

__pydantic_decorators__ = DecoratorInfos(validators={}, field_validators={}, root_validators={}, field_serializers={}, model_serializers={}, model_validators={}, computed_fields={})¶

__pydantic_fields__ = {'algorithm_version': FieldInfo(annotation=str, required=False, default='v0', init=True, init_var=False, kw_only=False), 'coherence_type': FieldInfo(annotation=str, required=False, default='title-abstract', description='Input types for the check. Supported: title-abstract, abstract-claim, title-claim', init=True, init_var=False, kw_only=False), 'input_a': FieldInfo(annotation=str, required=False, default="I'm a stochastic parrot.", description='First input for coherence check.', init=True, init_var=False, kw_only=False), 'input_b': FieldInfo(annotation=str, required=False, default="I'm a stochastic parrot.", description='Second input for coherence check.', init=True, init_var=False, kw_only=False), 'max_length': FieldInfo(annotation=int, required=False, default=512, description='Maximum length of the generated text.', init=True, init_var=False, kw_only=False), 'model_type': FieldInfo(annotation=str, required=False, default='', description='Type of the model.', init=True, init_var=False, kw_only=False), 'no_repeat_ngram_size': FieldInfo(annotation=int, required=False, default=2, description='Size of n-gram to not appear twice.', init=True, init_var=False, kw_only=False), 'num_return_sequences': FieldInfo(annotation=int, required=False, default=1, description='Number of alternatives should be always 1 for coherence check.', init=True, init_var=False, kw_only=False), 'top_k': FieldInfo(annotation=int, required=False, default=50, description='Number of top-k probability tokens to keep.', init=True, init_var=False, kw_only=False), 'top_p': FieldInfo(annotation=float, required=False, default=1.0, description='Only tokens with cumulative probabilities summing up to this value are kept.', init=True, init_var=False, kw_only=False)}¶

classmethod __pydantic_fields_complete__()¶

Return whether the fields where successfully collected (i.e. type hints were successfully resolves).

This is a private property, not meant to be used outside Pydantic.

Return type: bool

__pydantic_serializer__ = SchemaSerializer(serializer=Dataclass( DataclassSerializer { class: Py( 0x00005569daafc420, ), serializer: Fields( GeneralFieldsSerializer { fields: { "max_length": SerField { key_py: Py( 0x00007f8559f51ff0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f855a11aa90, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "input_a": SerField { key_py: Py( 0x00007f8559f51fb0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f8560dfba50, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "algorithm_version": SerField { key_py: Py( 0x00007f855b19c990, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f863c298a30, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "top_k": SerField { key_py: Py( 0x00007f8559f51f30, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0710, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "coherence_type": SerField { key_py: Py( 0x00007f8559f51bb0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f8559f17430, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "no_repeat_ngram_size": SerField { key_py: Py( 0x00007f855b19d8e0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a0110, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "top_p": SerField { key_py: Py( 0x00007f8559f51ef0, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f855a11a190, ), ), serializer: Float( FloatSerializer { inf_nan_mode: Null, }, ), }, ), ), required: true, serialize_by_alias: None, }, "input_b": SerField { key_py: Py( 0x00007f8559f52730, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f8560dfba50, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "model_type": SerField { key_py: Py( 0x00007f8560c89570, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a8030, ), ), serializer: Str( StrSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, "num_return_sequences": SerField { key_py: Py( 0x00007f855b19c120, ), alias: None, alias_py: None, serializer: Some( WithDefault( WithDefaultSerializer { default: Default( Py( 0x00007f86404a00f0, ), ), serializer: Int( IntSerializer, ), }, ), ), required: true, serialize_by_alias: None, }, }, computed_fields: Some( ComputedFields( [], ), ), mode: SimpleDict, extra_serializer: None, filter: SchemaFilter { include: None, exclude: None, }, required_fields: 10, }, ), fields: [ Py( 0x00007f863c2ac030, ), Py( 0x00007f863e4c8170, ), Py( 0x00007f863d4cd670, ), Py( 0x00007f858fad4430, ), Py( 0x00007f857ab65db0, ), Py( 0x00007f857a823d70, ), Py( 0x00007f857a823cd0, ), Py( 0x00007f858fdcc2b0, ), Py( 0x00007f858fdcc2f0, ), Py( 0x00007f8559f17470, ), ], name: "PGTCoherenceChecker", }, ), definitions=[])¶

__pydantic_validator__ = SchemaValidator(title="PGTCoherenceChecker", validator=Dataclass( DataclassValidator { strict: false, validator: DataclassArgs( DataclassArgsValidator { fields: [ Field { kw_only: false, name: "algorithm_version", py_name: Py( 0x00007f863c2ac030, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "algorithm_version", py_key: Py( 0x00007f855b19d930, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f863c298a30, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "model_type", py_name: Py( 0x00007f863e4c8170, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "model_type", py_key: Py( 0x00007f8560ca8b70, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a8030, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "max_length", py_name: Py( 0x00007f863d4cd670, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "max_length", py_key: Py( 0x00007f8559f51f70, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f855a11aa90, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_k", py_name: Py( 0x00007f858fad4430, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "top_k", py_key: Py( 0x00007f8559f51e30, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0710, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "top_p", py_name: Py( 0x00007f857ab65db0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "top_p", py_key: Py( 0x00007f8559f518b0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f855a11a190, ), ), on_error: Raise, validator: Float( FloatValidator { strict: false, allow_inf_nan: true, }, ), validate_default: false, copy_default: false, name: "default[float]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "num_return_sequences", py_name: Py( 0x00007f857a823d70, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "num_return_sequences", py_key: Py( 0x00007f8560ac1840, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a00f0, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "no_repeat_ngram_size", py_name: Py( 0x00007f857a823cd0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "no_repeat_ngram_size", py_key: Py( 0x00007f8561082150, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f86404a0110, ), ), on_error: Raise, validator: Int( IntValidator { strict: false, }, ), validate_default: false, copy_default: false, name: "default[int]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_a", py_name: Py( 0x00007f858fdcc2b0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "input_a", py_key: Py( 0x00007f8559f51c30, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f8560dfba50, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "input_b", py_name: Py( 0x00007f858fdcc2f0, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "input_b", py_key: Py( 0x00007f8559f51cb0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f8560dfba50, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, Field { kw_only: false, name: "coherence_type", py_name: Py( 0x00007f8559f17470, ), init: true, init_only: false, lookup_key_collection: LookupKeyCollection { by_name: Simple( LookupPath { first_item: PathItemString { key: "coherence_type", py_key: Py( 0x00007f8559f518f0, ), }, rest: [], }, ), by_alias: None, by_alias_then_name: None, }, validator: WithDefault( WithDefaultValidator { default: Default( Py( 0x00007f8559f17430, ), ), on_error: Raise, validator: Str( StrValidator { strict: false, coerce_numbers_to_str: false, }, ), validate_default: false, copy_default: false, name: "default[str]", undefined: Py( 0x00007f863e1e3a60, ), }, ), frozen: false, }, ], positional_count: 10, init_only_count: None, dataclass_name: "PGTCoherenceChecker", validator_name: "dataclass-args[PGTCoherenceChecker]", extra_behavior: Ignore, extras_validator: None, loc_by_alias: true, validate_by_alias: None, validate_by_name: None, }, ), class: Py( 0x00005569daafc420, ), generic_origin: None, fields: [ Py( 0x00007f863c2ac030, ), Py( 0x00007f863e4c8170, ), Py( 0x00007f863d4cd670, ), Py( 0x00007f858fad4430, ), Py( 0x00007f857ab65db0, ), Py( 0x00007f857a823d70, ), Py( 0x00007f857a823cd0, ), Py( 0x00007f858fdcc2b0, ), Py( 0x00007f858fdcc2f0, ), Py( 0x00007f8559f17470, ), ], post_init: None, revalidate: Never, name: "PGTCoherenceChecker", frozen: false, slots: true, }, ), definitions=[], cache_strings=True)¶

__repr__()¶: Return repr(self).

__signature__ = <Signature (*args: Any, algorithm_version: str = 'v0', model_type: str = '', max_length: int = 512, top_k: int = 50, top_p: float = 1.0, num_return_sequences: int = 1, no_repeat_ngram_size: int = 2, input_a: str = "I'm a stochastic parrot.", input_b: str = "I'm a stochastic parrot.", coherence_type: str = 'title-abstract') -> None>¶

__wrapped__¶: alias of PGTCoherenceChecker

algorithm_application: ClassVar[str] = 'PGTCoherenceChecker'¶

Unique name for the application that is the use of this configuration together with a specific algorithm.

Will be set when registering to ApplicationsRegistry, but can be given by direct registration (See register_algorithm_application)

algorithm_name: ClassVar[str] = 'PGT'¶

Name of the algorithm to use with this configuration.

Will be set when registering to ApplicationsRegistry