gt4sd.cli.hf_to_st_converter module¶
Transformers pretrained model to SentenceTransformer model converter.
Summary¶
Classes:
Transformers to Sentence Transformers converter arguments. |
Functions:
Convert HF pretrained model to SentenceTransformer. |
Reference¶
- class TransformersToSentenceTransformersArguments(model_name_or_path, pooling, output_path)[source]¶
Bases:
object
Transformers to Sentence Transformers converter arguments.
- __name__ = 'TransformersToSentenceTransformersArguments'¶
- model_name_or_path: str¶
- pooling: str¶
- output_path: str¶
- __annotations__ = {'model_name_or_path': <class 'str'>, 'output_path': <class 'str'>, 'pooling': <class 'str'>}¶
- __dataclass_fields__ = {'model_name_or_path': Field(name='model_name_or_path',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'HF model name or path.'}),kw_only=False,_field_type=_FIELD), 'output_path': Field(name='output_path',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'Path to the converted model.'}),kw_only=False,_field_type=_FIELD), 'pooling': Field(name='pooling',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'Comma separated pooling modes. Supported types: cls, max, mean, mean_sqrt.'}),kw_only=False,_field_type=_FIELD)}¶
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
- __dict__ = mappingproxy({'__module__': 'gt4sd.cli.hf_to_st_converter', '__annotations__': {'model_name_or_path': <class 'str'>, 'pooling': <class 'str'>, 'output_path': <class 'str'>}, '__doc__': 'Transformers to Sentence Transformers converter arguments.', '__name__': 'hf_to_st_converter_args', '__dict__': <attribute '__dict__' of 'TransformersToSentenceTransformersArguments' objects>, '__weakref__': <attribute '__weakref__' of 'TransformersToSentenceTransformersArguments' objects>, '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False), '__dataclass_fields__': {'model_name_or_path': Field(name='model_name_or_path',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'HF model name or path.'}),kw_only=False,_field_type=_FIELD), 'pooling': Field(name='pooling',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'Comma separated pooling modes. Supported types: cls, max, mean, mean_sqrt.'}),kw_only=False,_field_type=_FIELD), 'output_path': Field(name='output_path',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'Path to the converted model.'}),kw_only=False,_field_type=_FIELD)}, '__init__': <function TransformersToSentenceTransformersArguments.__init__>, '__repr__': <function TransformersToSentenceTransformersArguments.__repr__>, '__eq__': <function TransformersToSentenceTransformersArguments.__eq__>, '__hash__': None, '__match_args__': ('model_name_or_path', 'pooling', 'output_path')})¶
- __doc__ = 'Transformers to Sentence Transformers converter arguments.'¶
- __eq__(other)¶
Return self==value.
- __hash__ = None¶
- __init__(model_name_or_path, pooling, output_path)¶
- __match_args__ = ('model_name_or_path', 'pooling', 'output_path')¶
- __module__ = 'gt4sd.cli.hf_to_st_converter'¶
- __repr__()¶
Return repr(self).
- __weakref__¶
list of weak references to the object (if defined)
- main()[source]¶
Convert HF pretrained model to SentenceTransformer.
Create a SentenceTransformer model having a given HF model as word embedding model plus an optional pooling layer. We can also concatenate multiple poolings together.
- Parsing from the command line the following parameters:
HF pretrained model to be used as word embedding model.
- the pooling mode (more than one can be provided as a list), the implemented
options are “cls”, “max”, “mean”, “mean” and “sqrt”.
path to save the generated SentenceTransformer model.
- Return type
None