gt4sd.cli.hf_to_st_converter module¶
Transformers pretrained model to SentenceTransformer model converter.
Summary¶
Classes:
Transformers to Sentence Transformers converter arguments.  | 
Functions:
Convert HF pretrained model to SentenceTransformer.  | 
Reference¶
- class TransformersToSentenceTransformersArguments(model_name_or_path, pooling, output_path)[source]¶
 Bases:
objectTransformers to Sentence Transformers converter arguments.
- __name__ = 'TransformersToSentenceTransformersArguments'¶
 
- model_name_or_path: str¶
 
- pooling: str¶
 
- output_path: str¶
 
- __annotations__ = {'model_name_or_path': <class 'str'>, 'output_path': <class 'str'>, 'pooling': <class 'str'>}¶
 
- __dataclass_fields__ = {'model_name_or_path': Field(name='model_name_or_path',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'HF model name or path.'}),kw_only=False,_field_type=_FIELD), 'output_path': Field(name='output_path',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'Path to the converted model.'}),kw_only=False,_field_type=_FIELD), 'pooling': Field(name='pooling',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'Comma separated pooling modes. Supported types: cls, max, mean, mean_sqrt.'}),kw_only=False,_field_type=_FIELD)}¶
 
- __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False)¶
 
- __dict__ = mappingproxy({'__module__': 'gt4sd.cli.hf_to_st_converter', '__annotations__': {'model_name_or_path': <class 'str'>, 'pooling': <class 'str'>, 'output_path': <class 'str'>}, '__doc__': 'Transformers to Sentence Transformers converter arguments.', '__name__': 'hf_to_st_converter_args', '__dict__': <attribute '__dict__' of 'TransformersToSentenceTransformersArguments' objects>, '__weakref__': <attribute '__weakref__' of 'TransformersToSentenceTransformersArguments' objects>, '__dataclass_params__': _DataclassParams(init=True,repr=True,eq=True,order=False,unsafe_hash=False,frozen=False), '__dataclass_fields__': {'model_name_or_path': Field(name='model_name_or_path',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'HF model name or path.'}),kw_only=False,_field_type=_FIELD), 'pooling': Field(name='pooling',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'Comma separated pooling modes. Supported types: cls, max, mean, mean_sqrt.'}),kw_only=False,_field_type=_FIELD), 'output_path': Field(name='output_path',type=<class 'str'>,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=True,repr=True,hash=None,compare=True,metadata=mappingproxy({'help': 'Path to the converted model.'}),kw_only=False,_field_type=_FIELD)}, '__init__': <function TransformersToSentenceTransformersArguments.__init__>, '__repr__': <function TransformersToSentenceTransformersArguments.__repr__>, '__eq__': <function TransformersToSentenceTransformersArguments.__eq__>, '__hash__': None, '__match_args__': ('model_name_or_path', 'pooling', 'output_path')})¶
 
- __doc__ = 'Transformers to Sentence Transformers converter arguments.'¶
 
- __eq__(other)¶
 Return self==value.
- __hash__ = None¶
 
- __init__(model_name_or_path, pooling, output_path)¶
 
- __match_args__ = ('model_name_or_path', 'pooling', 'output_path')¶
 
- __module__ = 'gt4sd.cli.hf_to_st_converter'¶
 
- __repr__()¶
 Return repr(self).
- __weakref__¶
 list of weak references to the object (if defined)
- main()[source]¶
 Convert HF pretrained model to SentenceTransformer.
Create a SentenceTransformer model having a given HF model as word embedding model plus an optional pooling layer. We can also concatenate multiple poolings together.
- Parsing from the command line the following parameters:
 HF pretrained model to be used as word embedding model.
- the pooling mode (more than one can be provided as a list), the implemented
 options are “cls”, “max”, “mean”, “mean” and “sqrt”.
path to save the generated SentenceTransformer model.
- Return type
 None