gt4sd.algorithms.conditional_generation.regression_transformer.utils module

Summary

Functions:

filter_stubbed

Remove stub-like molecules that are substantially smaller than the target.

get_substructure_indices

type full_sequence

List[str]

Reference

get_substructure_indices(full_sequence, substructure)[source]
Parameters
  • full_sequence (List[str]) – A list of strings, each representing a token from the full sequence

  • substructure (List[str]) – A list of strings, each representing a token from the substructure that is contained in the full sequence.

Return type

List[int]

Returns

A list of integers, corresponding to all the indices of the tokens in the full sequence that match the substructure.

filter_stubbed(property_sequences, target, threshold=0.5)[source]

Remove stub-like molecules that are substantially smaller than the target.

Parameters
  • sequences – List of generated molecules.

  • properties – Properties of the molecules. Only used to be returned after filtering.

  • target (str) – Seed molecule.

  • threshold (float) – Fraction of size of generated molecule compared to seed determining the threshold under which molecules are discarded. Defaults to 0.5.

Return type

Tuple[Tuple[str, str]]

Returns

Tuple of tuples of length 2 with filtered, generated molecule and its properties.