gt4sd.frameworks.granular.dataloader.data_module module

Data module for granular.

Summary

Classes:

GranularDataModule

Data module from granular.

Reference

class GranularDataModule(dataset_list, validation_split=None, validation_indices_file=None, stratified_batch_file=None, stratified_value_name=None, batch_size=64, num_workers=1)[source]

Bases: LightningDataModule

Data module from granular.

__init__(dataset_list, validation_split=None, validation_indices_file=None, stratified_batch_file=None, stratified_value_name=None, batch_size=64, num_workers=1)[source]

Construct GranularDataModule.

Parameters
  • dataset_list (List[GranularDataset]) – a list of granular datasets.

  • validation_split (Optional[float, None]) – proportion used for validation. Defaults to None, a.k.a., use indices file if provided otherwise uses half of the data for validation.

  • validation_indices_file (Optional[str, None]) – indices to use for validation. Defaults to None, a.k.a., use validation split proportion, if not provided uses half of the data for validation.

  • stratified_batch_file (Optional[str, None]) – stratified batch file for sampling. Defaults to None, a.k.a., no stratified sampling.

  • stratified_value_name (Optional[str, None]) – stratified value name. Defaults to None, a.k.a., no stratified sampling. Needed in case a stratified batch file is provided.

  • batch_size (int) – batch size. Defaults to 64.

  • num_workers (int) – number of workers. Defaults to 1.

static combine_datasets(dataset_list)[source]

Combine granular datasets.

Parameters

dataset_list (List[GranularDataset]) – a list of granular datasets.

Return type

CombinedGranularDataset

Returns

a combined granular dataset.

prepare_train_data()[source]

Prepare training dataset.

Return type

None

prepare_test_data(dataset_list)[source]

Prepare testing dataset.

Parameters

dataset_list (List[GranularDataset]) – a list of granular datasets.

Return type

None

setup(stage=None)[source]

Setup the data module.

Parameters

stage (Optional[str, None]) – stage considered, unused. Defaults to None.

Return type

None

static get_stratified_batch_sampler(stratified_batch_file, stratified_value_name, batch_size, selector_fn)[source]

Get stratified batch sampler.

Parameters
  • stratified_batch_file (str) – stratified batch file for sampling.

  • stratified_value_name (str) – stratified value name.

  • batch_size (int) – batch size.

  • selector_fn (Callable[[DataFrame], DataFrame]) – selector function for stratified sampling.

Return type

StratifiedSampler

Returns

a stratified batch sampler.

train_dataloader()[source]

Get a training data loader.

Return type

DataLoader

Returns

a training data loader.

val_dataloader()[source]

Get a validation data loader.

Return type

DataLoader

Returns

a validation data loader.

__annotations__ = {}
__doc__ = 'Data module from granular.'
__module__ = 'gt4sd.frameworks.granular.dataloader.data_module'