Data loader#

This section describes Fortuna’s data loader functionalities. A DataLoader object is an iterable of two-dimensional tuples of arrays (either NumPy-arrays or JAX-NumPy-arrays), where the first components are input variables and the second components are target variables. If your dispose of a data loader of TensorFlow or PyTorch tensors, or others, you can convert them into something digestible by Fortuna using the appropriate DataLoader functionality (check from_tensorflow_data_loader(), from_torch_data_loader()).

The data DataLoader also allows you to generate an InputsLoader or a TargetsLoader, i.e. data loaders of only inputs and only targets variables, respectively (check to_inputs_loader() and to_targets_loader()). Additionally, you can convert a data loader into an array of inputs, an array of targets, or a tuple of input and target arrays (check to_array_inputs(), to_array_targets() and to_array_data()).

class fortuna.data.loader.DataLoader(iterable, num_unique_labels=None)[source]#
chop(divisor)[source]#

Chop the last part of each batch of the data loader, to make sure the number od data points per batch divides divisor.

Parameters:

divisor (int) – Number of data points that each batched must divide.

Returns:

A data loader with chopped batches.

Return type:

DataLoader

classmethod from_array_data(data, batch_size=None, shuffle=False, prefetch=False)[source]#

Build a DataLoader object from a tuple of arrays of input and target variables, respectively.

Parameters:
  • data (Batch) – Input and target arrays of data.

  • batch_size (Optional[int]) – The batch size. If not given, the data will not be batched.

  • shuffle (bool) – Whether the data loader should shuffle at every call.

  • prefetch (bool) – Whether to prefetch the next batch.

Returns:

A data loader built out of the tuple of arrays.

Return type:

DataLoader

classmethod from_callable_iterable(fun)#

Transform a callable iterable into a concrete instance of a subclass of BaseDataLoader.

Parameters:

fun (Callable[[], Iterable[Batch]]) – A callable iterable of tuples of input and target arrays.

Returns:

A concrete instance of a subclass of BaseDataLoader.

Return type:

T

classmethod from_inputs_loaders(inputs_loaders, targets, how='interpose')#

Transform a list of inputs loader into a concrete instance of a subclass of BaseDataLoader. The newly created data loader is formed out of concatenated batches of inputs and the respective assigned target variable.

Parameters:
  • inputs_loaders (List[BaseInputsLoader]) – A list of inputs loaders.

  • targets (List[int]) – A target variable for each inputs loader.

  • how (str) – How the input_loaders will be combined: ‘interpose’ will interpose the input_loaders based on their batch sizes; ‘concatenate’ will ignore batch size and concatenate them.

Returns:

A concrete instance of a subclass of BaseDataLoader. The data loader object is formed by the concatenated batches of inputs, and the assigned targets.

Return type:

T

classmethod from_iterable(iterable)#

Transform an iterable into a concrete instance of a subclass of BaseDataLoader.

Parameters:

iterable (Iterable[Batch]) – An iterable of tuples of input and target arrays.

Returns:

A concrete instance of a subclass of BaseDataLoader.

Return type:

T

classmethod from_tensorflow_data_loader(tf_data_loader)#

Transform a TensorFlow data loader into a concrete instance of a subclass of BaseDataLoader.

Parameters:

tf_data_loader – A TensorFlow data loader where each batch is a tuple of input and target Tensors.

Returns:

A concrete instance of a subclass of BaseDataLoader.

Return type:

T

classmethod from_torch_data_loader(torch_data_loader)#

Transform a PyTorch data loader into a concrete instance of a subclass of BaseDataLoader.

Parameters:

torch_data_loader – A PyTorch data loader where each batch is a tuple of input and target Tensors.

Returns:

A concrete instance of a subclass of BaseDataLoader.

Return type:

T

property input_shape: Union[Iterable[int], Dict[str, Iterable[int]]]#

Get the shape of the inputs in the data loader.

property num_unique_labels: Optional[int]#

Number of unique target labels in the task (classification only)

Returns:

Number of unique target labels in the task if it is a classification one. Otherwise returns None.

Return type:

int

sample(seed, n_samples)[source]#

Sample from the data loader, with replacement.

Parameters:
  • seed (int) – Random seed.

  • n_samples (int) – Number of samples.

Returns:

A data loader made of the sampled data points.

Return type:

DataLoader

property size: int#

The number of data points in the data loader.

Returns:

Number of data points.

Return type:

int

split(n_data)[source]#

Split a data loader into two data loaders.

Parameters:

n_data (int) – Number of data point after which the data loader should be split. The first returned data loader will contain exactly n_data data points. The second one will contain the remaining ones.

Returns:

The two data loaders made out of the original one.

Return type:

Tuple[DataLoader, DataLoader]

to_array_data()[source]#

Reduce a data loader to a tuple of input and target arrays.

Returns:

Tuple of input and target arrays.

Return type:

Batch

to_array_inputs()[source]#

Reduce a data loader to an array of target data.

Returns:

Array of input data.

Return type:

Array

to_array_targets()[source]#

Reduce a data loader to an array of target data.

Returns:

Array of input data.

Return type:

Array

to_inputs_loader()[source]#

Reduce a data loader to an inputs loader.

Returns:

The inputs loader derived from the data loader.

Return type:

InputsLoader

to_targets_loader()[source]#

Reduce a data loader to a targets loader.

Returns:

The targets loader derived from the data loader.

Return type:

TargetsLoader

to_transformed_data_loader(transform, status=None)#

Transform the batches of an existing data loader.

Parameters:
  • transform (Callable[[InputData, Array, Status], Tuple[InputData, Array, Status]]) – A transformation function. It takes a batch and returns its transformation. A status may be updated during the process.

  • status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.

Returns:

A concrete instance of a subclass of BaseDataLoader.

Return type:

T

class fortuna.data.loader.InputsLoader(iterable)[source]#
chop(divisor)[source]#

Chop the last part of each batch of the inputs loader, to make sure the number od data points per batch divides divisor.

Parameters:

divisor (int) – Number of data points that each batched must divide.

Returns:

An inputs loader with chopped batches.

Return type:

InputsLoader

classmethod from_array_inputs(inputs, batch_size=None, shuffle=False, prefetch=False)[source]#

Build a InputsLoader object from an array of input data.

Parameters:
  • inputs (Array) – Input array of data.

  • batch_size (Optional[int]) – The batch size. If not given, the inputs will not be batched.

  • shuffle (bool) – Whether the inputs loader should shuffle at every call.

  • prefetch (bool) – Whether to prefetch the next batch.

Returns:

An inputs loader built out of the array of inputs.

Return type:

InputsLoader

classmethod from_callable_iterable(fun)#

Transform a callable iterable into a concrete instance of a subclass of BaseInputsLoader

Parameters:

fun (Callable[[], Iterable[InputData]]) – A callable iterable of input data.

Returns:

A concrete instance of a subclass of BaseInputsLoader.

Return type:

T

classmethod from_data_loader(data_loader)#

Reduce a data loader to an inputs loader.

Parameters:

data_loader (DataLoader) – A data loader.

Returns:

A concrete instance of a subclass of BaseInputsLoader.

Return type:

T

classmethod from_iterable(iterable)#

Transform an iterable into a concrete instance of a subclass of BaseInputsLoader

Parameters:

iterable (Iterable[InputData]) – An iterable of input data.

Returns:

A concrete instance of a subclass of BaseInputsLoader.

Return type:

T

property input_shape: Union[Iterable[int], Dict[str, Iterable[int]]]#

Get the shape of the inputs in the inputs loader.

sample(seed, n_samples)[source]#

Sample from the inputs loader, with replacement.

Parameters:
  • seed (int) – Random seed.

  • n_samples (int) – Number of samples.

Returns:

An inputs loader made of the sampled inputs.

Return type:

InputsLoader

property size: int#

The number of data points in the inputs loader.

Returns:

Number of data points.

Return type:

int

split(n_data)[source]#

Split an inputs loader into two inputs loaders.

Parameters:

n_data (int) – Number of data point after which the inputs loader should be split. The first returned inputs loader will contain exactly n_data inputs. The second one will contain the remaining ones.

Returns:

The two inputs loaders made out of the original one.

Return type:

Tuple[InputsLoader, InputsLoader]

to_array_inputs()[source]#

Reduce an inputs loader to an array of inputs.

Returns:

Array of input data.

Return type:

Array

to_transformed_inputs_loader(transform, status=None)#

From an existing loader of inputs, create a loader with transformed inputs.

Parameters:
  • transform (Callable[[Array, Status], Tuple[Array, Status]]) – A transformation function. It takes a batch of inputs and returns their transformation.

  • status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.

Returns:

A concrete instance of a subclass of BaseInputsLoader.

Return type:

T

class fortuna.data.loader.TargetsLoader(iterable)[source]#
chop(divisor)[source]#

Chop the last part of each batch of the targets loader, to make sure the number od data points per batch divides divisor.

Parameters:

divisor (int) – Number of data points that each batched must divide.

Returns:

A targets loader with chopped batches.

Return type:

TargetsLoader

classmethod from_array_targets(targets, batch_size=None, shuffle=False, prefetch=False)[source]#

Build a TargetsLoader object from an array of target data.

Parameters:
  • targets (Array) – Target array of data.

  • batch_size (Optional[int]) – The batch size. If not given, the targets will not be batched.

  • shuffle (bool) – Whether the target loader should shuffle at every call.

  • prefetch (bool) – Whether to prefetch the next batch.

Returns:

A targets loader built out of the array of targets.

Return type:

TargetsLoader

classmethod from_callable_iterable(fun)#

Transform a callable iterable into a concrete instance of a subclass of BaseTargetsLoader.

Parameters:

fun (Callable[[], Iterable[Union[Batch, InputData, Array]]],) – A callable iterable of target arrays.

Returns:

A concrete instance of a subclass of BaseTargetsLoader.

Return type:

T

classmethod from_data_loader(data_loader)#

Reduce a data loader to a targets loader.

Parameters:

data_loader (DataLoader) – A data loader.

Returns:

A concrete instance of a subclass of BaseTargetsLoader.

Return type:

T

classmethod from_iterable(iterable)#

Transform an iterable into a concrete instance of a subclass of BaseTargetsLoader.

Parameters:

iterable (Iterable[Array]) – An iterable of target arrays.

Returns:

A concrete instance of a subclass of BaseTargetsLoader.

Return type:

T

sample(seed, n_samples)[source]#

Sample from the targets loader, with replacement.

Parameters:
  • seed (int) – Random seed.

  • n_samples (int) – Number of samples.

Returns:

A targets loader made of the sampled targets.

Return type:

TargetsLoader

property size: int#

The number of data points in the targets loader.

Returns:

Number of data points.

Return type:

int

split(n_data)[source]#

Split a targets loader into two targets loaders.

Parameters:

n_data (int) – Number of data point after which the targets loader should be split. The first returned targets loader will contain exactly n_data targets. The second one will contain the remaining ones.

Returns:

The two targets loaders made out of the original one.

Return type:

Tuple[TargetsLoader, TargetsLoader]

to_array_targets()[source]#

Reduce a targets loader to an array of targets.

Returns:

Array of target data.

Return type:

Array

to_transformed_targets_loader(transform, status=None)#

From an existing loader of targets, create a loader with transformed targets.

Parameters:
  • transform (Callable[[Array, Status], Tuple[Array, Status]]) – A transformation function. It takes a batch of targets and returns their transformation. A status may be updated during the process.

  • status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.

Returns:

A concrete instance of a subclass of BaseTargetsLoader.

Return type:

T