Data loader¶

This section describes Fortuna’s data loader functionalities. A DataLoader object is an iterable of two-dimensional tuples of arrays (either NumPy-arrays or JAX-NumPy-arrays), where the first components are input variables and the second components are target variables. If your dispose of a data loader of TensorFlow or PyTorch tensors, or others, you can convert them into something digestible by Fortuna using the appropriate DataLoader functionality (check from_tensorflow_data_loader(), from_torch_data_loader()).

The data DataLoader also allows you to generate an InputsLoader or a TargetsLoader, i.e. data loaders of only inputs and only targets variables, respectively (check to_inputs_loader() and to_targets_loader()). Additionally, you can convert a data loader into an array of inputs, an array of targets, or a tuple of input and target arrays (check to_array_inputs(), to_array_targets() and to_array_data()).

class fortuna.data.loader.DataLoader(iterable, num_unique_labels=None)[source]¶

chop(divisor)[source]¶

Chop the last part of each batch of the data loader, to make sure the number od data points per batch divides divisor.

Parameters:: divisor (int) – Number of data points that each batched must divide.
Returns:: A data loader with chopped batches.
Return type:: DataLoader

classmethod from_array_data(data, batch_size=None, shuffle=False, prefetch=False)[source]¶

Build a DataLoader object from a tuple of arrays of input and target variables, respectively.

Parameters:

data (Batch) – Input and target arrays of data.
batch_size (Optional[int]) – The batch size. If not given, the data will not be batched.
shuffle (bool) – Whether the data loader should shuffle at every call.
prefetch (bool) – Whether to prefetch the next batch.

Returns:

A data loader built out of the tuple of arrays.

Return type:

DataLoader

classmethod from_callable_iterable(fun)¶

Transform a callable iterable into a concrete instance of a subclass of BaseDataLoader.

Parameters:: fun (Callable[[], Iterable[Batch]]) – A callable iterable of tuples of input and target arrays.
Returns:: A concrete instance of a subclass of BaseDataLoader.
Return type:: T

classmethod from_inputs_loaders(inputs_loaders, targets, how='interpose')¶

Transform a list of inputs loader into a concrete instance of a subclass of BaseDataLoader. The newly created data loader is formed out of concatenated batches of inputs and the respective assigned target variable.

Parameters:

inputs_loaders (List[BaseInputsLoader]) – A list of inputs loaders.
targets (List[int]) – A target variable for each inputs loader.
how (str) – How the input_loaders will be combined: ‘interpose’ will interpose the input_loaders based on their batch sizes; ‘concatenate’ will ignore batch size and concatenate them.

Returns:

A concrete instance of a subclass of BaseDataLoader. The data loader object is formed by the concatenated batches of inputs, and the assigned targets.

Return type:

classmethod from_iterable(iterable)¶

Transform an iterable into a concrete instance of a subclass of BaseDataLoader.

Parameters:: iterable (Iterable[Batch]) – An iterable of tuples of input and target arrays.
Returns:: A concrete instance of a subclass of BaseDataLoader.
Return type:: T

classmethod from_tensorflow_data_loader(tf_data_loader)¶

Transform a TensorFlow data loader into a concrete instance of a subclass of BaseDataLoader.

Parameters:: tf_data_loader – A TensorFlow data loader where each batch is a tuple of input and target Tensors.
Returns:: A concrete instance of a subclass of BaseDataLoader.
Return type:: T

classmethod from_torch_data_loader(torch_data_loader)¶

Transform a PyTorch data loader into a concrete instance of a subclass of BaseDataLoader.

Parameters:: torch_data_loader – A PyTorch data loader where each batch is a tuple of input and target Tensors.
Returns:: A concrete instance of a subclass of BaseDataLoader.
Return type:: T

property input_shape: Iterable[int] | Dict[str, Iterable[int]]¶: Get the shape of the inputs in the data loader.

property num_unique_labels: int | None¶

Number of unique target labels in the task (classification only)

Returns:: Number of unique target labels in the task if it is a classification one. Otherwise returns None.
Return type:: int

sample(seed, n_samples)[source]¶

Sample from the data loader, with replacement.

Parameters:

seed (int) – Random seed.
n_samples (int) – Number of samples.

Returns:

A data loader made of the sampled data points.

Return type:

DataLoader

property size: int¶

The number of data points in the data loader.

Returns:: Number of data points.
Return type:: int

split(n_data)[source]¶

Split a data loader into two data loaders.

Parameters:: n_data (int) – Number of data point after which the data loader should be split. The first returned data loader will contain exactly n_data data points. The second one will contain the remaining ones.
Returns:: The two data loaders made out of the original one.
Return type:: Tuple[DataLoader, DataLoader]

to_array_data()[source]¶

Reduce a data loader to a tuple of input and target arrays.

Returns:: Tuple of input and target arrays.
Return type:: Batch

to_array_inputs()[source]¶

Reduce a data loader to an array of target data.

Returns:: Array of input data.
Return type:: Array

to_array_targets()[source]¶

Reduce a data loader to an array of target data.

Returns:: Array of input data.
Return type:: Array

to_inputs_loader()[source]¶

Reduce a data loader to an inputs loader.

Returns:: The inputs loader derived from the data loader.
Return type:: InputsLoader

to_targets_loader()[source]¶

Reduce a data loader to a targets loader.

Returns:: The targets loader derived from the data loader.
Return type:: TargetsLoader

to_transformed_data_loader(transform, status=None)¶

Transform the batches of an existing data loader.

Parameters:

transform (Callable[[InputData, Array, Status], Tuple[InputData, Array, Status]]) – A transformation function. It takes a batch and returns its transformation. A status may be updated during the process.
status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.

Returns:

A concrete instance of a subclass of BaseDataLoader.

Return type:

class fortuna.data.loader.InputsLoader(iterable)[source]¶

chop(divisor)[source]¶

Chop the last part of each batch of the inputs loader, to make sure the number od data points per batch divides divisor.

Parameters:: divisor (int) – Number of data points that each batched must divide.
Returns:: An inputs loader with chopped batches.
Return type:: InputsLoader

classmethod from_array_inputs(inputs, batch_size=None, shuffle=False, prefetch=False)[source]¶

Build a InputsLoader object from an array of input data.

Parameters:

inputs (Array) – Input array of data.
batch_size (Optional[int]) – The batch size. If not given, the inputs will not be batched.
shuffle (bool) – Whether the inputs loader should shuffle at every call.
prefetch (bool) – Whether to prefetch the next batch.

Returns:

An inputs loader built out of the array of inputs.

Return type:

InputsLoader

classmethod from_callable_iterable(fun)¶

Transform a callable iterable into a concrete instance of a subclass of BaseInputsLoader

Parameters:: fun (Callable[[], Iterable[InputData]]) – A callable iterable of input data.
Returns:: A concrete instance of a subclass of BaseInputsLoader.
Return type:: T

classmethod from_data_loader(data_loader)¶

Reduce a data loader to an inputs loader.

Parameters:: data_loader (DataLoader) – A data loader.
Returns:: A concrete instance of a subclass of BaseInputsLoader.
Return type:: T

classmethod from_iterable(iterable)¶

Transform an iterable into a concrete instance of a subclass of BaseInputsLoader

Parameters:: iterable (Iterable[InputData]) – An iterable of input data.
Returns:: A concrete instance of a subclass of BaseInputsLoader.
Return type:: T

property input_shape: Iterable[int] | Dict[str, Iterable[int]]¶: Get the shape of the inputs in the inputs loader.

sample(seed, n_samples)[source]¶

Sample from the inputs loader, with replacement.

Parameters:

seed (int) – Random seed.
n_samples (int) – Number of samples.

Returns:

An inputs loader made of the sampled inputs.

Return type:

InputsLoader

property size: int¶

The number of data points in the inputs loader.

Returns:: Number of data points.
Return type:: int

split(n_data)[source]¶

Split an inputs loader into two inputs loaders.

Parameters:: n_data (int) – Number of data point after which the inputs loader should be split. The first returned inputs loader will contain exactly n_data inputs. The second one will contain the remaining ones.
Returns:: The two inputs loaders made out of the original one.
Return type:: Tuple[InputsLoader, InputsLoader]

to_array_inputs()[source]¶

Reduce an inputs loader to an array of inputs.

Returns:: Array of input data.
Return type:: Array

to_transformed_inputs_loader(transform, status=None)¶

From an existing loader of inputs, create a loader with transformed inputs.

Parameters:

transform (Callable[[Array, Status], Tuple[Array, Status]]) – A transformation function. It takes a batch of inputs and returns their transformation.
status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.

Returns:

A concrete instance of a subclass of BaseInputsLoader.

Return type:

class fortuna.data.loader.TargetsLoader(iterable)[source]¶

chop(divisor)[source]¶

Chop the last part of each batch of the targets loader, to make sure the number od data points per batch divides divisor.

Parameters:: divisor (int) – Number of data points that each batched must divide.
Returns:: A targets loader with chopped batches.
Return type:: TargetsLoader

classmethod from_array_targets(targets, batch_size=None, shuffle=False, prefetch=False)[source]¶

Build a TargetsLoader object from an array of target data.

Parameters:

targets (Array) – Target array of data.
batch_size (Optional[int]) – The batch size. If not given, the targets will not be batched.
shuffle (bool) – Whether the target loader should shuffle at every call.
prefetch (bool) – Whether to prefetch the next batch.

Returns:

A targets loader built out of the array of targets.

Return type:

TargetsLoader

classmethod from_callable_iterable(fun)¶

Transform a callable iterable into a concrete instance of a subclass of BaseTargetsLoader.

Parameters:: fun (Callable[[], Iterable[Union[Batch, InputData, Array]]],) – A callable iterable of target arrays.
Returns:: A concrete instance of a subclass of BaseTargetsLoader.
Return type:: T

classmethod from_data_loader(data_loader)¶

Reduce a data loader to a targets loader.

Parameters:: data_loader (DataLoader) – A data loader.
Returns:: A concrete instance of a subclass of BaseTargetsLoader.
Return type:: T

classmethod from_iterable(iterable)¶

Transform an iterable into a concrete instance of a subclass of BaseTargetsLoader.

Parameters:: iterable (Iterable[Array]) – An iterable of target arrays.
Returns:: A concrete instance of a subclass of BaseTargetsLoader.
Return type:: T

sample(seed, n_samples)[source]¶

Sample from the targets loader, with replacement.

Parameters:

seed (int) – Random seed.
n_samples (int) – Number of samples.

Returns:

A targets loader made of the sampled targets.

Return type:

TargetsLoader

property size: int¶

The number of data points in the targets loader.

Returns:: Number of data points.
Return type:: int

split(n_data)[source]¶

Split a targets loader into two targets loaders.

Parameters:: n_data (int) – Number of data point after which the targets loader should be split. The first returned targets loader will contain exactly n_data targets. The second one will contain the remaining ones.
Returns:: The two targets loaders made out of the original one.
Return type:: Tuple[TargetsLoader, TargetsLoader]

to_array_targets()[source]¶

Reduce a targets loader to an array of targets.

Returns:: Array of target data.
Return type:: Array

to_transformed_targets_loader(transform, status=None)¶

From an existing loader of targets, create a loader with transformed targets.

Parameters:

transform (Callable[[Array, Status], Tuple[Array, Status]]) – A transformation function. It takes a batch of targets and returns their transformation. A status may be updated during the process.
status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.

Returns:

A concrete instance of a subclass of BaseTargetsLoader.

Return type: