Data loader#
This section describes Fortuna’s data loader functionalities. A DataLoader object
is an iterable of two-dimensional tuples of arrays (either NumPy-arrays or JAX-NumPy-arrays),
where the first components are input variables and the second components are target variables. If your dispose of a data loader
of TensorFlow or PyTorch tensors, or others, you can convert them into something digestible by Fortuna using
the appropriate DataLoader functionality
(check from_tensorflow_data_loader(), from_torch_data_loader()).
The data DataLoader also allows you to generate an InputsLoader or a
TargetsLoader, i.e. data loaders of only inputs and only targets variables, respectively
(check to_inputs_loader() and to_targets_loader()).
Additionally, you can convert a data loader into an array of inputs, an array of targets, or a tuple of input and target
arrays (check to_array_inputs(), to_array_targets() and to_array_data()).
- class fortuna.data.loader.DataLoader(iterable, num_unique_labels=None)[source]#
- chop(divisor)[source]#
Chop the last part of each batch of the data loader, to make sure the number od data points per batch divides divisor.
- Parameters:
divisor (int) – Number of data points that each batched must divide.
- Returns:
A data loader with chopped batches.
- Return type:
- classmethod from_array_data(data, batch_size=None, shuffle=False, prefetch=False)[source]#
Build a
DataLoaderobject from a tuple of arrays of input and target variables, respectively.- Parameters:
data (Batch) – Input and target arrays of data.
batch_size (Optional[int]) – The batch size. If not given, the data will not be batched.
shuffle (bool) – Whether the data loader should shuffle at every call.
prefetch (bool) – Whether to prefetch the next batch.
- Returns:
A data loader built out of the tuple of arrays.
- Return type:
- classmethod from_callable_iterable(fun)#
Transform a callable iterable into a concrete instance of a subclass of
BaseDataLoader.- Parameters:
fun (Callable[[], Iterable[Batch]]) – A callable iterable of tuples of input and target arrays.
- Returns:
A concrete instance of a subclass of
BaseDataLoader.- Return type:
T
- classmethod from_inputs_loaders(inputs_loaders, targets, how='interpose')#
Transform a list of inputs loader into a concrete instance of a subclass of
BaseDataLoader. The newly created data loader is formed out of concatenated batches of inputs and the respective assigned target variable.- Parameters:
inputs_loaders (List[BaseInputsLoader]) – A list of inputs loaders.
targets (List[int]) – A target variable for each inputs loader.
how (str) – How the input_loaders will be combined: ‘interpose’ will interpose the input_loaders based on their batch sizes; ‘concatenate’ will ignore batch size and concatenate them.
- Returns:
A concrete instance of a subclass of
BaseDataLoader. The data loader object is formed by the concatenated batches of inputs, and the assigned targets.- Return type:
T
- classmethod from_iterable(iterable)#
Transform an iterable into a concrete instance of a subclass of
BaseDataLoader.- Parameters:
iterable (Iterable[Batch]) – An iterable of tuples of input and target arrays.
- Returns:
A concrete instance of a subclass of
BaseDataLoader.- Return type:
T
- classmethod from_tensorflow_data_loader(tf_data_loader)#
Transform a TensorFlow data loader into a concrete instance of a subclass of
BaseDataLoader.- Parameters:
tf_data_loader – A TensorFlow data loader where each batch is a tuple of input and target Tensors.
- Returns:
A concrete instance of a subclass of
BaseDataLoader.- Return type:
T
- classmethod from_torch_data_loader(torch_data_loader)#
Transform a PyTorch data loader into a concrete instance of a subclass of
BaseDataLoader.- Parameters:
torch_data_loader – A PyTorch data loader where each batch is a tuple of input and target Tensors.
- Returns:
A concrete instance of a subclass of
BaseDataLoader.- Return type:
T
- property input_shape: Union[Iterable[int], Dict[str, Iterable[int]]]#
Get the shape of the inputs in the data loader.
- property num_unique_labels: Optional[int]#
Number of unique target labels in the task (classification only)
- Returns:
Number of unique target labels in the task if it is a classification one. Otherwise returns None.
- Return type:
int
- sample(seed, n_samples)[source]#
Sample from the data loader, with replacement.
- Parameters:
seed (int) – Random seed.
n_samples (int) – Number of samples.
- Returns:
A data loader made of the sampled data points.
- Return type:
- property size: int#
The number of data points in the data loader.
- Returns:
Number of data points.
- Return type:
int
- split(n_data)[source]#
Split a data loader into two data loaders.
- Parameters:
n_data (int) – Number of data point after which the data loader should be split. The first returned data loader will contain exactly n_data data points. The second one will contain the remaining ones.
- Returns:
The two data loaders made out of the original one.
- Return type:
Tuple[DataLoader, DataLoader]
- to_array_data()[source]#
Reduce a data loader to a tuple of input and target arrays.
- Returns:
Tuple of input and target arrays.
- Return type:
Batch
- to_array_inputs()[source]#
Reduce a data loader to an array of target data.
- Returns:
Array of input data.
- Return type:
Array
- to_array_targets()[source]#
Reduce a data loader to an array of target data.
- Returns:
Array of input data.
- Return type:
Array
- to_inputs_loader()[source]#
Reduce a data loader to an inputs loader.
- Returns:
The inputs loader derived from the data loader.
- Return type:
- to_targets_loader()[source]#
Reduce a data loader to a targets loader.
- Returns:
The targets loader derived from the data loader.
- Return type:
- to_transformed_data_loader(transform, status=None)#
Transform the batches of an existing data loader.
- Parameters:
transform (Callable[[InputData, Array, Status], Tuple[InputData, Array, Status]]) – A transformation function. It takes a batch and returns its transformation. A status may be updated during the process.
status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.
- Returns:
A concrete instance of a subclass of
BaseDataLoader.- Return type:
T
- class fortuna.data.loader.InputsLoader(iterable)[source]#
- chop(divisor)[source]#
Chop the last part of each batch of the inputs loader, to make sure the number od data points per batch divides divisor.
- Parameters:
divisor (int) – Number of data points that each batched must divide.
- Returns:
An inputs loader with chopped batches.
- Return type:
- classmethod from_array_inputs(inputs, batch_size=None, shuffle=False, prefetch=False)[source]#
Build a
InputsLoaderobject from an array of input data.- Parameters:
inputs (Array) – Input array of data.
batch_size (Optional[int]) – The batch size. If not given, the inputs will not be batched.
shuffle (bool) – Whether the inputs loader should shuffle at every call.
prefetch (bool) – Whether to prefetch the next batch.
- Returns:
An inputs loader built out of the array of inputs.
- Return type:
- classmethod from_callable_iterable(fun)#
Transform a callable iterable into a concrete instance of a subclass of
BaseInputsLoader- Parameters:
fun (Callable[[], Iterable[InputData]]) – A callable iterable of input data.
- Returns:
A concrete instance of a subclass of
BaseInputsLoader.- Return type:
T
- classmethod from_data_loader(data_loader)#
Reduce a data loader to an inputs loader.
- Parameters:
data_loader (DataLoader) – A data loader.
- Returns:
A concrete instance of a subclass of
BaseInputsLoader.- Return type:
T
- classmethod from_iterable(iterable)#
Transform an iterable into a concrete instance of a subclass of
BaseInputsLoader- Parameters:
iterable (Iterable[InputData]) – An iterable of input data.
- Returns:
A concrete instance of a subclass of
BaseInputsLoader.- Return type:
T
- property input_shape: Union[Iterable[int], Dict[str, Iterable[int]]]#
Get the shape of the inputs in the inputs loader.
- sample(seed, n_samples)[source]#
Sample from the inputs loader, with replacement.
- Parameters:
seed (int) – Random seed.
n_samples (int) – Number of samples.
- Returns:
An inputs loader made of the sampled inputs.
- Return type:
- property size: int#
The number of data points in the inputs loader.
- Returns:
Number of data points.
- Return type:
int
- split(n_data)[source]#
Split an inputs loader into two inputs loaders.
- Parameters:
n_data (int) – Number of data point after which the inputs loader should be split. The first returned inputs loader will contain exactly n_data inputs. The second one will contain the remaining ones.
- Returns:
The two inputs loaders made out of the original one.
- Return type:
Tuple[InputsLoader, InputsLoader]
- to_array_inputs()[source]#
Reduce an inputs loader to an array of inputs.
- Returns:
Array of input data.
- Return type:
Array
- to_transformed_inputs_loader(transform, status=None)#
From an existing loader of inputs, create a loader with transformed inputs.
- Parameters:
transform (Callable[[Array, Status], Tuple[Array, Status]]) – A transformation function. It takes a batch of inputs and returns their transformation.
status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.
- Returns:
A concrete instance of a subclass of
BaseInputsLoader.- Return type:
T
- class fortuna.data.loader.TargetsLoader(iterable)[source]#
- chop(divisor)[source]#
Chop the last part of each batch of the targets loader, to make sure the number od data points per batch divides divisor.
- Parameters:
divisor (int) – Number of data points that each batched must divide.
- Returns:
A targets loader with chopped batches.
- Return type:
- classmethod from_array_targets(targets, batch_size=None, shuffle=False, prefetch=False)[source]#
Build a
TargetsLoaderobject from an array of target data.- Parameters:
targets (Array) – Target array of data.
batch_size (Optional[int]) – The batch size. If not given, the targets will not be batched.
shuffle (bool) – Whether the target loader should shuffle at every call.
prefetch (bool) – Whether to prefetch the next batch.
- Returns:
A targets loader built out of the array of targets.
- Return type:
- classmethod from_callable_iterable(fun)#
Transform a callable iterable into a concrete instance of a subclass of
BaseTargetsLoader.- Parameters:
fun (Callable[[], Iterable[Union[Batch, InputData, Array]]],) – A callable iterable of target arrays.
- Returns:
A concrete instance of a subclass of
BaseTargetsLoader.- Return type:
T
- classmethod from_data_loader(data_loader)#
Reduce a data loader to a targets loader.
- Parameters:
data_loader (DataLoader) – A data loader.
- Returns:
A concrete instance of a subclass of
BaseTargetsLoader.- Return type:
T
- classmethod from_iterable(iterable)#
Transform an iterable into a concrete instance of a subclass of
BaseTargetsLoader.- Parameters:
iterable (Iterable[Array]) – An iterable of target arrays.
- Returns:
A concrete instance of a subclass of
BaseTargetsLoader.- Return type:
T
- sample(seed, n_samples)[source]#
Sample from the targets loader, with replacement.
- Parameters:
seed (int) – Random seed.
n_samples (int) – Number of samples.
- Returns:
A targets loader made of the sampled targets.
- Return type:
- property size: int#
The number of data points in the targets loader.
- Returns:
Number of data points.
- Return type:
int
- split(n_data)[source]#
Split a targets loader into two targets loaders.
- Parameters:
n_data (int) – Number of data point after which the targets loader should be split. The first returned targets loader will contain exactly n_data targets. The second one will contain the remaining ones.
- Returns:
The two targets loaders made out of the original one.
- Return type:
Tuple[TargetsLoader, TargetsLoader]
- to_array_targets()[source]#
Reduce a targets loader to an array of targets.
- Returns:
Array of target data.
- Return type:
Array
- to_transformed_targets_loader(transform, status=None)#
From an existing loader of targets, create a loader with transformed targets.
- Parameters:
transform (Callable[[Array, Status], Tuple[Array, Status]]) – A transformation function. It takes a batch of targets and returns their transformation. A status may be updated during the process.
status (Optional[Status]) – An initial status. This may include pre-computed objects used by the transformation.
- Returns:
A concrete instance of a subclass of
BaseTargetsLoader.- Return type:
T