torch_concepts.data.PendulumDataset¶
- class PendulumDataset(root: str | None = None, n_theta: int = 100, n_phi: int = 1000, seed: int = 42, concept_subset: list | None = None, label_descriptions: dict | None = None)[source]¶
Procedurally generated pendulum scene dataset for regression.
Each sample is a rendered image of a pendulum with a light source casting a shadow. The concepts are the pendulum angle (theta) and the light angle (phi), both continuous. The regression task is to predict the x-coordinate of the pendulum ball.
- Parameters:
root (str, optional) – Root directory to store/load the dataset. If None, defaults to
'./data/pendulum'.n_theta (int, optional) – Number of theta angle steps for generation. Default: 100
n_phi (int, optional) – Number of phi angle steps for generation. Default: 1000
seed (int, optional) – Random seed for reproducibility. Default: 42
concept_subset (list of str, optional) – Subset of concept names to use. Default: None (all concepts).
- concepts¶
Tensor of shape (n_samples, 3) containing [theta, phi, pendulum_x].
- Type:
Examples
>>> from torch_concepts.data import PendulumDataset >>> dataset = PendulumDataset(root='./data/pendulum', n_theta=10, n_phi=10) >>> sample = dataset[0] >>> x = sample['inputs']['x'] # image tensor (C, H, W) >>> c = sample['concepts']['c'] # [theta, phi, pendulum_x]
- __init__(root: str | None = None, n_theta: int = 100, n_phi: int = 1000, seed: int = 42, concept_subset: list | None = None, label_descriptions: dict | None = None)[source]¶
Methods
__init__([root, n_theta, n_phi, seed, ...])add_exogenous(name, value[, convert_precision])add_scaler(key, scaler)Add a scaler for preprocessing a specific tensor.
build()Generate pendulum images and save metadata to disk.
collate(samples)Collate samples into a batch, re-annotating the ground-truth concepts.
download()This dataset is procedurally generated.
load()Loads raw dataset and preprocess data.
load_raw()Loads raw dataset without any data preprocessing.
maybe_build()maybe_download()remove_exogenous(name)set_concepts(concepts)Set concept annotations for the dataset.
set_graph(graph)Set the adjacency matrix of the causal graph between concepts as a pandas DataFrame.
Attributes
annotationsAnnotations for the concepts in the dataset.
concept_namesList of concept names in the dataset.
exogenousMapping of dataset's exogenous variables.
graphAdjacency matrix of the causal graph between concepts.
has_conceptsWhether the dataset has concept annotations.
has_exogenousWhether the dataset has exogenous information.
n_conceptsNumber of concepts in the dataset.
n_exogenousNumber of exogenous variables in the dataset.
n_featuresShape of features in dataset's input (excluding number of samples).
n_samplesNumber of samples in the dataset.
processed_filenamesThe list of processed filenames in the
self.root_dirfolder that must be present in order to skip build().processed_pathsThe absolute paths of the processed files that must be present in order to skip building.
raw_filenamesThe list of raw filenames in the
self.root_dirfolder that must be present in order to skip download().raw_pathsThe absolute paths of the raw files that must be present in order to skip downloading.
root_dirshapeShape of the input tensor.