torch_concepts.data.BnLearnDataset¶
- class BnLearnDataset(name: str, root: str | None = None, seed: int = 42, n_gen: int = 10000, concept_subset: list | None = None, label_descriptions: dict | None = None, autoencoder_kwargs: dict | None = None)[source]¶
Dataset class for the Asia dataset from bnlearn.
This dataset represents a small expert system that models the relationship between traveling to Asia, smoking habits, and various lung diseases.
- __init__(name: str, root: str | None = None, seed: int = 42, n_gen: int = 10000, concept_subset: list | None = None, label_descriptions: dict | None = None, autoencoder_kwargs: dict | None = None)[source]¶
Methods
__init__(name[, root, seed, n_gen, ...])add_exogenous(name, value[, convert_precision])add_scaler(key, scaler)Add a scaler for preprocessing a specific tensor.
build()Eventually build the dataset from raw data to
self.root_dirfolder.collate(samples)Collate samples into a batch, re-annotating the ground-truth concepts.
download()Downloads dataset's files to the
self.root_dirfolder.load()Loads raw dataset and preprocess data.
load_raw()Loads raw dataset without any data preprocessing.
maybe_build()maybe_download()remove_exogenous(name)set_concepts(concepts)Set concept annotations for the dataset.
set_graph(graph)Set the adjacency matrix of the causal graph between concepts as a pandas DataFrame.
Attributes
annotationsAnnotations for the concepts in the dataset.
concept_namesList of concept names in the dataset.
exogenousMapping of dataset's exogenous variables.
graphAdjacency matrix of the causal graph between concepts.
has_conceptsWhether the dataset has concept annotations.
has_exogenousWhether the dataset has exogenous information.
n_conceptsNumber of concepts in the dataset.
n_exogenousNumber of exogenous variables in the dataset.
n_featuresShape of features in dataset's input (excluding number of samples).
n_samplesNumber of samples in the dataset.
processed_filenamesList of processed filenames that will be created during build step.
processed_pathsThe absolute paths of the processed files that must be present in order to skip building.
raw_filenamesList of raw filenames that need to be present in the raw directory for the dataset to be considered present.
raw_pathsThe absolute paths of the raw files that must be present in order to skip downloading.
root_dirshapeShape of the input tensor.