torch_concepts.data.datasets.bnlearn.BnLearnDataset¶
- class BnLearnDataset(name: str, root: str | None = None, seed: int = 42, n_gen: int = 10000, concept_subset: list | None = None, label_descriptions: dict | None = None, autoencoder_kwargs: dict | None = None)[source]¶
Dataset class for the Asia dataset from bnlearn.
This dataset represents a small expert system that models the relationship between traveling to Asia, smoking habits, and various lung diseases.
- __init__(name: str, root: str | None = None, seed: int = 42, n_gen: int = 10000, concept_subset: list | None = None, label_descriptions: dict | None = None, autoencoder_kwargs: dict | None = None)[source]¶
Methods
__init__(name[, root, seed, n_gen, ...])add_exogenous(name, value[, convert_precision])add_scaler(key, scaler)Add a scaler for preprocessing a specific tensor.
build()Eventually build the dataset from raw data to
self.root_dirfolder.download()Downloads dataset's files to the
self.root_dirfolder.load()Loads raw dataset and preprocess data.
load_raw()Loads raw dataset without any data preprocessing.
maybe_build()maybe_download()maybe_reduce_annotations(annotations[, ...])Set concept and labels for the dataset. :param annotations: Annotations object for all concepts. :param concept_names_subset: List of strings naming the subset of concepts to use. If
None, will use all concepts.remove_exogenous(name)set_concepts(concepts)Set concept annotations for the dataset.
set_graph(graph)Set the adjacency matrix of the causal graph between concepts as a pandas DataFrame.
Attributes
annotationsAnnotations for the concepts in the dataset.
concept_namesList of concept names in the dataset.
exogenousMapping of dataset's exogenous variables.
graphAdjacency matrix of the causal graph between concepts.
has_conceptsWhether the dataset has concept annotations.
has_exogenousWhether the dataset has exogenous information.
n_conceptsNumber of concepts in the dataset.
n_exogenousNumber of exogenous variables in the dataset.
n_featuresShape of features in dataset's input (excluding number of samples).
n_samplesNumber of samples in the dataset.
List of processed filenames that will be created during build step.
processed_pathsThe absolute paths of the processed files that must be present in order to skip building.
List of raw filenames that need to be present in the raw directory for the dataset to be considered present.
raw_pathsThe absolute paths of the raw files that must be present in order to skip downloading.
root_dirshapeShape of the input tensor.