Data Modules

This module provides data module implementations for concept-based datasets.

Summary

DataModule Classes

BnLearnDataModule

DataModule for all Bayesian Network datasets.

Class Documentation

class BnLearnDataModule(seed: int, name: str, root: str | None = None, val_size: int | float = 0.1, test_size: int | float = 0.2, batch_size: int = 512, backbone: str | Callable[[Tensor], Tensor] | None = None, precompute_embs: bool = False, force_recompute: bool = False, n_gen: int = 10000, concept_subset: list | None = None, label_descriptions: dict | None = None, autoencoder_kwargs: dict | None = None, workers: int = 0, **kwargs)[source]

Bases: ConceptDataModule

DataModule for all Bayesian Network datasets.

Handles data loading, splitting, and batching for all Bayesian Network datasets with support for concept-based learning.

Parameters:
  • seed – Random seed for data generation and splitting.

  • val_size – Validation set size (fraction or absolute count).

  • test_size – Test set size (fraction or absolute count).

  • batch_size – Batch size for dataloaders.

  • n_samples – Total number of samples to generate.

  • autoencoder_kwargs – Configuration for autoencoder-based feature extraction.

  • concept_subset – Subset of concepts to use. If None, uses all concepts.

  • label_descriptions – Dictionary mapping concept names to descriptions.

  • backbone – Model backbone to use (if applicable).

  • workers – Number of workers for dataloaders.