Random Variables¶

This module provides variable representations for concept-based probabilistic models.

Summary¶

Variable Classes

`Variable`	Represents a random variable in a concept-based Probabilistic Model.
`EndogenousVariable`	Represents an endogenous variable in a concept-based model.
`ExogenousVariable`	Represents an exogenous variable in a concept-based model.
`InputVariable`	Represents a latent variable in a concept-based model.

Class Documentation¶

Bases: object

Represents a random variable in a concept-based Probabilistic Model.

A Variable encapsulates one or more concepts along with their associated probability distribution, parent variables, and metadata. It supports multiple distribution types including Delta (deterministic), Bernoulli, Categorical, and Normal distributions.

The Variable class implements a special __new__ method that allows creating multiple Variable instances when initialized with multiple concepts, or a single instance for a single concept.

concepts¶

List of concept names represented by this variable.

Type:: List[str]

parents¶

List of parent variables in the graphical model.

Type:: List[Variable]

distribution¶

PyTorch distribution class for this variable.

Type:: Type[Distribution]

size¶

Size/cardinality of the variable (e.g., number of classes for Categorical).

Type:: int

metadata¶

Additional metadata associated with the variable.

Type:: Dict[str, Any]

Properties:: out_features (int): Number of output features this variable produces. in_features (int): Total input features from all parent variables.

Example

>>> import torch
>>> from torch.distributions import Bernoulli, Categorical, Normal
>>> from torch_concepts import Variable
>>> from torch_concepts.distributions import Delta
>>>
>>> # Create a binary concept variable
>>> var_binary = Variable(
...     concepts='has_wheels',
...     parents=[],
...     distribution=Bernoulli,
...     size=1
... )
>>> print(var_binary.concepts)  # ['has_wheels']
>>> print(var_binary.out_features)  # 1
>>>
>>> # Create a categorical variable with 3 color classes
>>> var_color = Variable(
...     concepts=['color'],
...     parents=[],
...     distribution=Categorical,
...     size=3  # red, green, blue
... )
>>> print(var_color.out_features)  # 3
>>>
>>> # Create a deterministic (Delta) variable
>>> var_delta = Variable(
...     concepts=['continuous_feature'],
...     parents=[],
...     distribution=Delta,
...     size=1
... )
>>>
>>> # Create multiple variables at once
>>> vars_list = Variable(
...     concepts=['A', 'B', 'C'],
...     parents=[],
...     distribution=Delta,
...     size=1
... )
>>> print(len(vars_list))  # 3
>>> print(vars_list[0].concepts)  # ['A']
>>> print(vars_list[1].concepts)  # ['B']
>>>
>>> # Create variables with parent dependencies
>>> parent_var = Variable(
...     concepts=['parent_concept'],
...     parents=[],
...     distribution=Bernoulli,
...     size=1
... )
>>> child_var = Variable(
...     concepts=['child_concept'],
...     parents=[parent_var],
...     distribution=Bernoulli,
...     size=1
... )
>>> print(child_var.in_features)  # 1 (from parent)
>>> print(child_var.out_features)  # 1

property out_features: int¶

Calculate the number of output features for this variable.

The calculation depends on the distribution type: - Delta/Normal: size * n_concepts - Bernoulli: n_concepts (binary per concept) - Categorical: size (single multi-class variable)

Returns:: Number of output features.
Return type:: int

property in_features: int¶

Calculate total input features from all parent variables.

Returns:: Sum of out_features from all parent variables.
Return type:: int
Raises:: TypeError – If any parent is not a Variable instance.

Bases: Variable

Represents an endogenous variable in a concept-based model.

Endogenous variables are observable and supervisable concepts that can be directly measured or annotated in the data. These are typically the concepts that we want to learn and predict, such as object attributes, semantic features, or intermediate representations that have ground truth labels.

concepts¶

List of concept names represented by this variable.

Type:: List[str]

parents¶

List of parent variables in the graphical model.

Type:: List[Variable]

distribution¶

PyTorch distribution class for this variable.

Type:: Type[Distribution]

size¶

Size/cardinality of the variable.

Type:: int

metadata¶

Additional metadata. Automatically includes ‘variable_type’: ‘endogenous’.

Type:: Dict[str, Any]

Example

>>> from torch.distributions import Bernoulli, Categorical
>>> from torch_concepts import EndogenousVariable
>>> # Observable binary concept
>>> has_wings = EndogenousVariable(
...     concepts='has_wings',
...     parents=[],
...     distribution=Bernoulli,
...     size=1
... )
>>>
>>> # Observable categorical concept (e.g., color)
>>> color = EndogenousVariable(
...     concepts=['color'],
...     parents=[],
...     distribution=Categorical,
...     size=3  # red, green, blue
... )

Bases: Variable

Represents an exogenous variable in a concept-based model.

Exogenous variables are high-dimensional representations related to a single endogenous variable. They capture rich, detailed information about a specific concept (e.g., image patches, embeddings, or feature vectors) that can be used to predict or explain the corresponding endogenous concept.

concepts¶

List of concept names represented by this variable.

Type:: List[str]

parents¶

List of parent variables in the graphical model.

Type:: List[Variable]

distribution¶

PyTorch distribution class for this variable.

Type:: Type[Distribution]

size¶

Dimensionality of the high-dimensional representation.

Type:: int

endogenous_var¶

The endogenous variable this exogenous variable is related to.

Type:: Optional[EndogenousVariable]

metadata¶

Additional metadata. Automatically includes ‘variable_type’: ‘exogenous’.

Type:: Dict[str, Any]

Example

>>> from torch.distributions import Normal, Bernoulli
>>> from torch_concepts.distributions import Delta
>>> from torch_concepts import EndogenousVariable, ExogenousVariable
>>> # Endogenous concept
>>> has_wings = EndogenousVariable(
...     concepts='has_wings',
...     parents=[],
...     distribution=Bernoulli,
...     size=1
... )
>>>
>>> # Exogenous high-dim representation for has_wings
>>> wings_features = ExogenousVariable(
...     concepts='wings_exogenous',
...     parents=[],
...     distribution=Delta,
...     size=128,  # 128-dimensional exogenous
... )

Bases: Variable

Represents a latent variable in a concept-based model.

Latent variables are high-dimensional global representations of the whole input object (e.g., raw input images, text, or sensor data). They capture the complete information about the input before it is decomposed into specific concepts. These are typically unobserved, learned representations that encode all relevant information from the raw input.

concepts¶

List of concept names represented by this variable.

Type:: List[str]

parents¶

List of parent variables in the graphical model (typically empty).

Type:: List[Variable]

distribution¶

PyTorch distribution class for this variable.

Type:: Type[Distribution]

size¶

Dimensionality of the latent representation.

Type:: int

metadata¶

Additional metadata. Automatically includes ‘variable_type’: ‘input’.

Type:: Dict[str, Any]

Example

>>> from torch_concepts.distributions import Delta
>>> from torch_concepts import InputVariable
>>> # Global latent representation from input image
>>> image_latent = InputVariable(
...     concepts='global_image_features',
...     parents=[],
...     distribution=Delta,
...     size=512  # 512-dimensional global latent
... )
>>>
>>> # Multiple latent variables for hierarchical representation
>>> low_level_features = InputVariable(
...     concepts='low_level_features',
...     parents=[],
...     distribution=Delta,
...     size=256
... )
>>> high_level_features = InputVariable(
...     concepts='high_level_features',
...     parents=[low_level_features],
...     distribution=Delta,
...     size=512
... )