Random Variables

This module provides variable representations for concept-based probabilistic models.

Summary

Variable Classes

Variable

Represents a random variable in a concept-based Probabilistic Model.

EndogenousVariable

Represents an endogenous variable in a concept-based model.

ExogenousVariable

Represents an exogenous variable in a concept-based model.

InputVariable

Represents a latent variable in a concept-based model.

Class Documentation

class Variable(concepts: List[str], parents: List[Variable | str], distribution: Type[Distribution] | List[Type[Distribution]] | None = None, size: int | List[int] = 1, metadata: Dict[str, Any] | None = None)[source]

Bases: object

Represents a random variable in a concept-based Probabilistic Model.

A Variable encapsulates one or more concepts along with their associated probability distribution, parent variables, and metadata. It supports multiple distribution types including Delta (deterministic), Bernoulli, Categorical, and Normal distributions.

The Variable class implements a special __new__ method that allows creating multiple Variable instances when initialized with multiple concepts, or a single instance for a single concept.

concepts

List of concept names represented by this variable.

Type:

List[str]

parents

List of parent variables in the graphical model.

Type:

List[Variable]

distribution

PyTorch distribution class for this variable.

Type:

Type[Distribution]

size

Size/cardinality of the variable (e.g., number of classes for Categorical).

Type:

int

metadata

Additional metadata associated with the variable.

Type:

Dict[str, Any]

Properties:

out_features (int): Number of output features this variable produces. in_features (int): Total input features from all parent variables.

Example

>>> import torch
>>> from torch.distributions import Bernoulli, Categorical, Normal
>>> from torch_concepts import Variable
>>> from torch_concepts.distributions import Delta
>>>
>>> # Create a binary concept variable
>>> var_binary = Variable(
...     concepts='has_wheels',
...     parents=[],
...     distribution=Bernoulli,
...     size=1
... )
>>> print(var_binary.concepts)  # ['has_wheels']
>>> print(var_binary.out_features)  # 1
>>>
>>> # Create a categorical variable with 3 color classes
>>> var_color = Variable(
...     concepts=['color'],
...     parents=[],
...     distribution=Categorical,
...     size=3  # red, green, blue
... )
>>> print(var_color.out_features)  # 3
>>>
>>> # Create a deterministic (Delta) variable
>>> var_delta = Variable(
...     concepts=['continuous_feature'],
...     parents=[],
...     distribution=Delta,
...     size=1
... )
>>>
>>> # Create multiple variables at once
>>> vars_list = Variable(
...     concepts=['A', 'B', 'C'],
...     parents=[],
...     distribution=Delta,
...     size=1
... )
>>> print(len(vars_list))  # 3
>>> print(vars_list[0].concepts)  # ['A']
>>> print(vars_list[1].concepts)  # ['B']
>>>
>>> # Create variables with parent dependencies
>>> parent_var = Variable(
...     concepts=['parent_concept'],
...     parents=[],
...     distribution=Bernoulli,
...     size=1
... )
>>> child_var = Variable(
...     concepts=['child_concept'],
...     parents=[parent_var],
...     distribution=Bernoulli,
...     size=1
... )
>>> print(child_var.in_features)  # 1 (from parent)
>>> print(child_var.out_features)  # 1
property out_features: int

Calculate the number of output features for this variable.

The calculation depends on the distribution type: - Delta/Normal: size * n_concepts - Bernoulli: n_concepts (binary per concept) - Categorical: size (single multi-class variable)

Returns:

Number of output features.

Return type:

int

property in_features: int

Calculate total input features from all parent variables.

Returns:

Sum of out_features from all parent variables.

Return type:

int

Raises:

TypeError – If any parent is not a Variable instance.

class EndogenousVariable(concepts: List[str], parents: List[Variable | str], distribution: Type[Distribution] | List[Type[Distribution]] | None = None, size: int | List[int] = 1, metadata: Dict[str, Any] | None = None)[source]

Bases: Variable

Represents an endogenous variable in a concept-based model.

Endogenous variables are observable and supervisable concepts that can be directly measured or annotated in the data. These are typically the concepts that we want to learn and predict, such as object attributes, semantic features, or intermediate representations that have ground truth labels.

concepts

List of concept names represented by this variable.

Type:

List[str]

parents

List of parent variables in the graphical model.

Type:

List[Variable]

distribution

PyTorch distribution class for this variable.

Type:

Type[Distribution]

size

Size/cardinality of the variable.

Type:

int

metadata

Additional metadata. Automatically includes ‘variable_type’: ‘endogenous’.

Type:

Dict[str, Any]

Example

>>> from torch.distributions import Bernoulli, Categorical
>>> from torch_concepts import EndogenousVariable
>>> # Observable binary concept
>>> has_wings = EndogenousVariable(
...     concepts='has_wings',
...     parents=[],
...     distribution=Bernoulli,
...     size=1
... )
>>>
>>> # Observable categorical concept (e.g., color)
>>> color = EndogenousVariable(
...     concepts=['color'],
...     parents=[],
...     distribution=Categorical,
...     size=3  # red, green, blue
... )
class ExogenousVariable(concepts: List[str], parents: List[Variable | str], distribution: Type[Distribution] | List[Type[Distribution]] | None = None, size: int | List[int] = 1, metadata: Dict[str, Any] | None = None)[source]

Bases: Variable

Represents an exogenous variable in a concept-based model.

Exogenous variables are high-dimensional representations related to a single endogenous variable. They capture rich, detailed information about a specific concept (e.g., image patches, embeddings, or feature vectors) that can be used to predict or explain the corresponding endogenous concept.

concepts

List of concept names represented by this variable.

Type:

List[str]

parents

List of parent variables in the graphical model.

Type:

List[Variable]

distribution

PyTorch distribution class for this variable.

Type:

Type[Distribution]

size

Dimensionality of the high-dimensional representation.

Type:

int

endogenous_var

The endogenous variable this exogenous variable is related to.

Type:

Optional[EndogenousVariable]

metadata

Additional metadata. Automatically includes ‘variable_type’: ‘exogenous’.

Type:

Dict[str, Any]

Example

>>> from torch.distributions import Normal, Bernoulli
>>> from torch_concepts.distributions import Delta
>>> from torch_concepts import EndogenousVariable, ExogenousVariable
>>> # Endogenous concept
>>> has_wings = EndogenousVariable(
...     concepts='has_wings',
...     parents=[],
...     distribution=Bernoulli,
...     size=1
... )
>>>
>>> # Exogenous high-dim representation for has_wings
>>> wings_features = ExogenousVariable(
...     concepts='wings_exogenous',
...     parents=[],
...     distribution=Delta,
...     size=128,  # 128-dimensional exogenous
... )
class InputVariable(concepts: List[str], parents: List[Variable | str], distribution: Type[Distribution] | List[Type[Distribution]] | None = None, size: int | List[int] = 1, metadata: Dict[str, Any] | None = None)[source]

Bases: Variable

Represents a latent variable in a concept-based model.

Latent variables are high-dimensional global representations of the whole input object (e.g., raw input images, text, or sensor data). They capture the complete information about the input before it is decomposed into specific concepts. These are typically unobserved, learned representations that encode all relevant information from the raw input.

concepts

List of concept names represented by this variable.

Type:

List[str]

parents

List of parent variables in the graphical model (typically empty).

Type:

List[Variable]

distribution

PyTorch distribution class for this variable.

Type:

Type[Distribution]

size

Dimensionality of the latent representation.

Type:

int

metadata

Additional metadata. Automatically includes ‘variable_type’: ‘input’.

Type:

Dict[str, Any]

Example

>>> from torch_concepts.distributions import Delta
>>> from torch_concepts import InputVariable
>>> # Global latent representation from input image
>>> image_latent = InputVariable(
...     concepts='global_image_features',
...     parents=[],
...     distribution=Delta,
...     size=512  # 512-dimensional global latent
... )
>>>
>>> # Multiple latent variables for hierarchical representation
>>> low_level_features = InputVariable(
...     concepts='low_level_features',
...     parents=[],
...     distribution=Delta,
...     size=256
... )
>>> high_level_features = InputVariable(
...     concepts='high_level_features',
...     parents=[low_level_features],
...     distribution=Delta,
...     size=512
... )