Probabilistic Models¶

This module provides probabilistic model implementations for concept-based reasoning.

Summary¶

Model Classes

`ProbabilisticModel`	Probabilistic Model for concept-based reasoning.
`ParametricCPD`	A ParametricCPD represents a conditional probability distribution (CPD) in a probabilistic graphical model.
`BipartiteModel`	Bipartite concept graph model with concepts and tasks in separate layers.
`GraphModel`	Concept-based model with explicit graph structure between concepts and tasks.

Class Documentation¶

class ProbabilisticModel(variables: List[Variable], parametric_cpds: List[ParametricCPD])[source]¶

Bases: Module

Probabilistic Model for concept-based reasoning.

This class represents a directed acyclic graph (DAG) where nodes are concept variables and edges represent probabilistic dependencies. Each variable has an associated CPD (neural network module) that computes its conditional probability given its parents.

variables¶

List of concept variables in the model.

Type:: List[Variable]

parametric_cpds¶

Dictionary mapping concept names to their CPDs.

Type:: nn.ModuleDict

concept_to_variable¶

Mapping from concept names to variables.

Type:: Dict[str, Variable]

Parameters:

variables – List of Variable objects defining the concepts.
parametric_cpds – List of ParametricCPD objects defining the conditional distributions.

Example

>>> import torch
>>> from torch_concepts import InputVariable, EndogenousVariable
>>> from torch_concepts.nn import ProbabilisticModel
>>> from torch_concepts.nn import ParametricCPD
>>> from torch_concepts.nn import LinearZC
>>> from torch_concepts.nn import LinearCC
>>> from torch_concepts.distributions import Delta
>>>
>>> # Define variables
>>> emb_var = InputVariable(concepts='input', parents=[], distribution=Delta, size=32)
>>> c1_var = EndogenousVariable(concepts='c1', parents=[emb_var], distribution=Delta, size=1)
>>> c2_var = EndogenousVariable(concepts='c2', parents=[c1_var], distribution=Delta, size=1)
>>>
>>> # Define CPDs (neural network modules)
>>> backbone = torch.nn.Linear(in_features=128, out_features=32)
>>> encoder = LinearZC(in_features=32, out_features=1)
>>> predictor = LinearCC(in_features_endogenous=1, out_features=1)
>>>
>>> parametric_cpds = [
...     ParametricCPD(concepts='input', parametrization=backbone),
...     ParametricCPD(concepts='c1', parametrization=encoder),
...     ParametricCPD(concepts='c2', parametrization=predictor)
... ]
>>>
>>> # Create ProbabilisticModel
>>> probabilistic_model = ProbabilisticModel(
...     variables=[emb_var, c1_var, c2_var],
...     parametric_cpds=parametric_cpds
... )
>>>
>>> print(f"Number of variables: {len(probabilistic_model.variables)}")
Number of variables: 3

concept_to_variable: Dict[str, Variable]¶

get_by_distribution(distribution_class: Type[Distribution]) → List[Variable][source]¶

Get all variables with a specific distribution type.

Parameters:: distribution_class – The distribution class to filter by.
Returns:: Variables using the specified distribution.
Return type:: List[Variable]

get_variable_parents(concept_name: str) → List[Variable][source]¶

Get the parent variables of a concept.

Parameters:: concept_name – Name of the concept to query.
Returns:: List of parent variables, or empty list if none.
Return type:: List[Variable]

get_module_of_concept(concept_name: str) → Module | None[source]¶

Return the neural network module for a given concept.

Parameters:: concept_name – Name of the concept.
Returns:: The parametric_cpd module for the concept, or None if not found.
Return type:: Optional[nn.Module]

build_potentials()[source]¶

Build potential functions for all concepts in the ProbabilisticModel.

Returns:: Dictionary mapping concept names to their potential functions.
Return type:: Dict[str, callable]

build_cpts()[source]¶

Build Conditional Probability Tables (CPTs) for all concepts.

Returns:: Dictionary mapping concept names to their CPT functions.
Return type:: Dict[str, callable]

training: bool¶

class ParametricCPD(concepts: str | List[str], parametrization: Module | List[Module])[source]¶

Bases: Module

A ParametricCPD represents a conditional probability distribution (CPD) in a probabilistic graphical model.

A ParametricCPD links concepts to neural network modules that compute probability distributions. It can automatically split multiple concepts into separate CPD and supports building conditional probability tables (CPTs) and potential tables for inference.

Parameters:

concepts (Union[str, List[str]]) – A single concept name or a list of concept names. If a list of N concepts is provided, the ParametricCPD automatically splits into N separate ParametricCPD instances.
module (Union[nn.Module, List[nn.Module]]) – A neural network module or list of modules that compute the probability distribution. If concepts is a list of length N, module can be: - A single module (will be replicated for all concepts) - A list of N modules (one per concept)

concepts¶

List of concept names associated with this CPD.

Type:: List[str]

module¶

The neural network module used to compute probabilities.

Type:: nn.Module

variable¶

The Variable instance this CPD is linked to (set by ProbabilisticModel).

Type:: Optional[Variable]

parents¶

List of parent Variables in the graphical model.

Type:: List[Variable]

Examples

>>> import torch
>>> import torch.nn as nn
>>> from torch_concepts.nn import ParametricCPD
>>>
>>> # Create different modules for different concepts
>>> module_a = nn.Linear(in_features=10, out_features=1)
>>> module_b = nn.Sequential(
...     nn.Linear(in_features=10, out_features=5),
...     nn.ReLU(),
...     nn.Linear(in_features=5, out_features=1)
... )
>>>
>>> # Create CPD with different modules
>>> cpd = ParametricCPD(
...     concepts=["binary_concept", "complex_concept"],
...     parametrization=[module_a, module_b]
... )
>>>
>>> print(cpd[0].parametrization)
Linear(in_features=10, out_features=1, bias=True)
>>> print(cpd[1].parametrization)
Sequential(...)

Notes

The ParametricCPD class uses a custom __new__ method to automatically split multiple concepts into separate ParametricCPD instances when a list is provided.
ParametricCPDs are typically created and managed by a ProbabilisticModel rather than directly.
The module should accept an ‘input’ keyword argument in its forward pass.
Supported distributions for CPT/potential building: Bernoulli, Categorical, Delta, Normal.

See also

Variable: Represents a random variable in the probabilistic model.
ProbabilisticModel: Container that manages CPD and variables.

variable: Variable | None¶

parents: List[Variable]¶

forward(**kwargs) → Tensor[source]¶

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

build_cpt() → Tensor[source]¶

build_potential() → Tensor[source]¶

training: bool¶

Bases: GraphModel

Bipartite concept graph model with concepts and tasks in separate layers.

This model implements a bipartite graph structure where concepts only connect to tasks (not to each other), creating a clean separation between concept and task layers. This is useful for multi-task learning with shared concepts.

label_names¶

All node labels (concepts + tasks).

Type:: List[str]

concept_names¶

Concept node labels.

Type:: List[str]

task_names¶

Task node labels.

Type:: List[str]

Parameters:

task_names – List of task names (must be in annotations labels).
input_size – Size of input features.
annotations – Annotations object with concept and task metadata.
encoder – LazyConstructor for encoding concepts from inputs.
predictor – LazyConstructor for predicting tasks from concepts.
use_source_exogenous – Whether to use exogenous features for source nodes.
source_exogenous – Optional propagator for source exogenous features.
internal_exogenous – Optional propagator for internal exogenous features.

Example

>>> import torch
>>> from torch_concepts import Annotations, AxisAnnotation
>>> from torch_concepts.nn import BipartiteModel, LazyConstructor, LinearCC
>>> from torch.distributions import Bernoulli
>>>
>>> # Define concepts and tasks
>>> all_labels = ('color', 'shape', 'size', 'task1', 'task2')
>>> metadata = {'color': {'distribution': Bernoulli},
...             'shape': {'distribution': Bernoulli},
...             'size': {'distribution': Bernoulli},
...             'task1': {'distribution': Bernoulli},
...             'task2': {'distribution': Bernoulli}}
>>> annotations = Annotations({
...     1: AxisAnnotation(labels=all_labels, metadata=metadata)
... })
>>>
>>> # Create bipartite model with tasks
>>> task_names = ['task1', 'task2']
>>>
>>> model = BipartiteModel(
...     task_names=task_names,
...     input_size=784,
...     annotations=annotations,
...     encoder=LazyConstructor(torch.nn.Linear),
...     predictor=LazyConstructor(LinearCC)
... )
>>>
>>> # Generate random input
>>> x = torch.randn(8, 784)  # batch_size=8
>>>
>>> # Forward pass (implementation depends on GraphModel)
>>> # Concepts are encoded, then tasks predicted from concepts
>>> print(model.concept_names)  # ['color', 'shape', 'size']
>>> print(model.task_names)     # ['task1', 'task2']
>>> print(model.probabilistic_model)
>>>
>>> # The bipartite structure ensures:
>>> # - Concepts don't predict other concepts
>>> # - Only concepts -> tasks edges exist

training: bool¶

class GraphModel(model_graph: ConceptGraph, input_size: int, annotations: Annotations, encoder: LazyConstructor | Module, predictor: LazyConstructor | Module, use_source_exogenous: bool | None = None, source_exogenous: LazyConstructor | Module | None = None, internal_exogenous: LazyConstructor | Module | None = None)[source]¶

Bases: BaseConstructor

Concept-based model with explicit graph structure between concepts and tasks.

This model builds a probabilistic model based on a provided concept graph structure. It automatically constructs the necessary variables and CPDs following the graph’s topological order, supporting both root concepts (encoded from inputs) and internal concepts (predicted from parents).

The graph structure defines dependencies between concepts, enabling: - Hierarchical concept learning - Causal reasoning with interventions - Structured prediction with concept dependencies

model_graph¶

Directed acyclic graph defining concept relationships.

Type:: ConceptGraph

root_nodes¶

Concepts with no parents (encoded from inputs).

Type:: List[str]

internal_nodes¶

Concepts with parents (predicted from other concepts).

Type:: List[str]

graph_order¶

Topologically sorted concept names.

Type:: List[str]

probabilistic_model¶

Underlying PGM with variables and CPDs.

Type:: ProbabilisticModel

Parameters:

model_graph – ConceptGraph defining the structure (must be a DAG).
input_size – Size of input features.
annotations – Annotations object with concept metadata and distributions.
encoder – LazyConstructor for encoding root concepts from inputs.
predictor – LazyConstructor for predicting internal concepts from parents.
use_source_exogenous – Whether to use source exogenous features for predictions.
source_exogenous – Optional propagator for source exogenous features.
internal_exogenous – Optional propagator for internal exogenous features.

Raises:

AssertionError – If model_graph is not a DAG.
AssertionError – If node names don’t match annotations labels.

Example

>>> import torch
>>> import pandas as pd
>>> from torch_concepts import Annotations, AxisAnnotation, ConceptGraph
>>> from torch_concepts.nn import GraphModel, LazyConstructor, LinearCC
>>> from torch.distributions import Bernoulli
>>>
>>> # Define concepts and their structure
>>> # Structure: input -> [A, B] -> C -> D
>>> # A and B are root nodes (no parents)
>>> # C depends on A and B
>>> # D depends on C
>>> concept_names = ['A', 'B', 'C', 'D']
>>>
>>> # Create graph structure as adjacency matrix
>>> graph_df = pd.DataFrame(0, index=concept_names, columns=concept_names)
>>> graph_df.loc['A', 'C'] = 1  # A -> C
>>> graph_df.loc['B', 'C'] = 1  # B -> C
>>> graph_df.loc['C', 'D'] = 1  # C -> D
>>>
>>> graph = ConceptGraph(
...     torch.FloatTensor(graph_df.values),
...     node_names=concept_names
... )
>>>
>>> # Create annotations with distributions
>>> annotations = Annotations({
...     1: AxisAnnotation(
...         labels=tuple(concept_names),
...         metadata={
...             'A': {'distribution': Bernoulli},
...             'B': {'distribution': Bernoulli},
...             'C': {'distribution': Bernoulli},
...             'D': {'distribution': Bernoulli}
...         }
...     )
... })
>>>
>>> # Create GraphModel
>>> model = GraphModel(
...     model_graph=graph,
...     input_size=784,
...     annotations=annotations,
...     encoder=LazyConstructor(torch.nn.Linear),
...     predictor=LazyConstructor(LinearCC),
... )
>>>
>>> # Inspect the graph structure
>>> print(model.root_nodes)  # ['A', 'B'] - no parents
>>> print(model.internal_nodes)  # ['C', 'D'] - have parents
>>> print(model.graph_order)  # ['A', 'B', 'C', 'D'] - topological order
>>>
>>> # Check graph properties
>>> print(model.model_graph.is_dag())  # True
>>> print(model.model_graph.get_predecessors('C'))  # ['A', 'B']
>>> print(model.model_graph.get_successors('C'))  # ['D']

References: Dominici, et al. “Causal concept graph models: Beyond causal opacity in deep learning”, ICLR 2025. https://arxiv.org/abs/2405.16507. De Felice, et al. “Causally reliable concept bottleneck models”, NeurIPS https://arxiv.org/abs/2503.04363v1.

training: bool¶