Concept Predictors

This module provides predictor implementations that map from concepts to target predictions.

Summary

Predictor Classes

LinearCC

Linear concept predictor.

MixCUC

Concept exogenous predictor with mixture of concept activations and exogenous features.

HyperLinearCUC

Hypernetwork-based linear predictor for concept-based models.

CallableCC

A predictor that applies a custom callable function to concept representations.

Class Documentation

class LinearCC(in_features_endogenous: int, out_features: int, in_activation: ~typing.Callable = <built-in method sigmoid of type object>)[source]

Bases: BasePredictor

Linear concept predictor.

This predictor transforms input concept endogenous into other concept endogenous using a linear layer followed by activation.

in_features_endogenous

Number of input logit features.

Type:

int

out_features

Number of output concept features.

Type:

int

in_activation

Activation function for inputs (default: sigmoid).

Type:

Callable

predictor

The prediction network.

Type:

nn.Sequential

Parameters:
  • in_features_endogenous – Number of input logit features.

  • out_features – Number of output concept features.

  • in_activation – Activation function to apply to input endogenous (default: torch.sigmoid).

Example

>>> import torch
>>> from torch_concepts.nn import LinearCC
>>>
>>> # Create predictor
>>> predictor = LinearCC(
...     in_features_endogenous=10,
...     out_features=5
... )
>>>
>>> # Forward pass
>>> in_endogenous = torch.randn(2, 10)  # batch_size=2, in_features=10
>>> out_endogenous = predictor(in_endogenous)
>>> print(out_endogenous.shape)
torch.Size([2, 5])

References

Koh et al. “Concept Bottleneck Models”, ICML 2020. https://arxiv.org/pdf/2007.04612

forward(endogenous: Tensor) Tensor[source]

Forward pass through the predictor.

Parameters:

endogenous – Input endogenous of shape (batch_size, in_features_endogenous).

Returns:

Predicted concept probabilities of shape (batch_size, out_features).

Return type:

torch.Tensor

prune(mask: Tensor)[source]

Prune input features based on a binary mask.

Removes input features where mask is False/0, reducing model complexity.

Parameters:

mask – Binary mask of shape (in_features_endogenous,) indicating which features to keep (True/1) or remove (False/0).

Example

>>> import torch
>>> from torch_concepts.nn import LinearCC
>>>
>>> predictor = LinearCC(in_features_endogenous=10, out_features=5)
>>>
>>> # Prune first 3 features
>>> mask = torch.tensor([0, 0, 0, 1, 1, 1, 1, 1, 1, 1], dtype=torch.bool)
>>> predictor.prune(mask)
>>>
>>> # Now only accepts 7 input features
>>> endogenous = torch.randn(2, 7)
>>> probs = predictor(endogenous)
>>> print(probs.shape)
torch.Size([2, 5])
training: bool
class MixCUC(in_features_endogenous: int, in_features_exogenous: int, out_features: int, in_activation: ~typing.Callable = <built-in method sigmoid of type object>, cardinalities: ~typing.List[int] | None = None)[source]

Bases: BasePredictor

Concept exogenous predictor with mixture of concept activations and exogenous features.

This predictor implements the Concept Embedding Model (CEM) task predictor that combines concept activations with learned exogenous using a mixture operation.

Main reference: “Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off” (Espinosa Zarlenga et al., NeurIPS 2022).

in_features_endogenous

Number of input concept endogenous.

Type:

int

in_features_exogenous

Number of exogenous features.

Type:

int

out_features

Number of output features.

Type:

int

cardinalities

Cardinalities for grouped concepts.

Type:

List[int]

predictor

Linear predictor module.

Type:

nn.Module

Parameters:
  • in_features_endogenous – Number of input concept endogenous.

  • in_features_exogenous – Number of exogenous features (must be even).

  • out_features – Number of output task features.

  • in_activation – Activation function for concept endogenous (default: sigmoid).

  • cardinalities – List of concept group cardinalities (optional).

Example

>>> import torch
>>> from torch_concepts.nn import MixCUC
>>>
>>> # Create predictor with 10 concepts, 20 exogenous dims, 3 tasks
>>> predictor = MixCUC(
...     in_features_endogenous=10,
...     in_features_exogenous=10,  # Must be half of exogenous latent size when no cardinalities are provided
...     out_features=3,
...     in_activation=torch.sigmoid
... )
>>>
>>> # Generate random inputs
>>> concept_endogenous = torch.randn(4, 10)  # batch_size=4, n_concepts=10
>>> exogenous = torch.randn(4, 10, 20)  # (batch, n_concepts, emb_size)
>>>
>>> # Forward pass
>>> task_endogenous = predictor(endogenous=concept_endogenous, exogenous=exogenous)
>>> print(task_endogenous.shape)  # torch.Size([4, 3])
>>>
>>> # With concept groups (e.g., color has 3 values, shape has 4, etc.)
>>> predictor_grouped = MixCUC(
...     in_features_endogenous=10,
...     in_features_exogenous=20, # Must be equal to exogenous latent size when cardinalities are provided
...     out_features=3,
...     cardinalities=[3, 4, 3]  # 3 groups summing to 10
... )
>>>
>>> # Forward pass with grouped concepts
>>> task_endogenous = predictor_grouped(endogenous=concept_endogenous, exogenous=exogenous)
>>> print(task_endogenous.shape)  # torch.Size([4, 3])

References

Espinosa Zarlenga et al. “Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off”, NeurIPS 2022. https://arxiv.org/abs/2209.09056

forward(endogenous: Tensor, exogenous: Tensor) Tensor[source]

Forward pass through the predictor.

Parameters:
  • endogenous – Concept endogenous of shape (batch_size, n_concepts).

  • exogenous – Concept exogenous of shape (batch_size, n_concepts, emb_size).

Returns:

Task predictions of shape (batch_size, out_features).

Return type:

torch.Tensor

training: bool
class HyperLinearCUC(in_features_endogenous: int, in_features_exogenous: int, embedding_size: int, in_activation: ~typing.Callable = <function HyperLinearCUC.<lambda>>, use_bias: bool = True, init_bias_mean: float = 0.0, init_bias_std: float = 0.01, min_std: float = 1e-06)[source]

Bases: BasePredictor

Hypernetwork-based linear predictor for concept-based models.

This predictor uses a hypernetwork to generate per-sample weights from exogenous features, enabling sample-adaptive predictions. It also supports stochastic biases with learnable mean and standard deviation.

in_features_endogenous

Number of input concept endogenous.

Type:

int

in_features_exogenous

Number of exogenous features.

Type:

int

embedding_size

Hidden size of the hypernetwork.

Type:

int

out_features

Number of output features.

Type:

int

use_bias

Whether to use stochastic bias.

Type:

bool

hypernet

Hypernetwork that generates weights.

Type:

nn.Module

Parameters:
  • in_features_endogenous – Number of input concept endogenous.

  • in_features_exogenous – Number of exogenous input features.

  • embedding_size – Hidden dimension of hypernetwork.

  • in_activation – Activation function for concepts (default: identity).

  • use_bias – Whether to add stochastic bias (default: True).

  • init_bias_mean – Initial mean for bias distribution (default: 0.0).

  • init_bias_std – Initial std for bias distribution (default: 0.01).

  • min_std – Minimum std to ensure stability (default: 1e-6).

Example

>>> import torch
>>> from torch_concepts.nn import HyperLinearCUC
>>>
>>> # Create hypernetwork predictor
>>> predictor = HyperLinearCUC(
...     in_features_endogenous=10,      # 10 concepts
...     in_features_exogenous=128,   # 128-dim context features
...     embedding_size=64,           # Hidden dim of hypernet
...     use_bias=True
... )
>>>
>>> # Generate random inputs
>>> concept_endogenous = torch.randn(4, 10)   # batch_size=4, n_concepts=10
>>> exogenous = torch.randn(4, 3, 128)         # batch_size=4, n_tasks=3, exogenous_dim=128
>>>
>>> # Forward pass - generates per-sample weights via hypernetwork
>>> task_endogenous = predictor(endogenous=concept_endogenous, exogenous=exogenous)
>>> print(task_endogenous.shape)  # torch.Size([4, 3])
>>>
>>> # The hypernetwork generates different weights for each sample
>>> # This enables sample-adaptive predictions
>>>
>>> # Example without bias
>>> predictor_no_bias = HyperLinearCUC(
...     in_features_endogenous=10,
...     in_features_exogenous=128,
...     embedding_size=64,
...     use_bias=False
... )
>>>
>>> task_endogenous = predictor_no_bias(endogenous=concept_endogenous, exogenous=exogenous)
>>> print(task_endogenous.shape)  # torch.Size([4, 3])

References

Debot et al. “Interpretable Concept-Based Memory Reasoning”, NeurIPS 2024. https://arxiv.org/abs/2407.15527

forward(endogenous: Tensor, exogenous: Tensor) Tensor[source]

Forward pass through hypernetwork predictor.

Parameters:
  • endogenous – Concept endogenous of shape (batch_size, n_concepts).

  • exogenous – Exogenous features of shape (batch_size, exog_dim).

Returns:

Task predictions of shape (batch_size, out_features).

Return type:

torch.Tensor

prune(mask: Tensor)[source]

Prune the predictor based on a concept mask.

Parameters:

mask – Binary mask of shape (n_concepts,) indicating which concepts to keep.

training: bool
class CallableCC(func: ~typing.Callable, in_activation: ~typing.Callable = <function CallableCC.<lambda>>, use_bias: bool = True, init_bias_mean: float = 0.0, init_bias_std: float = 0.01, min_std: float = 1e-06)[source]

Bases: BasePredictor

A predictor that applies a custom callable function to concept representations.

This predictor allows flexible task prediction by accepting any callable function that operates on concept representations. It optionally includes learnable stochastic bias parameters (mean and standard deviation) that are added to the output using the reparameterization trick for gradient-based learning.

The module can be used to write custom layers for standard Structural Causal Models (SCMs).

Parameters:
  • func – Callable function that takes concept probabilities and returns task predictions. Should accept a tensor of shape (batch_size, n_concepts) and return a tensor of shape (batch_size, out_features).

  • in_activation – Activation function to apply to input endogenous before passing to func. Default is identity (lambda x: x).

  • use_bias – Whether to add learnable stochastic bias to the output. Default is True.

  • init_bias_mean – Initial value for the bias mean parameter. Default is 0.0.

  • init_bias_std – Initial value for the bias standard deviation. Default is 0.01.

  • min_std – Minimum standard deviation floor for numerical stability. Default is 1e-6.

Examples

>>> import torch
>>> from torch_concepts.nn import CallableCC
>>>
>>> # Generate sample data
>>> batch_size = 32
>>> n_concepts = 3
>>> endogenous = torch.randn(batch_size, n_concepts)
>>>
>>> # Define a polynomial function with fixed weights for 3 inputs, 2 outputs
>>> def quadratic_predictor(probs):
...     c0, c1, c2 = probs[:, 0:1], probs[:, 1:2], probs[:, 2:3]
...     output1 = 0.5*c0**2 + 1.0*c1**2 + 1.5*c2
...     output2 = 2.0*c0 - 1.0*c1**2 + 0.5*c2**3
...     return torch.cat([output1, output2], dim=1)
>>>
>>> predictor = CallableCC(
...     func=quadratic_predictor,
...     use_bias=True
... )
>>> predictions = predictor(endogenous)
>>> print(predictions.shape)  # torch.Size([32, 2])
References

Pearl, J. “Causality”, Cambridge University Press (2009).

forward(endogenous: Tensor, *args, **kwargs) Tensor[source]

Forward pass through the concept layer.

Must be implemented by subclasses.

Returns:

Output tensor.

Return type:

torch.Tensor

Raises:

NotImplementedError – This is an abstract method.

training: bool