Concept Encoders¶
This module provides encoder implementations that transform input features into concept representations.
Summary¶
Encoder Classes
Encoder that predicts concept activations from latent. |
|
Encoder that extracts concepts from exogenous variables. |
|
Stochastic encoder that predicts concept distributions with uncertainty. |
|
Exogenous encoder that creates concept exogenous. |
|
Memory-based selector for concept exogenous with attention mechanism. |
Class Documentation¶
- class LinearZC(in_features: int, out_features: int, *args, **kwargs)[source]¶
Bases:
BaseEncoderEncoder that predicts concept activations from latent.
This encoder transforms input latent into concept endogenous using a linear layer. It’s typically used as the first layer in concept bottleneck models to extract concepts from neural network input.
- encoder¶
The encoding network.
- Type:
nn.Sequential
- Parameters:
in_features – Number of input latent features.
out_features – Number of output concept features.
*args – Additional arguments for torch.nn.Linear.
**kwargs – Additional keyword arguments for torch.nn.Linear.
Example
>>> import torch >>> from torch_concepts.nn import LinearZC >>> >>> # Create encoder >>> encoder = LinearZC( ... in_features=128, ... out_features=10 ... ) >>> >>> # Forward pass with latent from a neural network >>> latent = torch.randn(4, 128) # batch_size=4, latent_dim=128 >>> concept_endogenous = encoder(latent) >>> print(concept_endogenous.shape) torch.Size([4, 10]) >>> >>> # Apply sigmoid to get probabilities >>> concept_probs = torch.sigmoid(concept_endogenous) >>> print(concept_probs.shape) torch.Size([4, 10])
References
Koh et al. “Concept Bottleneck Models”, ICML 2020. https://arxiv.org/pdf/2007.04612
- class LinearUC(in_features_exogenous: int, n_exogenous_per_concept: int = 1)[source]¶
Bases:
BaseEncoderEncoder that extracts concepts from exogenous variables.
This encoder processes exogenous latent variables to produce concept representations. It requires at least one exogenous variable per concept.
- encoder¶
The encoding network.
- Type:
nn.Sequential
- Parameters:
in_features_exogenous – Number of exogenous input features.
n_exogenous_per_concept – Number of exogenous variables per concept (default: 1).
Example
>>> import torch >>> from torch_concepts.nn import LinearUC >>> >>> # Create encoder with 2 exogenous vars per concept >>> encoder = LinearUC( ... in_features_exogenous=5, ... n_exogenous_per_concept=2 ... ) >>> >>> # Forward pass with exogenous variables >>> # Expected input shape: (batch, out_features, in_features * n_exogenous_per_concept) >>> exog_vars = torch.randn(4, 3, 10) # batch=4, concepts=3, exog_features=5*2 >>> concept_endogenous = encoder(exog_vars) >>> print(concept_endogenous.shape) torch.Size([4, 3])
References
Espinosa Zarlenga et al. “Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off”, NeurIPS 2022. https://arxiv.org/abs/2209.09056
- class StochasticZC(in_features: int, out_features: int, num_monte_carlo: int = 200, eps: float = 1e-06)[source]¶
Bases:
BaseEncoderStochastic encoder that predicts concept distributions with uncertainty.
Encodes input latent into concept distributions by predicting both mean and covariance matrices. Uses Monte Carlo sampling from the predicted multivariate normal distribution to generate concept representations.
- mu¶
Network for predicting concept means.
- Type:
nn.Sequential
- sigma¶
Network for predicting covariance lower triangle.
- Type:
nn.Linear
- Parameters:
in_features – Number of input latent features.
out_features – Number of output concepts.
num_monte_carlo – Number of Monte Carlo samples for uncertainty (default: 200).
Example
>>> import torch >>> from torch_concepts.nn import StochasticZC >>> >>> # Create stochastic encoder >>> encoder = StochasticZC( ... in_features=128, ... out_features=5, ... num_monte_carlo=100 ... ) >>> >>> # Forward pass with mean reduction >>> latent = torch.randn(4, 128) >>> concept_endogenous = encoder(latent, reduce=True) >>> print(concept_endogenous.shape) torch.Size([4, 5]) >>> >>> # Forward pass keeping all MC samples >>> concept_samples = encoder(latent, reduce=False) >>> print(concept_samples.shape) torch.Size([4, 5, 100])
References
Vandenhirtz et al. “Stochastic Concept Bottleneck Models”, 2024. https://arxiv.org/pdf/2406.19272
- forward(input: Tensor, reduce: bool = True) Tensor[source]¶
Predict concept scores with uncertainty via Monte Carlo sampling.
Predicts a multivariate normal distribution over concepts and samples from it using the reparameterization trick.
- Parameters:
input – Input input of shape (batch_size, in_features).
reduce – If True, return mean over MC samples; if False, return all samples (default: True).
- Returns:
- Concept endogenous of shape (batch_size, out_features) if reduce=True,
or (batch_size, out_features, num_monte_carlo) if reduce=False.
- Return type:
- class LinearZU(in_features: int, out_features: int, exogenous_size: int)[source]¶
Bases:
BaseEncoderExogenous encoder that creates concept exogenous.
Transforms input input into exogenous variables (external features) for each concept, producing a 2D output of shape (out_features, exogenous_size). Implements the ‘embedding generators’ from Concept Embedding Models (Zarlenga et al., 2022).
- encoder¶
The encoding network.
- Type:
nn.Sequential
- Parameters:
in_features – Number of input latent features.
out_features – Number of output concepts.
exogenous_size – Dimension of each concept’s exogenous.
Example
>>> import torch >>> from torch_concepts.nn import LinearZU >>> >>> # Create exogenous encoder >>> encoder = LinearZU( ... in_features=128, ... out_features=5, ... exogenous_size=16 ... ) >>> >>> # Forward pass >>> latent = torch.randn(4, 128) # batch_size=4 >>> exog = encoder(latent) >>> print(exog.shape) torch.Size([4, 5, 16]) >>> >>> # Each concept has its own 16-dimensional exogenous >>> print(f"Concept 0 exogenous shape: {exog[:, 0, :].shape}") Concept 0 exogenous shape: torch.Size([4, 16])
References
Espinosa Zarlenga et al. “Concept Embedding Models: Beyond the Accuracy-Explainability Trade-Off”, NeurIPS 2022. https://arxiv.org/abs/2209.09056
- class SelectorZU(in_features: int, memory_size: int, exogenous_size: int, out_features: int, temperature: float = 1.0, *args, **kwargs)[source]¶
Bases:
BaseEncoderMemory-based selector for concept exogenous with attention mechanism.
This module maintains a learnable memory bank of exogenous and uses an attention mechanism to select relevant exogenous based on input. It supports both soft (weighted) and hard (Gumbel-softmax) selection.
- memory¶
Learnable memory bank.
- Type:
nn.Embedding
- selector¶
Attention network for memory selection.
- Type:
nn.Sequential
- Parameters:
in_features – Number of input latent features.
memory_size – Number of memory slots per concept.
exogenous_size – Dimension of each memory exogenous.
out_features – Number of output concepts.
temperature – Temperature parameter for selection (default: 1.0).
*args – Additional arguments for the linear layer.
**kwargs – Additional keyword arguments for the linear layer.
Example
>>> import torch >>> from torch_concepts.nn import SelectorZU >>> >>> # Create memory selector >>> selector = SelectorZU( ... in_features=64, ... memory_size=10, ... exogenous_size=32, ... out_features=5, ... temperature=0.5 ... ) >>> >>> # Forward pass with soft selection >>> latent = torch.randn(4, 64) # batch_size=4 >>> selected = selector(latent, sampling=False) >>> print(selected.shape) torch.Size([4, 5, 32]) >>> >>> # Forward pass with hard selection (Gumbel-softmax) >>> selected_hard = selector(latent, sampling=True) >>> print(selected_hard.shape) torch.Size([4, 5, 32])
References
Debot et al. “Interpretable Concept-Based Memory Reasoning”, NeurIPS 2024. https://arxiv.org/abs/2407.15527
- forward(input: Tensor | None = None, sampling: bool = False) Tensor[source]¶
Select memory exogenous based on input input.
Computes attention weights over memory slots and returns a weighted combination of memory exogenous. Can use soft attention or hard selection via Gumbel-softmax.
- Parameters:
input – Input latent of shape (batch_size, in_features).
sampling – If True, use Gumbel-softmax for hard selection; if False, use soft attention (default: False).
- Returns:
- Selected exogenous of shape
(batch_size, out_features, exogenous_size).
- Return type: