Interpretable Layers and Interventions ================================================== The Low-Level API provides building blocks to create concept-based models using interpretable layers and perform interventions using a PyTorch-like interface. .. |pyc_logo| image:: https://raw.githubusercontent.com/pyc-team/pytorch_concepts/refs/heads/master/doc/_static/img/logos/pyc.svg :width: 20px :align: middle .. |pytorch_logo| image:: https://raw.githubusercontent.com/pyc-team/pytorch_concepts/refs/heads/master/doc/_static/img/logos/pytorch.svg :width: 20px :align: middle Design Principles -------------- Overview of Data Representations ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In |pyc_logo| PyC, we distinguish between three types of data representations: - **Input**: High-dimensional representations where exogenous and endogenous information is entangled - **Exogenous**: Representations that are direct causes of endogenous variables - **Endogenous**: Representations of observable quantities of interest Layer Types ^^^^^^^^^^^ In |pyc_logo| PyC you will find three types of layers whose interfaces reflect the distinction between data representations: - ``Encoder`` layers: Never take as input endogenous variables - ``Predictor`` layers: Must take as input a set of endogenous variables - Special layers: Perform operations like memory selection or graph learning Layer Naming Standard ^^^^^^^^^^^^^^^^^^^^^ In order to easily identify the type of layer, |pyc_logo| PyC uses a consistent standard to assign names to layers. Each layer name follows the format: ```` where: - ``LayerType``: describes the type of layer (e.g., Linear, HyperLinear, Selector, Transformer, etc...) - ``InputType`` and ``OutputType``: describe the type of data representations the layer takes as input and produces as output. |pyc_logo| PyC uses the following abbreviations: - ``Z``: Input - ``U``: Exogenous - ``C``: Endogenous For instance, a layer named ``LinearZC`` is a linear layer that takes as input an ``Input`` representation and produces an ``Endogenous`` representation. Since it does not take as input any endogenous variables, it is an encoder layer. .. code-block:: python pyc.nn.LinearZC(in_features=10, out_features=3) As another example, a layer named ``HyperLinearCUC`` is a hyper-network layer that takes as input both ``Endogenous`` and ``Exogenous`` representations and produces an ``Endogenous`` representation. Since it takes as input endogenous variables, it is a predictor layer. .. code-block:: python pyc.nn.HyperLinearCUC( in_features_endogenous=10, in_features_exogenous=7, embedding_size=24, out_features=3 ) As a final example, graph learners are a special layers that learn relationships between concepts. They do not follow the standard naming convention of encoders and predictors, but their purpose should be clear from their name. .. code-block:: python wanda = pyc.nn.WANDAGraphLearner( ['c1', 'c2', 'c3'], ['task A', 'task B', 'task C'] ) Detailed Guides ------------------------------ .. dropdown:: Concept Bottleneck Model :icon: package **Import Libraries** To get started, import |pyc_logo| PyC and |pytorch_logo| PyTorch: .. code-block:: python import torch import torch_concepts as pyc **Create Sample Data** Generate random inputs and targets for demonstration: .. code-block:: python batch_size = 32 input_dim = 64 n_concepts = 5 n_tasks = 3 # Random input x = torch.randn(batch_size, input_dim) # Random concept labels (binary) concept_labels = torch.randint(0, 2, (batch_size, n_concepts)).float() # Random task labels task_labels = torch.randint(0, n_tasks, (batch_size,)) **Build a Concept Bottleneck Model** Use a ModuleDict to combine encoder and predictor: .. code-block:: python # Create model using ModuleDict model = torch.nn.ModuleDict({ 'encoder': pyc.nn.LinearZC( in_features=input_dim, out_features=n_concepts ), 'predictor': pyc.nn.LinearCC( in_features_endogenous=n_concepts, out_features=n_tasks ), }) .. dropdown:: Inference and Training :icon: rocket **Inference** Once a concept bottleneck model is built, we can perform inference by first obtaining concept activations from the encoder, and then task predictions from the predictor: .. code-block:: python # Get concept endogenous from input concept_endogenous = model['encoder'](input=x) # Get task predictions from concept endogenous task_endogenous = model['predictor'](endogenous=concept_endogenous) print(f"Concept endogenous shape: {concept_endogenous.shape}") # [32, 5] print(f"Task endogenous shape: {task_endogenous.shape}") # [32, 3] **Compute Loss and Train** Train with both concept and task supervision: .. code-block:: python import torch.nn.functional as F # Compute losses concept_loss = F.binary_cross_entropy(torch.sigmoid(concept_endogenous), concept_labels) task_loss = F.cross_entropy(task_endogenous, task_labels) total_loss = task_loss + 0.5 * concept_loss # Backpropagation total_loss.backward() print(f"Concept loss: {concept_loss.item():.4f}") print(f"Task loss: {task_loss.item():.4f}") .. dropdown:: Interventions :icon: tools Intervene using the ``intervention`` context manager which replaces the encoder layer temporarily. The context manager takes two main arguments: **strategies** and **policies**. - Intervention strategies define how the layer behaves during the intervention, e.g., setting concept endogenous to ground truth values. - Intervention policies define the priority/order of concepts to intervene on. .. code-block:: python from torch_concepts.nn import GroundTruthIntervention, UniformPolicy from torch_concepts.nn import intervention ground_truth = 10 * torch.rand_like(concept_endogenous) strategy = GroundTruthIntervention(model=model['encoder'], ground_truth=ground_truth) policy = UniformPolicy(out_features=n_concepts) # Apply intervention to encoder with intervention( policies=policy, strategies=strategy, target_concepts=[0, 2] ) as new_encoder_layer: intervened_concepts = new_encoder_layer(input=x) intervened_tasks = model['predictor'](endogenous=intervened_concepts) print(f"Original concept endogenous: {concept_endogenous[0]}") print(f"Original task predictions: {task_endogenous[0]}") print(f"Intervened concept endogenous: {intervened_concepts[0]}") print(f"Intervened task predictions: {intervened_tasks[0]}") .. dropdown:: (Advanced) Graph Learning :icon: workflow Add a graph learner to discover concept relationships: .. code-block:: python # Define concept and task names concept_names = ['round', 'smooth', 'bright', 'large', 'centered'] # Create WANDA graph learner graph_learner = pyc.nn.WANDAGraphLearner( row_labels=concept_names, col_labels=concept_names ) print(f"Learned graph shape: {graph_learner.weighted_adj}") The ``graph_learner.weighted_adj`` tensor contains a learnable adjacency matrix representing relationships between concepts. .. dropdown:: (Advanced) Verifiable Concept-Based Models :icon: shield-check To design more complex concept-based models, you can combine multiple interpretable layers. For example, to build a verifiable concept-based model we can use an encoder to predict concept activations, a selector to select relevant exogenous information, and a hyper-network predictor to make final predictions based on both concept activations and exogenous information. .. code-block:: python from torch_concepts.nn import LinearZC, SelectorZU, HyperLinearCUC memory_size = 7 exogenous_size = 16 embedding_size = 5 # Create model using ModuleDict model = torch.nn.ModuleDict({ 'encoder': LinearZC( in_features=input_dim, out_features=n_concepts ), 'selector': SelectorZU( in_features=input_dim, memory_size=memory_size, exogenous_size=exogenous_size, out_features=n_tasks ), 'predictor': HyperLinearCUC( in_features_endogenous=n_concepts, in_features_exogenous=exogenous_size, embedding_size=embedding_size, ) }) Next Steps ---------- - Explore the full :doc:`Low-Level API documentation ` - Try the :doc:`Mid-Level API ` for probabilistic modeling - Try the :doc:`Mid-Level API ` for causal modeling - Check out :doc:`example notebooks `