Usage
Installation
To use causally
, first install it cloning the project via git:
$ git clone https://github.com/francescomontagna/causally.git
$ cd 'path/to/causally'
$ pip install .
For development purposes, install in editable mode (pip install -e .
)
Graph generation
Structural causal models are associated to a causal graph. Causally supports three models for random network generation:
Erdos-Renyi: generates directed acyclic graphs by randomly connecting pair of nodes.
Barabasi-Albert: generates directed acyclic graphs by randomly connecting pair of nodes according to a preferential attachment schema. Nodes with higher degree are more likely to be assigned a new edge. Scale-free graphs models the presence of hubs in the graph.
Gaussian Random Partition: generates directed acyclic graph by sparsely connecting multiple Erdos-Renyi clusters together.
from causally.graph import random_graph
graph_generator = random_graph.ErdosRenyi(
num_nodes=10,
p_edge=0.4
)
adjacency = graph_generator.get_random_graph()
Causal mechanisms
causally
supports several functions for random generation of the causal mechanisms of structural causal models.
Linear mechanisms: generate a linear combination of the input observations.
Nonlinear mechanisms: generate a nonlinear combination of the input observations. The nonlinear function can be parametrized by a random neural network, or sampled as a Gaussian process with covariance defined with the kernel matrix of the parents.
import numpy as np
from causally.scm import causal_mechanism as cm
mechanism = cm.NeuralNetMechanism()
parents = np.random.standard_normal(size=(1000, 2)) # 1000 samples, 2 parents
child = mechanism.predict(parents) # 1000 samples
Custom definition of causal mechanisms is as simple as implementing the class PredictionModel
and the abstract method predict()
.
import numpy as np
from causally.scm.causal_mechanism import PredictionModel
class SumOfSquares(PredictionModel):
def predict(self, X):
effect = np.square(X).sum(axis=1)
return effect
mechanism = SumOfSquares()
parents = np.random.standard_normal(size=(1000, 2)) # 1000 samples, 2 parents
child = mechanism.predict(parents)
Noise terms
causally
allows specifying the distribution of the structural causal model exogenous random variables.
Probability distributions of the noise terms are specified by implementing the class Distribution
and the abstract method sample()
. Additionally, samples can be generated as nonlinear
transformations of a standard Normal. This is achieved implementing RandomNoiseDistribution
and the abstract method sample()
: use an instance of MLPNoise
for noise terms generated by a nonlinar transformation of a standard Normal with a random neural network.
import numpy as np
from torch import nn
from causally.scm.noise import Distribution, MLPNoise, Normal
# Generate sample from a Normal distribution
normal_generator = Normal()
normal_samples = normal_generator.sample((1000, ))
# Generate samples from an Laplace distribution
class Laplace(Distribution):
def __init__(self, loc: float=1.0, scale:float=2.0):
self.loc = loc
self.scale = scale
def sample(self, size: tuple[int]):
return np.random.laplace(self.loc, self.scale, size)
laplace_generator = Laplace()
laplace_samples = laplace_generator.sample((1000, ))
# Generate samples from a random distribution
mlp_generator = MLPNoise(
hidden_dim=100,
activation=nn.Sigmoid(),
bias=False,
)
mlp_samples = mlp_generator.sample((1000, ))
Structural causal models
causally
supports the generation of structural causal models with linear and nonlinar mechanisms, and provides classes
for generation according to the following common SCMs:
The
LinearModel
, a causal model with linear mechanisms and additive noise terms:\[X_i := \sum_{k \in \operatorname{PA_i}}w_kX_k + N_i\]where \(\operatorname{PA_i}\) denotes the set of parents of the node \(X_i\), and \(N_i\) the exogenous random variable for \(X_i\).
The
AdditiveNoiseModel
, a causal model with nonlinear mechanisms and additive noise terms:\[X_i := f_i(\operatorname{PA_i}) + N_i\]where \(f_i\) is the nonlinear causal mechanism.
The
PostNonlinearModel
, a causal model with invertible function applied to the output of a nonlinear additive noise model structural equation.\[X_i := g_i(f_i(\operatorname{PA_i}) + N_i)\]where \(g_i\) is an invertible function.
In order to generate data from a structural causal model, we need instances of:
GraphGenerator
, e.g.ErdosRenyi
, specifying the parameters for sampling of the random graph of the model.Distribution
, e.g.MLPNoise
, specifying the parameters of the distribution of the noise terms,PredictionModel
, e.g.NeuralNetMechanism
, which specifies the class of causal mechanisms of the SCM.
Then, we can define a structural causal model, for example an AdditiveNoiseModel
.
import causally.scm.scm as scm
import causally.graph.random_graph as rg
import causally.scm.noise as noise
import causally.scm.causal_mechanism as cm
# Erdos-Renyi graph generator
graph_generator = rg.ErdosRenyi(num_nodes=10, expected_degree=1)
# Generator of the noise terms
noise_generator = noise.MLPNoise()
# Nonlinear causal mechanisms (parametrized with a random neural network)
causal_mechanism = cm.NeuralNetMechanism()
# Generated the data
model = scm.AdditiveNoiseModel(
num_samples=1000,
graph_generator=graph_generator,
noise_generator=noise_generator,
causal_mechanism=causal_mechanism,
seed=42
)
dataset, groundtruth = model.sample()
Challenging assumptions
The key feature of causally
is its flexibility in specifying the assumptions of the structural causal model.
In particular, it allows to generate data that violate some of the most common assumptions of causal discovery
algorithms, such as faithfulness of the distribution, or absence of latent confounders.
In order to specify your modelling assumptions, you need to instantiate a SCMContext
object, which allows
to specify and parametrize the modelling assumptions. Then the context
is passed as an argument to the
SCM class constructor. As simple as that!
import causally.scm.scm as scm
import causally.graph.random_graph as rg
import causally.scm.noise as noise
import causally.scm.causal_mechanism as cm
from causally.scm.context import SCMContext
# Erdos-Renyi graph generator
graph_generator = rg.ErdosRenyi(num_nodes=10, expected_degree=1)
# Generator of the noise terms
noise_generator = noise.MLPNoise()
# Nonlinear causal mechanisms (parametrized with a random neural network)
causal_mechanism = cm.NeuralNetMechanism()
# Context for the assumptions
context = SCMContext()
# Make assumption: confounded model
context.confounded_model(p_confounder=0.1)
# Make assumption: unfaithful model
context.unfaithful_model(p_unfaithful=0.5)
# Generate the data
model = scm.AdditiveNoiseModel(
num_samples=1000,
graph_generator=graph_generator,
noise_generator=noise_generator,
causal_mechanism=causal_mechanism,
scm_context=context,
seed=42
)
# Sample from the model
dataset, groundtruth = model.sample()