๐ Overview
"Causality is not assumed, inferred, or interpreted โ it is survived or rejected under systematic perturbation tests."
COREX is a deterministic, graph-free, model-agnostic computational framework that treats causality as an empirically testable robustness property rather than an assumed structural characteristic. The framework implements a four-axis evaluation pipeline โ Statistical Stability, Representation Invariance, Intervention Consistency, and Domain Robustness โ and fuses their outputs through a weighted scoring function to produce a calibrated causal classification.
Contemporary machine learning systems routinely exploit shortcut correlations embedded in training distributions โ associations that collapse the moment data distribution shifts, feature encodings change, or interventions are applied. COREX provides a principled audit pipeline to classify any observed X โ Y relationship as CAUSAL, SPURIOUS, or REPRESENTATION ARTIFACT.
๐๏ธ 4-Module Architecture
Module 01 โ Statistical Stability (S)
Tests whether P(Y|X) remains invariant across independently drawn subpopulations of the data. Performs random stratified partitioning into k folds, estimates conditional distributions using kernel density estimation, and computes pairwise KL divergence across fold estimates.
S = 1 - mean_{iโ j} KL[P(Y|X, D_i) โ P(Y|X, D_j)]Module 02 โ Representation Invariance (R)
Evaluates whether the observed relationship persists when the feature representation of X is subjected to a structured family of transformations: linear projections, nonlinear embeddings, Gaussian noise injection, feature dropout, and PCA compression.
R = 1 - (1/|ฮฆ|) ฮฃ_ฯ ||P(Y|X) - P(Y|ฯ(X))||โ
Module 03 โ Intervention Consistency (I)
Simulates causal interventions by applying controlled perturbations to X and observing the consistency of the downstream effect on Y. Uses propensity-score matched observational comparisons and synthetic counterfactual generation.
I = Consistency(do(X=xโ)โYโ, do(X=xโ)โYโ)
Module 04 โ Domain Robustness (D)
Evaluates whether the predictive relationship generalizes across environments with distinct data-generating distributions. Constructs pseudo-environments through clustering in covariate space and assesses stability via coefficient of variation.
D = 1 - CV(P(Y|X, e)) over e โ E
๐ Core Equations
๐ COREX Scoring Function
COREX = wโยทS + wโยทR + wโยทI + wโยทD
| Weight | Value | Module |
|---|---|---|
| wโ | 0.25 | Statistical Stability |
| wโ | 0.25 | Representation Invariance |
| wโ | 0.30 | Intervention Consistency (highest) |
| wโ | 0.20 | Domain Robustness |
Decision Thresholds
| Label | COREX Range | Interpretation |
|---|---|---|
| ๐ข CAUSAL | โฅ 0.80 | All four modules stable; intervention consistent |
| ๐ก SPURIOUS | 0.50 โ 0.79 | Domain shift OR intervention instability |
| ๐ด ARTIFACT | < 0.50 | Representation invariance fails |
๐ฆ Installation
pip install corex # From source git clone https://github.com/gitdeeper12/COREX.git cd COREX pip install -e .
Core Dependencies: numpy, scipy
๐ง API Reference
from corex import CausalEvaluator # Initialize evaluator evaluator = CausalEvaluator() # Evaluate relationship between X and Y result = evaluator.evaluate(X, y) # Access results print(result.label) # "CAUSAL" | "SPURIOUS" | "REPRESENTATION_ARTIFACT" print(result.corex_score) # float in [0, 1] print(result.breakdown) # {"S": 0.91, "R": 0.88, "I": 0.85, "D": 0.90}
Parameters
| Parameter | Description | Default |
|---|---|---|
| weights | Custom module weights | {'statistical':0.25, 'representation':0.25, 'intervention':0.30, 'domain':0.20} |
| meta_scorer | Optional learnable meta-layer | None |
๐งฉ Core Modules
| Module | Path | Description |
|---|---|---|
| statistical.py | corex/modules/statistical.py | Statistical Stability Module (S) |
| representation.py | corex/modules/representation.py | Representation Invariance Module (R) |
| domain.py | corex/modules/domain.py | Domain Robustness Module (D) |
| intervention.py | corex/modules/intervention.py | Intervention Consistency Engine (I) |
| score.py | corex/score.py | COREX scoring function and thresholds |
| pipeline.py | corex/pipeline.py | Main evaluation pipeline |
๐ Validation Summary
| Method | Accuracy | AUROC | FPR |
|---|---|---|---|
| COREX v1.0.0 | 91.4% | 0.963 | 3.2% |
| IRM baseline | 76.0% | 0.871 | 23.0% |
| Conditional Independence | 69.0% | 0.741 | 31.0% |
๐ Citation
"Causality is not assumed โ it is survived."