Causal Inference Meets Deep Learning: A Comprehensive Survey


5. DL with Causal Inference

Causal discovery

= Inferring causal relationships from data


DL + CI

Traditional DL

  • Relies on large IID datasets
  • Focus on correlation rather than causation
    • Correlation-based models: Unstable and sensitive to small data changes



Causal learning

  • Improve generalization, stability, and interpretability in DL


Causal Inference + DL

figure2


(1) Adversarial Learning

  • Goal: Maintaining model stability/robustness
    • In the presence of malicious environments
  • How? By adding perturbations to real samples
    • NN will misled in their judgments by them!
  • Divided into
    • (1) Adversarial attacks
    • (2) Adversarial defenses


a) Adversarial attack

Aim to generate adversarial samples more efficiently

  • By manipulating the input samples


Adversarial attack + Causal Inference

  • Testing for conditional independence in a multi-dimensional dataset

    \(\rightarrow\) Can be challenging!!

  • Adversarial attack the IID distn

    • [Humans] May not perceive these changes!
    • [Model] Modified samples can cause intervention for the model!
  • Causality can mitigate the impact of adversarial attacks on DL


Ren et al. [161]

  • Employ Transformer as a tool to construct a causal model
    • Model = Explains the generation and performance of adversarial samples
  • Propose a simple and effective strategy to defend against adversarial attacks


Cai et al. [162]

  • Novel adversarial learning framework based on the causal generation method
  • Generates counterfactual adversarial examples …
    • By altering the distn through intervening variables


b) Adversarial defense

Aims to offer more effective protection against adversarial samples


Zhang et al. [163]

  • Cause of adversarial vulnerability in DNNs = Model’s reliance on “false correlations”

  • Construct causal graphs

    • To model the generation process of adversarial samples
  • Propose an adversarial distribution alignment method

    • To formalize the intuition behind adversarial attacks.
  • Eliminate the differences between natural & adversarial distns

    \(\rightarrow\) Robustness of the model is improved!


c) GAN

CausalGAN = 2-stage causal GAN

  • Kocaoglu et al. [166]

  • Stage 1) Trains a causal implicit GAN on binary labels
  • Stage 2) New conditional GAN to help the generator sample from the correct intervention distribution.

figure2


Scalable generative causal adversarial network (CAN)

  • Moraffah et al. [167]

  • Limitation of Causal GAN

    = Causal graph constructed in CausalGAN relies on known labels!

  • CAN: Learns the causal relations from the data iteslf!

  • Structured into 2 parts:

    • (1) Label generating network (LGN)
      • Learns causal relationships from data and generates samples
    • (2) Conditional image generating network (CIGN)
      • Receives labels and generates the corresponding images

figure2


Causal generative neural networks (CGNNs)

  • Goudet et al. [168]

  • Learn data distributions with causal construction generators


Causal-TGAN

  • Wen et al. [169]

  • Goal: Generate synthetic tabular data using the tabular data’s causal information

  • Multiple causal processes are captured by building an SCM


SCIGAN: Hierarchical discriminator

  • Bica et al. [170]

  • Estimatie counterfactual outcomes at successive interventions.

  • Goal: Estimatie counterfactual outcomes at successive interventions.

  • How? Significantly modified GAN model

    • Generate counterfactual outcomes

      \(\rightarrow\) Used to learn an inference model ( with standard supervised methods )

figure2


(2) Contrastive Learning (CL)

a) Supervised contrastive learning

\(C^2L\): Causal-based CL

  • Choi et al. [171]

  • To improve the robustness of “text categorization” models
  • Candidate tokens are selected based on attribution scores
  • Causality of these candidate tokens
    • Verified by evaluating their individualized treatment effect (ITE)

figure2


Proactive Pseudo-Intervention (PPI)

  • Wang et al. [172]

  • Causal intervention-based CL (for visual problems)

  • Pseudo-interventions are synthesized from observational data using CL

    \(\rightarrow\) Reduces the model’s dependence on image features that are strongly correlated with the target label but not causally related

  • Result: Addresses the issue of DNNs over-relying on non-causal visual informationin image classification

figure2


b) Self-supervised contrastive learning

Graph contrastive invariant learning (GCIL)

  • Mo et al. [173]

  • Graph generation based on the SCM

  • Limitation of previous works: Traditional graph CL is affected by non-causal information

  • GCIL

    • Uses an SCM to describe the graph generation process

    • Original graph \(G\): Divided into ..

      • (1) A set of causal variables \(C\)
      • (2) A set of non-causal variables \(S\)
    • Intervene causally on the noncausal variable \(S\)

      \(\rightarrow\) T ensure that the variable satisfies the following equation:

      \(P^{d o\left(S=s_i\right)}(Y \mid C)=P^{d o\left(S=s_j\right)}(Y \mid C)\).

  • Summary: Generates causal views to model interventions on non-causal factors from a graph perspective


(3) Diffusion Models

(4) Reinforcement Learning

(5) Recommendation Algorithm

Categories: ,

Updated: