Causal Inference Meets Deep Learning: A Comprehensive Survey

5. DL with Causal Inference

Causal discovery

= Inferring causal relationships from data

DL + CI

Traditional DL

Relies on large IID datasets
Focus on correlation rather than causation
- Correlation-based models: Unstable and sensitive to small data changes

Causal learning

Improve generalization, stability, and interpretability in DL

Causal Inference + DL

(1) Adversarial Learning

Goal: Maintaining model stability/robustness
- In the presence of malicious environments
How? By adding perturbations to real samples
- NN will misled in their judgments by them!
Divided into
- (1) Adversarial attacks
- (2) Adversarial defenses

a) Adversarial attack

Aim to generate adversarial samples more efficiently

By manipulating the input samples

Adversarial attack + Causal Inference

Testing for conditional independence in a multi-dimensional dataset

\(\rightarrow\) Can be challenging!!
Adversarial attack the IID distn
- [Humans] May not perceive these changes!
- [Model] Modified samples can cause intervention for the model!
Causality can mitigate the impact of adversarial attacks on DL

Ren et al. [161]

Employ Transformer as a tool to construct a causal model
- Model = Explains the generation and performance of adversarial samples
Propose a simple and effective strategy to defend against adversarial attacks

Cai et al. [162]

Novel adversarial learning framework based on the causal generation method
Generates counterfactual adversarial examples …
- By altering the distn through intervening variables

b) Adversarial defense

Aims to offer more effective protection against adversarial samples

Zhang et al. [163]

Cause of adversarial vulnerability in DNNs = Model’s reliance on “false correlations”
Construct causal graphs
- To model the generation process of adversarial samples
Propose an adversarial distribution alignment method
- To formalize the intuition behind adversarial attacks.
Eliminate the differences between natural & adversarial distns

\(\rightarrow\) Robustness of the model is improved!

c) GAN

CausalGAN = 2-stage causal GAN

Kocaoglu et al. [166]
Stage 1) Trains a causal implicit GAN on binary labels
Stage 2) New conditional GAN to help the generator sample from the correct intervention distribution.

Scalable generative causal adversarial network (CAN)

Moraffah et al. [167]
Limitation of Causal GAN

= Causal graph constructed in CausalGAN relies on known labels!
CAN: Learns the causal relations from the data iteslf!
Structured into 2 parts:
- (1) Label generating network (LGN)
  - Learns causal relationships from data and generates samples
- (2) Conditional image generating network (CIGN)
  - Receives labels and generates the corresponding images

Causal generative neural networks (CGNNs)

Goudet et al. [168]
Learn data distributions with causal construction generators

Causal-TGAN

Wen et al. [169]
Goal: Generate synthetic tabular data using the tabular data’s causal information
Multiple causal processes are captured by building an SCM

SCIGAN: Hierarchical discriminator

Bica et al. [170]
Estimatie counterfactual outcomes at successive interventions.
Goal: Estimatie counterfactual outcomes at successive interventions.
How? Significantly modified GAN model
- Generate counterfactual outcomes
  
  \(\rightarrow\) Used to learn an inference model ( with standard supervised methods )

(2) Contrastive Learning (CL)

a) Supervised contrastive learning

\(C^2L\): Causal-based CL

Choi et al. [171]
To improve the robustness of “text categorization” models
Candidate tokens are selected based on attribution scores
Causality of these candidate tokens
- Verified by evaluating their individualized treatment effect (ITE)

Proactive Pseudo-Intervention (PPI)

Wang et al. [172]
Causal intervention-based CL (for visual problems)
Pseudo-interventions are synthesized from observational data using CL

\(\rightarrow\) Reduces the model’s dependence on image features that are strongly correlated with the target label but not causally related
Result: Addresses the issue of DNNs over-relying on non-causal visual informationin image classification

b) Self-supervised contrastive learning

Graph contrastive invariant learning (GCIL)

Mo et al. [173]
Graph generation based on the SCM
Limitation of previous works: Traditional graph CL is affected by non-causal information
GCIL
- Uses an SCM to describe the graph generation process
- Original graph \(G\): Divided into ..
  - (1) A set of causal variables \(C\)
  - (2) A set of non-causal variables \(S\)
- Intervene causally on the noncausal variable \(S\)
  
  \(\rightarrow\) T ensure that the variable satisfies the following equation:
  
  \(P^{d o\left(S=s_i\right)}(Y \mid C)=P^{d o\left(S=s_j\right)}(Y \mid C)\).
Summary: Generates causal views to model interventions on non-causal factors from a graph perspective

(3) Diffusion Models

(4) Reinforcement Learning

(5) Recommendation Algorithm

Twitter Facebook LinkedIn

Causal Inference Meets Deep Learning; A Comprehensive Survey - Part 3

Causal Inference Meets Deep Learning: A Comprehensive Survey

Seunghan Lee

Causal Inference Meets Deep Learning: A Comprehensive Survey

5. DL with Causal Inference

Causal discovery

DL + CI

(1) Adversarial Learning

a) Adversarial attack

b) Adversarial defense

c) GAN

(2) Contrastive Learning (CL)

a) Supervised contrastive learning

b) Self-supervised contrastive learning

(3) Diffusion Models

(4) Reinforcement Learning

(5) Recommendation Algorithm

You May Also Enjoy