(paper) Dilated Recurrent Neural Networks

2017, Dilated RNN

1 minute read

Seunghan Lee

Seunghan Lee

Deep Learning, Data Science, Statistics

Dilated Recurrent Neural Networks (2017)

Contents

Abstract
Introduction
Dilated RNN
1. Dilated recurrent skip-connection
2. Exponentially Increasing Dilation

0. Abstract

RNN on long sequence… difficult!

[ 3 main challenges ]

1) complex dependencies
2) vanishing / exploding gradients
3) efficient parallelization

Simple, yet effective RNN structure, DilatedRNN

key : dilated recurrent skip connections
advantages
- reduce # of parameters
- enhance training efficiency
- match SOTA

1. Introduction

attempts to overcome problems of RNNs

LSTM, GRU, clockwork RNNs, phased LSTM, hierarchical multi-scale RNNs

Dilated CNNs

length of dependencies captured by dilated CNN is limited by its kernel size,

whereas an RNN’s autoregressive modeling can capture potentially infinitely long dependencies

\(\rightarrow\) introduce DilatedRNN

2. Dilated RNN

main ingredients of Dilated RNN

1) Dilated recurrent skip-connection
2) use of exponentially increasing dilation

(1) Dilated recurrent skip-connection

(2) Exponentially Increasing Dilation

stack dilated recurrent layers
(similar to WaveNet) dilation increases exponentially across layers
\(s^{(l)}\) : dilation of the \(l\)-th layer
- \(s^{(l)}=M^{l-1}, l=1, \cdots, L\).
ex) figure 2 depicts an example with \(L=3\) and \(M=2\).

Benefits

1) makes different layers focus on different temporal resolutions
2) reduces the average length of paths between nodes at different timestamp

Generalized Dilated RNN

does not start at one, but \(M^{l_{0}}\)
\(s^{(l)}=M^{\left(l-1+l_{0}\right)}, l=1, \cdots, L \text { and } l_{0} \geq 0\).

Twitter Facebook LinkedIn

You May Also Enjoy

8 minute read

2 minute read

5 minute read

14 minute read