Dilated Recurrent Neural Networks (2017)Permalink

ContentsPermalink

  1. Abstract
  2. Introduction
  3. Dilated RNN
    1. Dilated recurrent skip-connection
    2. Exponentially Increasing Dilation


0. AbstractPermalink

RNN on long sequence… difficult!

[ 3 main challenges ]

  • 1) complex dependencies
  • 2) vanishing / exploding gradients

  • 3) efficient parallelization


Simple, yet effective RNN structure, DilatedRNN

  • key : dilated recurrent skip connections
  • advantages
    • reduce # of parameters
    • enhance training efficiency
    • match SOTA


1. IntroductionPermalink

attempts to overcome problems of RNNs

  • LSTM, GRU, clockwork RNNs, phased LSTM, hierarchical multi-scale RNNs


Dilated CNNs

  • length of dependencies captured by dilated CNN is limited by its kernel size,

    whereas an RNN’s autoregressive modeling can capture potentially infinitely long dependencies

introduce DilatedRNN


2. Dilated RNNPermalink

main ingredients of Dilated RNN

  • 1) Dilated recurrent skip-connection
  • 2) use of exponentially increasing dilation


(1) Dilated recurrent skip-connectionPermalink

figure2


(2) Exponentially Increasing DilationPermalink

  • stack dilated recurrent layers

  • (similar to WaveNet) dilation increases exponentially across layers
  • s(l) : dilation of the l-th layer
    • s(l)=Ml1,l=1,,L.
  • ex) figure 2 depicts an example with L=3 and M=2.


Benefits

  • 1) makes different layers focus on different temporal resolutions
  • 2) reduces the average length of paths between nodes at different timestamp


Generalized Dilated RNN

  • does not start at one, but Ml0
  • s(l)=M(l1+l0),l=1,,L and l00.

Tags:

Categories:

Updated: