Perceiver IO: A General Architecture for Structured Inputs & Outputs

참고: https://medium.com/analytics-vidhya/perceiver-io-a-general-architecture-for-structured-inputs-outputs-4ad669315e7f


Contents

  • Introduction
  • (1) Perceiver
  • (2) Perceiver IO


Introduction

(1) Perceiver IO

  • Generalizable algorithm that utilizes transformer that solves the quadratic complexity

  • Extension of the original perceiver

    \(\rightarrow\) Extend to any size of output values


(2) Limitation of Transformer

  • Quadratic complexity!


(3) Previous works

  • VIT Transformers,

    • Patchify the images & feed to transformer

    \(\rightarrow\) Still doesn’t solve the quadratic complexity


1. (Original) Perceiver

(https://arxiv.org/abs/2103.03206)

Goal: Solve the quadratic complexity of Transformers

How? Add a cross attention layer between the ..

  • (1) input sequence
  • (2) multi-headed attention


figure2


2. Perceiver IO

Add a cross attention mechanism in the last layer of the decoder.

\(\rightarrow\) Maps latent of the encoder to arbitrarily sized and structured outputs using a querying system ( = simply querying the latent array using a query feature vector unique to the desired output element )


figure2

Categories: , ,

Updated: