Transformers Can Do Arithmetic with the Right Embeddings

McLeish, Sean, et al. "Transformers Can Do Arithmetic with the Right Embeddings." NeurIPS 2024

참고:

  • https://aipapersacademy.com/abacus-embeddings/
  • https://arxiv.org/pdf/2405.17399


Contents

  1. Introduction
  2. Abacus Embedding
    1. Positional Embedding
    2. Abacus Embedding
  3. Experiments


1. Introduction

Remarkable success of LLMs

\(\rightarrow\) Still not good at complex multi-step and algorithmic reasoning


Common approach

  • Focus on simple arithmetic problems (e.g., addition )

This paper = Remarkable progress in the arithmetic capabilities for transformers!


2. Abacus Embedding

(1) Positional Embedding (PE)

Calculation (by humans) vs. PE

  • Calculation = Organize the digits in columns by the digit position in each number

  • PE = Do not embed the position of each digit in each number

    ( Rather just the position of the token in the whole sequence )

figure2


(2) Abacus Embedding

Solution: Use a new type of positional embeddings!

Abacus embeddings

  • Provide the same value for all digits in the sequence of the same significance

figure2


3. Experiments

figure2

Categories: ,

Updated: