FITS: Modeling TS with 10k Parameters

Abstract
Introduction
Related Work & Motivation
Method
Experiments

Abstract

FITS

lightweight model
directly process raw time-domain (X)
interpolation in complex frequency domain (O)
use 10k parameters

1. Introduction

Frequency domain representation in TS: compact & efficient

Existing works: FEDformer, TimesNet

Still… comprehensive utilization of frequency domain’s compactness remains unexplored!

( i.e. employing complex numbers )

FITS

Reinterpret TS tasks (i.e. forecasting, reconstruction) as interpolation within frequency domain
Produce an extended TS segment by interpolating the frequency representation of a provided segment
- ex) Forecasting: by extending the given look-back window with frequency interpolation
- ex) Reconstruction: by interpolating the frequency representation of its downsampled counterpart
Core of FITS = complex-valued linear layer
- designed to learn “amplitude scling” & “phase shift”
But still, fundamentally remains a time-domain model, by integrating rFFT
- (1) transform input into frequency domain using rFFT
- (2) mapped back to time domain
Incorporates low-pass filter
- ensures a compact representation
Use only 10k params

(1) Frequency-aware TS Models

FNet
FEDFormer
FiLM
TimesNet

(2) Divide & Conquer the Frequency Components

Treating the TS as SIGNAL

Break down into linear combination of sinusoidal components ( w/o info loss )
- each component = unique frequency & initial phase & amplitude
Forecasting each frequency component: straightforward
- only apply a phase bias to the sinusodial wave ( based on time shift )
- then, linearly combine this shifted waves!

HOWEVER, forecasting each sinusoidal component in TIME domain can be cumbersom!

( \(\because\) sinusoidal components are treated as a sequences of data points )

\(\rightarrow\) Solution: perform it on FREQUENCY domain

3. Method

(1) Preliminaries: FFT & Complex Frequency Domain

a) FFT

Efficiently perform DFT on complex number sequences
Transforms discrete-time signals from TIME \(\rightarrow\) FREQUENCY

( \(N\) real numbers \(\rightarrow\) \(N/2+1\) complex numbers )

b) Complex Frequency Domain

Complex number

captures both amplitude & phase of the component
can be represented as a complex exponential element with a given amplitude & phase
\(X(f)= \mid X(f) \mid e^{j \theta(f)}\).
- \(X(f)\) : complex number associated with the frequency component at frequency \(f\)
- \(\mid X(f) \mid\) : amplitude
- \(\theta(f)\) : phase

Complex plane

Complex exponential element can be visualized as …

a vector with a length equal to the amplitude and angle equal to the phase
\(X(f)= \mid X(f) \mid (\cos \theta(f)+j \sin \theta(f))\).

Time Shift & Phase Shift

Time Shift = Phase Shift in FREQUENCY domain

by multiplying a unit complex exponential element with the corresponding space ( in FREQ domain )

Shift signal \(x(t)\) forawrd in TIME by \(\tau\) = \(x(t-\tau)\)

\(\rightarrow\) Fourier transform: \(X_\tau(f)=e^{-j 2 \pi f \tau} X(f)= \mid X(f) \mid e^{j(\theta(f)-2 \pi f \tau)}=[\cos (-2 \pi f \tau)+j \sin (-2 \pi f \tau)] X(f)\)

Amplitude : \(\mid X(f) \mid\)
Phase \(\theta_\tau(f)=\theta(f)-2 \pi f \tau\)
- linear to the time shift.

(2) FITS Pipeline

(Motivation) Longer TS = Higher frequency resolution

\(\rightarrow\) Train FITS to extend TS segment by interpolating the frequency representation of input TS semgnet

LPF (Low-Pass Filter)

To reduce the model size
Eliminates HIGH-frequency components above certain cutoff

Forecasting

generate the look-back window along with the horizon

( = combining backcast & forecast )

Reconstruction

downsample the original TS based on specific downsampling rate
then, perform frequency interpolation

(3) Key Mechanism of FITS

a) Complex Frequency Linear Interpolation

Interpolation rate: \(\eta\)

ratio of the model’s output length \(L_o\) to its corresponding input length \(L_i\).

Frequency interpolation

operates on the normalized complex frequency representation ( = half the length of the original TS )

Interpolation rate can also be applied to the frequency domain

\(\eta_{f r e q}=\frac{L_o / 2}{L_i / 2}=\frac{L_o}{L_i}=\eta\).

With an arbitrary frequency \(f\) …

Frequency band \(1 \sim f\) in the original signal is linearly projected to the frequency band \(1 \sim \eta f\) in the output signal.
Input length of our complex-valued linear layer = \(L\)
Interpolated output length = \(\eta L\).

b) Low Pass Filter (LPF)

To compress the model’s volume
By discarding frequency components above a specified cutoff frequency (COF)
Ensures that a significant portion of the original time series’ meaningful content is preserved
- High-frequency components filtered out by the LPF typically comprise noise,!
  
  ( = irrelevant for effective time series modeling )

Selecting COF? Nontrivial!

\(\rightarrow\) propose method based on the harmonic content of the dominant frequency

Also adopt channel independence

4. Experiments

(1) Forecasting as Frequency Interpolation

Input Length : \(L\) Output Length : \(H\) Combination of look-back window & forecasting horizon : \(L+H\)

Interpolation rate of the forecasting task:

\(\eta_{\text {Fore }}=1+\frac{H}{L}\).

Twitter Facebook LinkedIn

FITS; Modeling TS with 10k Parameters

Seunghan Lee

FITS: Modeling TS with 10k Parameters

Contents

Abstract

1. Introduction

FITS

(1) Frequency-aware TS Models

(2) Divide & Conquer the Frequency Components

3. Method