FITS: Modeling TS with 10k ParametersPermalink


ContentsPermalink

  1. Abstract
  2. Introduction
  3. Related Work & Motivation
  4. Method
  5. Experiments


AbstractPermalink

FITS

  • lightweight model
  • directly process raw time-domain (X)
  • interpolation in complex frequency domain (O)

  • use 10k parameters


1. IntroductionPermalink

Frequency domain representation in TS: compact & efficient

Existing works: FEDformer, TimesNet

  • Still… comprehensive utilization of frequency domain’s compactness remains unexplored!

    ( i.e. employing complex numbers )


FITSPermalink

  • Reinterpret TS tasks (i.e. forecasting, reconstruction) as interpolation within frequency domain
  • Produce an extended TS segment by interpolating the frequency representation of a provided segment
    • ex) Forecasting: by extending the given look-back window with frequency interpolation
    • ex) Reconstruction: by interpolating the frequency representation of its downsampled counterpart
  • Core of FITS = complex-valued linear layer
    • designed to learn “amplitude scling” & “phase shift”
  • But still, fundamentally remains a time-domain model, by integrating rFFT
    • (1) transform input into frequency domain using rFFT
    • (2) mapped back to time domain
  • Incorporates low-pass filter
    • ensures a compact representation
  • Use only 10k params


2. Related Work & MotivationPermalink

(1) Frequency-aware TS ModelsPermalink

  • FNet
  • FEDFormer
  • FiLM
  • TimesNet


(2) Divide & Conquer the Frequency ComponentsPermalink

Treating the TS as SIGNAL

  • Break down into linear combination of sinusoidal components ( w/o info loss )
    • each component = unique frequency & initial phase & amplitude
  • Forecasting each frequency component: straightforward
    • only apply a phase bias to the sinusodial wave ( based on time shift )
    • then, linearly combine this shifted waves!


HOWEVER, forecasting each sinusoidal component in TIME domain can be cumbersom!

( sinusoidal components are treated as a sequences of data points )

Solution: perform it on FREQUENCY domain


3. MethodPermalink

(1) Preliminaries: FFT & Complex Frequency DomainPermalink

a) FFTPermalink

  • Efficiently perform DFT on complex number sequences

  • Transforms discrete-time signals from TIME FREQUENCY

    ( N real numbers N/2+1 complex numbers )


b) Complex Frequency DomainPermalink

Complex number

  • captures both amplitude & phase of the component

  • can be represented as a complex exponential element with a given amplitude & phase
  • X(f)=∣X(f)ejθ(f).
    • X(f) : complex number associated with the frequency component at frequency f
    • X(f) : amplitude
    • θ(f) : phase

Complex planePermalink

Complex exponential element can be visualized as …

  • a vector with a length equal to the amplitude and angle equal to the phase

  • X(f)=∣X(f)(cosθ(f)+jsinθ(f)).


figure2


Time Shift & Phase ShiftPermalink

Time Shift = Phase Shift in FREQUENCY domain

  • by multiplying a unit complex exponential element with the corresponding space ( in FREQ domain )


Shift signal x(t) forawrd in TIME by τ = x(tτ)

Fourier transform: Xτ(f)=ej2πfτX(f)=∣X(f)ej(θ(f)2πfτ)=[cos(2πfτ)+jsin(2πfτ)]X(f)

  • Amplitude : X(f)
  • Phase θτ(f)=θ(f)2πfτ
    • linear to the time shift.


(2) FITS PipelinePermalink

(Motivation) Longer TS = Higher frequency resolution

Train FITS to extend TS segment by interpolating the frequency representation of input TS semgnet

figure2


LPF (Low-Pass Filter)

  • To reduce the model size
  • Eliminates HIGH-frequency components above certain cutoff


Forecasting

  • generate the look-back window along with the horizon

    ( = combining backcast & forecast )


Reconstruction

  • downsample the original TS based on specific downsampling rate
  • then, perform frequency interpolation


(3) Key Mechanism of FITSPermalink

a) Complex Frequency Linear InterpolationPermalink

Interpolation rate: η

  • ratio of the model’s output length Lo to its corresponding input length Li.


Frequency interpolation

  • operates on the normalized complex frequency representation ( = half the length of the original TS )


Interpolation rate can also be applied to the frequency domain

  • ηfreq=Lo/2Li/2=LoLi=η.


With an arbitrary frequency f

  • Frequency band 1f in the original signal is linearly projected to the frequency band 1ηf in the output signal.
  • Input length of our complex-valued linear layer = L
  • Interpolated output length = ηL.


b) Low Pass Filter (LPF)Permalink

  • To compress the model’s volume

  • By discarding frequency components above a specified cutoff frequency (COF)

  • Ensures that a significant portion of the original time series’ meaningful content is preserved

    • High-frequency components filtered out by the LPF typically comprise noise,!

      ( = irrelevant for effective time series modeling )

figure2


Selecting COF? Nontrivial!

propose method based on the harmonic content of the dominant frequency


Also adopt channel independence


4. ExperimentsPermalink

(1) Forecasting as Frequency InterpolationPermalink

Input Length : L Output Length : H Combination of look-back window & forecasting horizon : L+H


Interpolation rate of the forecasting task:

  • ηFore =1+HL.

Categories:

Updated: