FITS: Modeling TS with 10k ParametersPermalink
ContentsPermalink
- Abstract
- Introduction
- Related Work & Motivation
- Method
- Experiments
AbstractPermalink
FITS
- lightweight model
- directly process raw time-domain (X)
-
interpolation in complex frequency domain (O)
- use 10k parameters
1. IntroductionPermalink
Frequency domain representation in TS: compact & efficient
Existing works: FEDformer, TimesNet
-
Still… comprehensive utilization of frequency domain’s compactness remains unexplored!
( i.e. employing complex numbers )
FITSPermalink
- Reinterpret TS tasks (i.e. forecasting, reconstruction) as interpolation within frequency domain
- Produce an extended TS segment by interpolating the frequency representation of a provided segment
- ex) Forecasting: by extending the given look-back window with frequency interpolation
- ex) Reconstruction: by interpolating the frequency representation of its downsampled counterpart
- Core of FITS = complex-valued linear layer
- designed to learn “amplitude scling” & “phase shift”
- But still, fundamentally remains a time-domain model, by integrating rFFT
- (1) transform input into frequency domain using rFFT
- (2) mapped back to time domain
- Incorporates low-pass filter
- ensures a compact representation
- Use only 10k params
2. Related Work & MotivationPermalink
(1) Frequency-aware TS ModelsPermalink
- FNet
- FEDFormer
- FiLM
- TimesNet
(2) Divide & Conquer the Frequency ComponentsPermalink
Treating the TS as SIGNAL
- Break down into linear combination of sinusoidal components ( w/o info loss )
- each component = unique frequency & initial phase & amplitude
- Forecasting each frequency component: straightforward
- only apply a phase bias to the sinusodial wave ( based on time shift )
- then, linearly combine this shifted waves!
HOWEVER, forecasting each sinusoidal component in TIME domain can be cumbersom!
( ∵ sinusoidal components are treated as a sequences of data points )
→ Solution: perform it on FREQUENCY domain
3. MethodPermalink
(1) Preliminaries: FFT & Complex Frequency DomainPermalink
a) FFTPermalink
-
Efficiently perform DFT on complex number sequences
-
Transforms discrete-time signals from TIME → FREQUENCY
( N real numbers → N/2+1 complex numbers )
b) Complex Frequency DomainPermalink
Complex number
-
captures both amplitude & phase of the component
- can be represented as a complex exponential element with a given amplitude & phase
- X(f)=∣X(f)∣ejθ(f).
- X(f) : complex number associated with the frequency component at frequency f
- ∣X(f)∣ : amplitude
- θ(f) : phase
Complex planePermalink
Complex exponential element can be visualized as …
-
a vector with a length equal to the amplitude and angle equal to the phase
-
X(f)=∣X(f)∣(cosθ(f)+jsinθ(f)).
Time Shift & Phase ShiftPermalink
Time Shift = Phase Shift in FREQUENCY domain
- by multiplying a unit complex exponential element with the corresponding space ( in FREQ domain )
Shift signal x(t) forawrd in TIME by τ = x(t−τ)
→ Fourier transform: Xτ(f)=e−j2πfτX(f)=∣X(f)∣ej(θ(f)−2πfτ)=[cos(−2πfτ)+jsin(−2πfτ)]X(f)
- Amplitude : ∣X(f)∣
- Phase θτ(f)=θ(f)−2πfτ
- linear to the time shift.
(2) FITS PipelinePermalink
(Motivation) Longer TS = Higher frequency resolution
→ Train FITS to extend TS segment by interpolating the frequency representation of input TS semgnet
LPF (Low-Pass Filter)
- To reduce the model size
- Eliminates HIGH-frequency components above certain cutoff
Forecasting
-
generate the look-back window along with the horizon
( = combining backcast & forecast )
Reconstruction
- downsample the original TS based on specific downsampling rate
- then, perform frequency interpolation
(3) Key Mechanism of FITSPermalink
a) Complex Frequency Linear InterpolationPermalink
Interpolation rate: η
- ratio of the model’s output length Lo to its corresponding input length Li.
Frequency interpolation
- operates on the normalized complex frequency representation ( = half the length of the original TS )
Interpolation rate can also be applied to the frequency domain
- ηfreq=Lo/2Li/2=LoLi=η.
With an arbitrary frequency f …
- Frequency band 1∼f in the original signal is linearly projected to the frequency band 1∼ηf in the output signal.
- Input length of our complex-valued linear layer = L
- Interpolated output length = ηL.
b) Low Pass Filter (LPF)Permalink
-
To compress the model’s volume
-
By discarding frequency components above a specified cutoff frequency (COF)
-
Ensures that a significant portion of the original time series’ meaningful content is preserved
-
High-frequency components filtered out by the LPF typically comprise noise,!
( = irrelevant for effective time series modeling )
-
Selecting COF? Nontrivial!
→ propose method based on the harmonic content of the dominant frequency
Also adopt channel independence
4. ExperimentsPermalink
(1) Forecasting as Frequency InterpolationPermalink
Input Length : L Output Length : H Combination of look-back window & forecasting horizon : L+H
Interpolation rate of the forecasting task:
- ηFore =1+HL.