Basic timeseries opening/processing funcs.

maybe_unsqueeze[source]

maybe_unsqueeze(x)

Add empty dimension if it is a rank 1 tensor/array

a = np.random.random(10)
test_eq((1,10), maybe_unsqueeze(a).shape)
test_eq((1,10), maybe_unsqueeze(maybe_unsqueeze(a)).shape) #do nothing

t = torch.rand(10)
test_eq((1,10), maybe_unsqueeze(t).shape)
test_eq((1,10), maybe_unsqueeze(maybe_unsqueeze(t)).shape) #do nothing

A time series is just an array of 1 dimesion.

show_array[source]

show_array(array, ax=None, figsize=None, title=None, ctx=None, tx=None, **kwargs)

Show an array on ax.

A simple array of 1 channel is np.arange(10).

show_array(np.arange(10));

class TSeries[source]

TSeries(x, **kwargs) :: TensorBase

Basic Timeseries wrapper

We can add some noise for the fun

a = np.arange(10)+np.random.randn(10)
a.shape
(10,)

As we want to make explicit that is a one channel timeseries, we will unsqueeze the first dimension.

ts = TSeries.create(a)
ts.data
tensor([[-1.0373,  2.2527,  2.2667,  4.0746,  4.2688,  4.9019,  3.6700,  7.7529,
          6.5165,  8.9746]])
ts.shape
torch.Size([1, 10])
ts.show();
ts.data.mean()
tensor(4.3641)
@patch
def encodes(self:Normalize, x:TSeries): return (x - self.mean) / self.std

@patch
def encodes(self: Normalize, x:TSeries): 
    f = to_cpu if x.device.type=='cpu' else noop
    return (x*f(self.std) + f(self.mean))
nrm = Normalize(ts.data.mean(), ts.data.std())
((ts-nrm.mean)/ts.data.std()).show()
<AxesSubplot:>
nrm.encodes(ts).show();

UCR

The 2018 UCR 128-timeseries dataset

This is the dataset used to benchmark algorithms for Timeseries classification.

URLs.UCR
'http://www.timeseriesclassification.com/Downloads/Archives/Univariate2018_arff.zip'
ucr_path = untar_data(URLs.UCR)

Each sub task is on a subfolde with task name. For instance we will select 'Adiac' .

ucr_path.ls()
(#135) [Path('/home/tcapelle/.fastai/data/Univariate2018_arff/GestureMidAirD2'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/Adiac'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/CinCECGTorso'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/LargeKitchenAppliances'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/MixedShapesRegularTrain'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/ShakeGestureWiimoteZ'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/SmallKitchenAppliances'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/CricketY'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/ToeSegmentation2'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/EOGVerticalSignal')...]
adiac_path = ucr_path/'Adiac'
adiac_path.ls()
(#5) [Path('/home/tcapelle/.fastai/data/Univariate2018_arff/Adiac/Adiac_TEST.txt'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/Adiac/Adiac_TEST.arff'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/Adiac/Adiac_TRAIN.arff'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/Adiac/Adiac_TRAIN.txt'),Path('/home/tcapelle/.fastai/data/Univariate2018_arff/Adiac/Adiac.txt')]

We can find .csv and .arff files on this copy of the dataset. We will read the .arff files using a helper function. We will store them on pandas DataFrame's

load_df_ucr[source]

load_df_ucr(path, task)

Loads arff files from UCR pandas dfs

df_train, df_test = load_df_ucr(ucr_path, 'StarLightCurves')
Loading files from: /home/tcapelle/.fastai/data/Univariate2018_arff/StarLightCurves
df_train.head()
att1 att2 att3 att4 att5 att6 att7 att8 att9 att10 ... att1016 att1017 att1018 att1019 att1020 att1021 att1022 att1023 att1024 target
0 0.537303 0.531103 0.528503 0.529403 0.533603 0.540903 0.551103 0.564003 0.579603 0.597603 ... 0.546903 0.545903 0.543903 0.541003 0.537203 0.532303 0.526403 0.519503 0.511403 b'3'
1 0.588398 0.593898 0.599098 0.604098 0.608798 0.613397 0.617797 0.622097 0.626097 0.630097 ... 0.237399 0.246499 0.256199 0.266499 0.277399 0.288799 0.300899 0.313599 0.326899 b'3'
2 -0.049900 -0.041500 -0.033400 -0.025600 -0.018100 -0.010800 -0.003800 0.003000 0.009600 0.015900 ... -0.173801 -0.161601 -0.149201 -0.136401 -0.123201 -0.109701 -0.095901 -0.081701 -0.067100 b'1'
3 1.337005 1.319805 1.302905 1.286305 1.270005 1.254005 1.238304 1.223005 1.208104 1.193504 ... 1.288905 1.298505 1.307705 1.316505 1.324905 1.332805 1.340205 1.347005 1.353205 b'3'
4 0.769801 0.775301 0.780401 0.785101 0.789401 0.793301 0.796801 0.799901 0.802601 0.805101 ... 0.742401 0.744501 0.747301 0.750701 0.754801 0.759501 0.765001 0.771301 0.778401 b'3'

5 rows × 1025 columns

load_np_ucr[source]

load_np_ucr(path, task)

Loads arff files from UCR into np arrays

x_train, y_train, x_test, y_test = load_np_ucr(ucr_path, 'StarLightCurves')
Loading files from: /home/tcapelle/.fastai/data/Univariate2018_arff/StarLightCurves
test_eq(len(x_train), len(y_train))
test_eq(len(x_test), len(y_test))

Datasets

Loading from DataFrames

def get_x(row):
    return row.values[:-1].astype('float32')
def get_y(row):
    return int(row.values[-1])
x_cols = df_train.columns[slice(0,-1)].to_list()
x_cols[0:5]
['att1', 'att2', 'att3', 'att4', 'att5']
TSeries(get_x(df_train.iloc[0])).show();
splits = [list(range(len(df_train))), list(range(len(df_train), len(df_train)+len(df_test)))]
ds = Datasets(pd.concat([df_train, df_test]).reset_index(drop=True), 
                    tfms=[[get_x, TSeries.create], [get_y, Categorize()]],
                    splits=splits, 
                   )
ds.valid
(#8236) [(TSeries(ch=1, len=1024), TensorCategory(1)),(TSeries(ch=1, len=1024), TensorCategory(2)),(TSeries(ch=1, len=1024), TensorCategory(2)),(TSeries(ch=1, len=1024), TensorCategory(0)),(TSeries(ch=1, len=1024), TensorCategory(1)),(TSeries(ch=1, len=1024), TensorCategory(2)),(TSeries(ch=1, len=1024), TensorCategory(2)),(TSeries(ch=1, len=1024), TensorCategory(2)),(TSeries(ch=1, len=1024), TensorCategory(1)),(TSeries(ch=1, len=1024), TensorCategory(2))...]
dls = ds.dataloaders(bs=2)
dls.show_batch()