Basic timeseries opening/processing funcs.
a = np.random.random(10)
test_eq((1,10), maybe_unsqueeze(a).shape)
test_eq((1,10), maybe_unsqueeze(maybe_unsqueeze(a)).shape) #do nothing
t = torch.rand(10)
test_eq((1,10), maybe_unsqueeze(t).shape)
test_eq((1,10), maybe_unsqueeze(maybe_unsqueeze(t)).shape) #do nothing
A time series is just an array of 1 dimesion.
A simple array of 1 channel is np.arange(10)
.
show_array(np.arange(10));
We can add some noise for the fun
a = np.arange(10)+np.random.randn(10)
a.shape
As we want to make explicit that is a one channel timeseries, we will unsqueeze
the first dimension.
ts = TSeries.create(a)
ts.data
ts.shape
ts.show();
ts.data.mean()
@patch
def encodes(self:Normalize, x:TSeries): return (x - self.mean) / self.std
@patch
def encodes(self: Normalize, x:TSeries):
f = to_cpu if x.device.type=='cpu' else noop
return (x*f(self.std) + f(self.mean))
nrm = Normalize(ts.data.mean(), ts.data.std())
((ts-nrm.mean)/ts.data.std()).show()
nrm.encodes(ts).show();
URLs.UCR
ucr_path = untar_data(URLs.UCR)
Each sub task is on a subfolde with task name. For instance we will select 'Adiac' .
ucr_path.ls()
adiac_path = ucr_path/'Adiac'
adiac_path.ls()
We can find .csv
and .arff
files on this copy of the dataset. We will read the .arff
files using a helper function. We will store them on pandas DataFrame's
df_train, df_test = load_df_ucr(ucr_path, 'StarLightCurves')
df_train.head()
x_train, y_train, x_test, y_test = load_np_ucr(ucr_path, 'StarLightCurves')
test_eq(len(x_train), len(y_train))
test_eq(len(x_test), len(y_test))
Loading from DataFrames
def get_x(row):
return row.values[:-1].astype('float32')
def get_y(row):
return int(row.values[-1])
x_cols = df_train.columns[slice(0,-1)].to_list()
x_cols[0:5]
TSeries(get_x(df_train.iloc[0])).show();
splits = [list(range(len(df_train))), list(range(len(df_train), len(df_train)+len(df_test)))]
ds = Datasets(pd.concat([df_train, df_test]).reset_index(drop=True),
tfms=[[get_x, TSeries.create], [get_y, Categorize()]],
splits=splits,
)
ds.valid
dls = ds.dataloaders(bs=2)
dls.show_batch()