zoo.automl.feature package¶

Submodules¶

zoo.automl.feature.abstract module¶

class zoo.automl.feature.abstract.BaseFeatureTransformer[source]¶

Bases: abc.ABC

Abstract Base class for Feature transformers.

check_optional_config = False¶

fit_transform(input_df, **config)[source]¶: fit data with the input dataframe Will refit the scalars to this data if any. :param input_df: input to be fitted :param config: the config :return:

restore(**config)[source]¶: Restore variables from file :param file_path: file contain saved parameters. i.e. some parameters are obtained during training, not in trial config, e.g. scaler fit params) :param config: the trial config :return:

save(file_path)[source]¶: save the feature tools internal variables. Some of the variables are derived after fit_transform, so only saving config is not enough. :param: file_path : the file to be saved :param: config: the trial config :return:

transform(input_df)[source]¶: transform the data with fitted :param input_df: input dataframe :return:

zoo.automl.feature.identity_transformer module¶

class zoo.automl.feature.identity_transformer.IdentityTransformer(feature_cols=None, target_col=None)[source]¶

Bases: zoo.automl.feature.abstract.BaseFeatureTransformer

echo transformer

fit_transform(input_df, **config)[source]¶: fit data with the input dataframe Will refit the scalars to this data if any. :param input_df: input to be fitted :param config: the config :return:

post_processing(input_df, y_pred, is_train)[source]¶

restore(**config)[source]¶: Restore variables from file :param file_path: file contain saved parameters. i.e. some parameters are obtained during training, not in trial config, e.g. scaler fit params) :param config: the trial config :return:

save(file_path, replace=False)[source]¶: save the feature tools internal variables. Some of the variables are derived after fit_transform, so only saving config is not enough. :param: file_path : the file to be saved :param: config: the trial config :return:

transform(input_df, is_train=True)[source]¶: transform the data with fitted :param input_df: input dataframe :return:

zoo.automl.feature.time_sequence module¶

class zoo.automl.feature.time_sequence.TimeSequenceFeatureTransformer(future_seq_len=1, dt_col='datetime', target_col='value', extra_features_col=None, drop_missing=True)[source]¶

Bases: zoo.automl.feature.abstract.BaseFeatureTransformer

TimeSequence feature engineering

fit_transform(input_df, **config)[source]¶: Fit data and transform the raw data to features. This is used in training for hyper parameter searching. This method will refresh the parameters (e.g. min and max of the MinMaxScaler) if any :param input_df: The input time series data frame, it can be a list of data frame or just one dataframe Example: datetime value “extra feature 1” “extra feature 2” 2019-01-01 1.9 1 2 2019-01-02 2.3 0 2 :return: tuple (x,y) x: 3-d array in format (no. of samples, past sequence length, 2+feature length), in the last dimension, the 1st col is the time index (data type needs to be numpy datetime type, e.g. “datetime64”), the 2nd col is the target value (data type should be numeric) y: y is 2-d numpy array in format (no. of samples, future sequence length) if future sequence length > 1, or 1-d numpy array in format (no. of samples, ) if future sequence length = 1

get_feature_list(input_df)[source]¶

post_processing(input_df, y_pred, is_train)[source]¶: Used only in pipeline predict, after calling self.transform(input_df, is_train=False). Post_processing includes converting the predicted array into data frame and scalar inverse transform. :param input_df: a list of data frames or one data frame. :param y_pred: Model prediction result (ndarray). :param is_train: indicate the output is used to evaluation or prediction. :return: In validation mode (is_train=True), return the unscaled y_pred and rolled input_y. In test mode (is_train=False) return unscaled data frame(s) in the format of {datetime_col} | {target_col(s)}.

restore(**config)[source]¶: Restore variables from file :return:

save(file_path, replace=False)[source]¶: save the feature tools internal variables as well as the initialization args. Some of the variables are derived after fit_transform, so only saving config is not enough. :param: file : the file to be saved :return:

transform(input_df, is_train=True)[source]¶: Transform data into features using the preset of configurations from fit_transform :param input_df: The input time series data frame, input_df can be a list of data frame or one data frame. Example: datetime value “extra feature 1” “extra feature 2” 2019-01-01 1.9 1 2 2019-01-02 2.3 0 2 :param is_train: If the input_df is for training. :return: tuple (x,y) x: 3-d array in format (no. of samples, past sequence length, 2+feature length), in the last dimension, the 1st col is the time index (data type needs to be numpy datetime type, e.g. “datetime64”), the 2nd col is the target value (data type should be numeric) y: y is 2-d numpy array in format (no. of samples, future sequence length) if future sequence length > 1, or 1-d numpy array in format (no. of samples, ) if future sequence length = 1

unscale_uncertainty(y_uncertainty)[source]¶

Module contents¶