zoo.automl.regression package¶
Submodules¶
zoo.automl.regression.time_sequence_predictor module¶
-
class
zoo.automl.regression.time_sequence_predictor.TimeSequencePredictor(name='automl', logs_dir='~/zoo_automl_logs', future_seq_len=1, dt_col='datetime', target_col='value', extra_features_col=None, drop_missing=True)[source]¶ Bases:
objectTrains a model that predicts future time sequence from past sequence. Past sequence should be > 1. Future sequence can be > 1. For example, predict the next 2 data points from past 5 data points. Output have only one target value (a scalar) for each data point in the sequence. Input can have more than one features (value plus several features) Example usage: tsp = TimeSequencePredictor() tsp.fit(input_df) result = tsp.predict(test_df)
-
evaluate(input_df, metric=None)[source]¶ Evaluate the model on a list of metrics. :param input_df: The input time series data frame, Example: datetime value “extra feature 1” “extra feature 2” 2019-01-01 1.9 1 2 2019-01-02 2.3 0 2 :param metric: A list of Strings Available string values are “mean_squared_error”, “r_square”. :return: a list of metric evaluation results.
-
fit(input_df, validation_df=None, metric='mse', recipe=<zoo.automl.config.recipe.SmokeRecipe object>, mc=False, resources_per_trial={'cpu': 2}, distributed=False, hdfs_url=None)[source]¶ Trains the model for time sequence prediction. If future sequence length > 1, use seq2seq model, else use vanilla LSTM model. :param input_df: The input time series data frame, Example: datetime value “extra feature 1” “extra feature 2” 2019-01-01 1.9 1 2 2019-01-02 2.3 0 2 :param validation_df: validation data :param metric: String. Metric used for train and validation. Available values are “mean_squared_error” or “r_square” :param recipe: a Recipe object. Various recipes covers different search space and stopping criteria. Default is SmokeRecipe(). :param resources_per_trial: Machine resources to allocate per trial, e.g. ``{“cpu”: 64, “gpu”: 8}` :param distributed: bool. Indicate if running in distributed mode. If true, we will upload models to HDFS. :param hdfs_url: the hdfs url used to save file in distributed model. If None, the default hdfs_url will be used. :return: self
-
predict(input_df)[source]¶ Predict future sequence from past sequence. :param input_df: The input time series data frame, Example: datetime value “extra feature 1” “extra feature 2” 2019-01-01 1.9 1 2 2019-01-02 2.3 0 2 :return: a data frame with 2 columns, the 1st is the datetime, which is the last datetime of the past sequence. values are the predicted future sequence values. Example : datetime value_0 value_1 … value_2 2019-01-03 2 3 9
-