NAV Navbar

Introduction

Welcome to Talos! You can use Talos for hyperparameter optimization with Keras models. Talos allows you to use Keras models exactly as you would otherwise, and is built for and tested on Python 2 and 3.

To install stable:

pip install talos

Latest dev version:

pip install git+https://github.com/autonomio/talos.git@daily-dev

Talos incorporates grid, random, and probabilistic hyperparameter optimization strategies, with focus on maximizing the flexibility, efficiency, and result of random strategy. Talos users benefit from access to pseudo, quasi, true, and quantum random methods.

Talos provides a fully automated POD (Prepare, Optimize, Deploy) pipeline that consistently yields state-of-the-art prediction results in a wide-range of prediction problems.

Talos is maintained by a non-profit foundation with 501(c)(3) status. The code is available on Github.

Getting Started

import talos as ta

p = {
  # your parameter boundaries come here
}

def input_model():
  # your model comes here

ta.Scan(x, y, p, input_model)

Find code complete examples here

To get started with your first experiment is easy. You need to have three things:

STEP 1 >> In a regular Python dictionary, you declare the hyperparameters and the boundaries you want to include in the experiment.

STEP 2 >> In order to prepare a Keras model for a Talos experiment, you simply replace parameters you want to include in the scan, with references to the parameter dictionary.

STEP 3 >> To start the experiment, you input the parameter dictionary and the Keras model into Talos with the option for Grid, Random, or Probabilistic optimization strategy.

Workflow

Talos follows POD (prepare, optimize, deploy) workflow, with additional functionality for evaluation and reporting, including plots for visual analysis.

(P)repare

Preparation involves the process of defining the hyperparameter space for the experiment, and the setting of the experiment options such as choosing of the optimization strategy.

See Parameter dictionary, Models, and Scan().

(O)ptimize

Optimization involves the automated process of finding an optimal hyperparameter combination for a well generalizing model for a given prediction task.

See Optimization strategies and Reporting.

(D)eploy

Deployment involves the automated process of storing locally the required assets for local or remote deployment of a predictive model for production purpose.

See Deploy() and Predict

Reporting and Evaluation

In addition Talos provides several useful tools for analysis and evaluation of experiments, including live updating plots for epoch-by-epoch visual analysis of experiment progress.

See Evaluate, Reporting() and Monitoring

Parameter Dictionary

The first step in an experiment is to decide the hyperparameters you want to use in the optimization process.

Example parameter dictionary:

from keras.activations import relu, elu

p = {
    'first_neuron': [12, 24, 48],
    'activation': [relu, elu],
    'batch_size': [10, 20, 30]
}

Activations need to go in as objects and not strings

In addition to standard Keras hyperparameters, Talos allows several extra conveniences such as the ability to include number of hidden layers in the process.

Input Formats

Parameters may be inputted in three distinct ways:

# discreet values in a list
p = {'first_neuron': [12, 24, 48, 96]}

# range of values in a tuple
p = {'first_neuron': (12, 48, 10)}

# a single value
p = {'first_neuron': [12]}

When a tuple is used for a range of values, the first value is min, second value is max and the third value is steps.

Allowed hyperparameters

Generally speaking, whatever hyperparameters you can use in Keras, you can include in a Talos experiment as simply as including the hyperparameter label together with the desired values or the range of values in the parameter dictionary.

If you find that a given hyperparamter is not supported, create an issue on Github.

Hidden Layers

Each hidden layer is followed by a Dropout regularizer. If this is undesired, set dropout to 0 with dropout: [0] in the parameter dictionary.

from talos.model.layers import hidden_layers

def input_model():
  # model prep and input layer...
  hidden_layers(model, params, 1)
  # rest of the model...

Including hidden_layers in a model allows the use of number of hidden layers as an optimization parameter.

LR Normalizer

As one experiment may include more than one optimizer, and optimizers generally have default learning rates in different order of magnitudes, lr_normalizer can be used to allow simultanously including different optimizers and different degrees of learning rates into the Talos experiment.

from talos.model.normalizers import lr_normalizer
# model first part is here ...
model.compile(loss='binary_crossentropy',
              optimizer=params['optimizer'](lr_normalizer(params['lr'], params['optimizer'])),
              metrics=['accuracy'])
# model ending part is here ...

Models

The purpose of Talos is to allow you to continue working with Keras models exactly the way you are used to, and to allow leveraging the flexibility available in Keras without adding any restrictions. Any Keras model can be used in a Talos experiment and Talos does not introduce any new syntax to Keras models.

A single line example of modifying the model

# In original Keras model
model.add(64, input_dim=8)

# In Talos
model.add(Dense(params['first_neuron'], input_dim=8))

In order to use a Keras model in an experiment, you have to modify a working Keras model in a way where the hyperparameter references are replaced with the parameter dictionary references.

You can find several examples of modified Keras models ready for a Talos experiment here and a code complete example with parameter dictionary and experiment configuration here.

Optimization Strategies

Talos incorporates several optimization strategies:

Grid search is the default optimization strategy; all hyperparameter permutations in a given parameter boundary will be processed. Grid search is not recommended for anything but very small permutation spaces. A better option is to use random optimization strategy by invoking grid_downsample in Scan().

Random Optimizers

Random search is the recommended optimization strategy in Talos. This is invoked through the 'grid_downsample' argument in Scan() with a floating point value in the experiment options. For example, to randomly pick 10% of the permutations, a grid_downsample value of 0.1 is used.

Scan(x, y,
     params=p,
     model=input_model,
     grid_downsample=.1)

Several pseudo, quasi, true, and quantum random methods are provided for random searches. These are controlled through the 'random_method' argument in Scan().

Scan(x, y,
     params=p,
     model=input_model,
     grid_downsample=.1,
     random_method=quantum)

Each random method results in a different degree of discrepancy; whereas uniform random methods tend to have higher discrepancy, as does quantum and ambient sound, hypercube and Korobov have lower.

Probabilistic reduction

The probabilistic reducers can be used together with a Grid search, or together with any of the Random methods. The reducer makes a stop between set number of rounds i.e. 'reduction_interval' and uses a probabilistic method to remove poorly performing parameter configurations from the remaining search space.

Several parameters are related with this:

Reduction Method: Currently only one reduction method is supported 'correlation'.

Reduction Interval: The number of rounds between each reduction stop.

Reduction Window: The number of rounds to look back for input signals (e.g. when reduction_window is 50, the results of the last 50 rounds are used for inference)

Reduction Threshold: A floating point value between 0 and 1, where 1 is perfect correlation and 0 is no correlation. The lower the correlation, the less significant a given hyperparameter is with results.

Reduction Metric: The metric against which optimization is performed (e.g. 'val_acc' or 'fmeasure')

Reduce Loss: If 'reduction_metric' is a loss metric, then this needs to be True.

Early Stopping

The time that it takes to get to the desired result may be dramatically reduced by using an early stopping functionality. It is good to note though, that from the hyperparameter optimization standpoint, early stopping is not easy to get right, and it's often better to do without it. Early stopping needs to be invoked through Talos.

Example for using the early_stopper callback ```python from talos.model.early_stopper import early_stopper

out = model.fit(x_train, y_train, batch_size=params['batch_size'], epochs=params['epochs'], verbose=0, validation_data=[x_val, y_val], callbacks=early_stopper(params['epochs'], mode='strict')) ```

early_stopper parameters

Parameter Default Description
epochs NA used for moderate mode
monitor val_loss the value to be monitored
mode moderate moderate, strict, or custom
min_delta user input rate of change at which point flag is raised
patience user input number of epochs before termination from flag

The mode has three options and effects the point at which the flag is raised, and the number of epochs before termination on flag:

moderate: If the value is not changing for 10th of the total epochs

strict: If the value is not changing for 2 epochs

custom: Input needs to be a list or tuple with two integers, where the first integer is min_delta and the second is patience.

Commands

Scan()

Starting a simple quantum random experiment

Scan(x, y,
     params=p,
     model=input_model,
     random_method=quantum)

The experiment is configured and started through the Scan() command. All of the options effecting the experiment, other than the hyperparameters themselves, are configured through the Scan arguments. The most common use-case is where ~10 arguments are invoked.

Scan Arguments

Parameter Default Description
x user input prediction features
y user input prediction outcome variable
params user input the parameter dictionary
model user input the Keras model as a function
dataset_name None Used for experiment log
experiment_no None Used for experiment log
x_val None validation data for x
y_val None validation data for y
val_split .3 validation data split ratio
shuffle True if the data should be shuffled or not
random_method 'uniform_mersenne' the random method to be used
search_method 'random' the order in which permutations are checked
reduction_method None type of probabilistic reduction is used
reduction_interval 50 number of permutations after which reduction is applied
reduction_window 20 the look back window for reduction process
grid_downsample None A float to indicate fraction for random sampling
reduction_threshold 0.2 The threshold at which reduction is applied
reduction_metric 'val_acc' The metric to be used for reduction
reduce_loss False If reduction_metric is a loss function
round_limit None Maximum number of permutations in the experiment
talos_log_name 'talos.log' Name of the master log
debug False Turn on debug messages
seed None Seed for random states
clear_tf_session True Clear tensorflow session after each round
disable_progress_bar False Show live updating progress bar
functional_model False For functional model support
last_epoch_value False Reporting last epoch value in log
print_params False Print each permutation hyperparameters

Scan Object

The scan object has several attributes that are used for Reporting(), Predict() and Deploy(), but may also be useful to access directly. The namespace only consist of meaningful attributes.


# returns the results dataframe
h.data

# returns the experiment configuration details
h.details

# returns the epoch entropy dataframe
h.peak_epochs_df

# returns the saved models (json)
h.saved_models

# returns the saved model weights
h.saved_weights

# returns x data
h.x

# returns y data
h.y

Reporting()

The experiment results can be analyzed through the Reporting() command. The reporting consist of access to several meaningful signals related with the experiment, together with a dataframe with results for each permutation, together with the corresponding hyperparameter configurations. Reporting may be used after Scan completes, or during an experiment (from a different shell / kernel).

Using reporting

r = Reporting('experiment_log.csv')

# returns the results dataframe
r.data

# returns the highest value for 'val_fmeasure'
r.high('val_fmeasure')

# returns the number of rounds it took to find best model
r.rounds2high()

# draws a histogram for 'val_acc'
r.plot_hist()

The evaluation of experiment results consist of Reporting and Predict classes.

Reporting Functions

See docstrings for each function for a more detailed description.

high The highest result for a given metric

rounds The number of rounds in the experiment

rounds2high The number of rounds it took to get highest result

low The lowest result for a given metric

correlate A dataframe with Spearman correlation against a given metric

plot_line A round-by-round line graph for a given metric

plot_hist A histogram for a given metric where each observation is a permutation

plot_corr A correlation heatmap where a single metric is compared against hyperparameters

table A sortable dataframe with a given metric and hyperparameters

best_params A dictionary of parameters from the best model

Reporting Arguments

Parameter Default Description
filename None name of the file with experiment log

Predict()

In order to identify the best model from a given experiment, or to perform predictions with model/s, the Predict() command can be used.

Using predict

p = Predict('scan_object')

# returns model_id for best performing model
r.best_model(metric='val_fmeasure')

# returns predictions for input x
r.predict(x)

# performs a 10-fold cross-validation for multi-class prediction
r.evaluate(x, y, folds=10, average='macro')

Predict Functions

See docstring for each function for a more detailed information, and the required input arguments.

load_model Loads the Keras model with weights so it can be used in the local environment for predictions or other purpose. Requires model_id as argument. The model_id corresponds with the round in the experiment.

best_model Identifies the model_id for the best performing model based on a given metric (e.g. 'val_fmeasure').

predict Makes predictions based on input x and model_id. If model_id is not given, best model will be used.

predict_classes Same as predict, but predicts classes.

evaluate Evaluates models using a k-fold crossvalidation.

Predict Arguments

Parameter Default Description
scan_object None the object from Scan() after experiment is completed

Evaluate()

The models that result from the experiment Scan object can be evaluated with Evaluate(). This way one or more models may be picked for deployment using k-fold cross-validation in a straightforward manner.

Evaluating model generality

from talos import Evaluate

# create the evaluate object
e = Evaluate(scan_object)

# perform the evaluation
e.evaluate(x, y, average='macro')

NOTE: It's very important to save part of your data for evaluation, and keep it completely separated from the data you use for the actual experiment. A good approach would be where 50% of the data is saved for evaluation.

Evaluate Functions

See the function docstring for a more detailed description.

evaluate The highest result for a given metric

Evaluate Arguments

Parameter Default Description
x NA the predictor data x
y NA the prediction data y (truth)
model_id None the model_id to be used
folds None number of folds to be used for cross-validation
shuffle None if data is shuffled before splitting
average 'binary' 'binary', 'micro', 'macro', 'samples', or 'weighted'
metric None the metric against which the validation is performed
asc None should be True if metric is a loss

Deploy()

A successful experiment can be deployed easily

from talos import Deploy

Deploy(scan_object, 'experiment_name')

When you've achieved a successful result, you can use Deploy() to prepare a production ready package that can be easily transferred to another environment or system, or sent or uploaded. The deployment package will consists of the best performing model, which is picked automatically against 'val_acc' unless stated with metric argument.

The deploy package consists of:

The package can be restored into a copy of the original Scan object using the Restore() command.

Deploy Arguments

Parameter Default Description
scan_object None a Scan object
model_name None a string value as the name of experiment
metric 'val_acc' metric against which best model is picked
asc False use True for loss functions

Restore()

The Deploy package can be read back to an object

from talos import Restore

a = Restore(scan_object, 'experiment_name')

The Deploy() .zip package can be read back into a copy of the original experiment assets with Restore(). The object consists of:

Restore Arguments

Parameter Default Description
path_to_zip None full path to the Deploy asset zip file

Monitoring

There are several options for monitoring the experiment.

# turn off progress bar
Scan(disable_progress_bar=True)

# enable live training plot
from talos import live
out = model.fit(X,
                Y,
                epochs=20,
                callbacks=[live()])

# turn on parameter printing
Scan(print_params=True)

Progress Bar : A round-by-round updating progress bar that shows the remaining rounds, together with a time estimate to completion. Progress bar is on by default.

Live Monitoring : Live monitoring provides an epoch-by-epoch updating line graph that is enabled through the live() custom callback.

Round Hyperparameters : Displays the hyperparameters for each permutation. Does not work together with live monitoring.

GPU Support

Talos supports scenarios where on a single system one or more GPUs are handling one or more simultaneous jobs. The base GPU support is handled by TensorFlow, so make sure that you have the GPU version of TensorFlow installed.

You can watch system GPU utilization anytime with:

watch -n0.5 nvidia-smi

Single GPU, Single Job

Single GPU works out-of-the-box, as long as you have the GPU version of TensorFlow installed. In case you already have the CPU version installed, you have to uninstall TensorFlow and install the GPU version.

Single GPU, Multiple Jobs

Parallel Scans with multiple GPU system


from talos.utils.gpu_utils import parallel_gpu_jobs

# split GPU memory in two for two parallel jobs
parallel_gpu_jobs(0.5)

Run the above lines before the Scan() command

A single GPU can be split to simultaneously perform several experiments. This is useful you want to work on more than one scope at one time, or when you're analyzing the results of an ongoing experiment with Reporting() and are ready to start the next experiment while keeping the first one running.

NOTE: GPU memory needs to be reserved pro-actively i.e. once the experiment is already running with full GPU memory, part of the memory can no longer be allocated to a new experiment.

Multi-GPU, Single Job

from talos.utils.gpu_utils import multi_gpu

# split a single job to multiple GPUs
model = multi_gpu(model)

Include the above line in the input model before model.compile()

Multiple GPUs on a single machine can be assigned to work on a single machine in a parallelism fashion. This is useful when you have more than one GPU on a single machine, and want to speed up the experiment. Each GPU roughly speaking reduces compute time linearly.

Force CPU

from talos.utils.gpu_utils import force_cpu

# Force CPU use on a GPU system
force_cpu()

Run the above lines before the Scan() command

Sometimes it's useful (for example when batch_size tends to be very small) to disable GPU and use CPU instead. This can be done simply by invoking force_cpu().

Functional Model

Both Sequential and Functional Keras models are supported in Talos. The workflow would be otherwise the same, but you have to declare in Scan() with functional_model=True that the model is functional.


Scan(x, y, p, input_model, functional_model=True)

Troubleshooting

If you run into trouble with something, or things are not working as expected, there are several things to do:

Need help

Found an issue

Common errors

There are several common user-related cases where resulting errors are simple to overcome.

subprocess32 error

This might arise during the installation on Python2.7 systems. The error results from a dependency in matplotlib. The issue can be overcome by first installing an older version of matplotlib:

pip install matplotlib==1.5.3

wrong numpy version

This might arise in Python2.7 systems. The issue is overcome by installing a specific version of Numpy:

pip install numpy==1.14.5

TypeError: 'str' object is not callable

This error comes as a result of listing optimizers as string values as opposed to the actual object name in the params dictionary. The solution is to use the object name instead.

TypeError: unsupported operand type(s) for +: 'int' and 'numpy.str_'

Same as above.

TypeError: 'numpy.str_' object cannot be interpreted as an integer

Same as above.

ValueError: Could not interpret optimizer identifier: <class 'keras.optimizers.Adam'>

This is the reverse of the above; when lr_normalizer is not used, string values for optimizers should be used in the params dictionary.

KeyError: 'first_neuron'

The 'first_neuron' hyperparameter is missing from the params dictionary, or it's called something else than 'first_neuron'.

KeyError: 'hidden_layers' or KeyError: 'dropout'

When ever hidden_layers is applied in the model, hidden_layers and dropout parameters need to be included in the params dictionary

AttributeError: 'History' object has no attribute 'keys'

This happens when the input model has:

return model, out

You fix this by using the right order for the objects:

return out, model