Sampling Algorithms¶

class chocolate.Grid(connection, space, crossvalidation=None, clear_db=False)[source]¶

Regular cartesian grid sampler.

Samples the search space at every point of the grid formed by all dimensions. It requires every dimension to be a discrete distribution.

Parameters:	connection – A database connection object. space – The search space to explore with only discrete dimensions. crossvalidation – A cross-validation object that handles experiment repetition. clear_db – If set to `True` and a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one.

next()¶

Retrieve the next point to evaluate based on available data in the database.

Returns:	A tuple containing a unique token and a fully qualified parameter set.

update(token, values)¶

Update the loss of the parameters associated with token.

Parameters:	token – A token generated by the sampling algorithm for the current parameters values – The loss of the current parameter set. The values can be a single `Number`, a `Sequence` or a `Mapping`. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.

class chocolate.Random(connection, space, crossvalidation=None, clear_db=False, random_state=None)[source]¶

Random sampler.

Samples the search space randomly. This sampler will draw random numbers for each entry in the database in order to restore the random state for reproductibility when used concurrently with other random samplers.

If all parameters are discrete, the sampling is made without replacement. Otherwise, the exploration is conducted independently of conditional search space, meaning that each subspace will receive approximately the same number of samples.

Parameters:

connection – A database connection object.
space – The search space to explore with only discrete dimensions. The search space can be either a dictionary or a chocolate.Space instance.
crossvalidation – A cross-validation object that handles experiment repetition.
clear_db – If set to True and a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one.
random_state – Either a numpy.random.RandomState instance, an object to initialize the random state with or None in which case the global state is used.

next()¶

Retrieve the next point to evaluate based on available data in the database.

Returns:	A tuple containing a unique token and a fully qualified parameter set.

update(token, values)¶

Update the loss of the parameters associated with token.

Parameters:	token – A token generated by the sampling algorithm for the current parameters values – The loss of the current parameter set. The values can be a single `Number`, a `Sequence` or a `Mapping`. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.

class chocolate.QuasiRandom(connection, space, crossvalidation=None, clear_db=False, seed=None, permutations=None, skip=0)[source]¶

Quasi-Random sampler.

Samples the search space using the generalized Halton low-discrepancy sequence. The underlying sequencer is the ghalton package, it must be installed separatly. The exploration is conducted independently of conditional search space, meaning that each subspace will receive approximately the same number of samples.

This sampler will draw random numbers for each entry in the database to restore the random state for reproductibility when used concurrently with other random samplers.

Parameters:

connection – A database connection object.
space – The search space to explore with only discrete dimensions. The search space can be either a dictionary or a chocolate.Space instance.
crossvalidation – A cross-validation object that handles experiment repetition.
clear_db – If set to True and a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one.
seed – An integer used as seed to initialize the sequencer with or None in which case the global state is used. This argument is ignored if permutations if provided.
permutations – Either, the string "ea" in which case the ghalton.EA_PERMS are used or a valid list of permutations as desbribed in the ghalton package.
skip – The number of points to skip in the sequence before the first point is sampled.

next()¶

Retrieve the next point to evaluate based on available data in the database.

Returns:	A tuple containing a unique token and a fully qualified parameter set.

update(token, values)¶

Update the loss of the parameters associated with token.

Parameters:	token – A token generated by the sampling algorithm for the current parameters values – The loss of the current parameter set. The values can be a single `Number`, a `Sequence` or a `Mapping`. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.