Sampling Algorithms¶
-
class
chocolate.Grid(connection, space, crossvalidation=None, clear_db=False)[source]¶ Regular cartesian grid sampler.
Samples the search space at every point of the grid formed by all dimensions. It requires every dimension to be a discrete distribution.
Parameters: - connection – A database connection object.
- space – The search space to explore with only discrete dimensions.
- crossvalidation – A cross-validation object that handles experiment repetition.
- clear_db – If set to
Trueand a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one.
-
next()¶ Retrieve the next point to evaluate based on available data in the database.
Returns: A tuple containing a unique token and a fully qualified parameter set.
-
update(token, values)¶ Update the loss of the parameters associated with token.
Parameters: - token – A token generated by the sampling algorithm for the current parameters
- values – The loss of the current parameter set. The values can be a
single
Number, aSequenceor aMapping. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.
-
class
chocolate.Random(connection, space, crossvalidation=None, clear_db=False, random_state=None)[source]¶ Random sampler.
Samples the search space randomly. This sampler will draw random numbers for each entry in the database in order to restore the random state for reproductibility when used concurrently with other random samplers.
If all parameters are discrete, the sampling is made without replacement. Otherwise, the exploration is conducted independently of conditional search space, meaning that each subspace will receive approximately the same number of samples.
Parameters: - connection – A database connection object.
- space – The search space to explore with only discrete dimensions. The
search space can be either a dictionary or a
chocolate.Spaceinstance. - crossvalidation – A cross-validation object that handles experiment repetition.
- clear_db – If set to
Trueand a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one. - random_state – Either a
numpy.random.RandomStateinstance, an object to initialize the random state with orNonein which case the global state is used.
-
next()¶ Retrieve the next point to evaluate based on available data in the database.
Returns: A tuple containing a unique token and a fully qualified parameter set.
-
update(token, values)¶ Update the loss of the parameters associated with token.
Parameters: - token – A token generated by the sampling algorithm for the current parameters
- values – The loss of the current parameter set. The values can be a
single
Number, aSequenceor aMapping. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.
-
class
chocolate.QuasiRandom(connection, space, crossvalidation=None, clear_db=False, seed=None, permutations=None, skip=0)[source]¶ Quasi-Random sampler.
Samples the search space using the generalized Halton low-discrepancy sequence. The underlying sequencer is the ghalton package, it must be installed separatly. The exploration is conducted independently of conditional search space, meaning that each subspace will receive approximately the same number of samples.
This sampler will draw random numbers for each entry in the database to restore the random state for reproductibility when used concurrently with other random samplers.
Parameters: - connection – A database connection object.
- space – The search space to explore with only discrete dimensions. The
search space can be either a dictionary or a
chocolate.Spaceinstance. - crossvalidation – A cross-validation object that handles experiment repetition.
- clear_db – If set to
Trueand a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one. - seed – An integer used as seed to initialize the sequencer with or
Nonein which case the global state is used. This argument is ignored ifpermutationsif provided. - permutations – Either, the string
"ea"in which case theghalton.EA_PERMSare used or a valid list of permutations as desbribed in the ghalton package. - skip – The number of points to skip in the sequence before the first point is sampled.
-
next()¶ Retrieve the next point to evaluate based on available data in the database.
Returns: A tuple containing a unique token and a fully qualified parameter set.
-
update(token, values)¶ Update the loss of the parameters associated with token.
Parameters: - token – A token generated by the sampling algorithm for the current parameters
- values – The loss of the current parameter set. The values can be a
single
Number, aSequenceor aMapping. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.