Sampling Algorithms¶
-
class
chocolate.
Grid
(connection, space, crossvalidation=None, clear_db=False)[source]¶ Regular cartesian grid sampler.
Samples the search space at every point of the grid formed by all dimensions. It requires every dimension to be a discrete distribution.
Parameters: - connection – A database connection object.
- space – The search space to explore with only discrete dimensions.
- crossvalidation – A cross-validation object that handles experiment repetition.
- clear_db – If set to
True
and a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one.
-
next
()¶ Retrieve the next point to evaluate based on available data in the database.
Returns: A tuple containing a unique token and a fully qualified parameter set.
-
update
(token, values)¶ Update the loss of the parameters associated with token.
Parameters: - token – A token generated by the sampling algorithm for the current parameters
- values – The loss of the current parameter set. The values can be a
single
Number
, aSequence
or aMapping
. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.
-
class
chocolate.
Random
(connection, space, crossvalidation=None, clear_db=False, random_state=None)[source]¶ Random sampler.
Samples the search space randomly. This sampler will draw random numbers for each entry in the database in order to restore the random state for reproductibility when used concurrently with other random samplers.
If all parameters are discrete, the sampling is made without replacement. Otherwise, the exploration is conducted independently of conditional search space, meaning that each subspace will receive approximately the same number of samples.
Parameters: - connection – A database connection object.
- space – The search space to explore with only discrete dimensions. The
search space can be either a dictionary or a
chocolate.Space
instance. - crossvalidation – A cross-validation object that handles experiment repetition.
- clear_db – If set to
True
and a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one. - random_state – Either a
numpy.random.RandomState
instance, an object to initialize the random state with orNone
in which case the global state is used.
-
next
()¶ Retrieve the next point to evaluate based on available data in the database.
Returns: A tuple containing a unique token and a fully qualified parameter set.
-
update
(token, values)¶ Update the loss of the parameters associated with token.
Parameters: - token – A token generated by the sampling algorithm for the current parameters
- values – The loss of the current parameter set. The values can be a
single
Number
, aSequence
or aMapping
. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.
-
class
chocolate.
QuasiRandom
(connection, space, crossvalidation=None, clear_db=False, seed=None, permutations=None, skip=0)[source]¶ Quasi-Random sampler.
Samples the search space using the generalized Halton low-discrepancy sequence. The underlying sequencer is the ghalton package, it must be installed separatly. The exploration is conducted independently of conditional search space, meaning that each subspace will receive approximately the same number of samples.
This sampler will draw random numbers for each entry in the database to restore the random state for reproductibility when used concurrently with other random samplers.
Parameters: - connection – A database connection object.
- space – The search space to explore with only discrete dimensions. The
search space can be either a dictionary or a
chocolate.Space
instance. - crossvalidation – A cross-validation object that handles experiment repetition.
- clear_db – If set to
True
and a conflict arise between the provided space and the space in the database, completely clear the database and set the space to the provided one. - seed – An integer used as seed to initialize the sequencer with or
None
in which case the global state is used. This argument is ignored ifpermutations
if provided. - permutations – Either, the string
"ea"
in which case theghalton.EA_PERMS
are used or a valid list of permutations as desbribed in the ghalton package. - skip – The number of points to skip in the sequence before the first point is sampled.
-
next
()¶ Retrieve the next point to evaluate based on available data in the database.
Returns: A tuple containing a unique token and a fully qualified parameter set.
-
update
(token, values)¶ Update the loss of the parameters associated with token.
Parameters: - token – A token generated by the sampling algorithm for the current parameters
- values – The loss of the current parameter set. The values can be a
single
Number
, aSequence
or aMapping
. When a sequence is given, the column name is set to “_loss_i” where “i” is the index of the value. When a mapping is given, each key is prefixed with the string “_loss_”.