This is the Learning to Learn gradient-free optimization framework for experimenting with many different algorithms. The basic idea behind “Learning to Learn” is to have an “outer loop” optimizer optimizing the parameters of an “inner loop” optimizee. This particular framework is written for the case where the cycle goes as follows:

  1. The outer-loop optimizer generates an instance of a set of parameters and provides it to the inner-loop optimizee
  2. The inner-loop optimizee evaluates how well this set of parameters performs and returns a “fitness” vector for each parameter in the set of parameters
  3. The outer-loop optimizer generates a new set of parameters using the fitness vector it got back from the inner-loop optimizee

On the whole, what this means is that the outer-loop Optimizer works only with parameters and fitness values and doesn’t have access to the actual underlying model of the optimizee. And the only thing the optimizee does is to evaluate the fitness of the given parameter. This fitness can be anything – the reward achieved in a domain, the mean squared error on a dataset etc.

The Interface


An Individual refers to an instance of hyper-parameters used in the Optimizee. This means that the Optimizer tests multiple individuals of an optimizee to perform an optimization. This terminology is borrowed from evolutionary algorithms. It is equivalent to the hyper-parameters of the Optimizee.
This term, again borrowed from evolutionary algorithms, refers to a single iteration of the outer-loop. Refer to the Top-Level iteration Description in the introduction to see exactly what is done each generation
Used to denote a set of individuals that are evaluated in the same generation.

Other terms such as Trajectory, individual-dict, are defined as they are introduced below.

Iteration loop

The progress of execution in the script shown in L2L Experiments goes as follows:

  1. At the beginning, a population of individuals is created by the Optimizer by calling the Optimizee’s create_individual() method.
  2. The Optimizer then puts these individuals in its member variable eval_pop and calls its _expand_trajectory() method. This is the *key* step to starting and continuing the loop and should be done in all new Optimizer s added.
  3. One Optimizee run for each individual (parameter) is created in eval_pop by calling the Optimizee’s simulate() method. Each Optimizee run can happen in parallel across cores and even across nodes if enabled as described in Parallelization.
  4. the Optimizee’s simulate() method runs whatever simulation it has to run with the given individual (parameter) and returns a Python tuple with one or more fitness values [1].
  5. Once the runs are done, Optimizer’s post_process() method is called with the list of individuals and their fitness values as returned by the Optimizee’s simulate() method. The Optimizer can choose to do whatever it wants with the fitnesses, and use it to create a new set of individuals which it puts into its eval_pop attribute (after clearing it of the old population).
  6. The loop continues from 3.
[1]NOTE: Even if there is only one fitness value, this function should still return a tuple

Writing new algorithms

  • For a new Optimizee: Create a copy of the class Optimizee into a new python module with an appropriate name and fill in the functions. E.g. for a DMS task optimizee, you would create a module (i.e. directory with a __init__.py file) as l2l/optimizees/dms/ and copy the above class there.
  • For a new Optimizer: Create a copy of the class Optimizer into a new python module with an appropriate name and fill in the functions. (same as above)
  • For a new experiment: Create a copy of the file bin/l2l-template.py with an appropriate name and fill in the TODOs.
  • Add an entry in bin/logging.yaml for the new class/file you created. See logging.

Details for implementing the Optimizer, Optimizee and experiment follow.

Important Data Structures


The Trajectory is a container that is central to the simulation library. This concept is borrowed from PyPet. To quote from the PyPet website:

The whole project evolves around a novel container object called trajectory. A trajectory is a container for parameters and results of numerical simulations in python. In fact a trajectory instantiates a tree and the tree structure will be mapped one to one in the HDF5 file when you store data to disk. …

… a trajectory contains parameters, the basic building blocks that completely define the initial conditions of your numerical simulations. Usually, these are very basic data types, like integers, floats or maybe a bit more complex numpy arrays.

In the simulations using the L2L Framework, there is a single Trajectory object (called traj). This object forms the backbone of communication between the optimizer and the optimizee. In short, it is used to acheive the following:

  1. Storage of the parameters of the optimizer, optimizee, and individuals of the optimizee
  2. Storage of the results of our simulation
  3. Adaptive exploration of parameters via trajectory expansion.

traj object is passed as a mandatory argument to the constructors of both the Optimizee and Optimizer. Additionally, we automatically passes this object as an argument to the functions simulate() and post_process()


This is the data structure used to represent individuals. This is basically a dict that has the parameter names as keys, and parameter values as values. The following need to be noted about the parameters stored in an Individual-Dict.

  1. The parameter names must be the dot-separated full-name (e.g. 'sim_control.seed') of the parameter. This name must be the name by which it is stored in the individual parameter group of the traj. To understand this, look at constructor of Optimizee.
  2. The dictionary must contain exactly those parameters that are going to be explored by the Optimizer. This is because this dictionary is used to expand the trajectory in the post_process() function. See the note about expanding trajectories in ref:optimizer-constructor

In the documentation above, whenever the term individual is used, it is assumed that the object referred to is an Individual-Dict. Also note that the Individual-Dict is not a separate class but merely a specification for specifying individuals of an optimizee via a dict.

In other places in the documentation, the Individual-Dict may also be referred to as a parameter dict, due to the fact that its keys represent parameter names.


The optimizee subclasses Optimizee with a class that contains four mandatory methods (Documentation linked below):

  1. create_individual() : Called to return a random individual (returns an Individual-Dict)
  2. simulate() : Runs the actual simulation and returns a fitness vector

In order to maintain a consistent framework for communication between the optimizer and optimizee it is required to enforce certain requirements on the behaviour of the above functions. The details of these requirements for the Optimizee functions are given below

Constructor of Optimizee

This function may perform any one-time initialization operations that are required for the particular optimizee. In addition to this, It must perform the job of initializing parameters in the trajectory. These parameters must be created in the parameter subgroup named individual (i.e. using traj.parameter.individual.f_add_parameter()). The following is a contract that must be obeyed by this constructor.

All parameters that are explored for the optimizee must be created in the trajectory under the individual parameter group. Moreover, the names by which they are stored (excluding the individual) must be equal to the key of the Individual-Dict entry representing that parameter.

As an example, if one wanted a parameter named sim_control.seed to be a part of the trajectory, one would do the following.

traj.individual.f_add_parameter('sim_control.seed', 1010)

If one intends sim_control.seed to be a parameter over which to explore, the Individual-Dict describing an individual of the optimizee must contain a key named 'sim_control.seed'

NOTE that the parameter group named individual itself is created in the constructor of the base Optimizee class. Thus, the derived class need only implement the addition of parameters as shown above.

The create_individual() function:

This must return an individual of the optimizee, i.e. it must return an Individual-Dict representing a valid random individual of the optimizee.

The simulate() function:

This function only receives as argument the trajectory traj set to a particular run. Thus it must source all required parameters from the traj and the member variables of the Optimizee class. It must run the inner loop with these parameters and always return a tuple (even for 1-D fitness!!) representing the fitness to be used for optimizing. _ See the class documentation for more details: Optimizee


The optimizer subclasses Optimizer with a class that contains two mandatory methods:

  1. __init__(): This is the constructor which performs the duties of initializing the trajectory and the initial generation of the simulation.
  2. post_process() : knowing the fitness for the current parameters, it generates a new set of parameters and runs the next batch of simulations.

And one optional method:

  1. end() : Tertiary method to do cleanup, printing results etc.

Note that in order to maintain a consistent framework for communication between the optimizer and optimizee, we enforce a certain protocol for the above function. The details of this protocol are outlined below

Constructor of Optimizer

Perform any one-time initialization tasks including creating optimizer parameters in the trajectory. Note that optimizer parameters are created in the root parameter group of traj (i.e. traj.par.f_add_parameter(…)).

Create a list of individuals by using the optimizee_create_individual function. These are the individuals that will be simulated in the first generation (i.e. generation index 0). Assign self.eval_pop to the list of above individuals, and call self._expand_trajectory() to expand the trajectory to include the parameters corresponding to these individuals.

self._expand_trajectory() relies on the fact that the individual objects are Individual-Dicts and uses the keys to access and assign the relevant optimizee parameters in the parameter group traj.individual. This is the reason for the contract enforced on the Optimizee constructor

Note that all the (non-exploring) paramters to the Optimizer is passed in to its constructor through a namedtuple() to keep the paramters documented. For examples see GeneticAlgorithmParameters or SimulatedAnnealingParameters

The post_process() function:

This function receives, along with the trajectory traj, a list of tuples. Each tuple has the structure (run_index, run_fitness_tuple). The post_process() function also has access to the individuals whose fitness was calculated (via the member self.eval_pop), and the generation index (self.g), along with any other user defined member variable.

Using the above, the function must calculate a new population of individuals to explore (remember individuals are always in Individual-Dict form). It must then store this list of individuals in self.eval_pop and call self._expand_trajectory(). [See Constructor of Optimizer for trajectory expansion details]. Also, self.g must be incremented

In case one wishes to terminate the simulation after the current generation, one must simply not call self._expand_trajectory(). Do not call self._expand_trajectory() with an empty self.eval_pop as it will raise an error

Some points to remember are the following:

  1. The call to self._expand_trajectory not only causes the trajectory to store more parameter values to explore, but, due to the mechanism underlying f_expand(), also causes the framework to run the optimizee simulate() function on these parameters. Look at the documentation referenced in the footnote of iteration-loop for more details on this
  2. Always build the optimizer to maximize fitness. The weights that are passed in to the optimizer constructor can be made negative if one wishes to perform minimization

See the class documentation for more details: Optimizer

Running an L2L simulation

Before running a simulation for the first time, you need to specify the output directory for your results. To do so, create a new file bin/path.conf with a single entry containing an absolute path or a path relative to the top- level L2L directory, e.g. ./output_results/, and create an empty folder at the path you specified. You also need to commit any staged files to your local repo. Failing to follow these instructions raises an error when trying to run any of the test simulations.

To run a L2L simulation, copy the file bin/l2l-template.py (see L2L Experiments) to bin/l2l-optimizeeabbr-optimizerabbr.py. Then fill in all the TODOs . Especially the parts with the initialization of the appropriate Optimizers and Optimizees. The rest of the code should be left in place for logging and recording. See the source of bin/l2l-template.py for more details.


Data postprocessing



We also support running different instances of the experiments on different cores and hosts using Jube.


  1. Always use the logger object obtained from:

    logger = logging.getLogger('heirarchical.logger.name')

    to output messages to a console/file.

  2. Setting up logging in a multiprocessing environment is a mind-numbingly painful process. Therefore, to keep users sane, we have provided the module logging_tools with 2 functions which can be used to conveniently setup logging. See the module documentation for more details.

  3. As far as using loggers is concerned, the convention is one logger per file. The name of the logger should reflect module hierarchy. For example, the logger used in the file optimizers/crossentropy/optimizer.py is named 'optimizers.crossentropy'

  4. A logger is uniquely identified by its name throughout the python process (i.e. it’s kinda like a global variable). Thus if two different files use 'optimizer.crossentropy' then their logs will be redirect to the same logger.

You can modify the bin/logging.yaml file to choose the output level and to redirect messages to console or file.

See the Python logging tutorial for more details.

Additional Utilities and Protocols

While the essential interface between Optimizers and Optimizers is completely defined above, The practical implementation of Optimizers and Optimizees demands certain frequently used data structures and functions. These are detailed here

dict-to-list-to-dict Conversion

The benefit of treating individuals as Individual-Dicts is that it allows properly named parameters in the optimizee, however this comes at the cost of the optimizer being unable to generalize across different Optimizee classes with different Individual-Dicts representing the individual. One solution for this is that most Optimizers prefer to behave like they are optimizing a vector (in the case of python, a list). Thus, the Optimizer requires the ability to convert back and forth between a list and a dictionary. For this purpose, we have the following functions

  1. dict_to_list()
  2. list_to_dict()

Check their documentation for more details.