Fun Facts

Origins

For a DL course that I created, I was looking for a configuration library. Because I wanted their experience to generalise to future projects, I looked for commonly used configuration libraries. Back in those days, Hydra appeared to be the most popular. When I realised OmegaConf was the actual configuration library, I thought it would be a good idea to design the code around omegaconf.

Generally, I liked the idea behind omegaconf as a configuration library, but in the end I stumbled over a few (minor) details. First of all, omegaconf depends on antlr to provide variable interpolations, which is a feature I do not really care for. Secondly, when digging into the code of omegaconf, I realised I did not quite like its software design.

Eventually, I decided to write something by myself for my students. On one side not to bloat their python installation with (unnecessary) dependencies. On the other side, I wanted to promote good software design principles.

After some years of teaching, I decided there might be value in publishing this code. I could have just dropped the code on Github and be done with it. However, I thought it would be a fun exercise to create a proper package. Therefore, Upsilonconf can be considered an exercise in package design/management.

The package name is inspired by OmegaConf because this is where the story started. I decided to swap out the Greek prefix with Upsilon, which is the first letter of ὑπέρ (hupér). After all, this library is mainly intended to help with _hyper_-parameters in DL.

Design Principles

In order to understand why a feature has (not) been implemented, it can be useful to understand some of the underlying principles. You do not have to agree or adhere to these principles if you do not want. They are merely here to help you understand the library better.

Software

Some principles behind the code.

  1. Upsilonconf is supposed to be minimal. Not in the sense of character count in code, but rather from a technical standpoint. On one side, the goal is to have as little requirements as possible. On the other side, this library should do only one thing, but do it well, cf. the Unix Philosophy. Note that this does not mean that new features are not welcome!

  2. The code has been written with coupling and cohesion in mind. The goal is that every component in this library should be easily adaptable. This should make it easier to extend the library and/or change how things are done. However, ideally this also makes it easier to start or stop using this package. This is also reflected in how I think about configurations (see other sub-section).

  3. The code aims to provide useful typing. Type hints in Python make it possible to have (some) type checks before running the code. This typically helps to find errors before having to run any code. Therefore, upsilonconf aims to provide as much type hints as possible. Ideally, configuration objects would have type hints for configuration values. However, at the time of writing, this only seems to be possible through dataclasses.

  4. The code is being developed using a (relaxed) test-driven approach. Rather than just implementing a feature or a fix to some code, there should also be a unit test that fails without the feature or fix. In this sense, the tests define the expected/desired behaviour. This also means that in case of any disputes what code is supposed to do, we should be able to fall back to the unit tests.

  5. Finally, the Zen of Python might be a good summary of what all of the above points aim to do. Feel free to let me know if you think upsilonconf is not conforming to one of these rules.

Configurations

Some principles on using configuration files in general

  1. Configuration files are not the same as configuration objects. Configuration libraries are typically bound to file formats. Moreover, some libraries introduce additional syntax. E.g. OmegaConf introduces variable interpolation syntax to YAML. Although the variable interpolation syntax can be useful, it practically defines a new configuration file format. This should not be the job of a configuration library. After all, this couples the configuration library to the file format. Upsilonconf aims to provide an object that is convenient to work with. Furthermore, it is possible to read from/write to any format you like.

  2. Code should be configured in a way that does not require value duplication. If you feel like you need to repeat configuration values, I would argue that you are configuring your code in the wrong way. For me, the main goal of configuration files is to conveniently transfer information from the outside world into the code. Examples that argue in favour of variable interpolation typically ignore this aspect. Either information is (unnecessarily) duplicated, e.g. {base: foo/bar, path: ${base}/file} (using OmegaConf syntax) could easily be replaced by {base: foo/bar, filename: file}. This does not directly affect upsilonconf, but hopefully serves as food for thought.

  3. Configuration objects should not be used as arguments to functions. Some people like to use dataclasses as configuration objects. However, this typically requires functions to take these dataclasses as arguments. Technically, this works great because you can pack all arguments in one object. However, there are a few disadvantages to this approach. First of all, function signatures are no longer self-explanatory. Furthermore, code depends on the configuration object (increased coupling). Finally, it invites to increase the number of arguments a function takes. A large number of arguments is one of those code smells that might indicate high coupling. Therefore, I personally prefer not to pass configuration objects as arguments. Instead, values should be extracted from the configuration object and passed to the function. This is exactly what the dict interface of upsilonconf objects allows you to do.