SwirlyMyself

2010-11-08T08:59:14+00:00

A Solution to the Configuration Problem in Haskell

On the drive back home from BelHac I thought about the configuration problem in Haskell: The issue is finding a convenient way to work with values that are initialized once and used in many places all over the code.

Assume you have a large module of pure code that, using many custom functions and combinators, parses some data structure. Later you noticed that somewhere far down in the parser, you need to react differently depending on some user preferences – say, his preferred language. The usual solution is to add a new parameter to that function and, in consequence, to each and every function that calls or might call directly or indirectly this function. This is often very inconvenient.

Other solutions include:

  • Using mutable references and some hacking with unsafePerformIO, which always gives the programmer a bad conscience.
  • Using a Reader monad, requiring a  rewrite of the whole program in monadic style.
  • Using implicit parameters which is ok if you did not write type signatures, but if you did, you still have to modify them a lot. 
  • Some advanced type hackery.

The solution I thought of and implemented uses Template Haskell, the Haskell library to modify code at compile time, to turn the style you prefer to write in (pure code that uses configuration values as if they were global constants) into the style that is semantically correct (pure code with configuration values as an additional parameter). I uploaded the resulting code as seal-module to hackage and added plenty of comments and examples to the SealModule module (⅔ are comments according to ohcount). I refrain from copying that into this blog post, so if you are curious, please continue reading there.

Comments

It's funny how there was exactly the same question on planet PHP once. I don't think I can find the blogpost again, but there are still things that are difficult independent of your programming language - even in such different languages as PHP and Haskell.
#1 Thomas Koch (Homepage) am 2010-11-08T10:34:17+00:00
I presume one of those options is equivalent to adding a global variable — though never having used Haskell much of that passed me by. But isn't this exactly the kind of problem object—oriented programming solves?
#2 D Hardy (Homepage) am 2010-11-08T13:18:27+00:00
This is not a matter of objection-orientation or not, but rather of mutual state: In Haskell, variables always have a value and can never be changed, but only defined once – similar to references in C++. This has great benefits as it avoids a lot of possible errors, but some things are harder, of course.
#3 Joachim Breitner (Homepage) am 2010-11-08T13:48:58+00:00
Yes, the first (mutable references and unsafePerformIO) is roughly equivalent to using a global variable. It's more complicated than that, though. In most languages, the language specification provides guarantees about the order in which things occur. In Haskell, the order of things is typically unspecified for pure computations, so the unsafePerformIO hackery referred to there is working around that and make sure that the variable is read after it's initialized.
#4 Chris Smith am 2010-11-08T17:07:18+00:00
This looks great, it's very similar to Coq's section syntax which a lot of people tend to miss in haskell.
Though currently the params can only be values, right? sometimes one would like to abstract over types too.
And i think we should be able to use type families declarations to specify which types we want to abstract over.
#5 Andrea Vezzosi am 2010-11-08T10:51:46+00:00
Can you give an example of what kind of types you want to abstract over, and what you want to achieve?
#6 Joachim Breitner (Homepage) am 2010-11-08T11:21:14+00:00
a standard example would be parametrizing over the implementation of a dictionary.
e.g.

sealModule [d|
type family IMap :: * -> *
lookup :: Int -> IMap a -> Maybe a
insert :: Int -> a -> IMap a -> IMap a
lookup = sealParam
insert = sealParam

newtype MyMonad a = MM (State (IMap Foo) a)

foo :: IMap Foo -> MyMonad Bar
foo = ...
|]

the above would produce code like:

newtype MyMonad imap a = MM (State (imap Foo) a)
foo :: (Int -> imap a -> Maybe a) -> (Int -> a -> imap a -> imap a) -> imap Foo -> MyMonad imap Bar
foo = ...


Used like this might appear to overlap a bit in scope with typeclasses, but i still think there are many case where this style would be nicer.
#7 Andrea Vezzosi am 2010-11-08T12:19:02+00:00
This actually works without type classes, at least if you do not insist on giving type signatures for foo. This compiles:

{-# LANGUAGE TemplateHaskell, RecordWildCards #-}
module Test where

import Language.Haskell.SealModule

sealModule [d|
lookup :: Int -> imap a -> Maybe a
lookup = sealedParam
insert :: Int -> a -> imap a -> imap a
insert = sealedParam

-- foo :: imap a -> imap a
foo map = case lookup (1::Int) map of
Just a -> insert (2::Int) a map
Nothing -> map
|]

and the resulting function foo has a type of "foo :: (Int -> t -> Maybe t1) -> (Int -> t1 -> t -> t) -> t -> t".

But of course this is not an ideal solution.
#8 Joachim Breitner (Homepage) am 2010-11-08T12:53:28+00:00
What are your thoughts on using this solution with more data-heavy one-time initialization cases?

For example, one idea I have been struggling with recently is to use an I18n framework in a haskell web application, say with a framework like Snap.

Ideally, you would want to load the localization maps from gettext/yaml/json files into memory at the very start and keep them there throughout the run. As it stands, this doesn't seem to be possible in Snap as the Snap monad only has the Request, Response and Logger available during the processing cycle.

An obvious alternative would be to initialize the web server built on a reader monad that accepts an arbitrary GADT, which can be used to embed the I18n maps. But I am not sure if that is the performant thing to do...

Best,
OA
#9 OA am 2010-11-08T21:30:53+00:00
I don’t think this solution will cause any problems if the parameter is large – after all, after the code transformation it becomes a regular parameter. Also, inside the functions in sealModule, the parameter is not passed around. Instead, all the functions are moved to a large where statement and the parameter is in scope.

So if you make sure any expensive calculations about the parameter are shared among the calls _into_ the sealed Module (which is the case if you bind it once in main), you should be good.
#10 Joachim Breitner (Homepage) am 2010-11-09T08:28:54+00:00
(This may be slightly off-topic) Using explicit (yes, EXplicit) parameters should not be a burden. The problem appears when you need to retrofit a new parameter to an existing function call chain. It may require (a bit of|some|a bunch of) keystrokes to perform. This may be annoying for some people.

Nevertheless, it's what I've been doing in Java for years. I decided that it was better to be explicit about what's required by a function and that I would not use global variables, shared state nor anything else not allowed in plain classical Haskell. I have seed too many horrible solution for configuration: singletons, dynamic variables (fluid-let, thread locals, dynamic parameters). They eventually make the code impossible to unit test or to understand. So I took the explicit road even if it's not mainstream. I don't regret it. The code has less dependencies, is more clear and more robust.

Essentially, a configuration is big read-only structure having its elements passed to different components of the system. I fail to see why it's not possible to accomplish the same thing in Haskell using just regular function parameters. If the problem is that too many modifications must be done on the function signatures in order to introduce a new configuration element, well I'd say what we really need here is a refactoring tool. I've been using Eclipse for my Java development and it makes this kind of task very easy to perform.

A more difficult problem is when your need to refresh some configuration parameters during program execution. In order to support this, you have to convert everything in the call chain to monadic code (IO or Reader, I guess). This is much more dramatic than simply adding some parameters to existing functions.

One last thing concerning configuration. Sadly, it is not possible (or even conceivable, as far as I understand the Haskell type system) to perform reflection in Haskell. This would greatly ease the configuration process as we could automatically create complex dependent data structures out of external declarations. The Java example of that is the SpringFramework bean factory which allow the creation of a graph of Java objects that are used in the application. So, the Java developer never really needs to do anything to allow the system to be configurable. Correct me if I'm wrong but in Haskell, you'll always need to somehow read your configuration file and create your data by hand, the though part being managing dependencies between those data.
#11 Jean-Philippe Gariépy am 2010-11-09T02:53:05+00:00

Have something to say? You can post a comment by sending an e-Mail to me at <mail@joachim-breitner.de>, and I will include it here.