Making my life easier with two GADTs
Matt Parsons wrote and commented about Lucas Escot writing about their experience and thoughts about GADTs. The articles are called, respectively, “Making My Life Harder with GADTs” and “Making my life easier with GADTs”. In this article I don’t want to participate in the discussion about advanced and simple Haskell, but I do like GADTs and two of them are even more outrageous or better than one, depending on which “side” you are, respectively!
Event sourcing
Recently I have been extracting a “template” from my existing Haskell back-end for future projects. The original application is built using DDD principles, the Onion architecture and event sourcing. The Onion architecture, in particular, is realized using the “mtl-style”. In the new template I wanted to migrate to the interface-passing design, also known as “services”, “handlers” and so on.
As part of the template extraction, at some point it was the turn of event sourcing modules to get some refactoring/redesign. I came up with the following interface for a journal (I dislike the standard ES terms “event” and “stream”):
data Journal record = Journal {
journalVersion :: IO Version,
saveRecord :: record -> Version -> IO (Maybe Version)
-- Other journal operations here
}
The Journal record
interface is a record of functions (although Haskell does not have real records, only record syntax) which specifies operations for working with a journal, consisting of records of type record
. One layer of the program defines this interface and another layer provides its implementation, i.a. a value of type Journal record
for some record
type.
As more records are appended (saved) to the journal, its version changes. Current version of a journal can be obtained using the journalVersion
operation. Operation saveRecord
appends a new record to the journal. The reason why saveRecord
also requires a Version
is that it performs concurrent change detection. Given a new record and the expected journal version, it appends the record only if the journal version is still the expected version; only then the result is Just
carrying the new version of the journal, now corresponding to the old journal together with the newly appended record. If, on the other hand, another record was added concurrently, the expected version will not match the actual journal version, the new record will be rejected and no new version will be returned, i.e. the result becoming Nothing
. Nothing very special so far.
Using this new framework, I proceeded with defining a user activity journal. For all users, this journal stores what a user did in the application, on a very high level. For the template, only one example record was needed, so the type of records looked like this:
data UserActivityRecord =
LoggedIn (Identifier User)
The API of the user activity journal correspondingly becomes
journalVersion :: IO Version
saveRecord :: UserActivityRecord -> Version -> IO (Maybe Version)
The idea was to store a LoggedIn
record every time a user successfully authenticates in the application. However, it quickly became clear that the definition of saveRecord
was not convenient: it was not interesting or important to detect concurrent changes to the user activity journal, simply because there are no conflicts to resolve or changes to lose! Crucially, other journals certainly will be subject to these considerations, so completely dropping the concurrent change detection aspect of saveRecord
is not an option.
More specifically, saveRecord
in this form is not convenient in the HTTP handler for authentication, because I would have to
obtain the expected version of the journal from somewhere;
even before that, define what “expected” even means in this case (again, we are not dealing with some change to business data here);
pass a version to
saveRecord
and deal with the potential outcome of rejecting the new record (!);retry in case of a “concurrent change”.
All of this is clearly not applicable and nonsensical for user authentication; the journal API is too inflexible. A straightforward solution would be to just define two saveRecord
operations: one with and one without concurrent change detection. I did not like the idea for various objective and subjective reasons, so here comes a change of the API.
Make unnecessary outcomes not required (to handle)
The title of this section is a pun on the well-known “make illegal states unrepresentable”. In a way, this is what GADTs allow us to achieve for this design problem, only there are no illegal states, but unwanted outcomes instead. But first, what is this problem more exactly? The problem is that:
We cannot not ask for concurrent change detection. The
Version
parameter requires a client module to provide a version, it cannot be omitted.We must deal with the possibility of rejected record because of the
Maybe
inMaybe Version
, which has two possible values.
What we want instead is:
Usually ask for concurrent change detection and be forced to handle the possibility of rejection.
Sometimes ask for no concurrent change detection with the guaranteed outcome that the new record will not be rejected.
The tricky part is achieving this with only one saveRecord
operation.
If we look carefully at the requirements above, it is clear we need to deal with two “sides”: the parameters and the result. First, we define a GADT which differentiates asking and not asking for concurrent change detection:
data ConcurrencyCheck (detect :: Bool) where
NoCheck :: ConcurrencyCheck False
ExpectedVersion :: Version -> ConcurrencyCheck True
The NoCheck
constructor is our request to not perform change detection, therefore it does not carry a Version
. We now have a way of not having to come up with the expected journal version! But as this is a GADT, we have to talk about the type variable as well: detect
tracks whether the concurrent change detection is really requested, which is False
in case of NoCheck
.
The ExpectedVersion
constructor carries the expected version and detect
signifies a request for change detection with type True
. Notice, that ExpectedVersion v
for some v :: Version
corresponds exactly to passing v
in the current signature saveRecord :: record -> Version -> IO (Maybe Version)
. In other words, we generalized the input of saveRecord
to support both requirements above.
Now for the result type. We introduce another GADT which describes our desired outcomes:
data NewVersion (detect :: Bool) where
Stored :: Version -> NewVersion detect
ConcurrentChangeDetected :: NewVersion True
Value Stored nv
means that the new record was accepted, a new journal version nv
generated and it is now returned to the client module. On the other hand, constructor ConcurrentChangeDetected
does not carry a new journal version, because it means that a concurrent change was detected. We have the following correspondence with the original Maybe Version
result type:
Stored nv ~ Just nv
ConcurrentChangeDetected ~ Nothing
However, the main trick is that we have more information now, in the form of the detect
type variable. In the new signature of saveRecord
we can relate the request to perform concurrent change detection to the outcome of the operation like this:
data Journal record = Journal {
journalVersion :: IO Version,
saveRecord :: forall detect. record -> ConcurrencyCheck detect -> IO (NewVersion detect)
-- Other journal operations here
}
The essence here is that
By repeating type variable
detect
on the input and result sides, we relate certain inputs with certain outcomes, but some combinations are impossible.
What is possible are exactly the cases in “what we want” above. What is not possible is that we are forced to check for concurrent changes even we don’t want to. Which is exactly what we want!
It will become clear if we consider all the cases explicitly. Suppose we pass ExpectedVersion v
to saveRecord
. Effectively, its type becomes specialized to
saveRecord :: record -> ConcurrencyCheck True -> NewVersion True
True
appears on both sides because type variable detect
is the same on both sides. Now, what values can saveRecord
return in this case? This amounts to the question: which constructors of the NewVersion
GADT have type NewVersion True
? There are two such constructors:
ConcurrentChangeDetected :: NewVersion True -- Trivially, by definition
-- and
Stored nv :: NewVersion True -- Because `detect` in `Stored` unifies with `True`
The Stored
case is more tricky because it works due to being polymorphic in detect
, unlike the fixed True
in the ConcurrentChangeDetected
case. So two outcomes are possible and we must pattern match on them to deal with potential concurrent changes, which is good and is just like the original Maybe Version
!
But we are here for the second case: passing NoCheck
to saveRecord
. Repeating the exercise, its type effectively becomes
saveRecord :: record -> ConcurrencyCheck False -> NewVersion False
What values can saveRecord
return now? Which constructors of the NewVersion
GADT have type NewVersion False
? There is only one such constructor:
Stored nv :: NewVersion False -- Because `detect` in `Stored` unifies with `False`
The ConcurrentChangeDetected
case is not possible anymore, because its True
type variable instantiation does not unify with False
, no magic here. Polymorphism of Stored
does its work again, but at the same time the other, unnecessary outcome became not required to handle.
To cement what just happened, the following and only the following combinations of inputs and outputs of saveRecord
are possible:
ExpectedVersion v <---> { Stored nv, ConcurrentChangeDetected }
NoCheck <---> { Stored nv }
Convince yourself that this corresponds to our wishlist formulated above. Other combinations simply do not type check.
Final words
While this article is not exactly short, the final journal interface definition is and fits in several lines:
data Journal record = Journal {
journalVersion :: IO Version,
saveRecord :: forall detect. record -> ConcurrencyCheck detect -> IO (NewVersion detect)
-- ..
}
data ConcurrencyCheck (detect :: Bool) where
NoCheck :: ConcurrencyCheck False
ExpectedVersion :: Version -> ConcurrencyCheck True
data NewVersion (detect :: Bool) where
Stored :: Version -> NewVersion detect
ConcurrentChangeDetected :: NewVersion True
There is a lot to unpack here with several techniques coming together (I did not even discuss the strange Bool
, True
and False
…). However, I would not call GADTs themselves (very) advanced or fancy Haskell, which seems to be a popular opinion.
I also haven’t seen the trick with “matching” of cases belonging to different GADTs with a type variable described explicitly anywhere else. It’s no special though and undoubtedly known to many. Maybe this combination does move the needle towards the “advanced” Haskell side, I don’t know…