A while back, Edward Kmett wrote a library called reflection
, based on a 2004 paper by Oleg Kiselyov and Chung-chieh Shan that describes a neat trick for
reifying data into types (here the word “reify” can be understood as turning a
value into something that can be referenced at the type level). There was also
an article written by Austin Seipp on how to use the library, and some great
answers on reddit and stackoverflow that go into detail about how it works.
And yet, in all these years, though I’ve been on the lookout for a way to make
use of this library, I wasn’t able to fit it into my workflow – until today!
So let’s look at my real world use for reflection
, which solves a problem that
maybe others have encountered as well.
As you may know, the QuickCheck library provides a facility for generating arbitrary data sets. The property testing features of QuickCheck make use of this generation to search for test data that might violate a set of properties.
However, the generation facility can also be used on its own, separate from
the testing components, to randomly generate data for any purpose. The library
for producing this random data offers lots of combinators, and is based around
instances for a type class called Arbitrary
. Here’s a basic example:
module Main where
import Test.QuickCheck.Arbitrary
import Test.QuickCheck.Gen
data Foo = Foo [Int] [String]
deriving Show
instance Arbitrary Foo where
= do
arbitrary <- listOf chooseAny
xs <- choose (1, 100)
len <- vectorOf len (shuffle "Hello, world")
ys return $ Foo xs ys
main :: IO ()
= print =<< generate (arbitrary :: Gen Foo) main
This creates a specifically shaped set of random data, where the list of
integers may be of any length, and any value, but the list of strings will
always be from 1 to 100 elements long, and the strings will only consist of
random arrangements of the characters found in "Hello, world"
.
Now, what if you wanted to guide the generation process for Foo
using external
information? Such as picking the length of the list of strings from a value
provided by the user? Since Arbitrary
does not allow the use of Reader
, how do
we get that user-supplied value into the arbitrary
function above? And without
using global IORef=s or =unsafePerformIO
?
The reflection
library allows us to reify a runtime value into a type (whose
name we’ll never know, requiring us to reference it through a type variable),
and then communicate that type via a constraint, such that we can reflect the
value back out as needed. If this sounds a bit confusing, maybe an example can
make it clearer:
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}
module Main where
import Data.Proxy
import Data.Reflection
import Test.QuickCheck.Arbitrary
import Test.QuickCheck.Gen
import System.Environment
data Foo s = Foo [Int] [String]
deriving Show
instance Reifies s Int => Arbitrary (Foo s) where
= do
arbitrary <- listOf chooseAny
xs <- choose (1, reflect (Proxy :: Proxy s))
len <- vectorOf len (shuffle "Hello, world")
ys return $ Foo xs ys
main :: IO ()
= do
main <- getArgs
[len] read len :: Int) $ \(Proxy :: Proxy s) ->
reify (print =<< generate (arbitrary :: Gen (Foo s))
There are a few additional things to note here:
A phantom type variable has been added to
Foo
. This type variable associates the reified data to our type, so it can be reflected back out in the instance for this type.The
Arbitrary
instance forFoo s
has incurred a new contraint, stating that the type represented bys
somehow reifies anInt
. How this happens is the magic of thereflection
library, and uses a clever GHC trick representing Edward’s unique twist on Oleg and Chung-chieh’s work. This instance requires theUndecidableInstances
extension.We now call
reify
with the data we want to pass along. This function takes a lambda whose first argument is aProxy s
, giving us a way to know which type variable to use in the type of the call toarbitrary
. This requires theScopedTypeVariables
extension.
That’s it: reflection
gives us a way to plumb extra data into instances at
runtime, at the cost of adding a single phantom type.
If the phantom type seems excessive for one use case, or if adding the phantom
would effect a large family of types, then an alternative is to enable the
FlexibleInstances
extension, and use Edward’s tagged
library to carry the
phantom instead:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}
module Main where
import Data.Proxy
import Data.Tagged
import Data.Reflection
import Test.QuickCheck.Arbitrary
import Test.QuickCheck.Gen
import System.Environment
data Foo = Foo [Int] [String]
deriving Show
instance Reifies s Int => Arbitrary (Tagged s Foo) where
= fmap Tagged $ do
arbitrary <- listOf chooseAny
xs <- choose (1, reflect (Proxy :: Proxy s))
len <- vectorOf len (shuffle "Hello, world")
ys return $ Foo xs ys
main :: IO ()
= do
main <- getArgs
[len] read len :: Int) $ \(Proxy :: Proxy s) ->
reify (print . unTagged =<< generate (arbitrary :: Gen (Tagged s Foo))
This way we leave the original type alone – which may be the only option if
you’re generating arbitrary data for types from libraries. You’ll just have to
wrap and unwrap the Tagged
newtype wrapper as necessary.
Another benefit of using Tagged
is that, because it can be wrapped and
unwrapped as necessary, it becomes possible to change the refied information
in cases where nested types are involved. In this last example, the user is
allowed to specify the value that should be supplied to the Bar
constructor
during data generation.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}
module Main where
import Data.Proxy
import Data.Tagged
import Data.Reflection
import Test.QuickCheck.Arbitrary
import Test.QuickCheck.Gen
import System.Environment
newtype Bar = Bar Int
deriving Show
data Foo = Foo [Bar] [String]
deriving Show
instance Reifies s Int => Arbitrary (Tagged s Bar) where
= return $ Tagged $ Bar $ reflect (Proxy :: Proxy s)
arbitrary
instance Reifies s (Int, Int) => Arbitrary (Tagged s Foo) where
= fmap Tagged $ do
arbitrary let (len, bar) = reflect (Proxy :: Proxy s)
<- listOf (reify bar $ \(Proxy :: Proxy r) ->
xs <$> (arbitrary :: Gen (Tagged r Bar)))
unTagged <- choose (1, len)
l <- vectorOf l (shuffle "Hello, world")
ys return $ Foo xs ys
main :: IO ()
= do
main <- getArgs
[len, barValue] read len :: Int, read barValue :: Int) $ \(Proxy :: Proxy s) ->
reify (print . unTagged =<< generate (arbitrary :: Gen (Tagged s Foo))