A Typed Socket API

After wanting to for some time, I decided to finally get around to putting together a typed layer over the Haskell network library. This post goes into the techniques I used to make that happen, how it works and looks, and what the current short-comings are.

Singletons and phantom types are involved! For the code written in this post, the following language extensions are assumed to be enabled:

{-# LANGUAGE LambdaCase #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PartialTypeSignatures #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}

Contents

Network Sockets

Working with sockets was often a bit of an adventure. There’s a particular order to things, which, while well-documented, wasn’t always clear. For example, to create a client socket that communicate using TCP, you’d have to:

Haskell code to achieve this looks like this:

> import Network.Socket
> import qualified Network.Socket.ByteString as NSB

> serverAddress = SockAddr 2291 (tupleToHostAddress (127,0,0,1))
> socket <- socket AF_INET Stream defaultProtocol
> connect socket serverAddress
> send socket "fish"
> bs <- recv socket 32
> close socket

To create a server socket is a bit more involved. The steps are the same up to the point where we allocate a network address structure. After that, the steps are:

The code looks like this, without a loop:

> import Network.Socket
> import qualified Network.Socket.ByteString as NSB

> serverAddress = SockAddr 2291 (tupleToHostAddress (127,0,0,1))
> socket <- socket AF_INET Stream defaultProtocol
> bind socket serverAddress
> listen socket 1
> (client, _) <- accept socket
> bs <- recv client 32
> send client bs
> close client
> close socket

While seemingly relatively straight-forward (but perhaps only if one has read network socket documentation a lot), there’s some room for error. For example, one might:

And if we consider the broader socket API, there’s also room for error around sockets that have shutdown their ability to send or receive. There’s also some very particular rules around sending or receiving with UDP sockets when using the recvFrom/sendTo interfaces.

Lots of details. All documented in a few places. But!! We can do a little better with types. That’s my motivation for this effort, and the rest of this post goes into the design and implementation of such an API.

Capturing State (Machine) Assumptions as Types

Working with the socket API, there’s at least 4 bits of intrinsic state to them that determine how they work:

Here’s a snippet of the current socket APIs type signatures:

socket :: Family -> SocketType -> ProtocolNumber -> IO Socket
connect :: Socket -> SockAddr -> IO ()
bind :: Socket -> SockAddr -> IO ()
listen :: Socket -> Int -> IO ()
accept :: Socket -> IO (Socket, SockAddr)
shutdown :: Socket -> ShutdownCmd -> IO ()
close :: Socket -> IO ()
send :: Socket -> ByteString -> IO Int
recv :: Socket -> Int -> IO ByteString
sendTo :: Socket -> ByteString -> SockAddr -> IO Int
recvFrom :: Socket -> Int -> IO (ByteString, SockAddr)

What I’d like to point out is that once we call socket, we have a Socket, but our types tell us nothing about that Socket. Further, when we call connect or bind or listen or close, we lose information. Changes are happening to that socket, but all we get back is an empty void, IO (). We know an effect has happened! But - we forget the nature of that effect.

IO (), for when you want to throw away information.

Fortunately, Haskell is one of those languages where we can track these sorts of changes in the types themselves. So, what would a type-enhanced socket type look like?

Here’s one possible design:

import qualified Network.Socket as NS

newtype
  SSocket (f :: SocketFamily) (p :: SocketProtocol)
    (s :: SocketStatus) (sh :: ShutdownStatus)
  = SSocket NS.Socket

It’s nothing more than a newtype wrapper around the socket library, with several phantom types. They’re so-called phantom types, because they don’t affect the in-memory or run-time representation of our data structures. They spookily vanish once the program begins to run.

Now let’s look at the definitions for the each of these phantom types:

data SocketFamily
  = Unix
  | InetV4
  | InetV6

data SocketProtocol
  = Tcp  -- Stream
  | Udp  -- Datagram

data SocketStatus
  = Unconnected
  | Bound
  | Listening
  | Connected
  | Closed

data ShutdownStatus
  = Available
  | CannotReceive
  | CannotSend
  | CannotSendOrReceive

Nothing too surprising yet. This is still pretty cozy Haskell. Now, how do we get from that data type definition to a typed socket API? There’s the challenge.

Let’s consider the socket function. It’s original form is:

socket :: Family -> SocketType -> ProtocolNumber -> IO Socket

If we just map the original socket function over to the new socket type, it’d look something like:

socket ::
  Family -> SocketType -> ProtocolNumber
  -> IO (SSocket f p s sh)

However, we know a bit more than that about the state of a newly created socket. This means we can refine our type signature a step further:

socket ::
  Family -> SocketType -> ProtocolNumber
  -> IO (SSocket f p 'Unconnected 'Available)

Initializing a socket always gives one with a socket status of Unconnected and a shutdown status of Available. And so we initialize our socket state machine! We’re using DataKinds here to promote the data definitions to the type-level.

So what would the implementation look like? Here it is:

socket ::
  Family -> SocketType -> ProtocolNumber
  -> IO (SSocket f p 'Unconnected 'Available)
socket fam sockType protocolNum =
  return (SSocket (socket fam sockType protocolNum))

That’s it!

However, there’s two shortcomings here:

We should be able to do such a thing. And to do that, we turn to a technique known as singletons! In the next section, we’ll work through the machinery needed to connect types and values.

Singletons Mean 1-to-1

Before we get to the details of the technique, we’ll start from the end. The final form of the socket function type signature looks like this:

socket :: forall f p.
  (SockFam f, SockProto p) => IO (SSocket f p 'Unconnected 'Available)

Since the last version presented, we’ve:

With singleton types, it was possible to do all this and still have a reasonable implementation. After all, if the user specifies what types they want for the family and protocol, shouldn’t we be able to produce the values that match that?

That’s dependently-typed programming in Haskell.

We’ll need four specific tools to connect types and values:

Before we delve into the details, here’s the full implementation of the socket function in its final form:

socket :: forall f p.
  (SockFam f, SockProto p) => IO (SSocket f p 'Unconnected 'Available)
socket =
  let x = socketFamily (socketFamily1 :: SocketFamily1 f)
      y = socketProtocol (sockProto1 :: SocketProtocol1 p)
  in SSocket <$> (NS.socket x y NS.defaultProtocol)

I’ve at least six things to explain here:

Let’s start with all the socket family biz. Keep in mind, we want to go from a type SocketFamily given as f to a value NS.SocketFamily that the lower-level socket function can consume.

First, a GADT mirror for the SocketFamily data type:

data SocketFamily1 (f :: SocketFamily) where
  SUnix :: SocketFamily1 'Unix
  SInetV4 :: SocketFamily1 'InetV4
  SInetV6 :: SocketFamily1 'InetV6

This gives us values that connect the type-level SocketFamily to the value-level. That’s step one.

The next step, a type class to give us a unique value from a SocketFamily type:

class SockFam (f :: SocketFamily) where
  socketFamily1 :: SocketFamily1 f

instance SockFam 'Unix
  where socketFamily1 = SUnix

instance SockFam 'InetV4
  where socketFamily1 = SInetV4

instance SockFam 'InetV6
  where socketFamily1 = SInetV6

Between the GADT and the type-class, we’ve come most of the way. The final step, is a forgetful function giving us our NS.SocketFamily, so-called because it forgets its phantoms:

socketFamily :: SocketFamily1 f -> NS.Family
socketFamily = \case
  SUnix -> NS.AF_UNIX
  SInetV4 -> NS.AF_INET
  SInetV6 -> NS.AF_INET6

And that’s a singleton! The steps, in summary:

  1. define a data type to represent your valid states, to be used as a kind
  2. define a GADT to mirror the type at the value-level, indexed by the kind
  3. define a type-class to give us one of those GADTs from a the type
  4. map from the GADT to whatever you’d like to map to

This is the simplest version of the singleton technique. It involves a lot of repetitive code, to be sure. It’s also a topic of active research, with nifty ways of mirroring even more complex structures between the type- and value-levels.

For more details and a couple of papers, check out the singletons library, which uses template-haskell to automate a lot of this!

Now, we look at the socket protocol side:

data SocketProtocol1 (p :: SocketProtocol) where
  STcp :: SocketProtocol1 'Tcp
  SUdp :: SocketProtocol1 'Udp

class SockProto (p :: SocketProtocol) where
  sockProto1 :: SocketProtocol1 p

instance SockProto 'Tcp
  where sockProto1 = STcp

instance SockProto 'Udp
  where sockProto1 = SUdp

socketProtocol :: SocketProtocol1 p -> NS.SocketType
socketProtocol = \case
  STcp -> NS.Stream
  SUdp -> NS.Datagram

It’s a nearly identical dance.

With those two pieces, we’ve fully implemented socket:

socket :: forall f p.
  (SockFam f, SockProto p) => IO (SSocket f p 'Unconnected 'Available)
socket =
  let x = socketFamily (socketFamily1 :: SocketFamily1 f)
      y = socketProtocol (sockProto1 :: SocketProtocol1 p)
  in SSocket <$> (NS.socket x y NS.defaultProtocol)

That was An Effort.

In the next section, we’ll look at the rest of the typed socket API.

Coloring in the State Machine

So far, we have a nifty socket data type and a means to initialize our socket. However, we still want to like, be able to do a lot more with those sockets. After all, the goal is to have a usable sockets library!

Here’s an overview of the remainder of the API that affects socket state. We’ll use this to detail the last few techniques:

connect ::
  SockAddr f -> SSocket f 'Tcp 'Unconnected sh
  -> IO (SSocket f 'Tcp 'Connected 'Available)

bind :: SockAddr f -> SSocket f p 'Unconnected sh -> IO (SSocket f p 'Bound sh)

listen :: Int -> SSocket f 'Tcp 'Bound sh -> IO (SSocket f 'Tcp 'Listening sh)

accept :: SSocket f 'Tcp 'Listening sh -> IO (SSocket f 'Tcp 'Connected sh, NS.SockAddr)

close :: Closeable s ~ 'True =>
  SSocket f p s sh -> IO (SSocket f p 'Closed sh)

shutdownReceive ::
  CanShutdownReceive sh s ~ 'True =>
  SSocket f p s sh
  -> IO (SSocket f p s (Shutdown sh 'NS.ShutdownReceive))

shutdownSend ::
  CanShutdownSend sh s ~ 'True =>
  SSocket f p s sh
  -> IO (SSocket f p s (Shutdown sh 'NS.ShutdownSend))

shutdownBoth ::
  CanShutdownBoth sh s ~ 'True =>
  SSocket f p s sh
  -> IO (SSocket f p s (Shutdown sh 'NS.ShutdownBoth))

The simplest operations here are connect, listen, bind, and accept. The implementations of these are strictly pass-throughs to the lower-level API, and all we do is update the types to reflect the changes. There’s still a bit of power here. For example, looking at bind, we require that a socket be in an Unconnected state before this’ll compile. This is where the safety of the API comes from - we place strong requirements on the data we introduce before we allow a change to be made.

Let’s look at close:

close :: Closeable s ~ 'True =>
  SSocket f p s sh -> IO (SSocket f p 'Closed sh)

There’s two new things at work here:

  1. Closeable, a type family (a type function)
  2. an adorable squiggle that means - these two must be equal

Closeable itself is comfortable to write:

type family Closeable (s :: SocketStatus) :: Bool where
  Closeable 'Unconnected = 'True
  Closeable 'Bound = 'True
  Closeable 'Listening = 'True
  Closeable 'Connected = 'True
  Closeable 'Closed = 'False

For every status a socket could be in, we indicate whether it makes sense to close it. Technically, there’s no harm in closing a socket that’s already been closed. But, we can prevent even that bit of waste by marking that as an invalid state.

If we try to close a socket that’s already been closed, we’ll get this sort of error:

> tcp4Socket >>= close >>= close

<interactive>:2:26: error:
Couldn't match type'False’ with ‘'True
        arising from a use of ‘close’
In the second argument of ‘(>>=)’, namely ‘close’
      In the expression: tcp4Socket >>= close >>= close
      In an equation for ‘it’: it = tcp4Socket >>= close >>= close

It is not at all a particularly helpful type error. As of GHC 8.0, there’s a means to make type errors more helpful, but that technique is not covered in this post. For now, it’ll suffice to say - the error is prevented and highlighted.

To close up our discussion on close, here’s the full implementation (which mirrors bind and friends):

close :: Closeable s ~ 'True =>
  SSocket f p s sh -> IO (SSocket f p 'Closed sh)
close (SSocket s) = NS.close s >> return (SSocket s)

Now, let’s talk about the shutdown commands:

shutdownReceive ::
  CanShutdownReceive sh s ~ 'True =>
  SSocket f p s sh
  -> IO (SSocket f p s (Shutdown sh 'NS.ShutdownReceive))

shutdownSend ::
  CanShutdownSend sh s ~ 'True =>
  SSocket f p s sh
  -> IO (SSocket f p s (Shutdown sh 'NS.ShutdownSend))

shutdownBoth ::
  CanShutdownBoth sh s ~ 'True =>
  SSocket f p s sh
  -> IO (SSocket f p s (Shutdown sh 'NS.ShutdownBoth))

Four new type functions are introduced:

The techniques are very similar to what we did for close. Below are the relevant implementations for the first three type functions:

type family CanShutdownReceive (sh :: ShutdownStatus) (s :: SocketStatus) :: Bool where
  CanShutdownReceive 'Available 'Connected = 'True
  CanShutdownReceive 'CannotSend 'Connected = 'True
  CanShutdownReceive 'Available 'Listening = 'True
  CanShutdownReceive 'CannotSend 'Listening = 'True

type family CanShutdownSend (sh :: ShutdownStatus) (s :: SocketStatus)  :: Bool where
  CanShutdownSend 'Available 'Connected = 'True
  CanShutdownSend 'CannotSend 'Connected = 'True
  CanShutdownSend 'Available 'Listening = 'True
  CanShutdownSend 'CannotSend 'Listening = 'True

type family CanShutdownBoth (sh :: ShutdownStatus) (s :: SocketStatus)  :: Bool where
  CanShutdownBoth 'Available 'Connected = 'True
  CanShutdownBoth 'Available 'Listening = 'True

It turns out that trying to shut down sockets that aren’t in at least a connected or listening state will cause an exception. We’ve prevented that now.

The last function is the one that performs the actual state transition:

type family Shutdown (sh :: ShutdownStatus) (cmd :: NS.ShutdownCmd)
  :: ShutdownStatus where
  Shutdown 'Available 'NS.ShutdownSend = 'CannotSend
  Shutdown 'Available 'NS.ShutdownReceive = 'CannotReceive
  Shutdown 'Available 'NS.ShutdownBoth = 'CannotSendOrReceive
  Shutdown 'CannotSend 'NS.ShutdownReceive = 'CannotSendOrReceive
  Shutdown 'CannotReceive 'NS.ShutdownSend = 'CannotSendOrReceive

Unapologetically, we leverage the types of the shutdown commands from the underlying network library to complete this type function, and also take the time to prevent all inefficient or invalid states.

That’s the core of things! Next up, we’ll see how this affects the design of sending and receiving data over sockets.

Sending and Receiving, but Only When it Makes Sense

The goal here is to only send or receive when it makes sense. In particular:

Here’s the API to achieve that:

send :: CanSend sh ~ 'True
  => ByteString -> SSocket f 'Tcp 'Connected sh -> IO Int

recv :: CanReceive sh ~ 'True
  => Int -> SSocket f 'Tcp 'Connected sh -> IO ByteString

sendTo :: CanSend sh ~ 'True
  => ByteString -> SockAddr f -> SSocket f 'Udp 'Unconnected sh -> IO Int

recvFrom :: (CanReceive sh ~ 'True, SockFam f)
  => Int -> SSocket f 'Udp 'Unconnected sh -> IO (ByteString, SockAddr f)

CanSend and CanReceive are simple enough, and currently depend only on the socket shutdown status:

type family CanSend (sh :: ShutdownStatus) :: Bool where
  CanSend 'Available = 'True
  CanSend 'CannotReceive = 'True

type family CanReceive (sh :: ShutdownStatus) :: Bool where
  CanReceive 'Available = 'True
  CanReceive 'CannotSend = 'True

The rest of the constraints are encoded in the type signature itself. Both the required protocol and the required socket state are specified, and only by passing in a socket with the appropriate state will a program compile.

Type inference saves us all and makes this possible.

As it’s given, this API is slightly over-preventative. Namely, it’s valid to use recvFrom/sendTo with a TCP socket, as long as it’s disconnected. Furthermore, it’s also fine to use send/recv with a UDP socket, also as long as it’s connected. The remaining cases throw errors. The nuanced version of this state machine gives us little gain, as it’s reasonable to expect to use TCP sockets mainly when connected and UDP sockets mainly when disconnected.

I want to highlight one last thing before moving on to the next section. Take a look at recvFrom and sendTo and their SockAddr argument:

sendTo :: CanSend sh ~ 'True
  => ByteString -> SockAddr f -> SSocket f 'Udp 'Unconnected sh -> IO Int

recvFrom :: (CanReceive sh ~ 'True, SockFam f)
  => Int -> SSocket f 'Udp 'Unconnected sh -> IO (ByteString, SockAddr f)

This is a different structure than the one given by the underlying library with a touch of added safety.

Safer Socket Address Initialization

The underlying library uses the following socket address data structure:

data SockAddr
  = SockAddrInet PortNumber HostAddress
  | SockAddrInet6 PortNumber FlowInfo HostAddress6 ScopeID
  | SockAddrUnix String
  ...

The problem here, is that there’s nothing preventing us from using a SockAddrInet with say, a Unix socket. Which, would throw an error. Or slightly more likely, an Inet6 structure with an Inet4 socket.

To address this, the fix is simple enough - we use a GADT version!

data SockAddr (f :: SocketFamily) where
  SockAddrInet :: NS.PortNumber -> NS.HostAddress
    -> SockAddr 'InetV4

  SockAddrInet6 :: NS.PortNumber -> NS.FlowInfo -> NS.HostAddress6 -> !NS.ScopeID
    -> SockAddr 'InetV6

  SockAddrUnix :: ![Char]
    -> SockAddr 'Unix

The most important bit here is the SocketFamily phantom type and how it affects the return type at each branch of this GADT. We reuse the SocketFamily type from earlier so we can relate socket instances to socket address structures.

Now if we look at a function like bind:

bind :: SockAddr f -> SSocket f p 'Unconnected sh -> IO (SSocket f p 'Bound sh)

It becomes impossible to pass in a socket address structure that isn’t of the same socket family as the socket being used. With that, our goal is achieved… mostly!

There’s actually a hack I’d like to talk about here. Let’s look at recvFrom, including its implementation:

recvFrom :: (CanReceive sh ~ 'True, SockFam f)
  => Int -> SSocket f 'Udp 'Unconnected sh -> IO (ByteString, SockAddr f)
recvFrom n (SSocket s) = do
  (bs, sa) <- NSB.recvFrom s n
  return (bs, fromNetworkSockAddr sa)

In every function except this one, the type parameter f occurs in the argument types. This made our lives easy - we could depend on the user to pass it what we needed without needing to communicate between the type and value levels.

However, here, recvFrom returns a SockAddr f. Worse yet, the underlying NSB.recvFrom returns the original NS.SockAddr structure, with no type information!

So we have to relate the two somehow, and convince the type-checker that it’s okay. This wasn’t easy, and I ended up with the following “proof”:

fromNetworkSockAddr :: forall f. SockFam f => NS.SockAddr -> SockAddr f
fromNetworkSockAddr netAddr = case (netAddr, socketFamily1 :: SocketFamily1 f) of
  (NS.SockAddrInet p addr, SInetV4) -> SockAddrInet p addr
  (NS.SockAddrInet6 p f addr scope, SInetV6) -> SockAddrInet6 p f addr scope
  (NS.SockAddrUnix s, SUnix) -> SockAddrUnix s
  _ -> error "impossible"

I’ll explain. When doing a pattern match, Haskell’s type-checker is able to refine what it knows. From the left side of the pattern match arrow (including any type-level data), it’s able to learn a lot about what should be on the right. So, that’s the why of needing to generate a SocketFamily1 object which carries the type-level data to associate the two - it convinces the type checker this all makes sense and we’re not just pulling a SockAddrInet6 'Inet6 out of thin air.

However, we aren’t able to convince the type-checker these are all the valid cases. It insists that we haven’t covered, say, NS.SockAddrUnix, InetV4, which we know make no sense, and the type-checker would even reject!

So that’s that. A lil hack.

With that out of the way, let’s talk about making this API nicer to use.

Making it Nicer to Use

Let’s take a look at the socket function:

socket :: forall f p.
  (SockFam f, SockProto p) => IO (SSocket f p 'Unconnected 'Available)

We’d use it look this to create a TCP4 client:

(socket :: IO (SSocket 'InetV4 'Tcp 'Unconnected 'Available))
  >>= connect serverAddress
  >>= ...

Which is … actually really tedious. As a user, we have to remember to type all that out in the correct order, or get scolded by the compiler. It’s really cool that we don’t have to pass in any arguments to get the sort of socket we want, but it came at quite the cost in usability.

We can do better - a lot better. I introduce several helper functions below:

tcp4Socket :: IO (SSocket 'InetV4 'Tcp 'Unconnected 'Available)
tcp4Socket = socket

tcp6Socket :: IO (SSocket 'InetV6 'Tcp 'Unconnected 'Available)
tcp6Socket = socket

tcpUnixSocket :: IO (SSocket 'Unix 'Tcp 'Unconnected 'Available)
tcpUnixSocket = socket

udp4Socket :: IO (SSocket 'InetV4 'Udp 'Unconnected 'Available)
udp4Socket = socket

udp6Socket :: IO (SSocket 'InetV6 'Udp 'Unconnected 'Available)
udp6Socket = socket

udpUnixSocket :: IO (SSocket 'Unix 'Udp 'Unconnected 'Available)
udpUnixSocket = socket

Their implementations are the easiest - type-inference handles 99% of the work. We just need to be careful about the type signatures. And like magic, our client socket code now becomes much nicer:

tcp4Socket
  >>= connect serverAddress
  >>= ...

Another usability pattern, is that we’d like functions to automatically create our sockets and also close them, even if errors come up. There’s a very powerful function hiding in Haskell’s Control.Exception standard prelude module: bracket!

bracket ::
  IO a           -- open/acquire a resource
  -> (a -> IO b) -- close/release the resource
  -> (a -> IO c) -- do some work with the resource
  IO c           -- return some value

Using bracket, we can write some pretty simple helpers:

-- the core bracket helper, written point-free style
withSocket ::
  (SockFam f, SockProto p)
  => (SSocket f p 'Unconnected 'Available -> IO a) -> IO a
withSocket = bracket socket close

-- expanding the point-free version:
-- withSocket f = bracket socket close f

-- one of six helper functions, also point-free style
withTcp4Socket :: (SSocket 'InetV4 'Tcp 'Unconnected 'Available -> IO a) -> IO a
withTcp4Socket = withSocket

The way the with helpers work is like this:

-- serverAddress defined as earlier
runRudeServer :: SSocket 'InetV4 'Tcp 'Unconnected 'Available -> IO ()
runRudeServer s = forever $ do
  server <- bind serverAddress s >>= listen 1
  (client, _) <- accept server
  _ <- send "bye" client
  close client

-- guarantees that if the server ever crashes,
-- the socket will be closed
main = withTcp4Socket runRudeServer

In the next two sections, we’ll write a simple echo client and server using this library, and then test it all!

The Client Meows and the Server Echoes

First, the client:

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DataKinds #-}
module Main where

import Network.Typed.Socket

meowClient :: SSocket 'InetV4 'Tcp 'Unconnected 'Available -> IO ()
meowClient s = do
  client <- connect serverAddress s
  send "meow" client
  whatTheOtherCatSaid <- recv 32 client
  print whatTheOtherCatSaid

main :: IO ()
main = withTcp4Socket meowClient

Now the server:

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DataKinds #-}
module Main where

import Network.Typed.Socket

meowServer :: SSocket 'InetV4 'Tcp 'Unconnected 'Available -> IO ()
meowServer s = do
  server <- bind serverAddress s >>= listen 1
  (client, _) <- accept server
  print "a cat arrives!"
  whatTheNiceCatSaid <- recv 32 client
  send whatTheNiceCatSaid

main :: IO ()
main = withTcp4Socket meowServer

And that’s network programming in Haskell!

Testing It

To go one step further, I’ll show how such a library could be tested. I used the echo server example above as a test case to make sure I didn’t break anything while doing the typed layer.

I’ll make use of the nice hspec library.

Conceptually, the test seems like it should be this simple:

  describe "Network.Typed.Socket" $ do

    it "can echo server (TCP)" $ do
      withTcp4Socket $ \server -> do
        withTcp4Socket $ \client -> do
          runServer server
          runClient client

and as long as both finish, all’s well? Right?

Herein we encounter all the loveliness and nuance of network programming. There’s immediately three things wrong with this test suite:

  1. in runServer, we’ll block on accept. The client will never connect. The suite never ends. Eternal uncertainty!!
  2. Even if by bad magic we made it to runClient, we don’t actually check things are being echoed. Let’s fix that.
  3. Finally, between runs of our test suite, if any errors came up, our server would fail to rebind to that port and address because there’s a special TIME_WAIT mechanism where the operating system holds on to sockets after closure to ensure clean termination.

Wow. Yeah. Welcome to network programming!

I’ll skip ahead to the solutions, and just say that a bit of concurrency and setting a socket option goes most of the way:

serverAddr :: SockAddr 'InetV4
serverAddr = SockAddrInet 2291 (NS.tupleToHostAddress (127,0,0,1))

runServer :: SSocket 'InetV4 'Tcp 'Unconnected 'Available -> MVar () -> IO ()
runServer s serverWaitLock =
  makeAddrReusable s
    >>= bind serverAddr
    >>= listen 1
    >>= (\server -> putMVar serverWaitLock () >> serverTest server)

runClient :: SSocket 'InetV4 'Tcp 'Unconnected 'Available -> IO ()
runClient s = do
  client <- connect serverAddr s
  n <- send "fish" client
  bs <- recv 32 client
  n `shouldBe` 4
  bs `shouldBe` "fish"

main :: IO ()
main = hspec $ do

  describe "Network.Typed.Socket" $ do

    it "can echo server (TCP)" $ do
      withTcp4Socket $ \server -> do
        withTcp4Socket $ \client -> do
          serverWaitLock <- newEmptyMVar
          void (forkIO (runServer server serverWaitLock))
          void (takeMVar serverWaitLock)
          runClient client

That’s the whole test! The main changes needed were:

Socket Options and Socket Wishes

This brings us close to the end. Our last bit of adventure involves thinking about socket options. Turns out, there’s a lot of them!

data SocketOption
  = Debug
  | ReuseAddr
  | SType
  | SoError
  | DontRoute
  | Broadcast
  | SendBuffer
  | RecvBuffer
  | KeepAlive
  | OOBInline
  | TimeToLive
  | MaxSegment
  | NoDelay
  | Cork
  | Linger
  | ReusePort
  | RecvLowWater
  | SendLowWater
  | RecvTimeout
  | SendTimeout
  | UseLoopBack
  | UserTimeout
  | IPv6Only

Here’s the typed socket function for setting options as of right now:

setSocketOption :: NS.SocketOption -> Int -> SSocket f p s sh -> IO (SSocket f p s sh)

As you can see, we’re not doing any of the property tracking in the types we did with socket status, protocol, and family. We’re just, running the effects.

The problem here is more complicated, because we want to track all the options we’ve enabled so far, which would be a type-level list.

Something like:

setSocketOption :: ValidToSetOption opts ~ 'True
  => NS.SocketOption -> Int -> SSocket f p s sh opts
  -> IO (SSocket f p s sh (SetOpt NS.SocketOption opts))

This is not a completely terrible endeavor just yet. It becomes harder if we’d like to also track the values we set options to, and to possibly reconsider how this interacts with the rest of our current API.

Some options are boolean-valued (on, or off). Some take integers. Some take time-structures. It’s an adventure for another time!

Lacking Linearity Leaves Limitations

We catch a wide class of errors with this new API design. However, there’s still one major weakness to this approach because of the limitations of the type system we’re working in.

What happens if we throw some threading into the mix??

g server = do
  forkIO (runServer server)
  close server -- oh no

runServer server = forever $ do
  -- do cool server stuff, I hope

We were able to close the server socket in the main thread, and as a result, the type-state will be inconsistent in the child thread. The server will actually be closed but will have a type like SSocket f p 'Listening sh.

Welp.

This is a limitation of the type-system, and there’s naught we can to help the compiler help us.

Fortunately, extensions to the Haskell type-system are being worked on that will allow it to encode these sorts of constraints. Constraints that make it possible to say, “I’ve handed this resource off to another thread. Please make no further modifications on this thread!”. That’s linear types - things can be “consumed”.

Here’s the haskell paper detailing how this system would work.

Closing

And that’s that! Phantom types, singletons, to the type-level and back to the value-level again.

If you’re interested in seeing the full implementation, you can check it out on gitlab.

I’m looking for new work, so send me an email if you’d like to work with me on making reliable systems for all.

Thanks for reading!