As you may know, I’m a huge fan of static types. In particular, I adore type systems as one might find in Haskell, Idris, Agda, and OCaml. If I could, I’d write all my projects in one of these languages or some combination thereof.
However, I can’t. It’s not for a lack of want or because it’d be impossible. Similarly, many people can’t, whatever their reasons may be.
I’d like to draw attention to the issue of legacy software. Large systems have already been written in languages like Ruby, Python, PHP, and Javascript. In many cases, it would be too costly to rewrite these systems. Even if a parallel rewrite effort could be undertaken, one would still need to ease the maintenance of the existing system. Often times, these systems are the pipes and patches that hold the internals of companies together. They’re critical.
This is part of the reason why gradual type systems are so important! They provide a form of checking that can greatly facilitate the maintenance and evolution of these legacy systems. They complement tests, can be iteratively applied, and provide benefit immediately. I elaborate further below.
To date, the following programming languages provide or are in the process of adopting a gradual type system:
There are many other programming languages out there. However, for the time being, I’ve limited myself to surveying languages that I’ve encountered or I believe are in common use.
Gradual type systems, in simplest terms, use local information from literals (10
, "cat"
) and annotations to determine the shape of data as it proceeds through your program. In the simplest case, it can track primitive types: ints, floats, chars, strings, etc.. A level higher, and they understand composite types, usually through an object/subtyping system:
class Person(object):
def __init__(self, name: str, age: int) -> None:
self.name = name
self.age = age
A level higher still, and the gradual type system can talk about polymorphic types:
;; Apply `f` to each `x` in the list `xs`, returning a new list
(: map (All (a b) (-> (-> a b) (Listof a) (Listof b))))
(define (map f xs)
(if (null? xs)
null
(cons (f (car xs)) (map f (cdr xs)))))
This is the very briefest introduction of the sorts of things type systems can track.
With that as a start - why does it matter?
The very short of things that might be detected:
mu-map
instead of my-map
in a recursive invocationFurther, it helps separate the concern of implementing your application from checking that it is internally consistent. Consider the following example:
class Person
def initialize(name, age)
@name = name
@age = age
end
def greet
puts "Hi, #{@name}!"
end
end
x = Person.new("Cat", 10)
# Defensive programming
x.greet() if x.is_a?(Person)
x.greet() if x.respond_to(:greet)
# With type-checking
x.greet()
There are at least two reasons that one might prefer the decoupled, type-check approach:
To elaborate on (2), let’s replace Person
with Cat
:
class Cat
def initialize(name, age)
@name = name
@age = age
end
def purr
puts "Purring at #{@name}"
end
end
x = Cat.new("Still a Cat", 10)
x.greet() -- we still want an effect here, but...
If we had used the Ruby approach of respond_to?
, we would have avoided a run-time crash. However, if we still wanted the effect to take place, we would have missed this relying solely on defensive programming techniques.
Adding the presence of unit tests, we’d need this particular line to be covered for the desired effect (purring, rather than greeting). This adds a duplication of effort. When refactoring, we’d need to update our application, as well as our test suite.
A type-checker, on the other hand, can detect a Cat
doesn’t know how to greet
:
-- Refactor.hs
data Cat = Cat String Int
data Person = Person String Int
greet (Person n _) = putStrLn ("Hi, " ++ n ++ "!")
purrs (Cat n _) = putStrLn ("Purrs at" ++ n)
main = do
let x = Cat "Still a cat!" 10
greet x
$ ghci Refactor.hs
Refactor.hs:18:9:
Couldn't match expected type ‘Person’ with actual type ‘Cat’
In the first argument of ‘greet’, namely ‘x’
In a stmt of a 'do' block: greet x
I cheated a little above, since I’ve used an example in a statically-typed language, but the concept holds in general: type-checking fills an important role in the maintenance of software systems. Before anything is executed, I’ve got a line number, a column number, and precise information on the reason there’s an error.
Granted, this is a toy example. A more realistic example might consider adding a new Shape
or Currency
on the data model or a new action (Render
, Convert
) and making sure that those are accounted for throughout the system. Throughout the system could be as simple as within the same file, or it could be as complex as spanning multiple modules or projects.
Types help preserve abstraction.
Linting is another form of static analysis that focuses on suspicious constructs rather than the types of data. Type-checking is not linting.
A key difference between linting tools and type-checkers (in my experience) is that linting tools use heuristics to determine whether a particular expression or statement could go wrong. Type-checkers depend strictly on the shape of your data, and solve for inconsistencies. If a type-checker flags something, it’s almost certainly off.
One of the beautiful aspects of typeful programming is that you don’t need to do everything up front. Just like with testing, you can refine your types and models as you go. Bit by bit, you can improve the correctness of your code.