Wednesday, December 28, 2011

What the Heck are Algebraic Data Types? ( for Programmers )

This post is meant to be a gentle introduction to Algebraic Data Types. Some of you may be asking why you should learn Algebraic Data Types and how will they change your world?  I am not going to answer that, but suffice it to say that Algebraic Data Types are the underpinning of the type systems to the ML derived languages, Haskell and OCaml included, and their construction and properties allow for the power (and inference) that accompanies these type systems.  They are cropping up in other languages, like Scala, F#, and Clojure.  Well, here are my 2 cents.

I wrote this blog post because I had trouble learning what an Algebraic Data Type was, and could not find an introduction that did not either 1) leap immediately to copy and pasted phrases, 2) mislead, like make the claim that "algebraic data types are the same thing as discriminated unions" or 3) delve into graduate level formalism.  This blog post represents the results of what I have teased from numerous sources.

There are now tons of us programmers with sadly insufficient mathematical grounding, who are attempting to branch out and learn something new.  The confusion that comes from a sea of new words and overloaded acronyms can be very frustrating  (e.g. don't use the acronym ADT for algebraic data types, because that generally means abstract data type -- a completely orthogonal concept).   I wanted to be able to put a check mark next to my internal to-do list item for "What the heck are Algebraic Data Types?"   This blog post is meant to help people do the same.

Quick Start
 To start, let us attempt a definition of algebraic because for some weird reason, no introduction to Algebraic Data Types ever tries to explain this etymology.  For purposes of simplicity, let us define algebra to be two things: 1) a set of objects (do not think objects as in object oriented) and 2) the operations used on those objects to create new objects from that same set.  An introduction to algebraic data types is therefore an introduction to types and the operations used to create new types.

Examples of some other Algebras
In the case of numeric algebra (informally known as high-school algebra), the set is the set of numbers (whether they be natural, rational, real, or complex) and the operations used on these objects can be (but definitely not limited to be) addition, or multiplication.  The algebra of numbers is therefore the study of this set, and the laws by which these operators generate (or don't generate) new members from this set.

Another example: the relational algebra.  In this algebra, the set of objects are the set of relations.  Unlike numbers which we tend to (erroneously) view as atomic, a relation is a thicker kind of object containing a number of named columns, their associated domains, and a number of rows.  Each 'relation' (otherwise known as 'table') is therefore the object unto which operators may operate.  These operations include, amongst others,
  • projection (π) create a new relation from the existing one by removing columns, 
  • selection(σ) create a new relation from an existing one by removing rows that satisfy a predicate function,  
  • joining (⋈) create a new relation from two existing ones by matching up the domains and values of a chosen column,
  • renaming (ρ) create a new relation from an existing one by renaming a column

The key thing here to realize is that an algebra allows us to talk about the objects and the operations abstractly, and to consider the laws that these operations obey as they operate on the underlying set.

So, again, what is an algebraic data type? Let's repeat these questions. What is the set?  What are the operators?  First, the underlying set is the set of types.  For you programmers out there, types are so familiar that we often overlook how fundamental they are.  If we were to start with a set of base types, we could use operators to create new ones.  Our first operator, the product operator, will create what is known as a product type, or a record type, or a tuple-type.  We pair, or treble, or quadruple, or quintuple, or n-tuple existing types together, to create an aggregate type.   A classic example is the Pair type.  As a hard example, consider the Pair as implemented in a bevy of object oriented languages:

class Pair{
   float x;
   float y;
}

change the above code's syntax to your favorite python, ruby, scala, etc.  It doesn't matter.   The act of writing 'class' and defining this type IS our product operator.  Object oriented programmers tend to call this operation composing -- but whatever you call it, it's the same thing. From the pre-existing type float we have created a new type called Pair.  Does our operator allow us to keep on going with this?  Let's try it:

class PairOPairs{
   Pair start;
   Pair end;
   int time;
}

Well, there you go.  From two existing types (Pair and int), we have created a third. This is what operators do in an algebra.

Our class definition as a product operator has spun a new type into existence from other ones, just like
  • '+' allows us to combine 3 and 4 to get 7,  
  • '*' gives us 12 from  3 and 4,
  • 'join' allows to create a new relation (which we generally don't name) from 'Employee' and 'Salary',
  • 'or' allows us to get 'true' from 'true' and 'false' together,
  • '+' allows us to get "hotdamn" from "hot" and "damn"  (commutativity matters though!),
  • 'not' allows to turn 'true' into 'false' (the only non-binary operator in this list).
We have used an operator on an underlying set to create/synthesize a new member of that set.

Now, some languages allow us to create these types anonymously, what we call tuples.  For theoretic purposes, the tuple containing real numbers can be much the same (isomorphic) as a Pair class containing real numbers.  The differences?  Well those are practical and vary from programming language to programming language, but are about ease-of-use and the languages implementation of type checking.  Practical matters are of great importance to programmers.

The Haskell syntax for product types is:
  data Pair = Pair Real Real.

Note how the 'Pair' on the left is different from the 'Pair' on the right. The one on the left is the type.  The one on the right is the particular named product operator, what Haskell calls a data constructor.

All this time we have used the product operator to create new types from existing ones.  And we called it a product type.  Why?  Well.  The answer is that we are producing a Cartesian product (or multiplication) on members of several underlying sets.  Again, consider the Pair.  Let's rename it to Point to drive home a point:

class Point{
   float x_coordinate;
   float y_coordinate;
}

In the act of creating the class 'Point' we have created a new type of x's and y's together.   The x's have values in the real domain.  The y's have values in the real domain too.  The Point is a Cartesian product from values of type of Reals against values of type Reals or (R X R).  As another example, we could have done this:

class Person{
   String firstName;
   String lastName;
   int ssn;
}

Here we are using Cartesian multiplication to produce a Cartesian product of two strings and an integer ("firstname","lastname",123456714).  The three constituent elements may range throughout the space of all possible strings and integers (this is not yet ideal, as we would love to restrict the length of our integer type -- see dependent types) to create a value that is a triple of strings and an integer.  Thus our new type is of (String X String X Int).  Are there any differences?  Of course.  I don't know of a way to implement a comparison operator between the Person, modulo subjective judgements on character.  With Point, I can implement a 'distance' from the origin metric for some ordering.

How you use the types is 100% up to you.  Do you want to use 'Pair' to capture a point in x/y space, or to represent the length of your left foot and right foot?  Well that's up to you. The algebra of data types is abstract.  The choice of names and how we compose these types from other types is the act of domain modelling and is something for which programmers get paid.

So far we have looked at one operator.  A 'product' operator.  Are there others? Sure.  The object oriented guys have something called sub-typing which allows you to generate a new type from another via 'extension'.  Anything else?  Well, there is an interesting one that the typical Java enterprise developer rarely sees.  It is called a variant type or a sum type or a discriminated union, and allows you again to compose a new aggregate type from existing types.  But what is the difference between this and a product type? Well, consider again a product type:

class Point{
   float x
   float y
}

and now consider a completely different example as a  sum type:

union Identity{
   String name
   UUID uuid
}

In the first case, a value of type 'Point' has constituent values of an 'x' AND a 'y'.  In the case of the discriminated or disjoint union (sum type), values of the type Identity have constituent values of a 'name' OR a 'uuid' -- but not both at the same time.  The difference is fundamental.  If Java had this then something like the following would be possible, AND typecheck!

Identity myIdentity = new Identity();
if( db.hasOnlyName() ){
    myIdentity = "Daniel Eklund";       // could be a string
}else{
    myIdentity = java.util.UUID.randomUUID();  // OR a UUID
}

The Haksell notation for the sum type is:
   data Identity = S String | U UUID

(note how the bar symbol '|' is the sum operator)

Mathematicians like to overload symbols and words.  We all subconsciously respect that '+' means something different when it is used on two numbers versus when it used with two strings versus when it used on two vectors.  Why do we call this new operator a 'sum' and its result a 'sum type' then?  Well I could answer that, or let you figure that out on your own.  The important thing to do is to let go of concrete biases about what a 'sum' is but also realize that there must be a reason that we analogize with the pre-existing operator 'sum' from a different algebra. (Hint: think about the algebra of boolean values and what AND and OR mean.  Now think about 0s and 1s.)

What programming languages choose to expose as mechanisms for operating on types tells us a lot about what the language wants us to respect.  Some languages are concerned about the mathematical tractability of what can be expressed or inferred about these types from limiting the operators.  When we have this tractability, we can write algorithms in our type checker that do wondrous things for us.  Some languages are not so concerned about what can be proven mathematically about these types, but want us to have notions that make sense to us.  These languages often require us to constantly remind ourselves what types we have by annotating the values when they are used as parameters or return values.  Some languages ignore types. 

It is important to realize that an algebra of data types exists in all of these languages.  But what kind of algebra is a different question. 

Thus when you hear about algebraic data types, you will most often be referring to a Haskell-like system -- something called an initial algebra.  Definitely, the wikipedia article presumes this.  The wikipedia article is a hodge-podge mess.

In this case we mean algebraic data type to mean one which has both a 'product' and 'sum' operator, which is closed, which has a unit-type, which has associative properties, a fixed-point operator, and a bunch of other stuff which someone else's blog post  and a stackoverflow discussion do a better job explaining.  Please do read this other post as it is both more rigorous and continues exploring how the analogy (morphism) of arithmetic algebra with type algebra can give us beautiful things like derivative types.  My gosh. It's cool.

The purpose of this posting is merely an introduction, and, as such, will not be rigorous.  I have erred on the side of hand-wavey explanation and can even now see things which are technically suspect. But whatever.  Study.  Figure it out.

Tuesday, January 18, 2011

Simple Boggle Solver in Haskell

Years ago, I used the idea of solving a boggle board as my way of learning common lisp.  This past week, I have done the same with Haskell.

https://github.com/reverendpaco/Haskell-Dictionary-Trie

A Simple Boggle Solver
  • cabal install binary
  • git clone git@github.com:reverendpaco/Haskell-Dictionary-Trie.git
  • cd ./Haskell-Dictionary-Trie/src
  • ghc --make WordServer.hs
  • ./WordServer (listens on 9900)
  • (on another terminal) telnet localhost 9900
  • type in letters "SERSPATGLINESERSOO" which will be wrapped into the closest approximate square, for example
SERSPATGLINESERSOO ->
                S E R S
                P A T G
                L I N E
                S E R S
                O O
  • hit return
  • watch as the words in this board are returned to you

Thursday, December 30, 2010

I think I finally grok it


Sometime it's so much easier to make your own diagrams.  The arrow represents the "is a" relationship with labels to specify.

I left the crazy cousin "homeo" out of the family tree, cause he's just too topological.

Monday, July 26, 2010

Clojure Implementation of Dictionary Trie

I finally have Clojure
This is the Clojure implementation of the Dictionary Trie previously implemented in Haskell.  You can copy and paste it into a repl, or load it into a file.  A basic usage goes:
(generate-trie 
      ["rebate" "reborn" "realize" "real"])
or for fun
(println (generate-pretty
    (generate-trie 
          ["rebate" "reborn" "realize" "real" "relied"])))
to get the Dot
"_"[label="∅"]
"_"->"r_"[label=" r"]
"r_"[label="re"]
"r_"->"lr_"[label=" l"]
"r_"->"ar_"[label=" a"]
"r_"->"br_"[label=" b"]
"lr_"[label="lied(*)"]
"ar_"[label="al(*)"]
"ar_"->"iar_"[label=" i"]
"iar_"[label="ize(*)"]
"br_"[label="b"]
"br_"->"obr_"[label=" o"]
"br_"->"abr_"[label=" a"]
"obr_"[label="orn(*)"]
"abr_"[label="ate(*)"]
and copy those results here to see the image:

Discussion
  1. defmulti can be used to mimic Haskell's constructor/structure based dispatch
  2. Haskell has trained me to favor "higher-order" over macros.  Higher-orders like partial, comp, etc. can do a lot of what you need.
  3. Looking forward to Clojure 1.2's defrecord capabilities as I have become a fan of abstract/algebraic data types
  4. Clojure has gotten a lot of mileage with the 80% of data structures we all use (in their immutable form): vector, hashmap, list.  The upcoming defprotocol will allow the hard work for that last 20%
  5. as much as I love Haskell, you can't beat a quick (println) for debug.
  6. It was easier to mentally map the Haskell code to the Clojure code rather than re-reason the problem through.  
  7. I like recur over hoping for a compiler TCO. 
  8. Types do help debug.  I found a lot of quirky edge-cases in testing Clojure run-time.  Hard to put my finger on individual anecdotal evidence... more of a "feel thing".
  9. Just as in the Haskell implementation,  I don't think there's any great advantage to be gained in using a zipper data structure.  I will try though, mainly to familiarize myself with the zipper.
API Methods
  • generate-trie
  • query-node
  • add-word-to-trie

Monday, June 21, 2010

Implementing the Dictionary Trie: Insertion and Search Algorithms: Haskell

In a previous post, I settled on a data structure for the Dictionary Trie.  In this post, I present the Haskell code I used to implement the search and insertion algorithms.   To understand what motivates the dictionary trie, make sure to read the following posts (including an informal discussion on the algorithms used in the insertion):
  1. What is a dictionary trie?
  2. How we would go about creating a dictionary trie from inserts.
  3. What is a basic Haskell data structure for a dictionary trie.
The Data Structure

Highlights:
  1. Algebraic
  2. Parametrically Polymorphic
As I had mentioned on my previous post about refining this data structure, the TrieNode is the essential component to the tree.  I have used an enumeration type (NodeType) to indicate the node type rather than a set of booleans, in the belief that Haskell's pattern matching could make the code that much easier to understand.
The Insertion Algorithm
The insertion algorithm exposes one public API function:

  • addWordToTrie
which takes a string (actually an array of the parametrically typed value) and the dictionary trie. 
There are several utility functions that I do not export out of the module (none of the module code at the top is shown here).  These utility functions use internal utility data structures that are concerned with the type of match that two strings have towards each other -- the source string is always compared against a target string and there's always one of five different matches:  
  • exact match
  • target is prefix of source
  • source is prefix of target
  • shared prefix
  • nothing in common
In the below code, I have inserted a graphic image as a  visual description of the algorithm for the four different cases that you would actually run into during the insertion.  The utility function insertNode is a direct mapping (and formal encoding) of the informal scenarios discussed in this previus post. The correspondence to the informally developed algorithms and the actual functions was quite pleasing, and was made possible by the pattern-matching capabilities of the language


  Exact Match

Target Is Smaller Than Source

Source Is Smaller Than Target

Source and Target Share Same Prefix

The Querying Algorithms

Discussion
These are the subjects I want to touch on in my creation of the above code:
  1. Keeping things functional with record syntax
  2. Just Maybe stuff
  3. Potentially too redundant in spots  
  4. my first use of home-grown operators
  5. does pattern matching on record syntax get me that much?
  6. no magic
  7. functionally passing back self
Keeping Thing Functional with Record Syntax
In order to make a 'change' in the functional world, what you have to do is actually make a new value.  For values that are aggregates of others, like our data structure, you need to make sure that the old values for the other stuff remains the same, so essentially you 'copy' the other values.  I was extremely pleased to see that with the record syntax, you could code such that a 'modification' to one field in the record affected the overall change leaving the rest of the fields the same as previous.  This required the use of Haskell's pattern-matching to name the overall structure/node.  
Just Maybe Stuff
I know that I am not properly understanding the way for dealing with Maybe values.  I feel that there has to be a better way for extracting the underlying value than the where-clause utility functions I use. Does this reflect an underlying misapprehension of the pholosophy of Maybe values? Maybe.
Potentially Too Redundant
I had two internal data structures to represent the matching of the source and target strings.  One was purely a description of the matching that occurred (the type MatchType) and consisted of five nullary constructor data types.  The other redundantly captured the state of the match but also encapsulated the prefix-suffix pair that resulted.  I have a strong suspicion I could have done this with one pass.  I will revisit.
First Use of Homegrown Operators
This is an issue of terseness and psychology.  Once I had the algorithms developed and everything working according to plan, I refactored some prefix-oriented functions into operators.  I turned:

  • insertNewChildNode into *->
  • takeTogetherWhile into =><=
In the end, this reduced the camelCaseClutterAndExcessinveJavaLikeNaming, but as these functions were internal there probably was no real gain.
Pattern Matching Cost Benefit Analysis
The record syntax (as a data constructor) permits pattern matching on the left-hand side of function definitions or on case expressions.  There were several times I wondered whether the use of the pattern-matching was really worth it.  While it did lower the terseness on the right-hand side (no need to fetch the data out of the record with 'accessor' functions), it did raise the level of noise on the left-hand side.
No Magic
As I've been diving deep into Haskell, I've been itching to try out the magic that provides the allure to this language.  I have been attracted to function composition and currying.  I've desired to chain together functions in a point-free style or employ functors or applicative style idioms.  I found that my code showed little of this magic.  The reasons could be (and/or):

  • I am new to a different paradigm and still thinking idiomatically imperative
  • I am dealing with programming in the large which is really just massaging, manipulation, and transformation of record syntaxes.
Functionally Passing Back Self
The utility function insertNode is the heart of the insertion algorithm and is tree-recursive.  As mentioned above, the functional nature of Haskell requires that in tree updates, we are actually doing tree 'copies'  (the underlying implementation may share state to increase performance, but as far as the language semantics go, the values should all be new and copied).  To accomplish this, functions which are tree recursive have to always do the same thing:  
  • they need to pass back the node they are currently operating on as the result of the function
  • and assume that the invocations on themselves (the recursive call) are 'modifications' to the child nodes
This grew tedious after a while, and felt like unnecessary boilerplate.  It was stuff like this that drew the development for higher-order functions like fold and map, and I considered exploring the benefit of casting the Trie as a functor, or of implementing a zipper data structure.  
I will revisit this.

Thursday, June 10, 2010

Another Digression: Tries for Capturing Pronunciation




Those linguist guys came up with an alphabet of symbols to try to capture the totality of human sounds as a sequence of characters. They called this alphabet IPA.  Here is the IPA for the word 'antediluvian'

 \ˌan-ti-də-ˈlü-vē-ən, -(ˌ)dī-\

I just had an idea that maybe we can use words of IPA pronunciation in a trie and it might give us some useful stuff.  For instance,  'does one word sound like another?', both at the front and at the end -- 'friendship' and 'phrenology' sound alike near the front (at least, according to me), and 'cone' and 'loan' have a similar ending sound (i.e. they rhyme -- anytime).

I am going to make a lot of assumptions, but as I am not attempting to be a linguist (not today), I can get away with doing some cool stuff while waving my hand with the pronouncement that this is just an approximation.  The title of this blog is probably done before.  I am sure that someone has done this before, but I am not going to google it -- I want to have fun exploring this on my own.

Plan
  1. Find a dictionary of IPA words.  (Strictly speaking I need to find two parallel dictionaries: an IPA dictionary along with the script word: a mapping from words of an IPA language (ˌan-ti-də-ˈlü-vē-ən, -(ˌ)dī-) to words of a written language (antediluvian).
  2. Insert these IPA words into two tries:  one with the constituent IPA characters from left to right, and one right to left.  The first tree should tell me 'friendship' and 'phrenology' share a pronunciation prefix -- even though they don't share the same written-word prefix.  The second trie should tell me that 'loan' and 'cone' rhyme (despite having different suffixes).  Here is where my hand-waving should come in handy.
This should be fun, and I am sure I am going to run into a LOT of problems.