Array#flatten in Haskell
Array#flatten is on occasion a very handy method. It reduces
n-dimensional arrays to
1-dimensional arrays, containing all the elements of
the top-level-array and its sub-arrays. Therefore, it is said to be “flattening”
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
~ ➜ irb 2.1.4 :001 > [ ].flatten =>  2.1.4 :002 > [  ].flatten =>  2.1.4 :003 > [ 1, 2, 3 ].flatten => [1, 2, 3] 2.1.4 :004 > [ 1, 2,  ].flatten => [1, 2, 3] 2.1.4 :005 > [ , 2, [3, 4] ].flatten => [1, 2, 3, 4] 2.1.4 :006 > [ 1, , [[3, 4], 5] ].flatten => [1, 2, 3, 4, 5] 2.1.4 :007 > [ 1, 2, [2, 3] ].flatten # retains duplicates => [1, 2, 2, 3] 2.1.4 :008 > [ 2, 1, [4, 3] ].flatten # retains order => [2, 1, 4, 3]
As you can see,
#flatten is conceptually quite simple. The same functionality
can be implemented in Haskell. But, as usual, the solution requires a little bit
of abstract thinking.
Let’s start with
1-dimensional lists and work our up to
n dimensions. The
first dimension is trivial: All we need is the identity function.
flatten1 :: [a] -> [a] flatten1 = id -- flattening a 1-dim list is a no-op
For the second dimension, we could do something like the following:
1 2 3
flatten2 :: [[a]] -> [a] flatten2  =  flatten2 (x:xs) = (flatten1 x) ++ (flatten2 xs)
This isn’t really what we want, because we’ve told the compiler that the
argument has exactly
2 dimensions. For that reason, we have to define another
function to support three dimensions:
1 2 3
flatten3 :: [[[a]]] -> [a] flatten3  =  flatten3 (x:xs) = (flatten2 x) ++ (flatten3 xs)
By now, you can probably see the pattern. We have to define yet another and another and another function for each additional dimension. Thus, the naive approach doesn’t get us very far…
An instance of
Array in Ruby doesn’t restrict the types of its elements in any
way (i.e. it’s just a bunch of references to instances of
the elements of an
Array instance can…
- … have different types. For example,
[1, "abc"]is completely valid in Ruby (the first element is an instance
Fixnumand the second element is an instance of
- … reference other
Arrayinstances (all classes in Ruby, including
Array, are sub-classes of
Arrays can be arbitrarily nested.
Haskell is statically-typed, whereas Ruby is dynamically-typed. Due to restrictions imposed by the static type system, there are certain things that cannot be done in Haskell.
Okay, that’s enough background information for now. Let’s flatten some stuff :)
Regarding (1): There is no counterpart for Ruby
Arrays with mixed elements
in Haskell. All elements in a list, must be values of the same type. That being
said, we could define a custom data type to circumvent this issue. However,
that only works for types we know about at compile-time.
Regarding (2): Unless we know
n at compile-time, the concept of
n-dimensional lists doesn’t make any sense in Haskell. Instead, we have to
tell the compiler more explicitly what we really want. Namely, a
Fortunately, Haskell makes it very easy for us define such trees with algebraic
data Tree a = Node [Tree a] | Leaf a
What does this mean? Well,
Tree a is a algebraic data type with two
constructors. The first one,
Node, encapsulates a list
[Tree a]. The second
Leaf, encapsulates a value of type
Next, we are going to illustrate our new data type by translating the Ruby array
foo = [1, 2, [3, 4], ] to Haskell:
1 2 3 4 5
foo :: Tree Integer foo = Node [ Leaf 1 , Leaf 2 , Node [ Leaf 3, Leaf 4 ] , Node [ ] ]
The last thing we still need, is the implementation of
However, that’s the easy part: The function simply has to recursively traverse
the given tree and collect the all values from its
1 2 3
flatten :: Tree a -> [a] flatten ( Leaf v ) = [v] flatten ( Node ts ) = concat $ fmap flatten ts
We can use the tree we’ve defined before (i.e.
foo) to test the function in
1 2 3 4 5 6 7 8
~ ➜ ghci Prelude> :l Flatten.hs [1 of 1] Compiling Flatten ( Flatten.hs, interpreted ) Ok, modules loaded: Flatten. *Flatten> let foo = Node [ Leaf 1, Leaf 2, Node [ Leaf 3, Leaf 4 ], Node  ] *Flatten> flatten foo [1,2,3,4] *Flatten>
[1,2,3,4] is of course exactly the result we’ve expected.
If you are not familiar with Haskell, you might think now that this is a lot of
overkill. Personally, I’m a big proponent for Haskell, but I agree that
#flatten is easier to implement in Ruby.
However, I would argue that this a consequence of Haskell’s static type system
and not its functional nature. If you disagree, I encourage you to implement
#flatten in a statically-typed and object-oriented language (e.g. Java) –
you’ll see what I mean.
What’s the benefit of functional programing and static typing? Safety and by extension confidence in your code. Dynamic typing is convenient, that’s for sure. But, saying that the price we are paying for that convenience is high, would be an understatement. There’s approximately one gazillion ways in which your Haskell program can blow up. With Ruby and dynamic typing you get two gazillion…
While writing this post, I’ve actively tried to break Ruby’s
substantiate my point. This is what I came up with after a couple of minutes:
1 2 3 4 5 6 7 8 9 10
~ ➜ irb 2.1.4 :001 > a = [1, 2, 3] => [1, 2, 3] 2.1.4 :002 > a << a => [1, 2, 3, [...]] 2.1.4 :003 > a.flatten ArgumentError: tried to flatten recursive array from (irb):3:in `flatten' from (irb)34 from /Users/cs/.rvm/rubies/ruby-2.1.4/bin/irb:11:in `<main>'
It never ceases to amaze me how easy it is to break Ruby programs.
Software: MRI Ruby 2.1.4-p265, Glasgow Haskell Compiler 7.8.3.