Wednesday, December 20, 2006

The Magnificent Folds

Among all the beautiful higher order functions there is one family that I think shines a little brighter than the rest: the folds.

fold is a function that takes three arguments: a binary function, a base value and a list. When fold is applied to its arguments the binary function is inserted between the elements of the list like so:

fold op base [a, b, c] => a `op` b `op` c `op` base

fold is named so since it folds the elements of the list onto each other. fold is also commonly known as reduce since it usually reduces a list to a single value.

fold comes in two different flavors, foldr and foldl, depending on if the function associates to the right or left respectively.

foldr op base [a, b, c] => a `op` (b `op` (c `op` base))

foldl op base [a, b, c] => ((base `op` a) `op` b) `op` c

foldr

-- foldr in Haskell
-- natural recursion until it reaches the empty list where the base value is returned
-- Arguments
-- The binary function (a -> b -> b) takes two arguments and returns a value of the second arguments type.
-- The base value (b) must be of the same type as the binary funtions second argument and return type.
-- The list [a] must be of the same type as the first argument of the function.
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr op base [] = base
foldr op base (x:xs) = x `op`(foldr op base xs)
 

So, what is the big deal with fold? I can express almost any list manipulation with fold.

-- Sum up all elements of list.
sum :: Num a => [a] -> a
sum xs = foldr (+) 0 xs

-- sum [1, 2, 3]
-- (foldr (+) 0 [1, 2, 3])
-- 1 + (foldr (+) 0 [2, 3])
-- 1 + 2 + (foldr (+) 0 [3]))
-- 1 + 2 + 3 + foldr (+) 0 []))
-- 1 + 2 + 3 + 0 
-- 6

-- And, check if all elements of a list are True
and :: [Bool] -> Bool
and xs = foldr (&&) True xs

The declaration of id_list below shows an example of an application of a funtion (:) that is not type-symmetrical. The type of (:) is: a -> [a] -> [a] and care must be taken to provide the elements in correct order.

-- The identity function for list
id_list :: [a] -> [a]
id_list = foldr (:) []

-- id_list [1, 2, 3]
-- (foldr (:) [] [1, 2, 3])
-- 1 : (foldr (:) [] [2, 3])
-- 1 : 2 : (foldr (:) [] [3]))
-- 1 : 2 : 3 : foldr (:) [] []))
-- 1 : 2 : 3 : [] 
-- [1, 2, 3]


-- Map a function to all elements of a list
map :: (a -> b) -> [a] -> [b]
map f xs = foldr (\a b -> f a : b) [] xs

-- All, check if all elements of a list are true according to predicate p
all :: (a -> Bool) -> [a] -> Bool
all p xs = and (map p xs)

foldl

To complement foldr we have foldl which is similar but associates to the left instead of to the right. This is sometimes useful since it allows us to process the elements of the list in the opposite order.

-- foldl in Haskell
-- Applies the base to first element of the list, the value is accumulated and used a base.
-- Arguments
-- The binary function (a -> b -> a) takes two arguments and returns a value of the first arguments type.
-- The base value (a) must be of the same type as the binary funtions first argument and return type.
-- The list [b] must be of the same type as the second argument of the function.
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl op base [] = base
foldl op base (x:xs) = foldl op (base `op` x) xs

-- Multiplies  all elements of list.
sum :: Num a => [a] -> a
sum xs = foldr (*) 1 xs

-- prod [1, 2, 3]
-- (foldl (*) 1 [1, 2, 3])
-- (foldl (*) (1 * 1) [2, 3])
-- (foldl (*) (1 * 1 * 2) [3])
-- (foldl (*) (1 * 1 * 2 * 3) [])
-- 1 * 1 * 2 * 3
-- 6

Since foldl processes the elements from left to right it may be used to reverse a list. The first try to do this is the same as the identity functon above but with foldl instead of foldr.

-- The erroneous reverse function for lists
err_rev :: [a] -> [a]
err_rev = foldl (:) []  
---*** Expression     : foldl (:) []
---*** Term           : (:)
---*** Type           : a -> [a] -> [a]
---*** Does not match : [a] -> [a] -> [a]
---*** Because        : unification would give infinite type

The problem comes from the order in which the arguments are applied. When the function is applied it reduces to:

-- rev [1, 2, 3]
-- (foldl (:) [] [1, 2, 3])
-- (foldl (:) ([] : 1) [2, 3]) -- error since the (:)-functions argument are applied in reverse order.

We need to reverse the arguments to the function and this can be done through the simple function flip.

-- Flip takes a function as parameter and returns the function with the arguments reversed.
flip :: (a -> b -> c) -> b -> a -> c
flip f x y = f y x

-- The correct reverse function for list
rev :: [a] -> [a]
rev = foldl (flip (:)) []  
-- rev [1, 2, 3]
-- (foldl (flip (:)) [] [1, 2, 3])
-- (foldl (flip (:)) ([] (flip (:)) 1) [2, 3])
-- (foldl (flip (:)) (([] (flip (:)) 1) (flip (:)) 2) [3])
-- (foldl (flip (:)) ((([] (flip (:)) 1) (flip (:)) 2) (flip (:)) 3) [])
-- ((([] (flip (:)) 1) (flip (:)) 2) (flip (:)) 3)
-- (((1 : []) (flip (:)) 2) (flip (:)) 3)
-- ((2 : [1]) (flip (:)) 3)
-- (3 : [2, 1])
-- [3, 2, 1]

foldr1 and foldl1

For lists that are known to be non empty and that work on a single type there are two simplified versions of fold called foldr1 and foldl1 that do not need a base case passed as argument. This makes them simpler to use but also less powerful.

-- foldr1 in Haskell for non-empty lists
-- Arguments
-- (a -> a -> a) A binary function operating on a single type.
-- [a] A list containing this type
foldr1 :: (a -> a -> a) -> [a] -> a
foldr1 f [x] = x
foldr1 f (x:xs) = f x (foldr1 f xs)

-- Return the largest element of two.
max :: Ord a => a -> a -> a
max x y
  | x >= y = x
  | otherwise = y

-- Return the largest elements of a list
maximum :: Ord a => [a] -> a
maximum xs = foldr1 max xs

foldl1 is defined similarily.

If you want to check out the folds on the web, here are two sites that will allow you to explore them: foldl.com and foldr.com. Click the ellipsis… :)

No comments: