Wednesday, December 20, 2006

The Magnificent Folds

Among all the beautiful higher order functions there is one family that I think shines a little brighter than the rest: the folds.

fold is a function that takes three arguments: a binary function, a base value and a list. When fold is applied to its arguments the binary function is inserted between the elements of the list like so:

fold op base [a, b, c] => a `op` b `op` c `op` base

fold is named so since it folds the elements of the list onto each other. fold is also commonly known as reduce since it usually reduces a list to a single value.

fold comes in two different flavors, foldr and foldl, depending on if the function associates to the right or left respectively.

foldr op base [a, b, c] => a `op` (b `op` (c `op` base))

foldl op base [a, b, c] => ((base `op` a) `op` b) `op` c

foldr

-- foldr in Haskell
-- natural recursion until it reaches the empty list where the base value is returned
-- Arguments
-- The binary function (a -> b -> b) takes two arguments and returns a value of the second arguments type.
-- The base value (b) must be of the same type as the binary funtions second argument and return type.
-- The list [a] must be of the same type as the first argument of the function.
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr op base [] = base
foldr op base (x:xs) = x `op`(foldr op base xs)
 

So, what is the big deal with fold? I can express almost any list manipulation with fold.

-- Sum up all elements of list.
sum :: Num a => [a] -> a
sum xs = foldr (+) 0 xs

-- sum [1, 2, 3]
-- (foldr (+) 0 [1, 2, 3])
-- 1 + (foldr (+) 0 [2, 3])
-- 1 + 2 + (foldr (+) 0 [3]))
-- 1 + 2 + 3 + foldr (+) 0 []))
-- 1 + 2 + 3 + 0 
-- 6

-- And, check if all elements of a list are True
and :: [Bool] -> Bool
and xs = foldr (&&) True xs

The declaration of id_list below shows an example of an application of a funtion (:) that is not type-symmetrical. The type of (:) is: a -> [a] -> [a] and care must be taken to provide the elements in correct order.

-- The identity function for list
id_list :: [a] -> [a]
id_list = foldr (:) []

-- id_list [1, 2, 3]
-- (foldr (:) [] [1, 2, 3])
-- 1 : (foldr (:) [] [2, 3])
-- 1 : 2 : (foldr (:) [] [3]))
-- 1 : 2 : 3 : foldr (:) [] []))
-- 1 : 2 : 3 : [] 
-- [1, 2, 3]


-- Map a function to all elements of a list
map :: (a -> b) -> [a] -> [b]
map f xs = foldr (\a b -> f a : b) [] xs

-- All, check if all elements of a list are true according to predicate p
all :: (a -> Bool) -> [a] -> Bool
all p xs = and (map p xs)

foldl

To complement foldr we have foldl which is similar but associates to the left instead of to the right. This is sometimes useful since it allows us to process the elements of the list in the opposite order.

-- foldl in Haskell
-- Applies the base to first element of the list, the value is accumulated and used a base.
-- Arguments
-- The binary function (a -> b -> a) takes two arguments and returns a value of the first arguments type.
-- The base value (a) must be of the same type as the binary funtions first argument and return type.
-- The list [b] must be of the same type as the second argument of the function.
foldl :: (a -> b -> a) -> a -> [b] -> a
foldl op base [] = base
foldl op base (x:xs) = foldl op (base `op` x) xs

-- Multiplies  all elements of list.
sum :: Num a => [a] -> a
sum xs = foldr (*) 1 xs

-- prod [1, 2, 3]
-- (foldl (*) 1 [1, 2, 3])
-- (foldl (*) (1 * 1) [2, 3])
-- (foldl (*) (1 * 1 * 2) [3])
-- (foldl (*) (1 * 1 * 2 * 3) [])
-- 1 * 1 * 2 * 3
-- 6

Since foldl processes the elements from left to right it may be used to reverse a list. The first try to do this is the same as the identity functon above but with foldl instead of foldr.

-- The erroneous reverse function for lists
err_rev :: [a] -> [a]
err_rev = foldl (:) []  
---*** Expression     : foldl (:) []
---*** Term           : (:)
---*** Type           : a -> [a] -> [a]
---*** Does not match : [a] -> [a] -> [a]
---*** Because        : unification would give infinite type

The problem comes from the order in which the arguments are applied. When the function is applied it reduces to:

-- rev [1, 2, 3]
-- (foldl (:) [] [1, 2, 3])
-- (foldl (:) ([] : 1) [2, 3]) -- error since the (:)-functions argument are applied in reverse order.

We need to reverse the arguments to the function and this can be done through the simple function flip.

-- Flip takes a function as parameter and returns the function with the arguments reversed.
flip :: (a -> b -> c) -> b -> a -> c
flip f x y = f y x

-- The correct reverse function for list
rev :: [a] -> [a]
rev = foldl (flip (:)) []  
-- rev [1, 2, 3]
-- (foldl (flip (:)) [] [1, 2, 3])
-- (foldl (flip (:)) ([] (flip (:)) 1) [2, 3])
-- (foldl (flip (:)) (([] (flip (:)) 1) (flip (:)) 2) [3])
-- (foldl (flip (:)) ((([] (flip (:)) 1) (flip (:)) 2) (flip (:)) 3) [])
-- ((([] (flip (:)) 1) (flip (:)) 2) (flip (:)) 3)
-- (((1 : []) (flip (:)) 2) (flip (:)) 3)
-- ((2 : [1]) (flip (:)) 3)
-- (3 : [2, 1])
-- [3, 2, 1]

foldr1 and foldl1

For lists that are known to be non empty and that work on a single type there are two simplified versions of fold called foldr1 and foldl1 that do not need a base case passed as argument. This makes them simpler to use but also less powerful.

-- foldr1 in Haskell for non-empty lists
-- Arguments
-- (a -> a -> a) A binary function operating on a single type.
-- [a] A list containing this type
foldr1 :: (a -> a -> a) -> [a] -> a
foldr1 f [x] = x
foldr1 f (x:xs) = f x (foldr1 f xs)

-- Return the largest element of two.
max :: Ord a => a -> a -> a
max x y
  | x >= y = x
  | otherwise = y

-- Return the largest elements of a list
maximum :: Ord a => [a] -> a
maximum xs = foldr1 max xs

foldl1 is defined similarily.

If you want to check out the folds on the web, here are two sites that will allow you to explore them: foldl.com and foldr.com. Click the ellipsis… :)

Sunday, November 12, 2006

Essential Java Generics

Java Generics may seem quite intimidating but it is not really that hard. It is however important to remember a few simple rules.

Subtyping

The Liskov Substitution Principle does not apply for generic elements!
  • Integer is a subtype of Number.
  • Integer is a subtype of Comparable.
  • List is a subtype of Collection.
  • List<Integer> is a subtype of Collection<Integer>.
  • List<Integer> is not a subtype of List<Number>.
  • List<Integer> is not a subtype of List<Comparable>.
// Example why generic elements are not proper subtypes:
List<Integer> integers = new LinkedList<Integer>;
List<Number> numbers = integers;            // Won't compile!
numbers.add(3.14);                          // since integers cannot contain 3.14 


Wildcards

To loosen the constraint above wildcards may be used. Wildcards are used with the keywords extends and super.
  • <? extends Number> means all types that are subclasses of Number are allowed.
  • <? super Integer> means all types that are superclasses of Integer are allowed.

The Get and Put Principle: use extends only when you intend to get values out of a structure, use super only when you intend to put values into a structure.

This also implies: don’t use any wildcards when you intend to both get and put values into and out of a structure.

// Copy all elements, subclasses of T, from source to dest which contains elements that are superclasses of T.
public static <T> void copy(List<? super T> dest, List<? extends T> source) {
   for (int i = 0; i < source.size(); i++) {
       dest.set(i, source.get(i));
   }
}                     

// Extends wildcard violation
List<Integer> integers = new LinkedList<Integer>();
List<? extends Number> numbers = integers;   
numbers.get(i);                                 // Works fine!
numbers.add(3);                                 // Won't compile!

// Super wildcard violation
List<Number> numbers = new LinkedList<Number>();
List<? super Integer> integers = numbers;
numbers.add(3);                                 // Works fine!
int i = numbers.get(0);                         // Won't' compile!
Object o = numbers.get(0);                      // Works fine since object is the upper bound!

Additional to the above principle there are also a few restrictions on wildcards: Don’t use wilcards when creating instances, specifying generic method calls and extending superclasses:

List<?> integers = new LinkedList<?>();         // Won't compile!
List<?> integers = Lists.<?>factory();          // Won't compile!
class AnyList implements List<?> {}             // Won't compile!

Bounds

Bounds are used to make sure that generic parameters are of a specified subtype.

// The generic parameter of Query must extend (or implement) Entity and Entity must have a getId method!
public class Dao<T extends Entity> {  
   T createOrUpdate(T entity) {
       if (entity.getId() != null) {
           return update(entity);
       } else {
           return create(entity);
       }
   }
}    
// Using Dao
Dao<Person> personDao = new Dao<Person>(); // Works fine since Person extends Entity!
Dao<String> stringDao = new Dao<String>(); // Wont compile since String does not extend Entity!

Bounds may also be used in more advanced ways. The example below is a simplified version from java.util.Collections and show a recursive bound. The generic parameter T is also used inside the bound Comparable<T> to make sure that the objects contained in the collection are comparable amongst themselves.

// The method max takes a parameter which must contain elements of a subclass of Comparable.
// In addition the Comparable class must be comparable with the declared type
public static <T extends Comparable<T>> T max(Collection<T> collection) {
   T currentMax = collection.iterator().next();
   for (T element: collection) {
       if (currentMax.compareTo(element) < 0) currentMax = element;
   }                                                          
   return currentMax;
}

By far the most difficult generic declaration comes from java.lang.Enum and looks like this Class Enum<E extends Enum<E>> implements Comparable<E>. Like the above declaration this is a recursive bound but it is even more constrained than the above. The key to understanding this is to know how enums are implemented in Java.

// Declaring an enum
enum Tapir {MALAYAN, BRAZILIAN, BAIRD, MOUNTAIN}

// creates a class similar to this!
public final class Tapir extends Enum<Tapir> implements Comparable<Tapir> {                      
   private Tapir(String name, int ordinal) {super(name, ordinal)}
   public static final Tapir MALAYAN = new Tapir("MALAYAN", 0);
   public static final Tapir BRAZILIAN = new Tapir("BRAZILIAN", 1);
   public static final Tapir BAIRD = new Tapir("BAIRD", 2);
   public static final Tapir MOUNTAIN = new Tapir("MOUNTAIN", 3);
   private static final Tapir[] VALUES = {MALAYAN, BRAZILIAN, BAIRD, MOUNTAIN};
   public static Tapir[] values() {return VALUES.clone()}
   public static Tapir valueOf(String name) {
       for (Tapir t: VALUES) if t.getName().equals(name) return t;
       throw new IllegalArumentException();
   }
}

As can be seen in the code above E extends Enum<E> maps to Tapir extends Enum<Tapir> and Comparable<E> maps to Comparable<Tapir>. This makes sure that enums of one type can only be compared with enums of the same type. The innermost generic parameter makes the subclass’ type available to the superclass allowing it to define methods whose parameters and return values are that of the subclass’.

Generic parameters may also have multiple bounds. The signature of the simplified example above actually looks like this:

// Actual signature of max from Java Collections
public static <T extends Object & Comparable<? super T>> T max(Collection<? extends T> collection);

When multiple bounds appear the first bound is used for erasure and the reason for the Object in the signature above is that it makes the signature backwards compatible. The reason for the super and the extends are the same as above to make the method more generic.

Erasure

Java Generics is implemented by erasure. This means that the generic information is removed when the class is compiled:
  • The erasure of List<Integer>, List<String>, List<List<Integer>> is List.
  • The erasure of Comparable<? super T> is Comparable.
  • The erasure of Object & Comparable is the leftmost, Object.

Another thing to be aware of is that the implementation of generics with erasure forces the compiler to insert additional methods into the class files.

// Additional methods are compiled into generic classes
public interface Foo<T> {
   void foo(T param);
}

public class Bar implements Foo<Bar> {
                          
   // This method will appear twice once with Object as parameter and once with Bar.
    public void foo(Bar param) {}
   
   public static void main(String[] args) {
       for (Method m : Bar.class.getMethods())
           if (m.getName().startsWith("foo"))
               System.out.println(m.toGenericString());
   }
}
                              
$ java Bar
public void Bar.foo(Bar)
public volatile void Bar.foo(java.lang.Object)
                                                     

This can trip you up if you try to overload a method to accept Object as a parameter too. It has never been a good idea to overload with Object as well as a subclass of Object but now it will not even compile:

Error:line (6)name clash: foo(java.lang.Object) in Bar and foo(T) in Foo<Bar> have the same erasure, yet neither overrides the other

Compatibility

All in all the implementation of generics in Java is a wonderful example of craftsmanship. The solution is binary compatible both backward and forward allowing new code to use old libraries as well as allowing old code to use new libraries. I do, however, wish that they would have skipped the compatibility and made generic classes aware of what they are at runtime.

If you want to know more about generics I highly recommend the book Java Generics by Philip Wadler and Maurice Naftalin.

Friday, October 27, 2006

Sausage and Egg Burrito

Philip Wadler succeeded with the impossible, he made me by another Java book: Java Generics The talk was a historic talk about how Haskell came about and how the ideas from Haskell where later put into Java Generics. The highlight was when Philip Wadler described how “this is a case for lambda” and ripped of his shirt and showed his superman costume with a lambda instead of an S. The same techniques have also been put into links, a language for creating multi-tier web applications.

The onward track presented two papers: The Commensalistic Software System was a first try of implementing Dick Gabriels and Ron Goldmans Conscientious System_. This paper deals with Self-Healing and Self-Protection from Autonomic Computing.

Collaborative Diffusion: Programming Antiobjects was about reversing the responsibility between an object and its environment by letting the environment do things that are usually done by the object. Diffusion means spontaneous spreading. In this context it uses a function to calculate the relation between objects.

Martin Rinard talked of Minimizing Understanding in the Construction and Maintenance of Software Systems. This was probably the best presentation of the week. Martin said that there are three methods of reasoning about computer systems: Rationalism, empiricism, and cluelessness. Contrary to what you may think Martin suggested that selective cluelessness is the way to go. The reason for this is that we are always working at the edge of our cognitive understanding and if we want to build more complicated things we need to let go of our need of understanding everything. The common approach of searching for truth, elegance and understanding is to hard for the common programmer. The simple fact is that: Nobody understands a whole system! The way to get things done is by brute force instead of elegance because elegance takes to long. We have to plan to live with errors. Programs need to be acceptable not correct.

Martin and his team has implemented a compiler that ignores addressing errors, stops memory leaks by cyclic allocation and removes infinite loops by cutting them off after a specified number of iterations. They tested the compiler on Apache, sendmail, etc and they worked better than they did before without any noticeable errors. We need to give up correctness for robustness! Martin even went so far as to say that if a system must stay alive we should wrap all out call in try-catch-ignore clauses.

Wednesday, October 25, 2006

Mount Hood Skillet

Eggs, onion, sausage, bacon, peppers, jack-cheese and cilantro is a very tasty breakfast.

Dr. Guy L. Steele Jr. presented his new programming language Fortress. Is is loosely based on Fortran in that it is intended as a mathematical language. It is created with an intent of doing for Fortran what Java did for C, increase programmer productivity! The language is parallel by default but you can easily create sequential constructs. The language is created with a minimalist approach and it has a minimal kernel explicitly created for allowing you to create powerful libraries that blend in with the language. Fortress has explicit support for version control to allow early design mistakes to be corrected.

The language makes full use of unicode allowing you to use any mathematical notation that is appropriate. A wiki-like markup is also supported if you are restricted to ascii input. This syntax is however just a view of the actual program representation. The language has stole features from multiple languages. The type system is based on Traits which is basically multiple inheritance through the use of mix-ins as in Ruby. Everything, including primitives, is first-class in the language. The language has stolen both type inference and monads from Haskell. The language has also stolen contracts from Eiffel. Apart from contracts special constraints are also supported. Tests may be included with the code that make sure that the code conforms to the constraints.

Fortress supports operator overloading, but discipline is encouraged since they do not want the overloading hell that C++ has. Much of the parallelism is achieved by generators and reducers.

Intentional Software and Charles Simonyi are creating a Language Workbench that will allow programmers and domain experts to create DSL that can be used by the domain experts to describe their knowledge as code. The programmer will then write code generators that will convert the DSL into source code that can be compiled and run. The DSL is represented as an internal structure that may be viewed and edited in many different notations. Examples of notations are: Sun Coding Standard, Mathematical, Spreadsheets and BNF. I like the approach but I have no idea when these tools will become available to us as programmers.

Pegasus: First Steps Towards a Naturalistic Programming Language want to allow programming in natural language to allow you to program in different languages. Like Intentional Software, they are using an internal structure to represent the program. Their approach seems to be pretty far from reality though.

Joshua Bloch talked about API design. It was a good talk but he didn’t mention anything new. A good API should be:
  • easy to learn
  • easy to use
  • hard to misuse
  • easy to read (code that is written to it)
  • easy to evolve
  • appropriate to audience
A good API should be:
  • as small as possible
  • unaffected by implementation details
  • well documented, class – what does instance represent, method – pre/post-conditions and side-effects, parameters – units, form, ownership.
  • concise, the client should not have to do anything the API can do.
  • as immutable as possible.

Panel: Aspects and/versus Modularity raised the issue of DRY vs clear semantics. It made me more convinced that AOP is right. I’ll start using it as soon as possible!

Country Breakfast

The morning started with an invited speaker: Brenda Laurel who was speaking of how to ubiquitous distributed computing to get in more contact with nature. Interesting quotes where Look for the Void, and Look where nobody is doing anything. Interesting thoughts!

After the break, Dick Gabriel and Ron Goldman, talked about Conscientious Software. Dick Gabriel is a big man in the OOPSLA community and he wrote an article many year ago called Lisp: Good News, Bad News, How to Win Big, which I just happened to read on the way over here! In it he explained a philosophy called Worse is Better. Very interesting!

The presentation was a mix of biology, music, city planning and design. A very thought provocative presentation that succeeded in making me think about the inherent problems with complex programming. The paper that the presentation is related to is about software that need to be aware of it’s environment and of itself.

The afternoon I started with Framework and Tools which contained three research papers:

A Framework for Implementing Pluggable Type Systems was about creating a new annotation based type systems for Java that would allow you to specify more specific types such as @NonNull and @ReadOnly

Design Fragments Make Using Frameworks Easier. The reason for code fragments is that programmers copy and paste code and many things may go wrong. The code may be wrong to start out with or the context may not be the same as the one it will be reused in. One good idea that was presented was to have a repository of code fragments index by name and context. However, the idea of letting programmers reuse code fragments by allowing them to bind certain variables in a code fragment does not seem to be a usable way to work to me. I would prefer to solve the same problem with macros and multiple inheritance.

JTL – The Java Tools Language is basically a query language for posing structured queries of code. It is to be used by tools such as IDE’s and as an aspect query language to find join-points. I don’t think it will be used in practice.

In the late afternoon I watched a panel about Ultra Large Scale Systems with, Steven Fraser, Gregor Kiczales, Ricardo Lopez, Peter G. Neumann, Linda Northrop, Martin Rinard, Douglas Schmidt, and Kevin Sullivan.

The solution to ULS is not about formal specification but more about creating an environment that will allow heterogeneous systems to communicate. Conways Law states Any piece of software reflects the organizational structure that produced it. Two software modules A and B cannot interface correctly with each other unless the designer and implementer of A communicates with the designer and implementer of B. and is a real problem in ULS design since there are too many people involved to allow efficient communication. There also seems to be a need for new abstractions but nobody knows what this abstraction should be. Overall, there seems to be a huge problem in this area of computing.

Tuesday, October 24, 2006

Sausages, Eggs and Biscuits

To my disappointment the Monday tutorial on Dynamic Programming at the Model Level was no longer on the program. I have contacted the presenters to find out why…

Instead I went to Refactoring Databases : Evolutionary Database Design with Scott Ambler and Pramod Sadalage and it was quite good but there was too much agile propaganda.

Data is important and since it is it should not be treated as a second class citizen, but should be included in the normal development and refactoring process. The current state in the database world is not good. There are no tests and no evolutionary process of caring for the data.

The main issue of evolutionary database design is to let every developer have his own database allowing him to make changes without disturbing the flow of everyone else. Scripts to setup the database and populate it with with base data should be available for every developer. These scripts are the database schema and should be updated and versions as development goes along. An article by Martin Fowler discussed this issue about three years ago.

While updating the database scripts migration scripts should also be added to enable transitions from one version of the database to another. Tools for simplifying migrations are: MIGRATEdb and Sundog Refactoring Tool. The migration scripts should be checked into the repository along with the code changes and be run by the continuous integration tool.

For migrating production databases a strategy of adding additional columns and tables for a transitory period should be employed. The redundant data should be kept in sync by database triggers. After the transition period has past the redundant columns should be removed.

The afternoon tutorial was Concurrent Object-Oriented Programming for Modern Architectures. It is about a second generation OO language called X10. It is a new language but it reuses most of the Java language to facilitate the transition from Java and to use some of the advantages of Java. The language is designed to perform well on clusters and parallel hardware. X10 takes Java and remove threading and dynamic class loading and adds asynchronous constructs:
  • async stmt – runs stmt asynchronously.
  • finish stmt – waits for statements started by stmt.
  • foreach stmt – runs stmt in different threads.
  • atomic stmt – performs stmt inside a transaction.

X10 uses a global address space where object reside in places that never move. The places may reside locally or globally.

I ended the day with “Data Refactoring for Amateurs” by Avi Bryant creator of the continuation based web framework for Smalltalk. Avi had a number of refactorings that end users are in need of doing while working in a spreadsheet like environment: They are: Assign Type, Limit Options, Extract Table, Move field to related table, Merge fields and Invert relation.

Monday, October 23, 2006

Grand American Breakfast

Eggs, bacon, hash-browns and country sausages is another great way to start a day.

The PLoP continues sunday morning with patterns for Extensible Program Design. The patterns are about efficiently representing code for fast browsing in an IDE. These pattern are very interesting and relate Eclipse and IntelliJ as well as true code browsers such as Squeak.

A Writers Workshop consists of eight parts:
  1. The author describes what parts he wants the review to focus on.
  2. The author reads a selected part of the pattern.
  3. The author becomes a “fly on the wall”. He can listen but not participate.
  4. The reviewing members summarizes the pattern to make sure everyone has the same understanding of the pattern.
  5. The praise section is the part where all positive aspects of the paper are discussed.
  6. The improvement suggestion section is the part where all the negative critique is discussed in a positive way to help the authors improve the paper.
  7. The author is let back in to ask clarifying questions, NOT to explain himself.
  8. The author in applauded for having the courage to be scrutinized.

The afternoon contains patterns for Device Drivers and patterns for Error Containment. The device driver patterns were nothing special but the error containment patterns by Bob Hammer were really interesting. The patterns reviewed were: Error Containment Boundary and Marked Data, but they were just two patterns from a larger collection about error containment. A lot of wisdom here. I’m looking forward to seeing the full collection in the future.

Sunday, October 22, 2006

Mexican Burrito Breakfast

A filling breakfast with fried potatos and a burrito filled with scrambled eggs, peppers, onion and spicy meat.

Pattern Languages of Programming (PLoP) with Ward Cunningham, Guy Steele, Ralph Johnson, Linda Rising, ... We sat in a room and talked about patterns that people had sent in, giving emphasis to what was good and what could be improved. A great way to learn new things. The first pattern, Mutator, was a non-pattern that could be replaced by Iterator. But the other patterns about documenting frameworks where quite interesting. The discussion gave more than the patterns themselves though. Definitely worth looking into.

Friday, October 20, 2006

OOPSLA Yiiiihaaa!!!

I’m finally going to OOPSLA. One of the great things about having to travel for twelve hours is that I can catch up on my reading. So far I managed to finish off two great articles that have been lying around for ages: Growing a Language by Guy L. Steele Jr., who will be speaking about his new language Fortress at OOPSLA, and Lisp: Good News, Bad News, How to Win Big by Richard P. Gabriel.

I also managed to pack a couple of books: The Algorithm Design Manual which seems to be entertaining as well as comprehensive, and the second edition of Programming Language Pragmatics which has been updated to include scripting languages. This should suffice for both there and back.

Sunday, October 01, 2006

Higher Order Functions

The last few years I have been spending my spare time with languages like Scheme, Ruby, Haskell and Erlang. The reason I’m doing this is that the expressiveness of these languages is a lot higher than that of my work language, Java. I’m able to do more with less code. And the reason for this is higher order functions. Higher order functions is enabled by having functions as first class citizens of the language.

Map

The first of the higher order funtions is map. Map takes two arguments a function and a list and returns a new list with the result of applying the funtion to every element of the list. Map raises the abstraction of list iteration by allowing me to focus on converting one element of this into one element of that instead of worrying about how to convert a list of this into a list of that. It simplifies my job.


; Map in Scheme
(define (map f l)
(if (null? l)
 '()
 (cons (f (car l)) (map f (cdr l)))))

>(map plusone '(1 2 3 4))
(2 3 4 5)

-- Map in Haskell
map f []     = []
map f (x:xs) = f x : map f xs

> map (+ 1) [1..8]
[2,3,4,5,6,7,8,9]

Filtering

Map always returns a list of the same length as the one given as argument. Sometime I only want the elements that fulfill some criteria to be returned. This calls for filtering.

; Filter in Scheme
(define (filter p l)
 (cond
   ((null? l) '())
   ((p (car l)) (cons (car l) (filter p (cdr l))))
   (else (filter p (cdr l)))))

>(filter even? '(1 2 3 4 5 6))
(2 4 6)

-- Filter in Haskell
filter p [] = []
filter p (x:xs)
 | p x = x : filter p xs
 | otherwise = filter p xs      

> fil odd [1..8]
[1,3,5,7]

Reduce

Another thing commonly done with lists is to combine the arguments into one value. This can be done with reduce.

; Reduce in Scheme
(define (reduce op l)
 (if (null? (cdr l))
   (car l)
   (op (car l) (reduce op (cdr l)))))

>(reduce + '(1 2 3 4))
10
>(reduce - '(1 2 3 4))
2

-- Reduce in Haskell
reduce op (x:[]) = x
reduce op (x:xs) = (op x (reduce op xs))

> reduce (+) [1..4]
10

Compose

We now have the ability to transform lists by mapping a function to every element or by filtering the elements according to some predicate function. We are also able to reduce a list by combining the elements with a binary function. We would also like the ability to create new functions by combining two functions and this is where compose comes in:

; Compose in Scheme
(define (compose f g)
 (lambda (x)
   (f (g x))))

; Add as a higher order function
(define odd? (compose even? plusone))

> (odd? 3)
#t

-- Compose in Haskell
compose :: (b -> c) -> (a -> b)  -> (a -> c)
compose f g = \x -> f (g x)

-- Odd as a higer order function
odd = compose even plusone

> compose even plusone
10

It is important that the type signatures of the functions set to compose match as can be seen in the type definition of the Haskell example above. The type returned by function g must be the input of function f.

Currying

Another method of creating new methods is currying1. Currying means creating new functions by only suppling part of the required arguments to the function. Instead of failing a new function is created. This is directly supported in Haskell, but must be done explicitly in Scheme (or by use of macros).

-- Currying in Haskell  
-- + is a binary function but only given one parameter it returns a closure.
plusone = (+ 1)

> plusone 3
4   

; Currying in Scheme
(define (curry f . args)
 (lambda x
   (apply f (append args x))))

> (define plusone (curry + 1))
> (plusone 3)
4

Const

In order to combine useful functions it is sometimes necessary to have simplifying functions and const is as simple as they come. const returns the same value no matter what argument it is given.

; Const in Scheme
(define (const k)
 (lambda (x) k))

> (define five (const 5))
> (five 6)
5

-- Const in Haskell
const k = \x -> k

five = const 5

> five 6
5


Even Higher Order Functions

All the functions above, there are more in any good book on functional programming, gives me the ability to produce many other functions simply by combining them together.

; Length in Scheme
(define length
 (compose (curry reduce +) (curry map (const 1))))

> (length '(1 2 3 4))
4


-- Length in Haskell
length = compose (reduce (+)) (map (const 1))

> length [1, 2, 3, 4, 5, 6]
6
      

To be able to raise the level of abstraction like this is what makes programming fun and I long for it when using rigid languages like Java.

1 The technique was named by Christopher Strachey after logician Haskell Curry, though it was invented by Moses Schönfinkel and Gottlob Frege.

Thursday, September 21, 2006

Textile Quick Reference

I started using Textile and Enscript for blogging but found no quick reference for textile so I wrote one.
h1. H1
h6. H6
bq. blockquote
[1] footnote ref
fn1. footnote def
_word_ em
*word* strong
??word?? cite
@word@ code
-word- deletion (strike-through)
+word+ insertion (understrike)
^word^ superscipt
~word~ subscript
p<. align left
p>. align right
p=. align center
p<>. align justified
* bulleted list ( may be nested * *)
# numbered list ( may be nested # #)
"Google":http://google.com Link to google.com named Google
"Google":alias Link to alias named Google
[alias]http://google.com Alias definition of alias
!image! Image
!image!:http:// Image with link
!>image! Image aligned right
CSS(Cascading Style Sheet) acronym

Sunday, September 10, 2006

Closures in Dolphin

A proposal for adding closures to Java has been crafted by none less than: Gilad Bracha, Neal Gafter, James Gosling and Peter von der Ahé. With proponents such as these I have high hopes that Sun will finally see reason and include it JDK7. Our state machine:
interface Condition {
    boolean isFulfilledBy(Event e);
}

interface Action {
    void perform();
}

interface State {
    void addTransition(Condition condition, Action action, State nextState);
}
is used like this to set up a two transitions:
bootState.addTransition(
         new Condition() {
             public boolean isFullfilledBy(Event e) {errorService.hasErrors();}
         },
         new Action() {
             public void perform() {logMessage());},
         errorState);
bootState.addTransition(
         new Condition() {
             public boolean isFullfilledBy(Event e) {((BootEvent)e).isStarted();}
         },
         new Action() {
             public void perform() {displayIdleScreen();},
         idleState);
It can now be cleaned up to look like this:
bootState.addTransition((Event e){errorService.hasErrors();}, (){logMessage();}, errorState);
bootState.addTransition((Event e){((BootEvent)e).isStarted();}, (){displayIdleScreen();}, idleState);
Since most of our flows consist of many states and more transitions our code will become a lot more compact and easier to read. The code above will work without any changes to our state machine API. The reason for this is called closure conversion and it basically allows you to use any class with one abstract method with closure syntax. The typical usage is interfaces with one method but abstract classes with one abstract method will probably also be allowed. I'm looking forward to Dolphin. The verbs are finally making their way into the kingdom of the nouns.

Friday, September 08, 2006

Six Months with Eclipse

The last six months I have spent working exclusively with Eclipse on a OSGI based project. My editor of choice is IntelliJ IDEA but since everyone on the project are using Eclipse and it‘s OSGI features I figured I would give it a go without complaining (too much).

The project makes heavy use of OSGI components and since Eclipse is built with these components it should fit like a glove. OSGI component metadata, such as dependencies and published interfaces are defined through extensions to the standard jar-manifest file. Eclipse works well with these components and parses the manifests and updates the classpaths accordingly. All should be well but I miss IntelliJ everyday.

I am a keyboard shortcut fanatic and I like to be able to do everything without using the mouse. The first thing I did when I decided to go with Eclipse was to learn as many shortcuts as I could, print out a couple of quick reference guides and remap mappings I did not like and I am now fairly fluent in Eclipse. Eclipse has a lot of wondeful features and plugins (such as multi-clipboard) but IntelliJ excels. I can‘t say that everything is better in IntelliJ, but it is better at the little things that matter. It anticipates what I want to do and helps me do it when I want to. I behaves the same way no matter what kind of file I am editing. There are no surprises! If I think of something and try it out it usually works the first time.

To illustrate my point I‘ll give a few examples: Cut text (Control-X): Cuts the selected text into the buffer, but if no text is selected it works with the current line.

Context-Selection (Control-W) inside a string: Press once: a word delimited by non alphanums is selected, again: the full string without quotes, again: the full string including quotes, again: all parameters of the enclosing method call, again: the enclosing method call excluding semi-colon, again: the enclosing method call including semi-colon, again: the full line, again: all lines in the block, again: the enclosing block… And it works the same way inside XML, Javascript, JSP etc. If you try this in Eclipse there is no telling what will happen depending on file and context.

Go to class (Control-N), Go to file (Control-Shift-N) When you, while typing the search criteria, remember that it is not a class but an XML file you want to look up you just press Control-Shift-N and IDEA remembers what you typed in and starts looking for a file corresponding to what you typed.

These are just few little things that make a big difference. There are hundreds!

Eclipse is free! It is Open Source! It has loads of usable and cool plugins! I want to want to use it, but I don‘t since it is not good enough.

The Rise of Anarchy

At the end of the second millennium, Gregorian time, there was one major method that ruled the software industry. This method was known as the Common Rigid Artificial Process: CRAP. CRAP was a unification of three moderately successful competing methods that specialized in Object Oriented Design. The competition kept their methods from becoming overly bureaucratic.

At some point in time, the founders of the three methods, the Three Bandidos as they became known, decided that they would profit more from creating a monopoly than from competing. They assembled their methods and it became CRAP.

CRAP dominated the industry, because the three bandidos were hugely successful in both modeling and marketing, and all the major industries quickly switched to it.

Now the healthy competition between the bandidos was gone and they relished in bureaucracy. Thousands of people were using CRAP without getting anything done. If you asked anyone what they did today, they would say they produced some CRAP document that was required by the method and at the end of the day no one remembered what the goal was in the first place (to produce software). The bandidos didn‘t mind, since their goal was to make money and not software and they became wealthier and wealthier. Their future looked bright.

A New Beginning

But at an AA meeting in Utah a number of anarchists started a reactive movement against CRAP. They were fed up with producing artifacts of no value instead of real working software. They wrote the Anarchistic Manifesto and created the Anarchistic Alliance. They called it the Anarchistic Alliance so they could keep referring to it as AA in their correspondence, since they were afraid of repercussions from the CRAP affiliates.

Because they were anarchists, they could not agree on one single method that should suit all. Instead they decided to let everyone define their own method as they saw fit. The important thing was that the method had to be built on individual cooperation, instead of totalitarian hierarchies. Every individual had to take responsibility for what they were doing! No one could tell anyone else what to do! When large systems had to be produced, it had to be done through the voluntary formation of teams that would voluntarily cooperate with other teams. Informal leadership would be the rule, instead of the assigning of roles.

Before the CRAP people could do anything about it, the anarchist had infiltrated much of the small businesses that were now being targeted by CRAP. The anarchistic methods worked so well that there was no one willing to give them up in favor of CRAP. To stop the spreading of the anarchistic methods, CRAP started to make unfounded claims of how they could be used. Since they were currently only used in small and a few medium businesses, CRAP propagandist claimed that the methods would not scale. This claim was successful at first, since many people in the major industries had invested considerable amounts of time on becoming experts of the CRAP bureaucracy.

The Turning Point

Even though the propaganda worked for a while, many of the medium businesses that used CRAP went bankrupt. They did so because the software they produced was expensive and inferior to software produced by smaller but more effective firms. Soon, all the medium businesses threw CRAP out along with CRAP consultants.

At this time the large industries were still unaffected by the anarchists. They were buried too deep in bureaucracy and were also dependent on CRAP tools. This is where the CRAP people made a fundamental error. Instead of keeping up appearance and continuing to claim that anarchistic methods does not scale into large businesses, they flip-flopped. They claimed that CRAP was anarchistic! What a gargantuan mistake! Instead of making the medium industries move back to CRAP, it made the major industries realize that they had been fooled all along!

The major industries threw CRAP out as soon as they could. The expensive CRAP tools were replaced by cheaper and better alternatives and the rest, as they say, is history!

1. Anarchy, contrary to popular belief, means "absence of ruler" and not chaos.