Friday, December 11, 2009

Why #twitter sucks!

The problem with Twitter is information overflow. The reason for this overflow can be summarized with one simple sentence.

What are you doing right now?

That sentence opens up a whole can of worms.

It is totally egoistic. "What are you doing right now?", doesn't say that, what you are writing should, preferably, be interesting to someone else than you.

It doesn't say: You are sharing this with the rest of the world and you should at least think twice before you write "I am eating a donut".

"Right now", is devastating. It encourages you to write something without thinking. If it was to be taken literally, "I am publishing an uninteresting message on Twitter" would be the content of 90 percent of the tweets.

One thing it does not say is "Let's have a public dialog", but that doesn't stop most people from satisfying their IDOL-wannabe-souls from publishing their personal dialogs without thinking twice about spamming everyone else who is following them. Direct messages exists, even within Twitter, if you are not happy with other instant messaging tools or Skype.

So what can be done about Twitter?

Probably nothing since the current behavior is so ingrown in people's minds, it will be impossible to change. But here are a few ideas:

A new better introductory question. Instead of "What's happening?" maybe:

What would you like to share with the world?

Think TED! What's an idea worth spreading? What would you like to share with the world?

It implies that it should be interesting to at least someone else, other than yourself.

It doesn't have to be right now. It is even preferred that it is something that you have been thinking about for a while.

If it is a conversation with someone else, please make it a conversation with someone else, not a public display of your stupidity (or intelligence, if you prefer that)! Use Direct Messages or IM-tools.

If the tweet cannot be said in 140 characters, you have failed. Twitter is a micro blog and if you are not able to say what you want in 140 characters, please write it up in a normal blog and tweet a link it. Sending five consecutive messages on the same topic only shows that you have failed to condense your message down to it's essentials. If it cannot be said in 140 characters, don't tweet it, blog it!

I am aware that not everyone thinks that Twitter sucks and that I am not using it properly. I should not look at old tweets, etc., etc., but I want to use it like this.

Filters

I am therefore working on a filter for filtering out all the crap. The filter goes under the obvious name, no-crap, and it currently contains 4 filters:

  • A Bayesian filter is like a personal spam filter and allows you to classify certain tweets as crap or not.
  • A stop-word filter is a filter, that does not allow tweets with specified words, such as Viagra, to get through.
  • A max-per-time filter stops the tweeters that tweet tons of interesting things, but do it at the expense of others. You can, for example, set this filter to max five tweets per day.
  • A duplicate-per-time filter stops the re-tweet overflow that can happen if something interesting comes a long. It compares the tweet and does not resend tweets higher than a given similarity.

This is currently a hobby project, so it does not get much attention. I invite anybody to steal my ideas, and to merge them into the current Twitter clients.

Sunday, November 29, 2009

Working at Jayway

This morning I woke up singing, like I do most mornings. :D There are so many things ahead of me and most of them I like to do. One of those things is going to work. I worked at Jayway, for five years, three years ago, and I recently came back.

The reason I left was that I wanted to work with one product and one team. I wanted to do everything right, I wanted to use pair-programming, domain-driven design and test-driven development. I had many plans. It didn't turn out that way, the people on the team I worked with did not want to use pair-programming, DDD or TDD! :(

After a few years I gave up and came back to Jayway and I love it. The company has grown quite a bit, while I was away, and that is a good thing. Three years ago I probably thought that it was a bad thing. It isn't! Three years ago we had to come into a company as resource consultants, but now we, many times, come in as a team or get to do the project in-house. This is a really good thing. Getting to work with other Jayway people is a real Joy. They are smart, motivated and pragmatic. If I leave the project for a week, someone steps into my role and, everything works out fine. People and interactions over processes and tools is very, very true!

There are a many reasons why I think it is so nice to work at Jayway.

Natural authority is a pattern from Adrenaline Junkies and Template Zombies that states:

The meaning of an authority is a person who knows a great deal about something. Another meaning, in authority, is a person who is in charge. If someone who is an authority also is in authority, this is a natural authority.

The person who knows best, gets to make the decisions. This is the way it is at Jayway. You are not assigned to a project, you are asked if you want to work on the project and, it is OK to say no.

Management. My wife recently told me about a danish entrepreneur, Lars Kolind. He is usually called in when a company is not doing as well as it should. He talked about something that he called Kolindkuren (the Kolind treatment). What he does in this treatment is that he turns things upside-down. And one specific thing he did was that instead of asking managers who they wanted as employees, he asked the employees who they wanted to have as managers. It turned out that no-one wanted many of the managers.

When I think about this and how this would work out if we did the same thing at Jayway, I don't think that anything would change. I am happy with my managers, I don't see them as managers, I see them as collegues who work to allow me to do the job I want to do. If I got to choose who I wanted to be my boss, I would choose exactly the people we have right now. And, I think we all feel the same way. An example of this came a few years ago.

Thomas, the president of Jayway was fired by the board. What happened then was amazing, one week after Thomas was fired, 90 percent of the employees had resigned. If Thomas can't work for you, then we wont work for you, was the clear message that was sent. And it had effect, Thomas is back, the board is gone, and we are all happy.

Competence is the driving factor of everyone at Jayway. We all have different interests, but the common denominator is that everyone loves programming and want to get better at it. Take a look at this weeks competence workshops, and remember that they are all voluntary!

Competence Calendar

Openness is very important to me. If I know why a decision has been made, I can understand why it was taken even though I may not agree with it. At Jayway all the managers write a short daily mail about what they are doing during the day. Every week Thomas updates the wiki with what is going on currently and what is planned for the future. Everything that does not have to be secret is open and available. If you want to find it, it is on the wiki.

If a doctor wants to chop of my leg I would be happier with the decision if I knew that he wants to do this because I have an incurable tumor in the leg, instead of him wanting to practice his amputation skills.

So, that is how it is to work at Jayway (at least in my mind). If you feel that competence and humbleness is more important than fancy titles, come and join us.

Tuesday, November 24, 2009

Under the Hood of git clone

When you clone a git repository, everything is automatically setup to allow you to fetch, pull, push to and from the remote repository, origin. But what is really going on? git remote is configured with a few lines of configuration in the config file inside the .git/ directory.

Here’s how it works:

Create a new repository, called base, add a file to it, then commit.
$ mkdir base;cd base;git init
Initialized empty Git repository in /Users/andersjanmyr/tmp/repos/base/.git/
$ echo foo > bar.txt
$ git add .
$ git commit -m initial
[master (root-commit) 548d762] initial
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 bar.txt
Clone this repository, called klon.
$ cd ..
$ git clone base klon
Initialized empty Git repository in /Users/andersjanmyr/tmp/repos/klon/.git/
Initialize a new repository, called kopy.
$ mkdir kopy;cd kopy;git init
Initialized empty Git repository in /Users/andersjanmyr/tmp/repos/kopy/.git/
The difference in configuration between the klon and the kopy.
$ diff klon/.git/config kopy/.git/config
7,12d6
< [remote "origin"]
<       fetch = +refs/heads/*:refs/remotes/origin/*
<       url = /Users/andersjanmyr/tmp/repos/base
< [branch "master"]
<       remote = origin
<       merge = refs/heads/master

To set up the newly created repository to work the same way the clone does, all I have to do is to edit this file to make it look the same. This is not what git does, so lets do it the git way.

Fixing the remote configuration.
$ cd kopy
$ git remote add origin /Users/andersjanmyr/tmp/repos/base
This adds the [remote "origin"] entry to the config file.
[remote "origin"]
        url = /Users/andersjanmyr/tmp/repos/base
        fetch = +refs/heads/*:refs/remotes/origin/*
Fetching from the origin adds the remote heads to .git/.
$ git fetch
remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /Users/andersjanmyr/tmp/repos/base
 * [new branch]      master     -> origin/master
$ find .git/refs
.git/refs/
.git/refs/heads
.git/refs/remotes
.git/refs/remotes/origin
.git/refs/remotes/origin/master
.git/refs/tags

Now I can check out the origin/master, but if I do it the normal way, the configuration will not be set up correctly to allow me to pull and push the way i can with the clone.

Checkout the master version with tracking information.
# DON'T DO THIS, It does not add the tracking information to the config file.
$ git checkout -b master origin/master

# This add the tracking information to the config file.
$ git checkout --track -b master origin/master
Branch master set up to track remote branch master from origin.
Already on 'master'
The following information is added to .git/config when --track is used.
[branch "master"]
        remote = origin
        merge = refs/heads/master

That’s it! Now the .git/config file looks the same as if I had done a normal clone, but lets continue. What do the entries in the config file mean.

Definition of the remote.
[remote "origin"]
        url = /Users/andersjanmyr/tmp/repos/base
        fetch = +refs/heads/*:refs/remotes/origin/*
Definition of the remote as git config commands.
$ git config remote.origin.url /Users/andersjanmyr/tmp/repos/base
$ git config remote.fetch = +refs/heads/*:refs/remotes/origin/*

The first part is just declaring the alias origin for the remote url (or local in this case :).

The second part of the definition is more interesting. It sets up the refspec that will be used if you don’t provide anything on the command line. As we usually don’t provide a full refspec, most people don’t know what it is, this is extremely useful. In case you don’t know, the remote commands of git, push, pull, and fetch take a refspec as their last parameter. It is just that we usually just refer to a small part of it.

Usage of the remote git commands.
git pull <options> <repository> <refspec>...
git fetch <options> <repository> <refspec>...
git push <options> <repository> <refspec>...

The format of a refspec parameter is an optional plus +, followed by the source ref src, followed by a colon :, followed by the destination ref dest.

It defines what dest object should be updated by the src object.

Example definition of refspec.
# The local
# +<src>:<dest>
+refs/heads/spike:refs/remotes/origin/master

In our day-to-day usage of git, we usually don’t use the full syntax of the refspec. Instead we just refer to simple names. Like this.

Day-to-day usage of refspecs.
# Push the local branch to the remote branch with the same name
$ git push origin
# Pull the master into the local master.
$ git pull origin master
# Fetch the master of the origin and put the result in the remote experimental
$ git fetch origin master:refs/remotes/origin/experimental

The above really means:

Definition of the branch, expanded
# Push the local branch to the remote branch with the same name
$ git push refs/heads/*:refs/remotes/origin/*
# Pull the master into the local master.
$ git pull origin refs/heads/master:refs/remotes/origin/master
# Fetch the master of the origin and put the result in the remote experimental
$ git fetch origin refs/heads/master:refs/remotes/origin/experimental

From the above syntax, it is also possible to decrypt the obscure syntax used when deleting a remote branch. Deleting is the same as pushing to a remote branch without giving a local branch.

Delete a remote branch.
# Delete the remote branch serverfix
git push origin :serverfix

Now, we are down to the last part of the configuration, the branch definition.

Definition of the branch
[branch "master"]
        remote = origin
        merge = refs/heads/master

The first part branch.master.remote, tells git to use origin as the default remote, if none is given for this local branch.

The second part tells git which remote branch to use when merging. This also affects pull and fetch. Depending on your settings of push.default, it will also affect push.

Hopefully this has clarified some of the intricacies of git remoting. Just remember that if you make a mistake, you can always fire up an editor and edit the config file directly.

I’ll finish up with some more commands that can be used to get information about the remote.

Additional remote commands, to explore a remote.
# Show all remote branches
$ git branch -r
  origin/cucumber
  origin/customercare-0.6.x
  origin/master

# Show all remotes verbosely
$ git remote -v
origin  /Users/andersjanmyr/tmp/repos/base (fetch)
origin  /Users/andersjanmyr/tmp/repos/base (push)

# Show info about the remote
$ git remote show origin
* remote origin
  Fetch URL: /Users/andersjanmyr/tmp/repos/base
  Push  URL: /Users/andersjanmyr/tmp/repos/base
  HEAD branch: master
  Remote branches:
    experimental stale (use 'git remote prune' to remove)
    master       tracked
  Local branch configured for 'git pull':
    master merges with remote master
  Local ref configured for 'git push':
    master pushes to master (up to date)

# List the remote heads of the origin
$ git ls-remote --heads origin
548d7624f5385d36314e8ab61e61e8872c0bfe90        refs/heads/master

That’s it for today.

Tuesday, November 10, 2009

Adrenaline Junkies and Template Zombies

Since I had plenty of time to read on my flights back and forth to OOPSLA, I managed to read through a few books. One of them was Adrenaline Junkies and Template Zombies by Tom DeMarco et al. Being the sceptic that I am, my attitude when starting to read this book was: "Yeah, I know a lot of people has praised this book and I know that Tom DeMarco has written Peopleware, but what the hell do these bozos know anyway!" ;)

I figured that the six of them could probably scrape together a few decent people patterns, but I am a bit fed up with the whole pattern template style, so my expectations were not as high as they should have been.

Anyway, the book is amazing, the pattern template is really lightweight, basically, a name, an image, a statement, and a story. And the stories, are good. The authors have a remarkable ability to observe human nature and also the ability to write about what they see.

Some Patterns

The book contains 86 patterns, most of them are negative since they are funnier, as one of the authors said on a podcast.

Dead Fish

A dead fish is a project, that every one knows is impossible to start out with, but nobody says anything since the culture in the company is that way. If you say something you may get responses like,

Prove it! Prove that it is not feasible that this project will succeed!

Or,

Are you a weenie or a layabout?

Projects like this are usually worked on until it is inevitable that they will be late, and then everybody acts surprised.

Nanny

A nanny is a project manager that is aware of the capabilities of her staff and nurtures them and lets them at their peek. The nanny is responsible for improving her staff and make responsible project citizen of them. A nanny enables workers to do their job.

Eye Contact

If your life depended on a project, would you want them working close together or in different corners of the world? Human communication goes way beyond words, and it is foolish to think that it is possible to work at full speed with people who don't know each other and live in different places.

Dashboards

A dashboard is a highly visible board, that enable the project members and others to get an instant view of the status of the project. It can be a web page or a board on the wall. What is significant with a dashboard is that it is updated as soon as anything important happens in the project. All project members care for the dashboard since that is the place where they can find out how they are doing. Dashboards contain just the right amount of data.

Film Critics

A film critic is a project member or someone related to the project who thinks that they can succeed even if the project is a failure. They may often have good points, but usually when it is too late to do something about it. It is no use to have a person like this on the team, since they are not really on the team if they don't go down with the ship.

Natural Authority

The word authority have several meanings. The meaning of an authority is a person who knows a great deal about something. Another meaning, in authority, is a person who is in charge. A person who is an authority and in authority is a natural authority. This is the healthy pattern.

The White Line

The white line is the line on a tennis court. The line is a clear signal if the ball is in or out. Occasional dispute may arise over a questionable judgment call, but the line is respected by everyone. Most projects don't have this line. Create a white line for your project so that you know what is inside the project scope and what is not.

Silence Gives Consent

If someone don't oppose an idea, it is, usually, taken as consent. Especially from people who come from different working areas, like management and programmers. Silent consent is not good for anyone, because nobody is sure what has been decided and what has not. A way to remedy this problem is to keep a list of commitments where it is written down who has promised who what. The Scrum sprint backlog is similar to such a list.

Time Removes Cards from Your Hand

The earlier a decision can be made, the better. If you know that a project will not make a deadline, the decision to change the scope should be made as early as possible, since that will allow the team to work on the most important features first. If a decision is delayed until it is too late, the decision is made implicitly by time.

Rhythm

Instead of being daunted by overwhelming tasks, projects with rhythm take small, regular steps thus establishing a regular beat that carries them toward their goal.

This journey of a thousand miles begins with a single step --Lao Tzu

False Quality Gates

False quality gates are quality checks that don't do anything to promote the quality of the project. It is a sign that more attention is concentrated on format, rather than content. It can be a glossary of terms with the wrong content or a word template where most of the sections are filled in with the words "This section must contain text to fulfill company guidelines".

Testing Before Testing

Testing before testing refers to the practice of thinking about how to test a feature at the same time you think about how to implement it. If it cannot be tested how can you know that it does what you say?

Cider House Rules

Cider house rules are rules written by someone unconnected to the project. Rules like this, that give no apparent value, are a burden to the project and therefore often ignored.

Nobody pays attention to them rules. Every year Olive writes them up and every year nobody pays any attention to them. -- John Irving, The Cider House Rules

Talk Then Write

The team makes decisions during conversation, and then immediately communicates the decisions in writing. This is a really important pattern that it is easy to forget!. A common place where this kind of decisions are written down is very helpful.

Practicing the End Game

A team that does not release often and regularly will often take a long time to make a release. The act of releasing a product should be practiced often so that is becomes a natural part of the project.

Data Quality

Data quality often sucks! A common solution is often to attack the problem with technology instead of putting in the manpower that is actually needed to improve the quality of the data. Company web-sites are the prime example of this. They are often full of completely useless information.

Undivided Attention

Complex work is hard and it requires undivided attention to be performed properly. Splitting people's attention over multiple project will severely hinder their performance.

By doing two things at once, you've cut your IQ in half, and believe me, those 40 points can really make a difference. -- Dale Dauden, The New York Times

Orphaned Deliverables

Orphaned deliverables are deliverables that no-one values enough to pay for. Always make sure that there is a sponsor for every artifact that you are developing.

Hidden Beauty

Hidden beauty inside a program, that little extra thing, that may seem unnecessary, is what shows that the creator cares about his work. It is not unnecessary! This caring is what separates a quality product from garbage.

I Don't Know

If you are afraid of saying the words, "I don't know!", you are probably working in an organization where saying it means that you will be taken for a fool. In a healthy organization, "I don't know!", means that you don't know, but you will in a while, if it is important enough.

Loud and Clear

Having a clear goal, that everyone agrees to, is vital to a project. Clear goals allows you to focus the project activity if it starts to move away from its purpose. A PAM statement, that contains the Purpose, Advantage, and Measurements of a project is very helpful.

Conclusion

The patterns described above are just my short summaries, and I don't do them justice, but hopefully I have awoken your interest in this wonderful book.

It made my top-ten list and I recommend it to everyone from programmer to president. I like that the authors don't always provide a solution, but instead just describe the way they see it, it is up to you to figure out a solution to the problem yourself.

Saturday, November 07, 2009

The Craftsman Analogy

The analogy of software developers as craftsmen has become very popular. I don't know where it started, but the first book I read about it was the excellent book The Pragmatic Programmer by Andy Hunt and Dave Thomas. I really liked this analogy, it seemed right.

A few years later, Pete McBreen released the book Software Craftsmanship, where he articulates that software development should be more like craftsmanship. Pete writes eloquently about programming masters that should be paid ten times more than their apprentices because they are at least ten times more effective. The masters also have the responsibility to take on journeymen and apprentices to train in their particular flavor of software development. This also rang very true to me.

But, there is something fishy with this analogy. Something ain't right!

I have come to understand that the analogy refers to how it is believed that the craftsman "industry" worked ages ago, rather than how it is today. If you hire a carpenter today he may tell you that he will come "sometime next week". I have rarely been given a time that is more exact than a four hour span, to just meet up.

After the meeting has been scheduled, I have to be very lucky if my craftsman actually appears at the appointed time. Most of the time he will not show up at all!

If I happen to be lucky enough to get a craftsman to show up or to call and tell me that he can't make it, I usually make a note of this guy as being, a highly reliable craftsman, worth hiring again. He gets a golden star for just calling to tell me that he won't come.

When we finally meet up, the craftsman may do a terrific job and I will happily recommend him to anyone, but most likely we will talk and he will tell me that he doesn't have the time to do the job right now, but that he can come back tomorrow. If you let him go with that, he most likely will not come back tomorrow or the next day. He will come back when you call to remind him that he should have come back. And even better, if you pay him in advance you will never see him again, ever!

So, if our goal is to get programmers to be viewed as craftsmen, we're already there. Just change the statement those f***ing carpenters never do what we expect! to those f***ing programmers never do what we expect!

Maybe the whole idea of programmers as craftsmen is just:

It was better in the old days.

It wasn't, it is better now!

I have learned one thing through all my years of programming:

The more I learn, the more I learn how little I know. --Socrates

I'm not a software craftsman, I'm a humble programmer, a good one, and proud of it.

Thursday, October 29, 2009

OOPSLA 2009 Thursday, October 29th

Moving Fast at Scale, Lessons Learned at Facebook, Robert Johnson

Facebook has over 1 million active users per engineer.

Slowing down to get it right is not a good idea, unless you know exactly that your idea is right. If you try things fast, you can try out more things and you can get feedback fast.

How do you get fast?

  • Never block developers
  • Give developers control. At Facebook developers code, test and deploy. No QA is involved.
  • Tune processes for speed
  • No deadlines, but the site cannot go down.
  • Frequent small changes, no delays, easier to isolate bugs.
  • All development in trunk with weekly releases.
  • Deployment tools
  • Gatekeeper
  • Do less work
  • Build tools
  • phpsh -interactive development
  • diffcamp - code review
  • XHProf - profiler
  • Scribe - moves data from server to central repository.
  • Hive - data warehouse infrastructure on top of hadoop
  • Work with open source

How to deal with a lot of users?

  • Scale Horizontally
  • Web Server, the relations have moved to the web layer from the database.
  • Memcache
  • Database, only used as a persistence layer. No joins etc.
  • Thrift
  • C++, Java, Erlang, Python

Open Problems/Future Work Languages for real-time data-access parallel with lots of dependencies. Distributed indexing Automatic clustering of data Profiling of parallel data * Better ways of expressing client and server code + Javascript and PHP are not well matched.

Toward Cloud-Agnostic Middlewares, E. Michael Maximilien

The Cloud Computing Landscape

  • Service Providers
  • Platform Providers
  • Service Brokers
  • Application Providers (SaaS)
  • Users

Challenges in Cloud Computing

  • Data lock-in
  • Application programming lock-in
  • Management lock-in
  • API lock-in
  • IncreaseRisk
  • Security and Privacy
  • Catastrophic failures
  • Business models and impacts

Research Opportunities

  • Cloud middleware
  • Agnostic wit respect to providers, frameworks and interfaces.
  • Learn Best Practices
  • Fluid cloud application deployment
  • Optimize could usage prices
  • Give indication of cloud readiness

Use Cases

  • COBRA Java text analytics solution. (JavaEE)
  • CoScripter application (RoR)
  • JumpStart sMash application (WebSphere, Smash)

The Architecture

  • Core APIs
  • Cloud APIs
  • Cloud Adapters
  • Clouds and Brokers

IBM Altocumulus

Altocumulus is a configuration tool that helps you to deploy your application into different clouds. It supports multiple different service and platform providers. It helps you configure your deployment so that it follows the commonly known "good" practices. You can use it via the Web or via a RESTful API. It also supports Atom and RSS feeds. It is very interesting.

Lessons Learned

  • Cloud Providers need to include some support service s to be enterprise ready
  • Cloud Providers need to expose flexible image creation facilities
  • Cloud virtual machines are mostly hardware based and their memory management is suboptimal
  • Best Practice Pattens works but need to evolve.
  • Standardized hardware is important when implementing cloud infrastructures
  • Instance monitoring capabilities should coma as part of the cloud facilities and APIs
  • The cloud space is still maturing

When Users Become Collaborators: Towards Continuous and Context-Aware User Input, Walid Maalej et al.

User input is critical for the success of projects.

Various types of input:

  • Field observation, Lead Users
  • Perpetual Beta, Legacy Documents, Usage Data
  • Issue and Bug Report, Enhancement Request, Feature Request
  • Workshop, Interview, Survey, Classification Requests

Built in Feedback Mechanisms

Examples: Application Bug Reports, Application Usage Data, Discussion groups

Most applications don't have feedback mechanisms built into them.

Users are motivated by feedback, so if they submit a report and don't get any feedback, they will probably not do it again.

Wednesday, October 28, 2009

OOPSLA 2009 Wednesday, October 28th

Jeannette Wing, CMU, Frontiers in Research and Education in Computing

Jeanette said that there has been a paradigm shift.

Not just about computing's metal tools (transistors and wires), but also our mental tools (abstraction and methods)

Is this really a new paradigm shift? Maybe for the National Science Foundation (NSF), certainly not for anyone that has been working with software the last ten years.

The limits of Moore's Law forces the NSF to focus on programming languages and abstractions.

Three drivers of computing research, Society, Science, Technology.

Encouraged research areas:

  • Data Intensive Computing
  • Cloud Computing
  • Map-Reduce

  • Cyber-Physical Systems (computational core that interacts with the physical world)

  • Smart vehicles
  • Smart Flyers
  • Smart Devices

  • Network Science and Engineering

  • Understand the complexity of large scale networks
  • Trustworthy Computing

  • Socially Intelligent Computing

  • Humans are still much better at image recognition
  • Programs where the human is a port of the program.

  • IT and Sustainability (Energy, Environment, Climate)

  • Computer Science and Economics

  • AdSense
  • eBay

  • Computer Science and Biology

Agile development: Overcoming a short-term focus in implementing best practices, Karthik Dinakar

In the project they concluded on the following good practices.

  • Good version control
  • Coding guidelines
  • Build automation
  • Unit testing framework
  • Automatic sanity tests
  • Accurate task estimations
  • Effective pre-spring planning
  • Solid design discussions
  • Involve QA and Operations
  • Effective post-sprint-reviews

They also thought that it is difficult to implement best practices under way.

They had many problems:

  • Integration was not in the plan
  • Sprint backlog changing all the time.
  • Long sprint meetings

Management did not allow them to implement the changes that they needed and the reason was "It's Agile!"

All in all, their problems seemed to be the usual, they had no idea of what it meant to do Scrum in the first place. Don't people read books anymore?

Are systems Green? Panel with Steve Easterbrook et al

  • A Google query has a Carbon footprint.
  • We don't know what it is.
  • It may actually be less than the energy it uses since it may permit the person posing the query to save a lot of energy.
  • The term Green is just marketing bullshit.
  • It is not measurable at all.
  • Carbon Emissions are permanent.
  • It won't go away even if we stop burning any carbon today.
  • The goal of emissions are ZERO, anything else is not sustainable.

There is a book called Green IT for Dummies

  • What can we as a computing industry do.
  • Analyze the problem.
  • Make a list of what we need and can do.
  • Create a wiki.

If you are going to read an article for more than three minutes, YOU SHOULD PRINT IT OUT! Believe it or not, but check your facts.

Architecture in an Agile World, Panel with Steven Fraser et al

Randy Miller: We allow the customers to make changes, but we don't tell them what the cost will be. If we have no architecture the cost will be high. The people behind the agile manifesto were all good at architecture and that is why the importance of architecture was not emphasized in the manifest.

Bill Opdyke: Some people seem to see a difference between architects and agilists. But most people that are good don't have the this problem. Architects can learn from agilists that change is not that hard. Agilists can learn from architects that architecture matters. There is a middle ground.

Ethan Hadar: We need a stable vision of what the architecture should be while delivering solutions iteratively. Show me the architecture road map of your product, because it will be integrated with another program in nine months. Accountability and responsibility for the architecture is important.

Dennis Mancl: You need to plan architecture early, or you will have to do it later on and then it will be harder.

Audience: How does an architect work in an agile team?

Randy Miller: From an agile team, you're all part of the team. There are no distinctions. The architect is in the team to make sure that

The architecture in the system is the point in time in which you have to step back and think about how everything interacts.

Ethan Hadar: The architect is the person, who needs to interact with the testers and operations to explain how and why the system is the way it is.

Dennis Mancl: The architect is responsible to the stake holders and he is responsible to know the problem domain.

Irit Hadar: The architect's role is to take a step back and say when it is time to review the architecture.

Audience: The architecture is what you get, regardless of what you do.

An interesting discussion, that could not find a consensus to if there is a need for a single architect or not.

Agile Anthropology and Alexander's Architecture, Jenny Quillien, Dave West, Pam Rostal

Do we need to pay any interest of to Christopher Alexanders' new book The Nature of Order?

The software community has always taken interest in Alexanders' books, but we take interest in the wrong things. We have only understood some rules, we have not grasped the deeper part of it, because we don't understand the culture of architecture.

In a pattern language, we looked at patterns, but we dismissed the QWAN. We missed the holistic point-of-view. Everything is part of the system, people, organizations. He is writing about things that are multi-dimensional and multi-faceted, not things that are simple and exact.

The Nature of Order contains the same multi-faceted ideas and if we look at it with the same eyes, we will miss the point again.

Alexander looked at centers and centers affect other centers. And there is no right or wrong, there are only degrees.

In Alexanders' world there is only one system, the Universe. Everything is connected!

Writing Code for Other People, Tom Mullen

Chunking and Memory

  • The mind groups memory into chunks. Most chunks are stored in long-term memory. Out conscious is in long-term memory.

Short-time memory can only hold about seven relations. Short-time memory is also short:) This gives us a time-limit when traversing code.

Meyers' open-closed principle is an echo of the mind's way to learn things.

If the code is a reflection of our brains, then most brains contain spaghetti.

Analogies

Analogies is the mapping from one thing to another.

Conclusion

Our brains are not good a processing more than 4 chunks at the time. This implies that we should write methods with less than four lines. Classes with less than four methods, modules with less than four classes, and applications with less than four modules.

Tuesday, October 27, 2009

OOPSLA 2009 Tuesday, October 27th

Barbara Liskov, the Power of Abstraction

OOPSLA 2009 opened with Barbara Liskov as the keynote speaker. She is famous for, among others, the Liskov Substitution Principle. This principle states that:

A subtype should be substitutable for its super-type.

That is, it should be possible to use a subtype in the same way as if the type itself was used. Any difference in behavior should NOT be noticeable to the client.

A History of Abstract Data Types

Data abstraction was developed as a solution to the software crisis. A crises that is just as present today as it was then. This was in 1968, the time when Dijkstra wrote the paper Go To Statement Considered Harmful. The problem with gotos is that it is difficult to know the context in which a statement is used.

Other important papers at the time were. Nicholaus Wirth's paper Program Development by Stepwise Refinement on 1971, about top-down design and David Parnas' paper Information Distribution Aspect of Design Methodology in which he stated.

The connections between the modules are the assumptions which the modules make about each other.

These assumptions are often much more than the simple interfaces that we see today. It includes the whole context in which the module is used.

Barbara then wrote a paper called A Design Methodology for Reliable Systems. The ideas from this paper was later reused in the context of programming. When seen from the outside it is apparent that the same technique she used when creating the Venus Operating System, Partition State, could be used when building programs but it was not at the time. Where do ideas come from? Perhaps the time is just right.

Other influential papers that are still valid today are: Hierarchical Program Structures by Dahl and Hoare and Protection in Programming Languages by Morris, introduced the early ideas of encapsulation and Global Variable Considered Harmful by Shaw and Wulf.

In 1973, the paper on Abstract Data Types was published and Liskov the realized her ideas in CLU. Its worth noting that CLU was way ahead of its time. It included features, like data encapsulation, exceptions, iterators via yield but no inheritance. She doesn't think that inheritance is very important and that it complicates things.

The Liskov Substitution Principle didn't appear until 1983, when she held a speech here at OOSLA, when she had noticed that inheritance was used for two different things that was not very well understood. It is used for:

  • Implementation inheritance, which violates encapsulation.
  • Type Hierarchy, and this was not very well understood.

She ended up with noting that modularity based on abstraction is the way things are done now. It wasn't at the time.

She also pointed some challenges that still exist:

  • New abstraction mechanisms
  • Massive Parallel Computers
  • MapReduce?
  • Transactional Memory?
  • Internet Computer
  • Storage and computation
  • Semantics, reliability, availability, security

And she also made the point that

Readable programs are much more important than writable programs.

When the questions were opened, it was interesting to note that among the questioners were Phil Wadler (Haskell), Andrew Black (Traits), Guy Steele (Scheme, Fortress), Dave Ungar (Self) and Ralph Johnsson (GoF).

Flapjax, a Programming Language for Ajax Applications

After Liskov's keynote I watched a presentation of a research paper about Flapjax, a language designed for web applications. It is based on event streams and the language itself is reactive. Flapjax is a Javascript-based language that can be used as a library.

The language introduces two new concepts Behaviors and Event Streams.

A behavior is a value that changes over time. It can be created like this.

// A variable that changes over time, every 100ms.
var nowB = timerB(100);

What is interesting is that the behavior is composable with normal Javascript functions. This is done by compiling or transforming the Javascript into Flapjax code.

If an expression is a behavior, all expressions whose values depend on it also become behaviors.

var nowB = timerB(1000); 
var startTm = nowB.valueNow(); 
var clickTmsB = $E("reset", "click").snapshotE(nowB).startsWith(startTm); 
var elapsedB = nowB - clickTmsB;
insertValueB(elapsedB, "curTime", "innerHTML");

Programming with event streams is a little different from programming with behaviors. A behavior masquerades as an ordinary JavaScript object whose content just happens to change automatically. In contrast, an event stream is a new kind of value, with new primitives for programming over it.

Event streams and behaviors offer complementary views of the world. It is easy, however, to overstate their differences. Given an initial value, every event stream can be converted into a behavior: the behavior always has the value of the last event to have arrived on the stream, starting with the specified initial value until the first event arrives. Likewise, every behavior can be converted into an event stream: when the behavior's value changes, send the new value as an event.

Thomas W. Malone, Keynote Onward, the Future of Collective Intelligence

MIT Center for Collective Intelligence

Collective intelligence - Groups of individuals doing things collectively that seem intelligent.

Collective Stupidity is also very much existent.

New examples of collectively intelligence are: Google, the Web, Wikipedia, Linux, Digg, YouTube, etc.

How can people and computers be connected so that collectively they act more intelligently than any person, or computer?

Thomas showed a video of a crowd of people flying an airplane by turning a reflective shield green or red.

What are the genomes of collective intelligence?

Every activity has to answer four questions, Who? What? How? Why?

  • There are two Who?-genes, crowd and hierarchical.
  • There are three Why?-genes, money, glory and love.
  • There are two What?-genes, create and decide.
  • There are four How?-genes, collection (contest), collaboration, group decision (voting, consensus, averaging, prediction markets) and individual decision (market, social network).

Failure to get motivational factors (thw why?) right is probably the single greatest cause of failure in collective intelligence experiments.

Interesting examples are: Climate Collaboratorium, TopCoder, Kasparov vs. the World, Amazon Mechanical Turk and TurKit

What's coming?

Human Brain is very much like Global Network.

  • We have global moods?

Quotes from We are the Web, Wired 2005

There is only one time in the history of each planet when its inhabitants first wire up its innumerable parts to make one large Machine.

Three thousand years from now, when keen minds review the past, this will be recognized as the largest, most complex, and most surprising event on the planet.

The Machine provided a new way of thinking (perfect search, total recall) and a new mind for an old species. It was the Beginning.

Brion Vibber, Wikipedia, Making your people run as smoothly as your site

As the number of people involved in a project grows, key decision-makers often become bottlenecks, and community structure needs to change or a project can become stalled despite the best intentions of all participants.

Instead of having a single admin looking at a page to decide if it is garbage, a group can vote if they think it is garbage. This allows the admin to delete pages without checking them first if everyone votes for deletion.

  • People have limited time and patience.
  • Waiting on other people is slow.
  • People want to do what interests them, not deal with process!

Get out of peoples' way and let them do stuff!

Onward!

The Commenting Practice of Open Source, Oliver Arafat

An analysis of 80GB of Open Source code. The average code density is one comment per five lines of code or 19%.

Average comment density is independent of code size.

Strong variation by programming languages.

  • Java code has an average of 26%.
  • Perl code has an average of 11%.

Successful open source projects follow consistent comment practices.

Comment Density by Commit Size * Smaller commits have higher comment density.

Polymorphic System Architecture, Jeffery E Bryson

Run-Time polymorphism (RTP) has been used in the software community for two decades to satisfy dynamic reconfiguration, plug-n-play, extensibility, and system redundancy requirements. RTP is also used to construct software systems of systems. System engineers now have the same requirements applied to large-scale system architecture.

A Polymorphic System Architecture (PSA) uses the same technology, by applying it to the system architecture. By defining specific polymorphic relationship within the system architecture the system architect can reduce the system complexity and satisfy functional requirement.

Polymorphism reduces the code size, but it also reduces understandability.

Value Added
  • Extendable/Reusable System Designs
  • Dynamic reconfiguration
  • An architecture that matures over time instead of becoming absolute.
  • OO and Refactoring.

Conclusion

All in all, this was a good day with the keynotes being the highlights. The Onward sessions were interesting, but most of them, were of very little use to me.

Some thoughts from "A Theory of Fun"

I just finished reading the book A Theory of Fun by Raph Koster. It is funny how everything comes together once you start focusing and noticing certain patterns. It is a good book and worth reading even if you're not into game design.

I already learned from personal experience and from other books, that our conscious mind is terrible at multitasking. It is however very good at internalizing things, learning things so that the brain can perform them unconsciously, without conscious supervision. Raph calls this chunking.

The act of learning is about turning many steps into chunks that we don't need to think about as separate entities.

When we don't see something, we don't perceive it, but once we become aware of a certain pattern, we see it everywhere. Koster calls this noise.

Noise is any pattern that I don't understand.

A good game keeps us on the edge of our abilities constantly and once we learn something the game will become harder. This is a variant of flow. But since a game cannot continue for ever it is doomed to become boring once we have mastered it.

The destiny of games is to become boring, fun is the process and routine is its destination.

Koster also mentions some of his grandfathers carpentry practices.

  • Work Hard on Craft
  • Measure twice, cut once.
  • Feel the grain, work with it not aginst it.
  • Create something unexpected, but faithful to the source from which it sprang.

That is not bad advice for anything.

Sunday, October 25, 2009

Javascript, the Esperanto of the Web.

I just gave the tutorial called Javascript, the programming language of the web at OOPSLA, this coming Sunday. It used to be called "the Esperanto of the Web", but no one seemed to know what that was, so I had to rename it.

If you want to learn good Javascript, there are three books that you need to read. Javascript, the Good Parts, by Douglas Crockford is a really good, really thin book that teaches you all that is worth knowing about the language. The other two books, the Little Schemer, and the Seasoned Schemer, by Matthias Felleisen and Daniel P. Friedman will teach you functional programming using Scheme. The reasons the Schemer books are so good for learning Javascript is that Javascript is more like a dialect of Scheme than it is like a dialect of C.

After reading these books you have a whole new appreciation for Javascript.

Here is one of the most beautiful functions in computer science, the Y-combinator in Javascript.

// The Y Combinator
var Y=function (gen) {
 return function(f) {return f(f)}(
  function(f) {
   return gen(function() {return f(f).apply(null, arguments)})})}

The fact that the Y-combinator can be written i Javascript shows the power and elegance of the language.

Friday, October 09, 2009

Lists in Scala

As with most functional languages, lists play a big roll in Scala. Lists contains, among others, the following operations.

// List in Scala or homogenous, they are declared as List[T]
val names: List[String] = List("Arnold", "George", "Obama")

// Lists are constructed from two building blocks :: and Nil
assert(names == "Arnold" :: "George" :: "Obama" :: Nil)

// Gets the first element of a list
assert(names.head == "Arnold")

// Gets the rest of the list
assert(names.tail == "George" :: "Obama" :: Nil)

// Checks if the list is empty
assert(List().isEmpty)

Instead of using head and tail, pattern matching is commonly used.

def length(xs: List[T]): Int = xs match {
 case Nil => 0
 case x :: xs1 => 1 + length(xs1)
}

From these simple functions a flora of functions is built.

// List length
assert(names.length == 3)

// ::: appends two lists
val namesTwice = names ::: names
assert(namesTwice == List("Arnold", "George", "Obama", "Arnold", "George", "Obama"))

// last gets the last element
assert(names.last == "Obama")

// init gets all but the last
assert(names.init == "Arnold" :: "George" :: Nil)

// reverse reverses the list
assert(names.reverse == "Obama" :: "George" :: "Arnold" :: Nil)

// drop drops the first n items
assert(names.drop(2) == "Obama" :: Nil)

// take keeps the first n items
assert(names.take(1) == "Arnold" :: Nil)

// splitAt, does both take and drop at the same time, returning a tuple
assert(names.splitAt(1) == (List("Arnold"), List("George", "Obama")))

// indeces, gives my the indeces of the list
assert(names.indices == List(0, 1, 2))

// zip, zips two lists together
assert(names.zip(names.indices) == List(("Arnold", 0), ("George", 1), ("Obama", 2)))

// toString returns a list as String
assert(names.toString == "List(Arnold, George, Obama)")

// mkString, lets you join the string with a separator
assert(names.mkString("-") == "Arnold-George-Obama")

There are also a few functions for converting to and from lists.

val array = Array("Arnold", "George", "Obama")

// Convert the list to an Array
assert(names.toArray == array) // Equality does not work for arrays
java.lang.AssertionError: assertion failed
 at scala.Predef$.assert(Predef.scala:87)
...

// If we convert it back it works
assert(names.toArray.toList == names)

// We can also mutate the array with copyToArray
List("Hilary").copyToArray(array, 1)
assert(array.toList == List("Arnold", "Hilary", "Obama")) 

// elements will give me an iterator
val it = names.elements
assert (it.next == "Arnold")

Pretty slick, but now it is time for the good stuff, Higher Order Functions!

// map, converts from one list to another, notice the placeholder syntax (_)
assert(names.map(_.length) == List(6 ,6, 5))

// Get the first char of the words
assert(names.map(_.charAt(0)) == List('A', 'G', 'O'))

// Get the names as lists
assert(names.map(_.toList) == List(List('A', 'r', 'n', 'o', 'l', 'd'),
 List('G', 'e', 'o', 'r', 'g', 'e'), List('O', 'b', 'a', 'm', 'a')))

// When you have a list of lists, you can use flatMap
assert(names.flatMap(_.toList) == List('A', 'r', 'n', 'o', 'l', 'd',
 'G', 'e', 'o', 'r', 'g', 'e', 'O', 'b', 'a', 'm', 'a'))

// Filter is used to filter out specific elements that satisfy the predicate.

// Filter out all names of length 6
assert(names.filter(_.length == 6) == List("Arnold", "George"))

val chars = names.flatMap(_.toList)

// Filter out all chars larger than 'a' (capitals are smaller in ascii)
assert(chars.filter(_ > 'a') == List('r', 'n', 'o', 'l', 'd', 'e', 'o', 
'r', 'g', 'e', 'b', 'm'))

// And combine them
// Give me the first letter of all words with length 6
assert(names.filter(_.length == 6).map(_.charAt(0)) == List('A', 'G'))

There is a bunch of other useful functions based on filter.

// partition returns a pair of list (satisfied, not satisfied)
assert(names.partition(_.length == 6) == (List("Arnold", "George"), List("Obama")))

// find returns the first element that satisfy the predicate
// Since this function may not be satisfied, an optional value is used
assert(names.find(_.length == 6) == Some("Arnold"))

// An optional value returns Some(value) or None
assert(names.find(_.length == 7) == None)

// takeWhile and dropWhile take resp. drop while the predicate is fulfilled
assert(chars.takeWhile(_ != 'o') == List('A', 'r', 'n'))
assert(chars.dropWhile(_ != 'm') == List('m', 'a'))

// Span does both at the same time
assert(chars.span(_ != 'o') == (List('A', 'r', 'n'), 
List('o', 'l', 'd', 'G', 'e', 'o', 'r', 'g', 'e', 'O', 'b', 'a', 'm', 'a')))

// forall checks that a predicate is true for all elements of the list
assert(!chars.forall(_ == 'a'))
assert(chars.forall(_ >= 'A'))

// exists checks that a predicate is true for some element of the list
assert(names.exists(_.length == 5))
assert(!names.exists(_.length == 7))

// sort, sorts a list according to an ordinal function
assert(List(3, 7, 5).sort(_ > _) == List(7, 5, 3))

The fold functions, fold left (/:) and fold right (:\) inserts operators between all the elements of a list. The difference between them is whether they start or end with the base element.

fold left: (0 /: List(1, 2, 3)) (op) = op(op(op(0, 1), 2), 3)

fold right: (List(1, 2, 3) :\ 0) (op) = op(1, op(2, op(3, 0)))


// Define the sum function for lists with fold left
def sum(xs:List[Int]): Int = (0 /: xs)(_ + _)
assert(sum(List(2, 3, 4)) == 9) 

// Define the product function for lists with fold right
def prod(xs:List[Int]): Int = (xs :\ 1)(_ * _)
assert(prod(List(2, 3, 4)) == 24) 

// Define reverse in terms of fold
def reverse[T](xs: List[T]) = (List[T]() /: xs) ((ys, y) => y :: ys)
assert(reverse(List(1, 2, 3)) == List(3, 2, 1))

Thats it for the methods of the List class. In the Companion List object we also find some useful functions. We have been using one of them, all the time.

List.apply or List() creates a list from its arguments.

Apart from this one, there are some other worth mentioning.

// List.range creates a list of numbers
assert(List.range(1, 4) == List(1, 2, 3))
assert(List.range(1, 9, 3) == List(1, 4, 7))
assert(List.range(9, 1, -3) == List(9, 6, 3))

// List.make, creates lists containing the same element
assert(List.make(3, 1) == List(1, 1, 1))
assert(List.make(3, 'a') == List('a', 'a', 'a'))

// List.unzip zips up a list of tuples
assert(List.unzip(List(('a', 1), ('b', 2))) == (List('a', 'b'), List(1, 2)))

// List.flatten flattens a list of lists
assert(List.flatten(List(List(1, 2), List(3, 4), List(5, 6))) == List(1, 2, 3, 4, 5, 6))

// List.concat concatenates a bunch of lists
assert(List.concat(List(1, 2), List(3, 4), List(5, 6)) == List(1, 2, 3, 4, 5, 6))

// List.map2 maps two lists 
assert(List.map2(List.range(1, 999999), List('a', 'b'))((_, _)) == List((1, 'a'), (2, 'b')))

// List.forall2
assert(List.forall2(List("abc", "de"), List(3, 2)) (_.length == _))

// List.exists2
assert(List.exists2(List("abc", "de"), List(3, 4)) (_.length != _))

And as if all this was not enough, Scala also support For Expressions. In other languages they are commonly known as List Comprehensions.

A basic for expression looks like this

for ( seq ) yield expr where seq is a semicolon separated sequence of generators, definitions and filters.

// Do nothing
assert(names == (for (name <- names) yield name))

// Map
assert(List('A', 'G', 'O') == (for (name <- names) yield name.charAt(0)))

// Filter
assert(List("Obama") == (for (name <- names if name.length == 5) yield name))

val cartesian = for (x <- List(1, 2); y <- List("one", "two")) yield (x, y)
assert(cartesian == List((1, "one"), (1, "two"), (2, "one"),  (2, "two")))

// And now the grand finale, the cartesian product of a list of list
def cart[T](listOfLists: List[List[T]]): List[List[T]] = listOfLists match {
 case Nil => List(List())
 case xs :: xss => for (y <- xs; ys <- cart(xss)) yield y :: ys
}
val cp = cart(List(List(1,2), List(3,4), List(5,6))) 
assert(cp == 
  List(List(1, 3, 5), List(1, 3, 6), List(1, 4, 5), 
  List(1, 4, 6), List(2, 3, 5), List(2, 3, 6), 
  List(2, 4, 5), List(2, 4, 6)))

Ain't it beautiful so say!

Saturday, October 03, 2009

Scream, Project Management for the Real World

Many companies today find themselves in a situation where they have a working product, but adding new features takes forever. Even worse, when new features are added, old features stop working.

To solve this problem many companies have adapted Scrum. Scrum has a nice lightweight appeal. All you need is:

  • A product owner who cares for a backlog with prioritized stories.
  • A team that cares about their craft and take responsibility to deliver a subset of new features every month.
  • A scrum master that makes sure the product manager and the team are playing by the rules of Scrum.

Thats it, the recipe for success...

But, what if you are not living in la-la-land where everyone on the project cares about the product?

What if your product manager doesn't keep a prioritized log of testable stories because she doesn't care. She just works here!

What if your team cares more about going surfing, than delivering well-tested, high-quality code.

Enter Scream!

Scream

Scream is project management for the real world! Scream is the way of managing unmotivated development organizations. The method itself is not new, it has been used for centuries to manage everything from husbands to entire countries.

In Scream, all you need is:

  • A product manager who handle the requirements.
  • A team whom will develop the requirements.
  • A Scream master who will make sure that the team is playing by the rules of Scream.

At first glance, it looks deceptively like Scrum, but the rules are different: The Scream master is responsible for the product being delivered at high quality and may use any means he sees fit to make it work.

This makes all the difference in the world.

Now all you need is a Scream master with enough gust to deliver.

The Scream Master

The ideal Scream master is Begbie in Trainspotting. He has all the characteristics of a good Scream master:

Francis Begbie is an aggressive pit bull terrier, a monstrous, brawling hard man ready to explode at any moment, at anyone, for any reason. Begbie isn't afraid to test his fighting prowess against the largest of opponents. "Begbie didn't do drugs, he did people," says Renton. His sole ambition seems to be to jack someone in.

The Process

After you have selected the Scream master, you have to let him know that he will be judged on the performance of the entire development team, including, the product owner.

You also need to set up some acceptance criteria for what done is:

  • All stories in the backlog, must be SMART, Specific, Measurable, Attainable, Realistic and Timely.
  • The code should be DRY.
  • 100% unit-test coverage of all non-trivial methods.
  • Acceptance test for all stories.

The you set the project in motion.

Some Typical Scenarios

The backlog items are not SMART.

The product owner says: "I didn't have the time." Begbie: Slaps her face, "You daft c**t, these items better by SMART, right f***in' now, or I will glass you."

Bugs appear in production, due to missing unit tests.

A developer says: "It worked on my machine." Begbie: Punches him in the nose, "You f***in' buftie, if one more bug enters the system on your account you've f***in' had it."

Typical stand-up meeting:

You ken me, I'm not the type of c**t that goes looking for f***in' bother, like, but at the end of the day I'm the c**t with a pool cue and you can get the fat end in you face any time you f***ing want, like.

A scream is not only against his team and product owner, he is against everyone. This makes him excellent for dealing with impediments. Project management is all about communication and motivation and no one can get the message through like Begbie.

Begbie: "I need access to the Active Directory." SA: "I don't have the time." Begbie: "I need access to the Active Directory." SA: "You need to fill out this form." Begbie: "I NEED ACCESS TO THE ACTIVE DIRECTORY." SA: "OK, here you go."

Notes on Greg Young's on DDD

I watched Greg Young talk about DDD on InfoQ. Here are my notes on the talk.

  • Only use domain driven design on appropriate projects. Most projects are not suitable.
  • Use state transition event streams to communicate between different bounded contexts.
  • Bounded contexts are one of the very keys of domain driven design. The same word may have different meanings in different contexts, and this is OK.
  • Use OO, avoid setters. Objects have behaviors, not shapes.
  • If you always have valid objects, you avoid the problem of having to check if an object is valid all the time. IsValid is not the solution.
  • Always use the domain experts and end-users language.
  • Separating commands from queries gives the benefit of eventually consistent, queries may read from a different place than the commands.
  • Coupling is not a problem, if it's in the same layer.
  • Model the view, such as screens, as reports, with no transactional behavior.
  • Explicit state transitions remove the need for auditing. They are the audit.

Thursday, September 24, 2009

Git undo, reset or revert?

If you have found this page you probably came here since you wanted to clear your working directory from all the changes that you have made.

The simple answer is:

# Clear working directory tree from all changes
$ git checkout -f HEAD

This is, however, not the best way to do it. A better way is:

# Clears the working directory tree, and stashes all the changes.
$ git stash


git stash allows you to get your changes back any time you need them in case you change your mind. It is also possible to inspect and manipulate the stashes.

# List all the stashes
$ git stash list
stash@{0}: WIP on admin_ui: 0c1a80a Removed annotation from JdbcAdminService, it is now explicity initialized in the applicationContext.
stash@{1}: WIP on admin_ui: 14e12e6 Added foreign keys for UserRole
stash@{2}: WIP on master: d188ecd Merge branch 'master' of semc-git:customercare
stash@{3}: WIP on master: 3763795 More work on user_details.
...

# Apply the latest stash, and remove it from the stack
$ git stash pop

# Apply a named patch, but leave it on the stack
$ git stash apply stash@{2} 

# Drop a stash
$ git stash drop stash@{3} 

# Clear the entire stash stack (almost never needed)
$ git stash clear

# A better way to purge the stash
$ git reflog expire --expire=30.days refs/stash

What about git reset then, it sounds like it should do about the same as git co -f HEAD. It doesn't. git reset is used for setting the current reference pointer, HEAD.

# Reset the latest commit, and leave the changes in the index.
$ git reset --soft HEAD^

# Reset the latest commit, and leave the changes in the working directory
$ git reset HEAD^

# Undo add, move the changes from the index to the working directory
$ git reset

# Reset the latest successful pull or merge
$ git reset --hard ORIG_HEAD

# Reset the latest failed pull or merge
$ git reset --hard

# Reset the latest pull or merge, into a dirty working tree
$ git reset --merge ORIG_HEAD

You can do more things with reset, but the above covers the typical cases. And now to the last thing, git revert. What does it do? git revert creates a new commit that is the opposite of the commit it names.

# Show the commits
$ git log --oneline
4717a5c new line
7e38e95 added tapir file

# Revert the commit named, 4717a5c, and commit it.
$ git revert 4717a5c

# Revert the HEAD commit, but don't commit it
$ git revert -n HEAD

Git is incredibly flexible and lets you control everything if you want to.

Tuesday, September 22, 2009

Inside Git

This is an exploration into what is going on when I run some basic git commands. We start out by creating a new repository. git/object is the directory where git stores all its objects, and it is empty initially.

$ mkdir myrepo
$ cd myrepo/
$ git init
Initialized empty Git repository in /Users/andersjanmyr/tmp/myrepo/.git/
$ find .git/objects -type f     # find all files in .git/objects
$ 

When a file is added to git it gets stored in the .git/objects directory under the name of its hash. The first two characters of the hash is used as the name of a subdirectory and the rest become the file name. Worth noting is that the hash uniquely identifies its content, so were you to run the commands on your computer, your results should be identical.

$ echo "A tapir has 14 toes" > tapir.txt
$ git add tapir.txt
$ find .git/objects -type f
.git/objects/12/a93608760777f50380a94b52e1b54ec69f4743
$ git hash-object tapir.txt
12a93608760777f50380a94b52e1b54ec69f4743

If you try to list the contents of the file, you are out of luck since it is stored in a binary format, you should instead use the git command git cat-file. The file above is a blob and its contents is what can be expected.

$ cat .git/objects/12/a93608760777f50380a94b52e1b54ec69f4743
xK??OR02`pT(I,?,R?H,V04Q(?O-?zi$ 
$
$ git cat-file -t 12a93608760777f50380a94b52e1b54ec69f4743
blob
$ git cat-file blob 12a936   # Using the first part of the hash is enough
A tapir has 14 toes
 

Even though the file is in the .git/objects directory it is not committed yet and it cannot be read by the high-level git commands such as git log. git status on the other hand will show that the file is staged, or in the index.

$ git log
fatal: bad default revision 'HEAD'
$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
#   (use "git rm --cached <file>..." to unstage)
#
# new file:   tapir.txt
#

When I commit the file, two more objects are added to the .git/objects directory

$ git commit -m "added tapir file"
[master (root-commit) 7e38e95] added tapir file
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 tapir.txt
$ find .git/objects/ -type f
.git/objects//12/a93608760777f50380a94b52e1b54ec69f4743
.git/objects//7e/38e95d328287ea9d234a2affc4ed9e4510435a
.git/objects//e8/493a7e63154350f8c3d08a42e759132d9d2a39

One is tree object and the other is a commit.

$ git cat-file -t 7e38
commit
$ git cat-file -t e849
tree
$

The commit contains the information that was recorded when I committed. Apart from the commit message and my personal info it contains a reference to the tree object that was created simultaneously with the commit.

$ git cat-file commit  7e38
tree e8493a7e63154350f8c3d08a42e759132d9d2a39
author Anders Janmyr <anders.janmyr@jayway.se> 1253590540 +0200
committer Anders Janmyr <anders.janmyr@jayway.se> 1253590540 +0200

added tapir file
$ 

The tree object is stored in binary format and cannot be completely read without the help of git ls-tree. Now I can see that it contains a reference to the blob that was created initially, the tapir.txt file.

$ git cat-file tree e8493a7e63154350f8c3d08a42e759132d9d2a39
100644 tapir.txt?vw???KR?NƟGC$ 
$ git ls-tree e8493a7e63154350f8c3d08a42e759132d9d2a39
100644 blob 12a93608760777f50380a94b52e1b54ec69f4743 tapir.txt
$

So how does git know what is the latests commit? In git lingo the latest commit is know as the HEAD. If I look inside .git/HEAD I see a reference and this reference points to the latest commit.

$  cat ./.git/HEAD
ref: refs/heads/master
$ cat ./.git/refs/heads/master
7e38e95d328287ea9d234a2affc4ed9e4510435a

The .git/refs directory is where all the references of git live, heads and tags.

$ find .git/refs
.git/refs
.git/refs/heads
.git/refs/heads/master
.git/refs/tags
$ git branch olle
$ find .git/refs
.git/refs
.git/refs/heads
.git/refs/heads/master
.git/refs/heads/olle
.git/refs/tags

Creating a new branch with git branch shows that the branch is added to the heads directory, switching to it will change the .git/HEAD contents.

$  cat ./.git/HEAD
ref: refs/heads/master
$ git co olle
Switched to branch 'olle'
$  cat ./.git/HEAD
ref: refs/heads/olle

Git, simple, but beautiful!

Friday, August 21, 2009

Fat is Better

I have recently had discussions with some colleagues about what architecture they prefer and, while they seem to favor thinly sliced services, I have come to the conclusion that the overhead that comes with slicing services thin is not worth the extra time that it takes to setup, verify and test the complex, internal communication that comes with this kind of architecture. Fat is better!

If I am designing a system that should work in a coherent way, I want it all in my big, fat, juice object model. This enables me to put the functionality where it is most cohesive and, therefore, gives me the best design possible. Every object should carry its own weight.

If there are external services they must, by necessity, be outside the model, but the internal representation of the external service should be inside my model.

An example of an architecture that relies on thinly sliced services is REST. REST is very elegant and it definitely has a place when publishing resources. But REST models are anemic. They rely on you to GET the information from the resource, do things to it and then replace the information of the resource with a PUT. It is CRUD for the web. It is not intended to take advantage of what is good in object-oriented and functional programming, like sending behaviors into an object and have it perform the calculations for you.

The elegance of map and reduce (fold) is the essence of functional programming. How do you model map and reduce with REST? You can't! Polymorphism and encapsulation is the essence of object-oriented programming. Where does it go when everything is a resource? It disappears!

I have worked on projects where the goal has been to design every little part of the system as a free standing module with its own life and versioning, that can be switched in and out, but the artifacts have mostly been deployed together and have rarely given any extra value standing on their own. But they have given us a lot of grief when we tried to build a DRY system.

So, a fat model is the way to go. How fat? As fat as possible, but no fatter. How fat is that? As always this is a judgment call but, err on the side of fatter.

If, by luck or skill, my fat system reaches a workload where it will have to be split over multiple processors or machines, it will not be very difficult to split the system, since the system will be well factored, cohesive and DRY!

Note: "Fat is better" is somewhat related to worse is better by Dick Gabriel