Sunday, December 23, 2007

The Yegge Effect

So Steve Yegge has come up with another big post, and, as usual, the blogosphere is up on their heels. For mere mortal bloggers, it is like planning a movie release for not-so-big producers - your blog post is likely to be swamped in the Yegge Effect if the timing of your post happens to clash with that of Mr. Yegge. This post is a reaction to the Yegge Effect. Irrespective of whether you agree to all of his rants or not, a Yegge post has an impact, which you cannot ignore. You may call it the Yegge Impact as well.

Anyway, on a more serious note, it was a great post from Steve Yegge. It has brought to the forefront a number of software engineering issues - both technical and cultural. My greatest take from the post has been the preaching to select the engineering tools (read the language of development) based only on the job at hand. In fact, in one of the responses to the comments, Steve mentions that his selection of a dynamically typed language for rewriting Wyvern has been driven by the fact that the system he is trying to model is an extremely dynamic beast and in the earlier Java based system he needed all sorts of twisting and subversion of the type system to achieve the level of dynamism that the software demanded. And he chose Rhino over Groovy and Scala simply because the latter have not yet been mature enough and he was looking for something that has been enough time tested.

Yet Another Java Bashing ?

Expectedly enough, many of the bloggers have interpreted Yegge's post as another of those Java bashings. Admittedly there has been some unfair criticism of Java and design patterns in his post. We cannot ignore the fact that Java is verbose, and lacks the expressiveness of many of today's programming languages. And design patterns, very often, do not help in making your code succinct or concise, irrespective of the language in which you implement them. The biggest contribution of design patterns in statically typed implementations is the uniformity of vocabulary that they espouse - the moment you talk about factories, people understand creational semantics, the moment you talk about strategies, developers understand granular variations of algorithims within a larger context. Many of today's languages, especially the dynamically typed functional ones, have most of the 23 GOF patterns built-in the language themselves. To them, evolution of these patterns may sound like ways to get around the deficiencies of languages like Java and C++.

I feel Java has been able to successfully position itself as a much better C++. Accidental complexities of memmory management and pointer manipulations have been successfully abstracted away in Java - a definite step forward in expressiveness. And this is possibly the single most important reason that Java is ruling the enterprise even today. Java maintained the curly braces look and feel of C++ - yet another reason it could successfully launch itself within the cultural embodiment of C++ programmers. Raganwald is correct when he talks about Yegge's post as addressing more of cultural issues than technical ones. It is the Java culture that dominates the enterprise today. It does not require to be a strong elitist to program at a higher level of abstraction in Java - only you need to embrace the implementation of these features in the future versions of the language. Unfortunately many people are still against the empowerment of Java programmers with the new tools to write better programs. Steve Yegge mentions about copy-paste in Java causing code bloats. Many of the bloats are due to the fact that Java lacks the expressiveness of languages like Scala, but many of them are also due to an averagist mentality towards designing programs in Java. People have grown to develop a cultural apathy towards embracing new paradigms, incorporating functional features within the language, in the fear that it will make Java too complicated for the average Wall Street programmer. Until we come out of this vicious circle of apprehension, Java programs will continue to be great big machines that can move the dirt this way and that.

The question is .. Is Java too broken to support such empowerment ?

Tools of the Trade

Responding to a user comment, Steve says :
If I were writing a compiler, I'd use an H-M language (probably Scala). This game is a different beast; it's one of the most "living" systems in the world (notwithstanding its 1-year hibernation). You could do it in an H-M system, but it would be nearly as big as the Java system.

Select your programming language based on the task at hand. If you need well defined contracts that will be used by a large programmer base, then static typing is the answer. And Scala is one of the most advanced and expressive statically typed languages on the JVM today. In case Java decides not to implement any of the features that have been proposed by so many experts, then possibly Scala will enter into the same footsteps to displace Java that Java could successfully do to C++ a decade ago. Natural evolution !

The post also has a healthy bias towards dynamically typed systems, with all the code compression benefits that they offer through succinct abstractions. No complaints on that, though given today's constraints in executing large enterprise projects, I am still inclined to use the safety net that powerful static typing has to offer. One of the masters and researchers of programming languages, Matthias Felleisen makes the following observation in the comments to Yegge's post, while talking about evolutionary programming and gradual typing in ES4:
As a co-creator of gradual transformations from dynamically typed to statically typed languages (see DLS 2006), I am perfectly aware of this idea. The problem is that Ecmascript's implementation of the idea is broken, unsound. So whatever advantages you would get from a sound system, such as ML's or even Java's, you won't get from ES 4. (The most you get is C's notion of types.)


X Programmer (for all X)
But you should take anything a "Java programmer" tells you with a hefty grain of salt, because an "X programmer", for any value of X, is a weak player. You have to cross-train to be a decent athlete these days. Programmers need to be fluent in multiple languages with fundamentally different "character" before they can make truly informed design decisions.

A single language programmer always suffers from a myopic view of designing things. Unless you are exposed to other paradigms of programming, anonymous inner classes will always appear to be the most suitable idiom for the problem at hand. Many powerful programming techniques offered by the next door programming language may be more suited to solving your problem. Explore them, learn a new language and learn to think in that new language. This is a great message from Steve Yegge. Sometime back I was looking at an interview with Joe Armstrong, the inventor of Erlang. He believes that the combination of OCamL, Erlang and Haskell has the power to solve most of the problems of today's enterprise. OCamL, with all the static typing can be the C of tomorrow, the zen language for building the virtual machines, Erlang, with its dynamic typing, can support distributed applications with changing code on-the-fly, not really a cup of tea for constant type analyses, while Haskell can be great for building business applications. The only teaching that is required is to appease the user community of the unwarranted apprehensions of functional programming. Functional programming is based on the lambda calculus - it is the sheer intellectual weight of this very basic dictum that drives today's Java programmers away from the paradigm. Joe believes that functional programming experts should come forward more and write easy-to-read books to dispel this demon out of programmer's mind. Joe's book on Programming Erlang is a great piece - go grab a copy and do yourself a favor of learning a new functional language. You may not use FP in your next enterprise project, but you will surely be enriched with powerful idioms that higher order functions, pattern matching and hylomorphisms have to offer.

and then the central thesis ..

Refactoring makes the codebase larger that the IDEs of today are simply not capable of handling. Steve mentions that his codebase of 500K LOC could not be managed by Eclipse. I do not know for sure about the organization of the codebase of Wyvern, but I have seen Eclipse handle huge codebases. Only catch is that you need to manage them as multiple separate individual projects / artifacts with clean dependencies specified between them. And you need not have all of them open and active at the same time. Refactoring may increase the physical size of your codebase - refactoring away creational statements with pluggable factories through a Dependency Injection framework will cost some megabytes in terms of the ear size. But, I think the intellectual size of the codebase reduces - you now no longer need to worry about the creation processes and lifecycle management of your objects. Steve, however, feels that DI is also a bloat enhancer in a project. If I am using a statically typed language like Java, I think DI is a useful technique for implementing separation of concerns and modularization of your codebase. But I agree with Steve that platforms like Javascript, Ruby and Python do not need DI. They are too dynamic and offers enough flexibility to do away with the overhead of a DI framework. And regarding size bloat being reduced by dynamically typed languages, don't you think that it also increases the intellectual size of the codebase ?

So, that, in short was the Yegge Impact on me ..

Tuesday, December 18, 2007

Domain Modeling - What exactly is a Rich Model ?

It is possibly an understatement to emphasize the usefulness of making your domain model rich. It has been reiterated many times that a rich domain model is the cornerstone of building a scalable application. It places business rules in proper perspective to wherever they belong, instead of piling stuffs in the form of a fat service layer. But domain services are also part of the domain model - hence when we talk about the richness of a domain model, we need to ensure that the richness is distributed in a balanced way between all the artifacts of the domain model, like entities, value objects, services and factories. In course of designing a rich domain model, enough care should be taken to avoid the much dreaded bloat in your class design. Richer is not necessarily fatter and a class should only have the responsibilities that encapsulates its interaction in the domain with other related entities. By related, I mean, related by the Law of Demeter. I have seen instances of developers trying to overdo the richness thing and ultimately land up with bloated class structures in the domain entities. People tend to think of a domain entity as the sole embodiment of all the richness and often end up with an entity design that locks itself up in the context of its execution at the expense of reusability and unit testability. One very important perspective of architecting reusable domain models is to appreciate the philosophical difference in design forces amongst the various types of domain artifacts. Designing an entity is different from designing a domain service, you need to focus on reusability and a clean POJO based model while designing a domain entity. OTOH a domain service has to interact a lot with the context of execution - hence it is very likely that a domain service needs to have wiring with infrastructure services and other third party functionalities. A value object has different lifecycle considerations than entities and we need not worry about its identity. Hence when we talk of richness, it should always be dealt with the perspective of application. This post discusses some of the common pitfalls in entity design that developers face while trying to achieve rich domain models.

Entities are the most reusable artifacts of a domain model. Hence an entity should be extremely minimalistic in design and should encapsulate only the state that is required to support the persistence model in the Aggregate to which it belongs. Regarding the abstraction of the entity's behavior, it should contain only business logic and rules that model its own behavior and interaction with its collaborating entities.

Have a look at this simple domain entity ..


class Account {
    private String accountNo;
    private Customer customer;
    private Set<Address> addresses;
    private BigDecimal currentBalance;
    private Date date;

    //.. other attributes

    //.. constructors etc.

    public Account addAddress(final Address address) {
        addresses.add(address);
        return this;
    }

    public Collection<Address> getAddresses() {
        return Collections.unmodifiableSet(addresses);
    }

    public void debit(final BigDecimal amount) {
        //..
    }

    public void credit(final BigDecimal amount) {
        //..
    }

    //.. other methods
}



Looks ok ? It has a minimalistic behavior and encapsulates the business functionalities that it does in the domain.

Question : Suppose I want to do a transfer of funds from one account to another. Will transfer() be a behavior of Account ? Let's find out ..


class Account {
    //..
    //.. as above

    // transfer from this account to another
    public void transfer(Account to, BigDecimal amount) {
        this.debit(amount);
        to.credit(amount);
    }
    //.. other methods
}



Looks cool for now. We have supposedly made the domain entity richer by adding more behaviors. But at the same time we need to worry about transactional semantics for transfer() use case. Do we implant transactional behavior also within the entity model ? Hold on to that thought for a moment, while we have some fresh requirements from the domain expert.

In the meantime the domain expert tells us that every transfer needs an authorization and logging process through corporate authorization service. This is part of the statutory regulations and need to be enforced as part of the business rules. How does that impact our model ? Let us continue adding to the richness of the entity in the same spirit as above ..


class Account {
    //.. as above

    // dependency injected
    private AuthorizationService auth;
    //..

    public void transfer(Account to, BigDecimal amount) {
        auth.authorize(this, to, amount);
        this.debit(amount);
        to.credit(amount);
    }
    //.. other methods
}



Aha! .. so now we start loading up our entity with services that needs to be injected from outside. If we use third party dependency injection for this, we can make use of @Configurable of Spring and have DI in entities which are not instantiated by Spring.


import org.springframework.beans.factory.annotation.Configurable;
class Account {
    //.. as above

    // dependency injected
    @Configurable
    private AuthorizationService auth;

    //..
}



How rich is my entity now ?

Is the above Account model still a POJO ? There have already been lots of flamebaits over this, I am not going into this debate. But immediately certain issues crop up with the above injection :

  • The class Account becomes compile-time dependent on a third party jar. That import lying out there is a red herring.

  • The class loses some degree of unit-testability. Of course, you can inject mocks through Spring DI and do unit testing without going into wiring the hoops of an actual authorization service. But still, the moment you make your class depend on a third party framework, both reusability and unit testability get compromised.

  • Using @Configurable makes you introduce aspect weaving - load time or compile time. The former has performance implications, the latter is messy.



Does this really make my domain model richer ?

The first question you should ask yourself is whether you followed the minimalistic principle of class design. A class should contain *only* what it requires to encapsulate its own behavior and nothing else. Often it is said that making an abstraction design better depends on how much code you can remove from it, rather than how much code you add to it.

In the above case, transfer() is not an innate behavior of the Account entity per se, it is a use case which involves multiple accounts and maybe, usage of external services like authorization, logging and various operational semantics like transaction behavior. transfer() should not be part of the entity Account. It should be designed as a domain service that uses the relationship with the entity Account.


class AccountTransferService {
    // dependency injected
    private AuthorizationService auth;

    void transfer(Account from, Account to, BigDecimal amount) {
        auth.authorize(from, to, amount);
        from.debit(amount);
        to.credit(amount);
    }
    //..
}



Another important benefit that you get out of making transfer() a service is that you have a much cleaner transactional semantics. Now you can make the service method transactional by adding an annotation to it. There are enough reasons to justify that transactions should always be handled at the service layer, and not at the entity layer.

So, this takes some meat out of your entity Account but once again gives it back the POJO semantics. Taking out transfer() from Account, also makes Account decoupled from third party services and dependency injection issues.

What about Account.debit() and Account.credit() ?

In case debit() and credit() need to be designed as independent use cases under separate transaction cover, then it definitely makes sense to have service wrappers on these methods. Here they are ..


class AccountManagementService {
    // dependency injected
    private AuthorizationService auth;

    @Transactional
    public void debit(Account from, BigDecimal amount) {
        from.debit(amount);
    }
    @Transactional
    public void credit(Account to, BigDecimal amount) {
        to.credit(amount);
    }
    @Transactional
    public void transfer(Account from, Account to, BigDecimal amount) {
        //..
    }
    //..
}



Now the Account entity is minimalistic and just rich enough ..

Injection into Entities - Is it a good idea ?

I don't think there is a definite yes/no answer, just like there is no definite good or bad about a particular design. A design is a compromise of all the constraints in the best possible manner and the goodness of a design depends very much on the context in which it is used. However, with my experience of JPA based modeling of rich domain models, I prefer to consider this as my last resort. I try to approach modeling an entity with a clean POJO based approach, because this provides me the holy grail of complete unit-testability that I consider to be one of the most important trademarks of good design. In most of the cases where I initially considered using @Configurable, I could come up with alternate designs to make the entity decoupled from the gymnastics of third party weaving and wiring. In your specific design there may be cases where possibly you need to use @Configurable to make rich POJOs, but make a judgement call by considering other options as well before jumping on to the conclusion. Some of the other options to consider are :

  • using Hibernate interceptors that does not compromise with the POJO model

  • instead of injection, use the service as an argument to the entity method. This way you keep the entity still a pure POJO, yet open up an option to inject mocks during unit testing of the entity


Another point to consider is that @Configurable makes a constructor interception, which means that construction of every instance of that particular entity will be intercepted for injection. I do not have any figures, but that can be a performance overhead for entities which are created in huge numbers. A useful compromise in such cases may be to use a getter injection on the service, which means that the service will be injected only when it is accessed within the entity.

Having said all these, @Configurable has some advantages over Hibernate interceptors regarding handling of serialization and automatic reconstruction of the service object during de-serialization.

For more on domain modeling using JPA, have a look at the mini series which I wrote sometime back. And don't miss the comments too, there are some interesting feedbacks and suggestions ..

Friday, December 14, 2007

Closures in Java - the debate continues ..

Neal Gafter mentions in his blog while referring to Josh's presentation at Javapolis ..
The point he (Josh) makes is that function types enable (he says "encourage") an "exotic" style of programming - functional programming - which should be discouraged, otherwise the entire platform will become infected with unreadable code.

Huh! What exactly is the implication here ?

It could be either ..

  • Functional programming leads to unreadable code.


OR

  • The mass of today's Java programmers soaked in the trenches of imperative programming are likely to generate unreadable code with powerful tools of functional programming.


I sincerely hope it is not the first one that Josh meant.

What exactly is unreadable code ? To someone not familiar with the idioms of a particular language, perfectly idiomatic code may seem unreadable. To someone not familiar with the functional paradigms, higher order functions, list comprehensions and closures will look illegible.

The question is, do we constrain down a language to the lowest common denominator only to cater to the masses or try to uplift the developers by encouraging them to program to a higher level of abstraction using more powerful idioms of functional programming. Closures in Java definitely introduce more powerful abstractions, and coupled with tail call optimization, will help us write more concise and succinct programs. To someone familiar with functional programming, the resultant code will definitely be more readable and enjoyable.

I am all for having true closures in Java 7 !

Wednesday, December 12, 2007

How do you model a Domain Entity ?

Steve Freeman recommends using an interface for every domain entity in order to have a clean layering in architecture and no dependency between the domain and the persistence model of implementation. He does not mind if there is a single implementation for every interface and recommends his paradigm for expressing the needs of the domain code more clearly by limiting its dependency to an interface that defines just the services it needs from other parts of the system.

I am not sure if I agree to his principles. While not being a fanboy of interfaces-even-for-single-implementations club, I do not think using concrete classes for domain entities will incur any dependency between the domain layer and the persistence services. Standards like JPA backed up by ORM implementations like Hibernate provide transparent persistence services today, which can be plugged in non-intrusively into your domain model. I have indicated the same in the comments to his post, but just thought of having a separate post to make my point more clear.

Regarding data access using JPA, Repositories provide a great abstraction to encapulate them. While repositories belong to the domain services layer, they use domain entities and value objects to transport data across the layers of your model. Repositories also abstract away specific query languages like EJB QL or Hibernate HQL behind intention-revealing interfaces, keeping your domain entities free of any such dependencies. I had blogged about generic repository implementations to abstract away transparent data access code from domain models using the Bridge design pattern. All configuration parameters including EntityManagerFactories can be injected into your repository implementations through DI containers like Spring, keeping the domain model clean from these dependencies.

And JPA provides a nice standardized set of contracts to map your relational data model into your object-oriented domain entity class. All the annotations are from JPA, where you do not have to import any non-standard stuff into your codebase - all imports are from javax.persistence.*. And if you think annotations couple your code with the data model, go ahead and use XML for a completely transparent and decoupled model mapping. I have talked about the virtues of JPA based domain modeling and repository abstraction some time back.

Using transparent data persistence backing up your domain model, I tend to follow the policy of having one concrete class for every domain entity. I use JPA annotations for mapping domain model to the relational data model. This way the implementation adheres to the standards, and I have one clean artifact as my domain entity abstraction.

Thursday, December 06, 2007

Erlang Style Concurrency

In this comp.lang.functional thread, Ulf Wiger sums up Erlang style concurrency in the following 7 points :
1 lightweight actors
2 (conceptually) no shared memory between actors
3 (conceptally) copying asynchronous message passing
4 distribution transparency
5 scoped message reception (selective receive)
6 process monitoring, enabling "fail-fast" programming with supervision structures for error recovery
7 cascading exits, e.g. through process linking

He also goes on to discuss the importance of each of the above features for implementing scalable, robust, distributed systems with high concurrency.

The thread discusses synchronous versus asynchronous message passing, pros and cons of selective receive and how the message passing model differs between Erlang and O'Haskell. A good read for heavy duty concurrency aficionados !

Monday, December 03, 2007

Will Closures in Java 7 help in making Functional Programming mainstream ?

Of late, there has been a lot of discussions on the usage of advanced programming idioms in the developers' community. Are we getting too much confined within the limits of strongly typed object-oriented paradigms and ignoring the power that dynamic non object oriented languages (read functional languages) have to offer ? Some of the blogs put forward the strawman's argument dismissing the usage of dynamic languages and more powerful abstractions as somewhat elitist and not suitable for the mass programmers. C# has added lots of functional paradigms to the language, Microsoft has positioned F# as a mainstream functional language for the .NET platform. Ruby has lots of functional features, Erlang has started being in the limelight and Haskell still reigns supreme amongst all favorites in reddit postings.

Still the adoption rate is not heartening enough and there are enough indications of apprehensions from the corners who make the masses. Is it the fear of dynamic typing or the lack of tools for refactoring support that's holding us back ? Or will the functional programming support from today's most widely used enterprise language can act as the catalyst towards easier adoption of folds and unfolds ?

I am not much of a believer in the strawman argument. But I think the push has to come from the medium which controls the balance and there is no denying the fact that today's enterprise programmers do Java for a living.

How close are we to getting closures in Java 7 ?

Do I get ..


public static <T,U> U fold(List<T> us,{T,U=>U} fun, U init) {
    U acc = init;
    for(T elem : us) {
        acc = fun.invoke(elem, acc);
    }
    return acc;
}



even though I would like to have ..


public static <T,U> U fold(List<T> us,{T,U=>U} fun, U init) {
    if (us.size() == 0) {
        return init;
    }
    return fold(us.subList(1, us.size()),
                fun, fun.invoke(us.get(0),
                init));
}



Either way I should be able to write ..


List<Integer> ints = new ArrayList<Integer>();
//.. populate ints
fold(ints,
      {Integer elem,Integer acc=>elem + acc},
    0);



With Neal Gafter's prototype, I can already write this, though we need TCO in order to play around with meaningful functional control structures.

Thursday, November 29, 2007

Fun with unfold in Erlang

Fold is one of the most commonly used catamorphisms in functional languages. Erlang has foldl and foldr, which are used very frequently to encapsulate some of the very common patterns of computation as higher order operators instead of using recursion directly. Fold's dual is unfold (anamorphism), which unfortunately does not enjoy an equal popularity amongst functional programmers. While fold is a recursion operator that consumes a collection, it's dual unfold encapsulates a common pattern that produces streams / collections from a single object.

Here is an attempt to implement an unfold in Erlang .. adopting from this thread ..


unfold(Seed, Predicate, Transformer) when is_integer(Seed) ->
    unfold(Seed, Predicate, Transformer, 1).

unfold(Seed, Predicate, Transformer, Incrementor) ->
    unfold(Seed, Predicate, Transformer, Incrementor, []).

unfold(Seed, Predicate, Transformer, Incrementor, Acc) ->
    case Predicate(Seed) of
        true -> unfold(Incrementor(Seed),
                       Predicate,
                       Transformer,
                       Incrementor,
                       [Transformer(Seed)|Acc]);
        false -> lists:reverse(Acc)
    end.



It is a simple implementation that takes a seed, a transformer that transforms a state to an output value, a predicate that tells when to stop and an incrementor that takes us to the next state. And it generates finite collections, though a generic implementation should potentially be able to generate infinite streams as well. But it provides a base for implementing some of the common functions from the Erlang library in a pointfree notation ..


seq(Min, Max, Incr) when Incr > 0, Max >= Min ->
    unfold(Min,
           fun(X) -> X =< Max end,
           fun(X) -> X end,
           fun(X) -> X + Incr end);

seq(Min, Max, Incr) when Incr < 0, Max =< Min ->
    unfold(Min,
           fun(X) -> X >= Max end,
           fun(X) -> X end,
           fun(X) -> X + Incr end).



This one is more powerful. It packs an iteration over multiple lists into one operation and completely abstracts away the iteration from the user.


zip(Ls) ->
    unfold(Ls,
           fun(X) -> length(hd(X)) > 0 end,
           fun(Lists) -> [hd(List) || List <- Lists] end,
           fun(Lists) -> [List -- [hd(List)] || List <- Lists] end).



And you can treat them up with String lambdas as well .. of course for this you will need the string lambda library ..


zip(Ls) ->
    unfold(Ls,
           lambda("length(hd(_)) > 0"),
           lambda("[hd(List) || List <- _]"),
           lambda("[List -- [hd(List)] || List <- _]")).

seq(Min, Max, Incr) when Incr > 0, Max >= Min ->
    unfold(Min,
           lambda("_ =< Max"),
           lambda("_"),
           lambda("_ + Incr"));

seq(Min, Max, Incr) when Incr > 0, Max >= Min ->
    unfold(Min,
           lambda("_ >= Max"),
           lambda("_"),
           lambda("_ + Incr")).



Why unfold when you can recurse directly ? It provides one more pattern to programming in functional languages. Virtues of catamorphism and anamorphism have been well elucidated in various literature of functional programming. Look here for a detailed discussion on the virtues of unfold and unfold characterization of many well known algorithms.

Monday, November 26, 2007

Productivity, Team Size and the Blub Paradox

Reginald Braithwaite is one of my favorite bloggers. Each of his postings make me think, some of them leave a lasting impression. One of them is this one that talks about small teams, productivity and the power of abstraction in programming languages.

Do we necessarily have to build large teams to solve complex programming problems ? Here is what Reginald has to say ..
All of our experience in the last sixty years has suggested that productivity drops off a cliff as team size increases. So, if you want more code from a larger team, you have to invest heavily in ways of extracting value out of unproductive people in an unproductive environment.

We cannot build our project execution infrastructure for the unproductive mass of average developers. (via Neal Ford's JRuby podcast) Glen Vanderburg once noted that bad developers will move heaven and earth to do the wrong thing. And by having a restricted environment all you are doing is constraining the power of good developers, not making bad developers any better.

And this is where the expressive power of the programming language comes in. Paul Graham has talked a lot about succinctness, blub programmers and metrics for the continuum of abstractness for programming languages. Good developers love to program in languages higher up the power continuum, while a blub programmer looks out for the averagest feature-set that aligns well within his comfort zone. Hence, as Reginald says ..
If we know that bug per line of code remains amazingly constant, why do we try to scale code out in verbosity rather than up in abstraction?

Scaling out in verbosity is encouraging those language features that add to the lines of code at the cost of the levels of abstraction. Which in turn is encouraging the blub paradox.

Java is still the most dominant language of the enterprise and JVM is undoubtedly the ubiquitous platform. It is difficult for an enterprise to move away from a platform overnight. But today we have a number of alternatives in programming languages that run on the Java Virtual Machine. Many of them are much higher than Java in the power continuum of abstractness. Isn't it time that we start embracing some of them, at least incrementally ? Neal Ford makes a great point in his JRuby podcast while talking about polyglot programming. Tests and builds are two areas that do not ship to the customer. These make great candidates for bootstrapping other JVM friendly languages into the enterprise. Next time try using JRuby for writing your tests, Raven for writing the build scripts. Soon you may start feeling that you could have collapsed your multi-level strategy hierarchy in Java using higher order functions.

Wednesday, November 21, 2007

Erlang String Lambdas

Over the last couple of months I have been using Oliver Steele's Functional Javascript a lot. The string lambdas provide enough syntactic sugars to reduce a lot of accidental complexities from your codebase. Reginald Braithwaite recently gave us String#to_proc as a port of the same in Ruby. I am a big believer in adding syntactic sugars to a language so long it helps reducing boilerplates and making the domain behavior more expressive. I have been learning Erlang for some time now, sneaking off and on into Joe Armstrong's book and casually feeding into some of the Erlangy blogs around. Over the last weekend I wanted to have a go at porting Oliver Steele's string lambda into Erlang. And here is the result ..

Typically in Erlang you write ..

(fun(X) -> X * 2 end)(6)

to get 12 as the result. With string lambda, you write ..

(utils:lambda("X * 2"))(6)

to get the same result. Ok, counting the characters of the module and the function, you grin at the fact that it is really more verbose than the original one. But we have less boilerplates, with the explicit function declaration being engineered within the lambda.

A couple of more examples of using the string lambda :

(utils:lambda("X+2*Y+5*X"))(2,5) => 22
(utils:lambda("Y+2*X+5*Y"))(5,2) => 34


Less noisy ? Let's proceed ..

You can have higher order functions using the string lambdas, and with lesser noise .. The following are based on original Javascript examples in Oliver Steele's page ..

lists:map(fun(X) -> X * X end, [1,2,3]).

-> [1,4,9]

lists:filter(fun(X) -> X > 2 end, [1,2,3,4]).

-> [3,4]


lists:any(fun(X) -> length(X) < 3 end, string:tokens("are there any short words?", " ")).


-> false

and with string lambdas, you have the same result with ..


utils:map("X*X", [1,2,3]).
utils:filter("X>2", [1,2,3,4]).
utils:any("length(_) < 3", string:tokens("are there any short words?", " ")).



In the last example, _ is the parameter to a unary function and provides a very handy notation in the lambda. Here are some more examples with unary parameter ..

%% multiply by 5
(utils:lambda("_*5"))(8).

-> 40


%% length of the input list > 2 and every element of the list > 3
(utils:all_and(["length(_) > 2", "utils:all(\">3\", _)"]))([1,2,3,4]).


-> false

Explicit Parameters

As per original convention, -> separates the body of the lambda from the parameters, when stated explicitly and the parameters are matched based on their first occurence in the string lambda ..

(utils:lambda("X Y->X+2*Y"))(1,4).

-> 9

(utils:lambda("Y X->X+2*Y"))(1,4).

-> 6

Implicit Parameters

If not specified explicitly, parameters can be implicit and can be effectively deduced ..

%% left section implicit match
(utils:lambda("/2"))(6)

-> 3

%% right section implicit match
(utils:lambda("10/"))(2)

-> 5

%% both sections implicit match
(utils:lambda("/"))(12,6)

-> 2

Chaining for Curry

The operator -> can be chained to implement curried functions.

((utils:lambda("X->Y->X+2*Y"))(1))(4).

-> 9

or deferred invocation ..

Fun=(utils:lambda("X->Y->X+2*Y"))(1).

-> #Fun<erl_eval.6.72228031>

Fun(4).

-> 9

Higher Order Functions

String lambdas allow creating higher order functions with much less noise and much more impact than native boilerplates in the language.


%% compose allows composition of sequences of functions backwards
(utils:compose("+1", "*2"))(1).


-> 3

%% higher order functions
utils:map("+1", utils:map("*2", [1,2,3])).

-> [3,5,7]

lists:map(utils:compose(["+1", "*2"]), [1,2,3]).

-> [3,5,7]

And here are some catamorphisms ..

lists:reverse() is typically defined by Erlang in the usual recursive style :

reverse([] = L) ->
    L;
reverse([_] = L) ->
    L;
reverse([A, B]) ->
    [B, A];
reverse([A, B | L]) ->
    lists:reverse(L, [B, A]).


Here is reverse() using catamorphism and delayed invocation through string lambdas ..

Reverse = utils:lambda("utils:foldl(\"[E] ++ S\", [], _)").

-> #Fun<erl_eval.6.72228031>

and later ..

Reverse([1,2,3,4]).

-> [4,3,2,1]

Or the classic factorial ..


Factorial = utils:lambda("utils:foldr(\"E*S\", 1, lists:seq(1,_))").


-> #Fun<erl_eval.6.72228031>

and later ..

Factorial(5).

-> 120

Motivation to learn another programming language

I am no expert in functional programming. I do not use functional programming in my day job. But that is why I am trying to learn functional programming. Also it is not that I will be using functional programming to write production code in the near future. We all know how the enterprise machinery works and how a typical application developer has to struggle through the drudgery of boilerplates in his bread-earner job. Learning newer paradigms of programming will teach me how to think differently and how to move gradually towards writing side-effect free programs. The principal take-away from such learning is to be able to write a piece of code that encapsulates my intention completely within the specific block, without having to look around for those unintentional impacts in other areas of the codebase.

I am a newbie in Erlang and the attempt to port string lambdas to Erlang is entirely driven by my desire to learn the programming language. The source code is still a bit rusty, may not use many of the idioms and best practices of the language, and may not cover some of the corner cases. Go ahead, download the source code, and do whatever you feel like. But drop in a few comments in case you have any suggestions for improvement.

- lib_lambda.erl (the string lambda implementation)
- lib_misc.erl (miscellaneous utilities)
- lib_lambda_utilities.erl (utility wrappers)