The Obvious Truth
One of the most important qualities of a good programming language is the first class function. This critical language feature makes possible a whole class of behavioural and data abstraction techniques. Without it we could not incorporate many modern language like continuations, generators, or callbacks.
There are of course many workarounds to any of these features. These workarounds range from articles on simulating continuations in Object-Oriented languages and callbacks can be replaced in most cases by the, somewhat more complicated, event or messaging passing mechanisms. Generators however, really are just un-useful without proper first class functions.
A generator is simply a function that sends a number, possibly infinite, of values to an independent language construct, which can then take advantage of these values, one at a time. The proverbial construct associated with generators is the for loop. The loop invokes the generator, a value function within the generator produces each value, and the value function itself calls a function parameter containing the code body of the loop itself. There are usually some extra mechanisms to handle premature termination of the loop (among other things).
I digress however, since what I really want to discuss here is the language elements left behind. Features such as environment management, the function dispatch mechanisms, exception handling, memory management, concrete syntax, and so on. What I really want to see is language designers moving towards a more unified system, providing programmers, or at least library implementers, with abstractions for these features, rather than fixed semantics that limit the expressivity of the language.
See despite so many languages taking great pride in providing first class functions, most actually just inherited these features due to the popularisation of functional programming techniques. The real hidden gem here, I feel, is reifying language semantics however. This can allow programmers to selectively enhance or extend their favourite language, without losing the built-in compiler or interpreter, possibly at the cost of optimisations complexity.
My favourite example here is the environment, an element of language design that has historically encountered limitless debate. Yet, for reasons that I cannot comprehend, even languages promoting expressivity over implementation consistently refuse to provide language level abstractions over these features.
I’m not saying that most programming requires such levels of expressivity, and quite the opposite, most programming tasks will never require it. But there are some tasks that would benefit greatly from the presence of such features. Rather than re-implementing these, it should be possible to just tap into the abstractions provided by the programming language.
One style of programming that would definitely benefit from these extensions is programming language research. It would be much simpler to test out new environment models, construct prototype virtual machines and interpreters, and even mess around with less common things like dispatch algorithms or meta object protocols, if instead of building entirely new environment abstractions, the existing ones could be utilized and re-engineered in a controlled way. This requires both abstractions limiting the direct manipulation of the host language implementation, otherwise you really just re-implementing the host language, and by providing a preferable more powerful interface than the one likely used in the host languages compilers or interpreter.
Using Scheme as an example, it would be trivial to add some intermediate abstractions for managing the environment. Scheme already implements something very close to first class environments, and in-fact many lisps do actually provide procedures for inspecting and controlling where evaluation takes place, but it’s really not quite there yet.
The reason I singled out Scheme in this case, even comparatively to other dialects of Lisp, is because the larger dialects typically include an impressive continuations facility. This combines with the hygienic macro feature, producing extremely impressive language extension capabilities. Unfortunately, you still need to generate raw code, when writing domain specific languages if you wish to allow fragments of host language code to be embedded into the DSL. In my case these fragments correspond to actions and predicates in a parsing language I’ve been implementing.
Perhaps I should attempt a Scheme dialect with such features, and just maybe I’ll even end up with something new?
– Lorenz
Tags: Abstraction, Language Design, Scheme
April 13th, 2008 at 17:51
[...] while back I posted The Obvious Truth, which described a philosophy I hold over reifying language level implementation details in truly [...]