Experiments in Emacs

One of the most difficult challenges facing me when beginning to work with both Erlang and Haskell, was the significant lack of good IDEs… Well, at least that was the theory, anyway. Instead it turned out to be a real motivator to finally get around to investigating the only IDE that counts.

The real problem wasn’t that I couldn’t use Emacs for basic editing, rather I didn’t really have any idea how to use any of the extra features. Essentially it was like working with a semi-decent TextEdit.app, with code highlighting and unfamiliar keyboard shortcuts. Even after reading some of Steve Yegge’s essays I still didn’t really have any idea what I was doing.

Ok, so basically I’m now writing this in Emacs’ SGML mode and carefully trying to get around the perverse limitations of Wordpress. (Where the heck did the allow raw HTML checkbox go for author text?) Ironically I don’t think I’ve yet gained the productivity advantages from my efforts yet, but honestly that can probably just be attributed to a slow down in my typing speed and the rather unexpected work schedule I’ve been (you never realize just how much you appreciate sleep until you don’t get enough).

If I haven’t gained any real productivity, the question becomes why then am I still using Emacs to write this post? Well, as it turns out its not actually too bad for even when you don’t know all of the keyboard commands (all commands come from the keyboard, and most important commands really never migrate to them menu or at least don’t indicate they’re keyboard equivelent).

Even more importantly perhaps is the gains that I’ve achieved in other tasks…writing blog posts in Emacs may not be any faster for me (although I do seem to see much more of the post at any one time)…but writing code sure is. I seriously doubt that I’d change from using things like XCode for Cocoa applications or DrScheme for PLT Scheme applications (although I may change my mind on Scheme), but at least I have direct access to the REPL in Haskell and Erlang.

The way most of the editing modes in Emacs actually work, it essentially will even load up a file your working directly into a REPL running side by side in another text buffer. From there you can test smaller parts of your program or even just run the whole thing from the top if you needed to. Anyone who’s worked with a Lisp dialect or even Python or Erlang (among others) should be able to apreciate that.

There are a couple of things that have been bugging me so far. The first is the lack of a spell checker. Emacs does have built-in support for ispell, but as you probably guessed, it doesn’t work on my machine (actually it isn’t even installed on my cygwin install). Too make matters worse, I still need to teach Emacs how to find my Cygwin install and use programs from it before the windows paths…ugh…

And the other thing doesn’t even have anything to do with Emacs, but rather, its a limitation with the Haskell code I’m working on. Essentially I don’t have a framework for OpenGL work that allows me to substitute the code on my side of the framework after the application has been started…or more to the point I can’t stop the application after it has started, forcing me to quit the entire REPL every time I test the Game.

Of course the simple workaround is to just keep restarting the REPL or move out of Emacs into a Cygwin shell, but its still very inconvenient (damn you GLUT).

All in all I can only really see how it goes, but at the very least it should make life just that little bit easier. At worst, I’ll just have to resort to my aging Mac Powerbook (for the BSD layer).

Until next time…

– Lorenz

Monad Actions… ok… sure… whatever…!?

When I was first learning about OpenGL in Haskell it dawned on me that the example I was reading was doing some pretty crazy hung up things with Haskell style DSL techniques.

White this might make sense for a syntax based DSL (imho), I wasn’t sure this was absolutely necessary under OpenGL since the entire library is already just a bunch of monadic actions!?

Ok… that’s cool, maybe it was just overlooked, and I thus devised a test. Write a small piece of monadic code that re-evaluated an action over and over again…and it worked!!!

w00t!!!1one

uh… yeah… maybe you should just read this instead… save yourself some time and embarrassment ;)

IO Inside

One the upside… that game is still slowly progressing…

– Lorenz

Haskell Does Concurrency

Ok, by now we all know that Object Oriented programming is likely to go the way of Erlang, or at least inherit an Actor model of language level concurrency.

In fact, I’ve even got an object system that I’m planning to implement just to find out how well this style of concurrency works in an Object Oriented programming language. The design of the language is even more post-Self/Slate than Newspeak (although I believe that this language will just complement the Smalltalk family of languages in all honesty).

Well then, Haskell…Haskell is actually very interesting, in this case by being one of the very few to push the limits of locking methods for concurrency rather than using the larger granularity involved in processes and message passing (not by much, but its still something).

The problem is that its impossible to write locking code by hand that doesn’t have at least one race condition hiding away in the corner. Not the ones you’ve already tested for, but the ones that would cause transactions to restart…like multiple threads locking a few items and then deadlocking…or even just managing the code required to restart deadlocks…yeah…lol…

Haskell provides a novel approach (its actually pretty old now, but it’s really only just starting to catch on, outside of Haskell anyway)… Software Transactional Memory.

STM is sometimes used in other areas of computing (OODBs, et ceteral), but the real gem here is that the Haskell type system nicely compartmentalized the side effects involved into a clever transaction form, within our imperative outer program. This makes working with transactions very much like the database stuff, only we’re only working with individual variables, so no complex table lookups.

Anyway, enough of my rambling, check it out:
Programming in the Age of Concurrency Software Transactional Memory

And if you haven’t already, remember to look at Erlang too…after all concurrency isn’t going away, and there are only two ways to get it right!

– Lorenz

Parsers and Combinators

Last time I showed some brief examples of what a parser combinator is, how they work and how I intended to use them to solve the parsing problem using the Scheme programming language. This time we get to start the interesting stuff…

To get things going, last time we looked at a simple character facility which returns a parser that will only accept that particular character. We can also use this technique to select from a range of characters.

(define (char-range start end)
  (λ (state)
    (let ([rc (read-char state)])
      (cond ((eof-object? rc) fail)
            ((and (char>=? rc start)
                  (char<=? rc end))
             rc)
            (else fail)))))

We might also wish to repeat the actions of a parser, which involves writing a proper combinator. Actually, this is the first real utility parser we’ll see and it simply allows us to infinitely repeat the parser provided.

(define (many p)
  (letrec ([mp (λ (state acc)
                 (aif result fail? ((with-mark p) state)
                      acc
                      (mp state (cons result acc))))])
    (λ (state)
      (reverse (mp state '())))))

This defines a small function that accumulates a list (backwards as usual) by repeatedly applying the parser (p). The with-mark stuff is the second combinator we’ll look at, which is used here to prevent the stream position from moving beyond the last match.

The description of many would be a combinator that continues to apply a parser until it fails, reverting the parser’s state to the state prior to running the final failing application of the parser.

The definition of with-mark is even simpler (or should this be with-state?).

(define (with-mark parser)
  (λ (state)
    (let ([mark (file-position state)])
      (aif v fail? (parser state)
           (begin (file-position state mark)
                  v)
           v))))

Our parser state is of course still non-functional, based simply on the port capabilities built into Scheme. We check to make sure the parser doesn’t fail, and if it does we reset the parsers state by recording it before hand. Non-functional user state can be stored in a larger closure/environment, and this could be changed to support functional state as a second variable (in a data structure or list), but I’ll leave to a later post, or the reader of course.

Running our parser is actually this simple (from the repl).

(define digit
  (char-range #\0 #\9))
 
(define state (open-input-string "12345"))
 
((many digit) state)

There is one slight problem with this… It accepts any number of digits… including zero digits! I’ve provided a second version called many1, but you could actually define it like the following snipped. (Note: See the bottom of this post to find the source code as it currently appears).

Ok, we still need a way to parse whitespace, and to abstract the tokens of whatever language we’re implementing. To help us here we can write a whitespace parser and a token combinator which will help us.

(define ws-char
  (λ (state)
    (let ([rc (read-char state)])
      (cond ((eof-object? rc) fail)
            ((char-whitespace? rc) rc)
            (else fail)))))
 
(define whitespace
  (many1 ws-char))
 
(define (wstok tparser)
  (mand (opt whitespace)
        tparser))

I’d just like to point out before we more on… the ws-char parser here actually recognizes any Scheme whitespace character. I could have written this parser to recognize some other whitespace character set but I felt that this would serve us better for now, since we might eventually want to test our method against the existing Scheme language, or at least a sizable subset.

Again, using this is quite simple… to recognize a number token we can simple define the number token parser as (wstok (many1 digit)).

Now we still have one minor issue here… what if we don’t want to simply recognize the language, but also produce some direct result. For example, our parser so far can recognize number tokens, but it doesn’t return those numbers… just the lists of characters that make them up! (Without the whitespace of course.)

(define t-number
  (mlet nstr (wstok (many1 digit))
    (! (string->number (list->string nstr)))))

This is where the magic begins… mlet and ! are just macros which expand to scheme let and raw expressions wrapped in lambdas which require the current parser state. mlet then also takes a symbol to bind, and two parsers, which is why we need the ! macro magic. Remember that parsers are just lambdas that take a single parameter… it doesn’t matter if they don’t use that parameter, or just pass it on (like mlet).

It might also be interesting to note that we could write a combinator that took a function that accepted the results of the parser, and produced a replacement result. Technically this parser is actually very much like the one we just wrote, except that it can only filter the results of a parser, and not the results of several parsers which can only be achieved though nested mlet expressions. Usage might look similar to the following.

(define t-number
  (return nstr (wstok (many1 digit))
    (string->number (list->string nstr))))

Pretty close right.

By this point I thing most of you are starting to see what we’re getting to… the canonical example of an expression parser, but this time I’m actually going to add a slight twist to the story instead. We’re actually going to generate Scheme code instead, and then use the standard eval function instead. I know this might seem like a strange thing to do, especially since we don’t normally use eval in most Scheme code, but think forward a little. Normally you wouldn’t directly evaluate the code you just parsed either… think of eval as our interpreter, operating on an abstract syntax tree… conveniently Scheme code is already very much like an AST so we just use that.

(define t-add
  (return _ (wstok-char #\+)
    '+))

This gives us our addition token returning, obviously, the Scheme + operator. And finally two more pieces to put it all together.

(define (left-assoc term op)
  (letrec ([f (λ (state acc)
                (aif lst fail? ((with-mark (all op term)) state)
                     acc
                     (f state
                        (list (car lst) acc (cadr lst)))))]) ;; (op lhs rhs)
    (mlet result term
      (λ (state)
        (f state result)))))
 
(define expr
  (left-assoc t-number t-add))

Finally, to end this… rather lengthy… example. Here is a finished REPL using our parser…

(define (expr-repl)
  (display ">> ")
  (let ([line (read-line)])
    (if (string-ci=? "exit" line)
        '()
        (begin 
          (display " = ")
          (display (eval (expr (open-input-string line))))
          (newline)
          (expr-repl)))))

Calling it is pretty simple, and we can get results like so…

> (expr-repl)
>> 1
 = 1
>> 1+2+3+4
 = 10
>> 1 + 2 + 3 + 4
 = 10
>> exit
()

The purpose of this simple example is to show you how our combinators and point out what, in my opinion, is possibly the most important observation to make about parser combinators… all our combinators return parsers and all our parsers are not combinators. Yup, you always have to realize when you’re looking at a combinator, and when your looking at a parser. Hell, you could even be looking at a combinator that returns a combinator! (currying et cetera)

Also, in the code I’ve used above, I made a point of following a coding convention that separates top-level parser definitions by using lambdas, and combinators or simple functions which use the define syntax instead. I found that this makes it easier to visually distinguish between the two, without having to resort to an explicit naming convention.

Next time we’ll move beyond this simple stuff and I’ll try to investigate more complicated combinators that don’t take parsers but data structures to construct their resulting parser.

– Lorenz

Recognize This!

Ok… I may not have entirely explained the machinery behind the basic backtracking parser combinators yet, but before I get to far into the heavy stuff I thought it might be useful to discuss some basic context issues involved in writing parsers.

There are two main elements involved in writing parsers… The Why… and the How (to some extent what and when also fit here… these are technical aspect that is).

Why do we write parsers?

There are actually a few reasons, and this isn’t really a particularly interesting question for us (after all, we’re already writing a parsers aren’t we?), but I would like to briefly recap what I believe are important points to keep in mind when designing a parser combinator framework.

There are two main varieties of parsers, which come in the form of prototypes and production parsers. Regardless of what the parser is being used for, the purpose of the parser is either to support another process, or it is simply to test some linguistic nature of a computer language (concrete syntax typically).

Of these two categories, we also stumble over a hybrid variation, where a parser written for production use may end up serving a prototype role and, of course, prototype parsers may end up evolving into production parsers.

The importance of this categorization is the implications of generally knee-jerk decisions associated with each approach. The main decisions we need to worry about ourselves when designing a parser framework are of course the error reporting and recovery mechanisms, basic performance concerns, and to some extent the language and tool support.

This isn’t just limited to our intended audience either, but also to ourselves as parser implementors… its all to common to consider why we are writing a parser and select a methodology based purely on practical concerns, and yet, quite often the very notation we use to write the parser is more important. The notation can have very severe effects on the cognitive processes involved in writing a parser or computer language, and so too can the tools that support that notation.

How do we implement parsers?

This part is a little simpler to describe at a high level… that and we’ll be delving into the issues more closely later on anyway.

I personally like to break the implementation concerns into three major categories… recognition, state, and output.

Each of these can become extraordinarily complex and intertwined, but they always tend to follow the same basic trade-offs which we’ll discuss briefly here.

Recognizing input is possibly the most obvious element of parser construction, since it tends to follow the design of the language anyway. More specifically it tends to follow the formal definition of a language, and most conventional notations use the same form as the BNF or PEG formal notation.

While it might seem useful to apply the same notation as the formal definition of a language, imo its not particularly friendly in practice. The main issue with this trade off is the difficulty in implementing non-trivial concerns like left-factoring, lookahead, error reporting/recovery or simply the fact that they don’t contain any proper form of abstraction (no state information or context sensitivity).

This the actually the reason we’re using a combinator approach… it allows us to capture the conceptual elements involved rather than directly writing a whole lot of repetitive rules, simply because the parser generator tool we selected, didn’t realize that scanners aren’t the only type of repetitive task in language writing (expression/operator grammars for example)… or even the lowly delimited list based on non-conventional parameter parsers.

This might actually be a good time to point out the ANTLR parser generator as an example of the kind of trade of we’re trying to avoid. I can’t stress enough how often I’ve wished I could refactor part of a parser involving delimited lists, enclosed parsers, associativity and precedence rules, and it just gets worse from there. The problem? No higher-level abstraction of rules (its like programming a user interface with 1st order logic… well… maybe not quite that bad but you get the idea).

Collecting state is the second important consideration when parsing a computer language. It might be interesting to consider what kind of state you’re accumulation however. In a calculator for example, the state is actually the current number and possibly a bindings table for variables if you allow them, but for a C parser you also need the symbol table and possibly even some other annotations for the preprocessor phase.

The point I’m trying to make here is not that parsers must return something… we’ll get to that in a second… the point is that many parser have other state that is needed by the other software components that receive the output form. This state obviously cannot be collected directly as a result, and yet it also has to be collected directly as a result of parsing.

Why do I care so much about these side effects? Its a perfect opportunity to reflect on Parsec’s usage of monads… and why we don’t need them!

Yup… that’s right, monads primarily capture a few things, but the main elements are timely evaluation (not necessarily eager), imperative ordering of evaluation, and propagation of out-of-band state information. To paraphrase… side effects and imperative code blocks. (Monads are actually capable of quite a bit more than this, but in our case, none of that behaviour is relevant.)

Actually we do still need to pass state, but whether we do it globally (uh…), or through an accumulator doesn’t matter… we have a different way of writing side effects, and we’ll use that for them. Scheme also provides the begin notation for us, so we have a perfectly viable means of writing imperative code if we need it, but even in Parsec it’s rarely a necessity.

Finally we come to the Output form, which is almost exclusively broken down into Abstract Syntax Trees (ASTs), Compilers (intermediate or machine code output), Simple Recognizers, Translators (Scheme to C for example), and Interpreters. Of these only ASTs are used when we have reasonable levels of computation power available, while the other forms are typically relegated to AST parsers or in our case, the native pattern matching and program code of the host language.

The big challenge we’re facing here however is actually related somewhat to the difference between prototype and production parsers. If we wish to provide good support for prototyping, we have to realize that its not entirely uncommon for interpreters to be written without an accompanying AST representation. Unless we can make the AST representation so inexpensive to generate and extraordinarily useful, we may wish to provide support for both ASTs and interpreters.

Whatever trade-off we make however, as long as we realize that we need to effectively support all of the possible applications unless we expect programmers to also use another parser framework or generator. Not necessarily perfectly, but at least adequately (we need to at least make generating ASTs cheap so that programmers can work with them even if the other forms are not simple to directly implement).

Before I finish I’d also like to add a small note about how parsers are actually used… or more specifically that the tools we provide for testing, debugging and maintaining our parsers are possibly more important than the ability to write the parsers in the first place.

For those who have used ANTLR before, you probably realize what I’m talking about here… yes the ANTLR grammars are limited, and painful to effectively refactor, when half the time you’re only even generating ASTs anyway. But the grammar debuggers and generally the ANTLR Works IDE are exceptional tools for constructing parsers. Also… I don’t mean to pay out the ANTLR parser generator… it really is the best tool I’ve used with Java or C++… its just not as powerful as I believe we can get in Scheme

While we can simplify the construction of most parsers greatly, it would be even better if we could also provide dedicated tooling for profiling, debugging, inspecting or even just testing our parsers. To this end, I’ll be adding various tooling issues interspersed with the construction of various parser facilities as we construct hopefully the canonical Scheme parser framework… or at least make the framework as useful as possible.

I don’t intend on covering any of this again in future posts in the series. Instead we’ll only be investigating the trade offs relevant for a Scheme parser combinator framework

– Lorenz

Language Design and Small Team Game Development

I’ve been mulling over my old game development hobby for a little while now, and after messing around with Python a little I noticed that once again, there seems to be very little in the way of promising concurrent game object level languages floating around. At least languages that I’m aware of.

What I’m talking about exactly is game languages which allow the game objects themselves to participate in concurrent programming practices, without using intricate and complex locking sequences, or losing the right to talk directly to engine/library level services like physics or high-level game state (moving between the level and a menu for example).

Instead, almost all game level languages at the moment seem to be evangelizing the importance of being script level uncompiled beasts that require little to no real programming experience before becoming a productive member of the game code team!?

Of course I’m not saying that this is an inherently bad practice with the current level of technology floating around, but I do have to wonder whether this is still relevant for small team game development in the near multi-core parallelism future. Perhaps too far, but I think also relevant is the much slower pace of Open Source Game Development, which realistically speaking needs technology almost 3 years ahead of its time just to break even on an initial 1.0 release!

My question is actual reasonably simple despite the cynical tone up to this point… Can we realistically design a new object oriented DSL that contains the necessary concurrency techniques to allow programmers to write reliable and correct concurrent game code. This actually goes beyond just the engine code for which we’ve always been able to reasonably easily move to a more sophisticated language like Haskell but more towards the game specific stuff, which we cannot afford to bog down in complicated syntax and general side effect hostility.

I actually believe this is possible, and I’ve already begun planning the first steps towards writing some basic engine code that will underpin the project but I’m definitely more interested in seeing how effectively I can pin down the STM concurrency model into an object oriented syntax.

Ok… what about the language stuff…

I’ve actually been looking for a good excuse to both learn Haskell more fully (I’ve really only coded trivial examples in it so far), as well as properly exercise the Parsec combinator parser and finally, and perhaps most of all, design a new object model and object oriented programming language using that model.

The last, ironically may have to be restrained a little, but a concurrent perhaps slightly more conventional object oriented language may not be so bad. The point is that the project will involve a fairly comprehensive object oriented programming language, and one that is not only fully concurrency enabled but also tailored to game development tasks.

This is probably a good place to point out, that the conventional object oriented language I’m talking about here doesn’t include either C++, Java, Ruby or even the CLOS, but rather I’m talking specifically about the Smalltalk family of languages here. Its the work done on Smalltalk/Squeak, Self, Slate and more recently Newspeak that I find the most interesting, and the dialect I develop will likely incorporate many of the simpler innovations in the object systems they use.

The most critical issue that I’m aware of is the complexity added by the multi-dimensional objects used by most OO languages these days (including Smalltalk in this case). Rather than allowing the object state to exist in two (or more) dimensions I’ll be limiting them to a single flattened dimension similarly to record types, and the programmer will need to use either delegation/cooperation or composition in order to accomplish tasks in a reusable fashion. If you didn’t follow the dimension of state thing I’m actually just talking about Inheritance (another dimension might be Aspects, et cetera).

While I don’t want to go to much further into the syntax I would like to note that my intentions for this, apparently built in, concurrency will probably be in the form of an asynchronous send, paired with transaction support (which will actually be the only way to guarantee that data is written correctly). The asynchronous send, by definition, allows the receive to process it in a different thread… but since my little language is probably going to be more or less interpreted or at least green threaded, I’m pretty sure that I can just replace the async send operation as a plain old spawn operation.

The async message send in this case should provide a convenient way of signaling between game objects, and paired with the transactions (noting that very few Erlang processes don’t actively apply transactional semantics as well), I should have an effective way of implementing fully concurrent game code.

Well then… hopefully I don’t get killed between all of these projects and we’ll see something interesting come out of this stuff. Until then…

– Lorenz

Scheme Parser Combinators

I’ve had my MMeta parsing library on hold for a while now, but I think its finally time to reinvestigate the idea.

There are actually two reasons for doing this… The first is rather obvious… there is still a vacuum of good parser libraries for Scheme atm. But the other might be less so (uh… unless you read the title)… Scheme Parser Combinators.

After investigating parser combinators for a little while now, a short while ago I had the good fortune to stumble on a post by Gilad Bracha about parser combinators (this time its in Smalltalk).

While not everyone reading this will be able to easily follow the Smalltalk code involved in the post, its possible one of the best descriptions of what a combinator parser is, and how they tend to get written.

My goal here is to replace the forreign syntax of OMeta with a Scheme native syntax using plain old functions and namespace bindings. And of course there’s the odd macro to help with readability (combinators can occasionally get a little out of hand without proper currying, or an unusually light weight syntax for writing lambdas/closures).

What this means is that my parsing solution will now look a bit more like a set of constructors nested between each other being assigned to a variable. Its not quite as pretty as the Haskell variation which uses the parsers directly (thanks to lazy evaluation and currying), but it does work in a reasonably straight forward way.

(define basic-char
  (λ (state)
    (aif rc eof-object? (read-char state)
         fail
         rc)))
 
(define (char c)
  (λ (state)
    (let ([rc (read-char state)])
      (cond ((eof-object? rc) fail)
            ((char=? rc c) rc)
            (else fail)))))

In the above snipped we define the two fundamental character parsers… They essentially define the naive variation of a character parser and a selective character parser.

What I want you to do is note the second parser’s definition. This parser is not technically a parser at all, and is what we generally refer to as a combinator. In reality its not really a combinator since it doesn’t accept a function as an argument… but it is higher order since it does return one, and that’s they key to making these things work.

All you need to drive a parser combinator is to be capable of returning functions until you have a function with all but one argument… plain and simple currying. Of course because we’re using Scheme and not Haskell, currying for us is not implicit, so rather than doing all of it by hand we’d rather just have combinators which have not been parameterized yet, and actual parsers which only accept a single state argument (a Scheme input port currently).

This does make some of the parser we might implement a little more complicated, since some of our functions will return a parser, and some of our functions will be the parser… but overall it tends to work quite effectively once you understand the difference between the two. It might also help to note that currying is used to prepare functions early, but rather it tends to be on the fly. Occasionally you may even return a parser immediately before running the very same parser (within the same top level expression).

Since I’m looking predominantly at backtracking parsing still lets now investigate a useful practical example to introduce some more combinators and the basic style that you use when writing parser with this stuff…

(define a-or-b
  (many1 (mor (char #\a)
              (char #\b))))

Ok then… what do we have here. (char #\a) and the equivalent for #\b are simple parsers which only accept the plain and simple input of their respective character, or they fail.

This might be a good point to explain what the fail value actually is… its not a value for a start. Actually its a clever little MzScheme macro (not R5RS unfortunately), which allows you to either write just fail and it will expand to (make-fail-type ()), or you could write (fail "Did not find character #\c!")… which is what we’ll be using when we cover error handling in a later post. Of course there’s also a function for testing it called fail?.

The mor here is of course the or branch (I may come up with a better name for this at some point), which in our case performs any stream rewinding necessary and checks each path until one of them succeeds or the entire construct fails. The parser returned by mor here will actually parse a single #\a or a single #\b or it will fail.

And finally the many1 accepts a parser which it will run over and over again collecting the results in to a list based accumulator. If it doesn’t see at least one result it will return a fail result, otherwise it will return the list of results. Interestingly this parser that many1 accepts must fail at least once before many1 itself will stop, which means that many1 must always rewind the stream at least a little before returning a successful result.

And of course we use the standard define to set a-or-b…

To finish up this post, our completed parser can be evaluated just like a hand-written parser in Scheme can…

(a-or-b (open-input-string "aabb"))

This will of course return (#\a #\a #\b #\b).

Of course this is really just a brief introduction on Parser Combinators and they’re basic operation… in the next post for MMeta I’ll be discussing some actual usage of the parser and of course, attached will be the actual source files for the parser ready to go.

– Lorenz

FoNC Wiki

If your’re interested in post-Smalltalk research, or more specifically the implementation of such technologies then you might be interested in keeping an eye out on the work being done by Ian Piumarta, et al.

Well, the good news is that a wiki has been setup for increased public communication about the project.

You can check it out over at…linky

–Lorenz

Cygwin…Mingw…’.a’…WTH!!?

Recently I decided to pick up an apparently fantastic book on Erlang, written by Joe Armstrong (Programming Erlang). Now personally, I’ve never been particularly impressed by the man himself…too much of that functional programmer arrogance if you ask me. For some reason however, the hype got the better of me, especially after stumbling on a conversation between Armstrong and a couple of other functional programming ‘nuts’…uh…enthusiasts.

Perhaps of interest, is that I’m also quite a fan of functional programming myself, but I’d rather think of myself as avoiding a particularly insidious case of accidental complexity rather than using a fundamentally better language. I should explain… While functional programming languages does reduce a lot of the complexity accrued due to side effects and shared memory; they don’t deal with some of the other almost equally dangerous forms of accidental complexity.

Note: The actual problem I have with these functional language guys is the assumption that it’s a natural way of thinking about problems. Its not…it’s just, plain and simple, another way of learning how to solve problems. And its generally not one of the easiest to learn (basic anyone?)

I’m talking about compatibility of course, which brings me to the point of this post.

Cygwin…Mingw…’.a’…WTH!!?

After learning Erlang, and believing (perhaps even jumping the gun slightly), that I would be up to a real challenge, I realized that one thing was missing. A decent flexible, portable, Erlang code only FFI interface.

My challenge…write a real-time, computer game using Erlang. The likely candidate would be a space combat sim, within the vain of a Freespace2 style game (graphics and gameplay wise…hopefully the story will actually give our hero an actual name this time:)

Unfortunately the existing FFI in Erlang is either to use a network port to a C program linked to the candidate library, or to use a “linked-in driver”, attached to the VM and talking the VM’s data-structures in order to facilitate communication. Damn.

Talking to VM’s in case you’ve never tried it (although that seems unlikely if your reading my blog ;), is always a major pita. You effectively skewered between two of the most awkward types of interfaces known to programming kind, and worse yet, your only weapon of choice is what C gives you…or worse…a code generator (SWIG be damned).

No… What I need is something like the Pythoneers, imo brilliant, library…CTypes. And for C interfacing, the ctypes library uses a C level library called libffi (sort of anyway).

While this libffi based library of mine will no doubt be the topic of a future post, it does bring up a rather gruesome topic that I’ve been avoiding for quite some type now…Cygwin and Mingw.

After many hours of searching the web I’ve finally found what appears to be the solution.

Cygwin, as we all know is just a *nix/POSIX subsystem running over the windows COFF binaries (afaik), which allows us to blend windows and bash tasks during our development process (like using Windows compatible GNU/Make makefiles).

Mingw is where I got confused. You can find it in the installer for Cygwin, which put me under the assumption that it was more or less the default for this stuff…it isn’t.

Mingw is simply a cross-compiling target for GCC, which targets COFF binaries and the win32 C library (I can’t remember which one…I believe its the one in C:\windows however).

This means that to compile a binary for windows…one that doesn’t depend on cygwin.dll, or infact any of the cygwin dependent libraries you simply need to tell GCC to use the Mingw target…uh…sure…that sounds easy.

Actually, it turns out that this process is so involved sometimes (cross-compiling) that the mingw guys (or is that the Cygwin guys?) decided to make it easy. All you have to do, as mere mortals, is pass –mno-cygwin to GCC (“CFLAGS=’-mno-cygwin’ ./configure;make), and that’s it!

Fantastic…now back to the real challenge…using libffi to create my “Erlang C Types” library…

– Lorenz

Re: The Obvious Truth; In Terms of PLT Scheme

A while back I posted The Obvious Truth, which described a philosophy I hold over reifying language level implementation details in truly General Purpose Languages. While Domain Specific Languages may have no requirement, since they will never be used to implement another programming language (on their own), or really anything, for that matter, beyond their domain.

The language implementation detail of choice I discussed was environments. Well, it turns out there are some languages that take my philosophy a little more seriously than I thought. PLT Scheme does in-fact provide programmer level interfaces for its environments and even its module system. Presumably the compiler or possibly other features use whenever working with program code directly.

For those who have found this problem to be a tricky one to work around, using those nefarious eval hacks, the keyword here is namespace. A common trend in language design is to combine the features of both namespaces and modules, since whenever one is needed the other generally is as well… or at least in the case of namespaces, a module system rarely makes it harder to work with, when the two are unified. Obviously the implementers behind PLT don’t believe agree (or didn’t).

Once you lookup the namespace libraries, all your crazy namespace mutation and inspection features are suddenly cast into view. I’m not sure yet if they will be enough to implement all of the features that I require, but they sure should make one hell of a lot of noise in the process.

For those who are wondering why the hell I’d want to incorporate so many side-effects into what is generally a much more powerful language in its functional style, the parser generator either must generate genuinely new functions and side-effect them into the namespace, or it must produce a static source file, and load/require that as a module.

Ironically, to load it as a module automatically the same interfaces are required. This is because a module cannot realistically be imported into a non-top-level namespace, or more practically it is almost always an error to do so, and there really are workarounds for the one or two times you’ll ever do it.

– Lorenz