Create your own programming language for great justice

In an earlier post I wrote my wish-list for the perfect programming language (for me.) I believe most programmers have a perfect programming language in mind. It usually goes something like this: “I really like the object model of language W and the functional aspects of language X and the syntax of language Y. It would be great if it ran on Z runtime / virtual machine, too.” Then there are the language zealots who say that language Foo is perfect, or at least better than all the other commonly used languages and with such a large user base and library of reusable code, that even if they could fix the bits of the language they don’t like, it wouldn’t be worth it because it would break existing code. In a sense, they’ve settled, put their tent pegs in the ground and encourage others to set up camp where they are, so they can all share code. There is certainly a lot of value is just spending a lot of time one language so you can go deep with it and develop a strong set of idioms and patterns. It also leads to a large number of very exclusive camps. To be fair, many programmers dabble in multiple languages and paradigms, but most do not.

I often am offended by programming language zealots. Just like religious zealots, they are myopic, self-important and insulting towards outsiders. I love much of the writings of Paul Graham, but I find that his statements about Lisp and all that which is not Lisp to be difficult and often insulting. In his article “Beating the Averages,” Graham talks about a hypothetical language Blub that a programmer gets attached to. This programmer sees less powerful programming languages for what they are, but doesn’t understand the abstractions and power of better programming languages. The programmer thinks he is using the best programming language, but he isn’t. He isn’t using the best programming language, because… he isn’t using Lisp. Period. Graham states explicitly, “Lisp is so great not because of some magic quality visible only to devotees, but because it is simply the most powerful language available. And the reason everyone doesn’t use it is that programming languages are not merely technologies, but habits of mind as well, and nothing changes slower.” Again, I like Paul Graham’s writings for the most part, but saying that he is correct about this because he seems to be correct about so many other things would be a fallacy as much as saying that he is wrong because everything else he says is wrong.

I agree with many points in Graham’s article, more or less, but I don’t think that most programmers are small-minded Blub programmers. I agree that Lisp macros are powerful, that it is a powerful concept to write programs that write programs etc, but Lisp is hardly the only language that does that. Boo, a language that actually fulfills my wish-list, has macros which are very similar in functionality to Lisp macros. You get access to the actual parse tree and can create or change or augment language features in the language itself. It is a powerful feature, but it is not the most powerful feature, at least not for me. It’s not that I have a small mind, either. I understand that Lisp macros are awesome, but it is not an abstraction I need in day to day programming to get things done. I’m a big fan of getting things done, and building systems to help get them done faster and better. Graham argues that he used Lisp with lots of macros in a business that did well, with software that delivered functionality not present in the competitor’s projects, and that this was all because of Lisp macros, which comprised at least 20% of his product’s code. I’m sure there was a correlation between having features and being successful, but the 20% macros thing may just be significant of the fact that he had to add a lot of things to Lisp to be able to build the software he wanted. It could also be significant of the fact that he knows how to write macros. Just because the competitors did not deliver the same features does not mean that they didn’t or couldn’t because they used some less powerful programming language. The software he wrote could likely have been written in any number of languages by any number of programmers. The programmers that toil away in so-called Blub languages, oftentimes are quite good at what they do, and can deliver features if they can think them up or are allowed to add them or whatever. Look at Facebook, they rock the world with the power of PHP, a pretty awful language by the estimation of many. I just doubt that Lisp was the reason for Paul Graham’s success.

I said all that to say this: Lisp worship aside, it is a good idea for programmers to learn things about programming languages that they don’t understand. It is healthy to stretch your mind with different perspectives. I believe that it is a universally accepted truth that looking at a problem from different mental perspectives causes us to have breakthroughs of new ideas and understanding. One excellent way to expand your mind as a programmer is to create your own programming language–not just a theoretical grammar, but an actual working compiler and/or interpreter.

I have to admit, I’ve been a little lackadaisical regarding hacking code in my free time. I think I haven’t been able to get excited about a project. I’ve announced, prematurely, some projects in the past that have been since abandoned. Perhaps it is that I spend so much time at work “getting things done” that I relish the freedom to not get things done outside work. It’s true, I like messing around with different programming languages and not committing to projects. However, since the programming language wish-list post, I’ve been drawn to my newest project over and over. I am working on a new programming language. It is really in the very earliest of toy language stages right now. I am really just learning things as I go. I am learning how to use flex and bison, the descendants of lex and yacc, what abstract syntax tree is, and many other things. I’ve read a lot of articles about different aspects of compiler / interpreter design and techniques. Some things have inspired me a great deal, like V8, the JavaScript engine that Google created. I’ve been interested and inspired by byte-code interpreters, JIT compilation, garbage collection, stack-less interpreters, coroutines, closures, and many other things.

As you can imagine, I have a laundry-list of things I would like to implement in my very own programming language, but this project is mostly about learning, not the end product. I know most folks with a BSCS have taken a course on compilers and have written a compiler of some form or other. With all the instruction on compilers that has taken place, why do we settle for such mediocre languages. Python 3, was a very small step forward from Python 2.x. It was such a small step, that I wonder what the point was. They cleaned up the syntax and some of the libraries a little bit, but for a backwards compatibility breaking change, they sure didn’t do much to fix the inadequacies of the language. The “hot” programming languages these days are incredibly old–at least in internet years. Ruby is 15 years old. Python is 19 years old. Why, after developing a science around compilers, do we still use languages that require semicolons on nearly every line? If Ruby and Python are each at least a decade and a half old, do so many companies write code in C#, Java, PHP, and so many other languages that have vestigial syntax from C? Perhaps it is the slowness of businesses. I don’t think so. I think it is laziness.

Software developers have the ability to write their own compilers. Yet, they do not. It is intellectual laziness. We are blacksmiths. We can forge our own tools, yet we use the crappy ones handed down to us from old. There are established techniques for creating new tools, yet we forge on with the tools given to us. For shame. What if every developer wrote their own programming language? Sure, most would fall by the way-side and we need developers to write libraries, too and to research other concerns like concurrency, that don’t necessarily need to be solved at the language level. There are millions of software developers in the world. Python is dead. If you have the chance, fire up the python interpreter some time and type in “import this” and hit enter. You will discover a creed, a philosophy that underpins many of the decisions surrounding the creation and maintenance of Python. It is a philosophy I disagree with on many points. Not only that, but the Python language fails to fulfill a good number of them. “In the face of ambiguity, refuse the temptation to guess.” Python does not allow type annotations, everything is guesswork. “foo = bar” could mean a lot of things, it could be changing the value referenced by “bar” or it could be creating a new reference. “bar” could be anything, a number, a string, an object. Looking at Python code is often a lot of guesswork. Duck typing is the only typing available. When you create a function, you can only name the inputs, not define the type. It seems to fail here. Here is another : “There should be one– and preferably only one –obvious way to do it.” That line gets a lot of deserved criticism. It’s not even true. Python is a very powerful language that offers a very large number of way to solve any problem, and usually several of them could be considered obvious, depending on the style of code you normally write. Regardless, even if you could write a language that only had one obvious way to solve each problem, you would have a very poor language. “If the implementation is hard to explain, it’s a bad idea.” CPython has the GIL, both hard to explain, and a very bad idea.

I encourage every developer to write their own programming language. It will break us out of the complacency that allows us to toil day-in and day-out using the same leaky, broken abstractions. We can write our own tools. I’ve seen the HTTP 1.1 RFC, it isn’t that complicated, and I think the tools we use now to implement server-side code for web development are quite broken. PHP is just awful. Python is a little better, as is Ruby. V8 / JavaScript / Node.js has some promise, but JavaScript is a pretty broken language, too. It’s got that semi-colon problem, plus wierd object / array / function rules that lead to ambiguity and confusion. At the end of the day, you can’t have everything in one language, though they are certainly trying to do that with Perl 6. We can have better choices, or at least break through the complacency.

I don’t want to sound completely critical. Change is happening. People who are stuck with certain runtimes, like the JVM or .NET have created or ported better languages for the platform. There is quite a bit of excitement about new compiler tools, runtimes, front-ends, back-end etc. The LLVM project has gained a lot of attention, lately, for being completely self-hosting.

My own project is just barely getting going. I have the code constructing a complete AST, and I am in the process of writing the part that generates byte-code. I’ve learned so much already about grammars, tree manipulation, and parsing. I have ideas about register allocation and other things. It is quite the experience to get my hands dirty writing C code again. I am taking advantage of tools I didn’t know about in the past, like the Boehm garbage collector. I’ve spent too long in garbage collected languages to have to worry about freeing every bit of memory by hand now. That and generous use of typedefs make the experience not entirely unpleasant.

Post a Comment

Your email is never published nor shared. Required fields are marked *

Powered by WP Hashcash