Wednesday, June 6, 2012

My Mental Evolution In Making A Language

Over the past year I've been thinking about how to make my own language. It's a pretty big undertaking and I've been too busy with other projects to put any serious effort into it, so I just think about it when I go for a walk. Mostly I try to convince myself to not make a language but it sounds like a lot of fun. With the explosion of JavaScript, and so few people wanting to write JavaScript, it seems like other people think writing a language sounds like fun too. We have CoffeeScript, IcedCoffeeScript, Roy, JSX, Amber, Dart, and many others. And that's just a list of recent languages that compile to JavaScript. Many more languages have come out, relatively recently, such as Go, Fancy, Elixir and Loop. Most of these languages will be minor dents in the history of programming languages, but that is OK. Not everything has to be important to be worth doing. But they have gotten me thinking. Should I create a language? What would it have to offer? Is the effort worth it? Below are thoughts that have steered my decision process for when I get some time to start hacking on a language. The thoughts are targeted at me, someone who has little experience building a language, not a professional.

Why?

I think there are three reasons I should consider making a language:

  • Just for fun - Every hobby project should be fun! If I built a language just for fun, though, I think I would prefer to implement someone else's language. Depending on what language I chose, I would learn a lot about language design without having to make a lot of mistakes myself, I could learn from others mistakes. I would also have other implementations of it to compare to my own.
  • Experiment with semantics - This is the biggest reason for me. I have some unoriginal semantic ideas I'd like to understand better and I think creating them is the best way to go about that.
  • Experiment with syntax - C'mon now, there is no reason to experiment with syntax any more. God already gave it to us. But seriously, I find this the least compelling reason. I haven't seen recent language that does something with syntax I consider all that important. I think it will take a very clever person to change syntax enough to really matter. Removing semicolons doesn't really matter much to me. I can represent the semantics I'm interested in just fine in an existing syntax.

Can I stand on the shoulders of giants?

What languages already exist that have most of the semantics I care about? That way I can just extend it with the new semantics I care about. The clear upside is, even if the community for that language is small, they already know it so they just have to learn the extensions I added. It makes evaluating the new ideas easier. There are also a lot of language options that I'm likely to mess up or just not be interested in. I might as well go with whatever the professionals went with unless I have a strong opinion. Objective C, AliceML and Vala are examples of what I mean. In each case, the languages either took an existing language and extended it or was heavily inspired by an existing language.

The other side of this is deciding what backend to use. Should I build an interpreter or should I target a VM, like the JVM? Maybe JavaScript? Or should I target making native binaries? Maybe produce C. Or just write LLVM-IR myself? Or build the optimizer and backend myself? It depends on the real reason I want to make the language. Do I want people to use it? Or do I just want learn the entire stack on a compiler? If I implement an existing language, for example SML, maybe the twist I could add is having it target LLVM-IR. Just because I'd be implementing someone else's language doesn't mean there isn't any room for some innovation.

Disirregardlessly, a lot of smart people have thought very hard about building languages. I should take knowledge from them whenever possible (so basically, always). If I think I'm being original in an idea, I'm probably not. ALGOL probably has it...it always does. And I do not mean that in a defeatist way, new combinations of old ideas is progress. What I mean is a lot of these ideas have been thought out already, we don't know about them because they failed, and as a creator I should be aware of that.

Sometimes it's better to stay silent

Anders Hejlsberg was interviewed about the lack of checked exceptions in C# and said:

I'm a strong believer that if you don't have anything right to say, or anything that moves the art forward, then you'd better just be completely silent and neutral, as opposed to trying to lay out a framework.
I like this idea. Language design is often about trade-offs. If I am going to introduce a new concept because I don't like an old one, I better be sure it doesn't just swap one set of problems for another. At the very least, I've created a new complexity for people who want to evaluate my language to learn, and if it doesn't move the language forward then there isn't any benefit to the new complexity.

Keep It Simple

I think we should all take a lesson from Niklaus Wirth. When he was designing Oberon he took Modula-2 and greatly simplified it. The syntax is minimal and the language definition is tiny. Maybe it's too small, I don't know. But the language is very easy to think about and implement because of it's minimalism. Complexity is a burden. It's really hard to avoid complexity too, just look at the results of some trivial operators in JavaScript. JavaScript is simple in many ways but those ways interact to make complex results. These little, harmless at first, interactions cause painful bugs later on. On top of that, too much complexity makes for a more cumbersome implementation, which makes for taking longer to create. A simple language can get me something to play with sooner.

REPL

I don't actually mean "should my language have a REPL" here (one would be nice), but I mean every few weeks I take this list of questions and integrate them with what I've learned and evaluate if my previous conclusions still apply. Since I'm so new to designing a language the results tend to change. I think anyone interested in making their own language should do this. I've come up with ideas I thought were new and cool only to do some research and find out their are silly and absurd. Unfortunately, that's how I feel when I see a lot of these new toy languages being created. With a little bit of research the author could have made something much more impressive and easier to use. But then, I also feel a similar way about C++. I think a lot of people believe you can just cobble together a language and you'll get something great. It's hard to design a good, consistent, simple, extendable language. Language design is a profession and should be respected as such.

No comments:

Post a Comment