Saturday, December 31, 2011

PL Predictions for 2012

Summary: another year of broken lexical scope and broken type systems.
  • There will be a new language with broken lexical scope (remember, it's a criterion for success).
  • A GWT-like, 80's style, object-oriented GUI framework will make Dart a major player in web apps.
Happy new year!

Wednesday, December 21, 2011

Call-by-push-value

Call-by-push-value has a seducing PR message: Science is reductionism. Once the fine structure has been exposed, why ignore it?

Unfortunately, I'm too unsophisticated to be able to gain much insight from the CBPV papers. There's an introduction to CBPV for more mortal-like folks. Other material is hard to find.

What caught my eye was the following code sample from Paul Blain Levy's (rhymes with "all plain Stevie") PhD thesis:

thunk(print "hello1"; λz. print "we just popped" z; ...)

So, apparently you can do stuff in a function before you're even taking arguments. Interested!

Maybe one of you can provide more information about this interesting new thing?!

Looking back at predictions for 2011

Let's take a look at my PL Predictions for 2011:

▶ JSON will be the default format for new internet APIs.

I'm not sure how to corroborate this one, but I think it's true. The most significant piece of evidence I have is that IBM even developed JSONx, a standard for representing JSON as XML.

▶ As more people use JSON, we'll see a XML renaissance, as we - for the first time - discover that XML actually gets some things right.

Didn't really happen, AFAICS. But there's this post: Why XML won’t die: XML vs. JSON for your API which says: "not everything in the world is a programming-language object". +1 to that.

▶ GHC will get typed type-level programming and end this highly embarrassing state of untypedness.


▶ We'll have a verified compiler-and-OS toolchain for compiling and running some kinds of apps. It won't be x86-based.

Unfortunately, I didn't really track this issue in 2011. There's SAFE, and maybe the O'Caml folks have something to show.

▶ All kinds of stuff targetting JavaScript.

Yeah, well, that was easy. My favorite is Emscripten.

▶ Split stacks and maybe some scheduler improvements will be shown competitive with green threads in the millions-of-threads!!1! (anti-)scenario.
Nobody did this AFAIK, but I still think it's true. Maybe next year.

▶ some insanely great new PL for 2011


So, although most of the predictions were rather cautious, I'm satisified with the outcome.

Monday, December 19, 2011

what distinguishes a good program from a bad one

Bugs and Battleships by Edward Z. Yang:
One might even say that the ability to generalize behavior of specific tests to the behavior of the program is precisely what distinguishes a good program from a bad one. A bad program is filled with many, many different cases, all of which must be tested individually in order to achieve assurance.
(HT @dyokomizo)

Tuesday, December 13, 2011

URLs

One of my all-time favorite quotes:
Uniform Resource Locator is just the result of squeezing the term object reference through the IETF standardization process.

Saturday, November 26, 2011

The extensible way of thinking

'Twas November, 30 years ago:

Aucbvax.5036
fa.editor-p
utzoo!decvax!ucbvax!editor-people
Mon Nov 9 17:17:40 1981

>From RMS@MIT-AI Mon Nov 9 16:43:23 1981
Union types do not suffice for writing MEMBER.

They may be appear to be sufficient, if you look at a single program rather than at changing the program. But MEMBER is supposed to work on any type which does or EVER WILL exist. It should not be necessary to alter the definition of MEMBER and 69 other functions, or even to recompile them, every time a new type is added.

The only way to avoid this is if the system allows new alternatives to be added to a previously defined union type. And this must have the effect of causing all variables, functions and structures previously declared using the union type to accept the new alternatives, without the user's having to know what variables, functions or structures they are. Then you can define a union type called EVERYTHING and add all new types to that union. But then the system should save you trouble and automatically add all new types to the union EVERYTHING. This is tantamount to permitting you not to declare the type.

External variables are vital for extensible systems. A demonstration of this is in my EMACS paper, AI memo 519a.

Beliefs such as that type declarations are always desirable, and that external variables should be replaced by explicit parameters, and that it is good to control where a function can be called from, are based on a static way of looking at a program. There is only one program being considered, and the goal is to express that one program in the clearest possible manner, making all errors (in other words, all changes!) as difficult as possible. Naturally, mechanisms that incorporate redundancy are used for this.

The extensible way of thinking says that we want to make it as easy as possible to get to as many different useful variations as possible. It is not one program that we want, but options on many programs. Taking this as a goal, one reaches very different conclusions about what is good in a programming language. It looks less like ADA and more like Lisp.

It is no coincidence that "computer scientists" have been using the static view in designing their languages. The research has been sponsored for the sake of military projects for which that viewpoint may be appropriate. But it is not appropriate for other things. An editor designed to be reliable enough to be "man-rated" is a waste of effort; and if this unnecessary feature interferes with extensibility, it is actually harmful.

HT Usenet historian Chris.

Friday, November 11, 2011

Happy Birthday Go



The Go Programming Language turns two.

I really want to like Go (Commander Pike is one of my heroes, and systems programming currently either means C or a Special Olympics scripting language). But I'm slightly put off by the lack of exceptions, and the seemingly gratuitous concepts offered instead. Does anyone have experience with how this aspect of Go pans out in real world use?

Thursday, November 10, 2011

Is JavaScript a Lisp in disguise?

It is something of a commonplace to say that HorrorJavaScript is a Lisp or Scheme in disguise.

But that's completely wrong!

Yeah, JavaScript gets closures right, and that has become the yardstick for measuring amateur-designed languages. If you do closures right, your language immediately enters the upper echelons of the language space. Sad, but true.

But in every other respect, JavaScript completely fails the Kool-Aid Lisp Test.


Rich arithmetic: NO
Multiple values: NO
Macros: NO
Closures: YES
Complex Lambda lists (optional, keyword, rest parameters): NO
Conditions, restarts: NO
Generic functions: MAYBE (through JS's hare-brained "object system")
Programmable parser: NO

But most of all:
THE HALLMARKS OF LISP HAVE ALWAYS BEEN EXTREME FLEXIBILITY AND EXTENSIBILITY, AND A MINIMUM OF NONSENSE THAT GETS IN YOUR WAY.*
JavaScript offers a maximum of nonsense, as amply documented. If you use Perl, you have 2 problems. If you use JavaScript, you have NaN problems.

If you say that JavaScript is a Lisp, you don't know jack about Lisp.

Wednesday, November 9, 2011

Linkdump

I'm working on a Ninjas/Nazis/Bavarian Illuminati/Aliens/Werewolves* movie, so I don't have as much time for PL geekery atm. Here are some interesting links I haven't checked out fully yet.

Nested Refinements: A Logic For Duck Typing (via @swannodette)
Programs written in dynamic languages make heavy use of features — run-time type tests, value-indexed dictionaries, polymorphism, and higher-order functions — that are beyond the reach of type systems that employ either purely syntactic or purely semantic reasoning. We present a core calculus, System D, that merges these two modes of reasoning into a single powerful mechanism of nested re- finement types wherein the typing relation is itself a predicate in the refinement logic. System D coordinates SMT-based logical implication and syntactic subtyping to automatically typecheck sophisticated dynamic language programs. By coupling nested refinements with McCarthy’s theory of finite maps, System D can precisely reason about the interaction of higher-order functions, polymorphism, and dictionaries. The addition of type predicates to the refinement logic creates a circularity that leads to unique technical challenges in the metatheory, which we solve with a novel stratification approach that we use to prove the soundness of System D.
(See also The Shadow Superpower: the $10 trillion global black market is the world's fastest growing economy: "System D is a slang phrase pirated from French-speaking Africa and the Caribbean. The French have a word that they often use to describe particularly effective and motivated people. They call them débrouillards. To say a man is a débrouillard is to tell people how resourceful and ingenious he is.")

Dart Talk @ Stanford (I think this is by Bracha. Haven't checked it out coz it's in some weird Windows format.)

Things SQL needs: MERGE JOIN that would seek (Nice idea about using histograms to speed up index scans. I believe the new App Engine query planner does that or something similar. Also Understanding Histograms in Oracle.)

I'm a big fan of the tagless-final approach! (This comment caught my interest, because it reminds me of Kernel: "everything in the language has to become both directly interpretable and available as an AST".)

New extension to Haskell -- data types and polymorphism at the type level (Ωmega-like.)

Evolution of DDD: CQRS and Event Sourcing (@dyokomizo hyped this weird-sounding stuff in response to Beating the CAP theorem).

(* As Rob Zombie says: "I feel that the Nazi movie is back. Once the trend comes back, the Nazi trends comes back, now you pervert it like crazy and you add werewolves. So, I don't know if we'll ever see that movie, but now is the time. The time is now to strike with this.")

Tuesday, November 8, 2011

Open, extensible composition models


Did VPRI just enter the FEXPR game?
The evaluator corrects several semantic inadequacies of LISP (described by Stoyan [5]) and is metacircular (written in the language it evaluates). The language provides:
  • lists and atomic values, including symbols
  • primitive functions called SUBRs
  • symbolic functions (closures) called EXPRs
  • a FIXED object that encapsulates another applicable value and prevents argument evaluation
  • predicates to discriminate between the above types
  • built-in SUBRs to access the contents of these values
  • a way to call the primitive behaviour of a SUBR
  • a quotation mechanism to prevent evaluation of literals
(my emphasis)
More later. HT @CraigOverend.

Saturday, October 29, 2011

They said it couldn’t be done, and he did it

The post on Dennis Ritchie by Herb Sutter is wonderful:
Before C, there was far more hardware diversity than we see in the industry today. Computers proudly sported not just deliciously different and offbeat instruction sets, but varied wildly in almost everything, right down to even things as fundamental as character bit widths (8 bits per byte doesn’t suit you? how about 9? or 7? or how about sometimes 6 and sometimes 12?) and memory addressing (don’t like 16-bit pointers? how about 18-bit pointers, and oh by the way those aren’t pointers to bytes, they’re pointers to words?).

There was no such thing as a general-purpose program that was both portable across a variety of hardware and also efficient enough to compete with custom code written for just that hardware. Fortran did okay for array-oriented number-crunching code, but nobody could do it for general-purpose code such as what you’d use to build just about anything down to, oh, say, an operating system.

So this young upstart whippersnapper comes along and decides to try to specify a language that will let people write programs that are: (a) high-level, with structures and functions; (b) portable to just about any kind of hardware; and (c) efficient on that hardware so that they’re competitive with handcrafted nonportable custom assembler code on that hardware. A high-level, portable, efficient systems programming language.

How silly. Everyone knew it couldn’t be done.

C is a poster child for why it’s essential to keep those people who know a thing can’t be done from bothering the people who are doing it. (And keep them out of the way while the same inventors, being anything but lazy and always in search of new problems to conquer, go on to use the world’s first portable and efficient programming language to build the world’s first portable operating system, not knowing that was impossible too.)

Thanks, Dennis.

Monday, October 24, 2011

Toolstrapping is everywhere - can you see it?

Jacques Mattheij introduces an evocative term: toolstrapping: "Whatever tool you are building, you need that tool to build the tool."

One of the reasons behind the continuing success of C seems to be that, by virtue of it being the standard programming language, there's no toolstrapping involved in using C (because it's already been done for you).

In the world of Scheme, we have the phenomenon that most Scheme implementations are written in Scheme, eternally perpetrating Scheme's sillier parts.

And people who say that plain text is somehow natural or native to computing simply ignore the toolstrapping involved in plain text: your quaint Unix is already toolstrapped up the wazoo for handling plain text.

The browser is a chance to raise the toolstrap bar: from plain text to hypermedia. The window of opportunity closes once we're able to run Emacs in X11 in the browser.

Thursday, October 20, 2011

It's called programming

People get annoyed when they must write two lines of code instead of one, but I don't. It's called programming.Rob Pike

Thursday, October 13, 2011

How to beat the CAP theorem

Nathan Marz has an interesting proposal, How to beat the CAP theorem, that should resonate with all you FP shills out there:
  • make all your data immutable
  • index snapshots of your data using "purely functional" batch computing (e.g. MapReduce)
  • index the realtime data arriving after the last snapshot using an incremental computing system
  • for querying, merge the index of the last batch job with that of the incremental system
As Marz explains, incremental computing has a much higher complexity than batch computing. Using it only for the last few hours of data removes a huge complexity burden from your shoulders:
Since the realtime layer only compensates for the last few hours of data, everything the realtime layer computes is eventually overridden by the batch layer. So if you make a mistake or something goes wrong in the realtime layer, the batch layer will correct it. All that complexity is transient.

Tuesday, October 11, 2011

Programming Language Checklist

Programming Language Checklist: You appear to be advocating a new programming language. Your language will not work. Here is why it will not work.

Thursday, September 29, 2011

Alan Kay on type systems

I'm not against types, but I don't know of any type systems that aren't a complete pain, so I still like dynamic typing.
(HT Paul Snively)

Monday, September 19, 2011

Delimited continuations papers 2

Had a conversation with a friend about delimited, composable continuations again. Previously, I've linked to some papers about delimited continuations. Here are two important ones:

Representing Monads: We show that any monad whose unit and extension operations are expressible as purely functional terms can be embedded in a call-by-value language with "composable continuations". As part of the development, we extend Meyer and Wand's characterization of the relationship between continuation-passing and direct style to one for continuation-passing vs. general "monadic" style. We further show that the composable continuations construct can itself be represented using ordinary, non-composable first-class continuations and a single piece of state. Thus, in the presence of two specific computational effects -- storage and escapes -- any expressible monadic structure (e.g., nondeterminism as represented by the list monad) can be added as a purely definitional extension, without requiring a reinterpretation of the whole language. The paper includes an implementation of the construction (in Standard ML with some New Jersey extensions) and several examples.

A monadic framework for delimited continuations: Delimited continuations are more expressive than traditional abortive continuations and they apparently seem to require a framework beyond traditional continuation-passing style (CPS). We show that this is not the case: standard CPS is sufficient to explain the common control operators for delimited continuations. We demonstrate this fact and present an implementation as a Scheme library. We then investigate a typed account of delimited continuations that makes explicit where control effects can occur. This results in a monadic framework for typed and encapsulated delimited continuations which we design and implement as a Haskell library.

I've blogged about the API and hair-raising example offered in this paper before.

Thursday, September 8, 2011

The Kernel Underground

A quick update on Kernel buzz:

There are now fourfivesixseven19 (I stopped counting) implementations of Kernel and Kernel-like languages:
Check them out for inspiration, and write your own. It's fun. Fexprs are fun exprs!

There are some interesting Kernel discussions going on in the Guile thread (and here) on LtU. David Barbour provides welcome criticism. (The truth can only be found in conflict. - Tarkovksy)

Oh, and I created this:

Wednesday, September 7, 2011

Syntax is a monad

I'm becoming aware of a whole Lambda-Meme-Universe! This one via Dorophone:

Monday, September 5, 2011

Oleg cat


(via Paul)

Thursday, September 1, 2011

Mortal combat

A slightly confused anonymous commenter (I get to say such things about anonymous commenters, right?) reminded me of this quote by Ray Dillinger:
The scheme community is now very invested in its macrology; they got there by long hard work and emotional processing and yelling and screaming and weeping and gnashing of teeth, and they still remember the pain of not having a standard macrology. You will not pry it away except from their cold dead fingers, and you will not redefine it without defeating them in mortal combat.
I think the commenter misunderstood my earlier post. Syntactic extension is absolutely necessary, and modern Scheme macro systems are a fine way to do it.

On macros

This post by David Barbour is one of the most insightful statements on macros I have read (along with Vladimir Sedach's analysis):
It has been my impression that, no matter what specific thing you use macros to do, you can later invent a neat language feature that handles the problem in a clean and nice manner. The problem is the "later". Macros can do it "now"... in an ugly, brutish manner, true, but the job gets done. ...
Even among people who find macros smelly, you'll find many who think macros the least distasteful of these choices.
John Shutt's response ties this in with fexprs - instead of delegating the job to a separate preprocessing step, let the language itself have the power of self-extension:
Macros are an unpleasant feature to compensate, somewhat, for limitations of a language. It's better to compensate for the limitations than to leave the limitations in place without compensating at all. Better still, though, would be to eliminate the offending limitations. (Remove the weaknesses and restrictions that make additional features appear necessary.) Make no mistake, fexprs don't necessarily increase the uniform flexibility of the language; but with care in the design they can do so, and thereby, the jobs otherwise done by macros are brought within the purview of the elegant language core. So instead of an ugly brutish fix with macros now and hope for a more elegant replacement later, one can produce a clean solution now with fexprs.
PS: This point is subtle. After all Scheme is already self-extensible. But Scheme macros still are an additional "layer" on top of, and separate from, the core language. Kernel's fexprs are not - they are the core language, thus bringing self-extension within the purview of the elegant language core.

Sunday, August 28, 2011

I endorse this


Thursday, August 25, 2011

SRII 2011 - Keynote Talk by Alan Kay

Another awesome talk by Alan "I invented the term" Kay, after the recent Programming and Scaling.
"The internet is practically the only real object-oriented system on the planet." (~30:00)

SRII 2011 - Keynote Talk by Alan Kay - President, Viewpoints Research Institute from SRii GLOBAL CONFERENCE 2011 on Vimeo.

Also, the notion of reinventing the flat tire can't get enough airtime.

Zippers

I really need to understand Zippers. This seems like a good article. In the meantime, I'll meditate on the following diagram.

Tuesday, August 23, 2011

Totally awesome: FOSS Patents: Samsung cites "2001" movie as prior art against iPad patent

Attached hereto as Exhibit D is a true and correct copy of a still image taken from Stanley Kubrick's 1968 film "2001: A Space Odyssey." In a clip from that film lasting about one minute, two astronauts are eating and at the same time using personal tablet computers. The clip can be downloaded online at http://www.youtube.com/watch?v=JQ8pQVDyaLo. As with the design claimed by the D’889 Patent, the tablet disclosed in the clip has an overall rectangular shape with a dominant display screen, narrow borders, a predominately flat front surface, a flat back surface (which is evident because the tablets are lying flat on the table's surface), and a thin form factor.


Sunday, August 21, 2011

Software is a branch of movies

A post by HXA about the ideology of software engineering prompted me to re-read an old Ted Nelson essay. Some choice quotes:
"Virtual" does not mean, as many think, three-dimensional. It is the opposite of *real*.

Virtuality therefore means the *apparent structure of things*

Engineers generally deal with constructing reality. Movie-makers and software designers deal with constructing virtuality.

Software is a branch of movies.

Movies enact events on a screen that affect the heart and mind of a viewer. Software enacts events on a screen which affect the heart and mind of a user, AND INTERACT.

Saturday, August 20, 2011

Praising Kernel

I love Kernel! Let's face it: While Paul Graham has reanimated Lisp, Lisp has become stuck in a rut. CL hasn't changed since '94, and it probably won't, and heck, that's even a good thing - CL is the mud ball of strength, the acting patriarch of the Lisp family. There's a better chance of CL being around in 50 years than most other programming languages. The same for Scheme, although it's in a bit of an identity crisis right now. But what's crazy about Scheme is that it has become focused on ahead-of-time compilation, in a world where people run Linux in JavaScript JIT compilers. One of the purposes of Scheme should be to push compiler writers to be smarter. Let's forget all that static, performance-oriented stuff, let's make it as dynamic and general and simple as it can get, and spend the following decades on amazing new tech to make it fast! That's what Scheme's about, right? And let's not talk about Clojure, they don't even "see the need" for EVAL [in ClojureScript]. So much for the state of Lisp. Can you see it? We need to put the fun, nasty fun, back into Lisp. Lisp has been resting on its well-deserved laurels for much too long. Heck, even Java is getting lambdas now, and there's talk of macros in JavaScript. Lisp's purpose in the programming language galaxy is to assist our most gifted fellow humans in thinking previously impossible thoughts, remember? Do you really think current Lisps are up to that task?

So where does Kernel fit in? In short, I think Kernel is as close as you can get to the essence, the rhyming scheme of Lisp. Whereas the R6RS doesn't even mention interactive development, Kernel is squarely interactive. And it provides crystal clear semantics to go with it. LETREC blackhole? Hopeless toplevel? Not in Kernel. First-class environments solve these issues. Look, Kernel has a single form, $vau, that is as powerful as lambda, define-syntax, and let-syntax combined! (Strike that, it's even more powerful - you can use Kernel's macros, fexprs, just like ordinary functions.) Need I say more? Kernel can truly bootstrap itself from its 3 (or less) built-in forms (counting them is not easy), without the need for a separate expansion process, and without the need for a hygiene system - Kernel is simply inherently hygienic (when used right). Another form, $define!, not only takes on the roles of define and set!, but also doubles, nay, triples, nay, n-tuples, as a primitive for arbitrary namespace manipulation - you can build any module system you like on top of this single primitive! Think about it? Can your language do that? Fat chance! But should your Lisp be able to do that? I'd say. Now, you say, but no one has written real applications in Kernel yet. I say: that's what's so cool about it! It's a brave new world, without trodden paths, full of new discoveries in semantics, implementation technologies, and other things we can't even see yet. In the past 50 years, Lisp has succeeded in bringing object-orientation, dynamism, functional programming, and other niceties to the philistines. Soon, they'll even have macros and EVAL. Now it's time for Lisp to move another 50 years forward. That's what Lisp is for! And as far as I can see, Kernel is the vehicle for that. So fire up your printers, print out John Shutt's amazing works Fexprs as the basis of Lisp function application; or, $vau: the ultimate abstraction and the Revised -1 Report on the Kernel Programming Language, bind them, study them, and start hacking on Lisp's future! It's easy (nah, OK, it's hard) and it starts with you!

Friday, August 19, 2011

Virtua subjotting

I'm keeping a quite detailed log of Virtua on the new Subjot service: http://subjot.com/manuel/virtua.

(Subjot is an interesting new microblogging service, where every user has multiple streams, and you can choose which to follow. Here's an invite. I have no affiliation with Subjot.)

The Virtua programming language

Virtua will be my next programming language, with the following core features:
Virtua will be written in JavaScript. The goal is to make Virtua a good command language for interactive, extensible, browser-based applications, like an Emacs for hypermedia.

If you're interested to work with me on Virtua, do get in touch!

Thursday, August 18, 2011

Notes on delimited continuations

Delimited continuations are one of those topics where you need to spend months or maybe even years in the cloud of unknowing, until it suddenly makes click. At least, that's this autodidact's experience.

Delimited continuations are awesome, because they let us abstract over control flow - as I like to put it one man's control flow is another man's data, or as Chung-chieh Shan puts it: delimited continuations liberate control flow.

Continuations as we know them (i.e. those created by call/cc) are undelimited, they represent the whole rest of the computation. Delimited continuations represent the rest of a sub-computation up to a given prompt. They come from investigating REPLs: when you capture an undelimited continuation, the continuation includes the REPL's control flow! Generally, you don't want that, so with delimited continuations, the REPL can install a prompt before executing code entered by the user, and then you can capture a delimited continuation just up to that prompt, excluding the REPL's code.

My favorite description of delimited continuations is offered right at the start of A Monadic Framework for Delimited Continuations, even though the rest of the paper is quite over my head. Their API for delimited continuations is:
  • newPrompt --- creates a fresh prompt, distinct from all other prompts. It's just an object that we use as an identifier, basically. Some systems simply use symbols for prompts.
  • pushPrompt p e --- pushes prompt p on the stack, and executes expression e in this new context. This delimits the stack, so we can later capture a delimited continuation up to this part of the stack.
  • withSubCont p f --- aborts (unwinds the stack) up to and including the prompt p, and calls the function f with a single argument k representing the delimited continuation from the call to withSubCont up to but not including the prompt. This captures a delimited continuation, analogous to how call/cc captures an undelimited continuation.
  • pushSubCont k e --- pushes the delimited continuation k on the stack, and executes expression e in this new context. This composes the stack of k with the current stack, analogous to how calling the function representing a continuation in Scheme swaps the current stack with the stack of that continuation.
Now to the example! Delimited continuation examples are always hair-raising!! Brace yourself!!!
So, the first thing we do is call newPrompt (at the bottom) to obtain a fresh prompt, and then we pipe that into the function (at the top), so that we have a name, p, for the prompt. Inside that function we have the addition of 2 to a second expression, that pushPrompt's p on the stack, i.e. delimits the stack at this point. The if is called inside the new context delimited by p. Now it gets hairy: The if's test expression is a withSubCont, so it immediately aborts up to the outer pushPrompt and then calls the function that is its second argument with k bound to the rest of the computation between the withSubCont and the pushPrompt. What's the rest of that computation?? Well, by the magic of delimited control, k is now equivalent to the function λb. if b then 3 else 4. This is what I meant when I said that delimited continuations allow us to abstract over control flow - we have just turned an arbitrary, dynamic control flow sequence into a function. (Well, it's not really a function, but we can use it like one.) OK, so we have aborted back up to pushPrompt, and in k we have the rest of the computation as a function. What do we do with it? Well, we call it using pushSubCont once with False and once with True as argument, and we add the two results, giving us 7. Note that we just called the delimited continuation twice! We can not only abstract over arbitrary control flows, we can also call them as many times as we like. Now the whole shebang inside pushPrompt is over, and we return to the addition of 2 to our result of 7, giving us 9.

A little Scheme interpreter

I've rewritten the Scheme interpreter from Kent Dybvig's dissertation Three Implementation Models for Scheme in JavaScript: Schampignon.

I think this is really a nice way to learn about implementing tail-calls and first-class continuations.

Schampignon is undocumented, because it's really just a rewrite of the Scheme code in section 3.4 of the dissertation, which explains things nicely. (Still, it took me a couple of days to figure out how it works.)

My next steps will be to add delimited continuations and fexprs to the code.

Going co-nuclear

I was discussing implementation inheritance on Twitter (no really!), when it dawned on me that my discussion partner was from a different "paradigm cult", and no agreement was possible.

John Shutt describes it nicely in his post Memetic Organisms:
The meme set carried by the [memetic] organism includes a mass of theories, some of which contradict each other. At the center of this mass is a nucleus of theories that are supposed to be believed (part of what Kuhn called a paradigm). Surrounding the nucleus are theories that are meant to be contrasted with the paradigm and rejected, together with memes about how to conduct the contrast; one might call this surrounding material the co-nucleus.
For the Clojurian I was talking to, implementation inheritance is co-nuclear. There's probably no way for him to accept that it has valid uses.

Sunday, August 7, 2011

Intellectual badass Sunday reading

"Nerd? We prefer the term intellectual badass." – Unknown
AMD Fusion Developer Summit Keynotes, about AMD's upcoming "Cray on a Chip". Basically, they'll make the parallel processors (formerly known as GPUs) coherent with the brawny sequential CPUs, allowing you to "switch the compute" between sequential and parallel, while operating on the same data (e.g. fill an array sequentially, then switch to parallel processing to crunch the data, all in the same program - and memory space).

[racket] design and use of continuation barriers, by Matthew Flatt. "Use a continuation barrier when calling unknown code and when it's too painful to contemplate multiple instantiations of the continuation leading to the call."

Monads and Effects is a bad marriage, by David Barbour (who is probably really a front for an army of intellectual badasses). "Monads capture a lot of effects by default: ordering/time, identity, state, commitment, and unbounded time/space resource consumption."

Tutorial: Metacompilers Part 1, about META II, the one Alan Kay and the other VPRI folks always talk about. "You won't really find metacompilers like META II in compiler textbooks as they are primarily concerned with slaying the dragons of the 1960s using 1970s formal theory."

Microsoft and the Bavarian Illuminati: "In the Object Linking and Embedding 2.0 Programmer's Reference there is a very curious term. On page 78, the second paragraph starts with the sentence, 'In the aggregation model, this internal communication is achieved through coordination with a special instance of IUnknown interface known as the /controlling unknown/ of the aggregate.'"

Saturday, August 6, 2011

Linux kernel word cloud


(click to enlarge; via @hexmanshu)

Sunday, July 31, 2011

The meaning of forms in Lisp and Kernel

There are so many nice things to be learned from Lisp - and Kernel.

Take this:

A Lisp REPL reads forms input by the user.

But Lisp is never content with a form alone: it always wants to evaluate the form, determine its meaning.

For instance, when you input a symbol form, Lisp will not just return the symbol object - it will return the value of the variable binding referenced by the symbol - which is the symbol's meaning, per Lisp's rules of form evaluation.

No wonder then that one of the central features of Lisp is quotation - telling Lisp not to determine the meaning of a form presented to it, but rather give us the form itself.

So what's a form? A form is "any object meant to be evaluated", in other words, we may treat any object as a form, as something to evaluate, as something to determine the meaning of.

Lisp's evaluation rules do not prescribe in any way that forms are bound to text-based representations - a form is any first-class object meant to be evaluated. (And in fact, Racket, a leading Lisp implementation already supports images as forms.)

So Lisp is chiefly concerned with determining the meaning of forms read from the user.

Macros then allow us to extend the variety of forms recognized by a Lisp, beyond the ways that mere function calls can.

Where in a function call the meaning of every operand is established automatically before the function is actually called, macros have total freedom in evaluating - determining the meaning of - operands, or not evaluating them at all (as a short-circuiting AND operator may do, for example).

But Common Lisp and Scheme macros, by their preprocessing nature (a macro call works at a time conceptually different from runtime, translating source expressions to other expressions, to be evaluated at a later time), cannot be used in a first-class fashion. For example, it is not possible to apply the AND macro to a list of dynamically computed values (without resorting to EVAL), like we could with a function.

In a sense, macros of the preprocessing kind restrict our ability of determining the meaning of forms: With preprocessing macros, we may only determine the meaning of forms available at preprocessing time.

This where Kernel's fexprs come in. Fexprs drop the requirement of macros to be expandable in a separate preprocessing step. Thus, we may apply the AND fexpr to a list of dynamically computed operands, for example.

But fexprs go beyond being merely a more convenient replacement for macros. As John Shutt shows, a new lexically-scoped formulation of fexprs - as presented in Kernel - is even more fundamental to Lisp than lambda. In Kernel, lambda is defined in terms of vau, the ultimate abstraction:
($define! $lambda
($vau (formals . body) env
(wrap (eval (list* $vau formals #ignore body)
env))))
Through the power of fexprs, Kernel needs only three built-in operators (define, vau, and if) as well as some built-in functions. This is vastly simpler than any definition of Scheme! Scheme built-ins such as lambda, set! (!), define-syntax and others fall directly out of fexprs, for free, radically simplifying the implementation. But I'm rambling.

In essence, fexprs liberate Lisp's ability to determine the meaning of forms.

It's getting midnight (on a Sunday!), so I can't do fexprs or my excitement about them justice. But I hope this post has made you think about forms and their meanings, and to check out Kernel and share the excitement about the brave new Lisp world of fexprs.

The War Between Developers, Designers, & Project Managers




Previously:


Some history

Reading the interesting Portable and high-level access to the stack with Continuation Marks, I found the following quote about procedure calls (my emphasis):
This [not properly tail-calling] behavior of the SECD machine, and more broadly of many compilers and evaluators, is principally a consequence of the early evolution of computers and computer languages; procedure calls, and particularly recursive procedure calls, were added to many computer languages long after constructs like branches, jumps, and assignment. They were thought to be expensive, inefficient, and esoteric.

Guy Steele outlined this history and disrupted the canard of expensive procedure calls, showing how inefficient procedure call mechanisms could be replaced with simple “JUMP” instructions, making the space complexity of recursive procedures equivalent to that of any other looping construct.
Then it gets highly ironic, with a quote by Alan Kay:
Though OOP came from many motivations, two were central. ... [T]he small scale one was to find a more flexible version of assignment, and then to try to eliminate it altogether.

Saturday, July 30, 2011

Some nice paperz on delimited continuations and first-class macros

My current obsessions are delimited continuations (see my intro post) and first-class macros (fexprs). Here are some nice papers related to these topics:

Three Implementation Models for Scheme is the dissertation of Kent Dybvig (of Chez Scheme fame). It contains a very nice abstract machine for (learning about) implementing a Scheme with first-class continuations. I'm currently trying to grok and implement this model, and then extend it to delimited continuations.

A Monadic Framework for Delimited Continuations contains probably the most succinct description of delimited continuations (on page 3), along with a typically hair-raising example of their use.

Adding Delimited and Composable Control to a Production Programming Environment. One of the authors is Matthew Flatt, so you know what to expect - a tour de force. The paper is about how Racket implements delimited control and integrates it with all the other features of Racket (dynamic-wind, exceptions, ...). Apropos, compared to Racket, Common Lisp is a lightweight language.

Subcontinuations. This is an early paper on delimited continuations. It also describes control filters, a low level facility on top of which dynamic-wind can be implemented. Control filters are also mentioned - in passing - as a nice tool in the Racket paper (above).

Fexprs as the basis of Lisp function application or $vau: the ultimate abstraction and the Revised-1 Report on the Kernel Programming Language. These two are probably the two papers I would take to the desert island at the moment. I'm only a long-time apprentice of programming languages, but I know genius when I see it.

If you know of any related nice papers, let me know!


On Twitter, Paul Snively mentioned Oleg's paper Delimited Control in OCaml, Abstractly and Concretely, but probably only because he's mentioned in the Acknowlegdements - kidding! It's a great paper, and I'm studying it too, but unfortunately it's not really applicable to implementing delimited control in JavaScript, which is what I'm trying to do.

Thursday, July 28, 2011

Thursday, July 21, 2011

On the absence of EVAL in ClojureScript

[I got bored with the original text of this post. In short: the authors of ClojureScript have taken the pragmatic approach of reusing Clojure-on-the-JVM to generate JavaScript. This means there can be no EVAL in ClojureScript. I think that's sad because Lisp is one of the finest application extension languages, and without EVAL you can't do a really extensible app. Apparently, the ClojureScript authors view the browser-side as something of an adjunct to server apps. In my view, all the action of the future will be in the browser, and not on the server.]

Thursday, July 14, 2011

Frankenprogramming

David Barbour throws down the gauntlet to John Shutt in an already interesting and entertaining ("about as coherent as a 90 day weather forecast") LtU thread:
John Shutt would like to break the chains of semantics, e.g. using fexprs to reach under-the-hood to wrangle and mutate the vital organs of a previously meaningful subprogram. Such a technique is workable, but I believe it leads to monolithic, tightly coupled applications that are not harmonious with nature. I have just now found a portmanteau that properly conveys my opinion of the subject: frankenprogramming.
I'm looking forward to John's reply. In the meantime, I have to say that I like to view this in a more relaxed way. As Ehud Lamm said:
Strict abstraction boundaries are too limiting in practice. The good news is that one man's abstraction breaking is another's language feature.
Previously, regarding fexprs: John Shutt's blog

Sunday, July 10, 2011

Atwood's Law

Any application that can be written in JavaScript, will eventually be written in JavaScript.

Friday, July 8, 2011

it is nice to have the power of all that useful but obsolete software out there

Subject: Re: Switching tasks and context
∂08-Jul-81 0232 Nowicki at PARC-MAXC Re: Switching tasks and context
Date: 6 Jul 1981 10:32 PDT
From: Nowicki at PARC-MAXC
In-reply-to: JWALKER's message of Wednesday, 1 July 1981 11:52-EDT
To: JWALKER at BBNA
cc: WorkS at AI

[...]
The important thing is that such a tool WORKS, and I find incredibly useful. The virtual terminal uses ANSI standard escape sequences, so we can use all the tools that have been around for a long time (like the SAIL Display Service, Emacs under TOPS-20 or Unix, etc.) with NO modifications. We are planning to use this to develop a more integrated system, but it is nice to have the power of all that useful but obsolete software out there.

(via Chris, who's scouring Unix history for us)

Wednesday, June 29, 2011

PL nerds: learn cryptography instead

I've never studied cryptography. So when the paper Tahoe – The Least-Authority Filesystem fell in my lap, I was perplexed.

Using just two primitives, secret-key and public-key cryptography, they build an amazing solution to information storage. (Try to understand the two main diagrams in the paper. It's not hard, and it's amazing.)

Before encountering Tahoe-LAFS, cryptography was just a way to keep stuff secret to me. But cryptography also provides us with tools to design programs that wouldn't be possible otherwise.

Monday, June 27, 2011

State of Code interview

Zef Hemel has interviewed me for his weblog, State of Code: Lisp: The Programmable Programming Language. I don't feel qualified to be the "Lisp guy", but as you know, I never leave out an opportunity to blather about PLs.

Zef (@zef) is an academic developing mobl, a language for mobile apps, and is now working for a web-based IDE company, so the State of Code is a place to watch.

Wednesday, June 22, 2011

John Shutt's blog

John Shutt now has a blog: Structural insight.

John is pushing the envelope of Lisp with his Kernel programming language, a "dialect of Lisp in which everything is a first-class object". And when he says everything, he means it.

Not content with just first-class functions and continuations, John also wants first-class environments and first-class macros. Some fun LtU discussions in which John participated come to mind:

Thursday, June 16, 2011

David Barbour's soft realtime model

David Barbour is to me one of the most important thinkers on distributed programming languages and systems.

He frequently posts on LtU and on Ward's wiki.

(And he now has a blog!)

In this LtU post he describes a very interesting soft realtime programming model, reminiscent of Croquet/TeaTime:

I'm also using vat semantics inspired from E for my Reactive Demand Programming model, with great success.

I'm using a temporal vat model, which has a lot of nice properties. In my model:

  • Each vat has a logical time (getTime).
  • Vats may schedule events for future times (atTime, atTPlus).
  • Multiple events can be scheduled within an instant (eventually).
  • Vats are loosely synchronized. Each vat sets a 'maximum drift' from the lead vat.
  • No vat advances past a shared clock time (typically, wall-clock). This allows for soft real-time programming.

This model is designed for scalable, soft real-time programming. The constraints on vat progress give me an implicit real-time scheduler (albeit, without hard guarantees), while allowing a little drift between threads (e.g. 10 milliseconds) can achieve me an acceptable level of parallelism on a multi-core machine.

Further, timing between vats can be deterministic if we introduce explicit delays based on the maximum drift (i.e. send a message of the form 'doSomething `atTime` T' where T is the sum of getTime and getMaxDrift.


Previously: Why are objects so unintuitive

Sunday, June 12, 2011

Everything in JavaScript, JavaScript in Everything

We can now run Linux, LLVM bitcode, and a lot of other languages in JavaScript.

And we can run JavaScript in everything - browsers, servers, toasters, Lisp.

Like IP in networking, JavaScript is becoming the waist in the hourglass.

I don't know what it means yet, but it sure is exciting.

Friday, June 10, 2011

No No No No

// Note: The following is all standard-conforming C++, this is not a hypothetical language extension.
 assert( top( o-------o
|L \
| L \
| o-------o
| ! !
! ! !
o | !
L | !
L| !
o-------o ) == ( o-------o
| !
! !
o-------o ) );

Multi-Dimensional Analog Literals

Tuesday, June 7, 2011

WADLER'S LAW OF LANGUAGE DESIGN

In any language design, the total time spent discussing
a feature in this list is proportional to two raised to
the power of its position.

0. Semantics
1. Syntax
2. Lexical syntax
3. Lexical syntax of comments

(That is, twice as much time is spent discussing syntax
than semantics, twice as much time is spent discussing
lexical syntax than syntax, and twice as much time is
spent discussing syntax of comments than lexical syntax.)
(source, via Debasish Ghosh)

Saturday, May 28, 2011

Extreme software

Some programs are very successful with taking a paradigm to an extreme.

Emacs tries to represent every user interface as text. Even video editing.

Plan 9 tries to use filesystem trees as APIs for all applications and OS services.


It's interesting to abstract insights gained from extreme programs, and modify them.

We can ask: if Emacs is successful using plain text as user interface, what if we use hypertext?

Or: if Plan 9 is successful in using filesystem trees as APIs, what if we use activity streams?

Ouch

Few companies that installed computers to reduce the employment of clerks have realized their expectations.... They now need more, and more expensive clerks even though they call them "operators" or "programmers."
— Peter F. Drucker

Wednesday, May 18, 2011

TermKit

I happen to think that pushing beyond plain text is one of the most important tasks for programmers today, or as Conor McBride put it:
The real modern question for programmers is what we can do, given that we actually have computers. Editors as flexible paper won't cut it.
So of course I like TermKit:
It brings back memories of Apple Dylan:


It's simple: replace plain text files with collections, and lines with objects, and you get Good Things for free!

For example: Scott McKay describes the DEUCE editor:
The editor for FunO's Dylan product -- Deuce -- is the next generation of Zwei in many ways. It has first class polymorphic lines, first class BPs [buffer pointers], and introduces the idea first class "source containers" and "source sections". A buffer is then dynamically composed of "section nodes". This extra generality costs in space (it takes about 2 bytes of storage for every byte in a source file, whereas gnuemacs and the LW editor takes about 1 byte), and it costs a little in performance, but in return it's much easier to build some cool features:

- Multiple fonts and colors fall right out (it took me about 1 day to get this working, and most of the work for fonts was because FunO Dylan doesn't have built-in support for "rich characters", so I had to roll my own).

- Graphics display falls right out (e.g., the display of a buffer can show lines that separate sections, and there is a column of icons that show where breakpoints are set, where there are compiler warnings, etc. Doing both these things took less than 1 day, but a comparable feature in Zwei took a week. I wonder how long it took to do the icons in Lucid's C/C++ environment, whose name I can't recall.)

- "Composite buffers" (buffers built by generating functions such as "callers of 'foo'" or "subclasses of 'window') fall right out of this design, and again, it took less than a day to do this. It took a very talented hacker more than a month to build a comparable (but non-extensible) version in Zwei for an in-house VC system, and it never really worked right.

Of course, the Deuce design was driven by knowing about the sorts of things that gnuemacs and Zwei didn't get right (*). It's so much easier to stand on other people shoulders...

Saturday, May 14, 2011

Secret toplevel

A somewhat arcane aspect of hygienic macro systems is the notion that macro-introduced toplevel identifiers are secret.

An example of this in EdgeLisp:

The macro FOO expands to code that defines a variable X using DEFVAR, and prints its value:

(defmacro foo () #'(progn (defvar x 1) (print x)))

(#' is EdgeLisp's code quotation operator.)

Using the macro has the expected effect:

(foo)
1

But X isn't actually a global variable now:

x
; Condition: The variable x is unbound.
; Restarts:
; 1: #[handler [use-value]]
; 2: #[handler [retry-repl-request]]
; 3: #[handler [abort]]

The name X only makes sense inside the expansion of (foo). Outside, it gets transparently renamed by the hygienic macro system.

(In EdgeLisp, a UUID is attached as color (or hygiene context) to the identifier.)

Why the secrecy of toplevel variables? Well, it's simply an extension of the notion that identifiers that introduce new variable bindings (such as those in a LET) need to be treated specially to prevent unhygienic name clashes between macro- and user-written code. This secrecy completely frees macro writers from having to care about the identifiers they choose.

(Discussion of this topic on the R7RS list: Are generated toplevel definitions secret?)

Wednesday, May 11, 2011

The why of macros

Good analysis by Vladimir Sedach:
The entire point of programming is automation. The question that immediately comes to mind after you learn this fact is - why not program a computer to program itself? Macros are a simple mechanism for generating code, in other words, automating programming. [...]

This is also the reason why functional programming languages ignore macros. The people behind them are not interested in programming automation. [Milner] created ML to help automate proofs. The Haskell gang is primarily interested in advancing applied type theory. [...]

Adding macros to ML will have no impact on its usefulness for building theorem provers. You can't make APL or Matlab better languages for working with arrays by adding macros. But as soon as you need to express new domain concepts in a language that does not natively support them, macros become essential to maintaining good, concise code. This IMO is the largest missing piece in most projects based around domain-driven design.

Tuesday, May 10, 2011

Hygiene in EdgeLisp

EdgeLisp is chugging along nicely. I'll soon do a proper release.

I just implemented a variant of SRFI 72, that is a hygienic defmacro. (I wrote two articles about SRFI 72: part 1 and part 2).
(defmacro swap (x y)
#`(let ((tmp ,x))
(setq ,x ,y)
(setq ,y tmp)))
nil
(defvar x 1)
1
(defvar tmp 2)
2
(swap x tmp)
1
x
2
tmp
1
(swap tmp x)
1
x
1
tmp
2

Sunday, May 8, 2011

Compiling, linking, and loading in EdgeLisp

EdgeLisp now has first-class object files (called FASL, for Fast-Loadable) and a primitive linker:

Let's compile a FASL from a piece of code:

(defvar fasl (compile #'(print "foo")))
#[fasl [{"execute":"((typeof _lisp_function_print !== \"undefined\" ? _lisp_function_print : lisp_undefined_identifier(\"print\", \"function\", undefined))(null, \"foo\"))"}]]

(Note that EdgeLisp uses #' and #` for code quotation.)

The FASL contains the compiled JavaScript code of the expression (print "foo").

We can load that FASL, which prints foo.

(load fasl)
foo
nil

Let's create a second FASL:

(defvar fasl-2 (compile #'(+ 1 2)))
#[fasl [{"execute":"((typeof _lisp_function_P !== \"undefined\" ? _lisp_function_P : lisp_undefined_identifier(\"+\", \"function\", undefined))(null, (lisp_number(\"+1\")), (lisp_number(\"+2\"))))"}]]

(load fasl-2)
3

Using link we can concatenate the FASLs, combining their effects:

(defvar linked-fasl (link fasl fasl-2))
(load linked-fasl)
foo
3

Compile-time effects

Fasls keep separate the runtime and the compile-time effects of code.

For example, a macro definition returns nil at runtime, and does its work at compile-time (edited for readability):

(compile #'(defmacro foo () #'bar))
#[fasl [

execute:
(typeof _lisp_variable_nil !== "undefined" ? _lisp_variable_nil : lisp_undefined_identifier("nil", "variable", undefined))

compile:
((typeof _lisp_function_Nset_macro_function !== "undefined" ? _lisp_function_Nset_macro_function : lisp_undefined_identifier("%set-macro-function", "function", undefined))(null, "foo", (function(_key_, _lisp_variable_NNform){ lisp_arity_min_max(arguments.length, 2, 2); return (((typeof _lisp_function_Ncompound_apply !== "undefined" ? _lisp_function_Ncompound_apply : lisp_undefined_identifier("%compound-apply", "function", undefined))(null, (function(_key_){ lisp_arity_min_max(arguments.length, 1, 1); return (((new Lisp_identifier_form("bar")))); }), ((typeof _lisp_function_Ncompound_slice !== "undefined" ? _lisp_function_Ncompound_slice : lisp_undefined_identifier("%compound-slice", "function", undefined))(null, (typeof _lisp_variable_NNform !== "undefined" ? _lisp_variable_NNform : lisp_undefined_identifier("%%form", "variable", undefined)), (lisp_number("+1"))))))); }))) ]]

Thursday, May 5, 2011

Land of Lisp - The Music Video!



Minimal and sleek / but still so clever you'll freak.

Via Book Review: Land of LISP, by Conrad Barski via @zooko.

Wednesday, May 4, 2011

Unlimited number of runtimes

Because Lisp macros are written in Lisp, there's a runtime at compile-time. (See this previous post for more.)

Some Lisp compilers produce two separate object files for a .lisp file: A FASL file that contains the runtime effects, and a CFASL that contains the compile-time effects (such as macro definitions).

But why stop at two object files? A single file could in fact produce any number of runtimes (FASLs).

One example where this would make sense is documentation: Imagine a DEFDOC macro (for documenting variables), whose effects take place at documentation-time:

(defvar x 1)
(defdoc x "A cool variable.")

DEFDOC registers the documentation string "A cool variable." with X in some table, so that it can be looked up.

The FASL would contain (defvar x 1), and the DFASL would contain (defdoc x "A cool variable.").

Now it's up to the programmer to decide when and if to load the DFASL: in the development environment, one would always load documentation-time, but for a packaged application maybe not. There one would only ship the FASLs, not the CFASLs and DFASLs (unless the application is intended to be programmed by users).

Friday, April 29, 2011

on quotation

Some über-obvious notes:

Lisp literals (booleans, numbers, strings, ...) are sometimes called self-evaluating or self-quoting objects.

This is important because it brings to mind that other objects don't evaluate to themselves: a symbol evaluates to the value of the variable it names, and a list evaluates to the result of a call.

To actually get hands on a symbol or list, we need to quote it. Literals need no quoting, they're self-quoting.

While unquoted X looks quite similar to quoted 'X, they're utterly different. X stands for read a variable from memory, while 'X stands for construct a symbol object with the name "X" (leaving interning aside).

And that's really the use-mention distinction.

Thursday, April 28, 2011

A new Cambrian explosion

Three areas that seem to be exploding are new programming languages, new databases, and new content-centric networking protocols.

But I think all of these developments will be dwarfed by the emergence of the browser as the unified display substrate and JavaScript as the new instruction set.

The last time we had a similar event was the introduction of the GUI in the 80's/90's: millions of programmers scrambled to write every imaginable application for this new system

But the GUI explosion will be peanuts compared to the browser+JS explosion! Not only are there far more programmers today, they're also all joined by the internet now. And there's ubiquitous view source. Need I say more??

Millions of programmers are scrambling right now to write every imaginable application for the browser+JS.

If you ain't emittin' HTML and JS, you're gonna miss out on this once-in-a-geological-era event!

R7RS discussion warming up

I would still appreciate pointers as to how to implement this ;-) Anyone can implement Scheme, but only half a dozen people have implemented separate compilation with hygienically introduced toplevel bindings.Andy Wingo of Guile

For example, '幺' (U+5e7a) has Numeric_Type = 'Numeric', since the character means small or young, so it can sometimes mean 1 in some specific context (for Japanese, probably the only place it means '1' is in some Mah-jong terms.) So, when I'm scanning a string and found that char-numeric? returns #t for a character, and that character happens to '幺' (U+5e7a), and then what I do? It is probably a part of other word so I should treat it as an alphabetic character. And even if I want to make use of it, I need a separate database to look up to know what number '幺' is representing.Shiro Kawai of Gauche

Taylor Campbell's blag

Taylor Campbell's blag contains many insightful posts related to programming languages, Scheme in particular.

For example, he clears up my confusion about UNWIND-PROTECT vs Continuations:
The difference [to DYNAMIC-WIND] is in the intent implied by the use
of UNWIND-PROTECT that if control cannot re-enter the protected
extent, the protector can be immediately applied when control exits
the protected extent even if no garbage collection is about
to run finalizers.

Note that one cannot in general reconcile

(1) needing to release the resource immediately after control exits
the extent that uses it, and

(2) enjoying the benefits of powerful control abstractions such as
inversion of control.

However, if (1) is relaxed to

(1') needing to release the resource immediately after control
/returns normally from/ the extent that uses it,

then one can reconcile (1') and (2) by writing

(CALL-WITH-VALUES (LAMBDA () <resource-usage>)
(LAMBDA RESULTS
<resource-release>
(APPLY VALUES RESULTS))), or simply

(BEGIN0 <resource-usage> <resource-release>)

using the common name BEGIN0 for this idiom. (In Common Lisp, this
is called PROG1.)
(And Dorai Sitaram says: UNWIND-PROTECT "cannot have a canonical, once-and-for-all specification in Scheme, making it important to allow for multiple library solutions".)

Wednesday, April 27, 2011

What's a condition system and why do you want one?

This post attempts to explain how Lisp condition systems surpass ordinary exception handling. Condition systems don't unwind the stack by default and thus allow computations to be restarted, which is a useful tool.

Exception handling
try {
throw new Exception();
catch (Exception e) {
... // handler code
}
Everybody knows what this exception handling code in Java does: the THROW searches for a CATCH clause (a handler) that catches subclasses of Exception, unwinds the stack, and calls the handler code with E bound to the exception object.

At the moment the exception is thrown, the stack looks like this:
  • ... // outside stack
  • TRY/CATCH(Exception e)
  • ... // middle stack
  • THROW new Exception()
There's an outside stack that doesn't concern us. The middle stack is the stack between the TRY/CATCH and the THROW, which is actually empty in this example, but usually contains a whole lotta function calling going on.

Before the handler is called, the stack is unwound:
  • ... // outside stack
  • TRY/CATCH(Exception e)
  • ... // middle stack
  • THROW new Exception()
When the handler is called, the stack looks like this:
  • ... // outside stack
  • TRY/CATCH(Exception e)
  • ... // handler code
Condition systems

Condition systems in the Lisp family are based on the fundamental insight that calling a handler can be decoupled from unwinding the stack.

Imagine the following:
try {
throw new Exception();
} handle(Exception e) {
... // handler code
}
We've added a new keyword to Java, HANDLE. HANDLE is just like CATCH, except that the stack is not unwound when an Exception is thrown.

With this new keyword, the stack looks like this when the handler is called:
  • ... // outside stack
  • TRY/HANDLE(Exception e)
  • ... // middle stack
  • THROW new Exception()
  • ... // handler code
The handler runs inside the THROW statement (Common Lisp would say, during the dynamic extent of the THROW).

For many exceptions it makes sense to simply unwind the stack, like ordinary exception handling does. But for some exceptions, we gain a lot of power from the non-unwinding way condition systems enable.

Restarts

One of the most interesting aspects of not automatically unwinding the stack when an exception occurs is that the we can restart the computation that raised an exception.

Let's imagine two primitive API functions: GETVAL and SETVAL read and write a VAL variable.
Object val = null;
Object getVal() { return val; }
void setVal(Object newVal) { val = newVal; }
Now we want to add the contract that GETVAL should never return null. When VAL is NULL, and GETVAL is called then an exception is raised:
Object getVal() {
if (val == null) throw new NoValException();
else return val;
}
A user of GETVAL may install a handler like this:
try {
getVal();
} handle (NoValException e) { // note use of HANDLE, not CATCH
... // handler code
}
When GETVAL() is called and VAL is null, an exception is thrown, our handler gets called, and the stack looks like this:
  • ... // outside stack
  • TRY/HANDLE(NoValException e)
  • getVal()
  • THROW new NoValException()
  • ... // handler code
As you can see, our handler for NoValException runs, and the exception "isn't over yet", because the stack hasn't been unwound.

Thanks to the non-unwinding nature of condition systems, an application may decide to simply use a default value, when GETVAL is called and VAL is null.

We do this using restarts, which are simply an idiomatic use of non-unwinding exceptions:(1)

We rewrite GETVAL to provide a restart for using a value:
Object getVal() {
try {
if (val == null) throw new NoValException();
else return val;
} catch (UseValRestart r) {
return r.getVal();
}
}
GETVAL can be restarted by throwing a USEVALRESTART whose value will be returned by GETVAL. (Note that we use CATCH and not HANDLE to install the handler for the restart.)

In the application:
try {
getVal();
} handle (NoValException e) {
throw new UseValRestart("the default value");
}
(The USEVALRESTART is simply a condition/exception that can be constructed with a value as argument, and offers a single method GETVAL to read that value.)

Now, when GETVAL() is called and VAL is null, the stack looks like this:
  • ... // outside stack
  • TRY/HANDLE(NoValException e)
  • getVal()
  • TRY/CATCH(UseValRestart r) [1]
  • THROW new NoValException()
  • THROW new UseValRestart("the default value") [2]
The restart at [2] bubbles up, and is returned by the TRY/CATCH for the restart at [1], which means that GETVAL returns "the default value":
  • ... // outside stack
  • TRY/HANDLE(NoValException e)
  • getVal()
  • TRY/CATCH(UseValRestart r)
  • THROW new NoValException()
  • THROW new UseValRestart("the default value")
  • return r.getVal(); // "the default value"

A simple extension would be to offer a restart for letting the user interactively choose a value:
Object getVal() {
try {
if (val == null) throw new NoValException();
else return val;
} catch (UseValRestart r) {
return r.getVal();
} catch (LetUserChooseValRestart r) {
return showValDialog();
}
}
This assumes a function SHOWVALDIALOG that interactively asks the user for a value, and returns that value. The application can now decide to use a default value or let the user choose a value.

Summary

The ability to restart a computation is gained by decoupling calling a handler from unwinding the stack.

We have introduced a new keyword HANDLE, that's like CATCH, but doesn't unwind the stack (HANDLE and CATCH are analogous, but not equal, to Common Lisp's handler-bind and handler-case, respectively).

HANDLE is used to act from inside the THROW statement. We have used this to implement restarts, a stylized way to use exceptions.

I hope this post makes clear the difference between ordinary exception handling, and Lisp condition systems, all of which feature restarts in one form or another.

Further reading:
Footnotes:

(1) This is inspired by Dylan. Common Lisp actually treats conditions and restarts separately.