Sunday, January 7, 2024

Common Lisp's BLOCK / RETURN-FROM and UNWIND-PROTECT

I was just chatting with Ben Titzer on Twitter about control flow in his Virgil language (which is cool and you should definitely check out), when I felt the need to once more promote how Common Lisp does non-local control flow (stuff like returning early from a function or breaking from a loop), because I think it's a very nice solution.

So in Common Lisp we have BLOCK / RETURN-FROM (which work as a pair) and UNWIND-PROTECT.

BLOCK / RETURN-FROM

BLOCK and RETURN-FROM effectively offer the same functionality as C's setjmp/longjmp -- non-local exits -- but nicely wrapped as we expect it in a lexically-scoped, expression-oriented language.

BLOCK / RETURN-FROM lets you do:

  • Early returns from functions or arbitrary code blocks
  • Breaking from loops, including any number of nested loops
  • Continuing in loops, including any number of nested loops
  • Even arbitrary GOTOs in a code block (with some macrology & trampolining, see Baker's TAGBODY)

(block name forms*) lexically binds name within the forms as a non-local exit from which you can return a value with (return-from name value). Just (return-from name) without a value uses nil as the value.

A BLOCK without any RETURN-FROM just returns the last value: 

(block b 1 2 3) returns 3.

This prints 1 and returns 2:

(block b (print 1) (return-from b 2) (print 3))

You can have any number of nested blocks:

(block b1 ... (block b2 ... (return-from b1) ...) ...)

To do an early return from a function, place a block at its beginning:

(defun foo ()

    (block b 

        ...

        (return-from b)

        ...))

(Common Lisp automatically places an anonymous block around every function body, so you don't need to do this in practice, but my hobby Lisp doesn't, and I'm using this explicit approach, and I like it.)

To break from a loop, place a block around it:

(block break

    (loop

        ...

        (return-from break)

        ...))

To continue in a loop, place a block inside it:

(loop

    (block continue

        ...

        (return-from continue)

        ...))

You can have multiple nested loops, like in Java:

(block break-outer

    (loop

        (block break-inner

            (loop

                ...

                (return-from break-inner)

                ...))))

UNWIND-PROTECT

UNWIND-PROTECT is effectively a try/finally block (without a catch).

(unwind-protect protected-form cleanup-forms*) evaluates the protected form, and regardless of whether it returns normally or does a non-local exit with RETURN-FROM, the cleanup forms are evaluated.

(unwind-protect (foo)
   (bar)
   (quux))

is analogous to

try {
   return foo();
} finally {
   bar();
   quux();
}

Both of the following expressions print 2 and return 1:

(unwind-protect 1
   (print 2))

(block exit
   (unwind-protect (return-from exit 1)
      (print 2)))

Conclusion

Common Lisp's BLOCK / RETURN-FROM and UNWIND-PROTECT offer a minimalistic and expressive system for non-local control flow.

Tuesday, July 12, 2016

Peeking into the future with RDP

One aspect of reactive demand programming that sets it apart from other reactive programming models is its support for optimistically working with the predicted future states of signals.

Think about a signal that carries the current time in seconds, that you want to display on screen as a clock. The screen should update close to the full second.

Let's say your clock display uses complicated drawing routines that take a while to display. So if you start drawing at the full second, your drawing will always lag behind the full second.

But why should this be? The clock signal already knows that it will change to the next full second at a particular time in the future, and can communicate this to the display code.

In RDP, we can view a value not as a point, but as an arrow in time, that may point into the future. Consider a mouse: during mouse moves it is often possible to give a good prediction of where the mouse will move in the next instant. With RDP it's possible for the mouse signal to communicate this information to its clients.

Given that the clock display code can peek into the clock signal's future, it can perform intensive processing at leisure before the actual switch to the next full second (e.g. by drawing into an offscreen buffer and copying it to the screen buffer at the full second).

Predictions are often wrong, so clients always need to have a fallback (e.g. throwing away the prepared drawing in the offscreen buffer and drawing from scratch, accepting a laggy display). But the fact that RDP signals can inform clients about future states enables a wider spectrum of behaviors that were previously impossible to achieve.

Thursday, May 19, 2016

The worm is the spice

Just had a nice insight regarding systems that deal with reactively updating (potentially large, structured) values such as RDP:

The dynamic output of a process is indistinguishable from a static storage resource (file).

In batch systems like Unix, this symmetry is not so deep: the output stream of a process looks somewhat like a file, but the process can't make any changes to earlier, already emitted portions of the stream. This is unlike a file, which can be edited anywhere.

In a reactive system, a process output is truly indistinguishable from a file. The process may decide to update any part of the emitted value at any time, just like a user editing a storage resource.

Apart from being a nice symmetry in itself, I think this also offers new possibilities for user experience. Processes and files can be used interchangeably in the UI. A file can be viewed as a (rather boring) process, whose output gets edited by the user (similar to data in pi calculus).

Sunday, August 2, 2015

RESTful RDP with big values

What if you want to use a big value, like a whole database table or weblog, as a Reactive Demand Programming signal value? This would make it possible to use RDP to orchestrate things like incremental MapReduce pipelines. Here's one weird trick to make it work.

In effect, each RDP signal becomes a RESTful server, speaking an HTTP-like protocol. Clients of a signal remember the ETag of the last version of the signal they've processed, and send it to the server on subsequent requests.

The protocol to retrieve the value of a signal may work like this:
  • If a client hasn't seen a previous version of the signal, it sends no ETag. The server replies with the complete value. The value may be split up into multiple pages, using something like an Atom paged feed. (Other non-sequential kinds of splits are possible: for example, tree-like data like a filesystem can be fetched in parallel using hierarchical splits.)
  • If a client has seen and processed a previous version of the signal it sends the stored ETag. There are three possibilities:
    • The content hasn't changed (i.e. the server's ETag matches the client-sent ETag), so the server replies with a status code that indicates that there is no new content (HTTP's 304 Not Modified).
    • The content has changed, so the server replies with a diff of the client's version versus the server's version.
    • The content has changed, but the server is unable to provide a diff against the client's version. This can happen for servers that do not keep a complete history, or also if the server determines that it's more efficient to have the client retrieve the whole value again instead of sending a diff. The client has to re-fetch the whole signal value, as if it had seen no previous version of the value.
I haven't worked out all the details, but I think this scheme could be made to work.

Monday, July 20, 2015

What I learned about Urbit so far

[Updated, see comment and this Reddit AMA with the Urbit team, that clarifies a lot of issues.]

Urbit is some kind of new operating system design thingy, that is kinda hard to categorize.

Some interesting design points are:
  • Urbit restricts the number of identities in the system to 232. This means Urbit doesn't have enough identities even for currently living humans. In line with the usual obfuscation going on in Urbit, such an identity is called destroyer.
  • Urbit is programmed in a weird programming language called Hoon. Hoon's (only) definition is a 10KLOC file hoon.hoon, written in itself. It uses practically random variable names (such as nail for "parse input", or vase for a "type-value pair"), not to speak of the "nearly 100 ASCII digraph 'runes'". The author acknowledges that the parser is "very intricate".
  • Efficiency-wise, Hoon is impractical as a programming language, so in the real world, the VM will recognize Hoon fragments, and replace them with C code (called jets).
This brings us to the question: why would anybody actually design an OS like that? The best explanation I've seen so far is by Kragen, who writes:

Monday, June 15, 2015

A trivial implementation of Reactive Demand Programming

I wrote a trivial implementation of RDP in JavaScript to help me understand how it works.

It's called bucky-rdp (about 200 lines of heavily commented code).

It currently supports Sirea's bconst, bpipe, and bfmap.

Here's an example:

// Create some behaviors for transforming numbers.
var bDouble = rdp.bFMap(function(val) { return val * 2; });
var bMinusOne = rdp.bFMap(function(val) { return val - 1; });

// Create a pipeline behavior of the two behaviors
var myBehavior = rdp.bPipe(bDouble, bMinusOne);

// Apply an inactive input signal to the pipeline behavior
var sigIn = rdp.makeSignal();
var sigOut = rdp.apply(myBehavior, sigIn);

// Change the input signal value and watch the output signal change
sigIn.setValue(2);
console.log(sigOut.getValue()); // Prints 3
sigIn.setValue(4);
console.log(sigOut.getValue()); // Prints 7
sigIn.deactivate();


(This post refers to v1.0.1)

Tuesday, June 9, 2015

Grokking Reactive Demand Programming

TL;DR: RDP is an exciting declarative model of how computational processes (behaviors) are connected by continuously updating values (signals) to effect changes on storage and external state (resources).

I've come a bit closer to understanding David Barbour's Reactive Demand Programming model, and this has confirmed my previous hunch that RDP is one of the most interesting systems designs since Unix. If you're looking for new, better ways to structure dynamic, interactive applications, I strongly recommend checking out RDP.

I would call RDP an orchestration model, since it cares about how you connect and assemble components of your app, and gives you a lot of freedom in what these components do and how they interact. This also fits with David's description of an RDP application as "a complex symphony of signals and declarative effects orchestrated in space and time".

In terms of Unix, RDP's behaviors correspond to processes, signals correspond to pipes, and resources correspond to storage and other external, stateful things.

Signals (pipes, channels)

A signal continuously delivers a potentially changing value. The current implementation always updates the complete value, but RDP doesn't rule out diff/patch-based signal updates, to model e.g. a large set as a signal value.

In addition to these simple signals carrying a single value, there are also compound signals, such as (x :&: y) which represents the concurrent, asynchronous product of signals x and y, IOW a signal representing two independently updating signals. Analogously, (x :|: y) represents a disjoint sum of signals, with either x or y being active at any given point in time.

A signal is either active (carrying a value), or inactive (disrupted). Application-level errors have to be modelled as part of the value, there is no "stderr".

Behaviors (processes, computation)

A behavior won't do anything until you place a demand on it. You place a demand on a behavior by applying an input signal (the demand) to it; the behavior will produce an output signal for the duration of this application.

Multiple demands can be placed on a behavior at the same time. The behavior can either reply to each input signal with a different output signal, or with the same output signal, depending on the purpose of the behavior. For example, a "calculator" behavior may take N input signals with expressions like "1 + 2" and "3 * 5" and deliver a distinct output for each input; on the other hand, a "sum" behavior may take N input signals carrying a number and produce a total sum as the output signal, which would be the same for all inputs.

Behaviors can be composed into dataflow networks. A simple composition is the pipeline behavior, b1 >>> b2: the input signal of this pipeline behavior will be processed by the behavior b1; b1's output signal becomes the input signal for behavior b2; and finally, b2's output signal becomes the output of the whole pipeline behavior.

The Sirea Haskell implementation of RDP comes with other behaviors such as bdup, that copies a single signal into both branches of a :&: product, for creating more complex networks of signals and behaviors. There are also primitives like bfirst and bsecond for running different behaviors against the branches of product signals, and bfmap for applying ordinary Haskell functions to signals. (See Arrows.)

Resources (storage, external state)

RDP doesn't say anything about state, so it has to come from the outside. Access to stateful resources such as filesystems is abstracted through behaviors: to access a filesystem you use a behavior like readFile "foo.txt" that continuously delivers the contents of the file "foo.txt" as output signal.

Creating new resources in RDP is impossible, so resource discovery idioms are used: for example, to "create" a file, you use a UUID as its name, and it will be automatically created the first time you write to it.

I hope this has been helpful. For further reading, check out the extensive README of the Sirea RDP implementation, and David Barbour's blog.

Sunday, June 22, 2014

Attack of the Monadic Morons

[Deleted. To be replaced with a proper rant at another time.]

Wednesday, June 18, 2014

Obsession with mathematics

To put it bluntly, the discipline of programming languages has yet to get over its childish passion for mathematics and for purely theoretical and often highly ideological speculation, at the expense of historical research and collaboration with the other social sciences. PL researchers are all too often preoccupied with petty mathematical problems of interest only to themselves. This obsession with mathematics is an easy way of acquiring the appearance of scientificity without having to answer the far more complex questions posed by the world we live in.
I've replaced "economics" with "programming languages" in this quote from Phil Greenspun's blog. Seems appropriate, no?

Tuesday, June 11, 2013

A week of Lisp in Madrid


Geeks from all Lisps of life met in Madrid last week for the European Common Lisp Meeting and European Lisp Symposium 2013.

A lot of things happened, so I'll just recount the most memorable ones.

I enjoyed meeting Pascal Costanza and Charlotte Herzeel. Pascal shares my disdain for the aberration that is Lisp-1 and doesn't tire of telling Schemers so. I think he's a bit too opposed to all things Scheme though, and when somebody tells him about the niceness that is, e.g. syntax-parse, he goes all "nyah, nyah, nyah, I cannot hear you". Charlotte is one of the handful of people who understand 3-Lisp because she re-implemented it.
(Apparently, 3-Lisp is quite similar to Kernel: every operative receives not only its operand tree and the lexical environment in which it is called, but also its continuation. I still don't understand it. Why would you pass the continuation to an operative, when it can easily obtain it using e.g. call/cc? Apparently because 3-Lisp considers the continuation to exist on the next meta-level, not the current one.)
Erik Sandewall, who worked with John McCarthy, presented his document-editor/operating-system/database/agent-environment called Leonardo. We talked a lot about it, and I found out I'm working on a very similar system in my own next-gen OS efforts.

A very enjoyable talk was SISCOG: a story written in Lisp, by Tiago Maduro Dias and Luís Oliveira. SISCOG was started by two professor-level Lisp nerds in the eighties, because they wanted to apply the language to something useful. If you've ridden a train in Europe, chances are it was scheduled by one of SISCOG's apps. They employ a large number of Lisp programmers in Portugal, and reported how they view Lisp (most of them like it). Anyone who claims that dynamic languages can't be used for something other than prototyping I'd like to hit over the head with the SISCOG manual, which is probably heavy, given the highly complex stuff they work on.

One of the absolute over-the-top experiences, both geek-wise, food-wise, and otherwise-wise was the extended dinner with Ludovic Courtès (Guix and Guile), Andy Wingo (Guile and V8), Florian Loitsch (Hop and Dart), and Sam Tobin-Hochstadt (Racketeer). It doesn't get much better for a PL geek than such a barrage of PL-nerdery, funny background stories, and Spanish cakes.

Another memorable chat was with the very nice Luke Gorrie about dynlangs, networking, and immigration and property buying in the world's most agreeable country, Switzerland.

I also enjoyed the discussions between Ernst "I'm very male" van Waning and Venetian gentleman  Janusz Prodrazik, who presented OpusModus. OpusModus is a very polished tool for composers that uses S-expressions instead of scores for composition. Composers seem to enjoy it.

The last, but not least, three days I enjoyed the company of style icon Faré and Masatoshi Sano, who came all the way from Boston and Tokyo, respectively. Faré and I basically agree about all major points about how the next OS will look like, and I feel like I spiritually joined the TUNES project and made a friend. (Watch out, TUNES will be on your lap sooner than you imagine!)

Greetings to Piotr Kuchta with whom I immediately hit it off, who ported Wat to Python, and who I look forward to visiting in rainy GB.

All in all, a memorable and enjoyable week in the beautiful city of Madrid. Thanks to the organizers for the flawless execution and see you all again next time!

Wednesday, May 29, 2013

Wat: now in Perl, Python, and Ruby, too

I'm delighted somebody by the name of "shadowcat-mst" has taken my Wat interpreter and reimplemented it in Perl. That really seems fitting. Wat now covers JavaScript and Perl - think of the possibilities!

Update: Piotr Kuchta ported Wat to Python.
Update: Victor Hugo Borja ported Wat to Ruby.

Thursday, May 9, 2013

Green threads in the browser in 20 lines of Wat

This page shows 5 independent, cooperatively scheduled Wat green threads (view source for full Wat code).

Each thread has an ID and is defined as a function that loops forever, repeatedly printing its ID, and then sleeping for a (randomly long) while.
(define (run-thread (id String))
  (loop
    (@appendChild (.body $document)
                  (@createTextNode $document (+ "Active thread: " id " ")))
    (sleep (* 1000 (@random $Math)))))
So, how can a Wat thread sleep inside a loop when JavaScript forbids sleeping? Why, with good ole delimited continuations:

To spawn a thread, I just wrap the call to RUN-THREAD in a prompt (which is just a unique object used as an identifier):
(define default-prompt 'default-prompt)

(define (spawn-thread (id String))
  (push-prompt default-prompt
    (run-thread id)))
Where it gets interesting is the SLEEP function which captures the continuation up to the prompt, and sets up a callback with JavaScript's setTimeout that will reinstall the continuation later:
(define (sleep (ms Number))
  (take-subcont default-prompt k
    (define (callback . #ignore)
      (push-prompt-subcont default-prompt k))
    ($setTimeout (js-callback callback) ms)))
So, first, SLEEP aborts up to and including the default prompt using TAKE-SUBCONT. It receives the continuation in the variable K. Once it has K, it defines a CALLBACK function, that will reinstall the default prompt with PUSH-PROMPT, and then continue with K again with PUSH-SUBCONT. All that's left is to give this callback to setTimeout.

Then I can spawn independent threads:
(spawn-thread "thread-1")
(spawn-thread "thread-2")
(spawn-thread "thread-3")
(spawn-thread "thread-4")
(spawn-thread "thread-5")
Wat is very new, but it's already practical for adding concurrency and metaprogramming to JavaScript. Deployment is very easy. Include the single wat.js file, put some Lisp code in a <script> tag, and run it.

Wednesday, May 8, 2013

A new low in programming language design and implementation

The new Wat is the best, most lightweight way to implement a JavaScript-based programming language I have found so far.

Basically, I get away from JS as quickly and painlessly as possible, and start writing the language in itself.

So I define a very small set of primitives on the joint foundation of Kernel-like first-class lexical environments and fexprs and delimited continuations. Fexprs are a great tool for language-oriented programming, and delimited continuations allow me to escape from the browser's (and Node's) async hell and implement any concurrency and effect system I like.

To fexprs I also add macros. When a macro is used as the operator of a form, the form's code gets  changed to the macro's output when the macro is first called, a technique I learned from here. I like macros because they make syntactic abstraction cost-free - with fexprs alone there is always an interpretative overhead. Still, Wat macros, like fexprs, do not work with quoted identifiers, but with first-class values, so many hygiene problems are avoided.

To delimited control I also add classic first-order control (sequential, conditional, loop, throw, catch, finally). This runs on the ordinary JS stack. Only when a continuation is captured does the stack get reified on the heap.

And last but not least, I use a JSON-based syntax for writing the language in itself. At first this was just intended as a quick way to not have to specify a parser for Wat, but I'm starting to like it. It allows Wat to truly be embedded in JavaScript.

Wat does not have a type tagging or object system. It uses the raw JavaScript values.

The whole implementation is roughly 350 lines of JavaScript. After these 350 lines, I can already write Wat in Wat, which is just great.

Sunday, May 5, 2013

Some progress on the Wat VM

Wat is back! If you'll recall, Wat is my ultra-minimal (~500 lines of JS) interpreter for a Kernel-based language with delimited continuations as well as first-order control, and hygienic macros as well as fexprs.

I'm pretty excited by some recent and ongoing changes, which make Wat even smaller and turn it into more of a VM than a full language. Wat will provide (just) the following features:
  • delimited continuations and delimited dynamic binding (higher-order control); these will be used to build cooperative multithreading with thread-local bindings
  • try/catch/finally (first-order control) integrated with the JS stack, but suspendable by continuation capture
  • fexprs as well as in-source self-modifying-code memoizing macros (which are hygienic, as they're built on Kernel)
  • a native JS interface
And that's about it. This should give an extremely minimal yet powerful infrastructure for building JavaScript-based languages.

And I gave up on quasiquotation and Scheme-style hygienic macros again. I just cannot get them to work in a satisfying manner.

Exempli gratia, here's some initial Wat VM "microcode" for bootstrapping a vaporlanguage.

Sunday, April 28, 2013

A quasiquote I can understand

I've written two Lisps (1, 2) with quasiquotation, and in both, quasiquotation was the most difficult thing to implement, and gave me the most headaches. That shouldn't be, right? After all, it only creates new forms.

I think now I've found a formulation for quasiquote that has a really simple implementation, and yields more or less the same results as existing quasiquote implementations.

Some definitions:
  • `foo stands for (quasiquote foo), `(foo) stands for (quasiquote (foo)).
  • ,foo stands for (unquote foo) and is only allowed within a quasiquote.
  • ,@foo stands for (unquote-splicing foo) and is only allowed within a quasiquote.
  • Quasiquote, unquote, and unquote-splicing only ever take a single operand.
  • `foo = 'foo, i.e. a quasiquoted symbol yields simply the symbol.
  • `"foo" = "foo", `12 = 12, and likewise for other literals.
The main difficulty I previously had with quasiquote came from unquote-splicing, which inserts a list of multiple elements into the constructed list (whereas nested quoted or unquoted forms only insert a single element). The main idea in this new formulation is to make inserting multiple elements the default, and define nested quoted or unquoted elements as simply inserting a list containing a single element.

Every `(...) expression therefore stands for an APPEND of the list's further processed elements.

For example, given

(define foo 1)
(define bar 2)
(define quux '(3 4))

the quasiquote expression

`(foo ,bar ,@quux)

stands for

(append (list 'foo) (list bar) quux)

which produces the following when evaluated:

'(foo 2 3 4)

So, processing a quasiquoted list works by wrapping each element, except for unquote-splicing forms, in a call to LIST, and APPENDing the results. Quoted elements (foo) get processed recursively. Unquoted elements (bar) are passed to the call to LIST unprocessed. Unquote-splicing forms (quux) are inserted directly into the APPEND form.

I haven't implemented this yet, but I think defining a quasiquoted list `(...) as an APPEND really simplifies things.

Friday, February 1, 2013

Taf's translation to O'Caml for type-checking

Taf is my new vapor-Lisp with row polymorphism, delimited continuations, and hygienic macros.

(Warning: incoherent rambling ahead!) Taf has a class-based object system with no inheritance. A class defines which slots an instance of this class has, and which methods are applicable to it. Every class also implicitly defines a class type. In addition to class types, there are also interface types or simply interfaces. An interface type defines a suite of methods applicable to objects of this type. Everything is an object of a single class, but may have many compatible class types and interface types. Every object is a member of a special top type. There is no implicit subtyping: objects need to be upcast to top.

All Taf objects are encoded as O'Caml objects. There is one O'Caml class for each Taf class. All classes inherit from a top class. Interfaces (method suites) are also defined as O'Caml classes. Any object can be statically upcast to top or any of the interfaces it implements. This is structural: an object can be upcast to an interface if it has all its methods. Objects have full RTTI, so they can also be dynamically downcast, resulting in an exception if the object is not of the given type. (A more convenient TYPECASE is provided as a wrapper.) Internally, downcasting is implemented via Obj.magic on the O'Caml side, and via a dynamic type-check in the VM. So Taf supports for example heterogenous containers containing arbitrary instances of top or of any other type. Any object can be put into the container, and taken out again, and cast back to its original class type. Likewise, it's possible to write methods that operate on objects of arbitrary types, such as Java's EQUALS. Types are parametric.

Another aspect is the semantics of the global environment. O'Caml's is basically that of a LET* for values, and LETREC only for groups of functions. But Lisp requires LETREC* for all values. So every binding must be encoded as a reference cell containing a Maybe on the O'Caml side, to model bindings which may be undefined.

The runtime, and also the code that produces O'Caml code will run in the browser. Eventually, the type-checker will be implemented in the language itself, so O'Caml will no longer be needed.

Update: here's a sneak preview of the Taf Language Manual.

Monday, January 14, 2013

Current project

In my quest for a good Lisp, I could no longer ignore static types.

See Taf - A plan for a statically-typed Lisp.

There shouldn't be any difficult roadblocks, so I expect a release sometime in or before summer.

Saturday, November 24, 2012

Monday, November 19, 2012

Thursday, November 15, 2012