Saturday, June 26, 2010

Compile-Time Resolution

David Barbour has explored some highly interesting issues over on the C2 wiki, under the title of CompileTimeResolution. (The page is several years old, though.)

I just wrote him a mail with some of my thoughts, and maybe it's useful to some of you, so I'm replicating it here.



Hi David,

I just saw your C2 page on Compile-Time Resolution.

Highly interesting stuff, and the example of embedding JPEG data right into the code clarified a lot for me.

Even more so, your statement that compile-time code should be idempotent and side-effect-free rang a big bell with me.

In Scheme, there is current work regarding exactly this topic in the context of separate compilation and modularity, under the moniker of "phase separation".

Classic Lisps didn't know or care much about this, and EVAL-WHEN, their tool for handling this is seriously broken [1].

Schemers have started a rather rigorous investigation of this issue, probably starting with Matthew Flatt's work on modularity [2] (although Queinnec's earlier work on reflective towers [3] tackles similar issues.)

In work based on Flatt's, such as R6RS, it's required to explicitly demarcate the phase(s) in which an identifier is needed -- Schemes not only have a single compile-time, they may have a nested tower of compile-times, in case of lexically nested macros (LET-SYNTAX).

One of the latest contributions to the issue is the Implicit Phasing framework [4]. It shows that the programmer can be relieved from the burden of manually having to put code into a particular phase, and let the system infer the compile-times (phases) in which a given identifer is really used. Since that code should be idempotent and side-effect-free anyway, evaluating it multiple times, or not at all, isn't a problem.

Implicit phasing however, is (rightfully, IMO) critiqued for applying this principle even to runtime code. In a Scheme with implicit phasing, REQUIRE'ing a package for runtime will only load that package if, in fact, one of its exported identifiers is accessed, thereby violating a traditional Lisp assumption, namely that simply loading [edit: requiring] a module may have (important) side-effects.

However, for compile-time code, implicit phasing seems to be an advanced solution to the issue of compile-time resolution, so you might want to look at it.

In fact, in the new Lisp I am working on, a "strict" model is used for runtime dependencies (i.e. when you require a package for runtime it will be loaded no matter what), and a "lazy" model is used for compile-time dependencies (i.e. when you require a module "for compile-time", or "for-syntax" as Schemers call it, it will get loaded automatically, in whatever compile-time it is actually used, or not at all if it isn't actually used. After all, compile-time should be idempotent and side-effect free.)

Best regards,
Manuel

P.S. Even old Common Lisp is now getting pushed towards such a saner model by Fare Rideau's XCVB build system [5].

No comments: