Thursday, December 1, 2011

Clarifications of "Viewpoints Research Trip"

Computations and data are dual; a computation is a function from data to data, and a datum is a function from computations to more computations. (In particular, a data type is described completely by its "fold" and "unfold" operations) This duality is at the heart of programming language design. The sole difference between call-by-value and call-by-name lambda calculi is whether they treat functions as computations or as data, yet it leads to an enormous difference in programming style.

The main sticking point is that, due to what remains of the von Neumann architecture, there is something called "code"; the stored representation of a program. Code is not computation; it is data. This is why one is able to write compilers, reverse-engineer DRM, etc. - if code was actually a computation, instead of merely data describing a computation, it would be impenetrable. One doesn't read compiled binaries for pleasure, but the fact remains that they are still legible - they can be disassembled, studied, and as the Boomerang project showed, even turned back into high-level code. Actual computations are completely ephemeral; however hard one looks to find a computation, one will only see data. But they are an incredibly convenient abstraction for understanding the universe, so they can't be ignored.

Code is mobile; it moves around whenever one updates or installs an application. Computation is not, because there's nothing that can be moved.

Computation should be avoided if possible; it is slow, error-prone, and generally a nasty business. Data should reign supreme.

The web of today is in fact mobile code; particularly, 99% of it is written in Javascript. They do "download code and run it within a “page” with constrained resources". (One gets an "This page is taking too long to run" message after some time) I'm not certain what you mean by "mini operating systems", but Google and Mozilla are experimenting with web-centric operating systems that boot straight to a browser, and the amount of code that needs to be added is apparently quite small.

There is no difference between data and code; trying to create a "“document format” that represents documents as data, not as code" is futile. What one can do, on the other hand, is write it in a more or less declarative language; in other words, shift the burden of implementation from the programmer to the compiler writer.

The difference between Word, Excel, and PowerPoint documents is artificial; HTML is better than all of them. (Although the default presentation of the data is Word-like, and thus some javascript and css are required to allow Excel / PowerPoint formats; but see http://www.w3.org/Talks/Tools/Slidy2/ for how easy it is; see Popcorn.js and other things for how the web is evolving beyond these tools - the main issue of course is that a browser is a document consumption tool, not a document creation tool, so most people do not have the requisite knowledge)

I, of course, am indeed trying to solve "the whole problem of data representation, interpretation, transmission, and re-interpretation." There have been some quite promising results lately; extensible grammars, compilers, backends... the problem I wish to solve is that of creating a language without being tied down to any design decisions; in other words, a completely modular system; it seems plausible today. The design decisions would be offloaded into libraries, and developed by their users; but the underlying language would be the same, so that one could borrow constructs / algorithms / optimizations from other people without difficulty.

No comments:

Post a Comment