[ACCEPTED]-Why code-as-data?-common-lisp

Accepted answer
Score: 75

It means that your program code you write 13 is also data which can be manipulated by 12 a program. Take a simple Scheme expression 11 like

(+ 3 (* 6 7))

You can regard it as a mathematical 10 expression which when evaluated yields a 9 value. But it is also a list containing 8 three elements, namely +, 3 and (* 6 7). By quoting the 7 list,

 '(+ 3 (* 6 7))

You tell scheme to regard it as the 6 latter, namely just a list containing three 5 elements. Thus, you can manipulate this 4 list with a program and then evaluate it. The 3 power it gives you is tremendous, and when 2 you "get" the idea, there are some very 1 cool tricks to be played.

Score: 44

Code-as-data is actually only one side of 49 the coin. The other is data-as-code.

The possibility 48 to embed arbitrary data in Lisp code and load 47 and reload it on the fly makes it (the data) very 46 convenient to handle because it can eliminate 45 any potential impedance mismatch between 44 the way the data is represented and the 43 way the code works.

Let me give you an example.

Let's 42 say you want to write some kind of computer 41 game with various monster classes. You 40 have basically two choices: model the monster 39 classes within your programming language 38 or use a data-driven approach where the 37 class descriptions are read from, say, an 36 XML file.

Doing the modelling within the 35 programming language has the benefits of 34 ease of use and simplicity (which is always 33 a good thing). It's also easy to specify 32 custom behaviour depending on the monster 31 class as needed. Finally, the implementation 30 is probably pretty optimised.

On the other 29 hand, loading everything from data files 28 is much more flexible. You can do multiple 27 inheritance where the language doesn't support 26 it; you can do dynamic typing; you can load 25 and reload things at run-time; you can use 24 simple, to-the-point, domain-specific syntax, and 23 much more. But now you need to write some 22 kind of runtime environment for the whole 21 thing, and specifying behaviour means either 20 splitting the data up between the data files 19 and the game code or embedding a scripting 18 language, which is yet another layer of 17 incidental complexity.

Or you can do it the 16 Lisp way: specify your own sublanguage, translate 15 that into code, and execute it. If the 14 programming language you're using is sufficiently 13 dynamic and syntactically flexible, you 12 get all the benefits from using a data-driven 11 approach (since code is data) combined with 10 the simplicity of keeping everything in 9 the code (since data is code).

This isn't 8 specific to Lisp, by the way. There are 7 various shades of code-data-equivalence 6 gray in between Lisp and, say, C++. Ruby, for 5 example, makes embedding data within the 4 application easier than Python does, and 3 Python makes it easier than Java does. Both 2 data-as-code and code-as-data are more of 1 a continuum than they are either-or questions.

Score: 19

As a Lisp programmer you learn to think 35 of a program source as data. It is no longer 34 static text, but data. In some forms of 33 Lisp the program itself is that data structure, which 32 gets executed.

Then all the tools are oriented 31 that way. Instead of a textual macro processor 30 Lisp has a macro system which works over 29 programs as data. The transformation of 28 programs to and from text has also its tools.

Let's 27 think about adding two elements of a vector:

(let ((v (vector 1 2 3)))
   (+ (aref v 0)
      (aref v 1)))

There 26 is nothing unusual about it. You can compile 25 and run it.

But you could also do this:

(let ((v (vector 1 2 3)))
   (list '+
         (list 'aref v 0)
         (list 'aref v 1)))

That 24 returns a list with a plus symbol and two 23 sublists. These sublists have the symbol 22 aref, then the array value of v and the 21 index value.

That means that the constructed 20 program contains actually symbols, but also 19 data. The array is really a part of the 18 sublists. So you can construct programs 17 and these programs are data and can contain 16 arbitrary data.

EVAL then evaluates the program 15 as data.

CL-USER 17 > (setf *print-circle* t)
=>  T

Above tells us that the printer 14 should print circular data structures such 13 that the identities are preserved when read 12 back.

CL-USER 18 > (let ((v (vector 1 2 3)))
               (list '+
                     (list 'aref v 0)
                     (list 'aref v 1)))
=>  (+ (AREF #1=#(1 2 3) 0) (AREF #1# 1))

Now let's eval the data as a Lisp program:

CL-USER 19 > (EVAL (let ((v (vector 1 2 3)))
                     (list '+
                           (list 'aref v 0)
                           (list 'aref v 1))))

=>  3

If 11 your compiler expects text as source one 10 can construct these texts, but they can 9 never reference data directly. For these 8 text based source construction many tools 7 have been developed, but many of these tend 6 to work in stages. In Lisp the functionality 5 of manipulating data can be directly applied 4 to manipulate programs and this functionality 3 is directly built-in and part of the evaluation 2 process.

So Lisp gives you an additional 1 degree of freedom and new ways to think.

Score: 13

In Scheme (or any Lisp) you can declare 36 list literals like this:

> '(1 2 3)
=> (1 2 3)

This is similar 35 to many other high-level languages, except 34 for slight differences in notation. For 33 instance, this is how some other languages 32 represent list literals:

[1, 2, 3] # Python
#(1 2 3) "Smalltalk. This is in fact an array in Smalltalk. Let us ignore that for now."

Lists can contain 31 any type of values. As functions are first-class 30 objects, a list can contain functions as 29 well. Let us replace the first element in 28 the above list with a function:

> '(+ 2 3)
=> (+ 2 3)

The single-quote 27 (') identifies the list literal. (Just like 26 the # in Smalltalk). What will happen if 25 we remove the quote? Then the Scheme interpreter 24 will treat the list specially. It will consider 23 the first element as a function (or procedure) and 22 the rest of the elements as arguments to 21 that function. The function is executed 20 (or evaluated):

> (+ 2 3)
=> 5

The ability to represent 19 executable code using a data structure in 18 the language opens a new possibility - we 17 can write programs that write programs. That 16 means, extensions that require changes to 15 the compiler and the runtime system in other 14 languages could be implemented in Lisp, as 13 a few lines of Lisp itself. Imagine you 12 need a new control structure in your language 11 called when. It is similar to if but makes reading 10 code a little more natural in some situations:

 (when this-is-true do-this)

You 9 can extend your Lisp system to support when by 8 writing a short macro:

 (defmacro when (condition &rest body)
    `(if ,condition (progn ,@body)))

A macro is nothing 7 but a list, which gets expanded at compile 6 time. More complex language structures or 5 even entire paradigms could be added to 4 the core language using such lists. For 3 example, CLOS, the Common Lisp Object Systems 2 is basically a collection of macros written 1 in Common Lisp itself.

Score: 12

Code-as-data refers to the fact that your 10 code is expressed in terms of the language's 9 data structures. I wouldn't try to argue 8 here that it's the best way to program, but 7 I find it to be a beautiful way to express 6 the ideas in the code.

One of the benefits 5 is that metaprogramming is very nearly the 4 same as regular programming. With code-as-ascii-characters, you 3 often end up having to do some serious parsing 2 to do anything meta, and you skip those 1 nasty bits with Lisp.

Score: 7

Unless you're using something like an old 27 Harvard Mark I, your code is stored in the same place and 26 manner as your data -- just (as you noted) probably 25 in the form of ASCII characters, so it's 24 really hard to do anything with. Chances 23 are, most Java programmers have never parsed 22 Java code on their own.

Look at any program 21 -- there's an enormous wealth of information 20 (well, depending on the program!) encoded 19 in the source code itself. That's its reason 18 for existing! By not using a homoiconic 17 language, you're implicitly saying that 16 you're OK with not being able to read that 15 from another program you write (or that 14 it's OK that it's so hard that nobody ever 13 will). Basically the only program on your 12 computer that can read it is the compiler, and 11 the only thing it can do after reading is 10 generate object code and error messages.

Imagine 9 you had to work with some other data source 8 every day, like XML files or an RDBMS, and 7 that the only way to access that data was 6 to run it through a "compiler" that 5 converted it into a format you could read. I 4 don't think anybody would argue that that's 3 a good idea. :-)

I really don't know where 2 I'm going with this, so I'll try to summarize 1 my above ramblings:

  • I see code-as-data as just the logical next step from Harvard Architecture to Von Neumann Architecture
  • we already have X-as-data for pretty much every other X, so it seems odd to exclude the one kind of data that programmers spend all day manipulating

More Related questions