A long time ago I discovered a wonderful book by Hal Abelson and Gerald Sussman called The Structure and Interpretation of Computer Programs. The book aims to introduce engineering students to computing using the Scheme programming language. It presents a wonderful view of programming; investigating a wide variety of interesting and practical examples, and even showing how a language like Scheme can be implemented.
At about the same I obtained access to one of the first releases of Rick Becker and John Chambers' New S language. I remember noticing both similarities and differences between S and Scheme. In particular, I remember that one day I wanted to show Alan Zaslavsky how you could use lexical scope to obtain own variables. I didn't have a copy of Scheme handy, so I tried to show him using S. My demonstration failed because of the differences in the scoping rules of S and Scheme. It left me thinking that there were useful additions which could be made to S.
Rather later, Robert Gentleman and I became colleagues at The University of Auckland. We both had an interest in statistical computing and saw a common need for a better software environment in our Macintosh teaching laboratory. We saw no suitable commercial environment and we began to experiment to see what might be involved in developing one ourselves.
It seemed most natural to start our investigation by working with a small Scheme-like interpreter. Because it was clear that we would probably need to make substantial internal changes to the interpreter we decided to write our own, rather than adopt one the many free Scheme interpreters available at the time. This is not quite as daunting a task as it might seem. The process is well mapped out in books such as that of Abelson-Sussman and that of Kamin. Having access to the source code of a number Scheme interpreters also helped with some of the concrete implementation details.
Our initial interpreter consisted of about 1000 lines of C code and provided a good deal of the language functionality found in the present version of R. To make the interpreter useful, we had to add data structures to support statistical work and to choose a user interface. We wanted a command driven interface and, since we were both very familiar with S, it seemed natural to use an S-like syntax.
This decision, more than anything else, has driven the direction that R development has taken. As noted above, there are quite strong similarities between Scheme and S, and the adoption of the S syntax for our interpreter produced something which "felt" remarkably close to S. Having taking this first step we found ourselves adopting more and more features from S.
Despite the similarity between R and S, there remain number of key differences. The two fundamental differences result from R's Scheme heritage.
> total <- 10 > make.counter <- + function(total = 0) + function() { + total <<- total + 1 + total + } > counter <- make.counter() > counter() [1] 1 > counter() [1] 2 > counter() [1] 3
Generally, the scoping rules used in R have met with approval because they promote a very clean programming style. We have retained them despite the fact that they complicate the implementation of the interpreter.
The two differences noted above are of a very basic nature. In addition, we have experimented with a number of other features in R. A good deal of the experimentation has been with the graphics system (which is quite similar to that of S). Here is a brief summary of some of these experiments.
"#FFFF00"
indicates full intensity for red and green with no blue; producing
yellow.
"dotted"
).
A plot of the periodogram of a white-noise time series, showing the use of mathematical annotation.
can be used to produce the mathematical expression x^2+1 as annotation in a plot.expression(x^2+1)
The annotation system is relatively simple, and not designed to have the full capabilities of a system such as TeX. Even so, it can produce quite nice results. Figure * shows a simple example of a time series periodogram plot produced in R. The plot was produced with a single R command which used expression to describe the labels.
These graphical experiments were carried out at Auckland, but others have also bound R to be an environment which can be used as a base for experimentation.