Workshop on Scheme

Scheme vs. Pascal

In ``An outline of CS topics and courses, based on the Revised Model Curriculum by G. M. Schneider and H. M. Walker,'' Henry Walker provised a checklist of topics considered appropriate to CS1. Let's go down that list and compare the support for each topic provided by the Pascal and Scheme programming languages:

Algorithms and algorithmic problem-solving. Both languages offer very good support for the concise and clear expression of algorithms.

One fundamental problem-solving paradigm chosen from: procedural, object-oriented, functional, possibly with the introduction of a second paradigm. Pascal exemplifies the procedural paradigm and offers no support for the others. Scheme exemplifies the functional paradigm and can easily be used to explain the procedural paradigm; Scheme supports the message-passing style of the object-oriented paradigm reasonably easily, but it is difficult to demonstrate inheritance, virtual functions, etc., in Scheme.

Simple data types: integers, reals, Booleans, characters, strings(?), enumerative types, pointers(?). Both languages support integers, reals, Booleans, and characters. Scheme's support for strings is much better than Pascal's. Scheme provides symbols rather than enumerations. Scheme does not provide pointers.

Structured data types: records, arrays, lists(?), trees(?). Scheme's data structures are pairs and vectors. It does not directly support multidimensional arrays, as Pascal does, although it is not difficult to write a procedure library to support them. Similarly, it provides heterogeneous vectors for the purposes for which Pascal provides records. Scheme provides good support for a list type, implemented in terms of pairs and the null object. Neither language supports trees directly.

Classical algorithms: linear and binary search, insertion or selection sorting, quicksort. Easily formulable in either Pascal or Scheme.

Introduction to sequential files. Pascal directly supports binary files, which Scheme does not. Pascal's writeln procedure allows finer control over numeric output than Scheme's display and write procedures; if this is an issue, however, the format procedure in the SLIB library or other canned procedures (such as fixed-point) can be used.

Control and Evaluation Structures: Sequence; Iteration: while, for, loop/exit; Flow of control during procedure and function invocation; Expression Evaluation; Recursion; tail recursion. Both languages have sequencing and conditional constructions. For iteration, Pascal provides for, while, and repeat, and Scheme provides do and named-let expressions, so Pascal's iteration constructs are nicer and more readable, though perhaps less flexible (e.g., they need explicit Boolean flags far more often than Scheme's iteration constructs do). Procedure and function invocation is perhaps a little simpler in Scheme because there are no variable-parameters. In Scheme (assuming we ignore call-with-current-continuation) control structures are all instances of expression evaluation, which therefore does not exist as a separate topic. Both recursion and tail recursion are easier to explain in Scheme, partly because the syntax of the construction by which a value is returned from a recursive call is more natural.

Introduction to concepts of modularity, decomposition, procedural and data abstraction, program libraries. Somewhat easier in Scheme because (1) a procedure definition is a program and can be loaded interactively without change, and (2) it is possible in Scheme to create objects with state that respond only to a fixed repertoire of procedures and control storage that cannot be accessed except through those procedures.

Software development, including specifications, design, coding, debugging, testing. The languages provide about equally good support for specification, design, and coding, although stub programming and top-down programming are easier in Scheme because a procedure definition can textually follow a call to that procedure (provided that the call is not evaluated before the definition is encountered). Debugging and testing are easier in Scheme.

Social issues: intellectual property, liability, privacy, ethical behavior, access. The choice of language is probably irrelevant to these issues. (Note, however, that many good implementations of Scheme are freeware, while few good implementations of Pascal are.)

Here's a small, real-world programming project developed in both languages, for purposes of comparison. The premise is that our college astronomer has measured two observable physical quantities at various times and stored the measured values in a text file named RSCyg. Here are the first few lines of this data file:

7.468   2.850
7.532   2.871
7.628   2.898
7.633   2.884
7.579   2.890
7.612   2.902
7.548   2.881
7.551   2.851

In this particular case there are 261 pairs of values, but on other occasions there might be more or fewer.

The program is to compute and display (on standard output) the coefficient of correlation for these two quantities. First, here's the Pascal version:

{ This standard Pascal program reads in from a source file two columns of
  data, representing measurements of two observable quantities at various
  times.  It computes and displays the coefficient of correlation computed
  from the data.

  John Stone -- February 17--20, 1995; July 21, 1995.
}
program correlations (source, output);
const
  datasetsize = 1000;  { the maximum length of each column of data }
type
  dataset = array [1 .. datasetsize] of real;
var
  source: text;          { the file containing the data }
  alpha, beta: dataset;  { the two columns of data }
  len: integer;          { the length of the two columns }

  { This function computes the arithmetic mean of the numbers in a given
    dataset.  (If the dataset contains no numbers, the function returns
    0.0.)
  }
  function mean (arr: dataset; len: integer): real;
  var
    total: real;     { a running total of the numbers in the dataset }
    index: integer;  { counts off the numbers as they are added in }
  begin
    if len = 0 then
      mean := 0.0
    else begin
      total := 0.0;
      for index := 1 to len do
        total := total + arr[index];
      mean := total / len
    end
  end;

  { This function computes the standard deviation of the numbers in a given
    dataset.  (If the dataset contains no numbers, the function returns
    0.0.)
  }
  function stdev (arr: dataset; len: integer): real;
  var
    mu: real;        { the arithmetic mean of the numbers in the dataset }
    total: real;     { a running total of the squared deviations from the
                       mean } 
    index: integer;  { counts off the numbers as their squared deviations
                       are added in } 
  begin
    if len = 0 then
      stdev := 0.0
    else begin
      mu := mean(arr, len);
      total := 0.0;
      for index := 1 to len do
        total := total + sqr (mu - arr[index]);
      stdev := sqrt (total / len)
    end
  end;

  { This function takes two datasets and computes the observed coefficient
    of correlation between the quantities they represent. (If the length of
    the datasets is zero, or if the standard deviation of either dataset is
    zero, the function returns 0.0 as the coefficient of correlation.
  }
  function correlation (arr1, arr2: dataset; len: integer): real;
  var
    sigmaproduct: real;  { the product of the standard deviations of the
                           two datasets }
    mu1, mu2: real;      { the means of the two datasets }
    total: real;         { a running total of the products of the numbers'
                           disparities from their respective means }
    index: integer;      { counts off the pairs of numbers as they are
                           processed }
  begin
    if len = 0 then
      correlation := 0.0
    else begin
      sigmaproduct := stdev (arr1, len) * stdev (arr2, len);
      if sigmaproduct = 0.0 then
        correlation := 0.0
      else begin
        mu1 := mean (arr1, len);
        mu2 := mean (arr2, len);
        total := 0.0;
        for index := 1 to len do
          total := total + (arr1[index] - mu1) * (arr2[index] - mu2);
        correlation := (total / len) / sigmaproduct
      end
    end
  end;

  { This procedure discards whitespace characters from a given input file.
  }
  procedure skipwhitespace (var source: text);
  var
    continue: Boolean;  { indicates whether the discarding can and should
                          continue }
  begin
    continue := not eof (source);
    while continue do
      if source^ in [' ', chr (9), chr (10), chr (12), chr (13)] then begin
        get (source);
        continue := not eof (source)
      end
      else
        continue := false
  end;

  { This procedure reads in the datasets from a given source file,
    returning them through its parameters.  The number of values in each
    dataset is also returned through a parameter.
  }
  procedure readdatasets (var source: text; var alpha, beta: dataset;
      var len: integer);
  begin
    len := 0;
    skipwhitespace (source);
    while not eof (source) do begin
      len := len + 1;
      readln (source, alpha[len], beta[len]);
      skipwhitespace (source)
    end
  end;

begin { main program }
  reset (source);
  readdatasets (source, alpha, beta, len);
  writeln ('Correlation = ', correlation (alpha, beta, len) : 1 : 4)
end.

And here is the Scheme version:

;;; This Scheme program reads in, from a file named RSCyg, two columns of
;;; data, representing measurements of two observable quantities at various
;;; times.  It computes and displays the coefficient of correlation
;;; computed from the data.

;;; John Stone -- February 17--20, 1995; July 21, 1995.

;; This procedure computes the arithmetic mean of the numbers in a given
;; list.  (If the list is empty, it returns 0.0.)

(define (mean li)
  (let ((len (length li)))
    (if (zero? len) 0.0 (/ (apply + li) len))))

;; This procedure computes the standard deviation of the numbers in a list
;; by taking the square root of its variance.

(define (stdev li) (sqrt (variance li)))

;; The square procedure finds the square of a given number.

(define (square x) (* x x))

;; This procedure computes the variance of the numbers in a given list.
;; (If the list is empty, it returns 0.0.)

(define (variance li)
  (let ((len (length li)))
    (if (zero? len)
        0.0
        (let ((mu (mean li)))

          (define (squared-disparity val)  ; The squared disparity of a
            (square (- mu val)))           ; datum is the square of its
                                           ; difference from the mean.

          (/ (apply + (map squared-disparity li)) len)))))

;; This procedure takes two lists of numbers, assumed to be equal in
;; length, and computes the observed covariance of the quantities they
;; represent.

(define (covariance l1 l2)
  (let ((len (length l1)))
    (if (zero? len)
        0.0
        (let ((mu1 (mean l1))
              (mu2 (mean l2)))

          (define (product-of-differences v1 v2)
            (* (- v1 mu1) (- v2 mu2)))

          (/ (apply + (map product-of-differences l1 l2)) len)))))

;; This procedure takes two lists of numbers, assumed to be equal in
;; length, and computes the observed coefficient of correlation between the
;; quantities they represent.  (If the standard deviation of either list is
;; zero, the procedure returns zero as the coefficient of correlation.)

(define (correlation l1 l2)
  (let ((sigmaproduct (* (stdev l1) (stdev l2))))
    (if (zero? sigmaproduct)
        0.0
        (/ (covariance l1 l2) sigmaproduct))))

;; This procedure takes an input port and discards whitespace characters
;; from it.

(define (skip-whitespace source)
  (let loop ((next-char (peek-char source)))
    (if (and (not (eof-object? next-char))
             (char-whitespace? next-char))
        (begin
          (read-char source)
          (loop (peek-char source))))))

;; Given the name of a file containing the two columns of data, this
;; procedure opens the file, reads in the data, closes the file, and
;; returns the data as a pair of lists.

;; The named-let expression that reads in the data skips past any leading
;; whitespace, then checks to see if the end of the data has been reached.
;; If so, it closes the input port and returns the two columns of data;
;; otherwise, it reads in two values from the file, attaches them to the
;; two columns that are being built up, and restarts the loop to pick up
;; the rest of the data. 

(define (read-datasets filename)
  (let ((source (open-input-file filename)))
    (let loop ((alpha '())
               (beta '()))
      (skip-whitespace source)
      (if (eof-object? (peek-char source))
          (begin
            (close-input-port source)
            (cons (reverse alpha) (reverse beta)))
          (let* ((first (read source))
                 (second (read source)))
            (loop (cons first alpha) (cons second beta)))))))

;; The main program reads in the data and computes and writes out the
;; correlation coefficient.

(let ((datasets (read-datasets "RSCyg")))
  (display "Correlation = ")
  (display (correlation (car datasets) (cdr datasets)))
  (newline))

My feeling is that the two versions of the program are comparable in clarity and intelligibility. I like the concision that the functional programming style brings to the definition of procedures such as covariance, but I can see that there's also an argument for using the more concrete procedural model of the Pascal version.

The Scheme version of skip-whitespace is clearly better; the Pascal version shows the classical synchronization problem in which the decision about whether the loop must continue has to be made in the interior of the loop instead of at the head of the loop, where Pascal wants it. (And there's another synchronization kludge in readdatasets -- the call to skipwhitespace must be repeated so that any blank lines at the end of the file are cleared out of the way just before the test for end-of-file is made.) Finally, if on some future run the source file turns out to contain 1029 pairs of values, the Pascal program crashes and the Scheme program does not.

This document is available on the World Wide Web as

http://www.math.grin.edu/~stone/events/scheme-workshop/Scheme-vs-Pascal.html

created July 21, 1995
last revised July 21, 1995

John David Stone (stone@math.grin.edu)