Planet Scheme

May 28, 2012

Jeremy Kun

The Fourier Transform — A Primer

In our last primer we saw the Fourier series, which flushed out the notion that a periodic function can be represented as an infinite series of sines and cosines. While this is fine and dandy, and quite a powerful tool, it … Continue reading

by j2kun at May 28, 2012 12:00 AM

May 25, 2012

Programming Praxis

Ackermann’s Function

In the 1920s, Wilhelm Ackermann demonstrated a computable function that was not primitive-recursive, settling an important argument in the run-up to the theory of computation. There are several versions of his function, of which the most common is

A(m,n) = \left\{    \begin{array}{ll}      n+1 & \mbox{if \(m=0\)} \\      A(m-1,1) & \mbox{if \(m > 0\) and \(n = 0\)} \\      A(m-1, A(m,n-1)) &        \mbox{if \(m > 0\) and \(n > 0\)}    \end{array}  \right.

defined over non-negative integers m and n. The function grows very rapidly, even for very small inputs; as an example, A(4,2) is an integer of about 20,000 digits.

Your task is to implement Ackermann’s function. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at May 25, 2012 09:00 AM

May 22, 2012

Programming Praxis

Hamming Codes

Hamming codes, which were invented by Richard Hamming in 1950, are a method of transmitting data over a noisy channel that give the recipient the ability to correct simple errors. The sender adds parity bits to the transmission stream so that, when the data bits and parity bits are combined, any single-bit error, in either the data bits or parity bits, can be identified and corrected. The number of parity bits that are required is given by the Hamming rule d + p + 1 ≤ 2p where d is the number of data bits and p is the number of parity bits. The length of the code word c, which combines the data bits and parity bits, is d + p, and a Hamming code is described by (c,d). We will illustrate using a 4-bit data word, which requires 3 parity bits to satisfy the Hamming rule (2 is insufficient because 4+2+1>4, but 3 is sufficient because 4+3+1≤8) and is described by (7,4).

A particular instance of a Hamming code uses two matrices, G the generator matrix and H the syndrome matrix. Here are sample generator (left) and syndrome (right) matrices for a (7,4) Hamming code:

1 0 0 0 1 1 1    1 0 1 1 1 0 0
0 1 0 0 0 1 1    1 1 0 1 0 1 0
0 0 1 0 1 0 1    1 1 1 0 0 0 1
0 0 0 1 1 1 0

The generator matrix is denoted [ I : A ] and consists of an identity matrix I in the left-most four columns (the number of data bits) and a parity coding matrix A in the right-most three columns (the number of parity bits). There is no formula for the A matrix; it must be constructed so that each data bit is checked by two or more parity bits, in such a way that no combination of parity bits overlaps a data bit. The syndrome matrix is denoted [ AT : I ] and consists of the transpose of the parity coding matrix A in the left-most four columns and the identity matrix in the right-most four columns.

A data word is encoded by multiplying it by the generator matrix, with all arithmetic done modulo 2; we gave an algorithm for matrix multiplication in a previous exercise. For instance, the data word [1 0 0 1] is encoded as [1 0 0 1 0 0 1] like this:

                                  | 1 |
                                  | 0 |
              | 1 0 0 0 1 1 1 |   | 0 |
| 1 0 0 1 | * | 0 1 0 0 0 1 1 | = | 1 |
              | 0 0 1 0 1 0 1 |   | 0 |
              | 0 0 0 1 1 1 0 |   | 0 |
                                  | 1 |

Decoding is the inverse operation, with the syndrome matrix multiplied by the encoded data:

                    | 1 |
                    | 0 |
| 1 0 1 1 1 0 0 |   | 0 |   | 0 |
| 1 1 0 1 0 1 0 | * | 1 | = | 0 |
| 1 1 1 0 0 0 1 |   | 0 |   | 0 |
                    | 0 |
                    | 1 |

If all of the result bits are zero, then the transmission succeeded without errors, and the input code is the first four bits of the encoding [1 0 0 1]. But if there was a transmission error, the syndrome points to it. For instance, if the recipient received the tranmission as [1 0 1 1 0 0 1], then the syndrome computes as [1 0 1], which matches the third column of the H matrix, indicating that the third bit of the transmission was in error, so instead of [1 0 1 1] the message is [1 0 0 1], which is correct.

Your task is to write functions that encode and decode a message using Hamming codes as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at May 22, 2012 09:00 AM

May 20, 2012

Jeremy Kun

Double Angle Trigonometric Formulas

Problem: Derive the double angle identities Solution: Recall from linear algebra how one rotates a point in the plane. The matrix of rotation (derived by seeing where and go under a rotation by , and writing those coordinates in the … Continue reading

by j2kun at May 20, 2012 04:28 AM

May 18, 2012

Programming Praxis

Formatted Numeric Output

It is often necessary for programs to produce numeric output in various formats, and most languages provide libraries for this purpose; for instance, C provides the printf function, which includes the d and f format specifications for decimal numbers (integers) and floating point numbers, respectively.

Your task is to write library functions that format integers and floating point numbers; you may follow the formatting conventions of C, or those of some other language, or invent your own. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at May 18, 2012 09:00 AM

May 16, 2012

Andy Wingo

stranger in these parts

My JSConf 2012 video is out! Check it out:

The talk is called "Stranger in these parts: A hired gun in the JS corral", and in it I talk about my experiences as a Schemer in the implementation world, with a focus on JavaScriptCore, the JS implementation of the WebKit project.

If you want, you can fetch the slides or the notes. If you were unable to play the video in the browser, you can download it directly (25 minutes, ~80 MB, CC-BY-SA).

Special thanks to the A/V team for the fine recording. My talk was the first one that used the wireless mics, and it turned out there was some intermittent interference. They corrected this in later talks by double-miking the speakers. In my case, it was fortunate that they recorded the room as well, and with (I would imagine) a fair amount of post-processing the sound is perfectly intelligible. Cheers!

Finally, there were a number of other interesting talks whose recordings are starting to come out. I especially liked David Nolen's funhouse of ClojureScript and logic programming. It was pleasantly strange to see him mention Peter Norvig's 1992 book, Paradigms of Artificial Intelligence Programming, because I did too, and I think someone else did as well. Three people mentioning a somewhat obscure 20-year-old book, what are the odds?

I also liked Vyacheslav's amusing talk on V8's optimizing compiler. He actually showed some assembler! Folks that read this webrag might find it interesting. Dan Ingalls' talk was fun too. The ending scene was pretty surreal; be sure to watch all the way through.

Thanks again to the JSConf organizers for the invitation to speak. It was a pleasure to get to hang around the lively and energetic JS community. Happy hacking, all!

by Andy Wingo at May 16, 2012 11:35 AM

May 15, 2012

Programming Praxis

Streaming Knapsack

A famous problem of computer science is the knapsack problem, in which you are to find a combination of items from a population that sums to a given target, often with some kind of constraint such as maximizing the value of the items. In today’s problem we want to find the first possible combination of k integers from a stream of positive integers that sum to n. For instance, given the input stream 4, 8, 9, 2, 10, 2, 17, 2, 12, 4, 5, …, we want to find the knapsack containing 4, 2, 10, 2, 2 immediately after reading the third 2, without reading the 12, 4, 5 that follow it.

Your task is to write a program that takes parameters k and n and an input stream and returns the first possible knapsack. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at May 15, 2012 09:00 AM

May 14, 2012

Andy Wingo

doing it wrong: cse in guile

Greetings, readers! It's been a little while, but not because I haven't been doing anything, or nothing worth writing about. No, the reason I haven't written recently is because the perceived range of my ignorance has been growing faster than the domain of my expertise.

Knowledge may well be something one can dominate, but ignorance must forever be a range, stretching off to a hazy horizon. Climbing the hill of JavaScript implementations has let me see farther out on that plain. I think it's only the existence of the unknown unknowns that can let one muster up the arrogance to write at all.

But back to domains and dominators later. To begin, I note that there is very little in the way of correct, current, and universal folk wisdom as to how to implement a programming language. Compiler hackers are priests of their languages, yes, but their temples are in reality more or less isolated cults, in which the values of one's Gods may be unknown or abhorrent to those of others. Witness the attention paid to loop optimizations in some languages, compared to garbage collection in others, or closures in still others.

In my ecumenical capacity as abbot of Guile and adjunct deacon of JavaScriptCore, sometimes I'm tempted to mix rites: to sprinkle the holy water of lexical scope optimizations on JS, and, apropos of this article, to exorcise common subexpressions in Scheme.

As one might well imagine, the rites of one cult must be adapted to circumstances. I implemented CSE for Guile, but I don't know if it was actually a win. In this article I'll go into what CSE is, how it works in Guile, why it probably won't survive in its present form.

cse: common subexpression elimination

I implemented a source-to-source optimization pass in Guile that eliminates common subexpressions. It actually does both more and less than that: it propagates predicates and eliminates effect-free statements as well, and these latter optimizations are why I implemented the pass in the first place.

Let me give an example. Let's imagine we implement a binary tree in Guile, using the records facility.

(use-modules (srfi srfi-9))

(define-record-type btree
  (make-btree elt left right)
  btree?
  (elt btree-elt)
  (left btree-left)
  (right btree-right))

(define *btree-null* #f)

(define (btree-cons head tail)
  (if (btree? tail)
      (let ((elt (btree-elt tail)))
        (if (< elt head)
            (make-btree elt
                        (btree-left tail)
                        (btree-cons head (btree-right tail)))
            (make-btree elt
                        (btree-cons head (btree-left tail))
                        (btree-right tail))))
      (make-btree head
                  *btree-null*
                  *btree-null*)))

That's enough to illustrate my point, I think. We have the data type, the base case, and a constructor. Of course in Haskell or something with better data types it would be much cleaner, but hey, let's roll with this.

If you look at btree-cons, it doesn't seem to be amenable in its current form to classic common subexpression elimination. People don't tend to write duplicated code. You see that I bound the temporary elt instead of typing (btree-elt btree) each time, and that was partly because of typing, and partly out of some inner efficiency puritan, telling me I shouldn't write duplicate expressions. (Cult behavior, again!)

But, note that all of these record abstractions will probably inline, instead of calling out to procedures. (They are implemented as inlinable procedures. Yes, it would be better to have cross-module inlining, but we don't, so this is what we do.) In general, syntactic abstraction in Scheme can lead to duplicate code. Apologies in advance for this eyeball-rending torrent, but here's a listing of what btree-cons reduces to, after expansion and partial evaluation:

(define (btree-cons head tail)
  (if (and (struct? tail)
           (eq? (struct-vtable tail) btree))
      (let ((elt (if (eq? (struct-vtable tail) btree)
                     (struct-ref tail 0)
                     (throw 'wrong-type-arg
                            'btree-elt
                            "Wrong type argument: ~S"
                            (list tail)
                            (list tail)))))
        (if (< elt head)
            (let ((left (if (eq? (struct-vtable tail) btree)
                            (struct-ref tail 1)
                            (throw 'wrong-type-arg
                                   'btree-left
                                   "Wrong type argument: ~S"
                                   (list tail)
                                   (list tail))))
                  (right (btree-cons
                           head
                           (if (eq? (struct-vtable tail) btree)
                               (struct-ref tail 2)
                               (throw 'wrong-type-arg
                                      'btree-right
                                      "Wrong type argument: ~S"
                                      (list tail)
                                      (list tail))))))
              (make-struct/no-tail btree elt left right))
            (let ((left (btree-cons
                          head
                          (if (eq? (struct-vtable tail) btree)
                              (struct-ref tail 1)
                              (throw 'wrong-type-arg
                                     'btree-left
                                     "Wrong type argument: ~S"
                                     (list tail)
                                     (list tail)))))
                  (right (if (eq? (struct-vtable tail) btree)
                             (struct-ref tail 2)
                             (throw 'wrong-type-arg
                                    'btree-right
                                    "Wrong type argument: ~S"
                                    (list tail)
                                    (list tail)))))
              (make-struct/no-tail btree elt left right))))
      (let ((left *btree-null*) (right *btree-null*))
        (make-struct/no-tail btree head left right))))

Again, I'm really sorry about that, and it's not just for your eyes: it's also because that's a crapload of code for what should be a simple operation. It's also redundant! There are 6 checks that btree is in fact a btree, when only one is needed semantically. (Note that the null case is not a btree, of course.)

Furthermore, all of the checks in the first arm of the if are redundant. The code above is what the optimizer produces -- which is, you know, turrible.

So, I thought, we could run a pass over the source that tries to propagate predicates, and then tries to fold predicates whose boolean value we already know.

And that's what I did. Here's what Guile's optimizer does with the function, including the CSE pass:

(define (btree-cons head tail)
  (if (and (struct? tail)
           (eq? (struct-vtable tail) btree))
      (let ((elt (struct-ref tail 0)))
        (if (< elt head)
            (let ((left (struct-ref tail 1))
                  (right (btree-cons head (struct-ref tail 2))))
              (make-struct/no-tail btree elt left right))
            (let ((left (btree-cons head (struct-ref tail 1)))
                  (right (struct-ref tail 2)))
              (make-struct/no-tail btree elt left right))))
      (let ((left *btree-null*) (right *btree-null*))
        (make-struct/no-tail btree head left right))))

This is much better. It's quite close to the source program, except the symbolic references like btree-elt have been replaced with indexed references. The type check in the predicate of the if expression propagated to all the other type checks, causing those nested if expressions to fold.

Of course, CSE can also propagate bound lexicals:

(let ((y (car x)))
  (car x))
=> (let ((y (car x)))
     y)

This is the classic definition of CSE.

but is it a win?

I should be quite pleased with the results, except that CSE makes Guile's compiler approximately twice as slow. Granted, in the long run, this should be acceptable: code is usually run many more times than it is compiled. But this is a fairly expensive pass, and yet at the same time it's not as good as it could be.

In order to get to the heart of the matter, I need to explain a little about the implementation. CSE is a post-pass, that runs after partial evaluation (peval). I tried to make it a part of peval, as the two optimizations are synergistic -- oh yes, let's revel in that word -- are you feeling it? -- but it was too complicated in the end. The reason is that in functions like this:

(define (explode btree)
  (unless (btree? btree)
    (error "not a btree" btree))
  (values (btree-head btree)
          (btree-left btree)
          (btree-right btree)))

Here we have a sequence of two expressions. Since the first one bails out if the predicate is false, we should propagate a true predicate past the first expression. This means that running CSE on an expression returns two values: the rewritten expression, given the predicates already seen; and a new set of predicates that the expression asserts. We should be able to use these new assertions to elide the type checks in the second expression. And indeed, Guile can do this.

Perhaps you can see the size of the problem now. CSE is a pass that runs over all code, building up an ordered set of expressions that were evaluated, and in what context. When it sees a new expression in a test context -- as the predicate in an if -- it checks to see if the set contains that expression (or its negation) already, in test context, and if so tries to fold the expression to true or false. Already doing this set lookup and traversal is expensive -- at least N log N in the total size of the program, with some quadratic components in the case an expression is found, and also with high constant factors due to the need for custom hash and equality predicates.

The quadratic factor comes in when walking the set to see if the elimination is valid. Consider:

(if (car x)
    (if (car x) 10 20)
    30)

Here, we should be able to eliminate the second (car x), folding to (if (car x) 10 30). However, in this one:

(if (car x)
    (begin
      (y)
      (if (car x) 10 20))
    30)

If we don't know what (y) does, then we can't fold the second test, because perhaps (y) will change the contents of the pair, x. The information that allows us to make these decisions is effects analysis. For the purposes of Guile's optimizer, (car x) has two dependencies and can cause two effects: it depends on the contents of a mutable value, and on the value of a toplevel (x), and can cause the effect of an unbound variable error when resolving the toplevel, or a type error when accessing its car. Two expressions commute if neither depends on effects that the other causes.

I stole the idea of doing a coarse effects analysis, and representing it as bits in a small integer, from V8. Guile's version is here: effects.scm. The ordered set is a form of global value numbering. See the CSE pass here: cse.scm.

The commute test is fairly cheap, but the set traversal is currently a bit expensive.

and for what?

As I have indicated, the pass does do something useful on real programs, as in the binary tree example. But it does not do all it could, and it's difficult to fix that, for a few reasons.

Unlike traditional CSE, Guile's version of it is interprocedural. Instead of operating on just one basic block or one function, it operates across nested functions as well. However, only some dependencies can commute across a function boundary. For example:

(lambda (x)
  (if (pair? x)
      (let ((y (car x)))
        (lambda ()
          (and (pair? x) (car x))))))

Can the first pair? test propagate to the second expression? It can, because pair? does not depend on the values of mutable data, or indeed on any effect. If it's true once, it will always be true.

But can we replace the second (car x) with y? No, because (car x) has a dependency on mutable data, and because we don't do escape analysis on the closure, we don't let those dependencies commute across a procedure boundary. (In this case, even if we did escape analysis, we'd have the same conclusion.)

However, not all lambda abstractions are closures. Some of them might end up being compiled to labels in the function. Scheme uses syntactically recursive procedures to implement loops, after all. But Guile's CSE does poorly for lambda expressions that are actually labels. The reason is that lexical scope is not a dominator tree.

MLton hacker Stephen Weeks says it better than I do:

Thinking of it another way, both CPS and SSA require that variable definitions dominate uses. The difference is that using CPS as an IL requires that all transformations provide a proof of dominance in the form of the nesting, while SSA doesn't. Now, if a CPS transformation doesn't do too much rewriting, then the partial dominance information that it had from the input tree is sufficient for the output tree. Hence tree splicing works fine. However, sometimes it is not sufficient.

As a concrete example, consider common-subexpression elimination. Suppose we have a common subexpression x = e that dominates an expression y = e in a function. In CPS, if y = e happens to be within the scope of x = e, then we are fine and can rewrite it to y = x. If however, y = e is not within the scope of x, then either we have to do massive tree rewriting (essentially making the syntax tree closer to the dominator tree) or skip the optimization. Another way out is to simply use the syntax tree as an approximation to the dominator tree for common-subexpression elimination, but then you miss some optimization opportunities. On the other hand, with SSA, you simply compute the dominator tree, and can always replace y = e with y = x, without having to worry about providing a proof in the output that x dominates y. (i.e. without putting y in the scope of x)

[MLton-devel] CPS vs SSA

See my article on SSA and CPS for more context.

So that's one large source of lost CSE opportunities, especially in loops.

Another large source of lost opportunities is that the Tree-IL language, which is basically a macro-expanded Scheme, has the same property that Scheme does, that the order of evaluation of operands is unspecified.

Consider the base-case clause of my btree-cons above:

(let ((left *btree-null*) (right *btree-null*))
  (make-struct/no-tail btree head left right))

Now, *btree-null* is a toplevel lookup, that might have an unbound-variable effect. We should be able to eliminate one of them, though. Why doesn't the CSE pass do it? Because in Tree-IL, the order of evaluation of the right-hand-sides of left and right is unspecified. This gives Guile some useful freedoms, but little information for CSE.

This is an instance of a more general problem, that Tree-IL might be too high-level to be useful for CSE. For example, at runtime, not all lexical references are the same -- some are local, and some reference free variables. For mutated free variables, the variable is itself in a box, so to reference it you would load the box into a local and then dereference the box. CSE should allow you to eliminate duplicate loads of the box, even in the case that it can't eliminate duplicate references into the box.

conclusion

It is nice to be able to eliminate the duplicate checks, but not at any price. Currently the bootstrapping time cost is a bit irritating. I have other ideas on how to fix that, but ultimately we probably need to re-implement CSE at some lower level. More on that in a future post. Until then, happy hacking.

by Andy Wingo at May 14, 2012 05:07 PM

May 11, 2012

Programming Praxis

Partitions

The partitions of an integer is the set of all sets of integers that sum to the given integer. For instance, the partitions of 4 is the set of sets ((1 1 1 1) (1 1 2) (1 3) (2 2) (4)). We computed the number of partitions of an integer in a previous exercise. In today’s exercise, we will make a list of the partitions.

The process is recursive. There is a single partition of 0, the empty set (). There is a single partition of 1, the set (1). There are two partitions of 2, the sets (1 1) and (2). There are three partitions of 3, the sets (1 1 1), (1 2) and (3). There are five partitions of 4, the sets (1 1 1 1), (1 1 2), (1 3), (2 2), and (4). There are seven partitions of 5, the sets (1 1 1 1 1), (1 1 1 2), (1 2 2), (1 1 3), (1 4), (2 3) and (5). And so on. In each case, the next-larger set of partitions is determined by adding each integer x less than or equal to the desired integer n to all the sets formed by the partition of nx, eliminating any duplicates.

Your task is to write a function that generates the set of all partitions of a given integer. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at May 11, 2012 09:00 AM

May 08, 2012

Programming Praxis

Factor Tables

Before the dawn of computers, most computations were done with the aid of tables: logarithm tables, sine tables, and so on. These tables were ubiquitous, indispensable, and riddled with errors. Number theorists who needed to factor numbers used tables of the least prime factor of a number. The oldest such table dates to 1603 (it contained the least prime factor of all numbers to 750), and new tables were being constructed as late as Derrick N. Lehmer’s table of least prime factors to ten million in 1909 (he was the father of Derrick H. Lehmer); Maarten Bullynck gives the history. Here’s a sample page from a large table, showing the least prime factors of all numbers less than a thousand; numbers divisible by 2 and 5 are omitted, and primes are skipped:

      0    1    2    3    4    5    6    7    8    9
 1        --    3    7   --    3   --   --    3   17
 3   --   --    7    3   13   --    3   19   11    3
 7   --   --    3   --   11    3   --    7    3   --
 9    3   --   11    3   --   --    3   --   --    3
11   --    3   --   --    3    7   13    3   --   --
13   --   --    3   --    7    3   --   23    3   11
17   --    3    7   --    3   11   --    3   19    7
19   --    7    3   11   --    3   --   --    3   --
21    3   11   13    3   --   --    3    7   --    3
23   --    3   --   17    3   --    7    3   --   13
27    3   --   --    3    7   17    3   --   --    3
29   --    3   --    7    3   23   17    3   --   --
31   --   --    3   --   --    3   --   17    3    7
33    3    7   --    3   --   13    3   --    7    3
37   --   --    3   --   19    3    7   11    3   --
39    3   --   --    3   --    7    3   --   --    3
41   --    3   --   11    3   --   --    3   29   --
43   --   11    3    7   --    3   --   --    3   23
47   --    3   13   --    3   --   --    3    7   --
49    7   --    3   --   --    3   11    7    3   13
51    3   --   --    3   11   19    3   --   23    3
53   --    3   11   --    3    7   --    3   --   --
57    3   --   --    3   --   --    3   --   --    3
59   --    3    7   --    3   13   --    3   --    7
61   --    7    3   19   --    3   --   --    3   31
63    3   --   --    3   --   --    3    7   --    3
67   --   --    3   --   --    3   23   13    3   --
69    3   13   --    3    7   --    3   --   11    3
71   --    3   --    7    3   --   11    3   13   --
73   --   --    3   --   11    3   --   --    3    7
77    7    3   --   13    3   --   --    3   --   --
79   --   --    3   --   --    3    7   19    3   11
81    3   --   --    3   13    7    3   11   --    3
83   --    3   --   --    3   11   --    3   --   --
87    3   11    7    3   --   --    3   --   --    3
89   --    3   17   --    3   19   13    3    7   23
91    7   --    3   17   --    3   --    7    3   --
93    3   --   --    3   17   --    3   13   19    3
97   --   --    3   --    7    3   17   --    3   --
99    3   --   13    3   --   --    3   17   29    3

For example the table shows that the least prime factor of 923, in the column headed 9 and row headed 23, is 13, and 997 is the greatest prime less than a thousand. To factor a number, find its least prime factor in the table, compute the remaining cofactor by division, and repeat until the cofactor is prime.

When building the table, the least prime factor is computed by sieving, not by trial division. The setup is the same as the Sieve of Eratosthenes, except that integers are used instead of booleans, and each item in the sieve is initialized to 1. Then each successively-smallest prime p is sieved, but instead of changing true to false, each 1 in the chain of multiples of p is changed to p (other values are ignored). The same optimizations as the normal sieve — odd numbers only, start at the square of p, and stop when p2 is greater than n — apply here.

Your task is to write a function that sieves for least prime factors, and to use that function to write a program that builds factor tables as illustrated above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at May 08, 2012 09:00 AM

May 05, 2012

Jeremy Kun

False Proof – 2 = 4, As the Limit of an Infinite Power Tower

Problem: Prove that . Solution: Consider the value of the following infinitely iterated exponent: Let , that is, the above power tower where we stop at the -th term. Then is clearly an increasing sequence, and moreover by a trivial induction … Continue reading

by j2kun at May 05, 2012 07:06 PM

May 04, 2012

Programming Praxis

Even-Odd Partition

I’m not sure where this problem comes from; it’s either homework or an interview question. Nonetheless, it is simple and fun:

Take an array of integers and partition it so that all the even integers in the array precede all the odd integers in the array. Your solution must take linear time in the size of the array and operate in-place with only a constant amount of extra space.

Your task is to write the indicated function. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at May 04, 2012 09:00 AM

May 01, 2012

Programming Praxis

Legendre’s Symbol

The Legendre Symbol and its cousin the Jacobi Symbol are used in modular arithmetic to determine if a number a is a quadratic residue to the modulus m. A number a is a quadratic residue if there exists a number x such that x2a (mod m) and is defined only when a and m are co-prime. For instance, a2 (mod 7) for a from 0 to 6 is the list 02 (mod 7) = 0, 12 (mod 7) = 1, 22 (mod 7) = 4, 32 (mod 7) = 2, 42 (mod 7) = 2, 52 (mod 7) = 4, and 62 (mod 7) = 1, so the quadratic residues of 7 are 1, 2 and 4 (0 is excluded because it isn’t co-prime to 7). The jacobi symbol considers any odd modulus; the legendre symbol is limited to odd prime moduli. The symbols are usually written in parentheses with a over m, like this: \left( a \atop m \right). Sometimes the symbol is written with a horizontal rule between the a and m, and sometimes it is written on a single line as (a / m).

The legendre/jacobi symbol can be calculated according to the following three termination rules:

1. (0 / m) = 0

2. (1 / m) = 1

3. (2 / m) = −1 if m mod 8 ∈ {3, 5} or 1 if m mod 8 ∈ {1, 7}

and the following three reduction rules:

4. [reducing factors of 2] (2a / m) = (2 / m) × (a / m)

5. [reducing modulo m] (a / m) = (a (mod m) / m) if am or a < 0

6. [reducing odd co-primes a and m] (a /m) = −(m / a) if am ≡ 3 (mod 4), or (m / a) otherwise

Thus, the legendre/jacobi symbol is 1 if a is a quadratic residue, -1 if a is not a quadratic residue, and 0 if a and m are not co-prime.

Our various prime-number programs have used a definition of the legendre/jacobi symbol that doesn’t work; for some values of a and m it returned wrong results, and for other values of a and m it entered an infinite loop. This exercise fixes the problem.

Your task is to write a function that calculates the legendre/jacobi symbol using the rules given above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at May 01, 2012 09:00 AM

April 27, 2012

Joe Marshall

Package system horrors

Arcane Sentiment mentioned how the Lisp Machine used to handle package prefixes in this post, and it reminded me of an interesting problem.


When we were building the LMI K-machine we had to cross-compile from the existing Lisp Machine. Naturally, the cross-compiler would read the source code and intern the source code symbols as part of the process. But the K-machine had a different package setup. New packages weren't a particular problem, but the locations of some symbols were. In addition, some things that were implemented as macros in the Lisp Machine were implemented as compiler intrinsic special forms on the K-machine, and vice versa.


The compiler naturally uses symbol identity to determine how to handle a special form. If you want to compile a conditional, for example, the CAR of the form had better be the symbol CL:IF. A symbol named "IF" in a different package won't generally work.


Well, that's not exactly correct... When the Lisp Machine was built, there was no CL package. There was no Common Lisp. So when the CL package was added, it was necessary to make sure that the symbol named "IF" in the CL package was the same EQ symbol as the one the compiler used for conditionals. There are a couple of ways to arrange for this. The symbol could be explicitly interned in the CL package, or it could be visible to the CL package via inheritance.


We wanted to change some of these things on the K-machine, but the cross-compiler was running on the LMI Lambda and when it read the source code it would intern the symbols in the Lambda's packages. Munging the existing Lambda packages was out of the question. The ‘solution’ was to play some nasty tricks with the entire package system. We created two different packages, both named "COMMON-LISP". Some of the symbols were common between them, but others were not. The package inheritance was different, too. The trick was to dynamically switch how package names were resolved to package objects.


So if you have two different “package systems”, how do you refer explicitly to “the symbol IF in the other "COMMON-LISP" package? With a reader hack, of course. We made the triple colon ::: prefix indicate which package system to use to resolve the package name.


This kind of worked, but it was a horror. Most of the system didn't care, the rest didn't know, but if you accidentally had the wrong package system in place when you read a file, you would definitely trash the entire machine and need to reboot. We had a few helper functions that would try to ensure that the correct package system was selected before doing anything (for instance, when calling the cross-compiler, we'd switch out, and when it returned we'd switch back. Works great until you enter a debug repl...). The big problem was that it was impossible to ‘close’ over the correct context, so you could not in general guarantee seamless operation. (For example, if you referred to the K-machine's CL:READ symbol, you probably wanted to switch to the K-machine package set, but there was no way to do this automatically.)

by Joe Marshall (noreply@blogger.com) at April 27, 2012 07:58 PM

Programming Praxis

Trabb Pardo Knuth Algorithm

In their 1973 paper “The Early Development of Programming Languages,” Luis Trabb Pardo and Donald Knuth give the following algorithm as a means of assessing programming languages:

ask for 11 numbers to be read into a sequence S
reverse sequence S
for each item in sequence S
    call a function to do an operation
    if result overflows
        alert user
    else
        print result

Though it is trivial, the algorithm involves arrays, indexing, mathematical functions, subroutines, I/O, conditionals and iteration, and thus exposes many basic operations of a programming language.

Your task is to write a program that implements the Trabb Pardo Knuth algorithm. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


by programmingpraxis at April 27, 2012 09:00 AM