Fun with macros - extending application to support vector notation
[Remember to leave a comment: Was this post silly? Enligthning? Old news?]
"Scheme vector notation sucks" - a common complaint often heard amongst children and drunks. They might even be right. The sum of a three element vector is calculated like this:
It looks like they are right. Can we repair this with some macro magic?
Let's first consider what syntax we want to implement. Maybe something like this:
Not bad, but in most Scheme implementations the parentheses () and [] mean the same.
How about
then? That we can implement. In PLT Scheme the above is parsed as the
application:
All we need to do now, is to implement a new application, that takes special action
when the second argument begins with an underscore. In the following code underscore-symbol? is a function that checks whether a symbol starts with an underscore, and remove-underscore removes an initial underscore from a symbol.
The code also handles the case where the subscript is an identifier, such as in (v _n).
Notice the use of quasisyntax/loc instead of plain quasisyntax.
If a user writes (v _4) and v is of size 3, then an error must be signaled.
The culprit is the vector-ref expression, so if plain quasisyntax is used,
DrScheme will color the vector-ref expression red - and the user will think
the macro code is faulty. Using quasisyntax/loc allows us to report errors
to the usage of the macro.
We also need a notation for the case, where the subscript is a general expression.
That's fortunately easy. Add _ to the list of literals of the syntax-case expression, and add this clause:
Finally we also need a convenient way to make assignments to vectors. The syntax
(v 2 := 42) is easy to implement. And just in case people forget, we also support (v _2 := 42).
The entire source follows.
Remember to leave a comment: Was this post silly? Enligthning? Old news?
"Scheme vector notation sucks" - a common complaint often heard amongst children and drunks. They might even be right. The sum of a three element vector is calculated like this:
(define a (vector 1 2 3))
(+ (vector-ref a 0) (vector-ref a 1) (vector-ref a 2))
It looks like they are right. Can we repair this with some macro magic?
Let's first consider what syntax we want to implement. Maybe something like this:
(+ a[0] a[1] a[2])
Not bad, but in most Scheme implementations the parentheses () and [] mean the same.
How about
(+ (a _0) (a _1) (a _2))
then? That we can implement. In PLT Scheme the above is parsed as the
application:
(#%app + (#%app a _0) (#%app a _1) (#%app a _2))
All we need to do now, is to implement a new application, that takes special action
when the second argument begins with an underscore. In the following code underscore-symbol? is a function that checks whether a symbol starts with an underscore, and remove-underscore removes an initial underscore from a symbol.
The code also handles the case where the subscript is an identifier, such as in (v _n).
(define-syntax (my-app stx)
(syntax-case stx ()
[(_ expr sub)
(and (identifier? #'sub)
(underscore-symbol? (syntax-e #'sub)))
(let* ([sub (remove-underscore (syntax-e #'sub))]
[num (string->number (symbol->string sub))])
(cond
[(and (number? num) (or (zero? num) (positive? num)))
(quasisyntax/loc stx
(vector-ref expr #,(datum->syntax-object #'sub num)))]
[else
(quasisyntax/loc stx
(vector-ref expr #,(datum->syntax-object stx sub)))]))]
[(_ more ...)
#'(#%app more ...)]
[_
(raise-syntax-error 'my-app "bad syntax, in ~a" stx)]))
Notice the use of quasisyntax/loc instead of plain quasisyntax.
If a user writes (v _4) and v is of size 3, then an error must be signaled.
The culprit is the vector-ref expression, so if plain quasisyntax is used,
DrScheme will color the vector-ref expression red - and the user will think
the macro code is faulty. Using quasisyntax/loc allows us to report errors
to the usage of the macro.
(quasisyntax/loc stx
(vector-ref expr #,(datum->syntax-object #'sub num)))
We also need a notation for the case, where the subscript is a general expression.
That's fortunately easy. Add _ to the list of literals of the syntax-case expression, and add this clause:
[(__ expr _ sub)
(syntax/loc stx
(vector-ref expr sub))]
Finally we also need a convenient way to make assignments to vectors. The syntax
(v 2 := 42) is easy to implement. And just in case people forget, we also support (v _2 := 42).
The entire source follows.
Remember to leave a comment: Was this post silly? Enligthning? Old news?
; vector-hack.scm -- Jens Axel Søgaard
(module vector-hack mzscheme
(provide (rename my-app #%app ))
(begin-for-syntax
(define (underscore-symbol? sym)
(and (symbol? sym)
(let ([str (symbol->string sym)])
(> (string-length str) 0)
(equal? (string-ref str 0) #\_))))
(define (remove-underscore sym)
(if (underscore-symbol? sym)
(let ([str (symbol->string sym)])
(string->symbol (substring str 1 (string-length str))))
(error 'remove-underscore "symbol doesn't start with an underscore: ~a" sym))))
(define-syntax (my-app stx)
(syntax-case stx (_ :=)
[(__ expr sub)
(and (identifier? #'sub)
(underscore-symbol? (syntax-e #'sub)))
(let* ([sub (remove-underscore (syntax-e #'sub))]
[num (string->number (symbol->string sub))])
(cond
[(and (number? num) (or (zero? num) (positive? num)))
(quasisyntax/loc stx
(vector-ref expr #,(datum->syntax-object #'sub num)))]
[else
(quasisyntax/loc stx
(vector-ref expr #,(datum->syntax-object stx sub)))]))]
[(__ expr _ sub)
(syntax/loc stx
(vector-ref expr sub))]
[(__ expr sub := value-expr)
(and (identifier? #'sub)
(underscore-symbol? (syntax-e #'sub)))
(let* ([sub (remove-underscore (syntax-e #'sub))]
[num (string->number (symbol->string sub))])
(cond
[(and (number? num) (or (zero? num) (positive? num)))
(quasisyntax/loc stx
(vector-set! expr #,(datum->syntax-object #'sub num) value-expr))]
[else
(quasisyntax/loc stx
(vector-set! expr #,(datum->syntax-object stx sub) value-expr))]))]
[(__ expr sub := value-expr)
(syntax/loc stx
(vector-set! expr sub value-expr))]
[(__ more ...)
#'(#%app more ...)]
[__
(raise-syntax-error 'my-app "bad syntax, in ~a" stx)])))
(require vector-hack)
(define v (vector 0 1 2))
(v _0) ; => 0
(v _1) ; => 1
(v _2) ; => 2
; (my-app v _3) ; => error, correct syntax location
(define a 1)
(v _a) ; => 1
(+ (v _0) (v _1) (v _2)) ; => 3
v ; => #(0 1 2)
(v _0 := 3) ; #<void>
v ; => #(3 1 2)
(v 0 := 42) ; void
v ; => #(42 1 2)
7 Comments:
Nifty, Jens! Looks like a gold mine of information about the syntax-case system, particularly the bit about error reporting. Not to diminish the coolness above, but I often do things like (let ((! vector-set!) ($ vector-ref)) <expr w/lots of sets and refs>) when I have a lot of setting and reffing.
Hi Will,
Your solution is simple - I like it.
/Jens Axel
That's neat, Jens. :-)
When I read your (intentionally trivial) example, the solution that jumped to mind was
(apply + (map (cut vector-ref a <>) '(0 1 2)))
Or, with a bit of pseudo-currying,
(define @a (cut vector-ref a <>))
(apply + (map @a '(0 1 2)))
Spelled out as (+ (@a 0) (@a 1) (@a 2)), it looks like Will's $/! forms. That does seems more "schemish" than the infix syntax you describe here.
If a reader-syntax could be developed for
@expr --> (cute vector-ref expr <>)
then you could get brevity without losing functional style and its benefits. (I don't know PLT or syntax-case well, though; maybe reader-syntax like this isn't portable.)
Like Will said, though, I don't want to detract from the coolness here; this was an excellent post.
Hi Graham,
PLT Scheme actually have a customizable reader with read-tables a la Common Lisp. I haven't played with it yet, so my knowledge is limited to the manual:
http://download.plt-scheme.org/doc/360/html/mzscheme/mzscheme-Z-H-11.html#node_sec_11.2.8
The Honu syntax is implemented using the reader stuff, so it is quite powerful.
http://download.plt-scheme.org/doc/360/html/mzscheme/mzscheme-Z-H-19.html#node_sec_19.2
/Jens Axel
Thanks, Jens. I know that Honu was ported to Chicken Scheme from PLT; I assume the reader-syntax facilities must be either very similar, or easy to port.
BTW, in Chicken you can this, with reader macros:
(set-read-syntax! #\$ (lambda (port) `(cut vector-ref ,(read port) <>)))
where $v would expand to something like (lambda (g49) ((begin vector-ref) v g49)) -- in other words, a closure over v, that you could call as a function, pass as an argument, etc.
cheers -- graham
I should add -- the (begin vector-ref) in my last post is ugly, to be sure. It must be an artifact of the (cut) macro; but it would likely be simplified to "vector-ref" during compilation. If it really bothered someone, they could write
(set-read-syntax! #\$
(lambda (port)
(let ((i (gensym)))
`(lambda (,i) vector-ref ,(read port) ,i))))
instead, or the equivalent using syntax-case, skipping the cut macro; so that $v would expand to
(lambda (g99) vector-ref v g99). And that's enough pedantry for one night. :-)
Hi Jens,
Nice stuff. I remember you asking how one would go about matching the underscore on the plt-scheme list, and I knew you had to be up to some cool macro stuff.
Actually, I took away new ideas from your effort, as well as getting introduced to srfi-26, which I had not heard of before. So I looked it up, and Al Petrofsky did the reference implementation. He and Oleg continue to amaze me. I used it for a while and decided it was worth putting inside my schemekeys language, a module language that I use for interactive and non-module work.
A very worthwhile post. You are blessed with a very talented group of readers as well. I've snatched up the code and put it with all the rest of the classics I glean from your posts.
Thanks for the work.
--kyle
Post a Comment
Links to this post:
Create a Link
<< Home