Planet Scheme

June 05, 2020

Programming Praxis

2Max

Today’s exercise comes from Stack Overflow:

Given an array A consisting of N integers, return the maximum sum of two numbers whose digits add up to an equal sum. If there are not two numbers whose digits have an equal sum, the function should return -1. For example, A = [51, 71, 17, 42] would output 93 because there are two sets of numbers with the same digit-sum, (51, 42) with a digit-sum of 6 and (17, 71) with a digit-sum of 8, and the first pair has the maximum sum of two numbers of 93.

Your task is to write a program to calculated the requested maximum sum. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at June 05, 2020 09:00 AM

June 03, 2020

Andy Wingo

a baseline compiler for guile

Greets, my peeps! Today's article is on a new compiler for Guile. I made things better by making things worse!

The new compiler is a "baseline compiler", in the spirit of what modern web browsers use to get things running quickly. It is a very simple compiler whose goal is speed of compilation, not speed of generated code.

Honestly I didn't think Guile needed such a thing. Guile's distribution model isn't like the web, where every page you visit requires the browser to compile fresh hot mess; in Guile I thought it would be reasonable for someone to compile once and run many times. I was never happy with compile latency but I thought it was inevitable and anyway amortized over time. Turns out I was wrong on both points!

The straw that broke the camel's back was Guix, which defines the graph of all installable packages in an operating system using Scheme code. Lately it has been apparent that when you update the set of available packages via a "guix pull", Guix would spend too much time compiling the Scheme modules that contain the package graph.

The funny thing is that it's not important that the package definitions be optimized; they just need to be compiled in a basic way so that they are quick to load. This is the essential use-case for a baseline compiler: instead of trying to make an optimizing compiler go fast by turning off all the optimizations, just write a different compiler that goes from a high-level intermediate representation straight to code.

So that's what I did!

it don't do much

The baseline compiler skips any kind of flow analysis: there's no closure optimization, no contification, no unboxing of tagged numbers, no type inference, no control-flow optimizations, and so on. The only whole-program analysis that is done is a basic free-variables analysis so that closures can capture variables, as well as assignment conversion. Otherwise the baseline compiler just does a traversal over programs as terms of a simple tree intermediate language, emitting bytecode as it goes.

Interestingly the quality of the code produced at optimization level -O0 is pretty much the same.

This graph shows generated code performance of the CPS compiler relative to new baseline compiler, at optimization level 0. Bars below the line mean the CPS compiler produces slower code. Bars above mean CPS makes faster code. You can click and zoom in for details. Note that the Y axis is logarithmic.

The tests in which -O0 CPS wins are mostly because the CPS-based compiler does a robust closure optimization pass that reduces allocation rate.

At optimization level -O1, which adds partial evaluation over the high-level tree intermediate language and support for inlining "primitive calls" like + and so on, I am not sure why CPS peels out in the lead. No additional important optimizations are enabled in CPS at that level. That's probably something to look into.

Note that the baseline of this graph is optimization level -O1, with the new baseline compiler.

But as I mentioned, I didn't write the baseline compiler to produce fast code; I wrote it to produce code fast. So does it actually go fast?

Well against the -O0 and -O1 configurations of the CPS compiler, it does excellently:

Here you can see comparisons between what will be Guile 3.0.3's -O0 and -O1, compared against their equivalents in 3.0.2. (In 3.0.2 the -O1 equivalent is actually -O1 -Oresolve-primitives, if you are following along at home.) What you can see is that at these optimization levels, for these 8 files, the baseline compiler is around 4 times as fast.

If we compare to Guile 3.0.3's default -O2 optimization level, or -O3, we see bigger disparities:

Which is to say that Guile's baseline compiler runs at about 10x the speed of its optimizing compiler, which incidentally is similar to what I found for WebAssembly compilers a while back.

Also of note is that -O0 and -O1 take essentially the same time, with -O1 often taking less time than -O0. This is because partial evaluation can make the program smaller, at a cost of being less straightforward to debug.

Similarly, -O3 usually takes less time than -O2. This is because -O3 is allowed to assume top-level bindings that aren't exported from a module can be transformed to lexical bindings, which are more available for contification and inlining, which usually leads to smaller programs; it is a similar debugging/performance tradeoff to the -O0/-O1 case.

But what does one gain when choosing to spend 10 times more on compilation? Here I have a gnarly graph that plots performance on some microbenchmarks for all the different optimization levels.

Like I said, it's gnarly, but the summary is that -O1 typically gets you a factor of 2 or 4 over -O0, and -O2 often gets you another factor of 2 above that. -O3 is mostly the same as -O2 except in magical circumstances like the mbrot case, where it adds an extra 16x or so over -O2.

worse is better

I haven't seen the numbers yet of this new compiler in Guix, but I hope it can have a good impact. Already in Guile itself though I've seen a couple interesting advantages.

One is that because it produces code faster, Guile's boostrap from source can take less time. There is also a felicitous feedback effect in that because the baseline compiler is much smaller than the CPS compiler, it takes less time to macro-expand, which reduces bootstrap time (as bootstrap has to pay the cost of expanding the compiler, until the compiler is compiled).

The second fortunate result is that now I can use the baseline compiler as an oracle for the CPS compiler, when I'm working on new optimizations. There's nothing worse than suspecting that your compiler miscompiled itself, after all, and having a second compiler helps keep me sane.

stay safe, friends

The code, you ask? Voici.

Although this work has been ongoing throughout the past month, I need to add some words on the now before leaving you: there is a kind of cognitive dissonance between nerding out on compilers in the comfort of my home, rain pounding on the patio, and at the same time the world on righteous fire. I hope it is clear to everyone by now that the US police are an essentially racist institution: they harass, maim, and murder Black people at much higher rates than whites. My heart is with the protestors. Godspeed to you all, from afar. At the same time, all my non-Black readers should reflect on the ways they participate in systems that support white supremacy, and on strategies to tear them down. I know I will be. Stay safe, wear eye protection, and until next time: peace.

by Andy Wingo at June 03, 2020 08:39 PM

June 02, 2020

Programming Praxis

Hidden Squares

We have a simple task today:

A number n may have squares hidden among its digits. For instance, in the number 1625649, the consecutive digits 1, 4, 9, 16, 25, 49, 64, 256 and 625 are all squares.

Your task is to write a program that takes a positive integer n and finds all of its hidden squares. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at June 02, 2020 09:00 AM

May 29, 2020

Programming Praxis

Decreasing-Increasing Array

Today’s task is somebody’s homework:

Given an array of integers, determine if it is in decreasing-increasing order, with some initial segment of the array in decreasing order and the remainder of the array in increasing order. If the array is in decreasing-increasing order, return the pivot element (the minimum element in the array); otherwise, return an indication the array is not in decreasing-increasing order. Array elements may be duplicated. For example, the array (10 10 10 8 8 6 4 4 3 12 13 22 31 40 59 68) is in decreasing-increasing order, with pivot element 3, but the array (1 2 4 8 12 32 64) is not in decreasing-increasing order.

The student who asked the question suggests a solution using binary search that takes O(log n) time, but can’t get it to work.

Your task is to write a program to determine if an array is in decreasing-increasing order as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at May 29, 2020 09:00 AM

May 22, 2020

Programming Praxis

Prime Power Triples

Today’s exercise is Project Euler Problem 87:

The smallest number expressible as the sum of a prime square, prime cube, and prime fourth power is 28. In fact, there are exactly four numbers below fifty that can be expressed in such a way:

28 = 22 + 23 + 24
33 = 32 + 23 + 24
49 = 52 + 23 + 24
47 = 22 + 33 + 24

How many numbers below fifty million can be expressed as the sum of a prime square, prime cube, and prime fourth power?

Your task is to solve Project Euler 87; in the spirit of Project Euler, show only your code but not the solution. When you are finished, you are welcome to read
or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at May 22, 2020 09:00 AM

May 19, 2020

Programming Praxis

MinStack

[ My apologies that this exercise is so long delayed. We have been rewriting all of our business practices in the light of the pandemic, and work has been madness. The biggest projects are now complete, and we have retreated to merely super-busy, so maybe I will have time to write some more exercises. I hope everyone is well; I am. ]

Today’s task is to implement a minstack data structure that provides the normal stack operations push, pop and top and also the operation least which returns the smallest item in the stack without altering the stack in any way. All four operations must operate in O(1) time and O(n) space, where n is the size of the stack.

Your task is to implement a minstack as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at May 19, 2020 09:00 AM

May 17, 2020

GNU Guix

GNU Shepherd user services

One of the things which sets Guix apart from other GNU/Linux distributions is that it uses GNU Shepherd instead of the now ubiquitous systemd. A side effect of this is that user systemd units do not work on Guix System. Love, hate or extremely ambivalent toward systemd, this means that users cannot rely on already written systemd unit files for their regular user-level services.

There are a couple of benefits to using GNU Shepherd, and not all of them are due to it already being installed on Guix. Becoming comfortable with using Shepherd and understanding how to write and edit Shepherd service configurations makes the transition from other GNU/Linux distributions to Guix System easier. More complex services with their own logic tree, using the full power of GNU Guile, are also possible. This means you can have one service that behaves differently if it's running on a different system or architecture without needing to call out to shell scripts or using minimally different service definitions.

The GNU Shepherd manual suggests putting all the services inside a monolithic init.scm file, located by default at $XDG_CONFIG_DIR/shepherd/init.scm. While this does make it easy to keep everything in one place, it does create one glaring issue: any changes to the file mean that all the services need to be stopped and restarted in order for any changes to take place.

Luckily there's a nice function called scandir hiding in ice-9 ftw which returns a list of all files in a specified directory (with options for narrowing down the list or sorting it). This means that our init.scm can contain a minimum of code and all actual services can be loaded from individual files.

First the minimal init.scm:

(use-modules (shepherd service)
             ((ice-9 ftw) #:select (scandir)))

;; Load all the files in the directory 'init.d' with a suffix '.scm'.
(for-each
  (lambda (file)
    (load (string-append "init.d/" file)))
  (scandir (string-append (dirname (current-filename)) "/init.d")
           (lambda (file)
             (string-suffix? ".scm" file))))

;; Send shepherd into the background
(action 'shepherd 'daemonize)

Let's take a sample service for running syncthing, as defined in $XDG_CONFIG_DIR/shepherd/init.d/syncthing.scm:

(define syncthing
  (make <service>
    #:provides '(syncthing)
    #:docstring "Run `syncthing' without calling the browser"
    #:start (make-forkexec-constructor
              '("syncthing" "-no-browser")
              #:log-file (string-append (getenv "HOME")
                                        "/log/syncthing.log"))
    #:stop (make-kill-destructor)
    #:respawn? #t))
(register-services syncthing)

(start syncthing)

As with any other shepherd service it is defined and registered, and in this case it will start automatically. When the file is loaded by shepherd after being discovered by scandir everything works exactly as though the service definition were located directly inside the init.scm.

Now lets make a change. Since syncthing already has a -logfile flag and it has built-in log rotation that sounds better than using shepherd's #:log-file option. First we'll make our changes to the service:

(define syncthing
  (make <service>
    #:provides '(syncthing)
    #:docstring "Run `syncthing' without calling the browser"
    #:start (make-forkexec-constructor
              '("syncthing" "-no-browser"
                "-logflags=3" ; prefix with date & time
                "-logfile=/home/user/log/syncthing.log"))
    #:stop (make-kill-destructor)
    #:respawn? #t))
(register-services syncthing)

(start syncthing)

Now we stop syncthing:

$ herd stop syncthing

And we load the new service:

$ herd load root ~/.config/shepherd/init.d/syncthing.scm

This allows for quickly iterating on services without needing to stop all the services! Let's take a look at another service:

(define fccache
  (make <service>
    #:provides '(fccache)
    #:docstring "Run 'fc-cache -frv'"
    #:start (make-forkexec-constructor
              '("guix" "environment" "--ad-hoc" "fontconfig" "--"
                "fc-cache" "-frv")
              #:log-file (string-append (getenv "HOME")
                                        "/log/fccache.log"))
    #:one-shot? #t))

(register-services fccache)

In this example I want to refresh my font cache but I don't want to actually install fontconfig either system-wide or in my profile.

$ which fc-cache
which: no fc-cache in (/home/user/.config/guix/current/bin:/home/user/.guix-profile/bin:/home/user/.guix-profile/sbin:/run/setuid-programs:/run/current-system/profile/bin:/run/current-system/profile/sbin)
$ herd start fccache
Service fccache has been started.

Of course we can import other modules and leverage the code already written there. In this case, instead of using the string "guix environment --ad-hoc fontutils -- fc-cache -frv" let's use the guix environment function already available in guix scripts environment:

(use-modules (guix scripts environment))

(define fccache
  (make <service>
    #:provides '(fccache)
    #:docstring "Run 'fc-cache -frv'"
    #:start (lambda () ; Don't run immediately when registered!
              (guix-environment "--ad-hoc" "fontconfig" "--" "fc-cache" "-frv"))
    #:one-shot? #t))

(register-services fccache)
$ herd load root ~/.config/shepherd/init.d/fccache.scm
Loading /home/user/.config/shepherd/init.d/fccache.scm.
$ herd start fccache
/gnu/store/hbqlzgd8hcf6ndcmx7q7miqrsxb4dmkk-gs-fonts-8.11/share/fonts: caching, new cache contents: 0 fonts, 1 dirs
/gnu/store/hbqlzgd8hcf6ndcmx7q7miqrsxb4dmkk-gs-fonts-8.11/share/fonts/type1: caching, new cache contents: 0 fonts, 1 dirs
/gnu/store/hbqlzgd8hcf6ndcmx7q7miqrsxb4dmkk-gs-fonts-8.11/share/fonts/type1/ghostscript: caching, new cache contents: 35 fonts, 0 dirs
/home/user/.guix-profile/share/fonts: caching, new cache contents: 0 fonts, 7 dirs
/home/user/.guix-profile/share/fonts/opentype: caching, new cache contents: 8 fonts, 0 dirs
/home/user/.guix-profile/share/fonts/otf: caching, new cache contents: 12 fonts, 0 dirs
/home/user/.guix-profile/share/fonts/terminus: caching, new cache contents: 18 fonts, 0 dirs
/home/user/.guix-profile/share/fonts/truetype: caching, new cache contents: 58 fonts, 0 dirs
/home/user/.guix-profile/share/fonts/ttf: caching, new cache contents: 12 fonts, 0 dirs
/home/user/.guix-profile/share/fonts/type1: caching, new cache contents: 0 fonts, 1 dirs
/home/user/.guix-profile/share/fonts/type1/ghostscript: caching, new cache contents: 35 fonts, 0 dirs
/home/user/.guix-profile/share/fonts/woff: caching, new cache contents: 1 fonts, 0 dirs
/run/current-system/profile/share/fonts: skipping, no such directory
/home/user/.local/share/fonts: skipping, no such directory
/home/user/.fonts: skipping, no such directory
/gnu/store/hbqlzgd8hcf6ndcmx7q7miqrsxb4dmkk-gs-fonts-8.11/share/fonts/type1: skipping, looped directory detected
/home/user/.guix-profile/share/fonts/opentype: skipping, looped directory detected
/home/user/.guix-profile/share/fonts/otf: skipping, looped directory detected
/home/user/.guix-profile/share/fonts/terminus: skipping, looped directory detected
/home/user/.guix-profile/share/fonts/truetype: skipping, looped directory detected
/home/user/.guix-profile/share/fonts/ttf: skipping, looped directory detected
/home/user/.guix-profile/share/fonts/type1: skipping, looped directory detected
/home/user/.guix-profile/share/fonts/woff: skipping, looped directory detected
/gnu/store/hbqlzgd8hcf6ndcmx7q7miqrsxb4dmkk-gs-fonts-8.11/share/fonts/type1/ghostscript: skipping, looped directory detected
/home/user/.guix-profile/share/fonts/type1/ghostscript: skipping, looped directory detected
/var/cache/fontconfig: not cleaning unwritable cache directory
/home/user/.cache/fontconfig: cleaning cache directory
/home/user/.fontconfig: not cleaning non-existent cache directory
fc-cache: succeeded
herd: exception caught while executing 'start' on service 'fccache':
Throw to key `quit' with args `(0)'.

The problem with this approach is that guix-environment returns the exit code of the programs it calls and #:start expects a constructor to return #t or #f so there's some work to be done here.

This was just a quick peek into what's possible with GNU Shepherd when run as a user. Next time we'll take a look at integrating mcron to replicate some of systemd's timer functionality.

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the kernel Linux, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

by Efraim Flashner at May 17, 2020 08:00 PM

May 15, 2020

Göran Weinholt

Quasiquote - Literal Magic

While I was writing a manpage for Scheme’s quasiquote, something I saw surprised me and changed my understanding of quasiquote. It turns out that a new language, with semantics that are interesting to PLT enthusiasts, hides behind the innocent backtick character. Starting with R6RS Scheme, quasiquote became total magic.

Background

It is not going to be easy to understand the argument in this article if you lack some background knowledge, so here’s a brief explanation of quasiquote, Scheme’s concept of locations, Scheme’s handling of literal constants, and finally referential transparency.

Briefly on quasiquote

Quasiquote is a language feature in Scheme that lets you write a template for a structure of lists and vectors. These templates are more like web templates than C++ templates; don’t let the terminology confuse you.

Basically you write a backtick character to start a template. The code immediately following the backtick is the template. You write a comma wherever you want to fill in some variable or other expression. (There’s also a list-splicing version of the comma which often comes in handy).

The expression `(b "Hello " ,x) builds a list with these elements: the symbol b, the string "Hello" and lastly whatever value the variable x happened to have. So perhaps (b "Hello " "John").

Quasiquote is very useful to build SXML and SHTML. Forget learning a new templating system every week; this one’s a keeper. But this being Scheme, the most popular use is likely to write S-expressions that represent code in some language. It’s used for just that in the nanopass framework.

Location, location, location!

Making a language is a difficult job. Everything should ideally work smoothly together as a coherent whole, like pineapples on a pizza. Objects in Scheme programs implicitly refer to locations. There are many details around that which affect the whole language, and they have interesting consequences for quasiquote.

What’s a location? It’s just a place where you can store a value. The vector #(1 2 3) has three locations where values are stored, and currently it’s the numbers 1, 2 and 3. If you make a vector using make-vector then the vector is mutable and you can change the values. Later when you see the same vector again it will still contain the new values.

In practice a location is some address in memory, but the garbage collector might move it around, so its address changes, but it is still the same location.

Other objects in Scheme do not have locations. The number 1 does not have any locations that hold values that you can change. This is only because of the wisdom, kindheartedness and foresight of the language designers, because it is possible to design things differently.

As a consequence of numbers not having locations, there is also very little point in worrying about which number object you have. Suppose that numbers did have locations and you could store important information in them. You’d be very concerned that the 1 number object where you stored all your passwords is the same 1 that you now have in your secrets variable. (Nobody will ever think to look inside the number 1 for your passwords, so your secrets are safe there.) But number objects do not actually have locations, so it doesn’t matter if the Scheme implementation fiddles with them behind your back and gives you a different 1 object the next time you’re looking.

Literal constants

Pairs and vectors have locations, but the rules for their locations are much relaxed if they are literal constants in the program code.

Constants in Scheme are allowed to have read-only locations. If you compile a program with Loko Scheme, you will notice that you get an assertion if you try to change any of the constants. From R6RS:

It is desirable for constants (i.e. the values of literal expressions) to reside in read-only memory. To express this, it is convenient to imagine that every object that refers to locations is associated with a flag telling whether that object is mutable or immutable. Literal constants, the strings returned by symbol->string, records with no mutable fields, and other values explicitly designated as immutable are immutable objects, while all objects created by the other procedures listed in this report are mutable. An attempt to store a new value into a location referred to by an immutable object should raise an exception with condition type &assertion.

Literal constants can also share locations. If the same constant appears in different places in the program then the compiler is allowed to create a single shared instance of the constant, here explained as it applies to structures, in the section on eqv?:

Implementations may share structure between constants where appropriate. Thus the value of eqv? on constants is sometimes implementation-dependent.

The illustrative examples are:

(eqv? '(a) '(a))         ;⇒ unspecified
(eqv? "a" "a")           ;⇒ unspecified
(eqv? '(b) (cdr '(a b))) ;⇒ unspecified
(let ((x '(a)))
  (eqv? x x))            ;⇒ #t

So when it comes to literal constants, Scheme’s normal storage rules do not apply. A program might find that two different locations have become the same location, so that changing the value in a quoted vector ends up changing the value in another quoted vector. It’s also very likely that the program gets an exception when it tries to change the value in such a location. The last example shows that going the other way is not allowed: the compiler is not allowed to create two different versions of the list (a) in that program.

Referential transparency

Referential transparency is a concept that is important in purely functional programming languages. An expression that is referentially transparent can be replaced by the values it returns.

Some expressions in Scheme are referentially transparent. Constants, references to variables that are never mutated, arithmetic in general, type predicates, etc. A Scheme compiler is allowed to replace (+ 1 2) with 3. It doesn’t matter that the program might actually have returned a “different” 3 each time the expression runs. In the same way it doesn’t matter if the compiler turns two different constants into the same constant.

Most parts of Scheme are not referentially transparent. As an example, a Scheme compiler cannot replace (vector 1 2 3) with '#(1 2 3). The locations created by the vector procedure need to be fresh and mutable. But it can replace (vector-ref '#(1 2 3) 0) with 1, so this expression is referentially transparent. And as we previously saw, it can replace (cdr '(a b)) with '(b).

All of this should be fairly widely known, but now comes the interesting part.

There is a crack in everything

So far I have explained Scheme’s notion of locations, referential transparency and that the rules are different for literal constants.

Behold this hidden gem in R6RS:

A quasiquote expression may return either fresh, mutable objects or literal structure for any structure that is constructed at run time during the evaluation of the expression. Portions that do not need to be rebuilt are always literal. Thus,

(let ((a 3)) `((1 2) ,a ,4 ,'five 6))

may be equivalent to either of the following expressions:

'((1 2) 3 4 five 6)

(let ((a 3))
   (cons '(1 2)
         (cons a (cons 4 (cons 'five '(6))))))

However, it is not equivalent to this expression:

(let ((a 3)) (list (list 1 2) a 4 'five 6))

This part of R6RS originally came from Kent Dybvig’s formal comment #204. The same type of language was adopted in R7RS.

The meaning is that a quasiquoted expression can be turned into a literal, or parts may be turned into literals. Where there was code in the quasiquote expression, there can now be a literal. Going the other direction is not allowed: literals cannot be turned into code that returns fresh, mutable structure. But as the example '((1 2) 3 4 five 6) shows, a compiler is allowed to even propagate constants into quasiquote.

There is a very deep rabbit hole here! Have a look again: return […] literal structure for any structure that is constructed at run time during the evaluation of the expression.

There is now a way to construct literals from run time code, but to do so ahead of run time.

Literal magic

Let me demonstrate the power of Scheme’s magic quasiquote. Let --> mean “equivalent to”. It can be the result of an expansion or another compiler pass, such as a partial evaluator. Here is the original, innocuous-looking example:

(let ((a 3)) `((1 2) ,a ,4 ,'five 6))
; -->
'((1 2) 3 4 five 6)

Can we get literal structure copied into the constant part? Easy:

(let ((a '(3))) `((1 2) ,a ,4 ,'five 6))
; -->
'((1 2) (3) 4 five 6)

But we’re just getting started. Can we construct a structure at runtime and have that appear as a constant? Of course!

`((1 2) ,(list 3) ,4 ,'five 6)
;; -->
'((1 2) (3) 4 five 6)

Those Schemers who are paying attention will be thinking I’ve gone mad now. Maybe I have, but this example simply returned literal structure for the (list) structure that was constructed at run time during the evaluation of the expression, to paraphrase the quasiquote specification. Let’s increase the volume:

`((1 2) ,(map + '(1 1) '(2 3)) ,'five 6)
;; -->
'((1 2) (3 4) five 6)

Hang on, shouldn’t map return a fresh, mutable list? Not anymore, this is quasiquote. The map function constructs a list at run time during the evaluation of the quasiquote expression, so the structure no longer needs to be fresh. (Besides, the R6RS and R7RS definitions of map do not actually say that the list needs to be fresh and mutable, but everyone probably assumes it does.)

(letrec ((iota
          (lambda (n)
            (let lp ((m 0) (n n))
              (if (>= m n)
                  '()
                  (cons m (lp (+ m 1) n)))))))
  `((1 2) ,(iota 4) ,'five 6))
;; -->
'((1 2) (0 1 2 3) five 6)

Is this valid? I think most would say it isn’t; the list created by iota is constructed with cons, which returns fresh, mutable pairs. But this is happening inside quasiquote where the normal rules of society break down. The iota procedure constructs a list structure at runtime during the evaluation of a quasiquote expression, so a compiler is allowed to return literal structure for that list structure.

These go to 11

Let’s crank it up and make quasiquote transform code.

Compilers like Chez Scheme, Loko Scheme, Guile Scheme and many others use partial evaluators to perform source-level optimizations. A partial evaluator runs code symbolically. The output is a new program that is hopefully more efficient than the program that went into it.

The partial evaluators used by Scheme compilers are pretty mild as far as partial evaluators go, mostly because of the semantics of Scheme. Doing more powerful transformations on Scheme programs would require quite powerful static analysis, and that is both slow and difficult.

To get quasiquote to work with code, we need something that enables a partial evaluator to run a given procedure in such as way that it’s always inside a quasiquote expression. If we have such a procedure then the partial evaluator can start using the tricks described above, and treat all code that constructs new structures as if their structures were literals. This makes the partial evaluator very happy, so here is happy:

(define (happy proc)
  (lambda x
    `,(apply proc x)))

In a normal Scheme implementation this operator doesn’t do much more than maybe waste a little time and space. But in a Scheme that knows the magic nature of quasiquote, it would enable powerful program transformations on lists and vectors, without the need for as much analysis as it would normally require. In particular, it should no longer be necessary to analyze if intermediate results are mutated, nor to analyze if programs check results for pointer equality with eq?.

Here is an illustrative example of a potential program transformation, based on Philip Wadler’s 1990 paper Deforestation: Transforming programs to eliminate trees:

(define (upto m n)
  (if (> m n)
      '()
      (cons m (upto (+ m 1) n))))

(define (square x)
  (* x x))

(define (sum xs)
  (let sum* ((a 0) (xs xs))
    (match xs
      [() a]
      [(x . xs) (sum* (+ a x) xs)])))

(define (squares xs)
  (match xs
    [() '()]
    [(x . xs) (cons (square x) (squares xs))]))

(define f
  (happy (lambda (n) (sum (squares (upto 1 n))))))
;; -->
(define f
  (lambda (n)
    (letrec ((h0 (lambda (u0 u1 n)
                   (if (> u1 n)
                       u0
                       (h0 (+ u0 (square u1)) (+ u1 1) n)))))
      (h0 0 1 n))))

After deforestation (or fusion), the intermediate lists used in f have been eliminated. This is beneficial in that you can write high-level code, but still have the compiler produce the efficient loop you would have had to write by hand. Scheme compilers normally don’t do these transformations due to the required analysis.

The happy operator does not completely open the barn doors: the transformation still needs to not change other program side effects, such as I/O and exceptions.

The fly in the ointment

Imagine a program written according to this template:

(define (main)
  ...)

(define main* (happy main))

(main*)

What are the consequences for the main program? It would seem that everything in it follows the rules of quasiquote and it can’t use cons in the normal way. This is bad news for the main program.

Conclusions

I don’t know where this leads. What is the precise limit for what a compiler can and can’t turn into literal structure in quasiquote? The example with the main program makes it seem that quasiquote actually gives the compiler a bit too much freedom.

Perhaps it’s actually just poor wording, so R6RS and R7RS will get a new erratum that clarifies what is and what isn’t allowed. I suspect that this is the most likely outcome.

But it doesn’t stop someone who is working on a partial evaluator or another program transformation from proposing a happy operator as a SRFI, giving it semantics that enable even more powerful transformations, but without the need to rely on language lawyering.

There is one conclusion I can draw from all this: don’t assume that what comes of out quasiquote can be mutated.

by weinholt at May 15, 2020 12:00 AM

May 12, 2020

GNU Guix

Guix welcomes Outreachy and GSoC interns

We are thrilled to announce that three people will join Guix as interns over the next few months! As part of Google’s Summer of Code (GSoC), under the umbrella of the GNU Project, one person is joining us:

  • Brice Waegeneire (liberdiko) will work on network booting Guix System. This will involve both making Guix System network bootable, and making it easy to set up a network boot server on Guix System.

Through Outreachy, the internship program for groups underrepresented in free software and tech, two people will join:

  • Danjela will work on improving internationalization support for the Guix Data Service,
  • Raghav Gururajan will work on integrating desktop environments into Guix System.

Christopher Baines and Danny Milosavljevic will be their primary mentors, and the whole Guix crowd will undoubtedly help and provide guidance as it has always done.

We welcome all three interns, exciting things are sure to come!

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the kernel Linux, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

by Gábor Boskovits at May 12, 2020 12:00 PM

May 06, 2020

GNU Guix

Grafts, continued

Guix includes a mechanism called grafts that allows us to provide users with security updates in a timely fashion, even for core packages deep down in the dependency graph. Most users value the benefits of grafts, but newcomers were also unavoidably struck by what turned out to be the undesirable side effect of our graft implementation on user experience. This had been a well-known problem for a while, but 1.1.0 finally addressed these issues.

This article recaps how grafts are implemented, what problems that caused, and how we solved it. It’s a deep dive into core Guix, and I hope it’ll be insightful to all and intriguing to the functional programming geeks among us!

What’s this “graft” thing anyway?

Grafts were introduced in the early days of Guix to address probably the main practical shortcomings of functional software deployment. In a nutshell, functional deployment as implemented by Nix and Guix means that, when a package changes, everything that depends on it must be rebuilt (or re-downloaded). To deploy a security fix in the C library or in Bash, you would thus need to rebuild everything. Even with a huge build farm, that can significantly delay the deployment of fixes; users would find themselves either rebuilding things locally or, at best, re-downloading binaries of everything.

To address this, Guix developers can instead specify a replacement in a package definition. If we have a bug-fix for, say, libc, developers would (1) define a package for the fixed libc, and (2) add a replacement field in the original libc package pointing to that fixed package. The effect is that only the bug-fix libc needs to be built. When building a package, the bug-fix libc is automatically grafted onto that package, such that the resulting package refers to the bug-fix libc. See the manual for more.

When “lowering” a high-level package definition to a low-level derivation, Guix traverses the package dependency graph and identifies a set of potentially applicable grafts. Why “potentially applicable”? Consider this scenario: assume perl has a replacement; coreutils has a dependency on perl, but it’s a build-time dependency: coreutils does not depend on perl at run time. Thus, coreutils can be used as is, there is no need to graft it.

But how do we know whether a dependency is a built-time-only dependency? The native-inputs field of a package usually lists build-time dependencies, but it’s more of a hint. Ultimately, the set of run-time dependencies, which we call the references, is the subset of the build-time dependencies that the garbage collector (GC) in the build daemon finds in the build result—Section 5.5.1 of Eelco Dolstra’s PhD thesis describes how the GC scans for references. In our example, we first have to actually build coreutils before we can tell whether it depends on perl at run time.

Guix arranges to graft only when necessary. In this example, guix build coreutils would return the same as guix build coreutils --no-grafts. Conversely, since inkscape has a run-time dependency on perl, guix build inkscape returns a derivation that grafts the perl replacement onto the original inkscape build result, the one returned by guix build inkscape --no-grafts. The (simplified) dependency graph of the derivation for the grafted inkscape looks like this:

Dependency graph of the graft derivation of Inkscape.

Grafts are a form of what Build Systems à la Carte by Mokhov et al. (a good read!) refers to as dynamic dependencies: grafting depends on intermediate build results.

Still here? With the background in place, let’s look at the problems that arose.

Grafts, the user interface, and performance

Conceptually, to decide whether to graft a package, we examine the references of the build result of the ungrafted package. However, we usually want guix install & co. to first display an overall build plan, especially when invoked with --dry-run:

$ guix install inkscape
The following package will be installed:
   inkscape 1.0

71.3 MB will be downloaded:
   /gnu/store/xq64iaxx2gmlcgnipj31wjxlf1yd2g2p-gsl-2.6
   /gnu/store/zrmhnj3pwchn2msphgnwzwd3q89m46rn-aspell-0.60.8
   /gnu/store/5g31zf21lk8nrfd2b8qrm19nwdm9gir9-potrace-1.16
   /gnu/store/qpr7387bsjs7rpl6flrwdvn2zbzh5bch-ghostscript-with-cups-9.52
   /gnu/store/7y3lvk3xf4im8n44337mc6y0ccysvfia-font-dejavu-2.37
   /gnu/store/95n3zvzxzq2bxvdvz292cg04757ij30x-cups-minimal-2.3.1
…

To accommodate that, the pre-1.1.0 implementation of grafts did the following: when substitutes were enabled, it would get the list of references of ungrafted packages from substitutes; only when substitutes for an ungrafted package are missing would it first try to build that package. Thus, when substitutes are available, guix install and similar commands would be able to display the build plan upfront. However, when a packages had no substitutes, you’d see Guix start building it without saying a word about the build plan, which was arguably confusing.

But it’s worse than that. Grafting is per-package, so every time you would lower a package to a derivation, you would need to answer the question “does this specific package have substitutes, and if so, should it be grafted?” The end result was poor resource usage and terrible user interface feedback. For every package that is a graft candidate, the user would see that infamous line:

updating substitutes from 'https://ci.guix.gnu.org'...

The problem was particularly acute when building whole systems with guix system because there would typically be a large number of such packages. Furthermore, each of these lines would correspond to (roughly) a single HTTP GET request on a fresh TLS connection. That can be slow… and annoying. Perhaps to some users this updating substitutes stuttering was the proof of the developers’ incompetence and perhaps, truth be told, to some of us developers it was a small price to pay for the sophistication of grafts.

For users who disable substitutes and build everything locally, the situation wasn’t much better: all the packages candidate for grafting would be built one by one, thereby missing parallelization opportunities as specified by --max-jobs.

Gathering dynamic dependencies

To address this, all these individual dynamic dependencies need to be gathered somehow instead of being treated one by one. Conceptually, we would like to, roughly, do a first pass lowering packages to derivations as if grafting was disabled, build all these derivations, and then do a second pass to determine which packages in the graph need to be grafted and to compute the relevant grafting derivation. That would address the performance issue: we’d now have as much parallelism as possible, so we wouldn’t query substitutes or build packages one by one. If we reify that second pass to the user interface code, it also addresses the user interface issue by allowing it to display, possibly, two build plans: the “ungrafted” one followed by the grafted one.

The problem is that our API is inherently serial: the package-derivation function takes one package, lowers it, and returns its derivation:

(use-modules (guix)
             (gnu packages base)
             (gnu packages inkscape))

(define s (open-connection))

(package-derivation s coreutils)
⇒ #<derivation /gnu/store/rpfdbax1py483m9ydhvp73s7dgmn6xh4-coreutils-8.31.drv => /gnu/store/jkj7wxybgcpdamkl6fz7wwbb1ak5wxvk-coreutils-8.31-debug /gnu/store/zibwkb5xavnv6z3gzknfqjsxb9b0izh0-coreutils-8.31 7f6c92e3a000>

(package-derivation s coreutils #:graft? #f)
⇒ #<derivation /gnu/store/rpfdbax1py483m9ydhvp73s7dgmn6xh4-coreutils-8.31.drv => /gnu/store/jkj7wxybgcpdamkl6fz7wwbb1ak5wxvk-coreutils-8.31-debug /gnu/store/zibwkb5xavnv6z3gzknfqjsxb9b0izh0-coreutils-8.31 7f6c92e3a000>

(package-derivation s inkscape)
⇒ #<derivation /gnu/store/jzm2zsq8m0rj8wdsmikc0p2ik0cprrcf-inkscape-0.92.4.drv => /gnu/store/clj8rjnsip8a35hyd9nf4l65w7ahn0gs-inkscape-0.92.4 7f6c9c15b730>

(package-derivation s inkscape #:graft? #f)
⇒ #<derivation /gnu/store/psd31x1fq0v2g594z217mh56xzg21dym-inkscape-0.92.4.drv => /gnu/store/zz28ckjwfxwkx3gsm8sc452kmvfiky6y-inkscape-0.92.4 7f6c90ad4f50>

Lowering includes dealing with grafts, and that’s why we ended up with one-by-one inefficiencies. An option would be to make all the API “plural”: have package-derivation and its friends accept a list of packages instead of a single one. That would be a huge amount of work and the end result would be unpleasant to use: it’s easier to reason one-by-one.

The solution implemented in 1.1.0 instead starts from this observation: the call graph of package-derivation mirrors the package graph. Thus, we could gather dynamic dependencies using monad trickery or using “control effects”. We went for the latter, which didn’t have the “contamination” problem of monads and led to simpler code.

The starting point is that, by definition, code with dynamic dependencies necessarily calls build-derivations. Taking advantage of delimited continuations in Guile, build-derivations is instrumented to abort to a “build handler” prompt when it’s called. The build handler receives the list of derivations to build along with a continuation to invoke to resume the aborted computation and start building things. User interface code can install a build handler that displays what’s going to be built:

(with-build-handler (lambda (continue store things mode)
                      (show-what-to-build store things)
                      (continue #t))
  …)

To implement dry runs, simply omit the call to continue and nothing will be built. (This is a slightly simplified artist view, see build-notifier for the real thing.)

Now, we need to take advantage of this mechanism to gather the individual build-derivations calls so we can later emit a single build-derivations call for all the gathered derivations. The goal is to effectively gather all the calls for ungrafted packages, build them all at once, and then resume graft computation.

To achieve that, we write a build handler that, when invoked, returns an <unresolved> object that captures what to build and the continuation. In addition, we provide a primitive to introduce parallelism such that, if a dynamic dependency is encountered, we keep going and attempt to compute as much as possible without resolving that dynamic dependency. These are build-accumulator and map/accumulate-builds. map/accumulate-builds is like map, except that it accumulates and gathers build-derivations request.

By using map/accumulate-builds instead of map in a few key places, we obtain a good approximation of what we wanted, as illustrated in this run:

$ guix install zile-on-guile vim-full
The following packages will be installed:
   zile-on-guile 2.4.14-0.fd09781
   vim-full      8.2.0411

9.4 MB will be downloaded:
   /gnu/store/vf7w4yiax38ra7x8aqqvbnc38c0ldgpm-zile-on-guile-2.4.14-0.fd09781
   /gnu/store/dnj9wljcck9vdwgp7dwxk00qnnk1g3c5-vim-full-8.2.0411
downloading from https://ci.guix.gnu.org/nar/lzip/dnj9wljcck9vdwgp7dwxk00qnnk1g3c5-vim-full-8.2.0411...
 vim-full-8.2.0411  8.9MiB                 7.6MiB/s 00:01 [##################] 100.0%

downloading from https://ci.guix.gnu.org/nar/lzip/vf7w4yiax38ra7x8aqqvbnc38c0ldgpm-zile-on-guile-2.4.14-0.fd09781...
 zile-on-guile-2.4.14-0.fd09781  140KiB    1.8MiB/s 00:00 [##################] 100.0%

The following derivation will be built:
   /gnu/store/d9xms78917w67xq71pqsx5x9s6dmq6d7-profile.drv
The following graft will be made:
   /gnu/store/4n6dmg6iwjg0adpcvqygr9wgsnclswss-vim-full-8.2.0411.drv
applying 8 grafts for /gnu/store/4n6dmg6iwjg0adpcvqygr9wgsnclswss-vim-full-8.2.0411.drv...
building /gnu/store/d9xms78917w67xq71pqsx5x9s6dmq6d7-profile.drv...

What we see above is first a build plan that downloads binaries for the two ungrafted packages, followed by a build plan for one grafting derivations: we have successfully preserved parallelism.

The solution resembles the suspending scheduler discussed in the à la Carte paper, though decomposition is not as principled as what the paper describes. It remains an approximation and not the optimal way to deal with dynamic dependencies. There are still situations where that shows, but overall, it’s a significant improvement. Unlike other solutions prototyped before, this one has the advantage of being orthogonal and simple: less than 100 new lines of code, and even about 30 lines removed from the graft implementation. That alone contributes a lot to the author’s satisfaction. :-)

Interlude: a scatter/gather pattern?

In the end, we’re just gathering all the build-derivations calls, turning them into a single call, and finally calling all the original site continuations with the result. The same kind of issue shows up when dealing with sequences of remote procedure calls (RPCs) and HTTP requests, and it seems there’s a more general pattern lurking here. Consider code like this:

(map (lambda (thing)
       (http-get (thing->url thing)))
     lst)

Wouldn’t it be nice if we could somehow capture all the http-get calls, turn them into a series of pipelined GET requests, and resume the continuations with their result?

I haven’t found a standard functional pattern to address this and would welcome ideas!

Dynamic dependencies of all shapes

We have seen how Guix deals with dynamic dependencies. Nix supports a similar but limited form of dynamic dependencies through the import primitive of the Nix language, which can take the result of a derivation build; it does not attempt to gather the resulting buildPaths calls.

If we take a step back, we can see that Nix and Guix actually support other forms of dynamic dependencies. For example, it’s possible to write derivations whose result is a function of the reference graph of another derivation’s build result. This is achieved in Guix by passing the #:references-graphs argument of gexp->derivation, which leads the build daemon to include the reference graph in the build environment.

Another form of dynamic dependency is derivation-building derivations or recursive derivations, which were recently implemented in Nix. It supports another form of dynamic dependency where the build process of a derivation can itself create and build derivations (these are moldable tasks in scheduling parlance). It’s a great feature because in a nutshell, it allows Nix to be used not only to compose packages, but also at a finer grain as part of a package build process.

Guix supports yet another form of dynamic dependencies. The newfangled guix deploy tool works by evaluating g-expressions (gexps) remotely. For example, before actually deploying an operating system, it first runs code on the remote node to perform sanity checks: checking whether the declared file system UUIDs or labels are valid, checking whether additional kernel modules should be added to the initial RAM disk, and so forth. To do that, remote-eval first builds a derivation that produces a Scheme program, deploys it along with all its dependencies on that target machine, runs it, and retrieves the result. This form of dynamic dependency also benefits from the gathering machinery discussed above.

Conclusion

This is a long article on what may look like a fine point of Guix design and implementation, but there’s much to say about it! Grafts are key to the use of functional deployment in production because they enable quick security updates, and it’s a lot better if they don’t harm the user experience.

The pre-1.1.0 implementation of grafts had a negative impact on the user interface and on performance, which was due to the sequential handling of grafts, one package at a time. In 1.1.0 we addressed it by using delimited continuations to gather dynamic dependencies such as grafts, perform builds in bulk, and resume each derivation computation.

As it turned out, the implementation of dynamic dependencies raises lots of interesting design and implementation issues, and it’s probably not the end of the road!

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the kernel Linux, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

by Ludovic Courtès at May 06, 2020 03:00 PM

May 04, 2020

GNU Guix

GNU Guix maintainer collective update

This blog post is to announce a change of hands in the Guix co-maintainer collective: Ricardo Wurmus is stepping down from his role, and Mathieu Othacehe will be filling in to ensure continuity, after being elected by the other Guix co-maintainers.

Ricardo has been around since the start, and has been invaluable to the project. He has been key in maintaining the infrastructure Guix runs on, contributed countless packages, core APIs and tools (importers, build systems, and Docker image creation to name a few). Over the years, he's also brought us a fair share of cool hacks such as a nifty issue tracker, and generously spent time helping Guix users in the IRC channel and mailing lists. Equally important was his taking care of many administrative tasks such as expanding the build farm and organizing Outreachy participation. We're sad to let him go, and hope he'll stick around as time permits :-).

On the happier side of things, the appointment of Mathieu Othacehe as a co-maintainer means Guix will benefit from renewed energy and vision to grow further. Mathieu has already made valuable contributions to Guix; the graphical installer that allows users to easily install the Guix System on their machine is one of them. He has also demonstrated the qualities we expect from a co-maintainer. We're thrilled to make official his new role as a Guix co-maintainer!

Let's take a moment to show our gratitude to Ricardo and welcome Mathieu in his new role!

The Guix co-maintainers

The Guix maintainer collective now consists of Marius Bakke, Maxim Cournoyer, Ludovic Courtès, Tobias Geerinckx-Rice and Mathieu Othacehe. You can reach us all by email at guix-maintainers@gnu.org, a private alias.

For information about the responsibilities assumed by the Guix co-maintainers, you are encouraged to read a previous blog post that covered the topic.

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the kernel Linux, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

by Maxim Cournoyer at May 04, 2020 09:00 AM

April 15, 2020

GNU Guix

GNU Guix 1.1.0 released

We are pleased to announce the release of GNU Guix version 1.1.0!

The release comes with ISO-9660 installation images, a virtual machine image, and with tarballs to install the package manager on top of your GNU/Linux distro, either from source or from binaries. Guix users can update by running guix pull.

If you wonder what installing Guix System is like, this video gives an overview of the guided installation process:

Video of the system installation process.

There are more “getting started” videos.

It’s been 11 months since the previous release, during which 201 people contributed code and packages. This is a long time for a release, which is in part due to the fact that bug fixes and new features are continuously delivered to our users via guix pull. However, a number of improvements, in particular in the installer, will greatly improve the experience of first-time users.

It’s hard to summarize more than 14,000 commits! Here are some highlights as far as tooling is concerned:

On the distro side:

  • The big change is that the package dependency graph is rooted in a reduced set of “binary seeds”—a huge step towards a fully auditable bootstrap. There’s more to come soon!
  • The graphical installer for Guix System benefited from many bug fixes and improvements. Following the bugs found in 1.0.0, we developed an automated testing framework for the installer itself. Continuous integration runs automated tests of the installer for different configurations (encrypted root, non-encrypted root, with or without a desktop environment, etc.).
  • 3,514 packages were added, for a total of more than 13K packages. 3,368 packages were upgraded. The distribution comes with GNU libc 2.29, GCC 9.3, GNOME 3.32, MATE 1.24.0, Xfce 4.14.0, Linux-libre 5.4.28, and LibreOffice 6.4.2.2 to name a few.
  • 19 new services were added, notably providing support for running NFS servers, configuring the nftables firewall, or even a high-level Web service like Patchwork.
  • Build systems for Node, Julia, and Qt were added, making it easier to write package definitions for these ecosystems. In addition there is a new copy-build-system that does what you might expect.

At the programming interface level and under the hood, many things changed as well, notably:

  • The new with-build-handler form allows us to better support dynamic dependencies as introduced by grafts. More on that in a future post, but suffice to say that it fixes a longstanding user interface and performance issue.
  • The remote-eval procedure in (guix remote) supports remote execution of Scheme code as G-expressions after having first built and deployed any code it relies on. This capability was key to allowing code sharing between guix deploy, which operates on remote hosts, and guix system reconfigure. Similarly, there’s a new eval/container procedure to run code in an automatically-provisioned container.
  • The new lower-gexp procedure returns a low-level intermediate representation of a G-expression. remote-eval, eval/container, and gexp->derivation are expressed in terms of lower-gexp.
  • The with-parameters form allows you, for instance, to pin objects such as packages to a specific system or cross-compilation target.
  • Performance was improved for common low-level operations.

That’s a long list! The NEWS file lists additional noteworthy changes and bug fixes you may be interested in.

Enjoy!

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the kernel Linux, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

by Ludovic Courtès, Marius Bakke at April 15, 2020 03:00 PM

April 10, 2020

Programming Praxis

Nth Item In A Linked List

[ I continue working at home. It’s no fun, and less productive than working in my office, but I’m coping. There remains much emergency work to be done related to moving all of our students online. I hope everyone is doing well during this time. ]

A student recently asked for help on a beginning-programmer forum to write a Scheme program that finds the nth item in a linked list, mimicking the list-ref built-in function. He posted his code, which was awful; I won’t repost it here. Instead of engaging him, I sent a private email suggesting that he consult either his professor or his teaching assistant, as his posted code showed several misconceptions about Scheme. He wrote back, saying he was sure there was only one thing wrong with his code and I could easily point it out. I didn’t respond, as there was far more than one thing wrong with his code.

Your task is to write a program to find the nth item in a linked list; your program must be recursive, as was required by the original assignment. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at April 10, 2020 09:00 AM

April 08, 2020

GNU Guix

A “Hello World” virtual machine running the Hurd

Hello GNU World!

There’s been a bit of speculation as to whether our April 1st post was a joke. Part of it was a joke: we’re not deprecating Linux-libre, fear not! But when we published it, it was already April 2nd in Eastern parts of the world and thus, not surprisingly, the remainder of the post was less of a joke.

Getting to a bootable system

For all you who tried our April 1st image and ran guix we sure hope you had a good laugh. We set out to cross-build that virtual machine (VM) image using Guix and while we made some good progress on Wednesday, in the end we decided to cheat to make the release deadline.

What we got stuck on for a while was to get past the ext2fs translator (the user-land process that implements the ext2 file system) seemingly freezing on boot, saying:

start ext2fs:

and then nothing... Running ext2fs cross-built with Guix on Debian GNU/Hurd would hang similarly. The kernel debugger would show an intriguing backtrace in ext2fs suggesting that ext2fs was not handling page fault messages. Long story short: we eventually realized that the server interfaces were compiled with a 64-bit MiG whereas we were targeting a 32-bit platform. From there on, we embarked on a delightful hacking journey ensuring the Hurd boot process would correctly run in our VM up to a proper login prompt.

Today we have a humble gift for you: On the wip-hurd-vm branch (Update: this has now been merged!) we have an initial hurd.scm system description that can be used to cross build a VM running the Hurd.

Running:

./pre-inst-env guix build -f gnu/system/hurd.scm

cross-compiles all the relevant packages for GNU/Hurd—specifically the i586-pc-gnu triplet—and produces a VM image:

/gnu/store/yqnabv1zmlkviwzikc23w9qvfnyfwvj7-qemu-image

You can build it and start it from your GNU/Linux machine with this command:

qemu-system-i386 -enable-kvm -m 512 -snapshot -hda \
  $(./pre-inst-env guix build -f gnu/system/hurd.scm)

and voilà:

Initial Guix VM running the Hurd

Woohoo! (Actually we already have more stuff not shown here, such as guix itself running… for a future post! :-))

Why bother?

Why bother with the Hurd anyway? Isn’t it a pipe dream or “vaporware”, depending on one’s perspective? There’s some unquestionable truth in that: we know that Hurd development started in the early 90’s, months before Linux development started, and yet it still lacks so much in terms of hardware support, even though significant progress was made in recent years in particular with the use of Rump kernels.

The more we witness how new features are retrofitted in the kernel Linux, the more we think the Hurd’s design is better suited to today’s needs. Linux namespaces, the foundation of “containers”, are such an example of an afterthought; unprivileged user namespaces, which allow unprivileged users to benefit from lightweight “container” virtualization, are still often disabled by distros due to a lack of confidence. This is in sharp contrast with the Hurd’s inherent unrestricted support for fine-grain virtualization: a PID namespace is just another proc server, and file system name space is just another root file system server, and so on. Container-like lightweight virtualization is native on the Hurd.

Last but not least, with an eye on the security and transparency of free software systems, a microkernel-based systems seems to naturally lend itself well to bootstrapping from a reduced trusted base. This is one of the topics we discussed on the last Reproducible Builds Summit.

The question is not so much whether 2020 or 2021 will be the year of the Hurd. It’s more about the kind of systems we want to build. A lot of work remains to be done, but we think, in 2020 more than ever, that this is a promising approach for the betterment of the security of our systems and the freedom of users.

We also have to admit that this is an amazing system to hack on, even more so when combined with Guix, so… happy hacking! :-)

About GNU Guix

GNU Guix is a transactional package manager and an advanced distribution of the GNU system that respects user freedom. Guix can be used on top of any system running the Hurd or the Linux kernel, or it can be used as a standalone operating system distribution for i686, x86_64, ARMv7, and AArch64 machines.

In addition to standard package management features, Guix supports transactional upgrades and roll-backs, unprivileged package management, per-user profiles, and garbage collection. When used as a standalone GNU/Linux distribution, Guix offers a declarative, stateless approach to operating system configuration management. Guix is highly customizable and hackable through Guile programming interfaces and extensions to the Scheme language.

About the GNU Hurd

The GNU Hurd is the GNU project's replacement for the Unix kernel. It is a collection of servers that run on the Mach microkernel to implement file systems, network protocols, file access control, and other features that are implemented by the Unix kernel or similar kernels (such as Linux). More info.

The mission of the GNU Hurd project is to create a general-purpose kernel suitable for the GNU operating system, which is viable for everyday use, and gives users and programs as much control over their computing environment as possible.

by Jan Nieuwenhuizen, Ludovic Courtès at April 08, 2020 05:50 PM

April 03, 2020

Programming Praxis

Homework

[ I continue working from home. It is interesting that my commute has gone from a daily 75-minute round-trip by car to a 75-foot round-trip by foot, so I think I should have more energy in the evening, but the opposite is true; I am exhausted by the end of the day, just from sitting in front of my laptop. ]

Today’s exercise is somebody’s homework:

Given positive integers C and N, find N numbers that sum up to C and the difference between the highest and the lowest of these number should not be more than one. For example: with C = 26 and N = 7, the desired output is [4 4 4 4 4 3 3].

Your task is to write a program to solve the student’s homework. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

by programmingpraxis at April 03, 2020 09:00 AM