GNU bug report logs - #32234
Cuirass: The SQLite built in busy handler might block the Fibers scheduler

Package: guix;

Reported by: Clément Lassieur <clement <at> lassieur.org>

Date: Sat, 21 Jul 2018 09:59:01 UTC

Severity: normal

Done: Clément Lassieur <clement <at> lassieur.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 32234 in the body.
You can then email your comments to 32234 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox

Report forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Sat, 21 Jul 2018 09:59:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to Clément Lassieur <clement <at> lassieur.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sat, 21 Jul 2018 09:59:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: bug-guix <at> gnu.org
Subject: Cuirass: The SQLite built in busy handler might block the Fibers
 scheduler
Date: Sat, 21 Jul 2018 11:57:57 +0200

Hi,

I'm trying to understand why Berlin web API times out[1].

I think the SQlite built in busy handler may block the Fibers scheduler.
We use "PRAGMA busy_timeout = 30000;", which is an alternative to
calling sqlite3_busy_timeout(), whose description[2] is:

    This routine sets a busy handler that sleeps for a specified amount
    of time when a table is locked. The handler will sleep multiple
    times until at least "ms" milliseconds of sleeping have
    accumulated. After at least "ms" milliseconds of sleeping, the
    handler returns 0 which causes sqlite3_step() to return SQLITE_BUSY.

To me this sounds like non-cooperative and non-resumable code.

A solution would be to set a custom handler through
sqlite3_busy_handler[3] that would be Fibers compatible, i.e. it would
let the scheduler schedule other fibers instead of just sleeping, using
Fibers 'sleep' procedure[4].

WDYT?
Clément

[1]: https://bugs.gnu.org/32233.
[2]: https://www.sqlite.org/c3ref/busy_timeout.html
[3]: https://www.sqlite.org/c3ref/busy_handler.html
[4]: https://github.com/wingo/fibers/wiki/Manual#user-content-index-sleep

Added indication that bug 32234 blocks32233 Request was from clement <at> lassieur.org (Clément Lassieur) to control <at> debbugs.gnu.org. (Sat, 21 Jul 2018 12:41:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 23 Jul 2018 09:19:02 GMT) Full text and rfc822 format available.

Message #10 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: Cuirass: The SQLite built in busy handler might block
 the Fibers scheduler
Date: Mon, 23 Jul 2018 11:18:43 +0200

Hi,

Clément Lassieur <clement <at> lassieur.org> skribis:

> I'm trying to understand why Berlin web API times out[1].

Just to be clear, it does block on things involving non-trivial database
queries:

--8<---------------cut here---------------start------------->8---
ludo <at> berlin ~$ time wget -q -O -  http://localhost:8081/specifications
[{"name" : "core-updates-core-updates","load-path-inputs" : ["core-updates"],"package-path-inputs" : [],"proc-input" : "core-updates","proc-file" : "build-aux/cuirass/gnu-system.scm","proc" : "cuirass-jobs","proc-args" : {"subset" : "core","systems" : ["x86_64-linux", "i686-linux"]},"inputs" : [{"name" : "core-updates","url" : "https://git.savannah.gnu.org/git/guix.git","load-path" : ".","branch" : "core-updates","tag" : false,"commit" : false,"no-compile?" : true}]}, {"name" : "staging-staging","load-path-inputs" : ["staging"],"package-path-inputs" : [],"proc-input" : "staging","proc-file" : "build-aux/cuirass/gnu-system.scm","proc" : "cuirass-jobs","proc-args" : {"systems" : ["x86_64-linux", "i686-linux"]},"inputs" : [{"name" : "staging","url" : "https://git.savannah.gnu.org/git/guix.git","load-path" : ".","branch" : "staging","tag" : false,"commit" : false,"no-compile?" : true}]}, {"name" : "guix-modular-master","load-path-inputs" : ["guix-modular"],"package-path-inputs" : [],"proc-input" : "guix-modular","proc-file" : "build-aux/cuirass/guix-modular.scm","proc" : "cuirass-jobs","proc-args" : {"systems" : ["x86_64-linux", "i686-linux"]},"inputs" : [{"name" : "guix-modular","url" : "https://git.savannah.gnu.org/git/guix.git","load-path" : ".","branch" : "master","tag" : false,"commit" : false,"no-compile?" : true}]}, {"name" : "guix-master","load-path-inputs" : ["guix"],"package-path-inputs" : [],"proc-input" : "guix","proc-file" : "build-aux/cuirass/gnu-system.scm","proc" : "cuirass-jobs","proc-args" : {"subset" : "all","systems" : ["x86_64-linux", "i686-linux"]},"inputs" : [{"name" : "guix","url" : "https://git.savannah.gnu.org/git/guix.git","load-path" : ".","branch" : "master","tag" : false,"commit" : false,"no-compile?" : true}]}]
real    0m0.007s
user    0m0.004s
sys     0m0.000s
ludo <at> berlin ~$ time wget -q -O -  http://localhost:8081/api/latestbuilds?nr=2
[{"id" : 4484197,"jobset" : "core-updates-core-updates","job" : "ghc-alex-3.2.3.i686-linux","timestamp" : 1532336715,"starttime" : 1532336542,"stoptime" : 1532336715,"derivation" : "/gnu/store/b072sfmw44ww9dsm1ahpqafjk05zsl04-ghc-alex-3.2.3.drv","buildoutputs" : {"out" : {"path" : "/gnu/store/qk79zl1gcg6lh4k2nrp72x8iywy317d2-ghc-alex-3.2.3"}},"system" : "i686-linux","nixname" : "ghc-alex-3.2.3","buildstatus" : 0,"busy" : 0,"priority" : 0,"finished" : 1,"buildproducts" : null,"releasename" : null,"buildinputs_builds" : null}, {"id" : 4799156,"jobset" : "core-updates-core-updates","job" : "ghc-alex-3.2.3.i686-linux","timestamp" : 1532336715,"starttime" : 1532336542,"stoptime" : 1532336715,"derivation" : "/gnu/store/b072sfmw44ww9dsm1ahpqafjk05zsl04-ghc-alex-3.2.3.drv","buildoutputs" : {"out" : {"path" : "/gnu/store/qk79zl1gcg6lh4k2nrp72x8iywy317d2-ghc-alex-3.2.3"}},"system" : "i686-linux","nixname" : "ghc-alex-3.2.3","buildstatus" : 0,"busy" : 0,"priority" : 0,"finished" : 1,"buildproducts" : null,"releasename" : null,"buildinputs_builds" : null}]
real    0m28.033s
user    0m0.004s
sys     0m0.000s
--8<---------------cut here---------------end--------------->8---

I’m not sure it’s due to locking though; we may lack an index or two,
“explain query plan” should help.

> I think the SQlite built in busy handler may block the Fibers scheduler.
> We use "PRAGMA busy_timeout = 30000;", which is an alternative to
> calling sqlite3_busy_timeout(), whose description[2] is:
>
>     This routine sets a busy handler that sleeps for a specified amount
>     of time when a table is locked. The handler will sleep multiple
>     times until at least "ms" milliseconds of sleeping have
>     accumulated. After at least "ms" milliseconds of sleeping, the
>     handler returns 0 which causes sqlite3_step() to return SQLITE_BUSY.
>
> To me this sounds like non-cooperative and non-resumable code.

Indeed!

> A solution would be to set a custom handler through
> sqlite3_busy_handler[3] that would be Fibers compatible, i.e. it would
> let the scheduler schedule other fibers instead of just sleeping, using
> Fibers 'sleep' procedure[4].

AIUI the handler is called from C, and thus it’s a non-resumable
continuation, so this wouldn’t work.

Perhaps instead we need to set the timeout to a small value and handle
SQLITE_BUSY at the call site in our code.  We could define a macro that
automatically retries upon SQLITE_BUSY.

Thoughts?

Ludo’.

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 23 Jul 2018 13:43:01 GMT) Full text and rfc822 format available.

Message #13 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: Cuirass: The SQLite built in busy handler might block
 the Fibers scheduler
Date: Mon, 23 Jul 2018 15:42:24 +0200

Ludovic Courtès <ludo <at> gnu.org> writes:

>> I think the SQlite built in busy handler may block the Fibers scheduler.
>> We use "PRAGMA busy_timeout = 30000;", which is an alternative to
>> calling sqlite3_busy_timeout(), whose description[2] is:
>>
>>     This routine sets a busy handler that sleeps for a specified amount
>>     of time when a table is locked. The handler will sleep multiple
>>     times until at least "ms" milliseconds of sleeping have
>>     accumulated. After at least "ms" milliseconds of sleeping, the
>>     handler returns 0 which causes sqlite3_step() to return SQLITE_BUSY.
>>
>> To me this sounds like non-cooperative and non-resumable code.
>
> Indeed!
>
>> A solution would be to set a custom handler through
>> sqlite3_busy_handler[3] that would be Fibers compatible, i.e. it would
>> let the scheduler schedule other fibers instead of just sleeping, using
>> Fibers 'sleep' procedure[4].
>
> AIUI the handler is called from C, and thus it’s a non-resumable
> continuation, so this wouldn’t work.

Oh, I see.

> Perhaps instead we need to set the timeout to a small value and handle
> SQLITE_BUSY at the call site in our code.  We could define a macro that
> automatically retries upon SQLITE_BUSY.

That would limit the issue to the first timeout span: for that short
time the scheduler would be blocked.  I think a timeout of 0 would be
better.

Another solution would be to serialize all the database accesses as we
do already with the url handler, and stop using the SQLITE
multithreading features.  It would probably make the code simpler
because we would use the same paradigm everywhere, and we would avoid
looping until SQLITE isn't busy at each request.

WDYT?

Clément

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 23 Jul 2018 14:59:02 GMT) Full text and rfc822 format available.

Message #16 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: Cuirass: The SQLite built in busy handler might block
 the Fibers scheduler
Date: Mon, 23 Jul 2018 16:57:49 +0200

Clément Lassieur <clement <at> lassieur.org> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:

[...]

>> Perhaps instead we need to set the timeout to a small value and handle
>> SQLITE_BUSY at the call site in our code.  We could define a macro that
>> automatically retries upon SQLITE_BUSY.
>
> That would limit the issue to the first timeout span: for that short
> time the scheduler would be blocked.  I think a timeout of 0 would be
> better.

Yes, 0 is an acceptable “small value.”  ;-)  Perhaps 100ms would be
acceptable if the situation is rare enough, dunno.

> Another solution would be to serialize all the database accesses as we
> do already with the url handler, and stop using the SQLITE
> multithreading features.  It would probably make the code simpler
> because we would use the same paradigm everywhere, and we would avoid
> looping until SQLITE isn't busy at each request.

In essence we’d introduce a “database server” running as a fiber, and
everyone would talk to that server.

I considered doing that before but then though sqlite would probably be
able to do better than this, but I don’t know.

What’s a bit annoying with switching to a database server model is that
we’d need to adapt every call site.

Thoughts?

Ludo’.

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 23 Jul 2018 16:19:02 GMT) Full text and rfc822 format available.

Message #19 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: Cuirass: The SQLite built in busy handler might block
 the Fibers scheduler
Date: Mon, 23 Jul 2018 18:18:14 +0200

Ludovic Courtès <ludo <at> gnu.org> writes:

> Clément Lassieur <clement <at> lassieur.org> skribis:
>
>> Ludovic Courtès <ludo <at> gnu.org> writes:
>
> [...]
>
>>> Perhaps instead we need to set the timeout to a small value and handle
>>> SQLITE_BUSY at the call site in our code.  We could define a macro that
>>> automatically retries upon SQLITE_BUSY.
>>
>> That would limit the issue to the first timeout span: for that short
>> time the scheduler would be blocked.  I think a timeout of 0 would be
>> better.
>
> Yes, 0 is an acceptable “small value.”  ;-)  Perhaps 100ms would be
> acceptable if the situation is rare enough, dunno.
>
>> Another solution would be to serialize all the database accesses as we
>> do already with the url handler, and stop using the SQLITE
>> multithreading features.  It would probably make the code simpler
>> because we would use the same paradigm everywhere, and we would avoid
>> looping until SQLITE isn't busy at each request.
>
> In essence we’d introduce a “database server” running as a fiber, and
> everyone would talk to that server.

Why not as a thread?  There would be only one thread dedicated to SQLITE
(thus adding a constant overhead), and that would prevent the Scheduler
from being bloqued by long SQLite queries.

> I considered doing that before but then though sqlite would probably be
> able to do better than this, but I don’t know.

And it's hard to test...

> What’s a bit annoying with switching to a database server model is that
> we’d need to adapt every call site.
>
> Thoughts?

I'm a tiny bit in favor of switching to a database server model because
it's more consistent overall (we already use it for the url handler) and
the scheduler wouldn't be bloqued by long SQLite queries.

It's annoying to adapt every call site, but it would be done once and
for all. :-)

So... as you wish!

Clément

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 23 Jul 2018 20:12:02 GMT) Full text and rfc822 format available.

Message #22 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: Cuirass: The SQLite built in busy handler might block
 the Fibers scheduler
Date: Mon, 23 Jul 2018 22:11:07 +0200

Clément Lassieur <clement <at> lassieur.org> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Clément Lassieur <clement <at> lassieur.org> skribis:
>>
>>> Ludovic Courtès <ludo <at> gnu.org> writes:
>>
>> [...]
>>
>>>> Perhaps instead we need to set the timeout to a small value and handle
>>>> SQLITE_BUSY at the call site in our code.  We could define a macro that
>>>> automatically retries upon SQLITE_BUSY.
>>>
>>> That would limit the issue to the first timeout span: for that short
>>> time the scheduler would be blocked.  I think a timeout of 0 would be
>>> better.
>>
>> Yes, 0 is an acceptable “small value.”  ;-)  Perhaps 100ms would be
>> acceptable if the situation is rare enough, dunno.
>>
>>> Another solution would be to serialize all the database accesses as we
>>> do already with the url handler, and stop using the SQLITE
>>> multithreading features.  It would probably make the code simpler
>>> because we would use the same paradigm everywhere, and we would avoid
>>> looping until SQLITE isn't busy at each request.
>>
>> In essence we’d introduce a “database server” running as a fiber, and
>> everyone would talk to that server.
>
> Why not as a thread?  There would be only one thread dedicated to SQLITE
> (thus adding a constant overhead), and that would prevent the Scheduler
> from being bloqued by long SQLite queries.

Right, if we want to have a big timeout, it should be a thread, not a
fiber.  Sorry for the confusion.

>> I considered doing that before but then though sqlite would probably be
>> able to do better than this, but I don’t know.
>
> And it's hard to test...
>
>> What’s a bit annoying with switching to a database server model is that
>> we’d need to adapt every call site.
>>
>> Thoughts?
>
> I'm a tiny bit in favor of switching to a database server model because
> it's more consistent overall (we already use it for the url handler) and
> the scheduler wouldn't be bloqued by long SQLite queries.
>
> It's annoying to adapt every call site, but it would be done once and
> for all. :-)

Indeed.  If you want to take that route, that’s fine with me!

Thank you,
Ludo’.

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 06 Aug 2018 19:29:02 GMT) Full text and rfc822 format available.

Message #25 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 32234 <at> debbugs.gnu.org
Subject: [PATCH 1/2] utils: Avoid deadlock when WITH-CRITICAL-SECTION calls
 are nested.
Date: Mon,  6 Aug 2018 21:27:35 +0200

* src/cuirass/utils.scm (%critical-section-args): New parameter.
(make-critical-section): Put ARGS into a parameter, so that
CALL-WITH-CRITICAL-SECTION knows when it's called from the critical section.
In that case it would just apply PROC to ARGS.
(call-with-critical-section): If already in the critical section, apply PROC
to %CRITICAL-SECTION-ARGS instead of sending the message through the critical
section channel.
---
 src/cuirass/utils.scm | 27 +++++++++++++++++----------
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/src/cuirass/utils.scm b/src/cuirass/utils.scm
index 9e9ac36..6083890 100644
--- a/src/cuirass/utils.scm
+++ b/src/cuirass/utils.scm
@@ -94,6 +94,9 @@ delimited continuations and fibers."
         (conclusion)
         (apply throw args)))))
 
+(define %critical-section-args
+  (make-parameter #f))
+
 (define (make-critical-section . args)
   "Return a channel used to implement a critical section.  That channel can
 then be passed to 'join-critical-section', which will ensure sequential
@@ -104,19 +107,23 @@ dedicated fiber."
   (let ((channel (make-channel)))
     (spawn-fiber
      (lambda ()
-       (let loop ()
-         (match (get-message channel)
-           (((? channel? reply) . (? procedure? proc))
-            (put-message reply (apply proc args))))
-         (loop))))
+       (parameterize ((%critical-section-args args))
+         (let loop ()
+           (match (get-message channel)
+             (((? channel? reply) . (? procedure? proc))
+              (put-message reply (apply proc args))))
+           (loop)))))
     channel))
 
 (define (call-with-critical-section channel proc)
-  "Call PROC in the critical section corresponding to CHANNEL.  Return the
-result of PROC."
-  (let ((reply (make-channel)))
-    (put-message channel (cons reply proc))
-    (get-message reply)))
+  "Send PROC to the critical section through CHANNEL.  Return the result of
+PROC.  If already in the critical section, call PROC immediately."
+  (let ((args (%critical-section-args)))
+    (if args
+        (apply proc args)
+        (let ((reply (make-channel)))
+          (put-message channel (cons reply proc))
+          (get-message reply)))))
 
 (define-syntax-rule (with-critical-section channel (vars ...) exp ...)
   "Evaluate EXP... in the critical section corresponding to CHANNEL.
-- 
2.18.0

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 06 Aug 2018 19:29:02 GMT) Full text and rfc822 format available.

Message #28 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 32234 <at> debbugs.gnu.org
Subject: [PATCH 2/2] database: Serialize all database accesses in a thread.
Date: Mon,  6 Aug 2018 21:27:36 +0200

Fixes <https://bugs.gnu.org/32234>.

* bin/cuirass.in (main): Keep only one WITH-DATABASE call around all fibers.
Remove all DB arguments.
* src/cuirass/base.scm (evaluate, update-build-statuses!, spawn-builds,
handle-build-event, build-packages): Remove all DB arguments.
(clear-build-queue, cancel-old-builds): Wrap in WITH-DB-CRITICAL-SECTION,
remove all DB arguments.
(restart-builds): Remove the NON-BLOCKING call, remove all DB arguments.
(process-specs): Remove all DB arguments, remove the WITH-DATABASE call.
* src/cuirass/database.scm (%db-channel): New parameter.
(with-db-critical-section): New macro.
(db-add-input, db-add-specification, db-get-inputs, db-get-specifications,
db-add-evaluation, db-add-build, db-update-build-status!, db-get-outputs,
db-get-builds, db-get-build, db-get-pending-derivations, db-get-stamp,
db-add-stamp, db-get-evaluations, db-get-evaluations-build-summary,
db-get-evaluations-id-min, db-get-evaluations-id-max, db-get-builds-min,
db-get-builds-max): Wrap in WITH-DB-CRITICAL-SECTION, remove all DB arguments.
(with-database): Wrap BODY in PARAMETERIZE form that sets %DB-CHANNEL to the
channel returned by MAKE-CRITICAL-SECTION.
* src/cuirass/http.scm (handle-build-request, handle-builds-request): Remove
all DB arguments.
(url-handler): Remove all DB arguments, remove the DB-CHANNEL state, remove
the WITH-CRITICAL-SECTION calls.
(run-cuirass-server): Remove the DB arguments, remove the
MAKE-CRITICAL-SECTION call.
* src/cuirass/utils.scm (make-critical-section): Replace SPAWN-FIBER with
CALL-WITH-NEW-THREAD.  Wrap body in PARAMETERIZE form that clears
CURRENT-FIBER.
* tests/database.scm (with-temporary-database, db-add-specification,
db-add-build, db-update-build-status!, db-get-builds,
db-get-pending-derivations): Remove the DB arguments.
(db-init): Set the %DB-CHANNEL parameter to the channel returned by
MAKE-CRITICAL-SECTION, and return #t.
(database): Set %DB-CHANNEL to #f during cleanup.
* tests/http.scm (db-init): Set the %DB-CHANNEL parameter to the channel
returned by MAKE-CRITICAL-SECTION, and return #t.
(cuirass-run, fill-db): Remove the DB arguments.
(http): Set %DB-CHANNEL to #f during cleanup.
---
 bin/cuirass.in           |  23 +-
 src/cuirass/base.scm     |  94 +++----
 src/cuirass/database.scm | 550 +++++++++++++++++++++------------------
 src/cuirass/http.scm     | 136 +++++-----
 src/cuirass/utils.scm    |  23 +-
 tests/database.scm       | 103 ++++----
 tests/http.scm           |  21 +-
 7 files changed, 494 insertions(+), 456 deletions(-)

diff --git a/bin/cuirass.in b/bin/cuirass.in
index 11eb975..d30f788 100644
--- a/bin/cuirass.in
+++ b/bin/cuirass.in
@@ -115,19 +115,19 @@ exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
           (log-message "running Fibers on ~a kernel threads" threads)
           (run-fibers
            (lambda ()
-             (with-database db
+             (with-database
                (and specfile
                     (let ((new-specs (save-module-excursion
                                       (lambda ()
                                         (set-current-module (make-user-module '()))
                                         (primitive-load specfile)))))
-                      (for-each (lambda (spec) (db-add-specification db spec))
+                      (for-each (lambda (spec) (db-add-specification spec))
                                 new-specs)))
                (if one-shot?
-                   (process-specs db (db-get-specifications db))
+                   (process-specs (db-get-specifications))
                    (let ((exit-channel (make-channel)))
 
-                     (clear-build-queue db)
+                     (clear-build-queue)
 
                      ;; First off, restart builds that had not completed or
                      ;; were not even started on a previous run.
@@ -135,25 +135,22 @@ exec ${GUILE:-@GUILE@} --no-auto-compile -e main -s "$0" "$@"
                       (essential-task
                        'restart-builds exit-channel
                        (lambda ()
-                         (with-database db
-                           (restart-builds db)))))
+                         (restart-builds))))
 
                      (spawn-fiber
                       (essential-task
                        'build exit-channel
                        (lambda ()
-                         (with-database db
-                           (while #t
-                             (process-specs db (db-get-specifications db))
-                             (log-message "next evaluation in ~a seconds" interval)
-                             (sleep interval))))))
+                         (while #t
+                           (process-specs (db-get-specifications))
+                           (log-message "next evaluation in ~a seconds" interval)
+                           (sleep interval)))))
 
                      (spawn-fiber
                       (essential-task
                        'web-server exit-channel
                        (lambda ()
-                         (with-database db
-                           (run-cuirass-server db #:host host #:port port))))
+                         (run-cuirass-server #:host host #:port port)))
                       #:parallel? #t)
 
                      (spawn-fiber
diff --git a/src/cuirass/base.scm b/src/cuirass/base.scm
index 26a5996..abbdb7b 100644
--- a/src/cuirass/base.scm
+++ b/src/cuirass/base.scm
@@ -248,7 +248,7 @@ fibers."
                    (logior (@ (fibers epoll) EPOLLERR)
                            (@ (fibers epoll) EPOLLHUP)))))
 
-(define (evaluate store db spec checkouts commits)
+(define (evaluate store spec checkouts commits)
   "Evaluate and build package derivations defined in SPEC, using CHECKOUTS.
 Return a list of jobs."
   (define (augment-job job eval-id)
@@ -277,8 +277,8 @@ Return a list of jobs."
       (('evaluation jobs)
        (let* ((spec-name (assq-ref spec #:name))
               (eval-id (db-add-evaluation
-                        db `((#:specification . ,spec-name)
-                             (#:commits . ,commits)))))
+                        `((#:specification . ,spec-name)
+                          (#:commits . ,commits)))))
          (log-message "created evaluation ~a for '~a'" eval-id spec-name)
          (map (lambda (job)
                 (augment-job job eval-id))
@@ -368,7 +368,7 @@ Essentially this procedure inverts the inversion-of-control that
   ;; Our shuffling algorithm is simple: we sort by .drv file name.  :-)
   (sort drv string<?))
 
-(define (update-build-statuses! store db lst)
+(define (update-build-statuses! store lst)
   "Update the build status of the derivations listed in LST, which have just
 been passed to 'build-derivations' (meaning that we can assume that, if their
 outputs are invalid, that they failed to build.)"
@@ -376,8 +376,8 @@ outputs are invalid, that they failed to build.)"
     (match (derivation-path->output-paths drv)
       (((_ . outputs) ...)
        (if (any (cut valid-path? store <>) outputs)
-           (db-update-build-status! db drv (build-status succeeded))
-           (db-update-build-status! db drv (build-status failed))))))
+           (db-update-build-status! drv (build-status succeeded))
+           (db-update-build-status! drv (build-status failed))))))
 
   (for-each update! lst))
 
@@ -393,10 +393,11 @@ and returns the values RESULTS."
        (print-exception (current-error-port) frame key args)
        (apply values results)))))
 
-(define* (spawn-builds store db drv
+(define* (spawn-builds store drv
                        #:key (max-batch-size 200))
-  "Build the derivations listed in DRV, updating DB as builds complete.
-Derivations are submitted in batches of at most MAX-BATCH-SIZE items."
+  "Build the derivations listed in DRV, updating the database as builds
+complete.  Derivations are submitted in batches of at most MAX-BATCH-SIZE
+items."
   ;; XXX: We want to pass 'build-derivations' as many derivations at once so
   ;; we benefit from as much parallelism as possible (we must be using
   ;; #:keep-going? #t).
@@ -444,7 +445,7 @@ Derivations are submitted in batches of at most MAX-BATCH-SIZE items."
                                    ;; from PORT and eventually close it.
                                    (catch #t
                                      (lambda ()
-                                       (handle-build-event db event))
+                                       (handle-build-event event))
                                      (exception-reporter state)))
                                  #t)
               (close-port port)
@@ -455,14 +456,14 @@ Derivations are submitted in batches of at most MAX-BATCH-SIZE items."
           ;; derivations were built "behind our back", in which case
           ;; 'build-derivations' doesn't actually do anything and
           ;; 'handle-build-event' doesn't see any event.  Because of that,
-          ;; adjust DB here.
-          (update-build-statuses! store db batch)
+          ;; adjust the database here.
+          (update-build-statuses! store batch)
 
           (loop rest (max (- count max-batch-size) 0))))))
 
-(define* (handle-build-event db event)
+(define* (handle-build-event event)
   "Handle EVENT, a build event sexp as produced by 'build-event-output-port',
-updating DB accordingly."
+updating the database accordingly."
   (define (valid? file)
     ;; FIXME: Sometimes we might get bogus events due to the interleaving of
     ;; build messages.  This procedure prevents us from propagating the bogus
@@ -475,7 +476,7 @@ updating DB accordingly."
      (if (valid? drv)
          (begin
            (log-message "build started: '~a'" drv)
-           (db-update-build-status! db drv (build-status started)))
+           (db-update-build-status! drv (build-status started)))
          (log-message "bogus build-started event for '~a'" drv)))
     (('build-remote drv host _ ...)
      (log-message "'~a' offloaded to '~a'" drv host))
@@ -483,13 +484,13 @@ updating DB accordingly."
      (if (valid? drv)
          (begin
            (log-message "build succeeded: '~a'" drv)
-           (db-update-build-status! db drv (build-status succeeded)))
+           (db-update-build-status! drv (build-status succeeded)))
          (log-message "bogus build-succeeded event for '~a'" drv)))
     (('build-failed drv _ ...)
      (if (valid? drv)
          (begin
            (log-message "build failed: '~a'" drv)
-           (db-update-build-status! db drv (build-status failed)))
+           (db-update-build-status! drv (build-status failed)))
          (log-message "bogus build-failed event for '~a'" drv)))
     (('substituter-started item _ ...)
      (log-message "substituter started: '~a'" item))
@@ -503,42 +504,42 @@ updating DB accordingly."
   (string=? (assq-ref build1 #:derivation)
             (assq-ref build2 #:derivation)))
 
-(define (clear-build-queue db)
-  "Reset the status of builds in DB that are marked as \"started\".  This
-procedure is meant to be called at startup."
+(define (clear-build-queue)
+  "Reset the status of builds in the database that are marked as \"started\".
+This procedure is meant to be called at startup."
   (log-message "marking stale builds as \"scheduled\"...")
-  (sqlite-exec db "UPDATE Builds SET status = -2 WHERE status = -1;"))
+  (with-db-critical-section db
+    (sqlite-exec db "UPDATE Builds SET status = -2 WHERE status = -1;")))
 
-(define (cancel-old-builds db age)
+(define (cancel-old-builds age)
   "Cancel builds older than AGE seconds."
   (log-message "canceling builds older than ~a seconds..." age)
-  (sqlite-exec db
-               "UPDATE Builds SET status = 4 WHERE status = -2 AND timestamp < "
-               (- (time-second (current-time time-utc)) age)
-               ";"))
-
-(define (restart-builds db)
-  "Restart builds whose status in DB is \"pending\" (scheduled or started)."
+  (with-db-critical-section db
+    (sqlite-exec
+     db "UPDATE Builds SET status = 4 WHERE status = -2 AND timestamp < "
+     (- (time-second (current-time time-utc)) age) ";")))
+
+(define (restart-builds)
+  "Restart builds whose status in the database is \"pending\" (scheduled or
+started)."
   (with-store store
-    ;; Note: On a big database, 'db-get-pending-derivations' can take a couple
-    ;; of minutes, hence 'non-blocking'.
     (log-message "retrieving list of pending builds...")
     (let*-values (((valid stale)
                    (partition (cut valid-path? store <>)
-                              (non-blocking (db-get-pending-derivations db)))))
+                              (db-get-pending-derivations))))
       ;; We cannot restart builds listed in STALE, so mark them as canceled.
       (log-message "canceling ~a stale builds" (length stale))
       (for-each (lambda (drv)
-                  (db-update-build-status! db drv (build-status canceled)))
+                  (db-update-build-status! drv (build-status canceled)))
                 stale)
 
       ;; Those in VALID can be restarted.  If some of them were built in the
       ;; meantime behind our back, that's fine: 'spawn-builds' will DTRT.
       (log-message "restarting ~a pending builds" (length valid))
-      (spawn-builds store db valid)
+      (spawn-builds store valid)
       (log-message "done with restarted builds"))))
 
-(define (build-packages store db jobs)
+(define (build-packages store jobs)
   "Build JOBS and return a list of Build results."
   (define (register job)
     (let* ((name     (assq-ref job #:job-name))
@@ -570,14 +571,14 @@ procedure is meant to be called at startup."
                      (#:timestamp . ,cur-time)
                      (#:starttime . 0)
                      (#:stoptime . 0))))
-        (db-add-build db build))))
+        (db-add-build build))))
 
   (define derivations
     (filter-map register jobs))
 
-  (spawn-builds store db derivations)
+  (spawn-builds store derivations)
 
-  (let* ((results (filter-map (cut db-get-build db <>) derivations))
+  (let* ((results (filter-map (cut db-get-build <>) derivations))
          (status (map (cut assq-ref <> #:status) results))
          (success (count (lambda (status)
                            (= status (build-status succeeded)))
@@ -651,11 +652,11 @@ procedure is meant to be called at startup."
            checkout)
          results)))
 
-(define (process-specs db jobspecs)
-  "Evaluate and build JOBSPECS and store results in DB."
+(define (process-specs jobspecs)
+  "Evaluate and build JOBSPECS and store results in the database."
   (define (process spec)
     (with-store store
-      (let* ((stamp (db-get-stamp db spec))
+      (let* ((stamp (db-get-stamp spec))
              (name (assoc-ref spec #:name))
              (checkouts (fetch-inputs spec))
              (commits (map (cut assq-ref <> #:commit) checkouts))
@@ -663,7 +664,7 @@ procedure is meant to be called at startup."
         (unless (equal? commits-str stamp)
           ;; Immediately mark SPEC's INPUTS as being processed so we don't
           ;; spawn a concurrent evaluation of that same commit.
-          (db-add-stamp db spec commits-str)
+          (db-add-stamp spec commits-str)
           (compile-checkouts spec (filter compile? checkouts))
           (spawn-fiber
            (lambda ()
@@ -674,11 +675,10 @@ procedure is meant to be called at startup."
                (log-message "evaluating spec '~a': stamp ~s different from ~s"
                             name commits-str stamp)
                (with-store store
-                 (with-database db
-                   (let ((jobs (evaluate store db spec checkouts commits)))
-                     (log-message "building ~a jobs for '~a'"
-                                  (length jobs) name)
-                     (build-packages store db jobs)))))))
+                 (let ((jobs (evaluate store spec checkouts commits)))
+                   (log-message "building ~a jobs for '~a'"
+                                (length jobs) name)
+                   (build-packages store jobs))))))
 
           ;; 'spawn-fiber' returns zero values but we need one.
           *unspecified*))))
diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
index 7788ac9..5cf84aa 100644
--- a/src/cuirass/database.scm
+++ b/src/cuirass/database.scm
@@ -59,7 +59,9 @@
             ;; Parameters.
             %package-database
             %package-schema-file
+            %db-channel
             ;; Macros.
+            with-db-critical-section
             with-database))
 
 (define (%sqlite-exec db sql . args)
@@ -139,6 +141,16 @@ question marks matches the number of arguments to bind."
                                      (string-append %datadir "/" %package))
                                  "/sql")))
 
+(define %db-channel
+  (make-parameter #f))
+
+(define-syntax-rule (with-db-critical-section db exp ...)
+  "Evaluate EXP... in the critical section corresponding to %DB-CHANNEL.
+DB is bound to the argument of that critical section: the database
+connection."
+  (call-with-critical-section (%db-channel)
+                              (lambda (db) exp ...)))
+
 (define (read-sql-file file-name)
   "Return a list of string containing SQL instructions from FILE-NAME."
   (call-with-input-file file-name
@@ -238,92 +250,111 @@ database object."
   (vector-ref (car (sqlite-exec db "SELECT last_insert_rowid();"))
               0))
 
-(define (db-add-input db spec-name input)
-  (sqlite-exec db "\
+(define (db-add-input spec-name input)
+  (with-db-critical-section db
+    (sqlite-exec db "\
 INSERT OR IGNORE INTO Inputs (specification, name, url, load_path, branch, \
 tag, revision, no_compile_p) VALUES ("
-               spec-name ", "
-               (assq-ref input #:name) ", "
-               (assq-ref input #:url) ", "
-               (assq-ref input #:load-path) ", "
-               (assq-ref input #:branch) ", "
-               (assq-ref input #:tag) ", "
-               (assq-ref input #:commit) ", "
-               (if (assq-ref input #:no-compile?) 1 0) ");")
-  (last-insert-rowid db))
-
-(define (db-add-specification db spec)
-  "Store SPEC in database DB. SPEC inputs are stored in the INPUTS table."
-  (sqlite-exec db "\
+                 spec-name ", "
+                 (assq-ref input #:name) ", "
+                 (assq-ref input #:url) ", "
+                 (assq-ref input #:load-path) ", "
+                 (assq-ref input #:branch) ", "
+                 (assq-ref input #:tag) ", "
+                 (assq-ref input #:commit) ", "
+                 (if (assq-ref input #:no-compile?) 1 0) ");")
+    (last-insert-rowid db)))
+
+(define (db-add-specification spec)
+  "Store SPEC in database the database.  SPEC inputs are stored in the INPUTS
+table."
+  (with-db-critical-section db
+    (sqlite-exec db "\
 INSERT OR IGNORE INTO Specifications (name, load_path_inputs, \
 package_path_inputs, proc_input, proc_file, proc, proc_args) \
   VALUES ("
-               (assq-ref spec #:name) ", "
-               (assq-ref spec #:load-path-inputs) ", "
-               (assq-ref spec #:package-path-inputs)", "
-               (assq-ref spec #:proc-input) ", "
-               (assq-ref spec #:proc-file) ", "
-               (symbol->string (assq-ref spec #:proc)) ", "
-               (assq-ref spec #:proc-args) ");")
-  (let ((spec-id (last-insert-rowid db)))
-    (for-each (lambda (input)
-                (db-add-input db (assq-ref spec #:name) input))
-              (assq-ref spec #:inputs))
-    spec-id))
-
-(define (db-get-inputs db spec-name)
-  (let loop ((rows (sqlite-exec db "SELECT * FROM Inputs WHERE specification="
-                                spec-name ";"))
-             (inputs '()))
-    (match rows
-      (() inputs)
-      ((#(specification name url load-path branch tag revision no-compile-p)
-        . rest)
-       (loop rest
-             (cons `((#:name . ,name)
-                     (#:url . ,url)
-                     (#:load-path . ,load-path)
-                     (#:branch . ,branch)
-                     (#:tag . ,tag)
-                     (#:commit . ,revision)
-                     (#:no-compile? . ,(positive? no-compile-p)))
-                   inputs))))))
-
-(define (db-get-specifications db)
-  (let loop ((rows  (sqlite-exec db "SELECT * FROM Specifications;"))
-             (specs '()))
-    (match rows
-      (() specs)
-      ((#(name load-path-inputs package-path-inputs proc-input proc-file proc
-               proc-args)
-        . rest)
-       (loop rest
-             (cons `((#:name . ,name)
-                     (#:load-path-inputs .
-                      ,(with-input-from-string load-path-inputs read))
-                     (#:package-path-inputs .
-                      ,(with-input-from-string package-path-inputs read))
-                     (#:proc-input . ,proc-input)
-                     (#:proc-file . ,proc-file)
-                     (#:proc . ,(with-input-from-string proc read))
-                     (#:proc-args . ,(with-input-from-string proc-args read))
-                     (#:inputs . ,(db-get-inputs db name)))
-                   specs))))))
-
-(define (db-add-evaluation db eval)
-  (sqlite-exec db "\
+                 (assq-ref spec #:name) ", "
+                 (assq-ref spec #:load-path-inputs) ", "
+                 (assq-ref spec #:package-path-inputs) ", "
+                 (assq-ref spec #:proc-input) ", "
+                 (assq-ref spec #:proc-file) ", "
+                 (symbol->string (assq-ref spec #:proc)) ", "
+                 (assq-ref spec #:proc-args) ");")
+    (let ((spec-id (last-insert-rowid db)))
+      (for-each (lambda (input)
+                  (db-add-input (assq-ref spec #:name) input))
+                (assq-ref spec #:inputs))
+      spec-id)))
+
+(define (db-get-inputs spec-name)
+  (with-db-critical-section db
+    (let loop ((rows (sqlite-exec
+                      db "SELECT * FROM Inputs WHERE specification="
+                      spec-name ";"))
+               (inputs '()))
+      (match rows
+        (() inputs)
+        ((#(specification name url load-path branch tag revision no-compile-p)
+           . rest)
+         (loop rest
+               (cons `((#:name . ,name)
+                       (#:url . ,url)
+                       (#:load-path . ,load-path)
+                       (#:branch . ,branch)
+                       (#:tag . ,tag)
+                       (#:commit . ,revision)
+                       (#:no-compile? . ,(positive? no-compile-p)))
+                     inputs)))))))
+
+(define (db-get-specifications)
+  (with-db-critical-section db
+    (let loop ((rows  (sqlite-exec db "SELECT * FROM Specifications;"))
+               (specs '()))
+      (match rows
+        (() specs)
+        ((#(name load-path-inputs package-path-inputs proc-input proc-file proc
+                 proc-args)
+           . rest)
+         (loop rest
+               (cons `((#:name . ,name)
+                       (#:load-path-inputs .
+                                           ,(with-input-from-string load-path-inputs read))
+                       (#:package-path-inputs .
+                                              ,(with-input-from-string package-path-inputs read))
+                       (#:proc-input . ,proc-input)
+                       (#:proc-file . ,proc-file)
+                       (#:proc . ,(with-input-from-string proc read))
+                       (#:proc-args . ,(with-input-from-string proc-args read))
+                       (#:inputs . ,(db-get-inputs name)))
+                     specs)))))))
+
+(define (db-add-evaluation eval)
+  (with-db-critical-section db
+    (sqlite-exec db "\
 INSERT INTO Evaluations (specification, commits) VALUES ("
-               (assq-ref eval #:specification) ", "
-               (string-join (assq-ref eval #:commits)) ");")
-  (last-insert-rowid db))
+                 (assq-ref eval #:specification) ", "
+                 (string-join (assq-ref eval #:commits)) ");")
+    (last-insert-rowid db)))
 
-(define-syntax-rule (with-database db body ...)
-  "Run BODY with a connection to the database which is bound to DB in BODY."
+(define-syntax-rule (with-database body ...)
+  "Run BODY with %DB-CHANNEL being dynamically bound to a channel implementing
+a critical section that allows database operations to be serialized."
   ;; XXX: We don't install an unwind handler to play well with delimited
   ;; continuations and fibers.  But as a consequence, we leak DB when BODY
   ;; raises an exception.
   (let ((db (db-open)))
-    (unwind-protect body ... (db-close db))))
+    (unwind-protect
+     ;; Process database queries sequentially in a thread.  We need this
+     ;; because otherwise we would need to use the SQLite multithreading
+     ;; feature for which it is required to wait until the database is
+     ;; available, and the waiting would happen in non-cooperative and
+     ;; non-resumable code that blocks the fibers scheduler.  Now the database
+     ;; access blocks on PUT-MESSAGE, which allows the scheduler to schedule
+     ;; another fiber.  Also, creating one new handle for each request would
+     ;; be costly and may defeat statement caching.
+     (parameterize ((%db-channel (make-critical-section db)))
+       body ...)
+     (db-close db))))
 
 (define* (read-quoted-string #:optional (port (current-input-port)))
   "Read all of the characters out of PORT and return them as a SQL quoted
@@ -353,79 +384,84 @@ string."
   (failed-other      3)
   (canceled          4))
 
-(define (db-add-build db build)
-  "Store BUILD in database DB. BUILD eventual outputs are stored
-in the OUTPUTS table."
-  (catch 'sqlite-error
-    (lambda ()
-      (sqlite-exec db "
+(define (db-add-build build)
+  "Store BUILD in database the database.  BUILD eventual outputs are stored in
+the OUTPUTS table."
+  (with-db-critical-section db
+    (catch 'sqlite-error
+      (lambda ()
+        (sqlite-exec db "
 INSERT INTO Builds (derivation, evaluation, job_name, system, nix_name, log,
 status, timestamp, starttime, stoptime)
 VALUES ("
-                   (assq-ref build #:derivation) ", "
-                   (assq-ref build #:eval-id) ", "
-                   (assq-ref build #:job-name) ", "
-                   (assq-ref build #:system) ", "
-                   (assq-ref build #:nix-name) ", "
-                   (assq-ref build #:log) ", "
-                   (or (assq-ref build #:status)
-                       (build-status scheduled)) ", "
-                   (or (assq-ref build #:timestamp) 0) ", "
-                   (or (assq-ref build #:starttime) 0) ", "
-                   (or (assq-ref build #:stoptime) 0) ");")
-      (let ((derivation (assq-ref build #:derivation)))
-        (for-each (lambda (output)
-                    (match output
-                      ((name . path)
-                       (sqlite-exec db "\
+                     (assq-ref build #:derivation) ", "
+                     (assq-ref build #:eval-id) ", "
+                     (assq-ref build #:job-name) ", "
+                     (assq-ref build #:system) ", "
+                     (assq-ref build #:nix-name) ", "
+                     (assq-ref build #:log) ", "
+                     (or (assq-ref build #:status)
+                         (build-status scheduled)) ", "
+                     (or (assq-ref build #:timestamp) 0) ", "
+                     (or (assq-ref build #:starttime) 0) ", "
+                     (or (assq-ref build #:stoptime) 0) ");")
+        (let ((derivation (assq-ref build #:derivation)))
+          (for-each (lambda (output)
+                      (match output
+                        ((name . path)
+                         (sqlite-exec db "\
 INSERT INTO Outputs (derivation, name, path) VALUES ("
-                                    derivation ", " name ", " path ");"))))
-                  (assq-ref build #:outputs))
-        derivation))
-    (lambda (key who code message . rest)
-      ;; If we get a unique-constraint-failed error, that means we have
-      ;; already inserted the same build.  That happens when several jobs
-      ;; produce the same derivation, and we can ignore it.
-      (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
-          #f
-          (apply throw key who code rest)))))
-
-(define* (db-update-build-status! db drv status #:key log-file)
-  "Update DB so that DRV's status is STATUS.  This also updates the
+                                      derivation ", " name ", " path ");"))))
+                    (assq-ref build #:outputs))
+          derivation))
+      (lambda (key who code message . rest)
+        ;; If we get a unique-constraint-failed error, that means we have
+        ;; already inserted the same build.  That happens when several jobs
+        ;; produce the same derivation, and we can ignore it.
+        (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
+            #f
+            (apply throw key who code rest))))))
+
+(define* (db-update-build-status! drv status #:key log-file)
+  "Update the database so that DRV's status is STATUS.  This also updates the
 'starttime' or 'stoptime' fields.  If LOG-FILE is true, record it as the build
 log file for DRV."
   (define now
     (time-second (current-time time-utc)))
 
-  (if (= status (build-status started))
-      (sqlite-exec db "UPDATE Builds SET starttime=" now ", status="
-                   status "WHERE derivation=" drv ";")
-
-      ;; Update only if we're switching to a different status; otherwise leave
-      ;; things unchanged.  This ensures that 'stoptime' remains valid and
-      ;; doesn't change every time we mark DRV as 'succeeded' several times in
-      ;; a row, for instance.
-      (if log-file
-          (sqlite-exec db "UPDATE Builds SET stoptime=" now
-                       ", status=" status ", log=" log-file
-                       "WHERE derivation=" drv "AND status != " status ";")
-          (sqlite-exec db "UPDATE Builds SET stoptime=" now
-                       ", status=" status
-                       "WHERE derivation=" drv " AND status != " status ";"))))
-
-(define (db-get-outputs db derivation)
-  "Retrieve the OUTPUTS of the build identified by DERIVATION in DB database."
-  (let loop ((rows
-              (sqlite-exec db "SELECT name, path FROM Outputs
+  (with-db-critical-section db
+    (if (= status (build-status started))
+        (sqlite-exec db "UPDATE Builds SET starttime=" now ", status="
+                     status "WHERE derivation=" drv ";")
+
+        ;; Update only if we're switching to a different status; otherwise
+        ;; leave things unchanged.  This ensures that 'stoptime' remains valid
+        ;; and doesn't change every time we mark DRV as 'succeeded' several
+        ;; times in a row, for instance.
+        (if log-file
+            (sqlite-exec db "UPDATE Builds SET stoptime=" now
+                         ", status=" status ", log=" log-file
+                         "WHERE derivation=" drv "AND status != " status ";")
+            (sqlite-exec db "UPDATE Builds SET stoptime=" now
+                         ", status=" status
+                         "WHERE derivation=" drv " AND status != " status
+                         ";")))))
+
+(define (db-get-outputs derivation)
+  "Retrieve the OUTPUTS of the build identified by DERIVATION in the
+database."
+  (with-db-critical-section db
+    (let loop ((rows
+                (sqlite-exec db "SELECT name, path FROM Outputs
 WHERE derivation =" derivation ";"))
-             (outputs '()))
-    (match rows
-      (() outputs)
-      ((#(name path)
-        . rest)
-       (loop rest
-             (cons `(,name . ((#:path . ,path)))
-                   outputs))))))
+               (outputs '()))
+      (match rows
+        (() outputs)
+        ((#(name path)
+           . rest)
+         (loop rest
+               (cons `(,name . ((#:path . ,path)))
+                     outputs)))))))
 
 (define (filters->order filters)
   (match (assq 'order filters)
@@ -440,12 +476,13 @@ WHERE derivation =" derivation ";"))
     (('order . 'status+submission-time) "status DESC, timestamp DESC")
     (_ "rowid DESC")))
 
-(define (db-get-builds db filters)
-  "Retrieve all builds in database DB which are matched by given FILTERS.
+(define (db-get-builds filters)
+  "Retrieve all builds in the database which are matched by given FILTERS.
 FILTERS is an assoc list whose possible keys are 'derivation | 'id | 'jobset |
 'job | 'system | 'nr | 'order | 'status | 'evaluation."
-  (let* ((order (filters->order filters))
-         (stmt-text (format #f "SELECT * FROM (
+  (with-db-critical-section db
+    (let* ((order (filters->order filters))
+           (stmt-text (format #f "SELECT * FROM (
 SELECT Builds.derivation, Builds.rowid, Builds.timestamp, Builds.starttime,
 Builds.stoptime, Builds.log, Builds.status, Builds.job_name, Builds.system,
 Builds.nix_name, Specifications.name
@@ -475,93 +512,99 @@ CASE WHEN :borderlowtime IS NULL
 END DESC
 LIMIT :nr)
 ORDER BY ~a, rowid ASC;" order))
-         (stmt (sqlite-prepare db stmt-text #:cache? #t)))
-    (sqlite-bind-arguments
-     stmt
-     #:derivation (assq-ref filters 'derivation)
-     #:id (assq-ref filters 'id)
-     #:jobset (assq-ref filters 'jobset)
-     #:job (assq-ref filters 'job)
-     #:evaluation (assq-ref filters 'evaluation)
-     #:system (assq-ref filters 'system)
-     #:status (and=> (assq-ref filters 'status) object->string)
-     #:borderlowid (assq-ref filters 'border-low-id)
-     #:borderhighid (assq-ref filters 'border-high-id)
-     #:borderlowtime (assq-ref filters 'border-low-time)
-     #:borderhightime (assq-ref filters 'border-high-time)
-     #:nr (match (assq-ref filters 'nr)
-            (#f -1)
-            (x x)))
-    (sqlite-reset stmt)
-    (let loop ((rows (sqlite-fold-right cons '() stmt))
-               (builds '()))
-      (match rows
-        (() (reverse builds))
-        ((#(derivation id timestamp starttime stoptime log status job-name
-                       system nix-name specification) . rest)
-         (loop rest
-               (cons `((#:derivation . ,derivation)
-                       (#:id . ,id)
-                       (#:timestamp . ,timestamp)
-                       (#:starttime . ,starttime)
-                       (#:stoptime . ,stoptime)
-                       (#:log . ,log)
-                       (#:status . ,status)
-                       (#:job-name . ,job-name)
-                       (#:system . ,system)
-                       (#:nix-name . ,nix-name)
-                       (#:specification . ,specification)
-                       (#:outputs . ,(db-get-outputs db derivation)))
-                     builds)))))))
-
-(define (db-get-build db derivation-or-id)
-  "Retrieve a build in database DB which corresponds to DERIVATION-OR-ID."
-  (let ((key (if (number? derivation-or-id) 'id 'derivation)))
-    (match (db-get-builds db `((,key . ,derivation-or-id)))
-      ((build)
-       build)
-      (() #f))))
-
-(define (db-get-pending-derivations db)
+           (stmt (sqlite-prepare db stmt-text #:cache? #t)))
+      (sqlite-bind-arguments
+       stmt
+       #:derivation (assq-ref filters 'derivation)
+       #:id (assq-ref filters 'id)
+       #:jobset (assq-ref filters 'jobset)
+       #:job (assq-ref filters 'job)
+       #:evaluation (assq-ref filters 'evaluation)
+       #:system (assq-ref filters 'system)
+       #:status (and=> (assq-ref filters 'status) object->string)
+       #:borderlowid (assq-ref filters 'border-low-id)
+       #:borderhighid (assq-ref filters 'border-high-id)
+       #:borderlowtime (assq-ref filters 'border-low-time)
+       #:borderhightime (assq-ref filters 'border-high-time)
+       #:nr (match (assq-ref filters 'nr)
+              (#f -1)
+              (x x)))
+      (sqlite-reset stmt)
+      (let loop ((rows (sqlite-fold-right cons '() stmt))
+                 (builds '()))
+        (match rows
+          (() (reverse builds))
+          ((#(derivation id timestamp starttime stoptime log status job-name
+                         system nix-name specification) . rest)
+           (loop rest
+                 (cons `((#:derivation . ,derivation)
+                         (#:id . ,id)
+                         (#:timestamp . ,timestamp)
+                         (#:starttime . ,starttime)
+                         (#:stoptime . ,stoptime)
+                         (#:log . ,log)
+                         (#:status . ,status)
+                         (#:job-name . ,job-name)
+                         (#:system . ,system)
+                         (#:nix-name . ,nix-name)
+                         (#:specification . ,specification)
+                         (#:outputs . ,(db-get-outputs derivation)))
+                       builds))))))))
+
+(define (db-get-build derivation-or-id)
+  "Retrieve a build in the database which corresponds to DERIVATION-OR-ID."
+  (with-db-critical-section db
+    (let ((key (if (number? derivation-or-id) 'id 'derivation)))
+      (match (db-get-builds `((,key . ,derivation-or-id)))
+        ((build)
+         build)
+        (() #f)))))
+
+(define (db-get-pending-derivations)
   "Return the list of derivation file names corresponding to pending builds in
-DB.  The returned list is guaranteed to not have any duplicates."
-  (map (match-lambda (#(drv) drv))
-       (sqlite-exec db "
-SELECT derivation FROM Builds WHERE Builds.status < 0;")))
-
-(define (db-get-stamp db spec)
-  "Return a stamp corresponding to specification SPEC in database DB."
-  (let ((res (sqlite-exec db "SELECT * FROM Stamps WHERE specification="
-                          (assq-ref spec #:name) ";")))
-    (match res
-      (() #f)
-      ((#(spec stamp)) stamp))))
-
-(define (db-add-stamp db spec stamp)
-  "Associate STAMP to specification SPEC in database DB."
-  (if (db-get-stamp db spec)
-      (sqlite-exec db "UPDATE Stamps SET stamp=" stamp
-                   "WHERE specification=" (assq-ref spec #:name) ";")
-      (sqlite-exec db "\
+the database.  The returned list is guaranteed to not have any duplicates."
+  (with-db-critical-section db
+    (map (match-lambda (#(drv) drv))
+         (sqlite-exec db "
+SELECT derivation FROM Builds WHERE Builds.status < 0;"))))
+
+(define (db-get-stamp spec)
+  "Return a stamp corresponding to specification SPEC in the database."
+  (with-db-critical-section db
+    (let ((res (sqlite-exec db "SELECT * FROM Stamps WHERE specification="
+                            (assq-ref spec #:name) ";")))
+      (match res
+        (() #f)
+        ((#(spec stamp)) stamp)))))
+
+(define (db-add-stamp spec stamp)
+  "Associate STAMP to specification SPEC in the database."
+  (with-db-critical-section db
+    (if (db-get-stamp spec)
+        (sqlite-exec db "UPDATE Stamps SET stamp=" stamp
+                     "WHERE specification=" (assq-ref spec #:name) ";")
+        (sqlite-exec db "\
 INSERT INTO Stamps (specification, stamp) VALUES ("
-                   (assq-ref spec #:name) ", " stamp ");")))
+                     (assq-ref spec #:name) ", " stamp ");"))))
 
-(define (db-get-evaluations db limit)
-  (let loop ((rows  (sqlite-exec db "SELECT id, specification, commits
+(define (db-get-evaluations limit)
+  (with-db-critical-section db
+    (let loop ((rows  (sqlite-exec db "SELECT id, specification, commits
 FROM Evaluations ORDER BY id DESC LIMIT " limit ";"))
-             (evaluations '()))
-    (match rows
-      (() (reverse evaluations))
-      ((#(id specification commits)
-        . rest)
-       (loop rest
-             (cons `((#:id . ,id)
-                     (#:specification . ,specification)
-                     (#:commits . ,(string-tokenize commits)))
-                   evaluations))))))
-
-(define (db-get-evaluations-build-summary db spec limit border-low border-high)
-  (let loop ((rows (sqlite-exec db "
+               (evaluations '()))
+      (match rows
+        (() (reverse evaluations))
+        ((#(id specification commits)
+           . rest)
+         (loop rest
+               (cons `((#:id . ,id)
+                       (#:specification . ,specification)
+                       (#:commits . ,(string-tokenize commits)))
+                     evaluations)))))))
+
+(define (db-get-evaluations-build-summary spec limit border-low border-high)
+  (with-db-critical-section db
+    (let loop ((rows (sqlite-exec db "
 SELECT E.id, E.commits, B.succeeded, B.failed, B.scheduled
 FROM
 (SELECT id, commits
@@ -578,50 +621,59 @@ FROM Builds
 GROUP BY evaluation) B
 ON B.evaluation=E.id
 ORDER BY E.id ASC;"))
-             (evaluations '()))
-    (match rows
-      (() evaluations)
-      ((#(id commits succeeded failed scheduled) . rest)
-       (loop rest
-             (cons `((#:id . ,id)
-                     (#:commits . ,commits)
-                     (#:succeeded . ,(or succeeded 0))
-                     (#:failed . ,(or failed 0))
-                     (#:scheduled . ,(or scheduled 0)))
-                   evaluations))))))
-
-(define (db-get-evaluations-id-min db spec)
+               (evaluations '()))
+      (match rows
+        (() evaluations)
+        ((#(id commits succeeded failed scheduled) . rest)
+         (loop rest
+               (cons `((#:id . ,id)
+                       (#:commits . ,commits)
+                       (#:succeeded . ,(or succeeded 0))
+                       (#:failed . ,(or failed 0))
+                       (#:scheduled . ,(or scheduled 0)))
+                     evaluations)))))))
+
+(define (db-get-evaluations-id-min spec)
   "Return the min id of evaluations for the given specification SPEC."
-  (let ((rows (sqlite-exec db "
+  (with-db-critical-section db
+    (let ((rows (sqlite-exec db "
 SELECT MIN(id) FROM Evaluations
 WHERE specification=" spec)))
-    (vector-ref (car rows) 0)))
+      (vector-ref (car rows) 0))))
 
-(define (db-get-evaluations-id-max db spec)
+(define (db-get-evaluations-id-max spec)
   "Return the max id of evaluations for the given specification SPEC."
-  (let ((rows (sqlite-exec db "
+  (with-db-critical-section db
+    (let ((rows (sqlite-exec db "
 SELECT MAX(id) FROM Evaluations
 WHERE specification=" spec)))
-    (vector-ref (car rows) 0)))
+      (vector-ref (car rows) 0))))
 
-(define (db-get-builds-min db eval)
+(define (db-get-builds-min eval)
   "Return the min build (stoptime, id) pair for
    the given evaluation EVAL."
-  (let ((rows (sqlite-exec db "
+  (with-db-critical-section db
+    (let ((rows (sqlite-exec db "
 SELECT stoptime, MIN(rowid) FROM
 (SELECT rowid, stoptime FROM Builds
 WHERE evaluation=" eval " AND
 stoptime = (SELECT MIN(stoptime)
 FROM Builds WHERE evaluation=" eval "))")))
-    (vector->list (car rows))))
+      (vector->list (car rows)))))
 
-(define (db-get-builds-max db eval)
+(define (db-get-builds-max eval)
   "Return the max build (stoptime, id) pair for
    the given evaluation EVAL."
-  (let ((rows (sqlite-exec db "
+  (with-db-critical-section db
+    (let ((rows (sqlite-exec db "
 SELECT stoptime, MAX(rowid) FROM
 (SELECT rowid, stoptime FROM Builds
 WHERE evaluation=" eval " AND
 stoptime = (SELECT MAX(stoptime)
 FROM Builds WHERE evaluation=" eval "))")))
-    (vector->list (car rows))))
+      (vector->list (car rows)))))
+
+;;; Local Variables:
+;;; eval: (put 'with-db-critical-section 'scheme-indent-function 1)
+;;; eval: (put 'with-database 'scheme-indent-function 0)
+;;; End:
diff --git a/src/cuirass/http.scm b/src/cuirass/http.scm
index 16bbda0..d70517b 100644
--- a/src/cuirass/http.scm
+++ b/src/cuirass/http.scm
@@ -103,17 +103,17 @@
     (#:releasename . #nil)
     (#:buildinputs_builds . #nil)))
 
-(define (handle-build-request db build-id)
-  "Retrieve build identified by BUILD-ID over DB and convert it
-  to hydra format. Return #f is not build was found."
-  (let ((build (db-get-build db build-id)))
+(define (handle-build-request build-id)
+  "Retrieve build identified by BUILD-ID over the database and convert it to
+hydra format. Return #f is not build was found."
+  (let ((build (db-get-build build-id)))
     (and=> build build->hydra-build)))
 
-(define (handle-builds-request db filters)
-  "Retrieve all builds matched by FILTERS in DB and convert them
-  to Hydra format."
+(define (handle-builds-request filters)
+  "Retrieve all builds matched by FILTERS in the database and convert them to
+Hydra format."
   (let ((builds (with-time-logging "builds request"
-                                   (db-get-builds db filters))))
+                                   (db-get-builds filters))))
     (map build->hydra-build builds)))
 
 (define (request-parameters request)
@@ -146,10 +146,10 @@
 (define (request-path-components request)
   (split-and-decode-uri-path (uri-path (request-uri request))))
 
-(define (url-handler request body db-channel)
+(define (url-handler request body)
 
-  (define* (respond response #:key body (db-channel db-channel))
-    (values response body db-channel))
+  (define* (respond response #:key body)
+    (values response body #f))
 
   (define-syntax-rule (respond-json body ...)
     (respond '((content-type . (application/json)))
@@ -213,19 +213,14 @@
              (request-path-components request)
              'method-not-allowed)
     (((or "jobsets" "specifications") . rest)
-     (respond-json (object->json-string
-                    (with-critical-section db-channel (db)
-                      (db-get-specifications db)))))
+     (respond-json (object->json-string (db-get-specifications))))
     (("build" build-id)
-     (let ((hydra-build
-            (with-critical-section db-channel (db)
-              (handle-build-request db (string->number build-id)))))
+     (let ((hydra-build (handle-build-request (string->number build-id))))
        (if hydra-build
            (respond-json (object->json-string hydra-build))
            (respond-build-not-found build-id))))
     (("build" build-id "log" "raw")
-     (let ((build (with-critical-section db-channel (db)
-                    (db-get-build db (string->number build-id)))))
+     (let ((build (db-get-build (string->number build-id))))
        (if build
            (match (assq-ref build #:outputs)
              (((_ (#:path . (? string? output))) _ ...)
@@ -250,9 +245,7 @@
             ;; 'nr parameter is mandatory to limit query size.
             (nr (assq-ref params 'nr)))
        (if nr
-           (respond-json (object->json-string
-                          (with-critical-section db-channel (db)
-                            (db-get-evaluations db nr))))
+           (respond-json (object->json-string (db-get-evaluations nr)))
            (respond-json-with-error 500 "Parameter not defined!"))))
     (("api" "latestbuilds")
      (let* ((params (request-parameters request))
@@ -262,10 +255,9 @@
            ;; Limit results to builds that are "done".
            (respond-json
             (object->json-string
-             (with-critical-section db-channel (db)
-               (handle-builds-request db `((status . done)
-                                           ,@params
-                                           (order . finish-time))))))
+             (handle-builds-request `((status . done)
+                                      ,@params
+                                      (order . finish-time)))))
            (respond-json-with-error 500 "Parameter not defined!"))))
     (("api" "queue")
      (let* ((params (request-parameters request))
@@ -276,77 +268,65 @@
             (object->json-string
              ;; Use the 'status+submission-time' order so that builds in
              ;; 'running' state appear before builds in 'scheduled' state.
-             (with-critical-section db-channel (db)
-               (handle-builds-request db `((status . pending)
-                                           ,@params
-                                           (order . status+submission-time))))))
+             (handle-builds-request `((status . pending)
+                                      ,@params
+                                      (order . status+submission-time)))))
            (respond-json-with-error 500 "Parameter not defined!"))))
     ('()
      (respond-html (html-page
                     "Cuirass"
-                    (specifications-table
-                     (with-critical-section db-channel (db)
-                       (db-get-specifications db))))))
+                    (specifications-table (db-get-specifications)))))
 
     (("jobset" name)
      (respond-html
-      (with-critical-section db-channel (db)
-        (let* ((evaluation-id-max (db-get-evaluations-id-max db name))
-               (evaluation-id-min (db-get-evaluations-id-min db name))
-               (params (request-parameters request))
-               (border-high (assq-ref params 'border-high))
-               (border-low (assq-ref params 'border-low))
-               (evaluations (db-get-evaluations-build-summary db
-                                                              name
-                                                              %page-size
-                                                              border-low
-                                                              border-high)))
-          (html-page name (evaluation-info-table name
-                                                 evaluations
-                                                 evaluation-id-min
-                                                 evaluation-id-max))))))
+      (let* ((evaluation-id-max (db-get-evaluations-id-max name))
+             (evaluation-id-min (db-get-evaluations-id-min name))
+             (params (request-parameters request))
+             (border-high (assq-ref params 'border-high))
+             (border-low (assq-ref params 'border-low))
+             (evaluations (db-get-evaluations-build-summary name
+                                                            %page-size
+                                                            border-low
+                                                            border-high)))
+        (html-page name (evaluation-info-table name
+                                               evaluations
+                                               evaluation-id-min
+                                               evaluation-id-max)))))
 
     (("eval" id)
      (respond-html
-      (with-critical-section db-channel (db)
-        (let* ((builds-id-max (db-get-builds-max db id))
-               (builds-id-min (db-get-builds-min db id))
-               (params (request-parameters request))
-               (border-high-time (assq-ref params 'border-high-time))
-               (border-low-time (assq-ref params 'border-low-time))
-               (border-high-id (assq-ref params 'border-high-id))
-               (border-low-id (assq-ref params 'border-low-id)))
-          (html-page
-           "Evaluation"
-           (build-eval-table
-            (handle-builds-request db `((evaluation . ,id)
-                                        (nr . ,%page-size)
-                                        (order . finish-time+build-id)
-                                        (border-high-time . ,border-high-time)
-                                        (border-low-time . ,border-low-time)
-                                        (border-high-id . ,border-high-id)
-                                        (border-low-id . ,border-low-id)))
-            builds-id-min
-            builds-id-max))))))
+      (let* ((builds-id-max (db-get-builds-max id))
+             (builds-id-min (db-get-builds-min id))
+             (params (request-parameters request))
+             (border-high-time (assq-ref params 'border-high-time))
+             (border-low-time (assq-ref params 'border-low-time))
+             (border-high-id (assq-ref params 'border-high-id))
+             (border-low-id (assq-ref params 'border-low-id)))
+        (html-page
+         "Evaluation"
+         (build-eval-table
+          (handle-builds-request `((evaluation . ,id)
+                                   (nr . ,%page-size)
+                                   (order . finish-time+build-id)
+                                   (border-high-time . ,border-high-time)
+                                   (border-low-time . ,border-low-time)
+                                   (border-high-id . ,border-high-id)
+                                   (border-low-id . ,border-low-id)))
+          builds-id-min
+          builds-id-max)))))
 
     (("static" path ...)
      (respond-static-file path))
     ('method-not-allowed
      ;; 405 "Method Not Allowed"
-     (values (build-response #:code 405) #f db-channel))
+     (values (build-response #:code 405) #f #f))
     (_
      (respond-not-found (uri->string (request-uri request))))))
 
-(define* (run-cuirass-server db #:key (host "localhost") (port 8080))
+(define* (run-cuirass-server #:key (host "localhost") (port 8080))
   (let* ((host-info  (gethostbyname host))
          (address    (inet-ntop (hostent:addrtype host-info)
-                                (car (hostent:addr-list host-info))))
-
-         ;; Spawn a fiber to process database queries sequentially.  We need
-         ;; this because guile-sqlite3 handles are not thread-safe (caching in
-         ;; particular), and creating one new handle for each request would be
-         ;; costly and may defeat statement caching.
-         (db-channel (make-critical-section db)))
+                                (car (hostent:addr-list host-info)))))
     (log-message "listening on ~A:~A" address port)
 
     ;; Here we use our own web backend, call 'fiberized'.  We cannot use the
@@ -371,7 +351,7 @@
           (spawn-fiber
            (lambda ()
              (let-values (((response body state)
-                           (handle-request (cut url-handler <> <> db-channel)
+                           (handle-request (cut url-handler <> <>)
                                            request body '())))
                (write-client impl server client response body)))))
         (loop)))))
diff --git a/src/cuirass/utils.scm b/src/cuirass/utils.scm
index 6083890..48e797c 100644
--- a/src/cuirass/utils.scm
+++ b/src/cuirass/utils.scm
@@ -103,17 +103,18 @@ then be passed to 'join-critical-section', which will ensure sequential
 ordering.  ARGS are the arguments of the critical section.
 
 Critical sections are implemented by passing the procedure to execute to a
-dedicated fiber."
-  (let ((channel (make-channel)))
-    (spawn-fiber
-     (lambda ()
-       (parameterize ((%critical-section-args args))
-         (let loop ()
-           (match (get-message channel)
-             (((? channel? reply) . (? procedure? proc))
-              (put-message reply (apply proc args))))
-           (loop)))))
-    channel))
+dedicated thread."
+  (parameterize (((@@ (fibers internal) current-fiber) #f))
+    (let ((channel (make-channel)))
+      (call-with-new-thread
+       (lambda ()
+         (parameterize ((%critical-section-args args))
+           (let loop ()
+             (match (get-message channel)
+               (((? channel? reply) . (? procedure? proc))
+                (put-message reply (apply proc args))))
+             (loop)))))
+      channel)))
 
 (define (call-with-critical-section channel proc)
   "Send PROC to the critical section through CHANNEL.  Return the result of
diff --git a/tests/database.scm b/tests/database.scm
index af518bd..cdc7872 100644
--- a/tests/database.scm
+++ b/tests/database.scm
@@ -21,6 +21,7 @@
 
 (use-modules (cuirass database)
              ((guix utils) #:select (call-with-temporary-output-file))
+             (cuirass utils)
              (srfi srfi-64))
 
 (define example-spec
@@ -61,12 +62,12 @@
     (#:log . "log")
     (#:outputs . (("foo" . "/foo")))))
 
-(define-syntax-rule (with-temporary-database db body ...)
+(define-syntax-rule (with-temporary-database body ...)
   (call-with-temporary-output-file
    (lambda (file port)
      (parameterize ((%package-database file))
        (db-init file)
-       (with-database db
+       (with-database
          body ...)))))
 
 (define %db
@@ -79,7 +80,10 @@
 
 (test-group-with-cleanup "database"
   (test-assert "db-init"
-    (%db (db-init database-name)))
+    (begin
+      (%db (db-init database-name))
+      (%db-channel (make-critical-section (%db)))
+      #t))
 
   (test-assert "sqlite-exec"
     (begin
@@ -94,41 +98,40 @@ INSERT INTO Evaluations (specification, commits) VALUES (3, 3);")
   (test-equal "db-add-specification"
     example-spec
     (begin
-      (db-add-specification (%db) example-spec)
-      (car (db-get-specifications (%db)))))
+      (db-add-specification example-spec)
+      (car (db-get-specifications))))
 
   (test-equal "db-add-build"
     #f
     (let ((build (make-dummy-build "/foo.drv")))
-      (db-add-build (%db) build)
+      (db-add-build build)
 
       ;; Should return #f when adding a build whose derivation is already
       ;; there, see <https://bugs.gnu.org/28094>.
-      (db-add-build (%db) build)))
+      (db-add-build build)))
 
   (test-equal "db-update-build-status!"
     (list (build-status scheduled)
           (build-status started)
           (build-status succeeded)
           "/foo.drv.log")
-    (with-temporary-database db
+    (with-temporary-database
       (let* ((derivation (db-add-build
-                          db
                           (make-dummy-build "/foo.drv" 1
                                             #:outputs '(("out" . "/foo")))))
              (get-status (lambda* (#:optional (key #:status))
-                           (assq-ref (db-get-build db derivation) key))))
-        (db-add-evaluation db (make-dummy-eval))
-        (db-add-specification db example-spec)
+                           (assq-ref (db-get-build derivation) key))))
+        (db-add-evaluation (make-dummy-eval))
+        (db-add-specification example-spec)
 
         (let ((status0 (get-status)))
-          (db-update-build-status! db "/foo.drv" (build-status started))
+          (db-update-build-status! "/foo.drv" (build-status started))
           (let ((status1 (get-status)))
-            (db-update-build-status! db "/foo.drv" (build-status succeeded)
+            (db-update-build-status! "/foo.drv" (build-status succeeded)
                                      #:log-file "/foo.drv.log")
 
             ;; Second call shouldn't make any difference.
-            (db-update-build-status! db "/foo.drv" (build-status succeeded)
+            (db-update-build-status! "/foo.drv" (build-status succeeded)
                                      #:log-file "/foo.drv.log")
 
             (let ((status2 (get-status))
@@ -144,61 +147,61 @@ INSERT INTO Evaluations (specification, commits) VALUES (3, 3);")
       ((3 "/baz.drv") (2 "/bar.drv") (1 "/foo.drv")) ;ditto
       ((3 "/baz.drv"))                               ;nr = 1
       ((2 "/bar.drv") (1 "/foo.drv") (3 "/baz.drv"))) ;status+submission-time
-    (with-temporary-database db
+    (with-temporary-database
       ;; Populate the 'Builds'', 'Evaluations', and
       ;; 'Specifications' tables in a consistent way, as expected by the
       ;; 'db-get-builds' query.
-      (db-add-build db (make-dummy-build "/foo.drv" 1
-                                         #:outputs `(("out" . "/foo"))))
-      (db-add-build db (make-dummy-build "/bar.drv" 2
-                                         #:outputs `(("out" . "/bar"))))
-      (db-add-build db (make-dummy-build "/baz.drv" 3
-                                         #:outputs `(("out" . "/baz"))))
-      (db-add-evaluation db (make-dummy-eval))
-      (db-add-evaluation db (make-dummy-eval))
-      (db-add-evaluation db (make-dummy-eval))
-      (db-add-specification db example-spec)
-
-      (db-update-build-status! db "/bar.drv" (build-status started)
+      (db-add-build (make-dummy-build "/foo.drv" 1
+                                      #:outputs `(("out" . "/foo"))))
+      (db-add-build (make-dummy-build "/bar.drv" 2
+                                      #:outputs `(("out" . "/bar"))))
+      (db-add-build (make-dummy-build "/baz.drv" 3
+                                      #:outputs `(("out" . "/baz"))))
+      (db-add-evaluation (make-dummy-eval))
+      (db-add-evaluation (make-dummy-eval))
+      (db-add-evaluation (make-dummy-eval))
+      (db-add-specification example-spec)
+
+      (db-update-build-status! "/bar.drv" (build-status started)
                                #:log-file "/bar.drv.log")
 
       (let ((summarize (lambda (alist)
                          (list (assq-ref alist #:id)
                                (assq-ref alist #:derivation)))))
-        (vector (map summarize (db-get-builds db '((nr . 3)
-                                                   (order . build-id))))
-                (map summarize (db-get-builds db '()))
-                (map summarize (db-get-builds db '((jobset . "guix"))))
-                (map summarize (db-get-builds db '((nr . 1))))
+        (vector (map summarize (db-get-builds '((nr . 3) (order . build-id))))
+                (map summarize (db-get-builds '()))
+                (map summarize (db-get-builds '((jobset . "guix"))))
+                (map summarize (db-get-builds '((nr . 1))))
                 (map summarize
-                     (db-get-builds
-                      db '((order . status+submission-time))))))))
+                     (db-get-builds '((order . status+submission-time))))))))
 
   (test-equal "db-get-pending-derivations"
     '("/bar.drv" "/foo.drv")
-    (with-temporary-database db
+    (with-temporary-database
       ;; Populate the 'Builds', 'Evaluations', and
       ;; 'Specifications' tables.  Here, two builds map to the same derivation
       ;; but the result of 'db-get-pending-derivations' must not contain any
       ;; duplicate.
-      (db-add-build db (make-dummy-build "/foo.drv" 1
-                                         #:outputs `(("out" . "/foo"))))
-      (db-add-build db (make-dummy-build "/bar.drv" 2
-                                         #:outputs `(("out" . "/bar"))))
-      (db-add-build db (make-dummy-build "/foo.drv" 3
-                                         #:outputs `(("out" . "/foo"))))
-      (db-add-evaluation db (make-dummy-eval))
-      (db-add-evaluation db (make-dummy-eval))
-      (db-add-evaluation db (make-dummy-eval))
-      (db-add-specification db example-spec)
-
-      (sort (db-get-pending-derivations db) string<?)))
+      (db-add-build (make-dummy-build "/foo.drv" 1
+                                      #:outputs `(("out" . "/foo"))))
+      (db-add-build (make-dummy-build "/bar.drv" 2
+                                      #:outputs `(("out" . "/bar"))))
+      (db-add-build (make-dummy-build "/foo.drv" 3
+                                      #:outputs `(("out" . "/foo"))))
+      (db-add-evaluation (make-dummy-eval))
+      (db-add-evaluation (make-dummy-eval))
+      (db-add-evaluation (make-dummy-eval))
+      (db-add-specification example-spec)
+
+      (sort (db-get-pending-derivations) string<?)))
 
   (test-assert "db-close"
     (db-close (%db)))
 
-  (delete-file database-name))
+  (begin
+    (%db-channel #f)
+    (delete-file database-name)))
 
 ;;; Local Variables:
-;;; eval: (put 'with-temporary-database 'scheme-indent-function 1)
+;;; eval: (put 'with-temporary-database 'scheme-indent-function 0)
 ;;; End:
diff --git a/tests/http.scm b/tests/http.scm
index a9fc3ef..38e4175 100644
--- a/tests/http.scm
+++ b/tests/http.scm
@@ -125,14 +125,17 @@
        json->scm)))
 
   (test-assert "db-init"
-    (%db (db-init database-name)))
+    (begin
+      (%db (db-init database-name))
+      (%db-channel (make-critical-section (%db)))
+      #t))
 
   (test-assert "cuirass-run"
     (call-with-new-thread
      (lambda ()
        (run-fibers
         (lambda ()
-          (run-cuirass-server (%db) #:port 6688))
+          (run-cuirass-server #:port 6688))
         #:drain? #t))))
 
   (test-assert "wait-server"
@@ -184,11 +187,11 @@
            (evaluation2
             '((#:specification . "guix")
               (#:commits . ("fakesha2" "fakesha3")))))
-      (db-add-build (%db) build1)
-      (db-add-build (%db) build2)
-      (db-add-specification (%db) specification)
-      (db-add-evaluation (%db) evaluation1)
-      (db-add-evaluation (%db) evaluation2)))
+      (db-add-build build1)
+      (db-add-build build2)
+      (db-add-specification specification)
+      (db-add-evaluation evaluation1)
+      (db-add-evaluation evaluation2)))
 
   (test-assert "/build/1"
     (hash-table=?
@@ -275,4 +278,6 @@
   (test-assert "db-close"
     (db-close (%db)))
 
-  (delete-file database-name))
+  (begin
+    (%db-channel #f)
+    (delete-file database-name)))
-- 
2.18.0

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 06 Aug 2018 19:36:02 GMT) Full text and rfc822 format available.

Message #31 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database accesses
 in a thread.
Date: Mon, 06 Aug 2018 21:35:08 +0200

Hi,

Clément Lassieur <clement <at> lassieur.org> writes:

[...]

>  bin/cuirass.in           |  23 +-
>  src/cuirass/base.scm     |  94 +++----
>  src/cuirass/database.scm | 550 +++++++++++++++++++++------------------
>  src/cuirass/http.scm     | 136 +++++-----
>  src/cuirass/utils.scm    |  23 +-
>  tests/database.scm       | 103 ++++----
>  tests/http.scm           |  21 +-
>  7 files changed, 494 insertions(+), 456 deletions(-)

Pros:
  - fix the fact that long SQL queries block the scheduler
  - simplify the code, and make it more uniform

Cons:
  - we don't take advantage of the SQLite multi-threading features
    anymore

I've run some tests, and I didn't see any difference in terms of
performance.

Clément

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Sun, 19 Aug 2018 14:07:02 GMT) Full text and rfc822 format available.

Message #34 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database accesses
 in a thread.
Date: Sun, 19 Aug 2018 16:06:03 +0200

Hi Clément!

Clément Lassieur <clement <at> lassieur.org> skribis:

> Fixes <https://bugs.gnu.org/32234>.
>
> * bin/cuirass.in (main): Keep only one WITH-DATABASE call around all fibers.
> Remove all DB arguments.
> * src/cuirass/base.scm (evaluate, update-build-statuses!, spawn-builds,
> handle-build-event, build-packages): Remove all DB arguments.
> (clear-build-queue, cancel-old-builds): Wrap in WITH-DB-CRITICAL-SECTION,
> remove all DB arguments.
> (restart-builds): Remove the NON-BLOCKING call, remove all DB arguments.
> (process-specs): Remove all DB arguments, remove the WITH-DATABASE call.
> * src/cuirass/database.scm (%db-channel): New parameter.
> (with-db-critical-section): New macro.
> (db-add-input, db-add-specification, db-get-inputs, db-get-specifications,
> db-add-evaluation, db-add-build, db-update-build-status!, db-get-outputs,
> db-get-builds, db-get-build, db-get-pending-derivations, db-get-stamp,
> db-add-stamp, db-get-evaluations, db-get-evaluations-build-summary,
> db-get-evaluations-id-min, db-get-evaluations-id-max, db-get-builds-min,
> db-get-builds-max): Wrap in WITH-DB-CRITICAL-SECTION, remove all DB arguments.
> (with-database): Wrap BODY in PARAMETERIZE form that sets %DB-CHANNEL to the
> channel returned by MAKE-CRITICAL-SECTION.
> * src/cuirass/http.scm (handle-build-request, handle-builds-request): Remove
> all DB arguments.
> (url-handler): Remove all DB arguments, remove the DB-CHANNEL state, remove
> the WITH-CRITICAL-SECTION calls.
> (run-cuirass-server): Remove the DB arguments, remove the
> MAKE-CRITICAL-SECTION call.
> * src/cuirass/utils.scm (make-critical-section): Replace SPAWN-FIBER with
> CALL-WITH-NEW-THREAD.  Wrap body in PARAMETERIZE form that clears
> CURRENT-FIBER.
> * tests/database.scm (with-temporary-database, db-add-specification,
> db-add-build, db-update-build-status!, db-get-builds,
> db-get-pending-derivations): Remove the DB arguments.
> (db-init): Set the %DB-CHANNEL parameter to the channel returned by
> MAKE-CRITICAL-SECTION, and return #t.
> (database): Set %DB-CHANNEL to #f during cleanup.
> * tests/http.scm (db-init): Set the %DB-CHANNEL parameter to the channel
> returned by MAKE-CRITICAL-SECTION, and return #t.
> (cuirass-run, fill-db): Remove the DB arguments.
> (http): Set %DB-CHANNEL to #f during cleanup.

Excellent, thanks for working on this!  This looks great to me, and I
think the pros outweigh the cons.  Did you check on a big database how
well it performs?

One comment:

> -(define* (handle-build-event db event)
> +(define* (handle-build-event event)
>    "Handle EVENT, a build event sexp as produced by 'build-event-output-port',
> -updating DB accordingly."
> +updating the database accordingly."

[...]

> +(define %db-channel
> +  (make-parameter #f))
> +
> +(define-syntax-rule (with-db-critical-section db exp ...)
> +  "Evaluate EXP... in the critical section corresponding to %DB-CHANNEL.
> +DB is bound to the argument of that critical section: the database
> +connection."
> +  (call-with-critical-section (%db-channel)
> +                              (lambda (db) exp ...)))
> +

I think I find it nicer to keep the ‘db’ parameter everywhere (except
that it’s now a channel instead of an actual database) rather than using
this global variable.

WDYT?

Really happy we have a solution to this problem!

Thank you,
Ludo’.

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Sun, 26 Aug 2018 14:08:01 GMT) Full text and rfc822 format available.

Message #37 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database accesses
 in a thread.
Date: Sun, 26 Aug 2018 16:07:12 +0200

Hi Ludovic,

Ludovic Courtès <ludo <at> gnu.org> writes:

> Excellent, thanks for working on this!  This looks great to me, and I
> think the pros outweigh the cons.  Did you check on a big database how
> well it performs?

Yes, I didn't see any difference.  When I use Berlin's database, it
works well but crashes quickly for another reason (lack of disk space I
think, and /tmp being tmpfs).

> One comment:
>
>> -(define* (handle-build-event db event)
>> +(define* (handle-build-event event)
>>    "Handle EVENT, a build event sexp as produced by 'build-event-output-port',
>> -updating DB accordingly."
>> +updating the database accordingly."
>
> [...]
>
>> +(define %db-channel
>> +  (make-parameter #f))
>> +
>> +(define-syntax-rule (with-db-critical-section db exp ...)
>> +  "Evaluate EXP... in the critical section corresponding to %DB-CHANNEL.
>> +DB is bound to the argument of that critical section: the database
>> +connection."
>> +  (call-with-critical-section (%db-channel)
>> +                              (lambda (db) exp ...)))
>> +
>
> I think I find it nicer to keep the ‘db’ parameter everywhere (except
> that it’s now a channel instead of an actual database) rather than using
> this global variable.
>
> WDYT?

That 'db' parameter made sense before, because there were different
database connections: one per fiber.  But now that there is only one
global channel accessible from everywhere, I can't find any use for a
'db-channel' parameter.

Also, using two differents channels for the same database would be a
bug, it would break the serialization mechanism.

And I don't think using several databases (with one channel per
database) would make sense either.

WDYT?

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Sun, 26 Aug 2018 21:17:01 GMT) Full text and rfc822 format available.

Message #40 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database accesses
 in a thread.
Date: Sun, 26 Aug 2018 23:16:23 +0200

Hi Clément,

Clément Lassieur <clement <at> lassieur.org> skribis:

> Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Excellent, thanks for working on this!  This looks great to me, and I
>> think the pros outweigh the cons.  Did you check on a big database how
>> well it performs?
>
> Yes, I didn't see any difference.  When I use Berlin's database, it
> works well but crashes quickly for another reason (lack of disk space I
> think, and /tmp being tmpfs).

Sounds good (not that it crashes, but that you didn’t see any
difference ;-)).

>> I think I find it nicer to keep the ‘db’ parameter everywhere (except
>> that it’s now a channel instead of an actual database) rather than using
>> this global variable.
>>
>> WDYT?
>
> That 'db' parameter made sense before, because there were different
> database connections: one per fiber.  But now that there is only one
> global channel accessible from everywhere, I can't find any use for a
> 'db-channel' parameter.
>
> Also, using two differents channels for the same database would be a
> bug, it would break the serialization mechanism.
>
> And I don't think using several databases (with one channel per
> database) would make sense either.

These are all good points, indeed.  I’m mildly reluctant to the global
parameter, but if you prefer it that way, I don’t mind; the end result
matters more than this tiny issue anyway!

So: LGTM.

Thank you!

Ludo’.

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 27 Aug 2018 12:42:01 GMT) Full text and rfc822 format available.

Message #43 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Clément Lassieur <clement <at> lassieur.org>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database
 accesses in a thread.
Date: Mon, 27 Aug 2018 14:41:48 +0200

[Message part 1 (text/plain, inline)]

Hi Clément,

in the future I plan on making the actual bin/evaluate use another database connection
in order for the web interface to be isolated while it's querying.

Otherwise - as it is now in master - it can happen that while you are querying one
page, half of the things have different values than you requested - which is really
weird.

For example right now you could query for a row with status=42 and get back data with
status=43 (because it has been changed in the mean time).

It's better to have serializable transaction isolation to prevent this.  That means
that each connection will have its own worldview that is fixed at the beginning of
the connection's transaction.  The worldview will update only once a new transaction
starts.

Therefore, it would be good for writers to have their own connection in the long run
(really, for the readers to have their own connection - but that comes out the same).

So it would be good to keep this in mind.

[Message part 2 (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 27 Aug 2018 13:19:02 GMT) Full text and rfc822 format available.

Message #46 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database accesses
 in a thread.
Date: Mon, 27 Aug 2018 15:18:09 +0200

Hi Danny,

Danny Milosavljevic <dannym <at> scratchpost.org> writes:

> Hi Clément,
>
> in the future I plan on making the actual bin/evaluate use another database connection
> in order for the web interface to be isolated while it's querying.

I don't understand... bin/evaluate doesn't query the database at all.  I
don't know why it would need to.

> Otherwise - as it is now in master - it can happen that while you are querying one
> page, half of the things have different values than you requested - which is really
> weird.
>
> For example right now you could query for a row with status=42 and get back data with
> status=43 (because it has been changed in the mean time).

Could you please show an example that I can reproduce?  I don't
understand what you mean.

> It's better to have serializable transaction isolation to prevent this.  That means
> that each connection will have its own worldview that is fixed at the beginning of
> the connection's transaction.  The worldview will update only once a new transaction
> starts.

With that patch, database queries are serialized, which means that if
query1, query2 and query3 are sent in that order, they will be executed
in that order, one after the other.  I don't understand why using a
different connection would improve things.

> Therefore, it would be good for writers to have their own connection in the long run
> (really, for the readers to have their own connection - but that comes out the same).

Currently, there is only one connection that is shared by the writers
and readers.  Having one 'read connection' and one 'write connection'
would be possible and make sense if both of them live in a separate
thread and use the SQLite multithreading feature so that writing and
reading proceed concurrently.  Is that what you mean?

Clément

Reply sent to Clément Lassieur <clement <at> lassieur.org>:
You have taken responsibility. (Mon, 27 Aug 2018 13:48:03 GMT) Full text and rfc822 format available.

Notification sent to Clément Lassieur <clement <at> lassieur.org>:
bug acknowledged by developer. (Mon, 27 Aug 2018 13:48:03 GMT) Full text and rfc822 format available.

Message #51 received at 32234-done <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ludovic Courtès <ludo <at> gnu.org>
Cc: 32234-done <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database accesses
 in a thread.
Date: Mon, 27 Aug 2018 15:47:27 +0200

Ludovic Courtès <ludo <at> gnu.org> writes:

> Hi Clément,
>
> Clément Lassieur <clement <at> lassieur.org> skribis:
>
>> Ludovic Courtès <ludo <at> gnu.org> writes:
>>
>>> Excellent, thanks for working on this!  This looks great to me, and I
>>> think the pros outweigh the cons.  Did you check on a big database how
>>> well it performs?
>>
>> Yes, I didn't see any difference.  When I use Berlin's database, it
>> works well but crashes quickly for another reason (lack of disk space I
>> think, and /tmp being tmpfs).
>
> Sounds good (not that it crashes, but that you didn’t see any
> difference ;-)).
>
>>> I think I find it nicer to keep the ‘db’ parameter everywhere (except
>>> that it’s now a channel instead of an actual database) rather than using
>>> this global variable.
>>>
>>> WDYT?
>>
>> That 'db' parameter made sense before, because there were different
>> database connections: one per fiber.  But now that there is only one
>> global channel accessible from everywhere, I can't find any use for a
>> 'db-channel' parameter.
>>
>> Also, using two differents channels for the same database would be a
>> bug, it would break the serialization mechanism.
>>
>> And I don't think using several databases (with one channel per
>> database) would make sense either.
>
> These are all good points, indeed.  I’m mildly reluctant to the global
> parameter, but if you prefer it that way, I don’t mind; the end result
> matters more than this tiny issue anyway!
>
> So: LGTM.
>
> Thank you!
>
> Ludo’.

Ok, pushed!

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 27 Aug 2018 14:25:02 GMT) Full text and rfc822 format available.

Message #54 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Danny Milosavljevic <dannym <at> scratchpost.org>
To: Clément Lassieur <clement <at> lassieur.org>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database
 accesses in a thread.
Date: Mon, 27 Aug 2018 16:23:53 +0200

[Message part 1 (text/plain, inline)]

Hi Clément,

I've read through the patch and it seems to handle the case I mean fine because
you support an arbitrary number of queries per db critical section - so I agree
that this patchset is fine.

Keep in mind this is only fine if the critical section is held over an entire http
request handler and not only over a single database query (as far as I can see
the former is the case in the patch - OK). 

Much longer explanation follows:

On Mon, 27 Aug 2018 15:18:09 +0200
Clément Lassieur <clement <at> lassieur.org> wrote:

> Danny Milosavljevic <dannym <at> scratchpost.org> writes:
> 
> > Hi Clément,
> >
> > in the future I plan on making the actual bin/evaluate use another database connection
> > in order for the web interface to be isolated while it's querying.  
> 
> I don't understand... bin/evaluate doesn't query the database at all.  I
> don't know why it would need to.

Yeah, it has moved.  Sorry.

But I mean the part that changes the values in the database (on behalf of bin/evaluate).
So now it's the procedure "evaluate" in src/cuirass/base.scm .

> > Otherwise - as it is now in master - it can happen that while you are querying one
> > page, half of the things have different values than you requested - which is really
> > weird.
> >
> > For example right now you could query for a row with status=42 and get back data with
> > status=43 (because it has been changed in the mean time).  
> 
> Could you please show an example that I can reproduce?  I don't
> understand what you mean.

Right now something like this happens (simplified to make it easier to follow - finding
the problem by debugging the Javascript frontend (wrongly) was much more difficult):

Connection 1:

Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1

Connection 2:

... Wait some time until the user sends a request...
Query: SELECT x FROM a
Result: Nondeterministic number
Query: SELECT x FROM a
Result: Nondeterministic possibly different number (WTF!!!!!)

This is especially bad if you request extra data from other tables in an extra
query and the join condition suddenly doesn't match (and thus the row isn't
returned!).

Better would be if it acted like this:

Connection 1:

Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1
Statement: UPDATE a SET x = x + 1

Connection 2:

... Wait some time until the user sends a request...
Statement: BEGIN TRANSACTION
Statement: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
Query: SELECT x FROM a
Result: Some number
Query: SELECT x FROM a
Result: The same number
... wait however long you want
Query: SELECT x FROM a
Result: The same number
Statement: ROLLBACK TRANSACTION or COMMIT TRANSACTION

loop

Statement: BEGIN TRANSACTION
Statement: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
Query: SELECT x FROM a
Result: Some possibly different number xxx
Query: SELECT x FROM a
Result: The same number xxx as in the previous query
Query: SELECT x FROM a
Result: The same number xxx as in the previous query
...

> With that patch, database queries are serialized, which means that if
> query1, query2 and query3 are sent in that order, they will be executed
> in that order, one after the other.

It depends on what exactly that means.  As long as it means that the
entire HTTP request handler is ONE query that is ordered such, that's fine.

Otherwise not.

If there are more complicated multiple queries done by the web interface
on behalf of the user because of one HTTP request, we have to make sure
that those queries execute without any interleaving by some writer.

As a stopgap, this database query serializer should let the user batch
the queries/statements and run each batch in its own transaction.
I think that would be quite okay as a solution, actually, as long as
there are no other shadow clients of the database.

> Currently, there is only one connection that is shared by the writers
> and readers.  Having one 'read connection' and one 'write connection'
> would be possible and make sense if both of them live in a separate
> thread and use the SQLite multithreading feature so that writing and
> reading proceed concurrently.  Is that what you mean?

No.

[Message part 2 (application/pgp-signature, inline)]

Information forwarded to bug-guix <at> gnu.org:
bug#32234; Package guix. (Mon, 27 Aug 2018 15:06:01 GMT) Full text and rfc822 format available.

Message #57 received at 32234 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Danny Milosavljevic <dannym <at> scratchpost.org>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 32234 <at> debbugs.gnu.org
Subject: Re: bug#32234: [PATCH 2/2] database: Serialize all database accesses
 in a thread.
Date: Mon, 27 Aug 2018 17:05:03 +0200

Thank you for the explanation Danny.

Indeed I didn't fix what you described.  That could be done easily by
wrapping the handler with WITH-DB-CRITICAL-SECTION.  I'm not sure about
the consequences in terms of performance, given that this will send a
huge function to a channel, and that all the work will be done in the
same thread.  If you think it's worth it, don't hesitate to send a
patch.

Clément

Danny Milosavljevic <dannym <at> scratchpost.org> writes:

> Hi Clément,
>
> I've read through the patch and it seems to handle the case I mean fine because
> you support an arbitrary number of queries per db critical section - so I agree
> that this patchset is fine.
>
> Keep in mind this is only fine if the critical section is held over an entire http
> request handler and not only over a single database query (as far as I can see
> the former is the case in the patch - OK). 
>
> Much longer explanation follows:
>
> On Mon, 27 Aug 2018 15:18:09 +0200
> Clément Lassieur <clement <at> lassieur.org> wrote:
>
>> Danny Milosavljevic <dannym <at> scratchpost.org> writes:
>> 
>> > Hi Clément,
>> >
>> > in the future I plan on making the actual bin/evaluate use another database connection
>> > in order for the web interface to be isolated while it's querying.  
>> 
>> I don't understand... bin/evaluate doesn't query the database at all.  I
>> don't know why it would need to.
>
> Yeah, it has moved.  Sorry.
>
> But I mean the part that changes the values in the database (on behalf of bin/evaluate).
> So now it's the procedure "evaluate" in src/cuirass/base.scm .
>
>> > Otherwise - as it is now in master - it can happen that while you are querying one
>> > page, half of the things have different values than you requested - which is really
>> > weird.
>> >
>> > For example right now you could query for a row with status=42 and get back data with
>> > status=43 (because it has been changed in the mean time).  
>> 
>> Could you please show an example that I can reproduce?  I don't
>> understand what you mean.
>
> Right now something like this happens (simplified to make it easier to follow - finding
> the problem by debugging the Javascript frontend (wrongly) was much more difficult):
>
> Connection 1:
>
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
>
> Connection 2:
>
> ... Wait some time until the user sends a request...
> Query: SELECT x FROM a
> Result: Nondeterministic number
> Query: SELECT x FROM a
> Result: Nondeterministic possibly different number (WTF!!!!!)
>
> This is especially bad if you request extra data from other tables in an extra
> query and the join condition suddenly doesn't match (and thus the row isn't
> returned!).
>
>
> Better would be if it acted like this:
>
> Connection 1:
>
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
> Statement: UPDATE a SET x = x + 1
>
> Connection 2:
>
> ... Wait some time until the user sends a request...
> Statement: BEGIN TRANSACTION
> Statement: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
> Query: SELECT x FROM a
> Result: Some number
> Query: SELECT x FROM a
> Result: The same number
> ... wait however long you want
> Query: SELECT x FROM a
> Result: The same number
> Statement: ROLLBACK TRANSACTION or COMMIT TRANSACTION
>
> loop
>
> Statement: BEGIN TRANSACTION
> Statement: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
> Query: SELECT x FROM a
> Result: Some possibly different number xxx
> Query: SELECT x FROM a
> Result: The same number xxx as in the previous query
> Query: SELECT x FROM a
> Result: The same number xxx as in the previous query
> ...
>
>> With that patch, database queries are serialized, which means that if
>> query1, query2 and query3 are sent in that order, they will be executed
>> in that order, one after the other.
>
> It depends on what exactly that means.  As long as it means that the
> entire HTTP request handler is ONE query that is ordered such, that's fine.
>
> Otherwise not.
>
> If there are more complicated multiple queries done by the web interface
> on behalf of the user because of one HTTP request, we have to make sure
> that those queries execute without any interleaving by some writer.
>
> As a stopgap, this database query serializer should let the user batch
> the queries/statements and run each batch in its own transaction.
> I think that would be quite okay as a solution, actually, as long as
> there are no other shadow clients of the database.
>
>> Currently, there is only one connection that is shared by the writers
>> and readers.  Having one 'read connection' and one 'write connection'
>> would be possible and make sense if both of them live in a separate
>> thread and use the SQLite multithreading feature so that writing and
>> reading proceed concurrently.  Is that what you mean?
>
> No.

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Tue, 25 Sep 2018 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 214 days ago.

Previous Next

GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.

GNU bug report logs - #32234 Cuirass: The SQLite built in busy handler might block the Fibers scheduler

GNU bug report logs - #32234
Cuirass: The SQLite built in busy handler might block the Fibers scheduler