GNU bug report logs - #32190
Cuirass doesn't check if two subsequent jobs yield the same derivation

Previous Next

Package: guix;

Reported by: Clément Lassieur <clement <at> lassieur.org>

Date: Tue, 17 Jul 2018 19:32:01 UTC

Severity: normal

Done: Clément Lassieur <clement <at> lassieur.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 32190 in the body.
You can then email your comments to 32190 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Tue, 17 Jul 2018 19:32:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to Clément Lassieur <clement <at> lassieur.org>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Tue, 17 Jul 2018 19:32:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ricardo Wurmus <rekado <at> elephly.net>, bug-guix <at> gnu.org
Cc: guix-devel <at> gnu.org, Tatiana Sholokhova <tanja201396 <at> gmail.com>
Subject: Re: GSoC: Adding a web interface similar to the Hydra web interface
Date: Tue, 17 Jul 2018 21:31:15 +0200
Ricardo Wurmus <rekado <at> elephly.net> writes:

> You can download a copy of the Cuirass database as it is used on
> berlin.guixsd.org, one of the build farms of the Guix project.  I have
> copied it here:
>
>     http://bootstrappable.org/cuirass.db
>
> It is 12G(!), which indicates that Cuirass adds way too many entries
> than absolutely needed.  Ludovic wrote on IRC that we don’t seem to
> check if a record already exists when two subsequent evaluations yield
> the same build.

Adding bug-guix <at> gnu.org to keep track of that bug.




Changed bug title to 'Cuirass doesn't check if two subsequent evaluations yield the same build' from 'GSoC: Adding a web interface similar to the Hydra web interface' Request was from clement <at> lassieur.org (Clément Lassieur) to control <at> debbugs.gnu.org. (Tue, 17 Jul 2018 20:04:02 GMT) Full text and rfc822 format available.

Changed bug title to 'Cuirass doesn't check if two subsequent jobs yield the same derivation' from 'Cuirass doesn't check if two subsequent evaluations yield the same build' Request was from clement <at> lassieur.org (Clément Lassieur) to control <at> debbugs.gnu.org. (Tue, 17 Jul 2018 22:27:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Tue, 17 Jul 2018 22:33:01 GMT) Full text and rfc822 format available.

Message #12 received at 32190 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 32190 <at> debbugs.gnu.org
Subject: Cuirass doesn't check if two subsequent jobs yield the same derivation
Date: Wed, 18 Jul 2018 00:32:02 +0200
Consider the following table:

--8<---------------cut here---------------start------------->8---
CREATE TABLE Derivations (
  derivation    TEXT NOT NULL,
  evaluation    INTEGER NOT NULL,
  job_name      TEXT NOT NULL,
  system        TEXT NOT NULL,
  nix_name      TEXT NOT NULL,
  PRIMARY KEY (derivation, evaluation),
  FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
);
--8<---------------cut here---------------end--------------->8---

And the following code:

--8<---------------cut here---------------start------------->8---
(define (db-add-derivation db job)
  "Store a derivation result in database DB and return its ID."
  (catch 'sqlite-error
    (lambda ()
      (sqlite-exec db "\
INSERT INTO Derivations (derivation, job_name, system, nix_name, evaluation)\
  VALUES ("
                   (assq-ref job #:derivation) ", "
                   (assq-ref job #:job-name) ", "
                   (assq-ref job #:system) ", "
                   (assq-ref job #:nix-name) ", "
                   (assq-ref job #:eval-id) ");")
      (last-insert-rowid db))
    (lambda (key who code message . rest)
      ;; If we get a unique-constraint-failed error, that means we have
      ;; already inserted the same (derivation,eval-id) tuple.  That happens
      ;; when several jobs produce the same derivation, and we can ignore it.
      (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
          (sqlite-exec db "SELECT * FROM Derivations WHERE derivation="
                       (assq-ref job #:derivation) ";")
          (apply throw key who code rest)))))
--8<---------------cut here---------------end--------------->8---

I think the above constraint can't happen because by definition a new
job (for the same job_name) is produced at each evaluation.  So eval-id
will be incremented every time.

Also, the docs (and a comment in schema.sql) says:

    Builds are not in a one to one relationship with derivations in
    order to keep track of non deterministic compilations.

But I think it doesn't make sense, because Guix won't try to build twice
the same thing unless '--check' is used (which obviously isn't the
case).

So not only we have a huge Derivations table full of identical items,
but we also ask Guix to build them and we store the results in the
Builds table...

Maybe the solution is to replace the (derivation, evaluation) primary
key with (derivation), and only build the newly added derivations.
WDYT?

Clément




Information forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Tue, 24 Jul 2018 10:06:01 GMT) Full text and rfc822 format available.

Message #15 received at 32190 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 32190 <at> debbugs.gnu.org
Subject: Re: bug#32190: Cuirass doesn't check if two subsequent jobs yield the
 same derivation
Date: Tue, 24 Jul 2018 12:05:35 +0200
Hi Clément,

Clément Lassieur <clement <at> lassieur.org> skribis:

> Consider the following table:
>
> CREATE TABLE Derivations (
>   derivation    TEXT NOT NULL,
>   evaluation    INTEGER NOT NULL,
>   job_name      TEXT NOT NULL,
>   system        TEXT NOT NULL,
>   nix_name      TEXT NOT NULL,
>   PRIMARY KEY (derivation, evaluation),
>   FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
> );
>
>
> And the following code:
>
> (define (db-add-derivation db job)
>   "Store a derivation result in database DB and return its ID."
>   (catch 'sqlite-error
>     (lambda ()
>       (sqlite-exec db "\
> INSERT INTO Derivations (derivation, job_name, system, nix_name, evaluation)\
>   VALUES ("
>                    (assq-ref job #:derivation) ", "
>                    (assq-ref job #:job-name) ", "
>                    (assq-ref job #:system) ", "
>                    (assq-ref job #:nix-name) ", "
>                    (assq-ref job #:eval-id) ");")
>       (last-insert-rowid db))
>     (lambda (key who code message . rest)
>       ;; If we get a unique-constraint-failed error, that means we have
>       ;; already inserted the same (derivation,eval-id) tuple.  That happens
>       ;; when several jobs produce the same derivation, and we can ignore it.
>       (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
>           (sqlite-exec db "SELECT * FROM Derivations WHERE derivation="
>                        (assq-ref job #:derivation) ";")
>           (apply throw key who code rest)))))
>
> I think the above constraint can't happen because by definition a new
> job (for the same job_name) is produced at each evaluation.  So eval-id
> will be incremented every time.

I added it at the time because it did happen.  In a given eval, there
can be two jobs producing the same derivation (for instance a job called
“gcc” produces xyz-gcc-5.5.0.drv, and a job called “gcc-5.5.0” produces
the very same xyz-gcc-5.5.0.drv.)

> Also, the docs (and a comment in schema.sql) says:
>
>     Builds are not in a one to one relationship with derivations in
>     order to keep track of non deterministic compilations.
>
> But I think it doesn't make sense, because Guix won't try to build twice
> the same thing unless '--check' is used (which obviously isn't the
> case).

The rationale (that was back in Mathieu’s GSoC) was that sometimes, you
can have several builds logs for one derivation.  In Hydra this happens
if a build fails for some non-deterministic reason and then you click on
“Restart” in the hope that it’ll succeed this time.  ;-)  In this
situation Hydra keeps both build logs IIRC.

Anyway, I lean towards keeping only one build log, at least for now,
which is what guix-daemon does.

> So not only we have a huge Derivations table full of identical items,
> but we also ask Guix to build them and we store the results in the
> Builds table...
>
> Maybe the solution is to replace the (derivation, evaluation) primary
> key with (derivation), and only build the newly added derivations.
> WDYT?

I agree, we don’t need all these identical items, it makes no sense.

You can go ahead and clean that up!  ;-)

Thank you,
Ludo’.




Added indication that bug 32190 blocks32331 Request was from clement <at> lassieur.org (Clément Lassieur) to control <at> debbugs.gnu.org. (Tue, 31 Jul 2018 10:00:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Sat, 04 Aug 2018 16:04:01 GMT) Full text and rfc822 format available.

Message #20 received at 32190 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 32190 <at> debbugs.gnu.org
Subject: [PATCH] database: Merge Derivations into Builds table.
Date: Sat,  4 Aug 2018 18:03:03 +0200
Fixes <https://bugs.gnu.org/32190>.

* Makefile.am (dist_sql_DATA): Add 'src/sql/upgrade-2.sql'.
* src/cuirass/base.scm (evaluate): Don't add jobs to the Derivations table.
* src/cuirass/database.scm (db-add-derivation, db-get-derivation): Remove
exported procedures.
(db-add-build): Catch SQLITE_CONSTRAINT_PRIMARYKEY error, which means that two
jobs produced the same derivation, and return #f in that case.  Add columns
that were in the Derivations table before.  Use 'derivation' as primary key
for the Outputs table.
(db-get-outputs): Use 'derivation' as identifier, instead of 'build-id'.
(filters->order): Replace 'id' with 'rowid'.
(db-get-builds): Add a 'derivation' filter.  Replace 'id' with 'rowid'.
Remove the 'INNER JOIN Derivations'.  Replace Derivations with Builds.  Return
'derivation' in first position to make it clear that it's the primary key.
Pass DERIVATION instead of ID to DB-GET-OUTPUTS.
(db-get-build): Allow to take a derivation as argument.  Use NUMBER? to
differentiate between derivations and ids.
(db-get-pending-derivations): Remove the 'SELECT DISTINCT' clause now that
derivations are unique.  Remove the 'INNER JOIN Builds'.
(db-get-evaluations-build-summary, db-get-builds-min, db-get-builds-max):
Replace 'id' with 'rowid'.
* src/schema.sql (Derivations): Remove table.
(Outputs): Replace Builds.id with Builds.derivation.
(Builds): Use 'derivation' as primary key.  Remove the 'id' column.  Add
'job_name', 'system', 'nix_name' columns that were in the Derivations table
before.
(Builds_Derivations_index): Rename to Builds_index.  Update accordingly.
(Derivations_index): Remove index.
* src/sql/upgrade-2.sql: New file with SQL queries to upgrade the database.
* tests/database.scm (make-dummy-job, make-dummy-derivation): Remove
procedures.
(make-dummy-build): Add columns that were in MAKE-DUMMY-DERIVATION.  Get the
DRV parameter to be mandatory because it's a primary key.
(%id): Remove parameter.
(db-add-derivation, db-get-derivation): Remove tests.
(db-add-build): Expect #f, because it adds twice the same derivation.  Pass
the derivation argument to MAKE-DUMMY-BUILD.
(db-update-build-status!): Rename 'id' to 'derivation'.  Pass the derivation
argument to MAKE-DUMMY-BUILD.  Remove the DB-ADD-DERIVATION call.
(db-get-builds, db-get-pending-derivations): Pass the derivation argument to
MAKE-DUMMY-BUILD.  Remove the DB-ADD-DERIVATION calls.
* tests/http.scm (fill-db): Remove DERIVATION1 and DERIVATION2, and put
their content in BUILD1 and BUILD2.  Remove the DB-ADD-DERIVATION calls.
---
 Makefile.am              |   3 +-
 src/cuirass/base.scm     |  23 +++---
 src/cuirass/database.scm | 174 +++++++++++++++++----------------------
 src/schema.sql           |  28 ++-----
 src/sql/upgrade-2.sql    |  49 +++++++++++
 tests/database.scm       |  78 ++++++------------
 tests/http.scm           |  20 ++---
 7 files changed, 176 insertions(+), 199 deletions(-)
 create mode 100644 src/sql/upgrade-2.sql

diff --git a/Makefile.am b/Makefile.am
index ac22601..db56165 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -65,7 +65,8 @@ nodist_webobject_DATA =				\
 dist_pkgdata_DATA = src/schema.sql
 
 dist_sql_DATA = 				\
-  src/sql/upgrade-1.sql
+  src/sql/upgrade-1.sql				\
+  src/sql/upgrade-2.sql
 
 dist_css_DATA =					\
   src/static/css/bootstrap.css			\
diff --git a/src/cuirass/base.scm b/src/cuirass/base.scm
index 82f49a4..26a5996 100644
--- a/src/cuirass/base.scm
+++ b/src/cuirass/base.scm
@@ -280,11 +280,9 @@ Return a list of jobs."
                         db `((#:specification . ,spec-name)
                              (#:commits . ,commits)))))
          (log-message "created evaluation ~a for '~a'" eval-id spec-name)
-         (let ((jobs (map (lambda (job)
-                            (augment-job job eval-id))
-                          jobs)))
-           (for-each (cut db-add-derivation db <>) jobs)
-           jobs))))))
+         (map (lambda (job)
+                (augment-job job eval-id))
+              jobs))))))
 
 
 ;;;
@@ -546,6 +544,9 @@ procedure is meant to be called at startup."
     (let* ((name     (assq-ref job #:job-name))
            (drv      (assq-ref job #:derivation))
            (eval-id  (assq-ref job #:eval-id))
+           (job-name (assq-ref job #:job-name))
+           (system   (assq-ref job #:system))
+           (nix-name (assq-ref job #:nix-name))
            ;; XXX: How to keep logs from several attempts?
            (log      (log-file store drv))
            (outputs  (filter-map (lambda (res)
@@ -556,6 +557,9 @@ procedure is meant to be called at startup."
            (cur-time (time-second (current-time time-utc))))
       (let ((build `((#:derivation . ,drv)
                      (#:eval-id . ,eval-id)
+                     (#:job-name . ,job-name)
+                     (#:system . ,system)
+                     (#:nix-name . ,nix-name)
 
                      ;; XXX: We'd leave LOG to #f (i.e., NULL) but that
                      ;; currently violates the non-NULL constraint.
@@ -568,13 +572,12 @@ procedure is meant to be called at startup."
                      (#:stoptime . 0))))
         (db-add-build db build))))
 
-  (define build-ids
-    (map register jobs))
+  (define derivations
+    (filter-map register jobs))
 
-  (spawn-builds store db
-                (map (cut assq-ref <> #:derivation) jobs))
+  (spawn-builds store db derivations)
 
-  (let* ((results (filter-map (cut db-get-build db <>) build-ids))
+  (let* ((results (filter-map (cut db-get-build db <>) derivations))
          (status (map (cut assq-ref <> #:status) results))
          (success (count (lambda (status)
                            (= status (build-status succeeded)))
diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
index b4b1652..7788ac9 100644
--- a/src/cuirass/database.scm
+++ b/src/cuirass/database.scm
@@ -41,8 +41,6 @@
             db-add-stamp
             db-get-stamp
             db-add-evaluation
-            db-add-derivation
-            db-get-derivation
             db-get-pending-derivations
             build-status
             db-add-build
@@ -312,32 +310,6 @@ package_path_inputs, proc_input, proc_file, proc, proc_args) \
                      (#:inputs . ,(db-get-inputs db name)))
                    specs))))))
 
-(define (db-add-derivation db job)
-  "Store a derivation result in database DB and return its ID."
-  (catch 'sqlite-error
-    (lambda ()
-      (sqlite-exec db "\
-INSERT INTO Derivations (derivation, job_name, system, nix_name, evaluation)\
-  VALUES ("
-                   (assq-ref job #:derivation) ", "
-                   (assq-ref job #:job-name) ", "
-                   (assq-ref job #:system) ", "
-                   (assq-ref job #:nix-name) ", "
-                   (assq-ref job #:eval-id) ");")
-      (last-insert-rowid db))
-    (lambda (key who code message . rest)
-      ;; If we get a unique-constraint-failed error, that means we have
-      ;; already inserted the same (derivation,eval-id) tuple.  That happens
-      ;; when several jobs produce the same derivation, and we can ignore it.
-      (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
-          (sqlite-exec db "SELECT * FROM Derivations WHERE derivation="
-                       (assq-ref job #:derivation) ";")
-          (apply throw key who code rest)))))
-
-(define (db-get-derivation db id)
-  "Retrieve a job in database DB which corresponds to ID."
-  (car (sqlite-exec db "SELECT * FROM Derivations WHERE derivation=" id ";")))
-
 (define (db-add-evaluation db eval)
   (sqlite-exec db "\
 INSERT INTO Evaluations (specification, commits) VALUES ("
@@ -384,27 +356,39 @@ string."
 (define (db-add-build db build)
   "Store BUILD in database DB. BUILD eventual outputs are stored
 in the OUTPUTS table."
-  (let* ((build-exec
-          (sqlite-exec db "\
-INSERT INTO Builds (derivation, evaluation, log, status, timestamp, starttime, stoptime)\
-  VALUES ("
-                       (assq-ref build #:derivation) ", "
-                       (assq-ref build #:eval-id) ", "
-                       (assq-ref build #:log) ", "
-                       (or (assq-ref build #:status)
-                           (build-status scheduled)) ", "
-                       (or (assq-ref build #:timestamp) 0) ", "
-                       (or (assq-ref build #:starttime) 0) ", "
-                       (or (assq-ref build #:stoptime) 0) ");"))
-         (build-id (last-insert-rowid db)))
-    (for-each (lambda (output)
-                (match output
-                  ((name . path)
-                   (sqlite-exec db "\
-INSERT INTO Outputs (build, name, path) VALUES ("
-                                build-id ", " name ", " path ");"))))
-              (assq-ref build #:outputs))
-    build-id))
+  (catch 'sqlite-error
+    (lambda ()
+      (sqlite-exec db "
+INSERT INTO Builds (derivation, evaluation, job_name, system, nix_name, log,
+status, timestamp, starttime, stoptime)
+VALUES ("
+                   (assq-ref build #:derivation) ", "
+                   (assq-ref build #:eval-id) ", "
+                   (assq-ref build #:job-name) ", "
+                   (assq-ref build #:system) ", "
+                   (assq-ref build #:nix-name) ", "
+                   (assq-ref build #:log) ", "
+                   (or (assq-ref build #:status)
+                       (build-status scheduled)) ", "
+                   (or (assq-ref build #:timestamp) 0) ", "
+                   (or (assq-ref build #:starttime) 0) ", "
+                   (or (assq-ref build #:stoptime) 0) ");")
+      (let ((derivation (assq-ref build #:derivation)))
+        (for-each (lambda (output)
+                    (match output
+                      ((name . path)
+                       (sqlite-exec db "\
+INSERT INTO Outputs (derivation, name, path) VALUES ("
+                                    derivation ", " name ", " path ");"))))
+                  (assq-ref build #:outputs))
+        derivation))
+    (lambda (key who code message . rest)
+      ;; If we get a unique-constraint-failed error, that means we have
+      ;; already inserted the same build.  That happens when several jobs
+      ;; produce the same derivation, and we can ignore it.
+      (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
+          #f
+          (apply throw key who code rest)))))
 
 (define* (db-update-build-status! db drv status #:key log-file)
   "Update DB so that DRV's status is STATUS.  This also updates the
@@ -429,11 +413,11 @@ log file for DRV."
                        ", status=" status
                        "WHERE derivation=" drv " AND status != " status ";"))))
 
-(define (db-get-outputs db build-id)
-  "Retrieve the OUTPUTS of the build identified by BUILD-ID in DB database."
+(define (db-get-outputs db derivation)
+  "Retrieve the OUTPUTS of the build identified by DERIVATION in DB database."
   (let loop ((rows
-              (sqlite-exec db "SELECT name, path FROM Outputs WHERE build="
-                           build-id ";"))
+              (sqlite-exec db "SELECT name, path FROM Outputs
+WHERE derivation =" derivation ";"))
              (outputs '()))
     (match rows
       (() outputs)
@@ -445,56 +429,56 @@ log file for DRV."
 
 (define (filters->order filters)
   (match (assq 'order filters)
-    (('order . 'build-id) "id ASC")
-    (('order . 'decreasing-build-id) "id DESC")
+    (('order . 'build-id) "rowid ASC")
+    (('order . 'decreasing-build-id) "rowid DESC")
     (('order . 'finish-time) "stoptime DESC")
-    (('order . 'finish-time+build-id) "stoptime DESC, id DESC")
+    (('order . 'finish-time+build-id) "stoptime DESC, rowid DESC")
     (('order . 'start-time) "starttime DESC")
     (('order . 'submission-time) "timestamp DESC")
     ;; With this order, builds in 'running' state (-1) appear
     ;; before those in 'scheduled' state (-2).
     (('order . 'status+submission-time) "status DESC, timestamp DESC")
-    (_ "id DESC")))
+    (_ "rowid DESC")))
 
 (define (db-get-builds db filters)
   "Retrieve all builds in database DB which are matched by given FILTERS.
-FILTERS is an assoc list whose possible keys are 'id | 'jobset | 'job |
-'system | 'nr | 'order | 'status | 'evaluation."
+FILTERS is an assoc list whose possible keys are 'derivation | 'id | 'jobset |
+'job | 'system | 'nr | 'order | 'status | 'evaluation."
   (let* ((order (filters->order filters))
          (stmt-text (format #f "SELECT * FROM (
-SELECT Builds.id, Builds.timestamp, Builds.starttime, Builds.stoptime,
-Builds.log, Builds.status, Builds.derivation, Derivations.job_name,
-Derivations.system, Derivations.nix_name, Specifications.name
+SELECT Builds.derivation, Builds.rowid, Builds.timestamp, Builds.starttime,
+Builds.stoptime, Builds.log, Builds.status, Builds.job_name, Builds.system,
+Builds.nix_name, Specifications.name
 FROM Builds
-INNER JOIN Derivations ON Builds.derivation = Derivations.derivation
-AND Builds.evaluation = Derivations.evaluation
-INNER JOIN Evaluations ON Derivations.evaluation = Evaluations.id
+INNER JOIN Evaluations ON Builds.evaluation = Evaluations.id
 INNER JOIN Specifications ON Evaluations.specification = Specifications.name
-WHERE (:id IS NULL OR (:id = Builds.id))
+WHERE (:id IS NULL OR (:id = Builds.rowid))
+AND (:derivation IS NULL OR (:derivation = Builds.derivation))
 AND (:jobset IS NULL OR (:jobset = Specifications.name))
-AND (:job IS NULL OR (:job = Derivations.job_name))
-AND (:system IS NULL OR (:system = Derivations.system))
+AND (:job IS NULL OR (:job = Builds.job_name))
+AND (:system IS NULL OR (:system = Builds.system))
 AND (:evaluation IS NULL OR (:evaluation = Builds.evaluation))
 AND (:status IS NULL OR (:status = 'done' AND Builds.status >= 0)
                      OR (:status = 'pending' AND Builds.status < 0))
 AND (:borderlowtime IS NULL OR :borderlowid IS NULL
- OR ((:borderlowtime, :borderlowid) < (Builds.stoptime, Builds.id)))
+ OR ((:borderlowtime, :borderlowid) < (Builds.stoptime, Builds.rowid)))
 AND (:borderhightime IS NULL OR :borderhighid IS NULL
- OR ((:borderhightime, :borderhighid) > (Builds.stoptime, Builds.id)))
+ OR ((:borderhightime, :borderhighid) > (Builds.stoptime, Builds.rowid)))
 ORDER BY
 CASE WHEN :borderlowtime IS NULL
        OR :borderlowid IS NULL THEN Builds.stoptime
                                ELSE -Builds.stoptime
 END DESC,
 CASE WHEN :borderlowtime IS NULL
-       OR :borderlowid IS NULL THEN Builds.id
-                               ELSE -Builds.id
+       OR :borderlowid IS NULL THEN Builds.rowid
+                               ELSE -Builds.rowid
 END DESC
 LIMIT :nr)
-ORDER BY ~a, id ASC;" order))
+ORDER BY ~a, rowid ASC;" order))
          (stmt (sqlite-prepare db stmt-text #:cache? #t)))
     (sqlite-bind-arguments
      stmt
+     #:derivation (assq-ref filters 'derivation)
      #:id (assq-ref filters 'id)
      #:jobset (assq-ref filters 'jobset)
      #:job (assq-ref filters 'job)
@@ -513,45 +497,37 @@ ORDER BY ~a, id ASC;" order))
                (builds '()))
       (match rows
         (() (reverse builds))
-        ((#(id timestamp starttime stoptime log status derivation job-name
-               system nix-name specification) . rest)
+        ((#(derivation id timestamp starttime stoptime log status job-name
+                       system nix-name specification) . rest)
          (loop rest
-               (cons `((#:id . ,id)
+               (cons `((#:derivation . ,derivation)
+                       (#:id . ,id)
                        (#:timestamp . ,timestamp)
                        (#:starttime . ,starttime)
                        (#:stoptime . ,stoptime)
                        (#:log . ,log)
                        (#:status . ,status)
-                       (#:derivation . ,derivation)
                        (#:job-name . ,job-name)
                        (#:system . ,system)
                        (#:nix-name . ,nix-name)
                        (#:specification . ,specification)
-                       (#:outputs . ,(db-get-outputs db id)))
+                       (#:outputs . ,(db-get-outputs db derivation)))
                      builds)))))))
 
-(define (db-get-build db id)
-  "Retrieve a build in database DB which corresponds to ID."
-  (match (db-get-builds db `((id . ,id)))
-    ((build)
-     build)
-    (() #f)))
+(define (db-get-build db derivation-or-id)
+  "Retrieve a build in database DB which corresponds to DERIVATION-OR-ID."
+  (let ((key (if (number? derivation-or-id) 'id 'derivation)))
+    (match (db-get-builds db `((,key . ,derivation-or-id)))
+      ((build)
+       build)
+      (() #f))))
 
 (define (db-get-pending-derivations db)
   "Return the list of derivation file names corresponding to pending builds in
 DB.  The returned list is guaranteed to not have any duplicates."
-  ;; This is of course much more efficient than calling 'delete-duplicates' on
-  ;; a list of results obtained without DISTINCT, both in space and time.
-  ;;
-  ;; Here we use a subquery so that sqlite can use two indexes instead of
-  ;; creating a "TEMP B-TREE" when doing a single flat query, as "EXPLAIN
-  ;; QUERY PLAN" shows.
   (map (match-lambda (#(drv) drv))
        (sqlite-exec db "
-SELECT DISTINCT derivation FROM (
-  SELECT Derivations.derivation FROM Derivations INNER JOIN Builds
-  WHERE Derivations.derivation = Builds.derivation AND Builds.status < 0
-);")))
+SELECT derivation FROM Builds WHERE Builds.status < 0;")))
 
 (define (db-get-stamp db spec)
   "Return a stamp corresponding to specification SPEC in database DB."
@@ -596,7 +572,7 @@ AND (" border-high "IS NULL OR (id <" border-high "))
 ORDER BY CASE WHEN " border-low "IS NULL THEN id ELSE -id END DESC
 LIMIT " limit ") E
 LEFT JOIN
-(SELECT id, evaluation, SUM(status=0) as succeeded,
+(SELECT rowid, evaluation, SUM(status=0) as succeeded,
 SUM(status>0) as failed, SUM(status<0) as scheduled
 FROM Builds
 GROUP BY evaluation) B
@@ -632,8 +608,8 @@ WHERE specification=" spec)))
   "Return the min build (stoptime, id) pair for
    the given evaluation EVAL."
   (let ((rows (sqlite-exec db "
-SELECT stoptime, MIN(id) FROM
-(SELECT id, stoptime FROM Builds
+SELECT stoptime, MIN(rowid) FROM
+(SELECT rowid, stoptime FROM Builds
 WHERE evaluation=" eval " AND
 stoptime = (SELECT MIN(stoptime)
 FROM Builds WHERE evaluation=" eval "))")))
@@ -643,8 +619,8 @@ FROM Builds WHERE evaluation=" eval "))")))
   "Return the max build (stoptime, id) pair for
    the given evaluation EVAL."
   (let ((rows (sqlite-exec db "
-SELECT stoptime, MAX(id) FROM
-(SELECT id, stoptime FROM Builds
+SELECT stoptime, MAX(rowid) FROM
+(SELECT rowid, stoptime FROM Builds
 WHERE evaluation=" eval " AND
 stoptime = (SELECT MAX(stoptime)
 FROM Builds WHERE evaluation=" eval "))")))
diff --git a/src/schema.sql b/src/schema.sql
index eb0f7e9..0452495 100644
--- a/src/schema.sql
+++ b/src/schema.sql
@@ -37,43 +37,31 @@ CREATE TABLE Evaluations (
   FOREIGN KEY (specification) REFERENCES Specifications (name)
 );
 
-CREATE TABLE Derivations (
-  derivation    TEXT NOT NULL,
-  evaluation    INTEGER NOT NULL,
-  job_name      TEXT NOT NULL,
-  system        TEXT NOT NULL,
-  nix_name      TEXT NOT NULL,
-  PRIMARY KEY (derivation, evaluation),
-  FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
-);
-
 CREATE TABLE Outputs (
-  build INTEGER NOT NULL,
+  derivation TEXT NOT NULL,
   name TEXT NOT NULL,
   path TEXT NOT NULL,
-  PRIMARY KEY (build, name),
-  FOREIGN KEY (build) REFERENCES Builds (id)
+  PRIMARY KEY (derivation, name),
+  FOREIGN KEY (derivation) REFERENCES Builds (derivation)
 );
 
--- Builds are not in a one to one relationship with derivations in order to
--- keep track of non deterministic compilations.
 CREATE TABLE Builds (
-  id            INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
-  derivation    TEXT NOT NULL,
+  derivation    TEXT NOT NULL PRIMARY KEY,
   evaluation    INTEGER NOT NULL,
+  job_name      TEXT NOT NULL,
+  system        TEXT NOT NULL,
+  nix_name      TEXT NOT NULL,
   log           TEXT NOT NULL,
   status        INTEGER NOT NULL,
   timestamp     INTEGER NOT NULL,
   starttime     INTEGER NOT NULL,
   stoptime      INTEGER NOT NULL,
-  FOREIGN KEY (derivation) REFERENCES Derivations (derivation),
   FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
 );
 
 -- Create indexes to speed up common queries, in particular those
 -- corresponding to /api/latestbuilds and /api/queue HTTP requests.
-CREATE INDEX Builds_Derivations_index ON Builds(status ASC, timestamp ASC, id, derivation, evaluation, stoptime DESC);
+CREATE INDEX Builds_index ON Builds(job_name, system, status ASC, timestamp ASC, derivation, evaluation, stoptime DESC);
 CREATE INDEX Inputs_index ON Inputs(specification, name, branch);
-CREATE INDEX Derivations_index ON Derivations(derivation, evaluation, job_name, system);
 
 COMMIT;
diff --git a/src/sql/upgrade-2.sql b/src/sql/upgrade-2.sql
new file mode 100644
index 0000000..dfb919b
--- /dev/null
+++ b/src/sql/upgrade-2.sql
@@ -0,0 +1,49 @@
+BEGIN TRANSACTION;
+
+DROP INDEX Derivations_index;
+DROP INDEX Builds_Derivations_index;
+
+ALTER TABLE Outputs RENAME TO tmp_Outputs;
+ALTER TABLE Builds RENAME TO tmp_Builds;
+
+CREATE TABLE Builds (
+  derivation    TEXT NOT NULL PRIMARY KEY,
+  evaluation    INTEGER NOT NULL,
+  job_name      TEXT NOT NULL,
+  system        TEXT NOT NULL,
+  nix_name      TEXT NOT NULL,
+  log           TEXT NOT NULL,
+  status        INTEGER NOT NULL,
+  timestamp     INTEGER NOT NULL,
+  starttime     INTEGER NOT NULL,
+  stoptime      INTEGER NOT NULL,
+  FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
+);
+
+CREATE TABLE Outputs (
+  derivation TEXT NOT NULL,
+  name TEXT NOT NULL,
+  path TEXT NOT NULL,
+  PRIMARY KEY (derivation, name),
+  FOREIGN KEY (derivation) REFERENCES Builds (derivation)
+);
+
+INSERT OR IGNORE INTO Builds (derivation, evaluation, job_name, system, nix_name, log, status, timestamp, starttime, stoptime)
+SELECT Derivations.derivation, Derivations.evaluation, Derivations.job_name, Derivations.system, Derivations.nix_name,
+       tmp_Builds.log, tmp_Builds.status, tmp_Builds.timestamp, tmp_Builds.starttime, tmp_Builds.stoptime
+FROM Derivations
+INNER JOIN tmp_Builds ON tmp_Builds.derivation = Derivations.derivation
+                     AND tmp_Builds.evaluation = Derivations.evaluation;
+
+INSERT OR IGNORE INTO Outputs (derivation, name, path)
+SELECT tmp_Builds.derivation, tmp_Outputs.name, tmp_Outputs.path
+FROM tmp_Outputs
+INNER JOIN tmp_Builds on tmp_Builds.id = tmp_Outputs.build;
+
+CREATE INDEX Builds_index ON Builds(job_name, system, status ASC, timestamp ASC, derivation, evaluation, stoptime DESC);
+
+DROP TABLE tmp_Builds;
+DROP TABLE tmp_Outputs;
+DROP TABLE Derivations;
+
+COMMIT;
diff --git a/tests/database.scm b/tests/database.scm
index 17d48f5..af518bd 100644
--- a/tests/database.scm
+++ b/tests/database.scm
@@ -50,27 +50,14 @@
   `((#:specification . "guix")
     (#:commits . ,commits)))
 
-(define* (make-dummy-job #:optional (name "foo"))
-  `((#:name . ,name)
-    (#:job-name . "job")
-    (#:system . "x86_64-linux")
-    (#:derivation . ,(string-append name ".drv"))
-    (#:nix-name . "foo")
-    (#:specification 0)
-    (#:eval-id . 42)))
-
-(define* (make-dummy-derivation drv #:optional (eval-id 0))
+(define* (make-dummy-build drv
+                           #:optional (eval-id 42)
+                           #:key (outputs '(("foo" . "/foo"))))
   `((#:derivation . ,drv)
+    (#:eval-id . ,eval-id)
     (#:job-name . "job")
     (#:system . "x86_64-linux")
-    (#:nix-name . ,(basename drv ".drv"))
-    (#:eval-id . ,eval-id)))
-
-(define* (make-dummy-build #:optional (eval-id 42)
-                           #:key (drv "/foo.drv")
-                           (outputs '(("foo" . "/foo"))))
-  `((#:derivation . ,drv)
-    (#:eval-id . ,eval-id)
+    (#:nix-name . "foo")
     (#:log . "log")
     (#:outputs . (("foo" . "/foo")))))
 
@@ -86,10 +73,6 @@
   ;; Global Slot for a database object.
   (make-parameter #t))
 
-(define %id
-  ;; Global Slot for a job ID in the database.
-  (make-parameter #t))
-
 (define database-name
   ;; Use an empty and temporary database for the tests.
   (string-append (getcwd) "/" (number->string (getpid)) "-tmp.db"))
@@ -114,21 +97,13 @@ INSERT INTO Evaluations (specification, commits) VALUES (3, 3);")
       (db-add-specification (%db) example-spec)
       (car (db-get-specifications (%db)))))
 
-  (test-assert "db-add-derivation"
-    (let* ((job (make-dummy-job))
-           (key (assq-ref job #:derivation)))
-      (db-add-derivation (%db) job)
-      (db-add-derivation (%db) job)               ;idempotent
-      (%id key)))
-
-  (test-assert "db-get-derivation"
-    (db-get-derivation (%db) (%id)))
-
-  (test-assert "db-add-build"
-    (let ((build (make-dummy-build)))
+  (test-equal "db-add-build"
+    #f
+    (let ((build (make-dummy-build "/foo.drv")))
       (db-add-build (%db) build)
 
-      ;; This should be idempotent, see <https://bugs.gnu.org/28094>.
+      ;; Should return #f when adding a build whose derivation is already
+      ;; there, see <https://bugs.gnu.org/28094>.
       (db-add-build (%db) build)))
 
   (test-equal "db-update-build-status!"
@@ -137,13 +112,12 @@ INSERT INTO Evaluations (specification, commits) VALUES (3, 3);")
           (build-status succeeded)
           "/foo.drv.log")
     (with-temporary-database db
-      (let* ((id (db-add-build
-                  db
-                  (make-dummy-build 1 #:drv "/foo.drv"
-                                    #:outputs '(("out" . "/foo")))))
+      (let* ((derivation (db-add-build
+                          db
+                          (make-dummy-build "/foo.drv" 1
+                                            #:outputs '(("out" . "/foo")))))
              (get-status (lambda* (#:optional (key #:status))
-                           (assq-ref (db-get-build db id) key))))
-        (db-add-derivation db (make-dummy-derivation "/foo.drv" 1))
+                           (assq-ref (db-get-build db derivation) key))))
         (db-add-evaluation db (make-dummy-eval))
         (db-add-specification db example-spec)
 
@@ -171,18 +145,15 @@ INSERT INTO Evaluations (specification, commits) VALUES (3, 3);")
       ((3 "/baz.drv"))                               ;nr = 1
       ((2 "/bar.drv") (1 "/foo.drv") (3 "/baz.drv"))) ;status+submission-time
     (with-temporary-database db
-      ;; Populate the 'Builds', 'Derivations', 'Evaluations', and
+      ;; Populate the 'Builds'', 'Evaluations', and
       ;; 'Specifications' tables in a consistent way, as expected by the
       ;; 'db-get-builds' query.
-      (db-add-build db (make-dummy-build 1 #:drv "/foo.drv"
+      (db-add-build db (make-dummy-build "/foo.drv" 1
                                          #:outputs `(("out" . "/foo"))))
-      (db-add-build db (make-dummy-build 2 #:drv "/bar.drv"
+      (db-add-build db (make-dummy-build "/bar.drv" 2
                                          #:outputs `(("out" . "/bar"))))
-      (db-add-build db (make-dummy-build 3 #:drv "/baz.drv"
+      (db-add-build db (make-dummy-build "/baz.drv" 3
                                          #:outputs `(("out" . "/baz"))))
-      (db-add-derivation db (make-dummy-derivation "/foo.drv" 1))
-      (db-add-derivation db (make-dummy-derivation "/bar.drv" 2))
-      (db-add-derivation db (make-dummy-derivation "/baz.drv" 3))
       (db-add-evaluation db (make-dummy-eval))
       (db-add-evaluation db (make-dummy-eval))
       (db-add-evaluation db (make-dummy-eval))
@@ -206,19 +177,16 @@ INSERT INTO Evaluations (specification, commits) VALUES (3, 3);")
   (test-equal "db-get-pending-derivations"
     '("/bar.drv" "/foo.drv")
     (with-temporary-database db
-      ;; Populate the 'Builds', 'Derivations', 'Evaluations', and
+      ;; Populate the 'Builds', 'Evaluations', and
       ;; 'Specifications' tables.  Here, two builds map to the same derivation
       ;; but the result of 'db-get-pending-derivations' must not contain any
       ;; duplicate.
-      (db-add-build db (make-dummy-build 1 #:drv "/foo.drv"
+      (db-add-build db (make-dummy-build "/foo.drv" 1
                                          #:outputs `(("out" . "/foo"))))
-      (db-add-build db (make-dummy-build 2 #:drv "/bar.drv"
+      (db-add-build db (make-dummy-build "/bar.drv" 2
                                          #:outputs `(("out" . "/bar"))))
-      (db-add-build db (make-dummy-build 3 #:drv "/foo.drv"
+      (db-add-build db (make-dummy-build "/foo.drv" 3
                                          #:outputs `(("out" . "/foo"))))
-      (db-add-derivation db (make-dummy-derivation "/foo.drv" 1))
-      (db-add-derivation db (make-dummy-derivation "/bar.drv" 2))
-      (db-add-derivation db (make-dummy-derivation "/foo.drv" 3))
       (db-add-evaluation db (make-dummy-eval))
       (db-add-evaluation db (make-dummy-eval))
       (db-add-evaluation db (make-dummy-eval))
diff --git a/tests/http.scm b/tests/http.scm
index e05fdc5..a9fc3ef 100644
--- a/tests/http.scm
+++ b/tests/http.scm
@@ -142,6 +142,9 @@
     (let* ((build1
             `((#:derivation . "/gnu/store/fake.drv")
               (#:eval-id . 1)
+              (#:job-name . "fake-job")
+              (#:system . "x86_64-linux")
+              (#:nix-name . "fake-1.0")
               (#:log . "unused so far")
               (#:status . ,(build-status succeeded))
               (#:outputs . (("out" . "/gnu/store/fake-1.0")))
@@ -151,24 +154,15 @@
            (build2
             `((#:derivation . "/gnu/store/fake2.drv")
               (#:eval-id . 1)
+              (#:job-name . "fake-job")
+              (#:system . "x86_64-linux")
+              (#:nix-name . "fake-2.0")
               (#:log . "unused so far")
               (#:status . ,(build-status scheduled))
               (#:outputs . (("out" . "/gnu/store/fake-2.0")))
               (#:timestamp . 1501347493)
               (#:starttime . 0)
               (#:stoptime . 0)))
-           (derivation1
-            '((#:derivation . "/gnu/store/fake.drv")
-              (#:job-name . "fake-job")
-              (#:system . "x86_64-linux")
-              (#:nix-name . "fake-1.0")
-              (#:eval-id . 1)))
-           (derivation2
-            '((#:derivation . "/gnu/store/fake2.drv")
-              (#:job-name . "fake-job")
-              (#:system . "x86_64-linux")
-              (#:nix-name . "fake-2.0")
-              (#:eval-id . 1)))
            (specification
             '((#:name . "guix")
               (#:load-path-inputs . ("savannah"))
@@ -192,8 +186,6 @@
               (#:commits . ("fakesha2" "fakesha3")))))
       (db-add-build (%db) build1)
       (db-add-build (%db) build2)
-      (db-add-derivation (%db) derivation1)
-      (db-add-derivation (%db) derivation2)
       (db-add-specification (%db) specification)
       (db-add-evaluation (%db) evaluation1)
       (db-add-evaluation (%db) evaluation2)))
-- 
2.18.0





Information forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Sat, 04 Aug 2018 16:10:02 GMT) Full text and rfc822 format available.

Message #23 received at 32190 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 32190 <at> debbugs.gnu.org
Subject: Re: bug#32190: [PATCH] database: Merge Derivations into Builds table.
Date: Sat, 04 Aug 2018 18:09:27 +0200
Clément Lassieur <clement <at> lassieur.org> writes:

> Fixes <https://bugs.gnu.org/32190>.

[...]

> ---
>  Makefile.am              |   3 +-
>  src/cuirass/base.scm     |  23 +++---
>  src/cuirass/database.scm | 174 +++++++++++++++++----------------------
>  src/schema.sql           |  28 ++-----
>  src/sql/upgrade-2.sql    |  49 +++++++++++
>  tests/database.scm       |  78 ++++++------------
>  tests/http.scm           |  20 ++---
>  7 files changed, 176 insertions(+), 199 deletions(-)
>  create mode 100644 src/sql/upgrade-2.sql

I'll update the doc as well.




Information forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Wed, 08 Aug 2018 12:14:02 GMT) Full text and rfc822 format available.

Message #26 received at 32190 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 32190 <at> debbugs.gnu.org
Subject: Re: bug#32190: [PATCH] database: Merge Derivations into Builds table.
Date: Wed, 08 Aug 2018 14:13:51 +0200
Clément Lassieur <clement <at> lassieur.org> writes:

>  Makefile.am              |   3 +-
>  src/cuirass/base.scm     |  23 +++---
>  src/cuirass/database.scm | 174 +++++++++++++++++----------------------
>  src/schema.sql           |  28 ++-----
>  src/sql/upgrade-2.sql    |  49 +++++++++++
>  tests/database.scm       |  78 ++++++------------
>  tests/http.scm           |  20 ++---
>  7 files changed, 176 insertions(+), 199 deletions(-)
>  create mode 100644 src/sql/upgrade-2.sql

You can see how it looks like here:

https://cuirass.lassieur.org:8081/jobset/guix-master




Information forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Tue, 14 Aug 2018 16:59:02 GMT) Full text and rfc822 format available.

Message #29 received at 32190 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: 32190 <at> debbugs.gnu.org
Subject: Re: bug#32190: [PATCH] database: Merge Derivations into Builds table.
Date: Tue, 14 Aug 2018 18:57:31 +0200
I forgot:

Clément Lassieur <clement <at> lassieur.org> writes:

> Fixes <https://bugs.gnu.org/32190>.
>
> * Makefile.am (dist_sql_DATA): Add 'src/sql/upgrade-2.sql'.
> * src/cuirass/base.scm (evaluate): Don't add jobs to the Derivations table.

(build-packages): Add columns that were in the Derivations table before.  Only
build the derivations that were successfully registered, that is, those that
didn't exist in the Builds table.  Give a derivation instead of a build id to
DB-GET-BUILD.  Compute the number of failed jobs based on the derivations that
were added to the table, instead of the jobs.

> * src/cuirass/database.scm (db-add-derivation, db-get-derivation): Remove
> exported procedures.
> (db-add-build): Catch SQLITE_CONSTRAINT_PRIMARYKEY error, which means that two
> jobs produced the same derivation, and return #f in that case.  Add columns

[...]

And (in base.scm)

@@ -584,7 +587,7 @@ procedure is meant to be called at startup."
                              (((_ (#:path . (? string? outputs))) ...)
                               outputs))
                            outputs))
-         (fail (- (length jobs) success)))
+         (fail (- (length derivations) success)))
     (log-message "outputs:\n~a" (string-join outs "\n"))
     (log-message "success: ~a, fail: ~a" success fail)
     results))




Information forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Wed, 15 Aug 2018 18:58:01 GMT) Full text and rfc822 format available.

Message #32 received at 32190 <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ricardo Wurmus <rekado <at> elephly.net>
Cc: 32190 <at> debbugs.gnu.org
Subject: Re: bug#32190: [PATCH] database: Merge Derivations into Builds table.
Date: Wed, 15 Aug 2018 20:57:00 +0200
Hi Ricardo,

Ricardo Wurmus <rekado <at> elephly.net> writes:

> I see that you’ve mentioned your changes to “build-packages” in a later
> email.
>
> I have two general questions about this: why was the change from “id” to
> “rowid” necessary?  And: could you please also update the documentation
> so that is reflects the changes?

I removed the 'id' column from the Builds table because it was used as
the primary key, and I wanted the primary key to be 'derivation'.  But
we still need to display the build id in the web interface, and the API
allows to get the builds by id.  We can't use the 'id' column anymore
because AUTOINCREMENT isn't allowed without PRIMARY KEY[1], but we can
use the rowid[2] implicit column instead.

I updated the documentation on my personal branch[3], but I didn't send
it to the ml, because I have sent too many things already ;)

[1]: https://www.sqlite.org/autoinc.html
[2]: https://www.sqlite.org/rowidtable.html
[3]: https://git.lassieur.org/cgit/cuirass.git/

>> diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
>> index b4b1652..7788ac9 100644
>> --- a/src/cuirass/database.scm
>> +++ b/src/cuirass/database.scm

[...]

>>  (define (db-add-build db build)
>>    "Store BUILD in database DB. BUILD eventual outputs are stored
>>  in the OUTPUTS table."

[...]

> This procedure is called when a build is scheduled, isn’t it?

Yes.

> The docstring says “BUILD eventual outputs are stored in the OUTPUTS
> table.”  – does this mean the names of the *expected* output
> directories are recorded in Outputs?  They don’t exist at this point,
> right?

Indeed, it's just the output paths written in the derivation files.

>> +    (lambda (key who code message . rest)
>> +      ;; If we get a unique-constraint-failed error, that means we have
>> +      ;; already inserted the same build.  That happens when several jobs
>> +      ;; produce the same derivation, and we can ignore it.
>> +      (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
>> +          #f
>> +          (apply throw key who code rest)))))
>
> Okay.
>
> Can we prevent this from happening in the first place?  I feel a little
> uncomfortable about performing an operation that we expect to cause
> primary key errors regularly.

We would need to manually check that the derivation doesn't already
exist.  That would be an extra query so I think it would be less
efficient and it would require more code.  I believe the 'constraint' is
a pretty good mechanism to describe things that shouldn't happen in the
database, I don't know any better way.

>> diff --git a/src/schema.sql b/src/schema.sql
>> index eb0f7e9..0452495 100644
>> --- a/src/schema.sql
>> +++ b/src/schema.sql
> […]
>> --- Builds are not in a one to one relationship with derivations in order to
>> --- keep track of non deterministic compilations.
>
> Is this comment still correct considering that the derivation is now the
> primary key of the Builds table?

No, I forgot to remove it.

>>  CREATE TABLE Builds (
>> -  id            INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
>> -  derivation    TEXT NOT NULL,
>> +  derivation    TEXT NOT NULL PRIMARY KEY,
>>    evaluation    INTEGER NOT NULL,
>> +  job_name      TEXT NOT NULL,
>> +  system        TEXT NOT NULL,
>> +  nix_name      TEXT NOT NULL,
>>    log           TEXT NOT NULL,
>>    status        INTEGER NOT NULL,
>>    timestamp     INTEGER NOT NULL,
>>    starttime     INTEGER NOT NULL,
>>    stoptime      INTEGER NOT NULL,
>> -  FOREIGN KEY (derivation) REFERENCES Derivations (derivation),
>>    FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
>>  );
>
> Thanks again!

:-) you're welcome!  Thanks for reviewing.

Clément




Reply sent to Clément Lassieur <clement <at> lassieur.org>:
You have taken responsibility. (Thu, 16 Aug 2018 21:01:03 GMT) Full text and rfc822 format available.

Notification sent to Clément Lassieur <clement <at> lassieur.org>:
bug acknowledged by developer. (Thu, 16 Aug 2018 21:01:04 GMT) Full text and rfc822 format available.

Message #37 received at 32190-done <at> debbugs.gnu.org (full text, mbox):

From: Clément Lassieur <clement <at> lassieur.org>
To: Ricardo Wurmus <rekado <at> elephly.net>
Cc: 32190-done <at> debbugs.gnu.org
Subject: Re: bug#32190: [PATCH] database: Merge Derivations into Builds table.
Date: Thu, 16 Aug 2018 23:00:17 +0200
Pushed, thank you for the review!

Clément

Ricardo Wurmus <rekado <at> elephly.net> writes:

> Hi Clément,
>
>> Fixes <https://bugs.gnu.org/32190>.
>
> Woo!  Thank you for this patch.
>
>> * src/cuirass/base.scm (evaluate): Don't add jobs to the Derivations table.
>
> I see that you’ve mentioned your changes to “build-packages” in a later
> email.
>
> I have two general questions about this: why was the change from “id” to
> “rowid” necessary?  And: could you please also update the documentation
> so that is reflects the changes?
>
>> diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
>> index b4b1652..7788ac9 100644
>> --- a/src/cuirass/database.scm
>> +++ b/src/cuirass/database.scm
> […]
>>  (define (db-add-evaluation db eval)
>>    (sqlite-exec db "\
>>  INSERT INTO Evaluations (specification, commits) VALUES ("
>> @@ -384,27 +356,39 @@ string."
>>  (define (db-add-build db build)
>>    "Store BUILD in database DB. BUILD eventual outputs are stored
>>  in the OUTPUTS table."
>> -  (let* ((build-exec
>> -          (sqlite-exec db "\
>> -INSERT INTO Builds (derivation, evaluation, log, status, timestamp, starttime, stoptime)\
>> -  VALUES ("
>> -                       (assq-ref build #:derivation) ", "
>> -                       (assq-ref build #:eval-id) ", "
>> -                       (assq-ref build #:log) ", "
>> -                       (or (assq-ref build #:status)
>> -                           (build-status scheduled)) ", "
>> -                       (or (assq-ref build #:timestamp) 0) ", "
>> -                       (or (assq-ref build #:starttime) 0) ", "
>> -                       (or (assq-ref build #:stoptime) 0) ");"))
>> -         (build-id (last-insert-rowid db)))
>> -    (for-each (lambda (output)
>> -                (match output
>> -                  ((name . path)
>> -                   (sqlite-exec db "\
>> -INSERT INTO Outputs (build, name, path) VALUES ("
>> -                                build-id ", " name ", " path ");"))))
>> -              (assq-ref build #:outputs))
>> -    build-id))
>> +  (catch 'sqlite-error
>> +    (lambda ()
>> +      (sqlite-exec db "
>> +INSERT INTO Builds (derivation, evaluation, job_name, system, nix_name, log,
>> +status, timestamp, starttime, stoptime)
>> +VALUES ("
>> +                   (assq-ref build #:derivation) ", "
>> +                   (assq-ref build #:eval-id) ", "
>> +                   (assq-ref build #:job-name) ", "
>> +                   (assq-ref build #:system) ", "
>> +                   (assq-ref build #:nix-name) ", "
>> +                   (assq-ref build #:log) ", "
>> +                   (or (assq-ref build #:status)
>> +                       (build-status scheduled)) ", "
>> +                   (or (assq-ref build #:timestamp) 0) ", "
>> +                   (or (assq-ref build #:starttime) 0) ", "
>> +                   (or (assq-ref build #:stoptime) 0) ");")
>> +      (let ((derivation (assq-ref build #:derivation)))
>> +        (for-each (lambda (output)
>> +                    (match output
>> +                      ((name . path)
>> +                       (sqlite-exec db "\
>> +INSERT INTO Outputs (derivation, name, path) VALUES ("
>> +                                    derivation ", " name ", " path ");"))))
>> +                  (assq-ref build #:outputs))
>> +        derivation))
>
> This procedure is called when a build is scheduled, isn’t it?  The
> docstring says “BUILD eventual outputs are stored in the OUTPUTS table.”
> – does this mean the names of the *expected* output directories are
> recorded in Outputs?  They don’t exist at this point, right?
>
>> +    (lambda (key who code message . rest)
>> +      ;; If we get a unique-constraint-failed error, that means we have
>> +      ;; already inserted the same build.  That happens when several jobs
>> +      ;; produce the same derivation, and we can ignore it.
>> +      (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
>> +          #f
>> +          (apply throw key who code rest)))))
>
> Okay.
>
> Can we prevent this from happening in the first place?  I feel a little
> uncomfortable about performing an operation that we expect to cause
> primary key errors regularly.
>
>> diff --git a/src/schema.sql b/src/schema.sql
>> index eb0f7e9..0452495 100644
>> --- a/src/schema.sql
>> +++ b/src/schema.sql
> […]
>> --- Builds are not in a one to one relationship with derivations in order to
>> --- keep track of non deterministic compilations.
>
> Is this comment still correct considering that the derivation is now the
> primary key of the Builds table?
>
>>  CREATE TABLE Builds (
>> -  id            INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
>> -  derivation    TEXT NOT NULL,
>> +  derivation    TEXT NOT NULL PRIMARY KEY,
>>    evaluation    INTEGER NOT NULL,
>> +  job_name      TEXT NOT NULL,
>> +  system        TEXT NOT NULL,
>> +  nix_name      TEXT NOT NULL,
>>    log           TEXT NOT NULL,
>>    status        INTEGER NOT NULL,
>>    timestamp     INTEGER NOT NULL,
>>    starttime     INTEGER NOT NULL,
>>    stoptime      INTEGER NOT NULL,
>> -  FOREIGN KEY (derivation) REFERENCES Derivations (derivation),
>>    FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
>>  );
>
> Thanks again!





Information forwarded to bug-guix <at> gnu.org:
bug#32190; Package guix. (Fri, 17 Aug 2018 10:54:01 GMT) Full text and rfc822 format available.

Message #40 received at 32190 <at> debbugs.gnu.org (full text, mbox):

From: Ricardo Wurmus <rekado <at> elephly.net>
To: Clément Lassieur <clement <at> lassieur.org>
Cc: 32190 <at> debbugs.gnu.org
Subject: Re: bug#32190: [PATCH] database: Merge Derivations into Builds table.
Date: Tue, 14 Aug 2018 21:04:23 +0200
Hi Clément,

> Fixes <https://bugs.gnu.org/32190>.

Woo!  Thank you for this patch.

> * src/cuirass/base.scm (evaluate): Don't add jobs to the Derivations table.

I see that you’ve mentioned your changes to “build-packages” in a later
email.

I have two general questions about this: why was the change from “id” to
“rowid” necessary?  And: could you please also update the documentation
so that is reflects the changes?

> diff --git a/src/cuirass/database.scm b/src/cuirass/database.scm
> index b4b1652..7788ac9 100644
> --- a/src/cuirass/database.scm
> +++ b/src/cuirass/database.scm
[…]
>  (define (db-add-evaluation db eval)
>    (sqlite-exec db "\
>  INSERT INTO Evaluations (specification, commits) VALUES ("
> @@ -384,27 +356,39 @@ string."
>  (define (db-add-build db build)
>    "Store BUILD in database DB. BUILD eventual outputs are stored
>  in the OUTPUTS table."
> -  (let* ((build-exec
> -          (sqlite-exec db "\
> -INSERT INTO Builds (derivation, evaluation, log, status, timestamp, starttime, stoptime)\
> -  VALUES ("
> -                       (assq-ref build #:derivation) ", "
> -                       (assq-ref build #:eval-id) ", "
> -                       (assq-ref build #:log) ", "
> -                       (or (assq-ref build #:status)
> -                           (build-status scheduled)) ", "
> -                       (or (assq-ref build #:timestamp) 0) ", "
> -                       (or (assq-ref build #:starttime) 0) ", "
> -                       (or (assq-ref build #:stoptime) 0) ");"))
> -         (build-id (last-insert-rowid db)))
> -    (for-each (lambda (output)
> -                (match output
> -                  ((name . path)
> -                   (sqlite-exec db "\
> -INSERT INTO Outputs (build, name, path) VALUES ("
> -                                build-id ", " name ", " path ");"))))
> -              (assq-ref build #:outputs))
> -    build-id))
> +  (catch 'sqlite-error
> +    (lambda ()
> +      (sqlite-exec db "
> +INSERT INTO Builds (derivation, evaluation, job_name, system, nix_name, log,
> +status, timestamp, starttime, stoptime)
> +VALUES ("
> +                   (assq-ref build #:derivation) ", "
> +                   (assq-ref build #:eval-id) ", "
> +                   (assq-ref build #:job-name) ", "
> +                   (assq-ref build #:system) ", "
> +                   (assq-ref build #:nix-name) ", "
> +                   (assq-ref build #:log) ", "
> +                   (or (assq-ref build #:status)
> +                       (build-status scheduled)) ", "
> +                   (or (assq-ref build #:timestamp) 0) ", "
> +                   (or (assq-ref build #:starttime) 0) ", "
> +                   (or (assq-ref build #:stoptime) 0) ");")
> +      (let ((derivation (assq-ref build #:derivation)))
> +        (for-each (lambda (output)
> +                    (match output
> +                      ((name . path)
> +                       (sqlite-exec db "\
> +INSERT INTO Outputs (derivation, name, path) VALUES ("
> +                                    derivation ", " name ", " path ");"))))
> +                  (assq-ref build #:outputs))
> +        derivation))

This procedure is called when a build is scheduled, isn’t it?  The
docstring says “BUILD eventual outputs are stored in the OUTPUTS table.”
– does this mean the names of the *expected* output directories are
recorded in Outputs?  They don’t exist at this point, right?

> +    (lambda (key who code message . rest)
> +      ;; If we get a unique-constraint-failed error, that means we have
> +      ;; already inserted the same build.  That happens when several jobs
> +      ;; produce the same derivation, and we can ignore it.
> +      (if (= code SQLITE_CONSTRAINT_PRIMARYKEY)
> +          #f
> +          (apply throw key who code rest)))))

Okay.

Can we prevent this from happening in the first place?  I feel a little
uncomfortable about performing an operation that we expect to cause
primary key errors regularly.

> diff --git a/src/schema.sql b/src/schema.sql
> index eb0f7e9..0452495 100644
> --- a/src/schema.sql
> +++ b/src/schema.sql
[…]
> --- Builds are not in a one to one relationship with derivations in order to
> --- keep track of non deterministic compilations.

Is this comment still correct considering that the derivation is now the
primary key of the Builds table?

>  CREATE TABLE Builds (
> -  id            INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
> -  derivation    TEXT NOT NULL,
> +  derivation    TEXT NOT NULL PRIMARY KEY,
>    evaluation    INTEGER NOT NULL,
> +  job_name      TEXT NOT NULL,
> +  system        TEXT NOT NULL,
> +  nix_name      TEXT NOT NULL,
>    log           TEXT NOT NULL,
>    status        INTEGER NOT NULL,
>    timestamp     INTEGER NOT NULL,
>    starttime     INTEGER NOT NULL,
>    stoptime      INTEGER NOT NULL,
> -  FOREIGN KEY (derivation) REFERENCES Derivations (derivation),
>    FOREIGN KEY (evaluation) REFERENCES Evaluations (id)
>  );

Thanks again!

--
Ricardo





bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Fri, 14 Sep 2018 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 5 years and 226 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.