GNU bug report logs - #34877
Ghostscript: Missing text when converting PDF to PS

Previous Next

Package: guix;

Reported by: Diego Nicola Barbato <dnbarbato <at> posteo.de>

Date: Sat, 16 Mar 2019 00:35:01 UTC

Severity: important

Done: Ludovic Courtès <ludo <at> gnu.org>

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 34877 in the body.
You can then email your comments to 34877 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guix <at> gnu.org:
bug#34877; Package guix. (Sat, 16 Mar 2019 00:35:03 GMT) Full text and rfc822 format available.

Acknowledgement sent to Diego Nicola Barbato <dnbarbato <at> posteo.de>:
New bug report received and forwarded. Copy sent to bug-guix <at> gnu.org. (Sat, 16 Mar 2019 00:35:05 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: Diego Nicola Barbato <dnbarbato <at> posteo.de>
To: bug-guix <at> gnu.org
Subject: Ghostscript: Missing text when converting PDF to PS
Date: Fri, 15 Mar 2019 23:59:34 +0100
[Message part 1 (text/plain, inline)]
Hello Guix,

When converting certain PDF files to PostScript pdf2ps (from the
Ghostscript package) will print the following error messages:

--8<---------------cut here---------------start------------->8---
   **** Error reading a content stream. The page may be incomplete.
               Output may be incorrect.
   **** Error: File did not complete the page properly and may be damaged.
               Output may be incorrect.
--8<---------------cut here---------------end--------------->8---

The resulting file will be missing some (sometimes all) of the text.

I have attached one such PDF, which I obtained from
https://www.ichkoche.at/?ctl=recipe_pdf&recipe_id=3161, alongside the
files generated by running pdf2ps on both Guix System and, for
reference, Debian 9, where the conversion succeeds (even though they
provide the same version (9.26 (2018-11-20)) of Ghostscript).

I have also attached the results of running ‘gsnd -dPDFDEBUG’ on the
offending file.

I run Guix System (commit: 0bd1498) on x86_64.

Regards,

Diego

[original.pdf (application/pdf, attachment)]
[guix.ps (application/postscript, attachment)]
[debian.ps (application/postscript, attachment)]
[gsnd_guix.txt (text/plain, attachment)]
[gsnd_debian.txt (text/plain, attachment)]

Information forwarded to bug-guix <at> gnu.org:
bug#34877; Package guix. (Sun, 17 Mar 2019 12:06:02 GMT) Full text and rfc822 format available.

Message #8 received at 34877 <at> debbugs.gnu.org (full text, mbox):

From: Diego Nicola Barbato <dnbarbato <at> posteo.de>
To: bug-guix <at> gnu.org
Cc: 34877 <at> debbugs.gnu.org
Subject: Ghostscript: Missing text when converting PDF to PS
Date: Sun, 17 Mar 2019 13:05:50 +0100
Hello Guix,

Diego Nicola Barbato <dnbarbato <at> posteo.de> writes:

> Hello Guix,
>
> When converting certain PDF files to PostScript pdf2ps (from the
> Ghostscript package) will print the following error messages:
>
> --8<---------------cut here---------------start------------->8---
>    **** Error reading a content stream. The page may be incomplete.
>                Output may be incorrect.
>    **** Error: File did not complete the page properly and may be damaged.
>                Output may be incorrect.
> --8<---------------cut here---------------end--------------->8---
>
> The resulting file will be missing some (sometimes all) of the text.
>
> I have attached one such PDF, which I obtained from
> https://www.ichkoche.at/?ctl=recipe_pdf&recipe_id=3161, alongside the
> files generated by running pdf2ps on both Guix System and, for
> reference, Debian 9, where the conversion succeeds (even though they
> provide the same version (9.26 (2018-11-20)) of Ghostscript).
>
> I have also attached the results of running ‘gsnd -dPDFDEBUG’ on the
> offending file.
>
> I run Guix System (commit: 0bd1498) on x86_64.
>
> Regards,
>
> Diego

Unfortunately the original message did not make it to the mailing list
because the attachments were too big.  It did make it to debbugs,
though, so the attachments should be available there
(https://debbugs.gnu.org/34877).

Regards,

Diego




Information forwarded to bug-guix <at> gnu.org:
bug#34877; Package guix. (Sun, 14 Apr 2019 14:56:02 GMT) Full text and rfc822 format available.

Message #11 received at 34877 <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Diego Nicola Barbato <dnbarbato <at> posteo.de>
Cc: 34877 <at> debbugs.gnu.org
Subject: Re: bug#34877: Ghostscript: Missing text when converting PDF to PS
Date: Sun, 14 Apr 2019 16:55:22 +0200
[Message part 1 (text/plain, inline)]
Hi Diego,

Diego Nicola Barbato <dnbarbato <at> posteo.de> skribis:

> When converting certain PDF files to PostScript pdf2ps (from the
> Ghostscript package) will print the following error messages:
>
>    **** Error reading a content stream. The page may be incomplete.
>                Output may be incorrect.
>    **** Error: File did not complete the page properly and may be damaged.
>                Output may be incorrect.
>
> The resulting file will be missing some (sometimes all) of the text.

I have spent time investigating this issue, in vain so far.

There’s already one conclusion that can be drawn: pdf2ps succeeds in my
experience with PDFs that do *not* embed fonts (one of the 14 standard
fonts.)  It fails, as in this case, when fonts *are* embedded.

Looking at the strace output, I initially thought our gs was missing its
resource files: they were supposed to be compiled in
(“COMPILE_INITS=1”), but my understanding was that this was only the
case for the statically-linked gs, which we disabled in commit
eb354bdacbf4154ec66038dac07f19bf4ced1fad.

So I started by passing --disable-compile-inits and then fixing up
ENOENT issues that I could notice in the strace output (patch below),
but that didn’t make any difference.

I’m still not sure how to interpret this error, it’s really not clear to
me what it really means.  Reports like
<https://bugs.ghostscript.com/show_bug.cgi?id=695874> suggest it has to
do with fonts, but it’s not all that clear in this case.

Anyway, it’s also clear that this is the same problem people experience
when printing.

Ideas welcome!

Ludo’.

[Message part 2 (text/x-patch, inline)]
diff --git a/gnu/packages/ghostscript.scm b/gnu/packages/ghostscript.scm
index 53a9b60fdb..9591dbdb1d 100644
--- a/gnu/packages/ghostscript.scm
+++ b/gnu/packages/ghostscript.scm
@@ -2,7 +2,7 @@
 ;;; Copyright © 2013 Andreas Enge <andreas <at> enge.fr>
 ;;; Copyright © 2014, 2015, 2016, 2017 Mark H Weaver <mhw <at> netris.org>
 ;;; Copyright © 2015 Ricardo Wurmus <rekado <at> elephly.net>
-;;; Copyright © 2013, 2015, 2016, 2017 Ludovic Courtès <ludo <at> gnu.org>
+;;; Copyright © 2013, 2015, 2016, 2017, 2019 Ludovic Courtès <ludo <at> gnu.org>
 ;;; Copyright © 2017 Alex Vong <alexvong1995 <at> gmail.com>
 ;;; Copyright © 2017, 2018, 2019 Efraim Flashner <efraim <at> flashner.co.il>
 ;;; Copyright © 2017 Leo Famulari <leo <at> famulari.name>
@@ -269,6 +269,59 @@ output file formats and printers.")
     (home-page "https://www.ghostscript.com/")
     (license license:agpl3+)))
 
+(define-public ghostscript/fixed
+  (package/inherit
+   ghostscript
+   (version (string-append (package-version ghostscript) "-1"))
+   (arguments
+    (substitute-keyword-arguments (package-arguments ghostscript)
+      ((#:configure-flags flags ''())
+       `(append (list "--disable-compile-inits"
+                      (string-append "--with-fontpath="
+                                     (assoc-ref %build-inputs "gs-fonts")
+                                     "/share/fonts/type1/ghostscript"))
+                ,flags))
+      ((#:phases phases '%standard-phases)
+       `(modify-phases ,phases
+          (add-after 'install 'create-cmap-symlink
+            (lambda* (#:key outputs #:allow-other-keys)
+              (let* ((out     (assoc-ref outputs "out"))
+                     (init    (car (find-files out "^Init$"
+                                               #:directories? #t)))
+                     (fontdir (string-append out "/share/ghostscript/fonts"))
+                     (fontdir1 (string-append out "/share/fonts/type1/ghostscript")))
+                (symlink "../CMap"
+                         (string-append init "/CMap"))
+                (symlink "../Init/Fontmap"
+                         (string-append init "/../Font/Fontmap"))
+
+                (mkdir-p fontdir)
+                (symlink (string-append init "/Fontmap")
+                         (string-append fontdir "/Fontmap"))
+                (mkdir-p fontdir1)
+                (symlink (string-append init "/Fontmap")
+                         (string-append fontdir1 "/Fontmap"))
+                #t)))))))
+   (inputs `(("gs-fonts" ,gs-fonts)
+             ,@(package-inputs ghostscript)))))
+
+(define-public ghostscript/static
+  ;; Like before commit eb354bdacbf4154ec66038dac07f19bf4ced1fad.
+  (package
+    (inherit ghostscript)
+    (name "ghostscript-static")
+    (arguments
+     (substitute-keyword-arguments (package-arguments ghostscript)
+       ((#:phases phases '%standard-phases)
+        `(modify-phases ,phases
+           (replace 'build
+             (lambda _
+               (invoke "make" "-j5")))
+           (replace 'install
+             (lambda _
+               (invoke "make" "install")))
+           (delete 'create-gs-symlink)))))))
+
 (define-public ghostscript/x
   (package/inherit ghostscript
     (name (string-append (package-name ghostscript) "-with-x"))

Severity set to 'important' from 'normal' Request was from Ludovic Courtès <ludo <at> gnu.org> to control <at> debbugs.gnu.org. (Fri, 10 May 2019 09:26:01 GMT) Full text and rfc822 format available.

Information forwarded to bug-guix <at> gnu.org:
bug#34877; Package guix. (Mon, 13 May 2019 14:22:02 GMT) Full text and rfc822 format available.

Message #16 received at 34877 <at> debbugs.gnu.org (full text, mbox):

From: sirmacik <sirmacik <at> wioo.waw.pl>
To: 34877 <at> debbugs.gnu.org
Subject: re: #34877 Ghostscript: Missing text when converting PDF to PS
Date: Mon, 13 May 2019 12:22:42 +0200
Hey Guix!

Unfortunately this is something that affects also printing from
pdfs. At this point I can see what my printing result will look like
by opening it in GNU Emacs DocView.

Maybe another way would be to package ghostscript the way Arch does?
It's gs gave me the best quality results I've had on any distro.
Main differences:
- no patches
- more libraries are provided from the system than currently in Guix.

Reference:
https://git.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/ghostscript
https://www.archlinux.org/packages/extra/x86_64/ghostscript/

Unfortunately so far I haven't had the time to modify package in such
manner, but I'd be glad to do any testing necessary for this bug to be
fixed as it is the only blocker of my daily usage of GNU Guix System.

--
sirmacik
PGP: 0xE0DC81D523891771




Information forwarded to bug-guix <at> gnu.org:
bug#34877; Package guix. (Mon, 13 May 2019 14:23:01 GMT) Full text and rfc822 format available.

Message #19 received at 34877 <at> debbugs.gnu.org (full text, mbox):

From: sirmacik <sirmacik <at> wioo.waw.pl>
To: 34877 <at> debbugs.gnu.org
Subject: bug#34877
Date: Mon, 13 May 2019 16:13:54 +0200
Hey Guix!

Unfortunately this is something that affects also printing from
pdfs. At this point I can see what my printing result will look like
by opening it in GNU Emacs DocView.

Maybe another way would be to package ghostscript the way Arch does?
It's gs gave me the best quality results I've had on any distro.
Main differences:
- no patches
- more libraries are provided from the system than currently in Guix.

Reference:
https://git.archlinux.org/svntogit/packages.git/tree/trunk/PKGBUILD?h=packages/ghostscript
https://www.archlinux.org/packages/extra/x86_64/ghostscript/

Unfortunately so far I haven't had the time to modify package in such
manner, but I'd be glad to do any testing necessary for this bug to be
fixed as it is the only blocker of my daily usage of GNU Guix System.

--
sirmacik
PGP: 0xE0DC81D523891771




Reply sent to Ludovic Courtès <ludo <at> gnu.org>:
You have taken responsibility. (Sun, 25 Aug 2019 20:54:03 GMT) Full text and rfc822 format available.

Notification sent to Diego Nicola Barbato <dnbarbato <at> posteo.de>:
bug acknowledged by developer. (Sun, 25 Aug 2019 20:54:05 GMT) Full text and rfc822 format available.

Message #24 received at 34877-done <at> debbugs.gnu.org (full text, mbox):

From: Ludovic Courtès <ludo <at> gnu.org>
To: Diego Nicola Barbato <dnbarbato <at> posteo.de>
Cc: sirmacik <sirmacik <at> wioo.waw.pl>, 34877-done <at> debbugs.gnu.org
Subject: Re: bug#34877: Ghostscript: Missing text when converting PDF to PS
Date: Sun, 25 Aug 2019 22:53:17 +0200
Hello,

Diego Nicola Barbato <dnbarbato <at> posteo.de> skribis:

> When converting certain PDF files to PostScript pdf2ps (from the
> Ghostscript package) will print the following error messages:
>
>    **** Error reading a content stream. The page may be incomplete.
>                Output may be incorrect.
>    **** Error: File did not complete the page properly and may be damaged.
>                Output may be incorrect.

sirmacik <sirmacik <at> wioo.waw.pl> skribis:

> Unfortunately this is something that affects also printing from
> pdfs. At this point I can see what my printing result will look like
> by opening it in GNU Emacs DocView.

Good news everyone!  Commit 466ff55c72959ba1499ce3ec69f534b3038eb30b
fixes it!  The next commit makes a graft so that the working Ghoscript
is readily available to CUPS, etc.

It turned out that the primary issue was that Freetype was not found at
configure-time, due to the lack of pkg-config…  The commit above
improves a couple of other things in passing, but adding Freetype for
good is apparently the decisive change.

Please let me know if DocView, pdf2ps, and CUPS and all right for you!

Thanks,
Ludo’.




Information forwarded to bug-guix <at> gnu.org:
bug#34877; Package guix. (Thu, 29 Aug 2019 07:55:01 GMT) Full text and rfc822 format available.

Message #27 received at 34877-done <at> debbugs.gnu.org (full text, mbox):

From: sirmacik <sirmacik <at> wioo.waw.pl>
To: Ludovic Courtès <ludo <at> gnu.org>,
 Diego Nicola Barbato <dnbarbato <at> posteo.de>
Cc: 34877-done <at> debbugs.gnu.org
Subject: Re: bug#34877: Ghostscript: Missing text when converting PDF to PS
Date: Thu, 29 Aug 2019 09:53:44 +0200
[Message part 1 (text/plain, inline)]
that's a great one to hear, thank you!

I'll try it out over the weekend. 


Dnia 25 sierpnia 2019 22:53:17 CEST, "Ludovic Courtès" <ludo <at> gnu.org> napisał(a):
>Hello,
>
>Diego Nicola Barbato <dnbarbato <at> posteo.de> skribis:
>
>> When converting certain PDF files to PostScript pdf2ps (from the
>> Ghostscript package) will print the following error messages:
>>
>>    **** Error reading a content stream. The page may be incomplete.
>>                Output may be incorrect.
>>    **** Error: File did not complete the page properly and may be
>damaged.
>>                Output may be incorrect.
>
>sirmacik <sirmacik <at> wioo.waw.pl> skribis:
>
>> Unfortunately this is something that affects also printing from
>> pdfs. At this point I can see what my printing result will look like
>> by opening it in GNU Emacs DocView.
>
>Good news everyone!  Commit 466ff55c72959ba1499ce3ec69f534b3038eb30b
>fixes it!  The next commit makes a graft so that the working Ghoscript
>is readily available to CUPS, etc.
>
>It turned out that the primary issue was that Freetype was not found at
>configure-time, due to the lack of pkg-config…  The commit above
>improves a couple of other things in passing, but adding Freetype for
>good is apparently the decisive change.
>
>Please let me know if DocView, pdf2ps, and CUPS and all right for you!
>
>Thanks,
>Ludo’.
[Message part 2 (text/html, inline)]

bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 26 Sep 2019 11:24:06 GMT) Full text and rfc822 format available.

This bug report was last modified 4 years and 235 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.