GNU bug report logs - #15368
HTTP client is slow [2.0.9]

Previous Next

Package: guile;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Fri, 13 Sep 2013 13:47:01 UTC

Severity: normal

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 15368 in the body.
You can then email your comments to 15368 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#15368; Package guile. (Fri, 13 Sep 2013 13:47:01 GMT) Full text and rfc822 format available.

Acknowledgement sent to ludo <at> gnu.org (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Fri, 13 Sep 2013 13:47:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: bug-guile <at> gnu.org
Subject: HTTP client is slow [2.0.9]
Date: Fri, 13 Sep 2013 15:41:27 +0200
I just noticed that our HTTP client is very slow.  Consider this:

--8<---------------cut here---------------start------------->8---
(use-modules (web client)
             (rnrs io ports)
             (rnrs bytevectors)
             (srfi srfi-11)
             (ice-9 format))

(define %uri
  "http://ftp.gnu.org/gnu/idutils/idutils-4.6.tar.xz")

(with-fluids ((%default-port-encoding #f))
  (let*-values (((start)
                 (gettimeofday))
                ((p)
                 (let ((s (open-socket-for-uri %uri)))
                   (setvbuf s _IONBF)
                   s))
                ((r h)
                 (http-get %uri
                           #:port p
                           #:streaming? #t
                           #:decode-body? #f))
                ((d len)
                 (let ((b (get-bytevector-all h)))
                   (values b (bytevector-length b)))
                 ;; (let ((b (make-bytevector (* 5 (expt 2 20)))))
                 ;;   (values b
                 ;;           (get-bytevector-n! h b 0 (bytevector-length b))))
                 )
                ((end)
                 (gettimeofday))
                ((throughput)
                 (let ((duration (- (car end) (car start))))
                   (/ (/ len 1024.) duration 1.0))))
    (format #t "~5,1f KiB/s (total: ~5,1f KiB)~%"
            throughput (/ len 1024.))))
--8<---------------cut here---------------end--------------->8---

Here I get a throughput of ~60 KiB/s, vs. ~400 KiB/s for wget and curl.

Looking at the strace output reveals no real difference: they all make
one syscall for each chunk of 1410 bytes.

‘time’ reports that Guile spends 0.2 s. in user and 0.8 s. in system,
both of which are an order of magnitude higher than wget/curl.

Bypassing the custom binary input ports from http.scm and response.scm
doesn’t make any big difference.  Forcing the zero-copy path in
‘scm_c_read’ doesn’t help much either.

Ideas?

Thanks,
Ludo’.




Merged 15367 15368. Request was from ludo <at> gnu.org (Ludovic Courtès) to control <at> debbugs.gnu.org. (Fri, 13 Sep 2013 15:04:02 GMT) Full text and rfc822 format available.

Information forwarded to bug-guile <at> gnu.org:
bug#15368; Package guile. (Fri, 13 Sep 2013 16:08:02 GMT) Full text and rfc822 format available.

Message #10 received at 15368 <at> debbugs.gnu.org (full text, mbox):

From: Mark H Weaver <mhw <at> netris.org>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 15368 <at> debbugs.gnu.org
Subject: Re: bug#15368: HTTP client is slow [2.0.9]
Date: Fri, 13 Sep 2013 12:07:01 -0400
Hi Ludovic,

ludo <at> gnu.org (Ludovic Courtès) writes:

> I just noticed that our HTTP client is very slow.  Consider this:
>
> (use-modules (web client)
>              (rnrs io ports)
>              (rnrs bytevectors)
>              (srfi srfi-11)
>              (ice-9 format))
>
> (define %uri
>   "http://ftp.gnu.org/gnu/idutils/idutils-4.6.tar.xz")
>
> (with-fluids ((%default-port-encoding #f))
>   (let*-values (((start)
>                  (gettimeofday))
>                 ((p)
>                  (let ((s (open-socket-for-uri %uri)))
>                    (setvbuf s _IONBF)

Why are you using an unbuffered port?  On my system, changing this to
_IOFBF increases throughput from 326 KiB/s to 489.0 KiB/s.

Also, the fact that my throughput is so much higher than yours (on a
several-year-old computer) is interesting.  Obviously I have a faster
net connection (wget reports 1.19M/s), but the fact that Guile can
benefit so much from my faster connection suggests that the body is read
reasonably efficiently.  I guess the problem is added latency somewhere,
or perhaps inefficiency in the writing of the request or reading of the
response headers.

Note that using an unbuffered port means that all the reads of the
response headers will be done 1 byte at a time.

>                    s))
>                 ((r h)
>                  (http-get %uri
>                            #:port p
>                            #:streaming? #t
>                            #:decode-body? #f))
>                 ((d len)
>                  (let ((b (get-bytevector-all h)))
>                    (values b (bytevector-length b)))
>                  ;; (let ((b (make-bytevector (* 5 (expt 2 20)))))
>                  ;;   (values b
>                  ;;           (get-bytevector-n! h b 0 (bytevector-length b))))
>                  )
>                 ((end)
>                  (gettimeofday))
>                 ((throughput)
>                  (let ((duration (- (car end) (car start))))
>                    (/ (/ len 1024.) duration 1.0))))
>     (format #t "~5,1f KiB/s (total: ~5,1f KiB)~%"
>             throughput (/ len 1024.))))
>
> Here I get a throughput of ~60 KiB/s, vs. ~400 KiB/s for wget and curl.
>
> Looking at the strace output reveals no real difference: they all make
> one syscall for each chunk of 1410 bytes.
>
> ‘time’ reports that Guile spends 0.2 s. in user and 0.8 s. in system,
> both of which are an order of magnitude higher than wget/curl.

If they make essentially the same syscalls, then why would the system
time be an order of magnitude higher?  Something doesn't sound right
here.

    Regards,
      Mark




Information forwarded to bug-guile <at> gnu.org:
bug#15368; Package guile. (Fri, 13 Sep 2013 21:15:02 GMT) Full text and rfc822 format available.

Message #13 received at 15368 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw <at> netris.org>
Cc: 15368 <at> debbugs.gnu.org
Subject: Re: bug#15368: HTTP client is slow [2.0.9]
Date: Fri, 13 Sep 2013 23:14:07 +0200
[Message part 1 (text/plain, inline)]
Mark H Weaver <mhw <at> netris.org> skribis:

> ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> I just noticed that our HTTP client is very slow.  Consider this:
>>
>> (use-modules (web client)
>>              (rnrs io ports)
>>              (rnrs bytevectors)
>>              (srfi srfi-11)
>>              (ice-9 format))
>>
>> (define %uri
>>   "http://ftp.gnu.org/gnu/idutils/idutils-4.6.tar.xz")
>>
>> (with-fluids ((%default-port-encoding #f))
>>   (let*-values (((start)
>>                  (gettimeofday))
>>                 ((p)
>>                  (let ((s (open-socket-for-uri %uri)))
>>                    (setvbuf s _IONBF)
>
> Why are you using an unbuffered port?  On my system, changing this to
> _IOFBF increases throughput from 326 KiB/s to 489.0 KiB/s.

Arf, that’s because I was also forcing the ‘scm_c_read’ hack (which
is currently never used, and this is a bug):

[Message part 2 (text/x-patch, inline)]
diff --git a/libguile/ports.c b/libguile/ports.c
index 9068c5c..c217712 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1657,7 +1657,8 @@ scm_c_read (SCM port, void *buffer, size_t size)
      requested number of bytes.  (Note that a single scm_i_fill_input
      call does not guarantee to fill the whole of the port's read
      buffer.) */
-  if (pt->read_buf_size <= 1 && pt->encoding == NULL)
+  if (pt->read_buf_size <= 1
+      && (pt->encoding == NULL || strcmp (pt->encoding, "ISO-8859-1") == 0))
     {
       /* The port that we are reading from is unbuffered - i.e. does
 	 not have its own persistent buffer - but we have a buffer,
[Message part 3 (text/plain, inline)]
So in practice it was reading several KiB at a time, doing zero-copy.

> Also, the fact that my throughput is so much higher than yours (on a
> several-year-old computer) is interesting.  Obviously I have a faster
> net connection (wget reports 1.19M/s),

So for you wget is ~2.5 times faster than Guile, right?

[...]

>> Looking at the strace output reveals no real difference: they all make
>> one syscall for each chunk of 1410 bytes.
>>
>> ‘time’ reports that Guile spends 0.2 s. in user and 0.8 s. in system,
>> both of which are an order of magnitude higher than wget/curl.
>
> If they make essentially the same syscalls, then why would the system
> time be an order of magnitude higher?  Something doesn't sound right
> here.

I concur.

I’ve tried Linux perf and OProfile but failed to get useful info.

Ludo’.

Disconnected #15367 from all other report(s). Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Wed, 22 Jan 2014 16:13:02 GMT) Full text and rfc822 format available.

Merged 15367 15368. Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Wed, 22 Jan 2014 16:13:02 GMT) Full text and rfc822 format available.

Disconnected #15367 from all other report(s). Request was from Mark H Weaver <mhw <at> netris.org> to control <at> debbugs.gnu.org. (Wed, 22 Jan 2014 16:17:02 GMT) Full text and rfc822 format available.

Reply sent to ludo <at> gnu.org (Ludovic Courtès):
You have taken responsibility. (Fri, 23 May 2014 20:15:03 GMT) Full text and rfc822 format available.

Notification sent to ludo <at> gnu.org (Ludovic Courtès):
bug acknowledged by developer. (Fri, 23 May 2014 20:15:04 GMT) Full text and rfc822 format available.

Message #24 received at 15368-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: 15368-done <at> debbugs.gnu.org
Subject: Re: bug#15368: HTTP client is slow [2.0.9]
Date: Fri, 23 May 2014 22:14:00 +0200
ludo <at> gnu.org (Ludovic Courtès) skribis:

> Here I get a throughput of ~60 KiB/s, vs. ~400 KiB/s for wget and curl.

There’s one little detail I hadn’t even bothered checking:

          ;; Enlarge the receive buffer.
          (setsockopt s SOL_SOCKET SO_RCVBUF (* 12 1024))


Its effect was to *shrink* the receive buffer from 124 KiB (the default
size, per /proc/sys/net/core/rmem_default) to 12 KiB...

Fixed in 0bb3f94, which will be in 2.0.12.

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Sat, 21 Jun 2014 11:24:04 GMT) Full text and rfc822 format available.

This bug report was last modified 9 years and 284 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.