GNU bug report logs - #13544
(web http) fails to parse numeric timezones in Date header

Previous Next

Package: guile;

Reported by: ludo <at> gnu.org (Ludovic Courtès)

Date: Thu, 24 Jan 2013 22:23:02 UTC

Severity: normal

Done: ludo <at> gnu.org (Ludovic Courtès)

Bug is archived. No further changes may be made.

To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 13544 in the body.
You can then email your comments to 13544 AT debbugs.gnu.org in the normal way.

Toggle the display of automated, internal messages from the tracker.

View this report as an mbox folder, status mbox, maintainer mbox


Report forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Thu, 24 Jan 2013 22:23:02 GMT) Full text and rfc822 format available.

Acknowledgement sent to ludo <at> gnu.org (Ludovic Courtès):
New bug report received and forwarded. Copy sent to bug-guile <at> gnu.org. (Thu, 24 Jan 2013 22:23:02 GMT) Full text and rfc822 format available.

Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: bug-guile <at> gnu.org
Cc: Cyril Roelandt <tipecaml <at> gmail.com>
Subject: (web http) fails to parse numeric timezones in Date header
Date: Thu, 24 Jan 2013 23:13:39 +0100
[Message part 1 (text/plain, inline)]
--8<---------------cut here---------------start------------->8---
scheme@(guile-user)> (use-modules(web client)(web uri))
scheme@(guile-user)> (http-get (string->uri "http://www.sqlite.org/"))
web/http.scm:768:6: In procedure parse-asctime-date:
web/http.scm:768:6: Bad Date header: Thu, 24  Jan 2013 21:53:01 +0000
--8<---------------cut here---------------end--------------->8---

RFC 1123 reads:

       There is a strong trend towards the use of numeric timezone
       indicators, and implementations SHOULD use numeric timezones
       instead of timezone names.  However, all implementations MUST
       accept either notation.  If timezone names are used, they MUST
       be exactly as defined in RFC-822.

Here’s a tentative patch to fix it:

[Message part 2 (text/x-patch, inline)]
diff --git a/module/web/http.scm b/module/web/http.scm
index 216fddd..2ab5bd0 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -1,6 +1,6 @@
 ;;; HTTP messages
 
-;; Copyright (C)  2010, 2011, 2012 Free Software Foundation, Inc.
+;; Copyright (C)  2010, 2011, 2012, 2013 Free Software Foundation, Inc.
 
 ;; This library is free software; you can redistribute it and/or
 ;; modify it under the terms of the GNU Lesser General Public
@@ -732,6 +732,20 @@ as an ordered alist."
                (minute (parse-non-negative-integer str 19 21))
                (second (parse-non-negative-integer str 22 24)))
            (make-date 0 second minute hour date month year 0)))
+        ((string-match? str "aaa, dd aaa dddd dd:dd:dd .0000")
+         (let ((date (parse-non-negative-integer str 5 7))
+               (month (parse-month str 8 11))
+               (year (parse-non-negative-integer str 12 16))
+               (hour (parse-non-negative-integer str 17 19))
+               (minute (parse-non-negative-integer str 20 22))
+               (second (parse-non-negative-integer str 23 25))
+               (tz (parse-non-negative-integer str 28 31))
+               (tz-sign (case (string-ref str 27)
+                          ((#\+) +1)
+                          ((#\-) -1)
+                          (else (bad-header 'date str) #f))))
+           (make-date 0 second minute hour date month year
+                      (* tz-sign tz))))
         (else
          (bad-header 'date str)         ; prevent tail call
          #f)))
@@ -778,7 +792,8 @@ as an ordered alist."
     (make-date 0 second minute hour date month year 0)))
 
 (define (parse-date str)
-  (if (string-suffix? " GMT" str)
+  (if (or (string-suffix? " GMT" str)
+          (string-match "[+-][0-9]{4}$" str))
       (let ((comma (string-index str #\,)))
         (cond ((not comma) (bad-header 'date str))
               ((= comma 3) (parse-rfc-822-date str))
[Message part 3 (text/plain, inline)]
Problem is, this particular example has another problem: it has an extra
space before the month name.

How is this best addressed?  Should the parser be more tolerant,
possibly using plain regexps?

Thanks,
Ludo’.

Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Thu, 07 Mar 2013 22:30:02 GMT) Full text and rfc822 format available.

Message #8 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 13544 <at> debbugs.gnu.org, Cyril Roelandt <tipecaml <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Thu, 07 Mar 2013 23:28:42 +0100
On Thu 24 Jan 2013 23:13, ludo <at> gnu.org (Ludovic Courtès) writes:

> scheme@(guile-user)> (use-modules(web client)(web uri))
> scheme@(guile-user)> (http-get (string->uri "http://www.sqlite.org/"))
> web/http.scm:768:6: In procedure parse-asctime-date:
> web/http.scm:768:6: Bad Date header: Thu, 24  Jan 2013 21:53:01 +0000

As you can see here:

  http://pretty-rfc.herokuapp.com/RFC2616#date-time-formats

HTTP doesn't actually support other time zones.  The date header being
reported by sqlite.org is invalid.

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Sat, 09 Mar 2013 01:43:02 GMT) Full text and rfc822 format available.

Message #11 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Hartwig <mandyke <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 13544 <at> debbugs.gnu.org,
	Cyril Roelandt <tipecaml <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Sat, 9 Mar 2013 09:41:46 +0800
On 8 March 2013 06:28, Andy Wingo <wingo <at> pobox.com> wrote:
> On Thu 24 Jan 2013 23:13, ludo <at> gnu.org (Ludovic Courtès) writes:
>
>> scheme@(guile-user)> (use-modules(web client)(web uri))
>> scheme@(guile-user)> (http-get (string->uri "http://www.sqlite.org/"))
>> web/http.scm:768:6: In procedure parse-asctime-date:
>> web/http.scm:768:6: Bad Date header: Thu, 24  Jan 2013 21:53:01 +0000
>
> As you can see here:
>
>   http://pretty-rfc.herokuapp.com/RFC2616#date-time-formats
>
> HTTP doesn't actually support other time zones.  The date header being
> reported by sqlite.org is invalid.

Correct, though ‘+0000’ is the right time zone, just the format is
wrong (according to RFC 2616).  A survey of HTTP sites I performed
last year as research for another header issue in Guile showed
something like 1% of those sites using the numeric timezone format,
contrary to the specification.  This is likely due to false reliance
on RFC 1123 (quoted by Ludo in the original report), which indicates a
preference for numeric timezones that are are indirectly forbidden by
RFC 2616 (which states, timezone must be the string “GMT”)

Interpretting ‘+0000’ timezone is sensible in a robust implementation,
though what to do if a numeric timezone is given other than this?
Convert it to GMT is one option, since the spec. defines that the
header must be in this timezone.




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Sat, 09 Mar 2013 02:10:01 GMT) Full text and rfc822 format available.

Message #14 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Hartwig <mandyke <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 13544 <at> debbugs.gnu.org,
	Cyril Roelandt <tipecaml <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Sat, 9 Mar 2013 10:08:56 +0800
On 9 March 2013 09:41, Daniel Hartwig <mandyke <at> gmail.com> wrote:
> A survey of HTTP sites I performed
> last year as research for another header issue in Guile showed
> something like 1% of those sites using the numeric timezone format,
> contrary to the specification.

Reference: <http://debbugs.gnu.org/cgi/bugreport.cgi?bug=10147#17>.




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Sat, 09 Mar 2013 08:23:01 GMT) Full text and rfc822 format available.

Message #17 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Daniel Hartwig <mandyke <at> gmail.com>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 13544 <at> debbugs.gnu.org,
	Cyril Roelandt <tipecaml <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Sat, 09 Mar 2013 09:21:49 +0100
On Sat 09 Mar 2013 02:41, Daniel Hartwig <mandyke <at> gmail.com> writes:

> Interpretting ‘+0000’ timezone is sensible in a robust implementation,

Yes, I agree, this makes sense.

> though what to do if a numeric timezone is given other than this?

I would continue to raise an error I think.  Timezones get complicated,
fast, and there is little hope that we could preserve correctness.
WDYT?

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Sat, 09 Mar 2013 23:52:02 GMT) Full text and rfc822 format available.

Message #20 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Hartwig <mandyke <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 13544 <at> debbugs.gnu.org,
	Cyril Roelandt <tipecaml <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Sun, 10 Mar 2013 07:50:56 +0800
On 9 March 2013 16:21, Andy Wingo <wingo <at> pobox.com> wrote:
> On Sat 09 Mar 2013 02:41, Daniel Hartwig <mandyke <at> gmail.com> writes:
>
>> Interpretting ‘+0000’ timezone is sensible in a robust implementation,
>
> Yes, I agree, this makes sense.
>
>> though what to do if a numeric timezone is given other than this?
>
> I would continue to raise an error I think.  Timezones get complicated,
> fast, and there is little hope that we could preserve correctness.
> WDYT?

Ok.  What about Ludo's original comment, about the extra space in the
sqlite header?




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Sun, 10 Mar 2013 18:33:01 GMT) Full text and rfc822 format available.

Message #23 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: Daniel Hartwig <mandyke <at> gmail.com>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 13544 <at> debbugs.gnu.org,
	Cyril Roelandt <tipecaml <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Sun, 10 Mar 2013 19:31:56 +0100
On Sun 10 Mar 2013 00:50, Daniel Hartwig <mandyke <at> gmail.com> writes:

> On 9 March 2013 16:21, Andy Wingo <wingo <at> pobox.com> wrote:
>> On Sat 09 Mar 2013 02:41, Daniel Hartwig <mandyke <at> gmail.com> writes:
>>
>>> Interpretting ‘+0000’ timezone is sensible in a robust implementation,
>>
>> Yes, I agree, this makes sense.
>>
>>> though what to do if a numeric timezone is given other than this?
>>
>> I would continue to raise an error I think.  Timezones get complicated,
>> fast, and there is little hope that we could preserve correctness.
>> WDYT?
>
> Ok.  What about Ludo's original comment, about the extra space in the
> sqlite header?

Dunno.  Is it common?  In this particular case I would mail and try to
get them to fix their server, given that it is run by hackers.  Let us
leave that particular issue for another bug.

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Thu, 14 Mar 2013 13:36:02 GMT) Full text and rfc822 format available.

Message #26 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andy Wingo <wingo <at> pobox.com>
Cc: 13544 <at> debbugs.gnu.org, Cyril Roelandt <tipecaml <at> gmail.com>,
	Daniel Hartwig <mandyke <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Thu, 14 Mar 2013 14:34:28 +0100
Andy Wingo <wingo <at> pobox.com> skribis:

> On Sun 10 Mar 2013 00:50, Daniel Hartwig <mandyke <at> gmail.com> writes:
>
>> On 9 March 2013 16:21, Andy Wingo <wingo <at> pobox.com> wrote:
>>> On Sat 09 Mar 2013 02:41, Daniel Hartwig <mandyke <at> gmail.com> writes:
>>>
>>>> Interpretting ‘+0000’ timezone is sensible in a robust implementation,
>>>
>>> Yes, I agree, this makes sense.
>>>
>>>> though what to do if a numeric timezone is given other than this?
>>>
>>> I would continue to raise an error I think.  Timezones get complicated,
>>> fast, and there is little hope that we could preserve correctness.
>>> WDYT?
>>
>> Ok.  What about Ludo's original comment, about the extra space in the
>> sqlite header?
>
> Dunno.  Is it common?  In this particular case I would mail and try to
> get them to fix their server, given that it is run by hackers.  Let us
> leave that particular issue for another bug.

I think standards unfortunately don’t matter as much as usage here.

Fossil’s web server (by the same author, I think) doesn’t have the
problem, and sqlite.org doesn’t have a ‘Server’ header, so it’s hard to
tell if it’s common.

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Thu, 14 Mar 2013 15:03:01 GMT) Full text and rfc822 format available.

Message #29 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Andy Wingo <wingo <at> pobox.com>
To: ludo <at> gnu.org (Ludovic Courtès)
Cc: 13544 <at> debbugs.gnu.org, Cyril Roelandt <tipecaml <at> gmail.com>,
	Daniel Hartwig <mandyke <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Thu, 14 Mar 2013 16:00:44 +0100
On Thu 14 Mar 2013 14:34, ludo <at> gnu.org (Ludovic Courtès) writes:

>>> Ok.  What about Ludo's original comment, about the extra space in the
>>> sqlite header?
>>
>> Dunno.  Is it common?  In this particular case I would mail and try to
>> get them to fix their server, given that it is run by hackers.  Let us
>> leave that particular issue for another bug.
>
> I think standards unfortunately don’t matter as much as usage here.

It's a tradeoff.  Guile's web module is not permissive; though perhaps a
permissive parsing flag could make sense (one that doesn't propagate
exceptions).  But anyway it will never parse the whole range of crap
that people put on the internet.  So with nonstandard productions it's
always a tradeoff.  In this case the tradeoff is not worth it to me,
especially given other options, but that is MHO.

Andy
-- 
http://wingolog.org/




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Thu, 14 Mar 2013 16:09:02 GMT) Full text and rfc822 format available.

Message #32 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Andy Wingo <wingo <at> pobox.com>
Cc: 13544 <at> debbugs.gnu.org, Cyril Roelandt <tipecaml <at> gmail.com>,
	Daniel Hartwig <mandyke <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Thu, 14 Mar 2013 17:07:37 +0100
Andy Wingo <wingo <at> pobox.com> skribis:

> On Thu 14 Mar 2013 14:34, ludo <at> gnu.org (Ludovic Courtès) writes:
>
>>>> Ok.  What about Ludo's original comment, about the extra space in the
>>>> sqlite header?
>>>
>>> Dunno.  Is it common?  In this particular case I would mail and try to
>>> get them to fix their server, given that it is run by hackers.  Let us
>>> leave that particular issue for another bug.
>>
>> I think standards unfortunately don’t matter as much as usage here.
>
> It's a tradeoff.

Yes, of course.  That was a broad statement, not an argument for this
particular case.

I think looking at the workarounds found in software like Wget and cURL
gives an idea of how far we “could” go (but perhaps they don’t have any
workarounds, and instead just happen to be tolerant because they have
half-baked parsers in C.)

Ludo’.




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Fri, 15 Mar 2013 07:10:02 GMT) Full text and rfc822 format available.

Message #35 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Hartwig <mandyke <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 13544 <at> debbugs.gnu.org,
	Cyril Roelandt <tipecaml <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Fri, 15 Mar 2013 15:08:33 +0800
On 14 March 2013 23:00, Andy Wingo <wingo <at> pobox.com> wrote:
> On Thu 14 Mar 2013 14:34, ludo <at> gnu.org (Ludovic Courtès) writes:
>
>>>> Ok.  What about Ludo's original comment, about the extra space in the
>>>> sqlite header?
>>>
>>> Dunno.  Is it common?

In the sample data from last year there were no instances of any extra
whitespace in any date-valued header.  Let us consider it rare, which
is enough reason to not support it.  The same reasoning was applied in
#10147.  Otherwise, having ‘string-match?’ collapse whitespace may be
ok.

Ludo’s patch can be applied with support for arbitrary timezones
removed.  On a related note, how RFC-strict is ‘valid-header?’
supposed to be?  At the moment it will pass a date value in any
timezone.

>>> In this particular case I would mail and try to
>>> get them to fix their server, given that it is run by hackers.  Let us
>>> leave that particular issue for another bug.
>>
>> I think standards unfortunately don’t matter as much as usage here.
>
> It's a tradeoff.  Guile's web module is not permissive; though perhaps a
> permissive parsing flag could make sense (one that doesn't propagate
> exceptions).  But anyway it will never parse the whole range of crap
> that people put on the internet.  So with nonstandard productions it's
> always a tradeoff.  In this case the tradeoff is not worth it to me,
> especially given other options, but that is MHO.

Regards




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Fri, 15 Mar 2013 07:19:02 GMT) Full text and rfc822 format available.

Message #38 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Hartwig <mandyke <at> gmail.com>
To: Andy Wingo <wingo <at> pobox.com>
Cc: Ludovic Courtès <ludo <at> gnu.org>, 13544 <at> debbugs.gnu.org,
	Cyril Roelandt <tipecaml <at> gmail.com>
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Fri, 15 Mar 2013 15:17:05 +0800
On 15 March 2013 15:08, Daniel Hartwig <mandyke <at> gmail.com> wrote:
> Ludo’s patch can be applied with support for arbitrary timezones
> removed.

Actually, Appendix C (RFC 2616) recommends converting non-GMT tz to GMT:

 If an HTTP header incorrectly carries a date value with a time
 zone other than GMT, it MUST be converted into GMT using the most
 conservative possible conversion.




Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Fri, 15 Mar 2013 14:43:02 GMT) Full text and rfc822 format available.

Message #41 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: Daniel Hartwig <mandyke <at> gmail.com>
To: 13544 <at> debbugs.gnu.org
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Fri, 15 Mar 2013 22:40:17 +0800
[Message part 1 (text/plain, inline)]
See attached for handling of numeric time zones that may or may not be
GMT.

[0001-web-http-parse-numeric-time-zones-in-headers.patch (text/x-diff, inline)]
From 430fc9498ee08f6d06b5ec494a5d65e395c6c067 Mon Sep 17 00:00:00 2001
From: Daniel Hartwig <mandyke <at> gmail.com>
Date: Fri, 15 Mar 2013 22:25:10 +0800
Subject: [PATCH] web http: parse numeric time zones in headers

* module/web/http.scm (parse-zone-offset, normalize-date): New
  procedures.
  (parse-rfc-822-date, parse-rfc-850-date, parse-date): Update.
* test-suite/tests/web-http.test ("general headers"): Add test.
---
 module/web/http.scm            |   61 ++++++++++++++++++++++++++++++----------
 test-suite/tests/web-http.test |    3 ++
 2 files changed, 49 insertions(+), 15 deletions(-)

diff --git a/module/web/http.scm b/module/web/http.scm
index c79d57d..975eb8e 100644
--- a/module/web/http.scm
+++ b/module/web/http.scm
@@ -702,29 +702,50 @@ as an ordered alist."
              (else (bad))))
           (else (bad))))))
 
+;; "GMT" | "+" 4DIGIT | "-" 4DIGIT
+;;
+;; RFC 2616 requires date values to use "GMT", but recommends accepting
+;; the others as they are commonly generated by e.g. RFC 822 sources.
+(define (parse-zone-offset str start)
+  (let ((s (substring str start)))
+    (define (bad)
+      (bad-header-component 'zone-offset s))
+    (cond
+     ((string=? s "GMT")
+      0)
+     ((string-match? s ".dddd")
+      (let ((sign (case (string-ref s 0)
+                    ((#\+) +1)
+                    ((#\-) -1)
+                    (else (bad))))
+            (hours (parse-non-negative-integer s 1 3))
+            (minutes (parse-non-negative-integer s 3 5)))
+        (* sign 60 (+ (* 60 hours) minutes)))) ; seconds east of Greenwich
+     (else (bad)))))
+
 ;; RFC 822, updated by RFC 1123
 ;; 
 ;; Sun, 06 Nov 1994 08:49:37 GMT
 ;; 01234567890123456789012345678
 ;; 0         1         2
-(define (parse-rfc-822-date str)
+(define (parse-rfc-822-date str space zone-offset)
   ;; We could verify the day of the week but we don't.
-  (cond ((string-match? str "aaa, dd aaa dddd dd:dd:dd GMT")
+  (cond ((string-match? (substring str 0 space) "aaa, dd aaa dddd dd:dd:dd")
          (let ((date (parse-non-negative-integer str 5 7))
                (month (parse-month str 8 11))
                (year (parse-non-negative-integer str 12 16))
                (hour (parse-non-negative-integer str 17 19))
                (minute (parse-non-negative-integer str 20 22))
                (second (parse-non-negative-integer str 23 25)))
-           (make-date 0 second minute hour date month year 0)))
-        ((string-match? str "aaa, d aaa dddd dd:dd:dd GMT")
+           (make-date 0 second minute hour date month year zone-offset)))
+        ((string-match? (substring str 0 space) "aaa, d aaa dddd dd:dd:dd")
          (let ((date (parse-non-negative-integer str 5 6))
                (month (parse-month str 7 10))
                (year (parse-non-negative-integer str 11 15))
                (hour (parse-non-negative-integer str 16 18))
                (minute (parse-non-negative-integer str 19 21))
                (second (parse-non-negative-integer str 22 24)))
-           (make-date 0 second minute hour date month year 0)))
+           (make-date 0 second minute hour date month year zone-offset)))
         (else
          (bad-header 'date str)         ; prevent tail call
          #f)))
@@ -733,10 +754,10 @@ as an ordered alist."
 ;; Sunday, 06-Nov-94 08:49:37 GMT
 ;;        0123456789012345678901
 ;;        0         1         2
-(define (parse-rfc-850-date str comma)
+(define (parse-rfc-850-date str comma space zone-offset)
   ;; We could verify the day of the week but we don't.
-  (let ((tail (substring str (1+ comma))))
-    (if (not (string-match? tail " dd-aaa-dd dd:dd:dd GMT"))
+  (let ((tail (substring str (1+ comma) space)))
+    (if (not (string-match? tail " dd-aaa-dd dd:dd:dd"))
         (bad-header 'date str))
     (let ((date (parse-non-negative-integer tail 1 3))
           (month (parse-month tail 4 7))
@@ -750,7 +771,7 @@ as an ordered alist."
                    (cond ((< (+ then 50) now) (+ then 100))
                          ((< (+ now 50) then) (- then 100))
                          (else then)))
-                 0))))
+                 zone-offset))))
 
 ;; ANSI C's asctime() format
 ;; Sun Nov  6 08:49:37 1994
@@ -770,13 +791,23 @@ as an ordered alist."
         (second (parse-non-negative-integer str 17 19)))
     (make-date 0 second minute hour date month year 0)))
 
+;; Convert all date values to GMT time zone, as per RFC 2616 appendix C.
+(define (normalize-date date)
+  (if (zero? (date-zone-offset date))
+      date
+      (time-utc->date (date->time-utc date) 0)))
+
 (define (parse-date str)
-  (if (string-suffix? " GMT" str)
-      (let ((comma (string-index str #\,)))
-        (cond ((not comma) (bad-header 'date str))
-              ((= comma 3) (parse-rfc-822-date str))
-              (else (parse-rfc-850-date str comma))))
-      (parse-asctime-date str)))
+  (let* ((space (string-rindex str #\space))
+         (zone-offset (and space (false-if-exception
+                                  (parse-zone-offset str (1+ space))))))
+    (normalize-date
+     (if zone-offset
+         (let ((comma (string-index str #\,)))
+           (cond ((not comma) (bad-header 'date str))
+                 ((= comma 3) (parse-rfc-822-date str space zone-offset))
+                 (else (parse-rfc-850-date str comma space zone-offset))))
+         (parse-asctime-date str)))))
 
 (define (write-date date port)
   (define (display-digits n digits port)
diff --git a/test-suite/tests/web-http.test b/test-suite/tests/web-http.test
index 97f5559..0baa6ab 100644
--- a/test-suite/tests/web-http.test
+++ b/test-suite/tests/web-http.test
@@ -109,6 +109,9 @@
   (pass-if-parse date "Tue, 15 Nov 1994 08:12:31 GMT"
                  (string->date "Tue, 15 Nov 1994 08:12:31 +0000"
                                "~a, ~d ~b ~Y ~H:~M:~S ~z"))
+  (pass-if-parse date "Tue, 15 Nov 1994 16:12:31 +0800"
+                 (string->date "Tue, 15 Nov 1994 08:12:31 +0000"
+                               "~a, ~d ~b ~Y ~H:~M:~S ~z"))
   (pass-if-parse date "Wed, 7 Sep 2011 11:25:00 GMT"
                  (string->date "Wed, 7 Sep 2011 11:25:00 +0000"
                                "~a,~e ~b ~Y ~H:~M:~S ~z"))
-- 
1.7.10.4


Information forwarded to bug-guile <at> gnu.org:
bug#13544; Package guile. (Fri, 15 Mar 2013 23:07:02 GMT) Full text and rfc822 format available.

Message #44 received at 13544 <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Daniel Hartwig <mandyke <at> gmail.com>
Cc: 13544 <at> debbugs.gnu.org
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Sat, 16 Mar 2013 00:04:42 +0100
Daniel Hartwig <mandyke <at> gmail.com> skribis:

> From: Daniel Hartwig <mandyke <at> gmail.com>
> Date: Fri, 15 Mar 2013 22:25:10 +0800
> Subject: [PATCH] web http: parse numeric time zones in headers
>
> * module/web/http.scm (parse-zone-offset, normalize-date): New
>   procedures.
>   (parse-rfc-822-date, parse-rfc-850-date, parse-date): Update.
> * test-suite/tests/web-http.test ("general headers"): Add test.

Looks good to me.

Ludo’.




Reply sent to ludo <at> gnu.org (Ludovic Courtès):
You have taken responsibility. (Wed, 27 Mar 2013 15:28:01 GMT) Full text and rfc822 format available.

Notification sent to ludo <at> gnu.org (Ludovic Courtès):
bug acknowledged by developer. (Wed, 27 Mar 2013 15:28:03 GMT) Full text and rfc822 format available.

Message #49 received at 13544-done <at> debbugs.gnu.org (full text, mbox):

From: ludo <at> gnu.org (Ludovic Courtès)
To: Daniel Hartwig <mandyke <at> gmail.com>
Cc: 13544-done <at> debbugs.gnu.org
Subject: Re: bug#13544: (web http) fails to parse numeric timezones in Date
	header
Date: Wed, 27 Mar 2013 16:25:12 +0100
Daniel Hartwig <mandyke <at> gmail.com> skribis:

>>From 430fc9498ee08f6d06b5ec494a5d65e395c6c067 Mon Sep 17 00:00:00 2001
> From: Daniel Hartwig <mandyke <at> gmail.com>
> Date: Fri, 15 Mar 2013 22:25:10 +0800
> Subject: [PATCH] web http: parse numeric time zones in headers
>
> * module/web/http.scm (parse-zone-offset, normalize-date): New
>   procedures.
>   (parse-rfc-822-date, parse-rfc-850-date, parse-date): Update.
> * test-suite/tests/web-http.test ("general headers"): Add test.

I’ve pushed the patch, so I guess we can close this bug now.

Thanks!

Ludo’.




bug archived. Request was from Debbugs Internal Request <help-debbugs <at> gnu.org> to internal_control <at> debbugs.gnu.org. (Thu, 25 Apr 2013 11:24:03 GMT) Full text and rfc822 format available.

This bug report was last modified 10 years and 362 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.