GNU bug report logs -
#39634
All keyowrds hash to the same value
Previous Next
Reported by: Rob Browning <rlb <at> defaultvalue.org>
Date: Sun, 16 Feb 2020 18:22:02 UTC
Severity: important
Done: Ludovic Courtès <ludo <at> gnu.org>
Bug is archived. No further changes may be made.
To add a comment to this bug, you must first unarchive it, by sending
a message to control AT debbugs.gnu.org, with unarchive 39634 in the body.
You can then email your comments to 39634 AT debbugs.gnu.org in the normal way.
Toggle the display of automated, internal messages from the tracker.
Report forwarded
to
bug-guile <at> gnu.org
:
bug#39634
; Package
guile
.
(Sun, 16 Feb 2020 18:22:02 GMT)
Full text and
rfc822 format available.
Acknowledgement sent
to
Rob Browning <rlb <at> defaultvalue.org>
:
New bug report received and forwarded. Copy sent to
bug-guile <at> gnu.org
.
(Sun, 16 Feb 2020 18:22:02 GMT)
Full text and
rfc822 format available.
Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
(hash #:x most-postive-fixnum) hashes to the same value for all keyowrds
in at least 2.2 and 3.0. Here's one potential fix for 3.0, and I'd be
happy to adjust it for 2.2 if needed.
[0001-Implement-hashing-for-keywords-i.e.-hash-x.patch (text/x-diff, inline)]
From b380102564aad053f22586eb404e99c82635a3b0 Mon Sep 17 00:00:00 2001
From: Rob Browning <rlb <at> defaultvalue.org>
Date: Sun, 16 Feb 2020 12:12:08 -0600
Subject: [PATCH 1/1] Implement hashing for keywords, i.e. (hash #:x ...)
Add keyword handling to (hash ...). Previously it would just return the
same value for all keywords.
* libguile/hash.c (scm_raw_ihash): Add scm_tc7_keyword case.
* libguile/keywords.h (SCM_I_KEYWORD_HASH): New macro.
---
libguile/hash.c | 3 +++
libguile/keywords.h | 2 ++
2 files changed, 5 insertions(+)
diff --git a/libguile/hash.c b/libguile/hash.c
index d6e93dae0..c51a794c9 100644
--- a/libguile/hash.c
+++ b/libguile/hash.c
@@ -35,6 +35,7 @@
#include "chars.h"
#include "foreign.h"
#include "gsubr.h"
+#include "keywords.h"
#include "numbers.h"
#include "pairs.h"
#include "ports.h"
@@ -307,6 +308,8 @@ scm_raw_ihash (SCM obj, size_t depth)
return scm_i_string_hash (obj);
case scm_tc7_symbol:
return scm_i_symbol_hash (obj);
+ case scm_tc7_keyword:
+ return SCM_I_KEYWORD_HASH (obj);
case scm_tc7_pointer:
return scm_raw_ihashq ((uintptr_t) SCM_POINTER_VALUE (obj));
case scm_tc7_wvect:
diff --git a/libguile/keywords.h b/libguile/keywords.h
index c8f480869..cb8598d8b 100644
--- a/libguile/keywords.h
+++ b/libguile/keywords.h
@@ -60,6 +60,8 @@ SCM_API void
scm_c_bind_keyword_arguments (const char *subr, SCM rest,
scm_t_keyword_arguments_flags flags, ...);
+#define SCM_I_KEYWORD_HASH(x) scm_i_symbol_hash (SCM_CELL_OBJECT_1 (x))
+
SCM_INTERNAL void scm_init_keywords (void);
#endif /* SCM_KEYWORDS_H */
--
2.24.1
[Message part 3 (text/plain, inline)]
--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4
Severity set to 'important' from 'normal'
Request was from
Ludovic Courtès <ludo <at> gnu.org>
to
control <at> debbugs.gnu.org
.
(Thu, 20 Feb 2020 09:49:02 GMT)
Full text and
rfc822 format available.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#39634
; Package
guile
.
(Thu, 20 Feb 2020 16:20:02 GMT)
Full text and
rfc822 format available.
Message #10 received at 39634 <at> debbugs.gnu.org (full text, mbox):
Hi Rob,
Rob Browning <rlb <at> defaultvalue.org> skribis:
>>From b380102564aad053f22586eb404e99c82635a3b0 Mon Sep 17 00:00:00 2001
> From: Rob Browning <rlb <at> defaultvalue.org>
> Date: Sun, 16 Feb 2020 12:12:08 -0600
> Subject: [PATCH 1/1] Implement hashing for keywords, i.e. (hash #:x ...)
>
> Add keyword handling to (hash ...). Previously it would just return the
> same value for all keywords.
>
> * libguile/hash.c (scm_raw_ihash): Add scm_tc7_keyword case.
>
> * libguile/keywords.h (SCM_I_KEYWORD_HASH): New macro.
LGTM, please push!
Andy and I discussed it on IRC and despite the fact that it’s an ABI
change, we thought including the fix in 3.0.1 was probably the better
option.
Of all the scm_tc7_ values listed in ‘scm.h’, the following are not
explicitly listed (so they go to the default case that hashes the first
word):
variable, hashtable, fluid, stringbuf, dynamic_state, frame,
atomic_box, values, program, vm_cont, bytevector, weak_set,
weak_table, array, bitvector, port
So for example all input file ports hash to the same value, all
3-element bytevectors hash to the same value, etc.
We can probably omit stringbuf, dynamic_state, and values, but the rest
should probably be fixed.
WDYT, Andy?
Thanks,
Ludo’.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#39634
; Package
guile
.
(Tue, 25 Feb 2020 20:58:02 GMT)
Full text and
rfc822 format available.
Message #13 received at 39634 <at> debbugs.gnu.org (full text, mbox):
On Thu 20 Feb 2020 17:19, Ludovic Courtès <ludo <at> gnu.org> writes:
> Of all the scm_tc7_ values listed in ‘scm.h’, the following are not
> explicitly listed (so they go to the default case that hashes the first
> word):
Reformatting your list so I can check one by one :)
> variable,
> hashtable,
> fluid,
> dynamic_state,
> frame,
> atomic_box,
> program,
> vm_cont,
> weak_set,
> weak_table,
> port
No equal? implementation, so should hashq() instead.
> bytevector,
> array,
> bitvector,
These have equal? implementations, and what's more, a bitvector can
equal? an array... I think we have another bug!
;; Project 2d array as 1d array (scm_tc7_array)
(define x
(make-shared-array #2b((#t #t #t)) (lambda (i) (list 0 i)) '(0 2)))
;; scm_tc7_bitvector
(define y #*111)
(equal? x y) ;; => #t
(equal? (hash x #xffffffff) (hash y #xffffffff)) ;; => #f
Similarly for 1-d scm_tc7_array versus regular vectors, bytevectors,
etc.
Fixing this will not be straightforward... I think basically 1d arrays
need some special hashing logic so that e.g. vectors and 1d arrays hash
to the same thing.
> stringbuf,
> values,
These are never exposed to Scheme, and never compared using equal?
AFAIU. No need for special cases.
Basically I think the tc7 case should default to hashq, and include
special cases for the ones that have equal? implementations or which
have read syntax.
Sound right to you?
Andy
Information forwarded
to
bug-guile <at> gnu.org
:
bug#39634
; Package
guile
.
(Tue, 25 Feb 2020 22:15:02 GMT)
Full text and
rfc822 format available.
Message #16 received at 39634 <at> debbugs.gnu.org (full text, mbox):
> On 25 Feb 2020, at 21:56, Andy Wingo <wingo <at> igalia.com> wrote:
>
> On Thu 20 Feb 2020 17:19, Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Of all the scm_tc7_ values listed in ‘scm.h’, the following are not
>> explicitly listed (so they go to the default case that hashes the first
>> word):
>
> Reformatting your list so I can check one by one :)
>
>> variable,
>> hashtable,
>> fluid,
>> dynamic_state,
>> frame,
>> atomic_box,
>> program,
>> vm_cont,
>> weak_set,
>> weak_table,
>> port
>
> No equal? implementation, so should hashq() instead.
>
>> bytevector,
>> array,
>> bitvector,
>
> These have equal? implementations, and what's more, a bitvector can
> equal? an array... I think we have another bug!
>
> ;; Project 2d array as 1d array (scm_tc7_array)
> (define x
> (make-shared-array #2b((#t #t #t)) (lambda (i) (list 0 i)) '(0 2)))
> ;; scm_tc7_bitvector
> (define y #*111)
>
> (equal? x y) ;; => #t
> (equal? (hash x #xffffffff) (hash y #xffffffff)) ;; => #f
>
> Similarly for 1-d scm_tc7_array versus regular vectors, bytevectors,
> etc.
>
> Fixing this will not be straightforward... I think basically 1d arrays
> need some special hashing logic so that e.g. vectors and 1d arrays hash
> to the same thing.
I cannot check at the moment but I think that use of make-shared-array is special cased to return a bitvector because the shared array and the root happen to be equivalent. So your x isn't a scm_tc7_array but a scm_tc7_bitvector. The same is true for the other vector types. You can see that if you make a shared array with bounds '(0 1) instead of '(0 2) for example, or non-unit step or non-zero lower bound or anything that cannot be represented as a root vector.
Now it is true that functionally a root vector and a 1d array with the same bounds and the same elements are equivalent even if the array has non-unit stride and so on, but we had that logic before were you could use the root vector functions on arrays and it was an absolute mess. I think there should be a logic to hash n-d arrays that extends to 1-d arrays so there's no need to make special cases. All vectors can be treated as 1-d arrays so that should work fine for those too.
Partially related, I have a series of patches on wip-vector-cleanup to make sure that the various vector implementations don't depend on arrays (as they still do on some cases) but rather the other way around, strictly. I haven't posted about it b/c it changes a few interfaces and I haven't figured out the deprecation route.
regards
>
>> stringbuf,
>> values,
>
> These are never exposed to Scheme, and never compared using equal?
> AFAIU. No need for special cases.
>
> Basically I think the tc7 case should default to hashq, and include
> special cases for the ones that have equal? implementations or which
> have read syntax.
>
> Sound right to you?
>
> Andy
>
>
>
Information forwarded
to
bug-guile <at> gnu.org
:
bug#39634
; Package
guile
.
(Wed, 26 Feb 2020 21:03:01 GMT)
Full text and
rfc822 format available.
Message #19 received at 39634 <at> debbugs.gnu.org (full text, mbox):
Hi Andy!
Andy Wingo <wingo <at> igalia.com> skribis:
> On Thu 20 Feb 2020 17:19, Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Of all the scm_tc7_ values listed in ‘scm.h’, the following are not
>> explicitly listed (so they go to the default case that hashes the first
>> word):
>
> Reformatting your list so I can check one by one :)
>
>> variable,
>> hashtable,
>> fluid,
>> dynamic_state,
>> frame,
>> atomic_box,
>> program,
>> vm_cont,
>> weak_set,
>> weak_table,
>> port
>
> No equal? implementation, so should hashq() instead.
>
>> bytevector,
>> array,
>> bitvector,
>
> These have equal? implementations, and what's more, a bitvector can
> equal? an array... I think we have another bug!
>
> ;; Project 2d array as 1d array (scm_tc7_array)
> (define x
> (make-shared-array #2b((#t #t #t)) (lambda (i) (list 0 i)) '(0 2)))
> ;; scm_tc7_bitvector
> (define y #*111)
>
> (equal? x y) ;; => #t
> (equal? (hash x #xffffffff) (hash y #xffffffff)) ;; => #f
>
> Similarly for 1-d scm_tc7_array versus regular vectors, bytevectors,
> etc.
Oops.
> Fixing this will not be straightforward... I think basically 1d arrays
> need some special hashing logic so that e.g. vectors and 1d arrays hash
> to the same thing.
OK.
>> stringbuf,
>> values,
>
> These are never exposed to Scheme, and never compared using equal?
> AFAIU. No need for special cases.
Agreed.
> Basically I think the tc7 case should default to hashq, and include
> special cases for the ones that have equal? implementations or which
> have read syntax.
>
> Sound right to you?
It does, yes.
Thanks for looking into it,
Ludo’.
Information forwarded
to
bug-guile <at> gnu.org
:
bug#39634
; Package
guile
.
(Fri, 06 Mar 2020 14:43:01 GMT)
Full text and
rfc822 format available.
Message #22 received at 39634 <at> debbugs.gnu.org (full text, mbox):
[Message part 1 (text/plain, inline)]
Andy Wingo <wingo <at> igalia.com> skribis:
>> variable,
>> hashtable,
>> fluid,
>> dynamic_state,
>> frame,
>> atomic_box,
>> program,
>> vm_cont,
>> weak_set,
>> weak_table,
>> port
>
> No equal? implementation, so should hashq() instead.
Here’s a patch for these, let me know what you think!
(Of course longer term it’d be nice to have a centralized way to declare
each tc7 with its equal and hash methods.)
Ludo’.
[Message part 2 (text/x-patch, inline)]
diff --git a/libguile/hash.c b/libguile/hash.c
index c51a794c9..9cb8fcedd 100644
--- a/libguile/hash.c
+++ b/libguile/hash.c
@@ -1,4 +1,4 @@
-/* Copyright 1995-1997,2000-2001,2003-2004,2006,2008-2015,2017-2018
+/* Copyright 1995-1997,2000-2001,2003-2004,2006,2008-2015,2017-2018,2020
Free Software Foundation, Inc.
This file is part of Guile.
@@ -331,6 +331,22 @@ scm_raw_ihash (SCM obj, size_t depth)
h ^= scm_raw_ihash (scm_syntax_module (obj), depth);
return h;
}
+
+ /* The following tc7s have no 'equal?' implementation. Thus, just
+ fall back to 'hashq'. */
+ case scm_tc7_variable:
+ case scm_tc7_hashtable:
+ case scm_tc7_fluid:
+ case scm_tc7_dynamic_state:
+ case scm_tc7_frame:
+ case scm_tc7_atomic_box:
+ case scm_tc7_program:
+ case scm_tc7_vm_cont:
+ case scm_tc7_weak_set:
+ case scm_tc7_weak_table:
+ case scm_tc7_port:
+ return scm_raw_ihashq (SCM_UNPACK (obj));
+
case scm_tcs_cons_imcar:
case scm_tcs_cons_nimcar:
if (depth)
Information forwarded
to
bug-guile <at> gnu.org
:
bug#39634
; Package
guile
.
(Fri, 06 Mar 2020 15:44:02 GMT)
Full text and
rfc822 format available.
Message #25 received at 39634 <at> debbugs.gnu.org (full text, mbox):
On Fri 06 Mar 2020 15:42, Ludovic Courtès <ludo <at> gnu.org> writes:
> Andy Wingo <wingo <at> igalia.com> skribis:
>
>>> variable,
>>> hashtable,
>>> fluid,
>>> dynamic_state,
>>> frame,
>>> atomic_box,
>>> program,
>>> vm_cont,
>>> weak_set,
>>> weak_table,
>>> port
>>
>> No equal? implementation, so should hashq() instead.
>
> Here’s a patch for these, let me know what you think!
LGTM!
Andy
Reply sent
to
Ludovic Courtès <ludo <at> gnu.org>
:
You have taken responsibility.
(Fri, 06 Mar 2020 16:20:02 GMT)
Full text and
rfc822 format available.
Notification sent
to
Rob Browning <rlb <at> defaultvalue.org>
:
bug acknowledged by developer.
(Fri, 06 Mar 2020 16:20:02 GMT)
Full text and
rfc822 format available.
Message #30 received at 39634-done <at> debbugs.gnu.org (full text, mbox):
Andy Wingo <wingo <at> igalia.com> skribis:
> On Fri 06 Mar 2020 15:42, Ludovic Courtès <ludo <at> gnu.org> writes:
>
>> Andy Wingo <wingo <at> igalia.com> skribis:
>>
>>>> variable,
>>>> hashtable,
>>>> fluid,
>>>> dynamic_state,
>>>> frame,
>>>> atomic_box,
>>>> program,
>>>> vm_cont,
>>>> weak_set,
>>>> weak_table,
>>>> port
>>>
>>> No equal? implementation, so should hashq() instead.
>>
>> Here’s a patch for these, let me know what you think!
>
> LGTM!
Pushed as c5d3b45c9f903cc9726b4bd7e3c13db56bafd587, thanks!
Ludo'.
bug archived.
Request was from
Debbugs Internal Request <help-debbugs <at> gnu.org>
to
internal_control <at> debbugs.gnu.org
.
(Sat, 04 Apr 2020 11:24:04 GMT)
Full text and
rfc822 format available.
This bug report was last modified 4 years and 20 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.