On hash tables, key insertion order matters, unlike equalp #5

anticrisis · 2017-09-04T08:10:47Z

Unlike the equalp operator, the order in which keys are inserted seems to affect equals. Is this desired behavior or a bug?

CL-USER> (defvar a (make-hash-table))
A
CL-USER> (defvar b (make-hash-table))
B
CL-USER> (setf (gethash :a a) 1)
1
CL-USER> (setf (gethash :b a) 2)
2
CL-USER> (setf (gethash :b b) 2)
2
CL-USER> (setf (gethash :a b) 1)
1
CL-USER> (equalp a b)
T
CL-USER> (generic-comparability:equals a b)
NIL

This is on SBCL 1.3.21.

The text was updated successfully, but these errors were encountered:

pnathan · 2017-09-04T15:07:24Z

Off the top of my head, BUG.

…

On Sep 4, 2017 1:10 AM, "anticrisis" ***@***.***> wrote: Unlike the equalp operator, the order in which keys are inserted seems to affect equals. Is this desired behavior or a bug? CL-USER> (defvar a (make-hash-table)) A CL-USER> (defvar b (make-hash-table)) B CL-USER> (setf (gethash :a a) 1) 1 CL-USER> (setf (gethash :b a) 2) 2 CL-USER> (setf (gethash :b b) 2) 2 CL-USER> (setf (gethash :a b) 1) 1 CL-USER> (equalp a b) T CL-USER> (generic-comparability:equals a b) NIL This is on SBCL 1.3.21. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#5>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAuc4Eqeg9KjAgy95aUVV1yNucr8BIxoks5se7CIgaJpZM4PLlkV> .

anticrisis · 2017-09-04T23:07:13Z

Seems the problem is here:

generic-comparability/generic-comparability.lisp

Lines 177 to 184 in 9ad75d4

    
           (loop 
        
              for k1 in (hash-table-keys a) 
        
              for k2 in (hash-table-keys b) 
        
              always (apply 
        
                      #'(lambda (a b) 
        
                          (equals a b :recursive recursive)) 
        
                      k1 k2 keys)) 
        
           t)

(hash-table-keys) will return keys in an unspecified order. I haven't looked at equalp, but I imagine a correct implementation would iterate through the keys of the first hash table and query the second hash table on those keys.

pnathan · 2017-09-04T23:36:37Z

Yes, you're right; the code assumes hash-table-keys is returning in a consistent order. I can't promise anything in terms of a patch, I have a heap of things to do at home. But I will definitely try to get to it soon.... N.b., an ineffecient way to do this would be to sort the keys as they come back from both, then compare. That would be some 2 * n lg n, as opposed to your conception, which would be n.

…

On Mon, Sep 4, 2017 at 4:07 PM, anticrisis ***@***.***> wrote: Seems the problem is here: https://github.com/pnathan/ generic-comparability/blob/9ad75d4be22de86efabcd8522d9f52 32033aad8b/generic-comparability.lisp#L177-L184 (hash-table-keys) will return keys in an unspecified order. I haven't looked at equalp, but I imagine a correct implementation would iterate through the keys of the first hash table and query the second hash table on those keys. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#5 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAuc4D1YFflFxqgKYz54PjZ9JtcnOf2Xks5sfIKhgaJpZM4PLlkV> .

anticrisis · 2017-09-05T07:34:02Z

In reading CDR-8, I think there's a conflict between its description of EQUALS for hash tables, and hash-tables's use of an equivalence test function. If by-key is T, CDR-8 overrules the hash table's test function for comparing keys.

For example, if the hash tables use EQUALP as their equivalence test, the keys "a" and "A" would be equivalent. But under EQUALS, "a" and "A" are not equivalent. Therefore, EQUALS would return NIL, but an iteration of every key and value in the first hash table, and a GETHASH test of those keys in the second table, would return T.

This inconsistency makes my proposed implementation incorrect, since it would rely on GETHASH to query the second hash table for the presence of each key in the first hash table. GETHASH, of course, cannot use EQUALS, because user-defined test hash table test functions are not part of the standard.

So it seems that if by-key is T, the only correct implementation of CDR-8 is to sort the keys in the first table (with COMPARE), and then check for their existence in the second table.

If by-key is NIL, I assume we should use the hash-table's equivalence test? CDR-8 is silent on this point.

The algorithm in CDR-8 seems incomplete, in that it neglects to confirm that both hash tables use the same equivalence test, unless check-properties is T. But if the hash tables don't use the same equivalence test, then isn't it meaningless to try to assert by-value equality?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On hash tables, key insertion order matters, unlike equalp #5

On hash tables, key insertion order matters, unlike equalp #5

anticrisis commented Sep 4, 2017

pnathan commented Sep 4, 2017 via email

anticrisis commented Sep 4, 2017

pnathan commented Sep 4, 2017 via email

anticrisis commented Sep 5, 2017

On hash tables, key insertion order matters, unlike equalp #5

On hash tables, key insertion order matters, unlike equalp #5

Comments

anticrisis commented Sep 4, 2017

pnathan commented Sep 4, 2017 via email

anticrisis commented Sep 4, 2017

pnathan commented Sep 4, 2017 via email

anticrisis commented Sep 5, 2017