introduce hasheq() and oideq()
authorJeff King <peff@peff.net>
Tue, 28 Aug 2018 21:22:35 +0000 (17:22 -0400)
committerJunio C Hamano <gitster@pobox.com>
Wed, 29 Aug 2018 18:32:49 +0000 (11:32 -0700)
The main comparison functions we provide for comparing
object ids are hashcmp() and oidcmp(). These are more
flexible than a strict equality check, since they also
express ordering. That makes them useful for sorting and
binary searching. However, it also makes them potentially
slower than a strict equality check. Consider this C code,
which is traditionally what our hashcmp has looked like:

#include <string.h>
int hashcmp(const unsigned char *a, const unsigned char *b)
{
return memcmp(a, b, 20);
}

Compiling with "gcc -O2 -S -fverbose-asm", the generated
assembly shows that we actually call memcmp(). But if we
change this to a strict equality check:

return !memcmp(a, b, 20);

we get a faster inline version:

movq (%rdi), %rax # MEM[(void *)a_4(D)], MEM[(void *)a_4(D)]
movq 8(%rdi), %rdx # MEM[(void *)a_4(D)], tmp101
xorq (%rsi), %rax # MEM[(void *)b_5(D)], tmp94
xorq 8(%rsi), %rdx # MEM[(void *)b_5(D)], tmp93
orq %rax, %rdx # tmp94, tmp93
jne .L2 #,
movl 16(%rsi), %eax # MEM[(void *)b_5(D)], tmp104
cmpl %eax, 16(%rdi) # tmp104, MEM[(void *)a_4(D)]
je .L5 #,

Obviously our hashcmp() doesn't include the "!". But because
it's an inline function, optimizing compilers are able to
see "!hashcmp(a,b)" in calling code and take advantage of
this case. So there has been no value thus far in
introducing a more restricted interface for doing strict
equality checks.

But as Git learns about more values for the_hash_algo, our
hashcmp() will grow more complicated and may even delegate
at runtime to functions optimized specifically for that hash
size. That breaks the inline connection we have, and the
compiler will have to assume that the caller really cares
about the sign and magnitude of the memcmp() result, even
though the vast majority don't.

We can solve that by introducing a hasheq() function (and
matching oideq() wrapper), which callers can use to make it
clear that they only care about equality. For now, the
implementation will literally be "!hashcmp()", but it frees
us up later to introduce code optimized specifically for the
equality check.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cache.h
diff --git a/cache.h b/cache.h
index 4d014541ab7bc7692919c871a5306543bbf361c5..f6d227fac7c75447b75fc7d8217319221860ea72 100644 (file)
--- a/cache.h
+++ b/cache.h
@@ -1041,6 +1041,16 @@ static inline int oidcmp(const struct object_id *oid1, const struct object_id *o
        return hashcmp(oid1->hash, oid2->hash);
 }
 
        return hashcmp(oid1->hash, oid2->hash);
 }
 
+static inline int hasheq(const unsigned char *sha1, const unsigned char *sha2)
+{
+       return !hashcmp(sha1, sha2);
+}
+
+static inline int oideq(const struct object_id *oid1, const struct object_id *oid2)
+{
+       return hasheq(oid1->hash, oid2->hash);
+}
+
 static inline int is_null_sha1(const unsigned char *sha1)
 {
        return !hashcmp(sha1, null_sha1);
 static inline int is_null_sha1(const unsigned char *sha1)
 {
        return !hashcmp(sha1, null_sha1);