Compile-time string hashing.

Everything about the development of Urho3D.

Compile-time string hashing.

PostPosted by S.L.C » 20 Oct 2016, 01:35

Had some fun tonight trying to implement a constexpr version of the StringHash type (not an actual implementation. just a proof of concept). Anyway, I thought I'd share it hoping that it could be integrated into the engine since C++11 can now be optionally enabled. If necessary I'll make a pull request myself.
Last edited by S.L.C on 20 Oct 2016, 04:06, edited 1 time in total.
User avatar
S.L.C
New user
New user
 
Posts: 14
Joined: 24 Dec 2014, 21:08

Re: Compile-time string hashing.

PostPosted by yushli » 20 Oct 2016, 03:23

That sounds quite interesting. However I cann't access the link. Can you put it on github?
User avatar
yushli
Some active
Some active
 
Posts: 94
Joined: 18 Oct 2014, 05:42

Re: Compile-time string hashing.

PostPosted by S.L.C » 20 Oct 2016, 03:39

Sure, here's a gist. To actually see the difference in the generated assembly code I'd suggest something like http://gcc.godbolt.org/ with "-O3 -std=c++11" as command line. Uncommenting each example individually.

I've actually wen't ahead and integrated it with the engine and I'm trying it right now to see if it compiles (MinGW) and works. MSVC sure made it annoying considering they've introduced constexpr in the November CTP for 2013 and user defined literals again in the CTP 14 which eventually became 2015. So right now it looks a bit weird with some macros.

EDIT: Well, one thing it did so far is to shave about +250kb from the dll size (with some components disabled. like 2d, lua, database) and even from the samples as well. Considering only the StringHash type was specialized to take advantage of that, I'd say it's decent. I wonder how far I can take it. A huge opportunity is in the URHO3D_OBJECT macro. But I need to be sure not to affect exports.
User avatar
S.L.C
New user
New user
 
Posts: 14
Joined: 24 Dec 2014, 21:08

Re: Compile-time string hashing.

PostPosted by yushli » 04 Nov 2016, 10:06

Can you post the code here? I cannot access gist
User avatar
yushli
Some active
Some active
 
Posts: 94
Joined: 18 Oct 2014, 05:42

Re: Compile-time string hashing.

PostPosted by S.L.C » 04 Nov 2016, 14:35

Code: Select all
#include <cstdio>
#include <cctype>

// ------------------------------------------------------------------------------------------------

/// Retrieve the lowercase version of an ASCII character.
constexpr int tolower_c(int c)
{
    return (c > 64 && c < 91) ? (c + 32) : c;
}

/// Retrieve the uppercase version of an ASCII character.
constexpr int toupper_c(int c)
{
    return (c > 96 && c < 123) ? (c - 32) : c;
}

// ------------------------------------------------------------------------------------------------

//// Calculate the hash of the given ASCII string using the SDBM algorithm at compile time.
constexpr unsigned SDBMHash_C(const char* s, unsigned i, unsigned n, unsigned h)
{
   return i < n ? SDBMHash_C(s, i + 1, n, tolower_c(s[i]) + (h << 6) + (h << 16) - h) : h;
}

//// Update a hash with the given 8-bit value using the SDBM algorithm.
inline unsigned SDBMHash(unsigned hash, unsigned char c)
{
    return c + (hash << 6) + (hash << 16) - hash;
}

// ------------------------------------------------------------------------------------------------

// used as a workaround for overloading string literals
struct IsCharPtrDummy
{
};
template< typename T > struct IsCharPtr
{
};
template<> struct IsCharPtr< const char * >
{
    typedef IsCharPtrDummy* Type;
};
template<> struct IsCharPtr< char * >
{
    typedef IsCharPtrDummy* Type;
};

// ------------------------------------------------------------------------------------------------

// dummy hash class with optional compile time evaluation
struct shash
{
     unsigned value;

     // pre-compiled hash
    explicit constexpr shash(unsigned hash)
        : value(hash)
     {
     }

     // compile time hash
   template< unsigned N > constexpr shash(const char(&a)[N])
        : value(SDBMHash_C(a, 0, N-1, 0))
     {
     }

     // run-time non-const buffer hash
   template< unsigned N > shash(char(&a)[N])
        : value(calculate(a))
     {
     }

     // regular hash
     template< typename T > shash(T str, typename IsCharPtr<T>::Type=0)
         : value(calculate(str))
     {
     }

     // compile time hash of explicit length string
    constexpr shash(const char* str, size_t len)
        : value(SDBMHash_C(str, 0, len, 0))
     {
     }

     unsigned calculate(const char* str)
   {
       unsigned hash = 0;

       if (!str)
           return hash;

       while (*str)
       {
           // Perform the actual hashing as case-insensitive
           char c = *str;
           hash = SDBMHash(hash, (unsigned char)tolower(c));
           ++str;
       }

       return hash;
   }
};

// ------------------------------------------------------------------------------------------------

constexpr shash operator "" _hash(const char* str, size_t len)
{
    return shash(str, len);
}

constexpr unsigned operator "" _h(const char* str, size_t len)
{
    return SDBMHash_C(str, 0, len, 0);
}

// ------------------------------------------------------------------------------------------------

void test(const shash hash)
{
   printf("%u\n", hash.value);
}

// ------------------------------------------------------------------------------------------------

int main()
{
    const char * cstr = "abc";
     char sbuf[] = "abc";

     // compile time evaluation
   test("abc");
    //test(shash("abc")); // same thing
 
    // compile time evaluation with user defined literal
   //test("abc"_hash);
 
     // compile time evaluation with user defined literal
     // ugly because that constructor is explicit :(
   //test(shash("abc"_h));
 
     // run-time evaluation
     //test(cstr);
 
     // run-time evaluation on non-const buffers (preferably?)
     //test(sbuf);
 
     return 0;
}
User avatar
S.L.C
New user
New user
 
Posts: 14
Joined: 24 Dec 2014, 21:08

Re: Compile-time string hashing.

PostPosted by yushli » 04 Nov 2016, 15:22

Thank you for sharing this! Hope this can be merged into main branch.
User avatar
yushli
Some active
Some active
 
Posts: 94
Joined: 18 Oct 2014, 05:42

Re: Compile-time string hashing.

PostPosted by carnalis » 04 Nov 2016, 16:53

Nice work! I thought it good if Urho supports this.

TurboBadger UI (I use with Urho) also supports constexpr for its 32-bit stringhash.
For reference, here is its implementation:

https://github.com/fruxo/turbobadger/bl ... /tb_hash.h
https://github.com/fruxo/turbobadger/bl ... b_hash.cpp
<Patrician|Away> what does your robot do, sam
<bovril> it collects data about the surrounding environment, then discards it and drives into walls
User avatar
carnalis
Moderator
Moderator
 
Posts: 180
Joined: 21 Jan 2014, 19:00
Location: United States


Return to Developer Talk

Who is online

Users browsing this forum: No registered users and 0 guests

cron