I ran a test where I did the following:
- created 100,000 random 32 character strings
- choose 1,000 of those strings randomly
- found each of the 1,000 strings in the 100,000 string array
I did this using strings, then the same exact string converted into symbols to see how the performance compares for symbol matching & string matching. Here are the results on my laptop:
Test String Time = 38.2969999313354
Test Symbol Time = 26.4219999313354
The results are independent of the ordering of the two operations and are fairly consistent. It is interesting how non-linear this is with the length of the string:
2 char/10k test set:
Test String Time = 1.84299993515015
Test Symbol Time = 1.35999989509583
8 char/10k test set:
Test String Time = 3.20300006866455
Test Symbol Time = 2.67199993133545
32 char/10k test set:
Test String Time = 3.15700006484985
Test Symbol Time = 2.73399996757507
256 char/10k test set:
Test String Time = 3.28099989891052
Test Symbol Time = 2.6560001373291
Yes, comparing 256 character strings took LESS time than comparing 32 character strings. Clearly there is some form of indexing in play here.
Sunday, December 24, 2006
Tuesday, November 21, 2006
Webscale Papers
I had a nice chat with an old friend tonight about Snapvine & recent stuff he's come across at MSR. This inspired me to go look up a few really interesting papers from MSR & google labs. If you haven't had a chance to look at the papers on google labs, check it out: http://labs.google.com/papers/
Of particular interest are GFS, Chubby & BigTable.
Paxos algorithm for electing a master in a fault tolerant system
http://research.microsoft.com/users/lamport/pubs/paxos-simple.pdf
This is all quite far beyond anything we need to implement at Snapvine, but it's interesting reading about how you go from 100s of servers to 10,000s of servers.
Of particular interest are GFS, Chubby & BigTable.
Paxos algorithm for electing a master in a fault tolerant system
http://research.microsoft.com/users/lamport/pubs/paxos-simple.pdf
This is all quite far beyond anything we need to implement at Snapvine, but it's interesting reading about how you go from 100s of servers to 10,000s of servers.
Subscribe to:
Posts (Atom)