Subject:
|
Re: OID generation (was: Re: wishing upon a star)
|
Newsgroups:
|
lugnet.db.brictionary, lugnet.off-topic.geek
|
Date:
|
Thu, 26 Aug 1999 12:36:47 GMT
|
Reply-To:
|
lpieniazek@novera.comSTOPSPAMMERS
|
Viewed:
|
2060 times
|
| |
| |
Todd Lehman wrote:
> :
> KY51MET9
> L8PX6HJP
> :
>
> In your experience as a sysarch, is this about as good as it gets? or are
> there more robust methods for generating OIDs? (Let's assume WoLOG that
> 2^32 is a sufficient maximum number of OIDs; it could just as well be 48 or
> 64 bits.)
>
> --Todd
Quick answer: yes.
Long answer: Almost certainly. There are some tradeoffs that you've
made.
A big one is around the human readability of the key. Ask yourself, how
often do humans inspect these, and how fast can you convert back to
40bit ints? I'd probably keep these keys as ints internally as the
underlying indexing that you will surely ask the DB to do will perform
better that way, typically. (presumably if these keys are used in
relation implementations you will definitely want to index on them for
fast relation traversal)
if you've truly randomised your keys, you can use a simple power of 2
truncation hashing algorithm to match against a power of 2 bucket set
and get good results. Most people with not very random keys are well
advised to use a prime number of buckets instead of a power of 2 as that
smooths out lumps. Lumpy chains perform poorly because you're linear
searching the chain on collisions, unless you build a fancier bucket
(like using a b-tree bucket).
But if you're going to build a b-tree for each bucket, why not just
build one huge one is what I always end up thinking. So I tend to use
linear chains. Lots easier to debug. (I assume you're not using a
prebuilt hashtable class, in that case who cares, let the vendor
optimise chain search performance)
On the question of whether there are more robust methods, that depends
on what you're using these for. This is pretty good in the general case.
Surprisingly (it's counterintuitive) just using ascending integers works
well too (with a prime number of buckets), and it's simple to implememt
if your DB supports sequences. ints work even better if you have a
dynamically resizable hash table, though.
I know of a really really slick design for dynamic resize that does not
require total rehash on resize. But it has triangular loading
characteristics.
--
Larry Pieniazek larryp@novera.com http://my.voyager.net/lar
- - - Web Application Integration! http://www.novera.com
fund Lugnet(tm): http://www.ebates.com/ Member ref: lar, 1/2 $$ to
lugnet.
NOTE: I have left CTP, effective 18 June 99, and my CTP email
will not work after then. Please switch to my Novera ID.
|
|
Message is in Reply To:
22 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|