Skip to content

A modified van Emde Boas data structure that achieves O(lg w) predecessor and successor queries with O(n) space"

License

Notifications You must be signed in to change notification settings

nachinius/VanEmdeBoasInScala

Repository files navigation

Build Status codecov Join the chat at https://gitter.im/nachinius/van_emde_boas_in_scala License Latest version

modified van Emde Boas Tree

the fastest successor/predecessor for integers set

A modified van Emde Boas data structure that achieves time complexity for successor/predecessor/insert/search/delete in O(lg w) (with high probability), where w is the bits of the word used to store the integer. Space complexity is O(n), where n is how many numbers are stored. Thus, the constrain is that lg n <= w, since any number must fit in a word.

If w is not too high, then this is optimal. In case w is too high, fusion trees are the best. However, dynamic versions of fusion tree only achieve expected O(lg n/lg w).

When lg w >= sqrt(lg(n)) is better to use fusion trees. However, for dinamyc fusion trees only expected O(lg n/lg w) has been achieved.

For example, in a static case, with a 32-bit word, w = 32 (as this current implementation that uses scala's Int which is 32-bit signed integer), static van Emde Boas should be used if n >~ 2^25 ~ 33 millon . Otherwise, static fusion trees should be preferred. In the same scenario, static, for 64-bit word, w = 64, this structure is better than fusion trees if n >~ 68 billon . You're likely to have memory problems with such big numbers, and you may want to switch to a distributed system. Due to its recursive nature, I imagine this structure is easy to implement in a distributed way.

The current implementation, handles scala's Int which is a signed 32-bit number. However, th

time complexity (with high probability)

  • search: O(lg w)
  • successor: O(lg w)
  • predecessor: O(lg w)
  • delete: O(lg w)
  • insert: O(lg w)

where n = 2^w is the maximum number that can be stored.

space complexity

  • O(n)
may TODO
  • Create performance tests
  • Measure memory consumption againt data size (n) and bit (w)
  • Create a complete immutable version
  • Improve usage of generators for performance tests
  • Do performance tests for Immutable Version
  • Compare performance between mutable and immutable
  • Write predecessor
  • Add delete
  • Allow any ordering
  • Augment the data structure
  • Compare to fusion tree
  • Compare to O(lg n) successor/predecessor of trees
  • Present performance results

features

  • immutable or mutable
  • dynamic
  • fully tested
  • with benchmarks code

References

  1. MIT 6.851 Advanced Data Structures, Erik Demaine lecture 11

About

A modified van Emde Boas data structure that achieves O(lg w) predecessor and successor queries with O(n) space"

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages