FACTOID # 51: Russia won the first World Air Games, held in Turkey in 1997. Events included hang-gliding, sky-surfing, and ballooning.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Radix tree

A radix tree, Patricia trie/tree, or crit bit tree is a specialized set data structure based on the trie that is used to store a set of strings. These can be strings of characters, bit strings such as integers or IP addresses, or generally arbitrary sequences of objects in lexicographical order. Sometimes the names radix tree and crit bit tree are only applied to trees storing integers and Patricia trie is retained for more general inputs, but the structure works the same in all cases. In computer science, the set is a collection of certain values without any particular order. ... A trie for keys to, tea, ten, i, in, and inn. In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are strings. ... An IP address (Internet Protocol Address) is a unique address that devices use in order to identify and communicate with each other on a computer network utilizing the Internet Protocol standard (IP). ... In mathematics, the lexicographical order, or dictionary order, is a natural order structure of the cartesian product of two ordered sets. ...

Contents

Overview

The radix tree is easiest to understand as a space-optimized trie where each node with only one child is merged with its child. The result is that every internal node has at least two children. Unlike in tries, edges can be labelled with sequences of characters as well as single characters. This makes them much more efficient for small sets (especially if the strings are long) and for sets of strings that share long prefixes.


It supports the following main operations, all of which are O(k), where k is the maximum length of all strings in the set:

  • Lookup: Determines if a string is in the set. This operation is identical to tries except that some edges consume multiple characters.
  • Insert: Add a string to the tree. We search the tree until we can make no further progress. At this point we either add a new outgoing edge labelled with all remaining characters in the input string, or if there is already an outgoing edge sharing a prefix with the remaining input string, we split it into two edges (the first labelled with the common prefix) and proceed. This splitting step ensures that no node has more children than there are possible string characters.
  • Delete: Delete a string from the tree. We delete the corresponding leaf and then delete its former parent, merging the two incident edges.
  • Find predecessor: Locates the largest string less than a given string, by lexicographic order.
  • Find successor: Locates the smallest string greater than a given string, by lexicographic order.

A common extension of radix trees uses two colors of nodes, 'black' and 'white'. To check if a given string is stored in the tree, the search starts from the top and follows the edges of the input string until no further progress can be made. If the search-string is consumed and the final node is a black node, the search has failed; if it is white, the search has succeeded. This enables us to add a large range of strings with a common prefix to the tree, using white nodes, then remove a small set of "exceptions" in a space-efficient manner by inserting them using black nodes. A trie for keys to, tea, ten, i, in, and inn. In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are strings. ...


Applications

As mentioned, radix trees are useful for constructing associative arrays with keys that can be expressed as strings. They find particular application in the area of IP routing, where the ability to contain large ranges of values with a few exceptions is particularly suited to the hierarchical organization of IP addresses. They are also used for inverted indexes of text documents in information retrieval. An associative array (also map, hash, dictionary, finite map, lookup table, and in query-processing an index or index file) is an abstract data type composed of a collection of keys and a collection of values, where each key is associated with one value. ... An IP address (Internet Protocol Address) is a unique address that devices use in order to identify and communicate with each other on a computer network utilizing the Internet Protocol standard (IP). ... Information retrieval (IR) is the science of searching for information in documents, searching for documents themselves, searching for metadata which describe documents, or searching within databases, whether relational stand-alone databases or hypertext networked databases such as the Internet or intranets, for text, sound, images or data. ...


History

Donald R. Morrison first described what he called "Patricia tries" in 1968; the name comes from the acronym PATRICIA, which stands for "Practical Algorithm to Retrieve Information Coded in Alphanumeric". Gernot Gwehenberger independently invented and described the data structure at about the same time. 1968 (MCMLXVIII) was a leap year starting on Monday (the link is to a full 1968 calendar). ... It has been suggested that this article or section be merged with Backronym and Apronym (Discuss) Acronyms and initialisms are abbreviations, such as NATO, laser, and ABC, written as the initial letter or letters of words, and pronounced on the basis of this abbreviated written form. ...


Comparison to other data structures

(In the following comparisons, it is assumed that the keys are of length k and the data structure contains n elements.)


Unlike balanced trees, radix trees permit lookup, insertion, and deletion in O(k) time rather than O(log n). This doesn't seem like an advantage, since normally k ≥ log n, but in a balanced tree every comparison is a string comparison requiring O(k) worst-case time, many of which are slow in practice due to long common prefixes. In a trie, all comparisons require constant time, but it takes m comparisons to look up a string of length m. Radix trees can perform these operations with fewer comparisons and require many fewer nodes.


Radix trees also share the disadvantages of tries, however: as they can only be applied to strings of elements or elements with a reversible mapping (bijection) to strings, they lack the full generality of balanced search trees, which apply efficiently to any datatype with a total ordering (for example, floating-point numbers). A bijective function. ... In mathematics, a total order or linear order on a set X is any binary relation on X that is antisymmetric, transitive, and total. ...


Hash tables have expected O(1) insertion and deletion times but are subject to attacks that degrade them to worst-case scenarios of O(n). Radix trees have worst-case O(k) insertion and deletion. The successor/predecessor operations of radix trees are also not implemented by hash tables. In computer science, a hash table is a data structure that speeds up searching for information by a particular aspect of that information, called a key. ...


External links


  Results from FactBites:
 
Trees I: Radix trees [LWN.net] (1409 words)
A Linux radix tree is a mechanism by which a (pointer) value can be associated with a (long) integer key.
Nick Piggin currently has a patch circulating which would use read-copy-update to free tree nodes; this patch would allow lookup operations to be performed without locking as long as (1) the resulting pointer is only used in atomic context, and (2) the calling code avoids creating race conditions of its own.
In practice, deleting all items from a radix tree will free all memory associated with it other than the root node, which can then be disposed of normally.
py-radix - radix tree data structure for Python (311 words)
This is a Python equivalent to Dave Plonka's Perl Net::Patricia (it even steals the same radix tree code from MRTd).
The radix tree (a.k.a Patricia tree) is the data structure most commonly used for routing table lookups.
The underlying radix tree implementation is taken (and modified) from MRTd and is subject to a 4-term BSD license.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.