FACTOID # 138: Libya’s full name is the Great Socialist People’s Libyan Arab Jamahiriya.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > External sorting

External sorting is a generic term for a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted does not fit into the main memory of a computing device (usually RAM) and a slower kind of memory (usually a hard drive) needs to be used. Sorting refers to a process of arranging items in some sequence and/or in different sets, and accordingly, it has two common, yet distinct meanings: ordering: aranging items of the same kind, class, nature, etc. ... In mathematics, computing, linguistics, and related disciplines, an algorithm is a finite list of well-defined instructions for accomplishing some task that, given an initial state, will terminate in a defined end-state. ... For other uses, see Data (disambiguation). ... Primary storage is a category of computer storage, often called main memory. ... Look up RAM, Ram, ram in Wiktionary, the free dictionary. ... Typical hard drives of the mid-1990s. ...


Carefully implemented, external sorting can be done in-place (with no additional disk space required).


External mergesort

One example of external sorting is the external mergesort algorithm. For example, for sorting 900 megabytes of data using only 100 megabytes of RAM: In computer science, merge sort or mergesort is a sort algorithm for rearranging lists (or any other data structure that can only be accessed sequentially, e. ... ReBoot character, see Megabyte (ReBoot). ...

  1. Read 100 MB of the data in main memory and sort by some conventional method (usually quicksort).
  2. Write the sorted data to disk.
  3. Repeat steps 1 and 2 until all of the data is sorted in 100 MB chunks, which now need to be merged into one single output file.
  4. Read the first 10 MB of each sorted chunk (call them input buffers) in main memory (90 MB total) and allocate the remaining 10 MB for output buffer.
  5. Perform a 9-way merging and store the result in the output buffer. If the output buffer is full, write it to the final sorted file. If any of the 9 input buffers gets empty, fill it with the next 10 MB of its associated 100 MB sorted chunk or otherwise mark it as exhausted if there is no more data in the sorted chunk and do not use it for merging.

This algorithm can be generalized by assuming that the amount of data to be sorted exceeds the available memory by a factor of K. Then, K chunks of data need to be sorted and a K-way merge has to be completed. If X is the amount of main memory available, there will be K input buffers and 1 output buffer of size X/(K+1) each. Depending on various factors (how fast the hard drive is, what is the value of K) better performance can be achieved if the output buffer is made larger (for example twice as large as one input buffer). Q sort redirects here. ... In computer science, merge sort or mergesort is a sort algorithm for rearranging lists (or any other data structure that can only be accessed sequentially, e. ...


In the example, a single-pass merge was used. If the ratio of data to available main memory is particularly large, a multi-pass sorting is preferable. For example, merge only the first half of the sorted chunks, then the other half and now the problem has been reduced to merging just two sorted chunks. The exact number of passes depends on the above mentioned ratio, as well as the physical characteristics of the hard drive (transfer rate and seeking time). As a rule of thumb, do not perform more than 20-30 way merge[citation needed].


External links

Reference

  • Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Second Edition. Addison-Wesley, 1998. ISBN 0-201-89685-0. Section 5.4: External Sorting, pp.248–379.

  Results from FactBites:
 
Examined Algorithms (1636 words)
The algorithm given in table 3 is the one used for the string sorting measurements, and it is slightly more optimized than the one used for integer sorting.
External sorting performance is very dependent on assumptions about the problem, such as data size, I/O system, organization of input and output files, etc. The performance estimate must also take into account the I/O time for input and output to be useful.
The External Tree sort assumes that the total main memory available to all processors is equal to the input data size.
CIS Department > Tutorials > Software Design Using C++ > External Sorting (1298 words)
External sorting refers to the sorting of a file that is on disk (or tape).
The main concern with external sorting is to minimize disk access since reading a disk block takes about a million times longer than accessing an item in RAM (according to Shaffer -- see the reference at the end of this document).
Perhaps the simplest form of external sorting is to use a fast internal sort with good locality of reference (which means that it tends to reference nearby items, not widely scattered items) and hope that your operating system's virtual memory can handle it.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.