FACTOID # 10: Indians go out to the movies 3 billion times a year - much more than any other nation.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > String metric

A string metric is a metric for defining similarity or distance on strings. The computed measures of distance can be exploited in fuzzy string searching. Image File history File links Please see the file description page for further information. ... String metrics are metrics for defining similarity or distance on strings. ... In mathematics a metric or distance function is a function which defines a distance between elements of a set. ... Look up similarity in Wiktionary, the free dictionary. ... Distance is a numerical description of how far apart objects are at any given moment in time. ... In computer programming and formal language theory, (and other branches of mathematics), a string is an ordered sequence of symbols. ... Fuzzy string searching is the name for a category of techniques for finding one or more substrings of a text that approximately match some given pattern string. ...


List of string metrics

  • Hamming distance
  • Levenshtein distance
  • Needleman-Wunsch distance or Sellers Algorithm
  • Smith-Waterman distance
  • Gotoh Distance or Smith-Waterman-Gotoh distance
    • Monge Elkan distance
  • Block distance or L1 distance or City block distance
  • Jaro distance metric
  • Jaro-Winkler
  • Soundex distance metric
  • Matching Coefficient
  • Dice’s Coefficient
  • Jaccard similarity or Jaccard coefficient or Tanimoto coefficient
  • Overlap Coefficient
  • Euclidean distance or L2 distance
  • Cosine similarity
  • Variational distance
  • Hellinger distance or Bhattacharyya distance
  • Information Radius (Jensen-Shannon divergence)
  • Harmonic Mean
  • Skew divergence
  • Confusion Probability
  • Tau metric, an approximation of the Kullback-Leibler divergence
  • Fellegi and Sunters metric (SFS)
  • TFIDF or TF/IDF
  • Maximal matches

In information theory, the Hamming distance between two strings of equal length is the number of positions for which the corresponding symbols are different. ... In information theory and computer science, the Levenshtein distance is a string metric which is one way to measure edit distance. ... The Needleman-Wunsch algorithm performs a global alignment on two sequences(called A and B here). ... The Smith-Waterman algorithm is a well-known algorithm for performing local sequence alignment; that is, for determining similar regions between two nucleotide or protein sequences. ... FASTA is a DNA and Protein sequence alignment software package first described (as FASTP) by David J. Lipman and William R. Pearson in 1985 in the article Rapid and sensitive protein similarity searches. ... In bioinformatics, Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences. ... Taxicab geometry, considered by Hermann Minkowski in the 19th century, is a form of geometry in which the usual metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the (absolute) differences of their coordinates. ... Taxicab geometry, considered by Hermann Minkowski in the 19th century, is a form of geometry in which the usual metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the (absolute) differences of their coordinates. ... Taxicab geometry, considered by Hermann Minkowski in the 19th century, is a form of geometry in which the usual metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the (absolute) differences of their coordinates. ... The Jaro-Winkler distance (Winkler, 1999) is a measure of similarity between two strings. ... Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. ... The Jaccard index, also known as the Jaccard similarity coefficient, is a statistic used for comparing the similarity and diversity of sample sets. ... In mathematics, the Euclidean distance or Euclidean metric is the ordinary distance between the two points that one would measure with a ruler, which can be proven by repeated application of the Pythagorean theorem. ... In probability theory and statistics, the Jensen-Shannon divergence is a popular method of measuring the similarity between two probability distributions. ... In mathematics, the harmonic mean (formerly sometimes called the subcontrary mean) is one of several kinds of average. ... In probability theory and information theory, the Kullback-Leibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ... ...

See also

String searching algorithms, sometimes called string matching algorithms, are an important class of string algorithms that try to find a place where one or several strings (also called patterns) are found within a larger string or text. ...

External links

  • http://www.dcs.shef.ac.uk/~sam/stringmetrics.html A fairly complete overview


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.