|
A string metric is a metric for defining similarity or distance on strings. The computed measures of distance can be exploited in fuzzy string searching. Image File history File links Please see the file description page for further information. ...
String metrics are metrics for defining similarity or distance on strings. ...
In mathematics a metric or distance function is a function which defines a distance between elements of a set. ...
Look up similarity in Wiktionary, the free dictionary. ...
Distance is a numerical description of how far apart objects are at any given moment in time. ...
In computer programming and formal language theory, (and other branches of mathematics), a string is an ordered sequence of symbols. ...
Fuzzy string searching is the name for a category of techniques for finding one or more substrings of a text that approximately match some given pattern string. ...
List of string metrics
- Hamming distance
- Levenshtein distance
- Needleman-Wunsch distance or Sellers Algorithm
- Smith-Waterman distance
- Gotoh Distance or Smith-Waterman-Gotoh distance
- Block distance or L1 distance or City block distance
- Jaro distance metric
- Jaro-Winkler
- Soundex distance metric
- Matching Coefficient
- Dice’s Coefficient
- Jaccard similarity or Jaccard coefficient or Tanimoto coefficient
- Overlap Coefficient
- Euclidean distance or L2 distance
- Cosine similarity
- Variational distance
- Hellinger distance or Bhattacharyya distance
- Information Radius (Jensen-Shannon divergence)
- Harmonic Mean
- Skew divergence
- Confusion Probability
- Tau metric, an approximation of the Kullback-Leibler divergence
- Fellegi and Sunters metric (SFS)
- TFIDF or TF/IDF
- Maximal matches
In information theory, the Hamming distance between two strings of equal length is the number of positions for which the corresponding symbols are different. ...
In information theory and computer science, the Levenshtein distance is a string metric which is one way to measure edit distance. ...
The Needleman-Wunsch algorithm performs a global alignment on two sequences(called A and B here). ...
The Smith-Waterman algorithm is a well-known algorithm for performing local sequence alignment; that is, for determining similar regions between two nucleotide or protein sequences. ...
FASTA is a DNA and Protein sequence alignment software package first described (as FASTP) by David J. Lipman and William R. Pearson in 1985 in the article Rapid and sensitive protein similarity searches. ...
In bioinformatics, Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences. ...
Taxicab geometry, considered by Hermann Minkowski in the 19th century, is a form of geometry in which the usual metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the (absolute) differences of their coordinates. ...
Taxicab geometry, considered by Hermann Minkowski in the 19th century, is a form of geometry in which the usual metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the (absolute) differences of their coordinates. ...
Taxicab geometry, considered by Hermann Minkowski in the 19th century, is a form of geometry in which the usual metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the (absolute) differences of their coordinates. ...
The Jaro-Winkler distance (Winkler, 1999) is a measure of similarity between two strings. ...
Soundex is a phonetic algorithm for indexing names by sound, as pronounced in English. ...
The Jaccard index, also known as the Jaccard similarity coefficient, is a statistic used for comparing the similarity and diversity of sample sets. ...
In mathematics, the Euclidean distance or Euclidean metric is the ordinary distance between the two points that one would measure with a ruler, which can be proven by repeated application of the Pythagorean theorem. ...
In probability theory and statistics, the Jensen-Shannon divergence is a popular method of measuring the similarity between two probability distributions. ...
In mathematics, the harmonic mean (formerly sometimes called the subcontrary mean) is one of several kinds of average. ...
In probability theory and information theory, the Kullback-Leibler divergence (or information divergence, or information gain, or relative entropy) is a natural distance measure from a true probability distribution P to an arbitrary probability distribution Q. Typically P represents data, observations, or a precise calculated probability distribution. ...
...
See also String searching algorithms, sometimes called string matching algorithms, are an important class of string algorithms that try to find a place where one or several strings (also called patterns) are found within a larger string or text. ...
External links - http://www.dcs.shef.ac.uk/~sam/stringmetrics.html A fairly complete overview
|