Sequence Edit Distance¶
Implements the sequence edit distance of Levenshtein (1965) and its backtracing in cython.
-
edist.sed.
sed
()¶ Computes the sequence edit distance between the input sequence x and the input sequence y, given the element-wise distance function delta.
Parameters: - x (list) – a sequence of objects.
- y (list) – another sequence of objects.
- delta (function (default = None)) – a function that takes an element of x as first and an element of y as second input and returns the distance between them. If None, this method calls standard_sed instead.
Returns: d – the sequence edit distance between x and y according to delta.
Return type: float
-
edist.sed.
sed_backtrace
()¶ Computes a co-optimal alignment between the two input sequences x and y, given the element-wise distance function delta. This mechanism is deterministic and will always prefer replacements over other options.
Parameters: - x (list) – a sequence of objects.
- y (list) – another sequence of objects.
- delta (function (default = None)) – a function that takes an element of x as first and an element of y as second input and returns the distance between them. If None, this method calls standard_sed_backtrace instead.
Returns: alignment – A co-optimal alignment between x and y according to the sequence edit distance.
Return type: class alignment.Alignment
-
edist.sed.
sed_backtrace_matrix
()¶ Computes a matrix, summarizing all co-optimal alignments between x and y in a matrix P, where entry P[i, j] specifies the fraction of co-optimal alignments in which node x[i] has been aligned with node y[j].
Parameters: - x (list) – a sequence of objects.
- y (list) – another sequence of objects.
- delta (function (default = None)) – a function that takes an element of x as first and an element of y as second input and returns the distance between them. If None, this method calls standard_sed_backtrace_matrix instead.
Returns: - P (array_like) – a matrix, where entry P[i, j] specifies the fraction of co-optimal alignments in which node x[i] has been aligned with node y[j]. P[i, n] contains the fraction of deletions of node x[i] and P[m, j] the fraction of insertions of node y[j].
- K (array_like) – a matrix that contains the counts for all co-optimal alignments in which node x[i] has been aligned with node y[j].
- k (int) – the number of co-optimal alignments overall, such that P = K / k.
-
edist.sed.
sed_backtrace_stochastic
()¶ Computes a co-optimal alignment between the two input sequences x and y, given the element-wise distance function delta. This mechanism is stochastic and will return a random co-optimal alignment.
Note that the randomness does _not_ produce a uniform distribution over all co-optimal alignments because reandom choices at the start of the alignment process dominate. If you wish to characterize the overall distribution accurately, use sed_backtrace_matrix instead.
Parameters: - x (list) – a sequence of objects.
- y (list) – another sequence of objects.
- delta (function (default = None)) – a function that takes an element of x as first and an element of y as second input and returns the distance between them. If None, this method calls standard_sed_backtrace_stochastic instead.
Returns: alignment – A co-optimal alignment between x and y according to the sequence edit distance.
Return type: class alignment.Alignment
-
edist.sed.
sed_string
()¶ Computes the standard sequence edit distance/Levenshtein distance between two input strings x and y, using the Kronecker distance as element-wise distance measure.
Parameters: - x (str) – a string.
- y (str) – another string.
Returns: d – the standard sequence edit distance between x and y.
Return type: int
-
edist.sed.
standard_sed
()¶ Computes the standard sequence edit distance/Levenshtein distance between the input sequence x and the input sequence y.
Parameters: - x (list) – a sequence of objects.
- y (list) – another sequence of objects.
Returns: d – the standard sequence edit distance between x and y.
Return type: int
-
edist.sed.
standard_sed_backtrace
()¶ Computes a co-optimal alignment between the two input sequences x and y. This mechanism is deterministic and will always prefer replacements over other options.
Parameters: - x (list) – a sequence of objects.
- y (list) – another sequence of objects.
Returns: alignment – A co-optimal alignment between x and y according to the sequence edit distance.
Return type: class alignment.Alignment
-
edist.sed.
standard_sed_backtrace_matrix
()¶ Computes a matrix, summarizing all co-optimal alignments between x and y in a matrix P, where entry P[i, j] specifies the fraction of co-optimal alignments in which node x[i] has been aligned with node y[j].
Parameters: - x (list) – a sequence of objects.
- y (list) – another sequence of objects.
Returns: - P (array_like) – a matrix, where entry P[i, j] specifies the fraction of co-optimal alignments in which node x[i] has been aligned with node y[j]. P[i, n] contains the fraction of deletions of node x[i] and P[m, j] the fraction of insertions of node y[j].
- K (array_like) – a matrix that contains the counts for all co-optimal alignments in which node x[i] has been aligned with node y[j].
- k (int) – the number of co-optimal alignments overall, such that P = K / k.
-
edist.sed.
standard_sed_backtrace_stochastic
()¶ Computes a co-optimal alignment between the two input sequences x and y, given the element-wise distance function delta. This mechanism is stochastic and will return a random co-optimal alignment.
Note that the randomness does _not_ produce a uniform distribution over all co-optimal alignments because reandom choices at the start of the alignment process dominate. If you wish to characterize the overall distribution accurately, use sed_backtrace_matrix instead.
Parameters: - x (list) – a sequence of objects.
- y (list) – another sequence of objects.
Returns: alignment – A co-optimal alignment between x and y according to the sequence edit distance.
Return type: class alignment.Alignment