Affine Edit Distance

Implements a sequence edit distance with affine gap costs using ADP.

class edist.aed.AffineAlgebra(rep=None, gap=1.0, skip=0.5)

This is a class to efficiently store an algebra for the affine edit distance grammar in a pickleable format.

_rep

A function for replacement costs, i.e. _rep(x, y) is the cost of replacing x with y.

Type:function (default = Kronecker distance)
_gap

A function for deletion/insertion costs, i.e. _gap(x) is the cost of deleting/inserting x.

Type:function (default = constant function with 1.0)
_gap_cost

a constant cost for deletions/insertions.

Type:float (default = 1.0)
_skip

A function for deletion/insertion extension costs, i.e. _skip(x) is the cost of skip-deleting/-inserting x.

Type:function (default = constant function with 0.5)
_skip_cost

a constant cost for deletion/insertion extensions.

Type:float (default = 0.5)
edist.aed.aed(x, y, rep=None, gap=1.0, skip=0.5)

Computes the affine edit distance using algebraic dynamic programming.

Parameters:
  • x (list) – A list-like object.
  • y (list) – Another list-like object.
  • rep (function (default = Kronecker delta)) – A function with two arguments, computing the cost for replacing the first with the second OR an AffineAlgebra object, in which case the remaining aguments will be ignored. Defaults to the Kronecker distance.
  • gap (function or float (default = 1.0)) – A function with two arguments, computing the cost for deleting the first or inserting the second OR a number defining a constant cost. Defaults to 1.
  • skip (function or float (default = 0.5)) – A function with two arguments, computing the cost for deleting the first or inserting the second for gap extensions OR a number defining a constant cost. Defaults to 0.5.
Returns:

d – The affine edit distance between x and y.

Return type:

float

edist.aed.aed_backtrace(x, y, rep=None, gap=1.0, skip=0.5)

Computes the backtrace of the affine edit distance using algebraic dynamic programming.

Parameters:
  • x (list) – A list-like object.
  • y (list) – Another list-like object.
  • rep (function (default = Kronecker delta)) – A function with two arguments, computing the cost for replacing the first with the second OR an AffineAlgebra object, in which case the remaining aguments will be ignored. Defaults to the Kronecker distance.
  • gap (function or float (default = 1.0)) – A function with two arguments, computing the cost for deleting the first or inserting the second OR a number defining a constant cost. Defaults to 1.
  • skip (function or float (default = 0.5)) – A function with two arguments, computing the cost for deleting the first or inserting the second for gap extensions OR a number defining a constant cost. Defaults to 0.5.
Returns:

alignment – A co-optimal alignment between x and y according to the affine edit distance.

Return type:

class alignment.Alignment

edist.aed.aed_backtrace_matrix(x, y, rep=None, gap=1.0, skip=0.5)

Computes the backtrace matrix P of the affine edit distance using algebraic dynamic programming.

In particular, P[i, j] contains the probability of node i being replaced with node j in a co-optimal alignment. The last two columns contain deletion and deletion-extension probabilities, the last two rows contains insertion and insertion-extension probabilities.

Parameters:
  • x (list) – A list-like object.
  • y (list) – Another list-like object.
  • rep (function (default = Kronecker distance)) – A function with two arguments, computing the cost for replacing the first with the second OR an AffineAlgebra object, in which case the remaining aguments will be ignored. Defaults to the Kronecker distance.
  • gap (function or float (default = 1.0)) – A function with two arguments, computing the cost for deleting the first or inserting the second OR a number defining a constant cost. Defaults to 1.
  • skip (function or float (default = 0.5)) – A function with two arguments, computing the cost for deleting the first or inserting the second for gap extensions OR a number defining a constant cost. Defaults to 0.5.
Returns:

  • P (array_like) – A len(x) + 2 x len(y) + 2 matrix where P[i, j] contains the probability of node i being replaced with node j in a co-optimal alignment. The last two columns contain deletion and deletion-extension probabilities, the last two rows contains insertion and insertion-extension probabilities.
  • k (int) – The number of co-optimal alignments.

edist.aed.aed_backtrace_stochastic(x, y, rep=None, gap=1.0, skip=0.5)

Computes the backtrace of the affine edit distance using algebraic dynamic programming stochastically.

Note that the randomness does _not_ produce a uniform distribution over all co-optimal alignments because random choices at the start of the alignment process dominate. If you wish to characterize the overall distribution accurately, use aed_backtrace_matrix instead.

Parameters:
  • x (list) – A list-like object.
  • y (list) – Another list-like object.
  • rep (function (default = Kronecker delta)) – A function with two arguments, computing the cost for replacing the first with the second OR an AffineAlgebra object, in which case the remaining aguments will be ignored. Defaults to the Kronecker distance.
  • gap (function or float (default = 1.0)) – A function with two arguments, computing the cost for deleting the first or inserting the second OR a number defining a constant cost. Defaults to 1.
  • skip (function or float (default = 0.5)) – A function with two arguments, computing the cost for deleting the first or inserting the second for gap extensions OR a number defining a constant cost. Defaults to 0.5.
Returns:

alignment – A co-optimal alignment between x and y according to the affine edit distance.

Return type:

class alignment.Alignment