Tree Utilities¶
Provides general utility functions to process trees.
-
edist.tree_utils.
check_dfs_structure
(adj, i=0)¶ Verifies that a given adjacency list is in depth-first-search order, in the sense that the descendants of a node i are all indices i+1, i+2, … orl[i], where orl[i] is the outermost right leaf of i.
We verify this by performing a depth-first search over the tree and checking whether the indices are equivalent
Parameters: adj (list) – a directed graph in adjacency list format. Returns: size – the size of the input tree, i.e. the last DFS index. Return type: int Raises: ValueError
– if the given adjacency list is not in DFS format.
-
edist.tree_utils.
check_tree_structure
(adj)¶ Verifies that a given adjacency list describes a tree.
In particular, we build a parent representation of the tree and throw an exception if that fails. This is valid because trees are the subclass of directed graphs with a unique parent.
Parameters: adj (list) – a directed graph in adjacency list format. Raises: ValueError
– if the given adjacency list does not form a tree.
-
edist.tree_utils.
dataset_from_json
(path)¶ Reads trees in node list/adjacency list format from all JSON files in the given directory.
Parameters: path (str) – a path to a directory which contains JSON files. Returns: - X (list) – A list of tuples in node list/adjacency list format.
- filenames (list) – A list of filenames from which we read the trees.
Raises: Exception
– if the file access does not work, if some JSON data is malformed, if a parsed data is not a tree, or if a parsed tree is not in depth first search order.
-
edist.tree_utils.
from_json
(filename)¶ Loads a tree in node list/adjacency list format from a JSON file.
Parameters: filename (str) – A JSON filename containing tree data as written by the to_json method. Returns: - nodes (list) – The node list of the tree.
- adj (list) – The adjacency list of the tree.
Raises: Exception
– if the given file is not accessible, if the JSON data is malformed, if the parsed data is not a tree, or if the parsed tree is not in depth first search order.
-
edist.tree_utils.
parents
(adj)¶ Returns the parent representation of the tree with the given adjacency list.
Parameters: adj (list) – The adjacency list of the tree. Returns: par – a numpy integer array with len(adj) elements, where the ith element contains the index of the parent of the ith node. Nodes without children contain the entry -1. Return type: int array
-
edist.tree_utils.
root
(adj)¶ Returns the root of a tree and raises an error if the input adjacency matrix does not correspond to a tree.
Parameters: adj (list) – a directed graph in adjacency list format. Returns: root – the index of the root of the tree. Return type: int Raises: ValueError
– if the given adjacency list does not form a tree.
-
edist.tree_utils.
subtree
(nodes, adj, i)¶ Returns the subtree rooted at node i in node list/adjacency list format.
Parameters: - nodes (list) – The node list of the tree.
- adj (list) – The adjacency list of the tree.
- i (int) – The index of the desired subtree.
Returns: - nodes_sub (list) – The node list of the subtree rooted at i.
- adj_sub (list) – The adajency list of the subtree rooted at i.
-
edist.tree_utils.
to_dfs_structure
(nodes, adj)¶ Re-orders a tree to conform to a depth-first search structure.
Note that this method performs a copy and leaves the original tree untouched.
Parameters: - nodes (list) – A node list.
- adj (list) – An adjacency list.
Raises: ValueError
– if the input is not a tree.
-
edist.tree_utils.
to_json
(filename, nodes, adj)¶ Writes a tree in node list/adjacency list format to a JSON file.
Parameters: - filename (str) – The filename for the resulting JSON file.
- nodes (list) – The node list of the tree.
- adj (list) – The adjacency list of the tree.
Raises: Exception
– if the file is not accessible or the JSON writeout fails.
-
edist.tree_utils.
tree_to_string
(nodes, adj, indent=False, with_indices=False)¶ Prints a tree in node list/adjacency list format as string.
Parameters: - nodes (list) – The node list of the tree.
- adj (list) – The adjacency list of the tree.
- indent (bool (default = False)) – A boolean flag; if True, each node is printed on a new line.
- with_indices (bool (default = False)) – A boolean flag; if True, each node is printed with its index.
Raises: ValueError
– if the adjacency list does not correspond to a tree.