tree-features: e159f9b5b0b5e776ce273481a27fa466b04d31a2

     1: Generic feature extraction from XML documents.
     2: 
     3: For machine learning problems, we often need our inputs to be the same, fixed size.
     4: When we have a recursive structure, like a tree, we can fold over the structure to
     5: obtain a single value.
     6: 
     7: This is a very basic implementation of this idea: we take arbitrary XML documents,
     8: which are tree structured, and assign each element a value based on the md5 of its
     9: name and attributes concatenated together. We fold sub-trees together using bitwise
    10: circular convolution, to obtain a value for the whole tree.
    11: 
    12: Circular convolution is a linear operation, so it can't preserve as much
    13: information as, for example, auto-encoding, but it is reasonably fast, requires no
    14: learning and is largely non-commutative/associative, so sub-trees should be
    15: distinguishable to a certain extent.

Generated by git2html.