Skip to content

Warbo/tree-features

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generic feature extraction from XML documents.

For machine learning problems, we often need our inputs to be the same, fixed size.
When we have a recursive structure, like a tree, we can fold over the structure to
obtain a single value.

This is a very basic implementation of this idea: we take arbitrary XML documents,
which are tree structured, and assign each element a value based on the md5 of its
name and attributes concatenated together. We fold sub-trees together using bitwise
circular convolution, to obtain a value for the whole tree.

Circular convolution is a linear operation, so it can't preserve as much
information as, for example, auto-encoding, but it is reasonably fast, requires no
learning and is largely non-commutative/associative, so sub-trees should be
distinguishable to a certain extent.

About

Mirror of http://chriswarbo.net/git/tree-features

Resources

License

GPL-3.0, Unknown licenses found

Licenses found

GPL-3.0
LICENSE
Unknown
COPYING

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published