We've got samples of names, but when we bucket we need to provide the ASTs. We could do this in the Python script which invokes the bucketing scripts, since there's no need to have the ASTs duplicated over and over among the samples.