We could do some cool stuff with external tools if tokenization was it's own step and exposed publicly. I'm thinking we'd pass in a string that represents a single line of GSS and get tokens back.