Perplexing Parrot's Parser

by chromatic

I believe that a programming language should never crash, even given bad input. There may be cases where it reports obscure syntax errors that are difficult to understand, but crashing is unacceptable.

One way to make sure that there are no crashes is to feed your parser as much invalid input as you can imagine and check that you only ever get syntax errors. (I suppose another way is to write formal proofs for your parser, but even then you may have bugs in your implementation.)

To do that, you need a large corpus of valid programs and a way to generate a large corpus of mostly-valid programs that aren't quite right.

I installed Algorithm::MarkovChain and set to work.


2007-04-23 06:08:27
The idea of using Algorithm::MarkovChain for this stuff is brilliant, my very compliments. Just a nitpick: you create the aggregate file 'markov_train.pir' from the shell, but you then load file 'train_markov.pir' in your program.
2007-05-22 22:46:27
It looks pretty easy to substitute parrot for other programs, like, say gcc, the several Java compilers, Python, et cetera.