Thursday, September 28, 2006

CodeTidy Kickoff

Processing the code in a C# code file can be done in a few ways. Visual Studio has an extensibility/automation model that allows access to projects, solutions, modules, classes, code etc. as object collections; which would seem to be ideal for what I have in mind - not having to parse the stuff oneself.

However anytime I have looked at this in the past as a possible route for achieving the task of manipulating the raw code in a file I have come up against obstacles - most notably no support for locating and dealing with regions, some comments, namespace import (i.e. using) statements etc.

If I am to sort/arrange code without mucking up the containment of regions or orphaning comment lines, then I have to know about these parts of the text content of a code file and be able to manipulate them in the same way as the methods, properties etc. that are available as objects in the extensibility/automation model.

Parsing code files as raw text and recognising these 'contructs' manually seems like re-inventing the wheel and the sort of activity better done by people with more mathematical minds than mine, who thrive on lexical parser theory and and all that grammar/token recognition stuff.

So, what to do..? It would be nice to use the automation/extensibility model and fill in the gaps with a bit of manual parsing I suppose, but that leads to the need to effectively maintain my own 'meta-model' of the code being analysed that references/synchronises with the VS model but supplements it with information about the association of things like regions. Plus I need to take a fresh look at the VS model as I guess it has moved on and improved since I last looked at it (VS 2003) - though I'm pretty sure it still doesn't provide exposure of things like regions.

Further thinking required.

No comments: