We're trying to estimate the time needed to convert our application to VB.NET. The upgrade wizard that comes with VS.NET does a pretty fair job with our 200 (ish) DLL projects but chokes hard on the GUI. We killed our first run of the wizard after four hours during which it sucked up 100% of the CPU time the whole time it was running. When we killed it, the progress bar was only about 50% (but who knows what that really means).
We first thought about breaking up the GUI (150-ish forms, many dozens of modules/classes) into smaller projects. Even these seem to take an inordinate amount of time to convert with the wizard, and apparently the conversion leaves some kind of "compatibility layer" between the Windows.Forms classes and the code. Don't ask me, the other guy discovered that.
While he was playing with that option, I sat down to think about converting the code ourselves. I started thinking about how to parse, for example, a .frm file of medium size. Not only the code, but the form properties and all the controls and the "hidden" stuff you don't see in the IDE. I started working on a state machine, but the more I looked at it, the more it started to resemble an XML file without tags. So I hacked together (in Python, of course) a parser class that reads through .frm files and acts very much like a SAX parser by calling inheritable methods (startElement, endElement, characters, etc) when it encounters certain keywords. This works very well, but trying to handle all the permutations of keywords and punctuation that comprises VB6 syntax was getting less and less fun.
So to keep things interesting, I started looking into using an existing lexer/parser to tear apart the VB6 source files to produce a tree that I could use to spit out VB.NET syntax myself. I couldn't get yacc, flex, bison, et al. working via cygwin, but I found a great Python module named SimpleParse.
Ok, so now I have a parser that understands a particular form of EBNF grammar but I haven't been able to find a grammar for VB6. I've started cobbling my own together and I finally feel I've jumped up on the learning curve. Here is the grammar I have so far:
EBNF grammar for Visual Basic 6
code := statement*
statement := (comment / if / select / simpleact)#If
if := (if_inline / if_block)
if_inline := if_then, ts, simpleact
if_block := if_then, tail, statement*, else?, end_if
if_then := 'If', ts, condition, ts, 'Then'
else := 'Else', tail, statement*
end_if := 'End If', tail#Select
select := 'Select Case', ts, expression, tail, case*, case_else?, end_select
case := 'Case', ts, condition, tail, statement*
case_else := 'Case Else', tail, statement*
end_select := 'End Select', tail#Conditional
condition := 'cond', digit+#Expressions
expression := 'expr', digit+#Simple
simpleact := 'act', digit+, tail#trailing
tail := (comment / ws)
comment := ts, "'", visible*, ws#character
visible := [ -~]
newline := '
ts := [ ]+
ws := [
digit := [0-9]
lowercase := [a-z]
uppercase := [A-Z]
alpha := (lowercase / uppercase)
alphanumeric := (alpha / digit)
Here's the example input I created to test the grammar thus far:
If cond1 Then act1 'here is a comment
If cond2 Then
If cond3 Then
Else 'another comment
Select Case expr1
End Select 'end of select case
Obviously I have a long way to go to get all of VB6 syntax covered (and I just noticed a mistake in the 'comment' pattern as I was typing this), so I was wondering if any of you hardcore geeks out there have or know of a grammar that I can use to get past this nitty-gritty stuff.
$ select * from users where clue > 0
no rows returned