<$BlogRSDUrl$>

Wednesday, May 11, 2005

A Lexical Analyzer Generator – CsLex


Secondly, I decided to scrap my old GPS Parser and try some alternatives. Did a bit of research on the Lex & Yacc stuff, Lex helps write programs whose control flow is directed by instances of regular expressions in the input stream. It is well suited for editor-script type transformations and for segmenting input in preparation for a parsing routine. At the end of research I found this interesting Cs-Lex written by Brad Merrill of Microsoft @ http://www.cybercom.net/~zbrad/DotNet/Lex/Lex.htm which is a C# porting of JLex.
CsLex is Quite simple and clean than others (including antlr)

Generated a small simple test file and generated code for a simple input (to identify Commented lines in the text block)

test1;
/* comment1 */
/* This is a comment*/
test2;

and it parsed as follows


And here is some interesting LEX code generated

internal class Yylex
{
private const int YY_BUFFER_SIZE = 512;
private const int YY_F = -1;
private const int YY_NO_STATE = -1;
private const int YY_NOT_ACCEPT = 0;
private const int YY_START = 1;
private const int YY_END = 2;
private const int YY_NO_ANCHOR = 4;
delegate Yytoken AcceptMethod();
AcceptMethod[] accept_dispatch;
private const int YY_BOL = 128;
private const int YY_EOF = 129;
private System.IO.TextReader yy_reader;
private int yy_buffer_index;
private int yy_buffer_read;
private int yy_buffer_start;
private int yy_buffer_end;
private char[] yy_buffer;
private int yychar;
private int yyline;
private bool yy_at_bol;
private int yy_lexical_state;

internal Yylex(System.IO.TextReader reader) : this()
{
if (null == reader)
{
throw new System.ApplicationException("Error: Bad input stream initializer.");
}
yy_reader = reader;
}
.
.
.
.
private static int[] yy_cmap = new int[]
{
6, 6, 6, 6, 6, 6, 6, 6,
3, 3, 2, 6, 6, 1, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
3, 6, 6, 6, 7, 6, 6, 6,
6, 6, 4, 6, 6, 6, 6, 5,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6,
0, 0
};
private static int[] yy_rmap = new int[]
{
0, 1, 2, 2, 1, 3, 1, 1,
4, 1, 1, 5, 1, 6, 7, 8,
9, 10, 11, 12
};
private static int[,] yy_nxt = new int[,]
{
{ 1, 11, 2, 3, 4, 5, 6, 6 },
{ -1, -1, -1, -1, -1, -1, -1, -1 },
{ -1, -1, 3, 3, -1, -1, -1, -1 },
{ -1, -1, -1, -1, 7, -1, -1, -1 },
{ 1, 13, 13, 13, 14, 16, 13, -1 },
{ -1, -1, 12, -1, -1, -1, -1, -1 },
{ -1, 13, 13, 13, 18, 19, 13, -1 },
{ -1, 13, 13, 13, 15, 9, 13, -1 },
{ -1, 13, 13, 13, 15, 19, 13, -1 },
{ -1, 13, 13, 13, 10, 17, 13, -1 },
{ -1, 13, 13, 13, 18, 17, 13, -1 },
{ -1, 13, 13, 13, 15, -1, 13, -1 },
{ -1, 13, 13, 13, -1, 17, 13, -1 }
};
public Yytoken yylex()
{
char yy_lookahead;
int yy_anchor = YY_NO_ANCHOR;
int yy_state = yy_state_dtrans[yy_lexical_state];
int yy_next_state = YY_NO_STATE;
int yy_last_accept_state = YY_NO_STATE;
bool yy_initial = true;
int yy_this_accept;

yy_mark_start();
yy_this_accept = yy_acpt[yy_state];
if (YY_NOT_ACCEPT != yy_this_accept)
{
yy_last_accept_state = yy_state;
yy_mark_end();
}
.
.
}

However, still evaluating alternatives for writing my parser… if you any ideas/suggestions do post here…

posted by Logu Krishnan : 10:25 PM

Comments:
Hi Logu,

Interesting work, Plz have a look @ http://blogs.msdn.com/jmstall/archive/2005/02/06/368192.aspx

Its pretty good compile with parser

Take care...

Senthil


 
Post a Comment

This page is powered by Blogger. Isn't yours?