LET'S BUILD A COMPILER!(4)---第三部分:再论表达式续
空白字符
结束本章之前,我们再来讨论一下空白字符的问题。现在这个版本的分析器会在读到一个空白字符的地方停下来。这是相当不友好的行为。所以让我们消除这个最后的限制,使分析器的表现更有商业产品的味道。
处理空白字符的关键在于制定一条规则,规定分析器改如何处理对待输入的空白字符,并在整个分析器中都遵守它。目前为止,空白字符还是不被允许的,我们可以假定在每个分析动作之后,先行字符Look都包含的是个有意义的字符,然后可以立即验证这点。我们设计的方法就是基于上述原则。
具体来说,就是每个需要先行读取输入流的过程都必须忽略空白字符,并且最后Look变量中只能保留非空白字符。所幸,我们是用GetName,GetNum和Match完成大部分了输入操作,所以只有三个过程(再加上Init过程)需要修改。
首先,定义一个新的识别空白字符的过程。
{--------------------------------------------------------------} { Recognize White Space }
function IsWhite(c: char): boolean; begin IsWhite := c in [' ', TAB]; end; {--------------------------------------------------------------}
还需要一个过程“吃掉”空白字符,直到遇见一个非空白字符。
{--------------------------------------------------------------} { Skip Over Leading White Space }
procedure SkipWhite; begin while IsWhite(Look) do GetChar; end; {--------------------------------------------------------------}
现在,在Match,GetName和GetNum过程中调用SkipWhite:
{--------------------------------------------------------------} { Match a Specific Input Character }
procedure Match(x: char); begin if Look <> x then Expected('''' + x + '''') else begin GetChar; SkipWhite; end; end;
{--------------------------------------------------------------} { Get an Identifier }
function GetName: string; var Token: string; begin Token := ''; if not IsAlpha(Look) then Expected('Name'); while IsAlNum(Look) do begin Token := Token + UpCase(Look); GetChar; end; GetName := Token; SkipWhite; end;
{--------------------------------------------------------------} { Get a Number }
function GetNum: string; var Value: string; begin Value := ''; if not IsDigit(Look) then Expected('Integer'); while IsDigit(Look) do begin Value := Value + Look; GetChar; end; GetNum := Value; SkipWhite; end; {--------------------------------------------------------------}
(注意,我对Match过程进行了重新组织,但是没有改变它的功能。)
最后,还要在Init中略过源文件开头的空白符。
{--------------------------------------------------------------} { Initialize }
procedure Init; begin GetChar; SkipWhite; end; {--------------------------------------------------------------}
如上改写程序并编译,注意把Match过程放在SkipWhite的后面,否则通不过编译。测试这个新版本,以保证它能正常运行。
由于本章中我们对原来的程序做了不少的修改,现在我在下面列出修改后的完整程序:
{--------------------------------------------------------------} program parse;
{--------------------------------------------------------------} { Constant Declarations }
const TAB = ^I; CR = ^M;
{--------------------------------------------------------------} { Variable Declarations }
var Look: char; { Lookahead Character }
{--------------------------------------------------------------} { Read New Character From Input Stream }
procedure GetChar; begin Read(Look); end;
{--------------------------------------------------------------} { Report an Error }
procedure Error(s: string); begin WriteLn; WriteLn(^G, 'Error: ', s, '.'); end;
{--------------------------------------------------------------} { Report Error and Halt } procedure Abort(s: string); begin Error(s); Halt; end;
{--------------------------------------------------------------} { Report What Was Expected }
procedure Expected(s: string); begin Abort(s + ' Expected'); end;
{--------------------------------------------------------------} { Recognize an Alpha Character }
function IsAlpha(c: char): boolean; begin IsAlpha := UpCase(c) in ['A'..'Z']; end;
{--------------------------------------------------------------} { Recognize a Decimal Digit }
function IsDigit(c: char): boolean; begin IsDigit := c in ['0'..'9']; end;
{--------------------------------------------------------------} { Recognize an Alphanumeric }
function IsAlNum(c: char): boolean; begin IsAlNum := IsAlpha(c) or IsDigit(c); end;
{--------------------------------------------------------------} { Recognize an Addop }
function IsAddop(c: char): boolean; begin IsAddop := c in ['+', '-']; end;
{--------------------------------------------------------------} { Recognize White Space } function IsWhite(c: char): boolean; begin IsWhite := c in [' ', TAB]; end;
{--------------------------------------------------------------} { Skip Over Leading White Space }
procedure SkipWhite; begin while IsWhite(Look) do GetChar; end;
{--------------------------------------------------------------} { Match a Specific Input Character }
procedure Match(x: char); begin if Look <> x then Expected('''' + x + '''') else begin GetChar; SkipWhite; end; end;
{--------------------------------------------------------------} { Get an Identifier }
function GetName: string; var Token: string; begin Token := ''; if not IsAlpha(Look) then Expected('Name'); while IsAlNum(Look) do begin Token := Token + UpCase(Look); GetChar; end; GetName := Token; SkipWhite; end;
{--------------------------------------------------------------} { Get a Number }
function GetNum: string; var Value: string; begin Value := ''; if not IsDigit(Look) then Expected('Integer'); while IsDigit(Look) do begin Value := Value + Look; GetChar; end; GetNum := Value; SkipWhite; end;
{--------------------------------------------------------------} { Output a String with Tab }
procedure Emit(s: string); begin Write(TAB, s); end;
{--------------------------------------------------------------} { Output a String with Tab and CRLF }
procedure EmitLn(s: string); begin Emit(s); WriteLn; end;
{---------------------------------------------------------------} { Parse and Translate a Identifier }
procedure Ident; var Name: string[8]; begin Name:= GetName; if Look = '(' then begin Match('('); Match(')'); EmitLn('BSR ' + Name); end else EmitLn('MOVE ' + Name + '(PC),D0'); end;
{---------------------------------------------------------------} { Parse and Translate a Math Factor }
procedure Expression; Forward;
procedure Factor; begin if Look = '(' then begin Match('('); Expression; Match(')'); end else if IsAlpha(Look) then Ident else EmitLn('MOVE #' + GetNum + ',D0'); end;
{--------------------------------------------------------------} { Recognize and Translate a Multiply }
procedure Multiply; begin Match('*'); Factor; EmitLn('MULS (SP)+,D0'); end;
{-------------------------------------------------------------} { Recognize and Translate a Divide }
procedure Divide; begin Match('/'); Factor; EmitLn('MOVE (SP)+,D1'); EmitLn('EXS.L D0'); EmitLn('DIVS D1,D0'); end;
{---------------------------------------------------------------} { Parse and Translate a Math Term }
procedure Term; begin Factor; while Look in ['*', '/'] do begin EmitLn('MOVE D0,-(SP)'); case Look of '*': Multiply; '/': Divide; end; end; end;
{--------------------------------------------------------------} { Recognize and Translate an Add }
procedure Add; begin Match('+'); Term; EmitLn('ADD (SP)+,D0'); end;
{-------------------------------------------------------------} { Recognize and Translate a Subtract }
procedure Subtract; begin Match('-'); Term; EmitLn('SUB (SP)+,D0'); EmitLn('NEG D0'); end;
{---------------------------------------------------------------} { Parse and Translate an Expression }
procedure Expression; begin if IsAddop(Look) then EmitLn('CLR D0') else Term; while IsAddop(Look) do begin EmitLn('MOVE D0,-(SP)'); case Look of '+': Add; '-': Subtract; end; end; end;
{--------------------------------------------------------------} { Parse and Translate an Assignment Statement }
procedure Assignment; var Name: string[8]; begin Name := GetName; Match('='); Expression; EmitLn('LEA ' + Name + '(PC),A0'); EmitLn('MOVE D0,(A0)') end;
{--------------------------------------------------------------} { Initialize } procedure Init; begin GetChar; SkipWhite; end;
{--------------------------------------------------------------} { Main Program }
begin Init; Assignment; If Look <> CR then Expected('NewLine'); end. {--------------------------------------------------------------}
上面已经是完整的分析器了。它已经具有了一个“行编译器”的所有特征。建议你把它保存起来。下次我们将转向一个新主题,但是仍然要谈一点表达式的问题。下一章,我计划介绍一些关于解释器(interpreter)的知识,并展示当我们改变分析器的动作时,它的结构是如何跟着改动的。我们在下一章中要学到的东西以后还会用到的,即使你对解释器不感兴趣。下章见
(未完待续)

|