Contd...
if symb=�=�
print symb is a relational operator
else
ungetc symb from output file
print symb is a operator
end{if}
if symb=�=�
begin
advance to next token in input file
if symb=�=�then
print symb is equal to operator
else
ungetc symb from output file
print symb is assignment operator
end{if}
if symb=�&� then
begin
advance to next token in input file
if symb=�&� then
print symb is a logical and operator
else
print & symb is an address operator
end{if}
if symb=�/� then
begin
advance to next token in input file
if symb=�*� then
begin
advance to next token in input file
while symb!=�/� do
advance to next token in input file
end{while}
end{if}
else if symb=�/� then
begin
advance to next token in input file
while symb!=�\n� do
advance to next token in input file
end{while}
end{if}
else
ungetc symb from output file
print symb is a division operator
end{if}
if symb is a digit then
begin
advance to next token in input file
while symb is a digit or symb=�.� then
begin
advance to next token in input file
end {while}
print symb is a number
end{if}
if symb =��� then
begin
advance to next token in input file
while symb!=��� do
begin
advance to next token in input file
end{while}
print symb is a string
end{if}}
if symb= �{� then
print open brace
if symb=�}� then
print close brace
if symb=�[� then
print open bracket
if symb=�]� then
print close bracket
if symb=�(� then
print open parenthesis
if symb=�)� then
print close parenthesis
end {procedure main}
procedure verify
begin
scan the symbol table to check if encountered token exists
if exists
return token value
end{procedure}
USER MANUAL
The code for modules appears in two files: lex.c and output.c. The file lex.c contains the main
source code of the lexical analyzer. And the input to the lexical analyzer is contained in test.c. Under the DOS
operating system, the program is compiled by using alt F9, and is executed by using ctrl F9. The output i.e token
types are stored in the output file, output.txt
Sample Input
#include<stdio.h>
#include<stdlib.h>
#define abc 100
void main()
{
int a_,b=30;
printf("enter 2 no.s\n"); // printf statement
scanf("%d%d",&a,&b);
/* scanf
statement*/
if(a<20)
a=a+1;
}
Sample Output:
LINE NO TOKENS
-----------------------------------------------
1: #include<stdio.h> is a header file
2: #include<stdlib.h> is a header file
3: #define statement: abc is a constant
4: void: token value : 7
main :identifier, token value : 18
(: open parenthesis
): close parenthesis
5: {: open brace
6: int: token value : 1
a_ :identifier, token value : 18
, : comma
b :identifier, token value : 18
=: assignment operator
30 is a number
; : semi colon
7: printf: token value : 5
(: open parenthesis
enter 2 no.s\n : is a string
): close parenthesis
;: semi colon
8: scanf: token value : 6
(: open parenthesis
%d%d : is a string
,: comma
&a: address operator
, : comma
&b: address operator
): close parenthesis
;: semi colon
9:
10:
11: if: token value : 8
(: open parenthesis
a :identifier, token value : 18
<: less than operator
20 is a number
): close parenthesis
12: a: token value : 18
=: assignment operator
a: token value : 18
+: plus operator
1 is a number
;: semi colon
13: }: close parenthesis
CONCLUSION
Generally, when syntactic analysis is being carried out by the parser it may call upon the
scanner for tokenizing the input. But the LEXICAL ANALYZER designed by us is an independent program.
It takes as input a file with an executable code in C. There fore, the parser cannot make use of the designed
scanner as and when required.
Consider as an example an array ch[20].The designed lexical analyzer will tokenize 'ch' as an
identifier,'[' as an opening brace,'20' as a number, and ']' as a closing brace. But the parser might require a[5]
to be identified as an array. Similarly, there may arise a number of cases where the parser has to identify a
token by a different mannerism than the one specified and designed. Hence, we conclude that the LEXICAL ANALYZER
so designed is an independent program which is not flexible.