Java Code: Look through the Language Description and build a list of keywords. Add a HashMap to your Lexer class and initialize all the keywords. Change your lexer so that it checks each string before making the WORD token and creates a token of the appropriate type if the work is a key word. When the exact type of a token is known (like “WHILE”), you should NOT fill in the value string, the type is enough. For tokens with no exact type (like “hello”), we still need to fill in the token’s string. Finally, rename “WORD” to “IDENTIFIER”. Similarly, look through the Language Description for the list of punctuation. A hash map is not necessary or helpful for these – they need to be added to your state machine. Be particularly careful about the multi-character operators like := or >=. These require a little more complexity in your state machine. Strings and characters will require some additions to your state machine. Create “STRINGLITERAL” and “CHARACTERLITERAL” token types. These cannot cross line boundaries. Note that we aren’t going to build in escaping like Java does ( “ This is a double quote\” that is inside a string” or ‘\’’).

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Java Code: Look through the Language Description and build a list of keywords. Add a HashMap to your Lexer class and initialize all the keywords. Change your lexer so that it checks each string before making the WORD token and creates a token of the appropriate type if the work is a key word. When the exact type of a token is known (like “WHILE”), you should NOT fill in the value string, the type is enough. For tokens with no exact type (like “hello”), we still need to fill in the token’s string. Finally, rename “WORD” to “IDENTIFIER”.

Similarly, look through the Language Description for the list of punctuation. A hash map is not necessary or helpful for these – they need to be added to your state machine. Be particularly careful about the multi-character operators like := or >=. These require a little more complexity in your state machine. 

Strings and characters will require some additions to your state machine. Create “STRINGLITERAL” and “CHARACTERLITERAL” token types. These cannot cross line boundaries. Note that we aren’t going to build in escaping like Java does (  “ This is a double quote\” that is inside a string” or ‘\’’).

 

Comments, too, require a bit more complexity in your state machine. When a comment starts, you need to accept and ignore everything until the closing comment character. Assume that comments cannot be nested – {{this is invalid} and will be a syntax error later}. Remember, though, that comments can span lines, unlike numbers or words or symbols; no token should be output for comments.

 

Your lexer should throw an exception if it encounters a character that it doesn’t expect outside of a comment, string literal or character literal. Create a new exception type that includes a good error message and the token that failed. Ensure that the ToString method prints nicely. An example of this might be:

ThisIsAnIdentifier 123 ! { <- that exclamation is unexpected }

Add “line number” to your Token class. Keep track of the current line number in your lexer and populate each Token’s line number; this is straightforward because each call to lex() will be one line greater than the last one. The line number should be added to the exception, too, so that users can fix the exceptions.

Finally, indentation. This is not as bad as it seems. For each line, count from the beginning the number of spaces and tabs until you reach a non-space/tab. Each tab OR four spaces is an indentation level. If the indentation level is greater than the last line (keep track of this in the lexer), output one or more INDENT tokens. If the indentation level is less than the last line, output one or more DEDENT tokens (obviously you will need to make new token types). For example:

1 { indent level 0, output NUMBER 1 }

               a { indent level 1, output an INDENT token, then IDENTIIFIER a }

                              b { indent level 2, output an INDENT token, then IDENTIFIER b }

                                                            c { indent level 4, output 2 INDENT tokens, then IDENTIFIER c }

2 { indent level 0; output 4 DEDENT tokens, then NUMBER 2 }

Be careful of the two exceptions:

  • If there are no non-space/tab characters on the line, don’t output an INDENT or DEDENT and don’t change the stored indentation level.
  • If we are in the middle of a multi-line comment, indentation is not considered.

Note that end of file, you must output DEDENTs to get back to level 0.

Your exception must be called “SyntaxErrorException” and be in its own file. Unterminated strings or characters are invalid and should throw this exception, along with any invalid symbols. Attached is the image of the list of requirements that lexer.java must have. Make sure to show the full lexer.java code with the screenshot of shank.txt  being tested in the console. 

 

 

shank.txt

 

Fibonoacci (Iterative)

 

define add (num1,num2:integer var sum : integer)

variable counter : integer

Finonacci(N)

int N = 10;

while counter < N

define start ()

variables num1,num2,num3 : integer

add num1,num2,var num3

{num1 and num2 are added together to get num3}

num1 = num2;

num2 = num3;

counter = counter + 1;

 

GCD (Recursive)

 

define add (int a,int b : gcd)

if b = 0

sum = a

sum gcd(b, a % b)

 

GCD (Iterative)

 

define add (inta, intb : gcd)

if a = 0

sum = b

if b = 0

sum = a

while counter a != b

if a > b

a = a - b;

else

b = b - a;

sum = a;

variables a,b : integer

a = 60

b = 96

subtract a,b

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 3 steps

Blurred answer
Follow-up Questions
Read through expert solutions to related follow-up questions below.
Follow-up Question

Where is the code for lexer, main, and token classes? Make sure to show the full code for each class with the screenshot of shank.txt being printed out as tokens in the console. 

Solution
Bartleby Expert
SEE SOLUTION
Knowledge Booster
Hash Table
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education