Answer to Question #304924 in Python for Sudheer

Question

Answer to Question #304924 in Python for Sudheer

Question #304924

Sanjay is recruited in Infosys company as a python developer. The company assigned him a software development task to build a compiler. Sanjay identifies that for compiler designing he need to implement tokenization of a specific line of program. As a friend of Sanjay, your task is to help him to write a python function Tokenize_Line(fname, n) (Refer RollNo_W11A_2.py )which reads nth line from prime.py file and returns a list containing all tokens of that line as shown in the example. Also, handle the possible exception and provide proper message.

Contents of prime.py file will be like:

from math import sqrt

n = 1

prime_flag = 0

if(n > 1):

for i in range(2, int(sqrt(n)) + 1):

if (n % i == 0):

prime_flag = 1

break

if (prime_flag == 0):

print("true")

else:

print("false")

else:

print("false")

Expert's answer

#!/usr/bin/env python3

import re
from collections import namedtuple

class Tokenizer:

  Token = namedtuple('Token', 'name text span')

  def __init__(self, tokens):
    self.tokens = tokens
    pat_list = []
    for tok, pat in self.tokens:
      pat_list.append('(?P<%s>%s)' % (tok, pat))
    self.re = re.compile('|'.join(pat_list))

  def iter_tokens(self, input, ignore_ws=True):
    for match in self.re.finditer(input):
      if ignore_ws and match.lastgroup == 'WHITESPACE':
        continue
      yield Tokenizer.Token(match.lastgroup, match.group(0), match.span(0))

  def tokenize(self, input, ignore_ws=True):
    return list(self.iter_tokens(input, ignore_ws))

# test program
if __name__ == "__main__":

  TOKENS = [
    ('NIL'        , r"nil|\'()"),
    ('TRUE'       , r'true|#t'),
    ('FALSE'      , r'false|#f'),
    ('NUMBER'     , r'\d+'),
    ('STRING'     , r'"(\\.|[^"])*"'),
    ('SYMBOL'     , r'[\x21-\x26\x2a-\x7e]+'),
    ('QUOTE'      , r"'"),
    ('LPAREN'     , r'\('),
    ('RPAREN'     , r'\)'),
    ('DOT'        , r'\.'),
    ('WHITESPACE' , r'\s+'),
    ('ERROR'      , r'.'),
  ]

  for t in Tokenizer(TOKENS).iter_tokens('(+ nil 1 2)'):
    print(t)

Learn more about our help with Assignments: Python

Comments

No comments. Be the first!

Answer to Question #304924 in Python for Sudheer

Comments

Leave a comment

Related Questions