Reading chars in a text file - Programmers Heaven

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories

Welcome to the new platform of Programmer's Heaven! We apologize for the inconvenience caused, if you visited us from a broken link of the previous version. The main reason to move to a new platform is to provide more effective and collaborative experience to you all. Please feel free to experience the new platform and use its exciting features. Contact us for any issue that you need to get clarified. We are more than happy to help you.

Reading chars in a text file

XeonicXpressioXeonicXpressio Posts: 5Member
Ok here is what i need to do.
Read in a text file and calculate
1. The total number of lines in the file, including blank lines.
2. The number of blank lines in the file.
3. The number of sentences in the file of text. You may assume that sentences must end with a period, a question mark, or an exclamation point.
4. The number of words in the file. (Think about how you can determine when a word ends.)
5. The number of non-blank characters in the file, including punctuation.
[code]
def main():
myInFile = open("inn.txt", "r")
TotalLines = 0
BlankLines = 0
Words = 0
for ch in myInFile:
TotalLines = TotalLines + 1
InLine = myInFile.readline()
if (InLine == "
"):
BlankLines = BlankLines + 1
print "Total Words", Words
print "Total Lines", TotalLines
print "Total Blank Lines", BlankLines
main()
[/code]
That is what i have so far but doing for ch in myInFile doesnt look at each character and i cant figure out how to. Also shouldn't if (InLine == "
"): give me the total number of blank lines? It doesnt seem to be working. I guess until I can figure out how to look at the characters in each line I'm kind of at a dead end. Any help would be appriciated. If someone could help my get started here and tell me if i am even close to being on the right track I would appriciate it.

Comments

  • DrostDrost Posts: 24Member
    [b][red]This message was edited by Drost at 2005-2-26 3:57:39[/red][/b][hr]
    : Ok here is what i need to do.
    : Read in a text file and calculate
    : 1. The total number of lines in the file, including blank lines.
    : 2. The number of blank lines in the file.
    : 3. The number of sentences in the file of text. You may assume that sentences must end with a period, a question mark, or an exclamation point.
    : 4. The number of words in the file. (Think about how you can determine when a word ends.)
    : 5. The number of non-blank characters in the file, including punctuation.
    : [code]
    : def main():
    : myInFile = open("inn.txt", "r")
    : TotalLines = 0
    : BlankLines = 0
    : Words = 0
    : [red]for ch in myInFile:[/red]
    : TotalLines = TotalLines + 1
    : InLine = [red]myInFile.readline()[/red]
    : if (InLine == "
    "):
    : BlankLines = BlankLines + 1
    : print "Total Words", Words
    : print "Total Lines", TotalLines
    : print "Total Blank Lines", BlankLines
    : main()
    : [/code]
    : That is what i have so far but doing for ch in myInFile doesnt look at each character and i cant figure out how to. Also shouldn't if (InLine == "
    "): give me the total number of blank lines? It doesnt seem to be working. I guess until I can figure out how to look at the characters in each line I'm kind of at a dead end. Any help would be appriciated. If someone could help my get started here and tell me if i am even close to being on the right track I would appriciate it.
    :

    Hi

    In your code the red-marked cycle iterates through the lines of the file (speaking Python 2.3+) so deliberatelly reading a new line (the other red-marked part) seems to be superfluous.

    [code]
    lines, blanks, sentences, words, nonwhite = 0, 0, 0, 0, 0

    textf = open('file.txt', 'r')
    for l in textf:
    lines += 1
    if l.startswith('
    '):
    blanks += 1 # sorry MACs
    else:
    sentences += l.count('.') + l.count('!') + l.count('?')
    tempwords = l.split(None)
    words += len(tempwords)
    nonwhite += sum(map(len, tempwords))
    textf.close()

    print "Lines: ", lines
    print "Blanks: ", blanks
    print "Sentences: ", sentences
    print "Words: ", words
    print "non whitespace characters: ", nonwhite
    raw_input('Press Enter...')
    [/code]

    The sentence count falls for things like '...' or '?!?'.

    Drost

  • XeonicXpressioXeonicXpressio Posts: 5Member
    Thank you so much Drost! I guess the whole read line was what was screwing me up. I was wondering if you could explain a few things to me.

    [code]l.count('.'), l.count('!'), l.count('?')[/code]

    since l is a whole line is count reading every character in the line?

    [code] tempwords = l.split(None)[/code]

    This creates a list and the None omits the black characters, right?

    [code]
    words += len(tempwords)
    nonwhite += sum(map(len, tempwords))
    [/code]

    Why is len(tempwords) when assigned to words counting the number of cells, but when it is in map it counts the number of characters in the cell and creates a new list from that? I read about map and I thought that it would just be applying len to tempwords again, like the line above it. I understand what it is doing I just dont understand why it is doing it.

    Thank you for your help, that gave me a much better understanding of how the syntax of python works. I knew how I wanted to do this I just couldn't get the syntax down at all.
  • DrostDrost Posts: 24Member
    : Thank you so much Drost! I guess the whole read line was what was screwing me up. I was wondering if you could explain a few things to me.
    :
    : [code]l.count('.'), l.count('!'), l.count('?')[/code]
    :
    : since l is a whole line is count reading every character in the line?

    #1

    :
    : [code] tempwords = l.split(None)[/code]
    :
    : This creates a list and the None omits the black characters, right?

    #2

    : [code]
    : words += len(tempwords)
    : nonwhite += sum(map(len, tempwords))
    : [/code]
    :
    : Why is len(tempwords) when assigned to words counting the number of cells, but when it is in map it counts the number of characters in the cell and creates a new list from that? I read about map and I thought that it would just be applying len to tempwords again, like the line above it. I understand what it is doing I just dont understand why it is doing it.

    #3

    : Thank you for your help, that gave me a much better understanding of how the syntax of python works. I knew how I wanted to do this I just couldn't get the syntax down at all.
    :

    #4

    ---------
    #1:
    Strings have a count method which take a substring as a parameter and returns the number that can be found in the string.

    So basically we count how many end-of-sentence mark we can find in the line.

    #2:
    Again strings have the split method which would result in a list that is created through splitting the string on the boundaries which is the parameter this method takes. The speciality is that the None parameter makes the boundaries to be whitespaces (spaces, tabs, etc.) be them [b]any length[/b].

    [code]"asd fgh jkl".split(' ')[/code]
    would result in ["asd", "fgh", "jkl"]

    but

    [code]"asd fgh jkl".split(' ') # mark the plus space[/code]
    would result in ["asd", "fgh", "", "jkl"]

    So to get only real words I used .split(None). :)

    #3:
    tempwords is a list which have elements that are the words of the actual line. map iterates on each member of the list (the words) to have their length which would result in a new list of numbers. Their sum makes up how many 'useful' characters there are in the line.

    #4:
    The manual is a great help. :)

    Drost
Sign In or Register to comment.