Difference between revisions of "The Things They Carried"

From dftwiki3
Jump to: navigation, search
(Source)
Line 11: Line 11:
 
</onlysmith>
 
</onlysmith>
 
<br />
 
<br />
 +
=Stats=
 +
<br />
 +
* 4668 lines 
 +
* 66025  words
 +
* 382094 characters
 +
<br />
 +
 
=Python Program for Word Frequencies=
 
=Python Program for Word Frequencies=
 
<br />
 
<br />

Revision as of 22:26, 2 October 2015

--D. Thiebaut (talk) 22:39, 2 October 2015 (EDT)



OBrienTheThingsTheyCarried.png



Source


This section is only visible to computers located at Smith College


Stats


  • 4668 lines
  • 66025 words
  • 382094 characters


Python Program for Word Frequencies


# compute top 100 most frequent word in document
import string


doc = "TheThingsTheyCarried.txt"
stopwords = """a about above across after afterwards again against all almost 
alone along already also although always am among amongst amoungst 
amount an and another any anyhow anyone anything anyway anywhere 
are around as at back be became because become becomes becoming 
been before beforehand behind being below beside besides between 
beyond bill both bottom but by call can cannot cant co computer 
con could couldnt cry de describe detail do done down due during 
each eg eight either eleven else elsewhere empty enough etc even 
ever every everyone everything everywhere except few fifteen 
fify fill find fire first five for former formerly forty found 
four from front full further get give go had has hasnt have he 
hence her here hereafter hereby herein hereupon hers herse" him 
himse" his how however hundred i ie if in inc indeed interest 
into is it its itse" keep last latter latterly least less ltd 
made many may me meanwhile might mill mine more moreover most 
mostly move much must my myse" name namely neither never nevertheless 
next nine no nobody none noone nor not nothing now nowhere of 
off often on once one only onto or other others otherwise our 
ours ourselves out over own part per perhaps please put rather 
re same see seem seemed seeming seems serious several she should 
show side since sincere six sixty so some somehow someone something 
sometime sometimes somewhere still such system take ten than 
that the their them themselves then thence there thereafter thereby 
therefore therein thereupon these they thick thin third this 
those though three through throughout thru thus to together too 
top toward towards twelve twenty two un under until up upon us 
very via was we well were what whatever when whence whenever 
where whereafter whereas whereby wherein whereupon wherever whether 
which while whither who whoever whole whom whose why will with 
within without would yet you your yours yourself yourselves dont 
got just did didnt im
"""
def displayStops():
    s = ""
    for a in stopwords.split():
        s = s+a+" "
        if len( s )>60:
            print(s)
            s = ""
    print( s )


def main():
    global doc, stopwords
    text = open( doc, "r" ).read()

    stopwords = set( stopwords.lower().split() )
    dico = {}
    exclude = set(string.punctuation)
    text = ''.join(ch for ch in text if ch not in exclude)

    for word in text.lower().split():
        if word in stopwords: continue
        try:
            dico[word] += 1
        except:
            dico[word] = 1

    list = []
    for key in dico.keys():
        list.append( (dico[key], key) )

    list.sort()
    list.reverse()
    words = [k for (n,k) in list]
    print( "\n".join( words[0:100] ) )

#displayStops()
main()


100 Most Frequent Words


said                  413
like                  224
war                   181
carried               156
man                   148
rat                   147
things                146
night                 136
time                  119
right                 114
way                   112
kiowa                 109
eyes                  109
away                  105
sanders               104
old                   101
know                   98
field                  97
head                   96
remember               94
say                    92
dead                   90
id                     86
took                   85
tell                   84
story                  82
little                 82
felt                   80
maybe                  77
went                   76
later                  73
cross                  73
azar                   71
himself                70
kept                   69
dark                   69
looked                 68
long                   68
hed                    68
real                   66
mitchell               66
bowker                 65
thought                64
hard                   64
came                   63
river                  62
thing                  61
norman                 61
good                   61
told                   60
thats                  60
make                   60
feel                   60
stories                58
water                  57
new                    55
body                   55
years                  53
wanted                 53
think                  53
place                  53
look                   53
lieutenant             53
day                    53
tried                  52
guys                   52
young                  51
morning                51
men                    51
knew                   51
jimmy                  51
love                   50
kiley                  50
inside                 50
want                   49
sound                  48
fossie                 48
bad                    48
myself                 47
guy                    47
dobbins                47
come                   47
true                   46
face                   46
moved                  45
sure                   44
life                   44
hands                  44
white                  43
rain                   43
talk                   42
stood                  42
shit                   42
mary                   42
lake                   42
kind                   42
high                   42
gone                   42
vietnam                41
linda                  41


Word Cloud


ThingsTheyCarried.png