Sie sind auf Seite 1von 19

An Analysis of Language

Testing Google's Ngram-tool


jwr47

The Ngram-tool
Using Google's Ngram Viewers we may analyze the development of the verbal spectrum of the words in our language. The software uses a base of 5 million digitized books which have been published between 2 5 and 2 ! The application of the tool is simple and may be found at "gram #iewer. The easiest application is to simply enter one or some keywords such as $bankster% in &nglish. 't is important to click (case)insensitive$ unless you need the fre*uency number for the e+act match of your entry. There are numerous options for special purposes. ,ee -bout "gram #iewer.

Dedicated Searches
,ome test results produced e+traordinary results. which ' tried to trace back to their roots. /ost of the strange effects had been found in the samples of referenced manuscripts. The tool is easy to use and very fast. 'nitially a number of crashes occurred. but the system recovered *uickly.

Bankster
The word (Bankster$ obviously had been invented 0121 and has been in use up to 015 . -t that time the word disappeared fro some years and reentered the scene at a relatively stable rate.

Gold
-round 025 the capitalized version of $Gold% switched to the lower case word 3(gold$4.

The AE !" de#i$e of Em%eror &rederi$k


'n German the complete word -&'5U is to be written in upper case. 'nitially the word has not been interpreted as an abbreviation because the dotted version of the word 3-.&.'.5.U. as an abbreviation4 is to be used more fre*uently than the non)dotted -&'5U in later years.

The %ersonal %ronouns ieu' iau' iou and ih


The fre*uency of '&U0 reached a ma+imum at 062 . -fter 02 however disappeared into oblivion. the words ieu, iau, iou and ih

The ieu)spelling obviously also had been used for (7eu$ 38rench9 game4 in for e+ample :e royal ieu des echecs. avec son invention. science. ; 3< the game of check. 06104. -fter 02 this spelling seems to have been abandoned.

0 'eu as a root for 304 mixing ) 324 young ) 3=4 join ) and ieuo 3corn4 ) #ergleichendes >?rterbuch der 'ndogermanischen ,prachen

-round 022 the fre*uencies of Ic, Ih and Ick in German had been at comparable levels. The word Iu however had been in use over a rather long period of time. The analysis of samples however suggests this may have been caused by interpreting errors in digitizing the words $in% or $zu% as $iu%. - typical e+ample is (von -ngeficht iu .-ngellcht$ ) von -ngesicht zu -ngesicht$ 3German9 $in a face to face position%4.2 These classes of errors are common in interpreting 3@old German script@ such as Aurrent or ,Btterlin. 't is a systematic error in digitizing.

The %ersonal %ronouns ' (' $' , ik, ich


-part from the standard &nglish ego)pronoun ('$ at least two other variants e+ist9 (C$ and (ic$. which may be analyzed in "gram.

)euts
The fre*uency hit of the &nglish)spoken variant (Deuts$ around 050 may be identified as ( Deuts3$ as a coinageE.

Newly created words


The idea of suppressing the word $gold% from 01== onward 5 is to be checked against the e+plosive creation of new words. 'n a certain sense a common decrease is found from 02 onward. &+ceptional however is the fre*uency increase for both most important words Cou and '. The increase of these pronouns I. you and we even overcompensates the growing number of newly created words.

2 :ehr) und Geistreiches Dominicale. 5der ,onntag)Fredigen ) ,eite 222 = - )eut 3Dutch duit4 had been used as a copper coin currency in the 02. respectively 0!. century in Geldern. Aleve and the "etherlands E 5f these. si+teen pennings make one stiver or penny G and eight deuts or doits make one stiver G ) The Grand Tour ) Hand 0 ) ,eite E= 302E14 5 see The -mazing Disappearance 5f Gold 8rom The -merican Fsyche

Bankster6
The word obviously arose 01219

Fig 1: The creation o the word !"ankster! The word (Bankster$ obviously had been invented 0121 and has been in use up to 015 . -t that time the word disappeared fro some years and reentered the scene at a relatively stable rate. The initial phase 0121)01=5 may be zoomed in9

Fig #: The creation o the word !"ankster! >e may even try to find the first publication. which had been registered in a publication of 01= by a lawyer named Ferdinand Pecora2.

6 The word is bankster. derived by a marriage of banker and gangster. 2 't was coined. as far as ' can deduce. by an -merican immigrant. a fiery ,icilian)born lawyer by the name of 8erdinand Fecora. Ie was the chief counsel to the U, ,enate Jommittee on Hanking set up in the early = s to probe the origins of the Jrash of 0121. ) HHJ "&>, K UA K /agazine K Hanker L gangster M bankster

Fed, Bankster, Gold, Silver


The ne+t analysis did investigate the words 8ed.bankster.gold.silver in &nglish for the period between 05 and 2 !9

Fig$ %: the words Fed, bankster, go&d, si&'er in (ng&ish

Gold had been criminalized by president Noosevelt and reached a ma+imum 01==!. The word $8ed% already had been e+isting several centuries before its institution 3010=4 and creation at Oekyll 'sland1. Jompared to Gold. ,ilver and 8ed the fre*uency of the word $bankster% seems to be irrelevant.

! ,ource9 The -mazing Disappearance 5f Gold 8rom The -merican Fsyche 1 0!. ,eptember 010=

Gold in German and English


'n German the (Gold$)curve seems to start later 3020 4 than the &nglish curve 3056 4.

Fig$ /: *+o&d,-cur'e in +erman -round 025 the &nglish word $Gold% seems to switch from uppercase 3(Gold$4 to lowercase 3(gold$49

Fig$ ): *+o&d,-.ur'e in (ng&isch

The AE !" de#i$e of em%eror &rederi$k

*+4+,-+4-./+0

-part from a lonesome spike at 025 the fre*uency of the -.&.'.5.U.) or -&'5U) words starts at 015 .

Fig 3: The 0(I12 de'ice o emperor Frederick III 41/1)-1/5%6 in (ng&ish 'n German the spike of 025 is missing and seems to have been caused by dictionaries which list the five vowels. for e+ample in - copious &nglish and "etherdutch Dictionary9 comprehending .. . The samples may be found by clicking the "gram entry 05 P 010500. 'n German the complete word -&'5U is to be written in upper case. 'nitially the word has not been interpreted as an abbreviation because the dotted version of the word 3-.&.'.5.U. as an abbreviation4 is to be used more fre*uently than the non)dotted -&'5U in later years.

Fig$ 7: The 0(I12 de'ice o emperor Frederick III 41/1)-1/5%6 in +erman

0 The 0$($I$1$2)device of 8rederick ''' ) ,cribd 00 ,ample books9 05 ) 0105 0106 ) 0115 0116 ) 0111 2

)2

=2

E)2

! aeiou &nglish

The ersonal rono!ns ie!, ia!, io!, ih, ic, ick, i!


'n 8rench dialects the personal pronouns ieu. iau. iou. ih may have played a significant role02. 'n order to check the statistics ' started an "gram)analysis.

1eu
The fre*uency of '&U0= reached a ma+imum at 062 . -fter 02 however disappeared into oblivion. the words ieu, iau, iou and ih

The ieu)spelling obviously also had been used for (7eu$ 38rench9 game4 in for e+ample :e royal ieu des echecs. avec son invention. science. ; 3< the game of check. 06104. -fter 02 this spelling seems to have been abandoned.

Fig$ 8: The persona& pronouns ieu, iau, iou and ih in French 'n German the following variants of personal pronouns have been found9

Fig$ 5: The persona& pronouns ic, ih, ick and iu in +erman

02 #owel),e*uences in -rchaic /anuscripts ) ,cribd 0= 'eu as a root for 304 mixing ) 324 young ) 3=4 join ) and ieuo 3corn4 ) #ergleichendes >?rterbuch der 'ndogermanischen ,prachen

-round 022 the fre*uencies of Ic, Ih and Ick in German had been at comparable levels. The word Iu however had been in use over a rather long period of time. The analysis of samples however suggests this may have been caused by interpreting errors in digitizing the words $in% or $zu% as $iu%. - typical e+ample is (von -ngeficht iu .-ngellcht$ ) von -ngesicht zu -ngesicht$ 3German9 $in a face to face position%4.0E These classes of errors are common in interpreting 3@old German script@ such as Aurrent or ,Btterlin. 't is a systematic error in digitizing.

0E :ehr) und Geistreiches Dominicale. 5der ,onntag)Fredigen ) ,eite 222

The ersonal rono!ns ", #, "c, , ik, ich


-part from the standard &nglish ego)pronoun ('$ at least two other variants e+ist9 (C$ and (ic$. which may be analyzed in "gram.

Fig$ 1:: The persona& pronouns I, ;, Ic, <, ik, ich in (ng&ish 'f we skip the standard $'% the other variant may become visible 3C. 'c. Q. ik. ich49

Fig$ 11: The persona& pronouns ;, Ic, <, ik, ich in (ng&ish

(
The C)variant had been used by >ycliffe 30== )0=!E4 in a Hible)translation05.

$
The 'c)variant reached a ma+imum around 06E P the golden age for Dutch history at the 02 th century. This 'c)variant however has been overwhelmed by >ycliffe's C)variant.

05 The 9yc&i e Hible ) ,cribd

The $ellweg%6
'n German literature the Iellweg has been documented from 0!25. before Grimm's publication of German /ythology 30!=54.

Fig$ 1#: The =e&&weg in +erman 'n &nglish the Iellweg has been referenced as early as 026 . Three early references have been identified9 - new system of geography. tr. Rby F. /urdochS. ) ,eite =!E02. from 0262. in which the Iellweg is described as an (area north of the river Nuhr$. - "ew ,ystem of Geography9 Fart of Germany. viz. Hohemia. also in &nglish. 0262 &en kort och ganska nyttigh historia om the E h?gste och ... ) ,eite 252 060

Fig$ 1%: The =e&&weg in (ng&ish

06 The Iellweg to Iolland 02 -nton 8riedrich HBsching ) 0262 ) complete view

German
The fre*uency hit of the &nglish)spoken variant (Deuts$ around 050 (Deuts18$ as a coinage01. may be identified as

Fig$ 1/: >eutsch and >euts in (ng&ish

)euts
'n German the word (deuts$ is missing.

Fig$ 1) >eutsch and >euts in +erman

0! - )eut 3Dutch duit4 had been used as a copper coin currency in the 02. respectively 0!. century in Geldern. Aleve and the "etherlands 01 5f these. si+teen pennings make one stiver or penny G and eight deuts or doits make one stiver G ) The Grand Tour ) Hand 0 ) ,eite E= 302E14

)ut$h
'n &nglish the word $Dutch% 3the language in the "etherlands4 has been used before $Deutsch% had been used in German9

Fig$ 13: >utch 4the &anguage in the Nether&ands6 in (ng&ish

The growing n!m&er o' new word creations


The idea of suppressing the word $gold% from 01== onward 2 is to be checked against the e+plosive creation of new words. 5ne of the indicators for steadily invented new words ma be found in the statistics of standards like brea. butter. milk. beer. wine.... This list 3in German4 however is blurred by to many ups and downs.

/orris ,wadesh suggested to order words in their importance levels 3in German420 9 'ch. Du. wir. dieses. 7enes. wer. was. nicht. alle. viele. einst. zwei 8rom 025 this list displays a gradual decrease of fre*uencies which may indicate an increase of the total numbers of words9

Fig$ 17: First 1# words in the ?wadesh @ist in +erman

2 see The -mazing Disappearance 5f Gold 8rom The -merican Fsyche 20 ,wadesh :ist

2wadesh List in English


'n &nglish the ,wadesh list starts with the most important words according to /orris ,wadesh229 '. you. we. this. that. who. what. not. all. many. one. two

Fig$ 18: the irst twe&'e words o the ?wadesh @ist in (ng&ish 'n a certain sense a common decrease is found from 02 onward.

&+ceptional however is the fre*uency increase for both most important words Cou and '.

22 ?wadesh):iste

The e(ce tional increase o' #o! )D!*, " )"ch* and +e )+ir*
(ou *)u/' * $h/ and 3e *3ir/ in English
- fre*uency increase for both most important words Cou. ' and >e in the ,wadesh list3s4 starts from around 016 9

Fig$ 15: ;ou, I and we in (ng&ish The trend started for (Cou$ around 0165. Did it start in pop)musicT Those days the Heatles started a legendary career. The increase of these pronouns I. you and we even overcompensates the growing number of newly created words.

Fig #:: ;ou, I and we in (ng&ish 4detai&s in the period 15::-#::86

(ou *)u/' * $h/ and 3e *3ir/ in German


The same phenomenon has been identified in German in which $'% 3('ch$4 increases from 012 onward. although the fre*uency has not yet been topping the ma+imum levels at 012 and 01E!.

Fig$ #1: >u 4you6, Ich 4I6 and 9ir 4we6 in +erman

Fig$ ##: >u, Ich and 9ir in +erman 4detai&s in the period 15::-#::86

T!isco, T!isto, T!iston, Dis


&+cept for a ma+imum 025 TuiscoUTuisto2= up till 0! . the German authors did not seem to have been interested in

Fig$ #%: Tuisco, Tuisto in +erman

Fig$ #/: Tuisco, Tuisto in (ng&ish

2= documented in Tacitus' Germania

)is *do$umented 4y 1ulius 5aesar/


The word (dis$ has been used as an alternative spelling for $this%. for e+ample in >is ist das buch der wyszheit der alten wysen von geschlecht V 305 04

Fig #): >is, Tuisco, Tuisto in +erman 'n &nglish $Dis% as a negation often has been misinterpreted as an isolated entry. for e+ample in ,easonable :ecture9 5r. a /ost :earned 5ration >is)burthened ... This effect 3an error in interpreting words4 disturbs the statistics and destroys the relevance of the $Dis%)statistics.

Fig$ #3: >is in (ng&ish

"nhaltsver,eichnis
The "gram)tool....................................................................................................................................0 Dedicated ,earches...............................................................................................................................0 Hankster......................................................................................................................................0 Gold.............................................................................................................................................0 The -&'5U device of &mperor 8rederick '''.............................................................................0 The personal pronouns ieu. iau. iou and ih.................................................................................0 The personal pronouns '. C. 'c. Q. ik. ich....................................................................................2 Deuts ..........................................................................................................................................2 "ewly created words............................................................................................................................2 Hankster................................................................................................................................................= 8ed. Hankster. Gold. ,ilver...................................................................................................................E Gold in German and &nglish................................................................................................................5 The -&'5U device of emperor 8rederick ''' 30E05)0E1=4.........................................................6 The personal pronouns ieu. iau. iou. ih. ic. ick. iu ...............................................................................2 Oeu...............................................................................................................................................2 The personal pronouns '. C. 'c. Q. ik. ich.............................................................................................1 C..................................................................................................................................................1 'c..................................................................................................................................................1 The Iellweg.......................................................................................................................................0 German...............................................................................................................................................00 Deuts..........................................................................................................................................00 Dutch.........................................................................................................................................02 The growing number of new word creations......................................................................................0= ,wadesh :ist in &nglish............................................................................................................0E The e+ceptional increase of Cou 3Du4. ' 3'ch4 and >e 3>ir4.............................................................05 Cou 3Du4. ' 3'ch4 and >e 3>ir4 in &nglish................................................................................05 Cou 3Du4. ' 3'ch4 and >e 3>ir4 in German...............................................................................06 Tuisco. Tuisto. Tuiston. Dis................................................................................................................02 Dis 3documented by Oulius Jaesar4...........................................................................................0!

Das könnte Ihnen auch gefallen