Beruflich Dokumente
Kultur Dokumente
INTRODUCTIONTOSTATA
1.DOWNLOADINGDATA:......................................................................................................................2
Assignment1preparation..........................................................................................................................3
Selectvariables...........................................................................................................................................3
2.STARTINGASTATASESSION..............................................................................................................4
Opening,Saving,andClosingthedatafile.................................................................................................5
KeepingalogrecordofaStataSession......................................................................................................5
UsingOperators..........................................................................................................................................5
3.WORKINGWITHDATA.......................................................................................................................6
Toseewhatyourselectedvariablescontain:............................................................................................6
Tolookatthe%distributionincategoricalvariables................................................................................7
Tolookatatablecombiningtwocategoricalvariables.............................................................................7
CreatingandChangingValuesofVariables................................................................................................8
LabelingVariablesandValues....................................................................................................................9
UsingFunctions..........................................................................................................................................9
DeletingVariablesandObservations.........................................................................................................9
list .........................................................................................................................................................10
Analysisofcontinuous(scale)variables...................................................................................................10
Correlations..............................................................................................................................................10
EstimatingLinearModels(OLSand2StageLeastSquares).....................................................................11
EstimatingNonLinearModels(LogitandProbit)....................................................................................11
logit .........................................................................................................................................................11
probit........................................................................................................................................................12
4.MAKINGGRAPHS.............................................................................................................................12
Histogram.................................................................................................................................................12
Scatterplot................................................................................................................................................12
Bargraphs.................................................................................................................................................13
Printingyourgraph...................................................................................................................................13
5.UTILITIES.........................................................................................................................................13
ViewingtheData......................................................................................................................................13
CreatingandSubmittingaDoFile............................................................................................................13
6.CONVERTINGDATAFILES(EXCELTOSTATA)....................................................................................14
ThisprovidesabriefintroductiontousingStatafortheQoGdatasetanalysis.Stataisavailableonallof
the computers in the Kennedy Schools computer lab. If you have a home computer you may want to
purchaseacopyofStatafromtheCMO.StataisavailableforWindows98,Windows2000,WindowsME,
Windows XP, Windows NT, Macintosh, and UNIX operating systems. The Stata Users Guide is also
availablefromtheCMO.
The commands outlined below assume that you are using Stata for Windows. Throughout this text,
anything appearing in Bold font is a Stata command, whereas anything in red italics is a variable name
whichyoushouldchangeforyourspecificanalysis.Menucommandsareindicatedas,e.g.,File|Open,to
indicatethatyoufirstgototheFilemenuandthenchoosetheOpenoption.TheBlue Courieristhe
typeofoutputyoushouldgenerate.Asashortcut,youcanalsojustcopyandpasteanyofthecommand
linesheredirectlyintoyourStataCommandwindowthenrun.
NorrisIntroductiontoStata2009
1
1.DOWNLOADINGDATA:
GototheQoGwebsite:http://www.qog.pol.gu.se/GotoData:theQoGData
2.Downloadthecorrectversionofthedataset.Tostartwork,downloadthecrosssectionaldatasetfor
Stataandsavethissomewhereclearlylabeledonyourmemorystick,harddriveorsharedserverspace.
3.AlsosavethePDFversionoftheFULLCodebooksomewheresafe;itisverylongbutaninvaluable
referencedocument.
NorrisIntroductiontoStata2009
2
ASSIGNMENT1PREPARATION
Theaimistowriteaprofessionalreportassessingandcomparingtheproblemsofdemocraticgovernance
reforminoneworldregion.Pickyourregion:
LatinAmericaandtheCaribbean,
Africa,
Asia,
CentralandEasternEurope,
MiddleEast
WesternEurope
Thinkaboutthekeyproblemsofdemocraticgovernanceintheregion.Fromyourexperienceandyour
reading,whataretheprioritiesforagencies?Canyourankthem?Focusonthemostimportant23issues
inthefirstinstance.Thenlookcarefullyattheshareddatasetcodebook.Startbyselecting34indicators
whichrelatetotheproblemsyouhavedecidedtofocusupon.Thesharedclassdatasetprovidesthe
followingindicators,alongwithmanyothers:
1.FreedomHouseindexofpoliticalrightsandcivilliberties
2.PolityIVProjectDemocracyandAutocracyscales
3.CheibubandGandhiDemocracyAutocracyclassification
4.VanhanenDemocracyIndex
5.WorldValuesSurvey/GlobalBarometersAttitudinalsurveys
6.Kaufmann/KrayWorldBankInstituteGoodgovernanceindicators
7.TransparencyInternationalCorruptionindex
SELECTVARIABLES
You can use the whole dataset without doing anything further but there are a LOT of variables in the
dataset.Tosimplifyyourlifeandmakeitlessconfusing,forthisexerciseyouwillfinditeasiertoidentifya
subsetofkeyvariablesfromtheWhatItIslistwhichyouwanttouse.
You can always go back to select more variables at a later stage (none are deleted) but first work out
whichvariablestoputintoyoursubseti.e.whatisimportantasindicatorsofdemocraticgovernancefor
yourregion.NoteinStataitmatterswhethervariablenamesareincapitalsornot.
Selectthefirstoneslistedbelowandpickabout510toadd.Writedowndetailsinthelistbelowsothat
youhavethishandy.Lookinthefullcodebookformoredetailsabouttheconstructionandmeaningof
each.
Name
cname
ht_region
chga_regime
fh_status
p_polity
fh_cl
fh_pr
wbgi_vae
wbgi_pse
wbgi_gee
Briefdescription
Countryname
Globalregion:10categories
CheibubandGhandi:Typeofdemocraticorautocraticregime
Combinedpolityscoreofdemocracyandautocracy
FreedomHousecivilliberties
FreedomHousepoliticalrights
WorldBank:Voiceandaccountabilityestimate
WorldBank:Politicalstabilityestimate
WorldBank:Governmenteffectiveness
Type
Nominal
Nominal
Nominal
Ordinal
Scale10/+10
Scale7pt
Scale7pt
Scale
Scale
Scale
NorrisIntroductiontoStata2009
3
rsf_pfi
ti_cpi
ReporterswithoutBorders:PressFreedomIndex
TransparencyInternational:CorruptionPerceptionIndex
Scale100pts
Scale100pts
2.STARTINGASTATASESSION
Thewindowscanbemovedaboutandresizedtosuityourpreferences.Ifyoudonotseeanyoftheseon
yourversion,gotoWindowandaddtheseuntiltheylookroughlyliketheabove.
NorrisIntroductiontoStata2009
4
a.HeretheResultswindowliststheoutcome.
b.TheVariableswindow,ontheleft,liststhenamesofallthevariablesincludedintheshareddataset.
c.Youcanentercommandsintwoways.Tostartlearningtheprogram,youcanusethedropdown
menus,similartothosecommoninMicrosoftprograms.Thisisusefulforbeginners.Onceyoubecome
morefamiliarwiththeprogram,youwillwanttotypeincommandsdirectly,tosavetime,usingthe
Commandwindowatthebottomcenterofthescreen.AcommandtellsStatawhattodoe.g,toopena
file,torunaregression,tocalculateameanofavariable,etc.
d.TheReviewwindowshowsalistofallthecommandsyouhavealreadyrun.(Here,itshowsthatIhave
openedadatafile.)IfyouclickonapreviouslyruncommandintheReviewwindow,itwillappearinthe
Commandwindowandyoucanedititorrunitagain.
OPENING,SAVING,ANDCLOSINGTHEDATAFILE
Youwillthenneedtoopenthedatafileyouhavesaved.Youmayneedtoboostthememoryallocated.
Setmemory80000
File|Open
File|SaveAs
Itisalsoalwaysusefultohaveabackupcopyofyourdata.Thisway,nomatterwhatyoudotochangeor
recodethevariables,youalwayshaveacopyoftheolderversion.Itisalsousefulpracticetosaveyour
datafileattheendofeachsessionunderanewsequentialversion(egSTM103_2,STM103_3,sothatyou
havetheoldandnewestfileincaseyouneedtorevertback.
File|Exit
Wheneveryoufinish,toexitStata.
KEEPINGALOGRECORDOFASTATASESSION
File|log
Tosaveafile(log)ofyourresults,youwillneedtocreatealogfile.Statagivesyoutwochoicesoffile
formatsforyourlogfile,.log(textfile)and.smcl(formattedlogfile).The.smclfileswilllooknicerwhen
printed.YoushouldneverevercutandpasteyourStataoutputdirectlyintoyourreport;alwayssimplify,
cleanandtransferinaprofessionalandcleanformat.
Tostartalogfileinteractively,chooseFile|Log|Begin,selectthedirectoryyouwanttosavethelogfile
in,andgiveitaname(suchasjob1).Alternately,youcanclickonthefourthiconfromtheleftontheicon
bar,whichlookslikeascroll.
File|Log|Close
USINGOPERATORS
Statausesthefollowingarithmeticoperators:
+
*
/
^
add
subtract
multiply
divide
raisetothepower
NorrisIntroductiontoStata2009
5
Forrelations,Statauses:
==
equal
~=
notequal
>
greaterthan
<
lessthan
>=
greaterthanorequalto
<=
lessthanorequalto
Notethatasingleequalsign(=)isusedwhenassigningavaluetoavariable:
genwage=salary/(hours*weeks)
butadoubleequalsign(==)isusedwhenaskingStatatomakeacomparison:
replacefulltime=1ifhours==40
Forlogicaloperations,Statauses:
&
and
|
or(pipesign;whatyougetwhenyouhitShiftandthe\key)
~
not
Notethatwhentypingvariablenamesthecapitalizationmatters,followtheexactlabelsinthevarlist.
3.WORKINGWITHDATA
TheQoGisaverylongdatasetsotosimplifyyourlifeyoumaywanttostartbymovingthekeyvariables
youhaveselectedtothetopofthelist.Thatwayyoucanfindthemeasilyinsteadofhavingtohunt
througheachtime.Todothis,usingthetopmenugotoData|VariableUtilities|Relocatevariable
Todothisusingacommandtype:
ordercnameht_regionp_polityfh_statusfh_clfh_prchga_regimewbgi_vaewbgi_psewbgi_geersf_pfi
ti_cpi
Addthenamesofyourotherselectedvariablestothislist.Arrangetheminalogicalorder.
Thismovesallyourselectedvariablestothetopofthelistofvariables.
Whenyouhavetheseintheorderyouneed,savethefilewithanewname.Thatwayyoualwayshavethe
originalandabackupworkingfile.
Youcanalsorenamevariablesbutatthestartitsbesttokeepthenamesinthecodebooktopreservethe
record.
TOSEEWHATYOURSELECTEDVARIABLESCONTAIN:
Letsnowgetsomebasicdescriptiveresults.Firstwecanlookatsomeofthemostcommonvariableswe
areusingandoncewehavecompletedthisexerciseyoushouldsubstitutethe510variablesyouhave
chosentogetasenseofwhatisavailable.
Letsstartwithsomedifferenttypesofvariables.Nominalcategorieshavenoparticularorder,suchas
North,South,East,West.Ordinalcategorieshaveasequentialorderbutalimitednumberofcategories,
suchasHigh,MediumandLow.Scalevariablesareorderedintoacontinuousseries,forexamplelevelof
GDPindollars.
NorrisIntroductiontoStata2009
6
Firstletssummarizeyourselectedvariables.Type:
sum
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+-------------------------------------------------------ht_region |
192
4.526042
2.644633
10
p_polity |
160
14.45487
-77
10
fh_cl |
192
3.385417
1.821166
-------------+-------------------------------------------------------fh_pr |
192
3.364583
2.156785
chga_regime |
189
.3968254
.4905386
Thisisveryusefulforlookingatyourselectedvariablestoseewhattheyarelike,whethernominal,
ordinalorscale(continuous).E.g.chga_regimeisabinary2categoryvariable.Trythisforacoupleof
yourvariablesandaddnotestoyourselectedvarsonpage2.
summarizecanbeabbreviatedtosum.Youcanalsolookatmoredetail.eg
sumht_region,detail
Youcandothesamejustforoneselectedregion
sump_polityifht_region==1,detail
TOLOOKATTHE%DISTRIBUTIONINCATEGORICALVARIABLES
Forcategoricalvariables,trythefollowingwhichgeneratessomesimplefrequenciesielookatthe
numberofcountries(Freq)andthepercentcolumn.
. tab1 ht_region
The Region of the Country |
Freq.
Percent
Cum.
27
14.06
14.06
2. Latin America |
20
10.42
24.48
20
10.42
34.90
4. Sub-Saharan Africa |
48
25.00
59.90
27
14.06
73.96
6. East Asia |
3.13
77.08
7. South-East Asia |
11
5.73
82.81
8. South Asia |
4.17
86.98
9. The Pacific |
12
6.25
93.23
13
6.77
100.00
----------------------------------------+----------------------------------Total |
192
100.00
TOLOOKATATABLECOMBININGTWOCATEGORICALVARIABLES
NorrisIntroductiontoStata2009
7
tab2 ht_region
chga_regime, col
CREATINGANDCHANGINGVALUESOFVARIABLES
Recodeandgenerate
Thishelpstocategorizethevalueofanexistingscalevariable.Forexample,takethevariablecalled
p_politythatcontainsthe20pointPolityIVscaleforeachnationinyourstudy.Butyouwantthescale
tobecollapsedintotwotypesofregime,democraciesandautocracies.Todothis,generateanew
variablecalledregimeandrecodeitasfollows:
genregime=p_polity
recoderegime10/1=00/10=1
sumregime
Whenrecodingyourdata,becarefulnottooverwriteyouroriginalvariable.Youcancheckwhatyouhave
donewiththesummaryortab1commands.
Oryoumaywanttocreatesanewvariable,inthiscasefh_scale,definedasfh_clplusfh_pr:
Generatefh_scale=(fh_cl+fhpr)
Youcanuseanyformulatostandardizethescale.eg
generatefh_scale100=100(fh_cl+fh_pr)*7.1
Togenerateanewbinaryvariableforyourregion,ifAfrica(coded)
NorrisIntroductiontoStata2009
8
generateafrica=.
replaceafrica=1if(ht_region==4)
replaceafrica=0if(ht_region~=4)
Youcanabbreviatethiscommandwithgen.NotethatStatawilltellyouifanymissingvalueswere
generatedbyattemptingtoperformacalculationwithmissinginformation.Forexample,ifoneofthe
observationswasmissinginformationonhours,Statawouldsetyrhrsequaltomissingforthis
observation.(Seefurthernotesaboutmissingdatalaterinthissection.)
Onceavariablewithaparticularnamehasbeengeneratedyoucantgenerateanotherwiththesame
name.Instead,youmustreplacetheoldone.
LABELINGVARIABLESANDVALUES
Labelingvariablesandvalueshelpsyoukeeptrackofhowyoucodedyourvariablesandwhatthey
represent.Ittakesjustacoupleofsecondstoaddlabels,anditcansaveyoulotsoftimelaterwhenyou
cantrememberwhattheacodeof4meansinyourGDPcategoryvariable,forexample,orhowthe
variabledemo1differsfromdemo2.
Toattachalabeltoavariableanditsvalues:
labelvariableafricaWorldregion
labelvaluesafricaafricalabel
labeldefineafricalabel0Restoftheworld1SubSaharanAfrica
USINGFUNCTIONS
Functionsarespecialcalculationsusedwithothercommands,suchasgenerateorreplace.Statahasthe
capabilitytocalculatemanyfunctions.Herearesomeexamplesofthemostcommonlyusedones.
ln(x)
Calculatesthenaturallogofx,wherexmaybeaconstantoravariablesuchasmad_gdppc.
Inacommand,youmightusethelogfunctionlikethis:
genlogGDP2006=ln(mad_gdppc)
DELETINGVARIABLESANDOBSERVATIONS
drop
The drop command can delete either variables or observations. Deleting a variable removes an entire
variable (column) from the data set, whereas deleting an observation removes an entire observation
(row) from the data set. Be careful when doing this the variables and observations are permanently
NorrisIntroductiontoStata2009
9
deletedonceyousavethedatafile!Itisfarbettertoretainthewholedatasetbuttofilterfortheselected
region.
Toeliminateavariable,inthiscasemad_gdppc:
dropmad_gdppc
Toeliminateobservations,inthiscaseEasternEuropeasaregion:
dropifht_region==1
Alternativelyyoucouldjustkeepasubsetofdata:
keepifafrica==1
LIST
Printsallvariablesandobservationstothescreen.Youllprobablyneverwanttodothissinceyourdata
setswillbetoolarge.
list
Youcanprintalimitedsetofvariables:
listfh_status
Youcouldalsoprintalimitedsetofobservationsaccordingtoanothercriteria,inthiscaseAfricabeing
equalto1:
listcnamefh_statusfh_clfh_prifht_region==3
codebook
Providesevenmoreinformation(mean,standarddeviation,range,percentiles,labels,numberofmissing
values,etc.)aboutavariable:
codebookfh_statufh_prfh_cl
ANALYSISOFCONTINUOUS(SCALE)VARIABLES
EXAMININGMEANSBYCATEGORY
Inthiscasethecategoryisht_regionandthemeaniscalculatedforp_polity.Youcandothiswithvarious
commandseg
tableht_region,contents(meanp_polity)
tabstatp_polityfh_statusfh_clfh_pr,by(ht_region)columns(variables)
meanp_polity,over(ht_region)
CORRELATIONS
corrfh_clfh_prp_polity
Withsignificance(P)printedbelowinstarsforallcoefficientssignificantat.05orabove
pwcorr fh_cl fh_pr p_polity, star(5)
NorrisIntroductiontoStata2009
10
ESTIMATINGLINEARMODELS(OLSAND2STAGELEASTSQUARES)
regress
Calculatesanordinaryleastsquares(OLS)regression,inthiscaseforaregressionofthedependent
p_polityontheindependentsGDPandal_ethnic.Notethatthedependentvariableisthefirst
variablelisted.
regressp_politymad_gdppcal_ethnic
IfyouwishtoonlyincludeobservationswithAfricaequalto1intheregression:
regressp_politymad_gdppcal_ethnicifafrica==1
Torunaregressionwithrobuststandarderrors:
regressStable2006GDP2006Africa,robust
ToruntwostageleastsquareswhereGDPisendogenousandz1"isanexogenousinstrumental
variable:
regressp_polityal_ethnic(mad_gdppcz1)
Note:Ifyourunaregressioncontainingmorethan40variables,Statawillreturnanerrorcodesaying:
matsizetoosmall
Toovercomethisproblem,resetthemaximumnumberofvariablesStatawillestimateusingthematsize
command;thenumbershouldbegreaterthanorequaltothetotalnumberofvariablesintheregression.
setmatsize150
predict
Calculatesthepredictedvalueforeachobservationusingthecoefficientsfromthelastregression
estimatedandsavestheseasavariablecalledyhat:
predictyhat
Tocalculatetheresidualforeachobservationusingthemostrecentlyestimatedregressionmodeland
savetheseasavariablecalledehat:
predictehat,residual
test
CalculatesanFtestofajointhypothesisconcerningthecoefficientsinthemostrecently
estimatedlinearregressionmodel,inthiscasewiththenullhypothesisH0:age=sex=0:
testal_ethnicmad_gdppc
ESTIMATINGNONLINEARMODELS(LOGITANDPROBIT)
LOGIT
Estimatesamodelsuitableforadichotomousdependentvariable.Inthiscase,thevariable
chga_regimeequals1fordemocracyand0forautocracy.
logitchga_regimeal_ethnicmad_gdppc
NorrisIntroductiontoStata2009
11
Ifyouwishtofindapredictedprobabilityforeachobservationbasedonthemostrecentmodelrunand
savetheseasavariablecalledphat:
predictphat
PROBIT
Estimatesamodelsuitableforadichotomousdependentvariable.Inthiscase,thevariable
chga_regimeequals1fordemocracyand0forautocracy.Ifyouwishtoestimatetheprobabilityof
chga_regimeconditionaluponal_ethnicmad_gdppc:
probitchga_regimeal_ethnicmad_gdppc
Ifyouwishtofindapredictedprobabilityforeachobservationbasedonthemostrecentmodelrunand
savetheseasavariablecalledphat:
predictphat
4.MAKINGGRAPHS
Stata8hasaGraphicsmenuthatletsyoucreategraphsfromawindowsmenu,asanalternativetousing
commandlanguage.TheGraphicsmenuisaparticularlyuserfriendlywayofcreatinggraphs,sincegraphs
containsomanyoptionsforlabels,axes,etc.TheGraphicsmenuisfairlyintuitivetousesimplypull
downthemenuandchoosethetypeofgraphyouwant.Theoptionsareselfexplanatory.Forthose
interestedinusingcommandlanguagetocreategraphs,someofthebasicsarecoveredbelow,andyou
canreplyonthegraphicsmanualformorecomplicatedcreations.AlsoSPSShasbetterandfarmore
flexiblegraphics.Youmaywanttoconsiderthisprogramforthisfunctionalone.Youcanalsocutand
pastetheresultsoftablesintoExcelforflexibleformatsandcontroloverelements.
HISTOGRAM
Thisisthedefaultwhenonlyonevariableisspecified:
histogramp_polity
Youcanalsodrawanormaldensityoverthehistogram:
histogramp_polity,normal
TohaveSTATAgraphonlycertainobservations,inthiscasethoseforwhichafricais1:
histogramp_polityifafrica==1,bin(30)
Toaddatitle:
histogramp_polity,title(PolityIVRatingofLiberalDemocracyinAfrica)
SCATTERPLOT
Thisisthedefaultiftwovariablesarespecified:
scatterwbgi_geewbgi_pse
Conditions,axes,titles,labelingandreferencelinescanbespecifiedasabove.Forexample:withlabels
scatterwbgi_geewbgi_pse,t1(Effectivenessbystability)mlabel(cname)
NorrisIntroductiontoStata2009
12
scatterwbgi_geewbgi_pse,t1(Effectivenessbystability)
Afterperformingaregression,youmaywanttographpredictedandactualvaluesofthedependent
variableagainsttheindependentvariable:
scatteryhat1scatterwbgi_geewbgi_pse,xlabelylabelsymbol(o.)
BARGRAPHS
Thisisproducedwithagraphcommandfollowedbyonevariable.Asecondvariableisusedtodefine
groups.Toproduceagraphwithbarheightsrepresentingthemeanforeachgroup:
sortwbgi_gee
graphbar(mean)wbgi_gee,over(ht_region)
Conditions,yaxisoptions,mosttitles,andhorizontalreferencelinescanbespecifiedasdescribedabove
withregardtohistogram:
sortwbgi_gee
graphbar(mean)wbgi_gee,over(africa)t1(PoliticalStabilityinAfrica)t2(Title2)l1(MeanStability)
l2(AnotherTitle)yline(33000)
PRINTINGYOURGRAPH
Stataallowsyoutoprint(File|PrintGraph)andsave(File|SaveGraph)yourgraphs.Theeasiestwayto
incorporateyourgraphintoaWorddocumentistocopythegraphtotheclipboardusingEdit|Copy
Graphandthenpasteitintoyourdocument.Rememberthatallgraphsshouldhaveaclearheadline,to
illustrateyourreport,withafullnotebelowspecifyingthesourceofthedataandanynotesexplaining
variables.Allgraphsshouldbeselfcontainedwithoutlookingfurtherinyourreport.
5.UTILITIES
VIEWINGTHEDATA
Onceyouhaveopenedadataset,youmaywishtolookatthevariablesandobservationsinspreadsheet
format.Stataprovidestwowaystodothis,browseandedit.Thebrowsecommandletsyouseethe
databutnotmakechanges,whereastheeditcommandallowsyoubothtobrowseandtomakechanges.
Itisprobablybesttousebrowseunlessyouactuallyintendtomakechangestoyourdatamanually;
otherwiseyoumayaccidentallychangesomethingandruinyourdata.
Tobrowse,enterbrowseintotheCommandwindoworselecttheBrowseicon(thirdfromtheright,a
spreadsheetwithamagnifyingglassonit).Toedit,entereditintotheCommandwindoworselectthe
Editicon(fourthfromtheright,aspreadsheetwithnomagnifyingglass).
CREATINGANDSUBMITTINGADOFILE
AlthoughStatacanberuninteractivelybyjusttypingonecommandatatime,Statacommandscanalso
besubmittedinbatchesbyusingadofile.AdofileissimplyatextfilewhichcontainsaseriesofStata
commands.YouentertheStatacommandsinthesameorderasyouwouldentertheminteractively,and
Statathenrunsthesecommandsautomaticallyinsteadofyourhavingtotypetheminlinebyline.
NorrisIntroductiontoStata2009
13
Foryourproblemsets,itisstronglyrecommendedthatyouusedofiles.Someoftheproblemsetswill
requiremanyStatacommands,anditisinevitablethatyouwillneedtomakechangesandrunthese
seriesofcommandsanumberoftimes.Whenyouhaveallofyourcommandsinasinglefile,itismuch
easiertogobacktothatfileandmakethenecessarychangesthantohavetoretypeeverycommand.
CreatingaDofile
Tostartcreatingadofile,clickontheDofileeditorbutton(fifthfromtheright,lookslikeanenvelope
withapencilonit),choosetheDofileeditoroptionundertheWindowmenu,ortypedoeditinthe
Commandwindow.NotethatsinceadofileisawrittenlistofcommandsasenteredintheCommand
window,youcannotusetheStatamenuswithinadofile.Insteadyouneedtousethetyped(Command
window)commands.
6.CONVERTINGDATAFILES(EXCELTOSTATA)
TheeasiestwaytoconvertdatafilesistousethesoftwareprogramStatTransfer.Thisprogramisonthe
labcomputersandallowsyoutoconvertyourdatatoorfromavarietyofdifferentfileformats(Stata,SAS
Transport,Excel,SPSS,QuatroPro,FoxPro,etc.).
ToconvertafilefromExceltoStata:
a)ClickontheapplicationStatTransferintheDataAnalysisfolder.
b)SelectExcelWorksheetforInputFileType.
c)UseBrowsetoidentifytheExcelfileyouwanttoconvertfrom.(Ifthefirstrowofthe
worksheetcontainsthevariablenames,theprogramwillusetheseasthevariablenames.)
d)SelectStataVersion8astheOutputFileType.(SinceStata8.0isarecentrelease,itis
possiblethattheversionofStatTransferyoureusingwillnothaveStataVersion8asanoption.If
thisisthecase,saveitasaversion7file;youshouldstillbeabletoopenthefileinversion8.)
e)Typeinthepathandnameofthefileyouwishtocreate.
f)BegintheconversionbyclickingonBeginTransfer.
StataalsoallowsyoutoreadinbinaryandASCIIfilesdirectly.However,inmostcasesitiseasier
tofirstconvertyourdatatoaspreadsheetandthenconvertittoStatausingStatTransfer.
NorrisIntroductiontoStata2009
14
SUMMARYOFSPSSANDSTATACOMMANDS
SPSSCommand
StataCommand(s)
SPSSCommand
StataCommand(s)
ADDFILES
append
GETFILE
use
AGGREGATE
collapse
GETSAS
fdause
ANOVA
anova
GRAPH
graph
AUTORECODE
destring
encode
IF
generate__if__
IGRAPH
graph
CASESTOVARS
reshapewide
INCLUDEFILE
do___
COMMENT
*
/**/
LIST
list
COMPUTE
generate
replace
egen
LOOP
forvalues
MATCHFILES
merge
LOGISTICREGRESSION logistic
CORRELATIONS
correlate
pwcorr
MEANS
tabulate__,
summarize(__)
CROSSTABS
tabulate
tab2
MISSINGVALUES
none
MIXED
xtmixed
DATALIST
infile
infix
insheet
NOMREG
mlogit
PLUM
ologit
DELETEVARIABLES
keep
drop
PROBIT
probit
RECODE
recode
DESCRIPTIVES
summarize
RECORDTYPE
noequivalent
DISPLAY
describe
REGRESSION
regress
DOCUMENT
notes
RELIABILITY
alpha
DOIF
xyzcommandif
RENAMEVARIABLES
rename
DOREPEAT
foreach
SAMPLE
sample
ECHO
display
SAVE
save
ERASE
erase
EXAMINE
tabulatex,
summarize(y)
SELECTIF
keepif
dropif
SORTCASES
sort
EXECUTE
noequivalent
SPLITFILE
by
EXPORT
noequivalent
FACTOR
factor
SUMMARIZE
tabulate___,
summarize(___)
FILELABEL
labeldata
xyzcommandif(___)
FILTER
xyzcommandif(___)
TEMPORARY.
SELECTIF(___).
FLIP
xpose
TTEST
ttest
FORMATS
format
FREQUENCIES
tabulate
VALUELABELS
VARIABLELABELS
NorrisIntroductiontoStata2009
15
Notes:
NorrisIntroductiontoStata2009
16