Sie sind auf Seite 1von 4

Creating STATA Datasets (Inputting your data into Stata) All of the data used in the text are

available in STATA datasets. These data can read into STATA with the use command. When you study other datasets, they need to be converted into STATA data files (.dta files). There are several ways to do this conversion. (1) Typing data into the Stata editor One of the easiest methods for getting data into Stata is using the Stata data editor, which resembles an Excel spreadsheet. It is useful when your data is on paper and needs to be typed in, or if your data is already typed into an Excel spreadsheet. (2) Comma/tab separated file with variable names on line 1 Spreadsheet and database programs, such as Excel, can save and read files in tab- and comma-delimited format. These files contain one observation per line where the values are separated by tabs or commas. In addition, the first line of the file may contain variable names. Comma-delimited files often have an extension of .csv for "comma-separated values." Converting Excel to Stata This is an example of a spreadsheet from Excel. It cannot be brought into Stata without some editing and saving it as a ".csv" file.

The following conventions must be followed for Stata to read your spreadsheet successfully: 1. The first line should have Stata variable names (32 characters or less, no spaces, or "special characters" except the underscore [_], not starting with an underscore or number) and the second line begins the data. 2. No blank rows or columns between data. 3. Missing numeric data should be coded as an empty cell, not a space, dot, or any other non-numeric data. 4. Commas in numbers or text are particularly problematic because Stata thinks they are a delimiter and will not read the data properly. You must remove the commas from numeric values before saving the file. 5. The file must be specifically saved as a "comma separated values" file in Excel. You can do this by going to "File", then "Save As?", then choosing "comma separated values." Simply giving it an extension of ".csv" will not work. When you close the spreadsheet and Excel asks if you want to save the changes, say

"No." This is counter-intuitive, but the changes it's asking you about are the changes it needs to make the spreadsheet a regular Excel spreadsheet again.

Once you have completed the changes and you are in Stata, give the command: . insheet using filename.csv to read in the file.

Here are step-by-step instructions for saving an Excel worksheet as a comma-delimited file and reading that file into Stata. Follow these instructions if you have saved your data as an Excel file (.xls) or as a comma-delimited file (.csv). 1. Invoke Excel and read the file by selecting the File menu, Open. 2. Prepare the data for conversion. a. Make sure that missing data values are coded as blank or as numeric values (e.g., 999 or -1). Do not use periods (.) or any other character values (e.g., N/A) to represent missing data. Simply leave the cells empty or code a number. b. Make sure that there are no commas in the numbers. You can change this under Format menu, then select Cells.... c. Make sure that variable names are included only in the first row of your spreadsheet. There should be only one row of variable names (some files produced by databases have several header rows). Variable names should be 32 characters or less, start with a letter and contain no special characters, such as $ or &, except the underscore [_]. You should eliminate embedded blanks (spaces). 3. Under the File menu, select Save As. Then Save as type 'CSV' (comma separated values). The file will be saved with a .csv extension, for example "mydata.csv." 4. Start Stata. Then issue the following command: 5. insheet using mydata.csv (3) You can also run the following STATA do file. # delimit ; clear; log using convert.log,replace; insheet using mydata.csv; save mydata.dta, replace; log close; exit; The command insheet using mydata.csv reads in the data file created in (2). The command save mydata.dta, replace saves the data as a STATA dataset with the name mydata.dta replacing any other dataset with that name. The data set can now be read into STATA using the command use mydata.dta. This example used mydata as the name of the dataset. You can replace this with the name of your choosing.

Das könnte Ihnen auch gefallen