Sie sind auf Seite 1von 2

SCMHRD

Data Analytics using SAS for Economists Problem Sheet 1

April, 2013

1. (File: Latlong.txt) Import data by reading fixed-width raw data file that contains three variables: name of the volcano, followed by its latitude and longitude. Notice that the column names appear in the first row, and the data values are vertically aligned. Modify the data type for latitude and longitude to Numeric. (Hint: Volcano names start from 1, Latitude starts from 15, and Longitude starts from 24). 2. (File: depressionIQ.tab) Import data by reading a tab-separated data obtained from an investigation into the effect of mothers post-natal depression on child development. Mothers who gave birth to their first-born child in a major teaching hospital in London were divided into two groups: depressed or not depressed, on the basis of their mental state three months after the birth. The fathers were also divided into two groups: namely fathers who had a history of psychiatric illness and fathers who did not. The third variable is the childs IQ at age 4 years. (Info: Mothers depression: 1=no, 2=yes; Fathers history: 0=no previous psychiatric history, 1=has a previous psychiatric history). 3. (File: turnover.dat) Import data by reading fixed-width raw data collected from university clerical employees about their job attitudes and turnover. A survey was conducted at a specified time. Approximately a year later the university directory was used to determine who had quit the job. Notice that the column names do not appear in the first row, and the data values are vertically aligned. Refer the table below for column markers and exclude the work frustration scale and the job satisfaction scale during the data import operation. Use the advanced expression editor and compute the age of the employee at the time of joining. Columns 1-4 5-7 8-10 11 12 13-14 15-17 Variables ID Number 3 items for work frustration scale, 1 digit each 3 items for job satisfaction scale, 1 digit each Intent to quit Quit Age Tenure on the job Description Cases numbered 1 to 193 3 item work frustration scale, total score is sum of items on the scales 3 item job satisfaction scale, total score is sum of items on the scales 1 to 6 scale with high numbers indicating greater intention to quit the job 1 = person quit the job within a year of the original survey; 0 = did not quit In years In months

4. (File: ToyRevenue.csv) Import data by reading the comma-separated values about revenues obtained on toy sales by a toy dealer in Mumbai. The data contains two variables namely: Quarter and Revenue. Use the advanced expression editor to create two new computed columns to hold information about Quarter and Year. (Hint: Use the SUBSTR char. function). 5. (File: automob.txt) Import data by reading fixed-width raw data file about automobiles with their make, price, miles per gallon (mpg), repair record in 1978 (rep78), and whether the car was foreign or domestic (foreign). Notice that the column names do not appear in the first row, and the data values are vertically aligned. Modify data types as numeric for the following fields: price, mpg, rep78 and foreign. Column Markers: make (1-6), price (7-11), mpg (12-14), rep78 (15-16) and foreign (17-18). Compute the cost variable as price in thousands of dollars and mpgptd variable as miles per gallon per thousand dollars. Page 1 of 2

SCMHRD

Data Analytics using SAS for Economists

April, 2013

6. (File: marks.tab) Import data by reading the tab-separated data file of marks obtained by students in business analytics course. The faculty intends to analyse the overall performance of the students across grades A (above 75), B (above 55 and below 75) and C (below 55). (Hint: Use the compute columns feature and recode marks to generate the grade variable.) 7. Use the Latlong data table as imported in Problem 1 that gives the latitude and longitude of volcanoes from around the world. Use the recode feature of the Query Builder to group the volcanoes by zone, according to the value of the Latitude variable. Refer the following information for defining the appropriate zones. Zone Tropical N. Temperate S. Temperate Arctic Antarctic Latitude Range -23.49 to 23.49 23.50 to 66.49 -66.49 to -23.50 66.50 to 90.00 -90.00 to -66.50

8. (File: orsales.sas7bdat) Generate the following data subsets. a. Clothes & Shoes product line sorted by descending order in Profits and ascending order in Quantity and export it as a CSV file. b. Product Category in Indoor Sports, Racket Sports and Swim Sports sorted in ascending order followed by Product Group in ascending order and export it to XLS. c. Quarter 1 data for product category excluding any kind of Sports where profit is greater than 20000 and export it to HTML file. d. Product Line not in Children and Clothes & Shoes where the total retail price is greater than 15000 and total retail price is less than 60000 and profit is less than 10000 sorted in ascending order of Product Group and export it as a SAS data set. 9. (Files: Tours.sas7bdat and Volcanoes.sas7bdat) Tours data contains information about the tour: name of the volcano, the city where the tour departs, the number of days the tour lasts, the price, and a difficulty rating for the tour. Volcanoes data includes the country and region of the volcano, as well as the height, activity, and type of volcano. Merge the two data sets by the name of the volcano, and exclude redundant information, if any, and the difficulty rating of the tour. Generate a dataset that contains tours of volcanoes in Europe. Suppose you would like to include the list of all the volcanoes in Europe, not just volcanoes that currently have tours. Make the necessary changes and export to XLS.

Page 2 of 2

Das könnte Ihnen auch gefallen