Sie sind auf Seite 1von 11

USM’s Shriram Mantri Vidyanidhi Info Tech Academy

PG DBDA Feb 19 Data Visualization Question Bank


Contents
EDA ............................................................................................................................................................. 1
Numpy, Pandas and Data Visualization ..................................................................................................... 2
Matplot and seaborn ................................................................................................................................. 3
Tableau ....................................................................................................................................................... 4

EDA
1. Exploratory data analysis should be used to
A. help you search for patterns in your data.
B. spot serious defects in your data that may warrant taking corrective action.
C. help determine whether assumptions of the inferential tests you intend to use may have been violated.
D. all of the above

2. A bar graph is the best graph to use when


A. your dependent variable was measured on at least a ratio scale.
B. your independent variable is categorical.
C. your independent and dependent variables are both continuous.
D. you want to show ordered trends in your data.

3. To show a functional relationship between your independent and dependent variables, the graph of choice
would be a
A. line graph. B. histogram. C. pie chart D. scatterplot.

4. The Spearman Rank Order Correlation is used when


A. your data are scaled on an ordinal scale.
B. your data are scaled on an interval scale.
C. one measure is scaled on a nominal scale and the other on an ordinal scale.
D. one measure is scaled on a in interval scale and the other on an ordinal scale

5. In which of the following situations would you not want to use a Pearson correlation coefficient?
A. when the relationship between variables is nonlinear
B. when both of your variables are measured on at least an interval scale
C. when the variances of your distributions are very similar D. all of the above

6. A curve showing a functional relationship that starts off flat, becomes progressively steeper, and shows a single
direction of change is
A. negatively accelerated. B. monotonic
C. positively accelerated. D. both b and c

7. A ________ distribution has most scores collected about the center and is symmetrical about its midpoint.
A. functional B. normal B. monotonic D. bimodal

8. _______ are used to represent category values (e.g., gender) as values.


A. Unstacked formats B. Dummy codes C. Stacked formats D.Codes

9. A functional graph that shows a uniformly increasing or decreasing functional relationship is said to be
A. monotonic. B. negatively skewed. C. normal. D. positively skewed.

10. If you have discrete group data, such as months of the year, age group, shoe sizes, and animals. Which is best
to explain?
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank

A. Boxplot B. histogram C. bar D. scatterplot

11. Which graph is better used when data needs to be classified or categorize?
A. stack bar B. Pie chart C. histogram D. None of the above

12. Which is best to explain a relationship between to target and feature?


A. scatterplot B. bar C. Pareto chart D. all of the above

13. How can you check for outliers in data set?


Using scatterplot B. Using histogram C. Using Boxplot D. all of the above

14. From which plot you will come to the distribution of the target variable?
A. histogram B. pie chart C. bar D. Pareto chart

15. TrueFalse: The quantilequantile (qq) plot is a graphical technique for determining if two data sets come from
populations with a common distribution.
A. True B. False

16. TrueFalse: In Boxplot the middle line inside the box display the mean of the distribution
A. True B. False

17. TrueFalse: For Numeric vs Numeric data scatterplot is the best representation.
A. True B. False

18. TrueFalse: For Bivariant data, correlogram or corr plot show the correlation of each variable.
A. True B. False

19. TrueFalse: the height of the bar corresponds to the value of each category.
A. True B. False

20. TrueFalse: The height of the resulting Stacked Bar shows the combined result of the groups.
A. True B. False

Numpy, Pandas and Data Visualization


1) Pandas is designed to work with _______ data.
A. Relational B. Labeled C. Both of these D. None of these

2) DataFrame is a _______ labeled data structure.


A. 1dimensional B. 2dimensional C. 3dimensional D. ndimensional

3) Pandas does easy handling of missing data in floating point as well as nonfloating point data?
A. True B. False

4) Columns can be deleted and inserted from:


A. DataFrame B. Higher dimensional objects.
C. All of the above D. None of the above

5) Shape property in pandas is used to


A. Visualise the distribution of the data
B. See the number of rows and columns of the data
C. Visualise the shape of skewness of the data
D. See the spread of data (mean, median etc.)
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank
6) The _______ method allows us to retrieve rows and columns by position.
A. head B. getloc C. iloc D. locate

7) Pivot table can aggregate the data and summarize it by grouping the columns
A. True B. False

8) _______ is a convenient method for combining the columns of two potentially differentlyindexed DataFrames
into a single result DataFrame.
A. Concatenate B.Merge C. Join D. Collaborate

9) Dimensions should match along the axis you are _______ on.
A. concatenating B. merging C. joining D. collaborating

10) Series can have axis labels and it can be indexed by a label
A. True B. False

11) MatplotLib is a _______ library for data visualisation.


A. 1dimensional B. 2dimensional C. 3dimensional D. ndimensional

12) Select the proper sequence to create a plot:


A. Set plot parameters, import required libraries, define the required dataset, display plot.
B. Define the required dataset, set plot parameters, import required libraries, display plot.
C. Set plot parameters, define the required dataset, import required libraries, display plot.
D. Import required libraries, define the required dataset, set plot parameters, display plot.

13) The plt.subplots() object acts as a more automatic axis manager?


A. True B. False

14) To avoid the overlapping of subplots we use


A. fig.tight_layout() B. sub.tight_layout() C. flt.tight_layout()

15) We cannot create a horizontal bar plot in matplotlib


A. True B. False

16) We use plot.barh() to adjust the height of the plot


A. True B. False
Explanation: We use it to create a horizontal barplot

17) We use ____ to create a horizontal bar plot.


axesh.bar() B. haxis.bar() C. axes.barh() D. hor.barh()

18) _______ is a visualisation library that provides a highlevel interface to draw attractive statistical graphics.
A. Scrapy B. Seaborn C. Airborn D. Statistica

Matplot and seaborn


1. The plot method on Series and DataFrame is just a simple wrapper around :
A. gplt.plot() B. plt.plot() C. plt.plotgraph() D. none of the Mentioned
Explanation: If the index consists of dates, it calls gcf().autofmt_xdate() to try to format the xaxis nicely.

2. Point out the correct combination with regards to kind keyword for graph plotting:
A. ‘hist’ for histogram B. ‘box’ for boxplot
C. ‘area’ for area plots D. all of the Mentioned
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank

Explanation: The kind keyword argument of plot() accepts a handful of values for plots other than the default Line
plot.

3. Which of the following value is provided by kind keyword for barplot ?


A. barh B. kde C. hexbin D. none of the Mentioned
Explanation: bar can also be used for barplot.

4. You can create a scatter plot matrix using the __________ method in pandas.tools.plotting.
A. sca_matrix B. scatter_matrix C. DataFrame.plot D. all of the Mentioned
Explanation: You can create density plots using the Series/DataFrame.plot.

5. Point out the wrong combination with regards to kind keyword for graph plotting:
A. ‘scatter’ for scatter plots B. ‘kde’ for hexagonal bin plots
C. ‘pie’ for pie plots D. none of the Mentioned
Explanation: kde is used for density plots.

6. Which of the following plots are used to check if a data set or time series is random ?
A. Lag B. Random C. Lead D. None of the Mentioned
Explanation: Random data should not exhibit any structure in the lag plot.

7. Plots may also be adorned with error bars or tables.


A. True B. False
Explanation: There are several plotting functions in pandas.tools.plotting.

8. Which of the following plots are often used for checking randomness in time series ?
A. Autocausation B. Autorank C. Autocorrelation D. None of the Mentioned
Explanation: If time series is random, such autocorrelations should be near zero for any and all timelag
separations.

9. __________ plots are used to visually assess the uncertainty of a statistic.


A. Lag B. RadViz C. Bootstrap D. None of the Mentioned
Explanation: Resulting plots and histograms are what constitutes the bootstrap plot.

10. Andrews curves allow one to plot multivariate data.


A. True B. False
Explanation: Curves belonging to samples of the same class will usually be closer together and form larger
structures.

Tableau
1. Tableau treats date
A. Specially by defining hierarchy for user
B. Treats date as any other field
C. Converts date to number
D. None of the above

2. Tableau has following data types


A. Number, date, geo location
B. Number, date, datetime, geolocation
C. Number decimal, number whole, date & time, date, string, Boolean, Default
D. None of the above

3. Tableau allows the user


USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank
A. Create calculated fields
B. Create calculated fields and use them on shelfs
C. None of the above

4. Tableau allows
A. Using data from disparate sources using blending as well as joining
B.Using data from disparate sources using only blending
C. Does not work with disparate sources
D. None of the above

5. Tableau can connect to


A. Databases, warehouses, cubes
B. Only databases
C. Databases, warehouses, cubes, flat files, excel, Salesforce
D. None of the above

6. With Tableau user can show


A. Standard deviation, running totals, percentage of totals and forecast
B. Forecast, running totals
C. Running totals, standard deviation
D. None of the above

7. Tableau has following modes


A. Live connection and extract B. Extract only C. Live only D. None of the above

8. Tableau allows to
A. Store metadata
B. Store Metadata, ability to rename fields, pivoting the data
C. Do not allow storing metadata
D. None of the above

9. Tableau has
A. Stories which allow better communication
B. Dashboards and stories and together they can be used for communication
C. No good features to communicate data
D. None of the above

10. Tableau allows you to animate


A. Using filter shelf B. Using row and column shelf
C. Using pages shelf D. Using any shelf

11. While blending the data


A. We need to have a common field and connection should be established for it to work
B. We need not have common field and tableau will handle it
C. Tableau will figure out blending without we doing any action
D. None of the above

12. A Reference Band cannot be based on two fixed points.


A. False B. True

13. A Reference Distribution plot cannot be along a continuous axis.


A. True B. False

14. Which of the following is not a Trend Line model


USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank

A. Linear Trend Line B. Exponential Trend Line


C. Binomial Trend Line D. Logarithmic Trend Line
Binomial Trend Line is not a Trend Line model.

15. The image below uses which map visualization?


A. Filled maps B. Layered maps C. WMS server maps D. Symbol maps

16. Is it possible to deploy a URL action on a dashboard object to open a Web Page within a dashboard rather than
opening the system’s web browser?
A. True, with the use of Tableau Server
B. True, with the use of a Web Page object
C. False, not possible
D. True, requires a plug-in
True, with the use of a Web Page object it is possible to deploy a URL action on a dashboard object to open a web
page within a dashboard rather than opening the system’s web browser.

17. The Highlighting action can be disabled for the entire workbook.
A. True B. False
From the toolbar the Highlighting action can be disabled for the entire workbook.

18. A sheet cannot be used within a story directly. Either sheets should be used within a dashboard, or a
dashboard should be used within a story.
A. True B. False
A sheet can be used within a story directly.

19. How do you identify a continuous field in Tableau?


A. It is identified by a blue pill in the visualization.
B. It is identified by a green pill in a visualization.
C. It is preceded by a # symbol in the data window.
D. When added to the visualization, it produces distinct values. It is identified by a green pill in a visualization

20. Is it possible to use measures in the same view multiple times (e.g. SUM of the measure and AVG of the
measure)?
A. No B. Yes
Yes, measures can be used multiple times in the same view.

21. Sets can be created on Measures.


A. False B. True
Sets can be created on dimensions.

22. For creating variable size bins we use _____________.


A. Sets B. Groups C. Calculated fields D. Table Calculations
For creating variable size bins we use Calculated Fields.

23. A good reason to use a bullet graph.


A. Analyzing the trend for a time period
B. Comparing the actual against the target sales
C. Adding data to bins and calculating count measure
D. Displaying the sales growth for a particular year

24. The line shown in the image below is a Reference Line. True or False?
A. true B. false
The line shown in the image is a Trend Line.
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank

25. Disaggregation returns all records in the underlying data source.


A. True B. False
Disaggregation returns all records in the underlying data sources.

26. By definition, Tableau displays measures over time as a ____________.


A. Bar B. Line C. Histogram D. Scatter Plots
By definition, Tableau displays measures over time as a Lines.

27. The icon associated with the field that has been grouped is a ______________.
A. Paper Clip B.Set C. Hash D. Equal To
The icon associated with the field that has been grouped is a paper clip.

28. In the West region, which state’s sales fall within the Reference Band starting from average sales of that region
till median of sales? (Perform the below questions in Tableau 9.0 and connect to the Saved Sample – Superstore
dataset)
A. California B. Colorado C. Montana D. New Mexico

29. Create a simple bar chart with Region and Total Expenses from the Sample- Superstore dataset and Sample -
Coffee Chain dataset, respectively. (Establish the link on State). Identify the budgeted profit for the region having
the 2nd highest total expenditure. (Connect to the Sample- Coffee Chain access file using the CoffeeChain Query
table)
A. 84850 B. 87680 C. 80231 D. 84823

30. In 2012, what is the percent contribution of sales for Decaf in the East market? (Perform all the questions in
Tableau 9.0 and connect to the Saved Sample-Superstore dataset)
A. 48.942% B. 54.765% C. 51.231% D. 55.875%
48.942% is the percent contribution of sales of Decaf in 2012 in the East market.

31. In 2013, what is the percentage of total profit for Caffe Mocha falling under Major Market (Market
Size)?(Perform all the questions in Tableau 9.0 and connect to the Saved Sample-Superstore dataset)
A. 60% B. 45% C. 58% D. 55%
In 2013, the percentage of total profit for Caffe Mocha falling under Major Market is 55%.

32. Create a heat map for Product Type, State, and Profit. Which state in the East market has the lowest profit for
Espresso?(Use the Sample- Coffee Chain dataset for the following questions)
A. Florida B. Connecticut C. New York D. New Hampshire
New Hampshire has the lowest profit for Espresso, in the East market.

33. In 2012, what is the difference in budget profit, in Q3 from the previous quarter for major market (Market
Size)? (Use the Sample- Coffee Chain dataset for the following questions)
A. 630 B.-287 C. 667 D. 654

34. In which month did the running sales cross $30,000 for Decaf in Colorado and Florida? (Use the Sample- Coffee
Chain dataset for the following questions)
A. November 2013 B. September 2013 C. May 2013 D. December 2013

35. Create a bar chart with Product Type, Product, and Profit. Identify which of the following
products fall below the overall 99.9% Confidence Interval Distribution (Table across)? (Use the Sample- Coffee
Chain dataset for the following questions)
A. Decaf Espresso B. Green Tea C. Caffe Latte D. Regular Espresso

36. Using quartiles, identify which of the following Espresso product has the highest distribution of sales? (Use the
Sample- Coffee Chain dataset for the following questions)
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank

A. Decaf Espresso B. Caffe Mocha C. Caffe Latte D. Regular Espresso


Regular Espresso has the highest distribution of sales in Espresso product.

37. In 2013, identify the state with the highest profit in the West market? (Use the Sample- Coffee Chain dataset
for the following questions)
A. Utah B. Nevada C. California D. Washington

38. Create a scatter plot with State, Sales, and Profit. Identify the Trend Line with ‘R-Squared’ value between 0.7 to
0.8? (Use the Sample- Coffee Chain dataset for the following questions)
A. Linear Trend Line B. Logarithmic Trend Line
C. Exponential Trend Line D. Polynomial Trend Line with Degree 2
The Trend Line with ‘R-Squared’ value between 0.7 to 0.8 is a Polynomial Trend Line with Degree 2.

39. Identify the total expenses to sales ratio of the state with the lowest profit. (Use the Sample- Coffee Chain
dataset for the following questions)
A. 47.31% B. 45.58% C. 41.98% D. 40.78%

40. Create a Combined Field with Product and State. Identify the highest selling product and its state. (Use the
Sample- Coffee Chain dataset for the following questions)
A. Colombian, California B. Colombian, Texa
C. Lemon, Neva D. Darjeeling, Iowa

41. What is the contribution of tea to the overall Profit in 2012? (Use the Sample- Coffee Chain dataset for the
following questions)
A. 24.323% B. 22.664% C. 20.416% D. 21.765%
Tableau Multiple Choice Questions For Experienced

42. What is the average profit ratio for all the products starting with C? (Use the Sample- Coffee Chain dataset for
the following questions)
A. 30% B. 25% C. 33% D. 20%

43. What is the distinct count of area codes for the state with the lowest budget margin in small markets? (Use the
Sample- Coffee Chain dataset for the following questions)
A. 3 B. 1 C. 2 D. 6

44. Which product type does not have any of its product within the Top 5 Products by sales? (Use the Sample-
Coffee Chain dataset for the following questions)
A. Tea B. Espresso C. Coffee D. Herbal Tea

45. In the Central region, the Top 5 Products by sales contributed _____ % of the total expenditure. (Use the
Sample- Coffee Chain dataset for the following questions)
A. 48.54% B. 51.66% C. 69.21% D. 54.02%
In the Central region, the Top 5 Products by sales contributed 54.02 % of the total expenditure.

45. The aggregation function attr() returns a * when __________________.


A. There is a single value for all rows in the group.
B. It is a null value.
C. There are more than one value in all rows in the group.
D. The data is not present at the desired level.

46. Trend Lines can only be used with numeric or date fields.
A. True B. False
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank
47. The best trend model for your view would be the one with?
A. R-Squared value closest to 1 B. P-Value more than 1
C. R-Squared value greater than 1 D. R-Squared value equal to P-Value

48. A Reference Line cannot be added from the Analytics pane.


A. True B. False

49. It is possible to change the geographic roles of a dimension.


A. True B. False

50. Groups can be used in a calculated field.


A. TRUE B. FALSE

51. The highlight action in a dashboard is similar to filtering action in a worksheet.


A. TRUE B. FALSE

52. The default join type in case of Blended data sources is?
A.Cross Join B. Inner Join C. Left outer Join D. Full outer Join

53. Bins cannot be created on dimensions.


A. TRUE B. FALSE

54. Union All is efficient than Union.


A. True B. False

55. Using GROUP BY ............ has the effect of removing duplicates from the data.
A. without order by B. with aggregates C. with order by D. without aggregates

56. Which type of Inner Join restricts fetching of redundant data?


A. Cross B. Outer C. Natural D. Equi

57. The JOIN which returns all the records from the right table in conjunction with the matching records from the
left table and if there are no matching values in the left table, it returns NULL. Which is this JOIN?
A. CROSS JOIN B. LEFT Join C. Full OUTER JOIN D. Right JOIN

58. GROUP BY ALL generates all possible groups - even those that do not meet the query's search criteria.
A. True B. False

59. You can combine tables in a partitioned view by using a Union All statement that causes the data from the
separate tables to appear as if they were one table. These tables in a SELECT statement of the view must adhere
to some restrictions like: A table can appear . . . . . . as a part of Union All statement.
A. as many times as possible B. only twice C. only once D. only thrice

60. Order by can only be used by Where Clause, correct?


A. True B. False

61. List the types of Inner join?


A. Out, In, Equi B. Equi, Natural, Cross C. Left, In, Cross D. None of these

62. Having clause is processed after the GROUP BY clause and any aggregate functions.
A. True B. False

63. All aggregate functions ignore NULLs except for ............


A. Count (*) B. Average() C. None of these D. Distinct
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank

64. Related to UNION ALL which one do you think is correct syntax: A, B or both
A. Select * from B
Union all
Select * from C
Order by ID desc
B. Select * from B
Order by ID desc
Union all
Select * from C
Order by ID desc

65. Which one is correct syntax for Where clause in SQL server?
A. SELECT WHERE "Condition" Col1, Col2 FROM "Table" ;
B. SELECT Col1, Col2 FROM "Table" WHERE "condition";
C. SELECT "Condition" Col1, Col2 FROM "Table" WHERE;
D. None of these

66. Are FULL Outer Join and Cross Join Same?


A. True B. False
67. What is true about NOT INNER JOIN?
A. It is a JOIN which restricts INNER JOIN to work.
B. None of the above.
C. It is one of the type of the JOINS in SQL Server.
D. When full Outer Join is used along with WHERE. This join will give all the results that were not present in
Inner Join.

68. If you SELECT attributes and use an aggregate function, you must GROUP BY the non-aggregate attributes.
A. True B. False

69. Which type of Inner Join fetches result with redundant data?
A. Left Outer B. Equi C. Cross D. IN

70. What will be the result of running the below UNION ALL query:
A. Select Null B. Select Null C. Union all

71. The JOIN which does Cartesian Product is called?


A. Cross Join B. Left Join C. Left Outer Join D. Right Outer Join

72. You want all dates when any employee was hired. Multiple employees were hired on the same date and you
want to see the date only once.
Query - 1
Select distinct hiredate
From hr.employee
Order by hiredate;

Query - 2
Select hiredate
From hr.employees
Group by hiredate
Order by hiredate;
Which of the above query is valid?
A. Both B. Query – 2 C. Query – 1
USM’s Shriram Mantri Vidyanidhi Info Tech Academy
PG DBDA Feb 19 Data Visualization Question Bank

73. What is the purpose of Order By Clause in SQL server?


A. It is used to sort the result.
B. It can’ be used in SQL Server
C. It is used to change sequence order of columns
D. None of these

74. For the purposes of ............, null values are considered equal to other nulls and are grouped together into a
single result row.
A. Having B. Group By C. None of these D. Both Having & Group By

75. What are also called Self Joins?


A. OUTER Joins B. INNER joins

76. Which join is used for Joining the table to itself?


A. Self B. In C. Cross D. Natural

77. Is there any limit for number of predicates/conditions to be added in a Where clause?
A. False B. True

78. Below query is run in SQL Server 2012, is this query valid or invalid:
Select count(*) as X
from Table_Name
Group by ()
A. Valid B. Invalid

79. In the context of MS SQL SERVER, with the exception of ............ column(s), any column can participate in the
GROUP BY clause.
A. ntext B. bit C. text D. All of these D. image

80. What is the other name of INNER JOIN?


A. Equi Join B. In Join C. Out Join D. All of these

81. What can be the condition in where clause in a SQL query?


A. None of these
B. Condition that is to be met for the rows to be returned from result.
C. Text condition only
D. Boolean Condition only

82. The sequence of the columns in a GROUP BY clause has no effect in the ordering of the output.
A. True B. False

Das könnte Ihnen auch gefallen