Sie sind auf Seite 1von 3

1.

2.
3.
4.
5.

6.

7.
8.
9.

Define KDD.
Are the main task and importance of data preprocessing?
Define and give examples of supervised and unsupervised learning.
What are the main reason to apply data transformation?
An analyst collects surveys from different participants about their likes and
dislikes. Subsequently, the analyst uploads the data to a database,
corrects erroneous or missing entries, and designs a recommendation
algorithm on this basis. Which of the following actions represent data
collection, data preprocessing, and data analysis? (a) Conduct- ing
surveys and uploading to database, (b) correcting missing entries, (c)
designing a recommendation algorithm.
An analyst obtains medical notes from a physician for data mining
purposes, and then transforms them into a table containing the medicines
prescribed for each patient. What is the data type of (a) the original data,
and (b) the transformed data? (c) What is the process of transforming the
data to the new format called?
What is the data type of each of the following kinds of attributes (a) Age,
(b) Salary, (c) ZIP code, (d) State of residence, (e) Height, (f) Weight?
It is desired to partition customers into similar groups on the basis of their
demo- graphic profile. Which data mining problem is best suited to this
task?
Consider the transaction database in the table below:

Determine the confidence of the rules{a}{f},and{a,e}{f}for the transaction


database
10. Consider the 1-dimensional data set with 10 data points {1,2,3,...10}.
Show three iterations of the k-means algorithms when k = 2, and the
random seeds are initialized to {1, 2}.
11. Compute the value of the predict link using common neighbors,
Adamic/Adar and Jaccard between Alice and Bob from the graph bellow:

12. Consider a variation of GirvanNewman algorithm in which edges are


randomly dis- connected from a network, as opposed to those with high
betweenness centrality. Explain the negative impact of this change on the
algorithm. Can you make minor changes to the disconnection criterion to
ameliorate this impact?
13. Consider the double star graph given in Figure 4.17 with n nodes, where
only nodes 1 and 2 are connected to all other vertices, and there are no
other links. Answer the following questions (treating n as a variable).
1. What is the degree distribution for this graph?
2. What is the clustering coefficient for vertex 1 and vertex 3?

14. Transform the following matrix in a graph

15. The following power law represents the degree distribution of a complex
network. You dont know what the network represent, what you can tell
only looking at this graphic?

16. Define cognitive computing


17. What are the main differences between cognitive computing and
programmable system?

Das könnte Ihnen auch gefallen