Sie sind auf Seite 1von 2

USC 551

QUIZ 1

Question 1
Explain the differentiate between structure data and unstructured data. Give an example of
each type of data structure.
STRUCTURED DATA UNSTRUCTURED DATA
Can be displayed in rows, columns and Cannot be displayed in rows, columns and
relational databases relational database
Requires less storage Requires more storage
Easier to manage and protect with legacy More difficult to manage and protect with
solutions legacy solutions

Type of structured data: Relational Data (Tables/Transaction/Legacy Data)


Type of unstructured data: Text data (Web)
[6 marks]
Question 2
Describe the following term and give the example of situation where the type of data
structure can be used.
a. Matrix
b. Data frame

a. Matrix - A matrix is a composed of numbers in two dimensions: rows and columns. It is


similar to a data table composed only of numbers.
Situation: The data matrix is usually used to mark small items because the 2-3 mm² area
can store up to 50 characters of information and the code is only read at 20% contrast ratio. A
member is identified by two indices, one for rows and another for columns. Matrices are
equal if they are of the same size and each corresponding member is equal.
b. Data frame - Generated by combining together multiple vectors such that each vector
becomes a separate column. The concept of a data frame comes from the world of statistical
software used in empirical research
Situation: Data frame can be used during created data information in the table form.

[8 marks]
Question 3
Describe the function of data preprocessing in data management process.
Data Preprocessing is a technique that used to improve the quality of the data before applied
mining, so that data will lead to high quality mining results. Data processing technique can
substantially improve the overall quality of the patterns mined and/or the time required for
the actual mining. Data preprocessing include data cleaning, data integration, data
transformation, and data reduction. Therefore, to improve the quality of data and,
consequently, of the mining results, data preprocessing needed.
[6 marks]

Question 4
Explain, why data visualization is considered as important activities in information
representation.
Data visualization is the representation of data or information in a graph, chart, or other
visual format. It communicates relationships of the data with images. This is important
because it allows trends and patterns to be more easily seen. With the rise of big data upon us,
we need to be able to interpret increasingly larger batches of data. Machine learning makes it
easier to conduct analyses such as predictive analysis, which can then serve as helpful
visualizations to present. But data visualization is not only important for data scientists and
data analysts, it is necessary to understand data visualization in any career. Whether you work
in finance, marketing, tech, design, or anything else, you need to visualize data. That fact
showcases the importance of data visualization.
[6 marks]

Question 5
Briefly describe the activities involved in data mining process.
Activities that involved in data mining process are business understanding, data
understanding, data preparation, modelling, evaluation and deployment. The last three
processes including data mining, pattern evaluation and knowledge representation are
integrated into one process called data mining.
[4 marks]

Das könnte Ihnen auch gefallen