Sie sind auf Seite 1von 4

Multiple Choice Single Answer

Many methods for data smoothing are also methods for data reduction involving :-
Discretization
Dimensionality reduction reduces the data set size by removing :-
Irrelevant attributes
Effect of one attibute value on a given class is independent of values of other attibute is called
Value independence
Bayes Theorem is :-
P(H|X)=P(X|H)(P)/P(X)
Which from the following are special programs that are stored on database and fired when certain predefined
Triggers
Which of the following is based on set of density distribution function clustering?
DBSCAN
Classification rules are extracted from
Decision Tree
Data matrix is :-
Object by variable structure
Association rules mining is based on :-
Clustering and Employing rules for classification
Which type of following clustering computes augumented cluster ordering?
OPTICS
Query tool is meant for :-
Data acquisition
Which of the following function involves data cleaning, data standardizing and summarizing?
Transforming data
Main advantage of following which method is it's fast processing?
Grid based
In intermediate data extraction data capture through transaction log uses transaction from :-
Recovery from failure
Real world databases are highly susceptible to noisy, missing and inconsistent data due to :-
Huge size of data
Classification rules are extracted from
Decision Tree
Queries run faster to find exact match using which type of indexing?
Clustered index
Data can be smoothed by filling the data to function such as :-
Regression
Grouped data can be analyzed with the technique :-
Mixed effect model
Redundancies can be deleted by :-
Co-relational analysis
OLAP is used for :-
Online Analytical Processing
Which of the following of Grid based clustering method explorates statistical information?
STING
Data reduction by volume can be used for data representation using which type of reduction?
Numerosity reduction
Multiple Choice Multiple Answer
Advantages of Wavelet transformation for clustering are :-
Unsupervised clustering , Detection of cluster for accuracy , Clustering is fast
Which of the following clustering analysis method uses multire solution approach?
STING , Wave Cluster
Time variant nature of the data in data warehouse :-
Allows for analysis of the past , Relate information to the present , Enables forecasts for the future
Data compression is to compress the given data by encoding in terms of :-
Association rule , Decision tree , Cluster
The different definitions of metadata are :-
Data about data , Catalog of data , Data warehouse roadmap
In physical design of data warehouse administration provides features like:-
Avoiding reorganizing of tables , Support backup and recovery , Query processing
Data mining Functionalities are :-
Charactrization and Discrimination , Association Analysis , Cluster Analysis
Source Data Component may be grouped into following categories :-
Production Data , Internal External Data
The strategies for data reduction are :-
Data aggregation , Dimension reduction , Numerocity reduction
Metadata in a data warehouse falls into following categories :-
Operational Metadata , Extraction and Transformation metadata , End-user Metadata
Knowledge discovery process includes :-
Data Cleaning , Data Intergration , Data Selectin
SMP provides the features like :-
Each node has access to common set of disks , Controllers which are accessible to all processors , Each processor
has full
Foundation infrastructure of warehouse includes many elements such as :-
Basic Computing platform , Hardware and operating system , DBMS and Query
In data storage area , DBA uses metadata for processes of :-
Backup , Recovery , Tuning Database
Building blocks of Data Warehouse are :-
Source Data , Data Staging , Management and Control
Business metadata is useful for :-
Providing support to end users , For external view of data , Provides technical support to search data
Common areas of application for mixed effect model includes :-
Multiple data , Repeated measures data , Block designs
The dimensions of spatial data cube are :-
Non- spatial dimension , Spatial to non spatial , Spatial to spatial
Generalized linear model includes :-
Logistic regression , Poisson regression
True/False
Visual display can help user to give clear impression and overview of the data characteristics in a database.
True
COBWEB is a method of incremental conceptual clustering.
True
Data cube stores multidimensional aggregate information.
True
Data updates are common place in an operational database.
True
In decision tree internal nodes are denoted by ovals and leaf nodes are denoted by rectangles
False
From a Dataware house perspective data mining canbe viewed as an advanced stage of Online Analytical prog.
True
The Structure that brings all the components together is known as Architecture.
True
A distinct feature of DB Miner is its data cube based online analytical mining.
True
To remove noise from data is called as Smoothing.
True
A distinguishing feature of Clementine is its object oriented extended module interface.
True
Data Mining refers to extracting knowledge from larger amount of data.
True
All data extraction, transformation, integration and staging jobs run on selected hardware under chosen OS.
True
Legacy data resides on Hierarchical or Network database.
True
Data classification is two step process in which first step includes classfi of model and in 2nd step model describes

False
Descriptive mining takes perform ingerence on curr data which predictiv mining character the gen properti of
False
The elements of warehouse infrastructure are classified into operational and physical infrastructure.
True
Metadata acts like a nerve center.
True
To detect money laundering and other financial crimes, it is important to integrate information for multiple
True

Match The Following


Disparate data -Production data
Non volatile data -Query and analysis
Data granularity -Level of detail
Data from external source -External data
Match The Following
Data Mining -Knowledge discovery
Metadata -Roadmap for user
Data storage -Data management
Data staging -Workbench for data
Match The Following
Data producer Responsible for data quality
Domain values Prevalent problem
Update security Prevention of unauthorized updates
Referential integrity Foreign key preserved
Match The Following
Clustering tool To group different cases
Data visualization tool Transaction activity using graph
Linkage analysis tool To identify links
Classification tool To filter unrelated attributes
Select The Blank
Semantic integration of ________ genome database is the important task of DNA analysis.
Heterogeneous and distributed
With the widespread option of ________ real-time connection is viable for data warehouse.
TCP/IP
________ are responsible for running queries and reports against data warehouse tables.
End users
________ includes Normalization and Aggregation as data preprocessing procedures.
Data transformation
________ is the user who has system access privileges but no database administration privileges as well as not
Network administrator
________ dimension of database in which primitive level data are spatial but generalization becomes non spat
Spatial to non spatial
________ technique is the statistical technique for analyzing data.
Time series
________ is the method used to predict the value of response variable from one to more variables.
Regression
________ databases are one of the most poplularly available and rich information repositories.
Relational
A web server usually registers ________ entry for every access of a web page
Weblog
Human being have around ________ gene.
100000
In ________ duplicate sub trees exist within the tree.
Repetition
Indexed ________ engine search index,web page and build huge keywor based indices which help to search sets

Web Search
In data ________, data encoding or transformatio are applied to obtain reduced or compressed representation.
Compression
________ is the navigational map of data warehouse.
End user Metadata
In data warehouse architecture, the ________ component interleaves with and connects other components.
Metadata
Creating ________is violation of Normalization principles.
Array
In ________ type smoothing, minimum and maximum values in given bin are identified as bin boundaries.
Smoothing by bin boundaries
________ can store aggregate and detail data at varying levels of resolution or abstraction.
Index tree
________ is the platform for complex data transformation for the purpose of cleanse it
Separate optimal Platform
________ is density based clustering method which computes on augumented clustering ordering for automic
DBSCAN

Das könnte Ihnen auch gefallen