Sie sind auf Seite 1von 12

Question: 1 of 60

You are trying to load a JSON formatted data file from Hadoop into BigSheets
workbook. The data appears to be formatted incorrectly. What should you do?

A. Export the file to a compatible format.


B. Load the data into RDBMS table.
C. Pick a different catalog table.
D. Change the line reader for the file.

Question: 2 of 60
Which IBM BigInsights component should be used to build extractors from
unstructured data?

A. CRAN
B. Web Tooling
C. HDFS
D. BigSheets

Question: 3 of 60
Which of the following units is larger than the others?

A. TiB
B. ZiB
C. EiB
D. MiB

Question: 4 of 60
Which feature of Apache Spark allows for it to perform faster computations
than just Hadoop2?

A. APIs for Scala, Python, and Java


B. in-memory computation and caching
C. allows for batch applications
D. support for the MapReduce model

Question: 5 of 60
Which of the following describes a big data stream of data?

A. structured data from multiple systems


B. real-time transactions of stocks
C. information from email correspondence
D. collections of unstructured documents
Question: 6 of 60
Which YARN component do clients submit new jobs to in a Yarn architecture?

A. Container
B. NodeManager
C. Application Manager
D. Resource Manager

Question: 7 of 60
How is Apache Ambari used?

A. for flexible data processing framework


B. for machine learning
C. for Hadoop cluster administration
D. for statistics processing

Question: 8 of 60
BigSheets was designed for which group of users?

A. business analysts
B. DBAs
C. programmers
D. system administrators

Question: 9 of 60
When comparing Pig and SQL, which is a true statement?

A. SQL supports pipeline splits and ETL techniques.


B. Pig is a declarative language.
C. Pig can only store data at the end of evaluation.
D. SQL and RDBMSs are generally much faster after data is loaded.

Question: 10 of 60
Which two statements are true regarding the functionality of HDFS? (Choose
two.) (Please select ALL that apply)

A. Files are split across the cluster.


B. Programs are moved to the data for processing.
C. Each node has a copy of all the data.
D. Data can be updated from any node.
Question: 11 of 60
Which command can be used to read the results of a Hadoop computation
stored in a file named results?

A. hadoop fs -pwd results


B. hadoop fs -cat results
C. hadoop fs -ls -R results
D. hadoop fs -rm results

Question: 12 of 60
Which command lists the files available in a folder on HDFS?

A. hdfs dfs -chown


B. hdfs dfs -ls
C. hdfs dfs -put
D. hdfs dfs -cat

Question: 13 of 60
What is the minimum version of Java required to program with Spark v1.x?

A. Java 8
B. Java 7
C. Java 5
D. Java 6 with Spark 1.x, Java 7 with Spark 2.x

Question: 14 of 60
[missing]

Question: 15 of 60
What is the name of the variable of the Spark context in the Scala shell?

A. sp
B. sc
C. scala
D. spk

Question: 16 of 60
[missing]
Question: 17 of 60
Which command is used to submit a job to a Hadoop 2 cluster?

A. hdfs ls filename.jar
B. hadoop dfs -put filename.jar
C. hadoop jar filename.jar
D. jar -tf filename.jar

Question: 18 of 60
Which Apache tool should be used to initialize the Hadoop cluster?

A. Spark
B. Knox
C. Ambari
D. Yarn

Question: 19 of 60
Which two characteristics describe the Platform Symphony grid management
platform? (Choose two.) (Please select ALL that apply)

A. it is designed around a single-tenant platform


B. it can run applications built in Java
C. it manages only grid-aware applications
D. it is reactive to time-critical requirements

Question: 20 of 60
When using Spark Stream, what is a DStream?

A. the processed data stream


B. a bidirectional input/output stream
C. a sequences of RDDs
D. a SQL-to-stream reader

Question: 21 of 60
What does the hadoop fsck command do?

A. deletes old files


B. checks the status of the HDFS cluster
C. enables file system replication
D. rebalances files in the HDFS cluster
Question: 22 of 60
What are two primary issues when dealing with unstructured data in
comparison to structured data? (Choose two.) (Please select ALL that apply)

A. lack of value
B. lack of context
C. lack of availability
D. lack of data types

Question: 23 of 60
What is Hive used for?

A. job submitting and monitoring via GUI


B. migrating data from one cluster to another
C. cluster resource allocation and reporting
D. managing and querying structured data in Hadoop

Question: 24 of 60
Which of the following units is smaller than the others?

A. YiB
B. EiB
C. PiB
D. ZiB

Question: 25 of 60
What does it mean to create a new shard in HBase?

A. Scale out seamlessly to another partition.


B. Create a data restore point.
C. Collect related data together.
D. Remove unused data from the cluster.

Question: 26 of 60
What should be done to load data that is at rest into Hadoop?

A. Use standard HDFS commands to load the data.


B. Setup Flume to import the data into Hadoop.
C. Create a data streamer for the data.
D. Import the data into an RDBMS.
Question: 27 of 60
Which type of statement is available in the Pig Latin language?

A. if
B. case
C. foreach
D. do loops

Question: 28 of 60
In a YARN architecture, what does all computation code run inside of?

A. ApplicationManager
B. Container
C. ResourceManager
D. Distributed Application.

Question: 29 of 60
[missing]

Question: 30 of 60
Which of the following describes data in motion?

A. a file being updated by a sensor network


B. a read-only database view of archived data
C. a text file containing data from the last census
D. a particular encounter in a patient’s medical record

Question: 31 of 60
What command will launch ZooKeeper from a Linux terminal window?

A. ZK.sh
B. zkCli.sh
C. zk-admin
D. zookeepersh

Question: 32 of 60
What is the default storage level for Apache Spark?

A. MEMORY_AND_DISK
B. MEMORY_ONLY
C. OFF_HEAP
D. DlSK_ONLY
Question: 33 of 60
What are two features that BigR provides for R users? (Choose two.) (Please
select ALL that apply)

A. BigR queries on Big Data without MapReduce code


B. provides scalability of machine learning for large data sets
C. is an open source implementation of the R language
D. has a large community library of statistics packages for big data

Question: 34 of 60
Which lBM Biglnsight tool is designed to build extractors to extract structured
data from both unstructured and semi-structured data?

A. Online Analytical Programming (OLAP)


B. Annotation Query Language (AOL)
C. Structured Query Language (SQL)
D. Distributed File System (DFS)

Question: 35 of 60
What is the correct definition of a Proximity Rule?

A. Proximity Rule is a component for building extractors that pull structured


information from unstructured text.
B. Proximity Rule is a union of two or more elements with the same schema.
C. Proximity Rule specifies the minimum and maximum number of tokens that
might occur before or after.
D. Proximity Rule is a programming model for processing and generating large
datasets.

Question: 36 of 60
What does each view define from an AOL extractor?

A. a row
B. a column
C. a relation
D. a dictionary

Question: 37 of 60
What should you do when an extractor generates multiple rows for the same
text?

A. Re-create the extractors.


B. Edit the properties of the sequence.
C. Create a new filter.
D. Create a consolidation rule.

Question: 38 of 60
Which text analytics phase should you stay in until you are satisfied that
extractors are meeting your requirements?

A. Analysis
B. Production
C. Performance Tuning
D. Rule Development

Question: 39 of 60
Which pre-built extractors are used for extracting numeric information and
percentages?

A. generic extractors
B. named-entity extractors
C. financial extractors
D. other extractors

Question: 40 of 60
During which text analytics phase should you refine the rules for runtime
enhancements and speed?

A. Analysis
B. Production
C. Rule Development
D. Performance Tuning

Question: 41 of 60
Which category of pre-built extractors is used to extract dates and emails?

A. financial extractors
B. generic extractors
C. other extractors
D. named-entity extractors

Question: 42 of 60
You have some results from an extractor. You need to see which extractor
found the results. Which button should you click on?

A. Service Actions
B. Filter Button
C. Show Extractor Name
D. Tag Button

Question: 43 of 60
Which approach is taken during the Analysis phase using text analytics?

A. Refine rules for runtime performance.


B. Create extractors that will meet requirements.
C. Verify appropriate data is being extracted.
D. Locate examples of information to be extracted.

Question: 44 of 60
Which AQL candidate rule is used to perform pattern matching?

A. Sequence
B. Select
C. Union
D. Blocks

Question: 45 of 60
Which two types of encoding can an HBASE entity have? (Choose Two.)
(Please select ALL that apply)

A. binary
B. hex
C. string
D. decimal
E. character

Question: 46 of 60
Which lSO/lEC standards and capabilities does Big SQL support?

A. SOL:2008
B. SQL:2011
C. SQL:2013
D. SQL:2006
Question: 47 of 60
Which column storage format for Big SQL allows for parallel processing of row
collections in a cluster?

A. Parquet
B. ORC
C. Sequence
D. Avro

Question: 48 of 60
Which feature of Big SQL can split up data that will be later added into multiple
files?

A. BigSheets
B. Primary Key
C. Foreign Key
D. Partitioned Tables

Question: 49 of 60
Which Biq SQL file format is supported by a native l/O engine, is the highest
performance format, and has an optimal block size of 168 (the same as the
HDFS block size)?

A. Parquet
B. ORC
C. Sequence
D. Avro

Question: 50 of 60
What is the reason that no information other than EXPLAIN itself is collected in
the snapshot column when a Big SQL EXPLAIN statement is executed?

A. No column was defined for the snapshot.


B. The WITH SNAPSHOT clause was used.
C. No errors occurred during compilation of the explainable statement.
D. The FOR SNAPSHOT clause was used.

Question: 51 of 60
What should you create when you need to query a remote table and a Big SQL
table?

A. a snapshot
B. a database
C. a nickname
D. a union

Question: 52 of 60
How is HBASE different from a relational database?

A. Is not useful for sparse datasets.


B. All data is stored as bit arrays.
C. It has a schema.
D. All data is stored as byte arrays.

Question: 53 of 60
Which connector type does LOAD use to retrieve data from a Hadoop data
source into an HBASE table?

A. DATAACCESS
B. JDBC URL
C. Insert
D. Sqoop

Question: 54 of 60
Which command is recommended to get HDFS data into a Big SQL table
because it has the best runtime performance?

A. Select
B. Create
C. Load
D. Insert

Question: 55 of 60
You need to create a table in an HDFS file system called "users". Which
command should you use?

A. create hdfs table users


B. create hadoop table users
C. create external users
D. create table users

Question: 56 of 60
What are the two ways to invoke the EXPLAIN statement in Big SQL? (Choose
two.) (Please select ALL that apply)

A. with an incremental scan


B. interactively
C. embedded into an application
D. in batch scripts

Question: 57 of 60
In a Spark operation, what is considered a "lazy" evaluation?

A. Count
B. Transformations
C. Actions
D. Sequences

Question: 58 of 60
What is the default directory in HDFS that holds any tables created by Big
SQL?

A. /apps/hive/warehouse/
B. /apps/hbase/data/
C. /apps/hive/warehouse/schema/
D. /apps/hbase/warehouse

Question: 59 of 60
Which ANALYZE TABLE...COMPUTE STATISTICS... command option should
you run to collect just basic table statistics (number of files and partitions, total
size)?

A. -NOSCAN
B. -COPYHIVE
C. -INCREMENTAL
D. -PARTIALSCAN

Question: 60 of 60
How should you collect statistics about your data in a Big SQL table to help
better organize your data?

A. Run the SYSPROCSYSINSTALLOBJECT procedure.


B. Run the SJSQSH_HOME/bin/JSQSH script.
C. Run the EXPLAIN command with no columns selected.
D. Run the ANALYZE command with at least one column selected.

Das könnte Ihnen auch gefallen