Pass the Cloudera CCAH CCA-500 Questions and answers with CertsForce

Viewing page 1 out of 2 pages
Viewing questions 1-10 out of questions
Questions # 1:

You are running a Hadoop cluster with MapReduce version 2 (MRv2) on YARN. You consistently see that MapReduce map tasks on your cluster are running slowly because of excessive garbage collection of JVM, how do you increase JVM heap size property to 3GB to optimize performance?

Options:

A.

yarn.application.child.java.opts=-Xsx3072m


B.

yarn.application.child.java.opts=-Xmx3072m


C.

mapreduce.map.java.opts=-Xms3072m


D.

mapreduce.map.java.opts=-Xmx3072m


Expert Solution
Questions # 2:

Which two features does Kerberos security add to a Hadoop cluster? (Choose two)

Options:

A.

User authentication on all remote procedure calls (RPCs)


B.

Encryption for data during transfer between the Mappers and Reducers


C.

Encryption for data on disk (“at rest”)


D.

Authentication for user access to the cluster against a central server


E.

Root access to the cluster for users hdfs and mapred but non-root access for clients


Expert Solution
Questions # 3:

On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a directory of 10 plain text files as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers will run?

Options:

A.

We cannot say; the number of Mappers is determined by the ResourceManager


B.

We cannot say; the number of Mappers is determined by the developer


C.

30


D.

3


E.

10


F.

We cannot say; the number of mappers is determined by the ApplicationMaster


Expert Solution
Questions # 4:

Given:

Question # 4

You want to clean up this list by removing jobs where the State is KILLED. What command you enter?

Options:

A.

Yarn application –refreshJobHistory


B.

Yarn application –kill application_1374638600275_0109


C.

Yarn rmadmin –refreshQueue


D.

Yarn rmadmin –kill application_1374638600275_0109


Expert Solution
Questions # 5:

You are planning a Hadoop cluster and considering implementing 10 Gigabit Ethernet as the network fabric. Which workloads benefit the most from faster network fabric?

Options:

A.

When your workload generates a large amount of output data, significantly larger than the amount of intermediate data


B.

When your workload consumes a large amount of input data, relative to the entire capacity if HDFS


C.

When your workload consists of processor-intensive tasks


D.

When your workload generates a large amount of intermediate data, on the order of the input data itself


Expert Solution
Questions # 6:

You are running Hadoop cluster with all monitoring facilities properly configured.

Which scenario will go undeselected?

Options:

A.

HDFS is almost full


B.

The NameNode goes down


C.

A DataNode is disconnected from the cluster


D.

Map or reduce tasks that are stuck in an infinite loop


E.

MapReduce jobs are causing excessive memory swaps


Expert Solution
Questions # 7:

You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because you Hadoop cluster isn’t optimized for storing and processing many small files, you decide to do the following actions:

1. Group the individual images into a set of larger files

2. Use the set of larger files as input for a MapReduce job that processes them directly with python using Hadoop streaming.

Which data serialization system gives the flexibility to do this?

Options:

A.

CSV


B.

XML


C.

HTML


D.

Avro


E.

SequenceFiles


F.

JSON


Expert Solution
Questions # 8:

You have a cluster running with the fair Scheduler enabled. There are currently no jobs running on the cluster, and you submit a job A, so that only job A is running on the cluster. A while later, you submit Job B. now Job A and Job B are running on the cluster at the same time. How will the Fair Scheduler handle these two jobs? (Choose two)

Options:

A.

When Job B gets submitted, it will get assigned tasks, while job A continues to run with fewer tasks.


B.

When Job B gets submitted, Job A has to finish first, before job B can gets scheduled.


C.

When Job A gets submitted, it doesn’t consumes all the task slots.


D.

When Job A gets submitted, it consumes all the task slots.


Expert Solution
Questions # 9:

Each node in your Hadoop cluster, running YARN, has 64GB memory and 24 cores. Your yarn.site.xml has the following configuration:

yarn.nodemanager.resource.memory-mb

32768

yarn.nodemanager.resource.cpu-vcores

12

You want YARN to launch no more than 16 containers per node. What should you do?

Options:

A.

Modify yarn-site.xml with the following property:

yarn.scheduler.minimum-allocation-mb

2048


B.

Modify yarn-sites.xml with the following property:

yarn.scheduler.minimum-allocation-mb

4096


C.

Modify yarn-site.xml with the following property:

yarn.nodemanager.resource.cpu-vccores


D.

No action is needed: YARN’s dynamic resource allocation automatically optimizes the node memory and cores


Expert Solution
Questions # 10:

Which is the default scheduler in YARN?

Options:

A.

YARN doesn’t configure a default scheduler, you must first assign an appropriate scheduler class in yarn-site.xml


B.

Capacity Scheduler


C.

Fair Scheduler


D.

FIFO Scheduler


Expert Solution
Viewing page 1 out of 2 pages
Viewing questions 1-10 out of questions