CCD-410試験無料問題集「Cloudera Certified Developer for Apache Hadoop (CCDH) 認定」

Which process describes the lifecycle of a Mapper?

解説: (GoShiken メンバーにのみ表示されます)
The Hadoop framework provides a mechanism for coping with machine issues such as faulty configuration or impending hardware failure. MapReduce detects that one or a number of machines are performing poorly and starts more copies of a map or reduce task. All the tasks run simultaneously and the task finish first are used. This is called:

解説: (GoShiken メンバーにのみ表示されます)
You wrote a map function that throws a runtime exception when it encounters a control character in input data. The input supplied to your mapper contains twelve such characters totals, spread across five file splits. The first four file splits each have two control characters and the last split has four control characters.
Indentify the number of failed task attempts you can expect when you run the job with mapred.max.map.attempts set to 4:

解説: (GoShiken メンバーにのみ表示されます)
You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm uses TextInputFormat: the mapper applies a regular expression over input values and emits key-values pairs with the key consisting of the matching text, and the value containing the filename and byte offset. Determine the difference between setting the number of reduces to one and settings the number of reducers to zero.

解説: (GoShiken メンバーにのみ表示されます)
You need to perform statistical analysis in your MapReduce job and would like to call methods in the Apache Commons Math library, which is distributed as a 1.3 megabyte Java archive (JAR) file. Which is the best way to make this library available to your MapReducer job at runtime?

解説: (GoShiken メンバーにのみ表示されます)
A combiner reduces:

解説: (GoShiken メンバーにのみ表示されます)