You work in the genomics industry and you process large amounts of genomic data using a nightly Elastic Map Reduce (EMR) job. This job processes a single 3 Tb file which is stored on S3. The EMR job runs on 3 ondemand core nodes and four on-demand task nodes. The EMR job is now taking longer than anticipated and you have been asked to advise how to reduced the completion time?
A. Use four Spot Instances for the task nodes rather than four On-Demand instances.
B. You should reduce the input split size in the MapReduce job configuration and then adjust the number of simultaneous mapper tasks so that more tasks can be processed at once.
C. Store the file on Elastic File Service instead of S3 and then mount EFS as an independent volume for your core nodes.
D. Configure an independent VPC in which to run the EMR jobs and then mount EFS as an independent volume for your core nodes.
E. Enable termination protection for the job flow.
Answer: B. You should reduce the input split size in the MapReduce job configuration and then adjust the number of simultaneous mapper tasks so that more tasks can be processed at once.
You might also like to view...
When you change a value in a cell, Excel automatically recalculates
Indicate whether the statement is true or false
Critical Thinking QuestionsCase 5-2You think that Outlook's Quick Parts feature might be of use to you but you are not sure. You turn to your colleague Nan for a brief tutorial.According to Nan, what is the first step in creating a Quick Part? a. move a message to a folder marked as complete and readc. forward messages to members of your teamb. select the content that you want to reused. name and save the contents
What will be an ideal response?