amazon web services - EMR Hadoop-streaming job fails while looking for container

amazon web services - EMR Hadoop-streaming job fails while looking for container_tokens -

attempt run emr streaming job fails with:

2014-10-15 18:36:36,560 error [main] org.apache.hadoop.yarn.yarnuncaughtexceptionhandler: thread thread[main,5,main] threw exception. java.io.ioexception: exception reading /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1413396780703_0003/container_1413396780703_0003_01_000218/container_tokens     @ org.apache.hadoop.security.credentials.readtokenstoragefile(credentials.java:177)     @ org.apache.hadoop.security.usergroupinformation.loginuserfromsubject(usergroupinformation.java:744)     @ org.apache.hadoop.security.usergroupinformation.getloginuser(usergroupinformation.java:703)     @ org.apache.hadoop.security.usergroupinformation.getcurrentuser(usergroupinformation.java:605)     @ org.apache.hadoop.mapred.yarnchild.main(yarnchild.java:98) caused by: java.io.filenotfoundexception: /mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1413396780703_0003/container_1413396780703_0003_01_000218/container_tokens (no such file or directory)     @ java.io.fileinputstream.open(native method)     @ java.io.fileinputstream.<init>(fileinputstream.java:146)     @ org.apache.hadoop.security.credentials.readtokenstoragefile(credentials.java:172)     ... 4 more

the failure indeterminate, frequent on big clusters. how launch cluster:

elastic-mapreduce --create --alive --instance-group master --instance-type m1.large \ --instance-count 1 \ --instance-group core --instance-type r3.xlarge \ --instance-count 200 --hadoop-version "2.4.0" \ --ami-version "3.2.1" --enable-debugging --json ./emr_config \ --bootstrap-action 's3://path/to/bootstrap.sh' --bootstrap-name bootstrap

and step configuration (emr_config):

[   {     "name": "step name",     "actiononfailure": "continue",     "hadoopjarstep": {       "jar": "/home/hadoop/contrib/streaming/hadoop-streaming.jar",       "args": [          "-files", "s3://path/to/mapper.py",          "-input",     "s3://path/to/input/",          "-output",    "s3://path/to/output/",          "-mapper",    "mapper.py",          "-reducer",   "/bin/cat",          "-jobconf",   "mapreduce.map.java.opts=-xmx22528m",          "-jobconf",   "mapreduce.map.memory.mb=23424",          "-jobconf",   "mapreduce.task.timeout=24000000",          "-jobconf",   "mapreduce.job.maps=200",          "-jobconf",   "mapreduce.tasktracker.map.tasks.maximum=1",          "-jobconf",   "mapred.map.tasks.speculative.execution=false"       ]     }   } ]

anyone know source of problem, or workaround?

hadoop amazon-web-services hadoop-streaming yarn emr

Search This Blog

Four

amazon web services - EMR Hadoop-streaming job fails while looking for container_tokens -

Comments

Post a Comment

Popular posts from this blog

formatting - SAS SQL Datepart function returning odd values -

c++ - Apple Mach-O Linker Error(Duplicate Symbols For Architecture armv7) -

php - Yii 2: Unable to find a class into the extension 'yii2-admin' -