在Linux系统下做基于单机的hadoop运行例子的时候(参考文章 Linux系统下运行基于本地的Hadoop ),只有通过root用户才能够执行一个简单的例子,比如WordCount词频统计工具。只要换成一个新建的用户(没有配置过权限),就会在程序运行过程中出现异常。
因为,在执行Hadoop的WordCount工具的时候,需要执行基于MapReduce算法的任务,这里涉及到创建目录和删除目录。其实就是一个对目录权限的配置问题。
我新建的用户为shirdrn,具体用户信息及其用户组信息如下所示:
shirdrn:x:501:501::/home/shirdrn:/bin/bash shirdrn:x:501:
|
虽然,在创建shirdrn用户的时候,自动创建了home/shirdrn目录,似乎该/home/shirdrn的所有层次的目录的操作权限并没有落实到用户shirdrn。
所以,在使用shirdrn用户登录,执行hadoop的WordCount工具屡屡失败,大致异常信息是这样的:
[shirdrn@shirdrn hadoop-0.18.0]$ bin/hadoop jar hadoop-0.18.0-examples.jar wordcount my-input my-output 08/09/26 12:53:43 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 08/09/26 12:53:43 INFO mapred.FileInputFormat: Total input paths to process : 7 08/09/26 12:53:43 INFO mapred.FileInputFormat: Total input paths to process : 7 08/09/26 12:53:43 INFO mapred.JobClient: Running job: job_local_0001 08/09/26 12:53:44 INFO mapred.FileInputFormat: Total input paths to process : 7 08/09/26 12:53:44 INFO mapred.FileInputFormat: Total input paths to process : 7 08/09/26 12:53:44 ERROR mapred.LocalJobRunner: Mkdirs failed to create file:/home/shirdrn/hadoop-0.18.0/my-output/_temporary 08/09/26 12:53:44 WARN mapred.LocalJobRunner: job_local_0001 java.io.IOException: The directory file:/home/shirdrn/hadoop-0.18.0/my-output/_temporary doesnt exist at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:148) java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1113) at org.apache.hadoop.examples.WordCount.run(WordCount.java:149) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.examples.WordCount.main(WordCount.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:53) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:155) at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
|
其实,从上面红色标注的信息行,我就已经感觉到,shirdrn用户似乎对目录操作的权限不够。其实,我一直以为,在root创建shridrn用户的时候,会把/home/shirdrn的目录操作权限默认地赋予shirdrn用户,实践证明不是这样的。
当我使用root用户,执行/home/shirdrn目录下的Hadoop例子的时候,生成了my-output结果输出目录。而切换到shirdrn用户的时候,执行Hadoop例子的时候,居然提示my-output已经存在,那就删除了吧:
[shirdrn@shirdrn hadoop-0.18.0]$ rm -rf my-output
|
结果,提示我权限不够:
rm: 无法删除 “my-output/part-00000”: 权限不够 rm: 无法删除 “my-output/.part-00000.crc”: 权限不够 |
这时,我感觉到,确实需要为shirdrn用户配置操作目录的权限了。
通过切换到root用户,为shirdrn用户分配操作/home/shirdrn目录的权限:
[shirdrn@shirdrn hadoop-0.18.0]$ su 口令: [root@shirdrn hadoop-0.18.0]# chown -R shirdrn:shirdrn /home/shirdrn
|
这时,再切换到shirdrn用户来,成功删除了my-output目录:
[root@shirdrn hadoop-0.18.0]# su shirdrn [shirdrn@shirdrn hadoop-0.18.0]$ rm -rf my-output [shirdrn@shirdrn hadoop-0.18.0]$
|
再执行Hadoop的例子,如下所示:
[shirdrn@shirdrn hadoop-0.18.0]$ bin/hadoop jar hadoop-0.18.0-examples.jar wordcount my-input my-output
|
执行过程如下所示:
08/09/26 12:56:19 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId= 08/09/26 12:56:19 INFO mapred.FileInputFormat: Total input paths to process : 7 08/09/26 12:56:19 INFO mapred.FileInputFormat: Total input paths to process : 7 08/09/26 12:56:20 INFO mapred.JobClient: Running job: job_local_0001 08/09/26 12:56:20 INFO mapred.FileInputFormat: Total input paths to process : 7 08/09/26 12:56:20 INFO mapred.FileInputFormat: Total input paths to process : 7 08/09/26 12:56:20 INFO mapred.MapTask: numReduceTasks: 1 08/09/26 12:56:20 INFO mapred.MapTask: io.sort.mb = 100 08/09/26 12:56:21 INFO mapred.JobClient: map 0% reduce 0% 08/09/26 12:56:21 INFO mapred.MapTask: data buffer = 79691776/99614720 08/09/26 12:56:21 INFO mapred.MapTask: record buffer = 262144/327680 08/09/26 12:56:21 INFO mapred.MapTask: Starting flush of map output 08/09/26 12:56:21 INFO mapred.MapTask: bufstart = 0; bufend = 3262; bufvoid = 99614720 08/09/26 12:56:21 INFO mapred.MapTask: kvstart = 0; kvend = 326; length = 327680 08/09/26 12:56:21 INFO mapred.MapTask: Index: (0, 26, 26) 08/09/26 12:56:21 INFO mapred.MapTask: Finished spill 0 08/09/26 12:56:21 INFO mapred.LocalJobRunner: file:/home/shirdrn/hadoop-0.18.0/my-input/e.txt:0+1957 08/09/26 12:56:21 INFO mapred.TaskRunner: Task ‘attempt_local_0001_m_000000_0‘ done. 08/09/26 12:56:21 INFO mapred.TaskRunner: Saved output of task ‘attempt_local_0001_m_000000_0‘ to file:/home/shirdrn/hadoop-0.18.0/my-output 08/09/26 12:56:22 INFO mapred.MapTask: numReduceTasks: 1 08/09/26 12:56:22 INFO mapred.MapTask: io.sort.mb = 100 08/09/26 12:56:22 INFO mapred.JobClient: map 100% reduce 0% 08/09/26 12:56:22 INFO mapred.MapTask: data buffer = 79691776/99614720 08/09/26 12:56:22 INFO mapred.MapTask: record buffer = 262144/327680 08/09/26 12:56:22 INFO mapred.MapTask: Starting flush of map output 08/09/26 12:56:22 INFO mapred.MapTask: bufstart = 0; bufend = 3262; bufvoid = 99614720 08/09/26 12:56:22 INFO mapred.MapTask: kvstart = 0; kvend = 326; length = 327680 08/09/26 12:56:23 INFO mapred.MapTask: Index: (0, 26, 26) 08/09/26 12:56:23 INFO mapred.MapTask: Finished spill 0 08/09/26 12:56:23 INFO mapred.LocalJobRunner: file:/home/shirdrn/hadoop-0.18.0/my-input/a.txt:0+1957 08/09/26 12:56:23 INFO mapred.TaskRunner: Task ‘attempt_local_0001_m_000001_0‘ done. 08/09/26 12:56:23 INFO mapred.TaskRunner: Saved output of task ‘attempt_local_0001_m_000001_0‘ to file:/home/shirdrn/hadoop-0.18.0/my-output 08/09/26 12:56:23 INFO mapred.MapTask: numReduceTasks: 1 08/09/26 12:56:23 INFO mapred.MapTask: io.sort.mb = 100 08/09/26 12:56:23 INFO mapred.MapTask: data buffer = 79691776/99614720 08/09/26 12:56:23 INFO mapred.MapTask: record buffer = 262144/327680 08/09/26 12:56:24 INFO mapred.MapTask: Starting flush of map output 08/09/26 12:56:24 INFO mapred.MapTask: bufstart = 0; bufend = 16845; bufvoid = 99614720 08/09/26 12:56:24 INFO mapred.MapTask: kvstart = 0; kvend = 1684; length = 327680 08/09/26 12:56:24 INFO mapred.MapTask: Index: (0, 42, 42) 08/09/26 12:56:24 INFO mapred.MapTask: Finished spill 0 08/09/26 12:56:24 INFO mapred.LocalJobRunner: file:/home/shirdrn/hadoop-0.18.0/my-input/b.txt:0+10109 08/09/26 12:56:24 INFO mapred.TaskRunner: Task ‘attempt_local_0001_m_000002_0‘ done. 08/09/26 12:56:24 INFO mapred.TaskRunner: Saved output of task ‘attempt_local_0001_m_000002_0‘ to file:/home/shirdrn/hadoop-0.18.0/my-output 08/09/26 12:56:24 INFO mapred.MapTask: numReduceTasks: 1 08/09/26 12:56:24 INFO mapred.MapTask: io.sort.mb = 100 08/09/26 12:56:24 INFO mapred.MapTask: data buffer = 79691776/99614720 08/09/26 12:56:24 INFO mapred.MapTask: record buffer = 262144/327680 08/09/26 12:56:24 INFO mapred.MapTask: Starting flush of map output 08/09/26 12:56:24 INFO mapred.MapTask: bufstart = 0; bufend = 3312; bufvoid = 99614720 08/09/26 12:56:24 INFO mapred.MapTask: kvstart = 0; kvend = 331; length = 327680 08/09/26 12:56:24 INFO mapred.MapTask: Index: (0, 72, 72) 08/09/26 12:56:24 INFO mapred.MapTask: Finished spill 0 08/09/26 12:56:25 INFO mapred.LocalJobRunner: file:/home/shirdrn/hadoop-0.18.0/my-input/d.txt:0+1987 08/09/26 12:56:25 INFO mapred.TaskRunner: Task ‘attempt_local_0001_m_000003_0‘ done. 08/09/26 12:56:25 INFO mapred.TaskRunner: Saved output of task ‘attempt_local_0001_m_000003_0‘ to file:/home/shirdrn/hadoop-0.18.0/my-output 08/09/26 12:56:25 INFO mapred.MapTask: numReduceTasks: 1 08/09/26 12:56:25 INFO mapred.MapTask: io.sort.mb = 100 08/09/26 12:56:25 INFO mapred.MapTask: data buffer = 79691776/99614720 08/09/26 12:56:25 INFO mapred.MapTask: record buffer = 262144/327680 08/09/26 12:56:25 INFO mapred.MapTask: Starting flush of map output 08/09/26 12:56:25 INFO mapred.MapTask: bufstart = 0; bufend = 3262; bufvoid = 99614720 08/09/26 12:56:25 INFO mapred.MapTask: kvstart = 0; kvend = 326; length = 327680 08/09/26 12:56:26 INFO mapred.MapTask: Index: (0, 26, 26) 08/09/26 12:56:26 INFO mapred.MapTask: Finished spill 0 08/09/26 12:56:26 INFO mapred.LocalJobRunner: file:/home/shirdrn/hadoop-0.18.0/my-input/g.txt:0+1957 08/09/26 12:56:26 INFO mapred.TaskRunner: Task ‘attempt_local_0001_m_000004_0‘ done. 08/09/26 12:56:26 INFO mapred.TaskRunner: Saved output of task ‘attempt_local_0001_m_000004_0‘ to file:/home/shirdrn/hadoop-0.18.0/my-output 08/09/26 12:56:26 INFO mapred.MapTask: numReduceTasks: 1 08/09/26 12:56:26 INFO mapred.MapTask: io.sort.mb = 100 08/09/26 12:56:26 INFO mapred.MapTask: data buffer = 79691776/99614720 08/09/26 12:56:26 INFO mapred.MapTask: record buffer = 262144/327680 08/09/26 12:56:26 INFO mapred.MapTask: Starting flush of map output 08/09/26 12:56:26 INFO mapred.MapTask: bufstart = 0; bufend = 3262; bufvoid = 99614720 08/09/26 12:56:26 INFO mapred.MapTask: kvstart = 0; kvend = 326; length = 327680 08/09/26 12:56:26 INFO mapred.MapTask: Index: (0, 26, 26) 08/09/26 12:56:26 INFO mapred.MapTask: Finished spill 0 08/09/26 12:56:26 INFO mapred.LocalJobRunner: file:/home/shirdrn/hadoop-0.18.0/my-input/c.txt:0+1957 08/09/26 12:56:26 INFO mapred.TaskRunner: Task ‘attempt_local_0001_m_000005_0‘ done. 08/09/26 12:56:26 INFO mapred.TaskRunner: Saved output of task ‘attempt_local_0001_m_000005_0‘ to file:/home/shirdrn/hadoop-0.18.0/my-output 08/09/26 12:56:27 INFO mapred.MapTask: numReduceTasks: 1 08/09/26 12:56:27 INFO mapred.MapTask: io.sort.mb = 100 08/09/26 12:56:27 INFO mapred.MapTask: data buffer = 79691776/99614720 08/09/26 12:56:27 INFO mapred.MapTask: record buffer = 262144/327680 08/09/26 12:56:27 INFO mapred.MapTask: Starting flush of map output 08/09/26 12:56:27 INFO mapred.MapTask: bufstart = 0; bufend = 3306; bufvoid = 99614720 08/09/26 12:56:27 INFO mapred.MapTask: kvstart = 0; kvend = 330; length = 327680 08/09/26 12:56:27 INFO mapred.MapTask: Index: (0, 50, 50) 08/09/26 12:56:27 INFO mapred.MapTask: Finished spill 0 08/09/26 12:56:27 INFO mapred.LocalJobRunner: file:/home/shirdrn/hadoop-0.18.0/my-input/f.txt:0+1985 08/09/26 12:56:27 INFO mapred.TaskRunner: Task ‘attempt_local_0001_m_000006_0‘ done. 08/09/26 12:56:27 INFO mapred.TaskRunner: Saved output of task ‘attempt_local_0001_m_000006_0‘ to file:/home/shirdrn/hadoop-0.18.0/my-output 08/09/26 12:56:27 INFO mapred.ReduceTask: Initiating final on-disk merge with 7 files 08/09/26 12:56:27 INFO mapred.Merger: Merging 7 sorted segments 08/09/26 12:56:27 INFO mapred.Merger: Down to the last merge-pass, with 7 segments left of total size: 268 bytes 08/09/26 12:56:27 INFO mapred.LocalJobRunner: reduce > reduce 08/09/26 12:56:27 INFO mapred.TaskRunner: Task ‘attempt_local_0001_r_000000_0‘ done. 08/09/26 12:56:27 INFO mapred.TaskRunner: Saved output of task ‘attempt_local_0001_r_000000_0‘ to file:/home/shirdrn/hadoop-0.18.0/my-output 08/09/26 12:56:28 INFO mapred.JobClient: Job complete: job_local_0001 08/09/26 12:56:28 INFO mapred.JobClient: Counters: 11 08/09/26 12:56:28 INFO mapred.JobClient: File Systems 08/09/26 12:56:28 INFO mapred.JobClient: Local bytes read=953789 08/09/26 12:56:28 INFO mapred.JobClient: Local bytes written=961740 08/09/26 12:56:28 INFO mapred.JobClient: Map-Reduce Framework 08/09/26 12:56:28 INFO mapred.JobClient: Reduce input groups=7 08/09/26 12:56:28 INFO mapred.JobClient: Combine output records=21 08/09/26 12:56:28 INFO mapred.JobClient: Map input records=7 08/09/26 12:56:28 INFO mapred.JobClient: Reduce output records=7 08/09/26 12:56:28 INFO mapred.JobClient: Map output bytes=36511 08/09/26 12:56:28 INFO mapred.JobClient: Map input bytes=21909 08/09/26 12:56:28 INFO mapred.JobClient: Combine input records=3649 08/09/26 12:56:28 INFO mapred.JobClient: Map output records=3649 08/09/26 12:56:28 INFO mapred.JobClient: Reduce input records=21
|
执行过程非常顺利。
查看一下执行结果:
[shirdrn@shirdrn hadoop-0.18.0]$ cat my-output/part-00000 apache 1826 baketball 1 bash 1813 fax 2 find 1 hash 1 shirdrn 5 [shirdrn@shirdrn hadoop-0.18.0]$
|
正确,没有任何问题。
类别: | 阅读:110592 | 评论:0 |
标签: