MapReduce任务运行到running job卡住

    xiaoxiao2025-05-03  14

    (1) 环境:ubuntuJDK1.8hadoop-2.7.2

    (2) 问题:每次hadoop跑各种MR应用,运行到running job都卡住了。

    配置好伪分布式的hadoop集群,启动集群后,使用自带的pi实例测试集群是否配置成功,使用命令:

    $hadoop jar myapp.jar data/ncdc/wc data/result

    可是任务运行到running job就卡住了

    INFO mapreduce.Job: Running job: job_1403905542893_0004

    ResourcesManager浏览器界面显示UNASSIGNED

    Tracking UI - UNASSIGNED Apps Submitted - 1 Apps Pending - 1 Apps Running - 0

    Jps输出:

    4764 Jps 2148 DataNode 3280 ResourceManager 2053 NameNode 3378 NodeManager 2318 SecondaryNameNode

    (3) 解决方法:

    从网上查了好多资料,主要有两种方法:一是hosts配置了不相关的主机,修改/etc/hosts文件,删除不相关的主机;二是集群的资源不足,无法分配给新任务的资源,需要调节yarn-site.xml的调度器获得资源的参数。

    对于方法一,我的配置文件只配置了本地主机,因此不是hosts文件问题。对于方法二,以前使用Apachehadoop伪分布式集群时,运行到map 0% reduce 0%卡住,调节下yarn-site.xml参数,可以完美运行了,原yarn-site.xml配置:

    <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>2048</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>2</value> </property>

    调节后的配置:

    <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>3072</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>2</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>256</value> </property>

    Yarn配置详细参数可见:http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-configurations-resourcemanager-nodemanager/

    可是这些配置同样在CDHhadoop上却出现在running job卡住的问题。

    (4) 最终解决方案:

    yarn-site.xml中关于资源调节的配置删除即可。

    yarn-site.xml配置:

    <configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>2560</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>2</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>256</value> </property> </configuration>

    修改后yarn-site.xml文件:

    <configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>604800</value> </property> </configuration>

    重启resourcemanagernodemanager守护进程,再次运行pi实例,会发现作业成功运行!

    参考资料:http://stackoverflow.com/questions/24481439/cant-run-a-mapreduce-job-on-hadoop-2-4-0

    转载请注明原文地址: https://ju.6miu.com/read-1298707.html
    最新回复(0)