Why Datanode is Denied Communication With Namenode

2017-12-20 14:51:08

So for those trying to setup HDFS out there, and are struggling with this kind of error where it said datanode denied communication with namenode:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
2017-12-20 14:26:26,108 INFO org.apache.hadoop.ipc.Server: IPC Server handler 29 on 8020, call Call#9 Retry#0 org.apache.hadoop.hdfs.server.protocol.DatanodeProtocol.registerDatanode from 172.16.56.238:27481
org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=172.16.56.238, hostname=172.16.56.238): DatanodeRegistration(0.0.0.0:50010, datanodeUuid=95e30839-4e7b-4918-98bb-f125b0c932c7, infoPort=50075, infoSecurePort=0, ipcPort=50020, storageInfo=lv=-57;cid=CID-d26bb3c5-ffa5-466c-a3be-fe16d8b84e73;nsid=1832233914;c=1513750982825)
at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:952)
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.registerDatanode(BlockManager.java:2014)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:3656)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1418)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:101)
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:30583)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:868)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:814)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1886)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2603)

It could actually be because of many configurations problem, but in the end it boils to one thing and that’s what I am going to tell you. It costed me one day just to fix this problem when I was running test-kitchen to provision hadoop servers.

So we have a namenode up and running, now we boot up a datanode. We have the defaultFS correct, so the datanode knows where the namenode is. It tries to connect to namenode. The connection happens through IP, so namenode only see the datanode’s ip. The problem is, in this process. Namenode has a list of blacklisted hostnames, which should not connect to it. This list can be empty, but namenode will keep checking it anyway, so it will try to do a reverse dns lookup to see which hostname the ip has. If it fails, then namenode will throw the exception you saw, and this it the source of all problems.

So basically, we have a few options to fix this:

  1. Fix the reverse dns thing in namenode, so that it could do the reverse dns lookup properly.
  2. Think that it doesn’t make sense to block anything in your case and turn off the checking.

Well, if your cluster is in a private network (which usually is), what is the chance that the datanode which is trying to connect to your namenode is not your datanodes? Not big. So let’s just turn it off.

You will have to add this setting in namenode’s hdfs-site.xml

1
2
3
4
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>

It was really confusing for me at first, because it is not clear whether denied communication was a result of misconfiguration in datanode or namenode. There are some stackoverflow entry about this, but they are single machine setup, and confused me even more because the solution is not working. I hope if you are experiencing same trouble, this post has helped you to understand a bit of the picture.

ref
Why Datanode is Denied Communication With Namenode
Getting the following error “Datanode denied communication with namenode” while configuring hadoop 0.23.8
Hadoop 使用 DNS 的问题


您的鼓励是我写作最大的动力

俗话说,投资效率是最好的投资。 如果您感觉我的文章质量不错,读后收获很大,预计能为您提高 10% 的工作效率,不妨小额捐助我一下,让我有动力继续写出更多好文章。