hadoop 源码分析(三) hadoop RPC 机制

Hadoop 通信机制采用自己编写的RPC. 相比于其他复杂的rpc框架着实清爽了许多.rpc在hadoop中扮演的角色是通信和数据传输在client和server端,以及datanode和namenode 心跳数据以及jobTracker 和taskTracker 通信

1. Client 与 server 端通信采用Writable 序列化形式.因此hadoop中信息的传递必须继承自writable 接口,writable 接口有两个方法 write 和read
2. Client 端通过调用Call 方法,将消息序列化为writable 形式与server端通信
3. Client 调用sendPing() 到server端.每隔一定时间,Ping时间间隔通过ipc.ping.interval 配置
4. connection方法为多路复用.多个call请求公用一个call方法,通过addCall( ) 将call 加入hash 的call队列中，但是response则单独处理,call 队列 Hashtable<Integer, Call> calls = new Hashtable<Integer, Call>()
5. Server 端通过NIO方式将serveraddress bind到lister.
6. Reader为读入监听到的动作key 交给doRead 去读出来

其实hadoop的RPC 比较简单,无非就是通过wirtable 序列化在client 和server 端传输数据.其中包括心跳检测.client 传参数给服务端代理执行器方法等,jobClient 代理直接JobTracker的方法其中传参数的协议就是通过RPC 序列化参数传给服务端

/** Get a connection from the pool, or create a new one and add it to the
   * pool.  Connections to a given ConnectionId are reused. */
   private Connection getConnection(ConnectionId remoteId,
                                   Call call)
                                   throws IOException, InterruptedException {
    if (!running.get()) {
      // the client is stopped
      throw new IOException("The client is stopped");
    }
    Connection connection;
    /* we could avoid this allocation for each RPC by having a  
     * connectionsId object and with set() method. We need to manage the
     * refs for keys in HashMap properly. For now its ok.
     */
    do {
      synchronized (connections) {
        connection = connections.get(remoteId);
        if (connection == null) {
          connection = new Connection(remoteId);
          connections.put(remoteId, connection);
        }
      }
    } while (!connection.addCall(call));
    
    //we don't invoke the method below inside "synchronized (connections)"
    //block above. The reason for that is if the server happens to be slow,
    //it will take longer to establish a connection and that will slow the
//entire system down.
// setupIOstreams 方法建立IO通道.client和server 建立链接
    connection.setupIOstreams();
    return connection;
  }

hadoop 源码分析(三) hadoop RPC 机制

猜你喜欢