5105 pa2 Distributed Hash Table based on Chord


1 Design document

1.1 System overview

We implemented a Book Finder System using a distributed hash table (DHT) based on the Chord protocol. Using this system, the client can set and get the book’s information (title and genre for simplicity) using the DHT. The whole system is composed of 1 supernode, 1 client (or more clients), and several compute nodes.

The client will deal with Set and Get request. It will get a node address from the supernode first. Then it sends the request to the node via Thrift.

The supernode could listen to requests from the client, and record the information of nodes. When a node wants to join the DHT, it contacts the SuperNode using Thrift. The SuperNode will then return one of the nodes information. The client will contact the SuperNode, and SuperNode will return node information randomly chosen.

The nodes will store the book information and keep the DHT table. Each node will maintain a predecessor pointer, a successor pointer, and a FingerTable for fast searching in chord ring. The FingerTable will be updated when a new node is added into the system. Also, the data will be stored on different node based on the hash value of BookTitle. When client do a get/set operation, the operation will be forwarded to appropriate node based on searching result in FingerTable.

pa2要求实现一个分布式key-value hash table,存储图书名和对应的类别供用户查询(就set、get两种操作)。系统由一个client,一个supernode,若干个node组成。DHT基于Chord协议。

扫描二维码关注公众号,回复: 6999281 查看本文章

1.2 Assumptions

We made these assumptions in this system:

  1. Each Node can either run on same or different machine on its own port.
  2. More than 2 Thrift Interface files are needed (for SuperNode and Nodes).
  3. The Nodes will act as client of SuperNode for joining phase for forming DHT and

will act as a server for handling requests from the client.

  1. SuperNode should not maintain any state about the DHT (only the list of nodes)
  2. The genre will be updated if a client sets a different genre for a book title.
  3. The system does not need to be persistent in this project.
  4.  The number of nodes for DHT can be set when the SuperNode starts as a

parameter. (This will let the SuperNode know the DHT is ready)

  1. No node failures or nodes leaving the DHT after they've joined.

1.3 Component design

1.3.1 Common part

In the common part, we defined a Thrift struct Address, which is used to describe a node (IP, port, NodeID).

1.3.2 Client

The client accepts a task from users.

To initiate it, input  “java -cp ".:/usr/local/Thrift/*" Client <serverIP> <serverPort>”, for example,  “java -cp ".:/usr/local/Thrift/*" Client csel-kh1250-01 9090”.

To set a book with the genre, input Set “Book_title” “Genre” (“” are required), for example, Set “Harry Potter” “Magic”. The server will return the trace and the node who stores this book.

To reset a book with a genre, just input Set “Book_title” “Genre” (“” are required) again, for example, Set “Harry Potter” “Magic and Children”.

To set books and genres with a file. Input Set “filename”, for example, Set “../shakespeares.txt”.

To get books’ genres, input Get “Book_title”, for example, Get “Harry Potter”. If the book exists, the server will return trace with the node who stores this book, and its genre. If the book does not exist, the server will return an error message.

To quit the program, type in Quit.

1.3.3 Supernode

To initiate the supernode, input java -cp ".:/usr/local/Thrift/*" SuperNode Port NodeNumber, for example, “java -cp ".:/usr/local/Thrift/*" SuperNode 9090 5”.

Also, we support user-defined chord length. The command-line will be java -cp ".:/usr/local/Thrift/*" SuperNode Port NodeNumber ChordLen. Then the range of chord ring will be 0-(2^ChordLen). ChordLen will be 7 if it is not defined explicitly.

The supernode implements the following methods:

1. Join(IP, Port): When a node wants to join the DHT, the SuperNode will then return one of nodes information to implement updatedht process. If the SuperNode is busy in join process of another node, it will return a “NACK” to the requesting node and let the node wait.

2. PostJoin(IP, Port): After the node is done to join the DHT, it should notify the SuperNode about it. The supernode will then add the node information to the nodelist and allow other nodes to join the DHT.

3. GetNode(): The client will contact the SuperNode and it will return node information randomly chosen. The client may contact SuperNode only once when the client is running or every time when the client sends a request for testing purpose.

4. GetPossibleKey(), after the node call join, it can call the GetPossibleKey() to get the key that can be assigned to it.

5. GetHashPara(), the node can get the parameter of hash function from it.

1.3.4 Node

To initiate the node, input  “java -cp ".:/usr/local/Thrift/*" Node <serverIP> <serverPort> <nodePort>”, for example,  “java -cp ".:/usr/local/Thrift/*" Node csel-kh1250-01 9090 9092”.

To make the supernode ready for client, it should get NodeNumber Nodes (For example, 5 postjoin calls).

The Node will contain an interface for the client and other nodes. Following calls to the Node are implemented:

1. Set(Book_title, Genre, chain): When a client wants to set a book title and a genre, it contacts the Node using Thrift. The node will check whether it needs to store information locally or not. If it is not the node for the book title, it will forward the request to other nodes recursively. The chain is used for tracing involved nodes.

2. Get(Book_title, chain): When a client wants to know a genre with a book title, it contacts the Node using Thrift. The Node will check whether it is the node for the book title. If not, it will forward the request to other nodes recursively. The chain is used for tracing involved nodes.

3. UpdateDHT(Address NN, int IDX, List<Long> chain): On the new node, The IDX item in FingerTable will be updated to node NN. Then the new node will contact to involved nodes in the finger table to let them update DHT. When the node finishes calling to all nodes, it will let the SuperNode know that it is done to join by calling PostJoin().

4. HashKey(String key, int MODbit): calculate the hash value of key string.

5. FindSuccessor(ID): Find successor of an arbitrary point ID in chord ring.

6. FindPredecessor(ID): Find predecessor of an arbitrary point ID in chord ring.

7. FindClosetPrecedingFinger(ID): Find the closet predecessor of ID by searching in FingerTable

8. InSet(long _x, long _i, long _y, int ll, int rr): Check whether _i is in the (_x, _y) set in chord ring clockwise order.

9. InitNode(): Initialize all the variables in the new node. Find its predecessor and successor, and call UpdateDHT().


2 User document

2.1 How to compile

We have written a make script to compile the whole project.

cd pa2/src

./make.sh

2.2 How to run the project

1.    Run supernode
cd pa2/src/
java -cp ".:/usr/local/Thrift/*" SuperNode Port NodeNumber
        <Port>: The port of super
        <NodePort>: The port of node
            <NodeNumber>: Number of nodes
    Eg. java -cp ".:/usr/local/Thrift/*" SuperNode 9090 5
      2.  Run compute node
    Start compute node on 5 different machines.
    cd pa2/src/
    java -cp ".:/usr/local/Thrift/*" Node <serverIP> <serverPort> <nodePort>
        <ServerIP>: The ip address of server
        <ServerPort>: The port of server
        <NodePort>: The port of node
    Eg:    java -cp ".:/usr/local/Thrift/*" Node csel-kh1250-01 9090 9092
      3. Run client
    cd pa2/src/
    java -cp ".:/usr/local/Thrift/*" Client <serverIP> <serverPort>
<ServerIP>: The ip address of supernode
        <ServerPort>: The port of supernode
E.g. java -cp ".:/usr/local/Thrift/*" Client csel-kh1250-01 9090
Set "Harry Potter" "magic"
Get “Harry Potter”
Get “Harry”
Set “Harry Potter” “child”
Set “../shakespeares.txt”

2.3 What will happen after running

            The results and log(node searching trace) will be output on the screen. You will be asked to input the next command.


3 Testing Document

3.1 Testing Environment

Machines:

We use 7 machines to perform the test, including 1 supernode machine (csel-kh1250-01), 5 computeNode machines (csel-kh4250-03, csel-kh4250-01, csel-kh4250-22, csel-kh4250-25, csel-kh4250-34), 1 client machine (csel-kh1250-03).

Test Set:

We use a test set (../shakespeares.txt) including 42 items, totally 1.1 kB. The data uses a shared directory via NSF.

Logging:

Trace Logging is output on the window.

Testing Settings:

We test positive test case – where the book title is present in the DHT and negative test case – where the book title is not present in the DHT. And also other tests.

3.2 Join and PostJoin

The successfully joined node will print its ID, successor, predecessor, and FingerTable as follows:

NODE INFO: Successfully joined SuperNode. My ID is 100, and ChordLen=7
Returned JointRes==csel-kh4250-25.cselabs.umn.edu : 9092
My Successor is Address(ip:csel-kh4250-03.cselabs.umn.edu, port:9092, ID:0) . My Predecessor is Address(ip:csel-kh4250-25.cselabs.umn.edu, port:9092, ID:75)
__________________________________________________
| Print Finger Table                             |
__________________________________________________
|| 1 || Address(ip:csel-kh4250-03.cselabs.umn.edu, port:9092, ID:0) || 101 ||
|| 2 || Address(ip:csel-kh4250-03.cselabs.umn.edu, port:9092, ID:0) || 102 ||
|| 3 || Address(ip:csel-kh4250-03.cselabs.umn.edu, port:9092, ID:0) || 104 ||
|| 4 || Address(ip:csel-kh4250-03.cselabs.umn.edu, port:9092, ID:0) || 108 ||
|| 5 || Address(ip:csel-kh4250-03.cselabs.umn.edu, port:9092, ID:0) || 116 ||
|| 6 || Address(ip:csel-kh4250-01.cselabs.umn.edu, port:9092, ID:25) || 4 ||
|| 7 || Address(ip:csel-kh4250-22.cselabs.umn.edu, port:9092, ID:50) || 36 ||
__________________________________________________

When another new node is added and the Finger Table of this node is updated, it will print its updated Finger Table again.

Supernode output:

Postjoin csel-kh4250-03.cselabs.umn.edu:9092
Postjoin csel-kh4250-01.cselabs.umn.edu:9092
Postjoin csel-kh4250-22.cselabs.umn.edu:9092
Postjoin csel-kh4250-25.cselabs.umn.edu:9092
Postjoin csel-kh4250-34.cselabs.umn.edu:9092

Stability Test: Join multiple nodes at the same time:

Node output:

NODE INFO: Try joining SuperNode…
NODE INFO: Try joining SuperNode…
NODE INFO: Try joining SuperNode…
NODE INFO: Try joining SuperNode…

In this case, only one node will join first, while other nodes will wait for the change for joining.

3.2 Set Book and Genre

1) With Book and Genre as input

Set "Harry Potter" "magic"

[Harry Potter is set on machine 0 with hash value= 120]Trace(75, 100, 0, )
Succeed to set: Harry Potter magic

When doing Set/Get task, the client command line will return its result, and all the nodes involved (including the node directly communicate with client, the nodes passed when forwarding, and the node physically saving this record). All the visited nodes will be successively printed in Trace() field.

2) With Filename

Set "../shakespeares.txt"

[All's Well That Ends Well is set on machine 50 with hash value= 35]Trace(75, 25, 50, )
Succeed to set: All's Well That Ends Well Comedies
[As You Like It is set on machine 25 with hash value= 10]Trace(75, 0, 25, )
Succeed to set: As You Like It Comedies
[The Comedy of Errors is set on machine 50 with hash value= 41]Trace(75, 25, 50, )
Succeed to set: The Comedy of Errors Comedies
[Love's Labor's Lost is set on machine 25 with hash value= 14]Trace(75, 0, 25, )
Succeed to set: Love's Labor's Lost Comedies
[Measure for Measure is set on machine 25 with hash value= 17]Trace(75, 0, 25, )
Succeed to set: Measure for Measure Comedies
[The Merchant of Venice is set on machine 0 with hash value= 102]Trace(75, 100, 0, )
Succeed to set: The Merchant of Venice Comedies
…...

 3) Reset book and genre
Set "Harry Potter" "magic and children"

[Harry Potter is set on machine 0 with hash value= 120]Trace(75, 100, 0, )
Succeed to set: Harry Potter magic and children

In this sample, the key is “Harry Potter” and the value is “magic and children”. It is set on machine 0 because the hash value of its key is 120. When setting this key, it visited node 75, 100, 0 successively.

 3.3 Get

1)  Positive: File exists

Get "Harry Potter"

Succeed to get: Harry Potter.
[Harry Potter:magic and children is get on machine 0 with hash value= 120]Trace(75, 100, 0, )

In this sample, “Harry Potter:magic and children” means the key is “Harry Potter” and the value is “magic and children”. It is found on machine 0 because the hash value of its key is 120. When getting this key, it visited node 75, 100, 0 successively.

2)  Negative: key does not exist

Get "Filename in Dream"

ERROR[Filename in Dream is NOT FOUND on machine 100 with hash value= 89]Trace(75, 100, )
Failed to find: Filename in Dream

In this sample, the key “Filename in Dream” should be on machine 100 because the hash value of its key is 89. When trying to get this key, it visited node 75, 100 successively. But the key does not exist.

3.4 Printing Trace Log

After each Set() and Get() operation, the system will print all nodes it passed when executing, in the trace() field.

Note: According to the algorithm in the reference paper, the find_processor() function will always pass the predecessor of the end point in the path, and then check whether the next point should end this finding process.

3.5 Invalid Command

When typing invalid command, the system will show error message.

WrongCommand "Book"

Wrong Command: please input again

Input Command: Set Book_title Genre|Set Filename|Get Book_title|Quit


下面是写作业过程中的一点笔记:

task: 基于Chord实现一个Hash Table

我负责写Node,队友写SuperNode和Client。总体参考paper[Stoica et al., 2001]上的伪代码

FindSuccessor(key):对chord环上的任意一个key,返回他的successor

FindPredecessor(key):对chord环上的任意一个key,返回他的Predecessor

n.Closet_Preceding_Finger(key):对chord环上的任意一个key,在node n的Finger table中查找它的最近的Predecessor

注意几个坑点:

1. 在UpdateDHT时,因为这个函数是被递归调用的,所以有可能会出现若干个node形成infinite loop的情况。解决方法就是在updateDHT函数入口设置一个List来记录visited过的节点,如果下次遇到重复的就退出。

2. 在FindSuccesor / FindPredecessor里,假设这个书名就该在自己node上 那findsuccessor里就不需要rpc,直接访问自己的的successor就行了。这个需要特别判断一下,不然自己rpc自己是会崩的

3. 在UpdateDHT的伪代码中,if(s属于[n, finger[i].node)) 这里在测试中发现有问题。我们改成了if(finger[i].start属于[finger[i].node, s])解决了问题。它表示对于finger[i]这一项,新节点s比该项的当前值finger[i].node靠后,且是该节点能触到的位置finger[i].start的successor,所以要更新。    (所以顶会论文也会出错么?

FingerTable的实现:

 1 public class FingerItem{    //For FingerTable[i],
 2     Address Node;           //  Node = Successor(FingerTable[i].start)
 3     long ivx; long ivy;     //  [ivx, ivy) = [FingerTable[i].start, FingerTable[i+1].start)
 4     long start;             //  start = (NodeInfo.ID+(long)(Math.pow(2,i-1))) % (long)(Math.pow(2,ChordLen)))
 5     public FingerItem(Address _Node, long _ivx, long _ivy, long _start) {
 6         Node=_Node;
 7         ivx=_ivx;
 8         ivy=_ivy;
 9         start=_start;
10     }
11     public static long FingerCalc(long _n, int _k, int _m) {
12         if (_k == _m + 1)
13             return (_n);
14         else
15             return ((_n + (long) (Math.pow(2, _k - 1))) % (long) (Math.pow(2, _m)));
16     }
17     public void PrintItem(int i){
18         System.out.println("FingerItem"+i+" "+Node+" "+ivx+"_"+ivy+" : "+start);
19     }
20 }
21 
22 FingerItem[ChordLen] FingerTable;

猜你喜欢

转载自www.cnblogs.com/pdev/p/10621547.html