6.824 raft算法与lab2

2022-03-04 00:18:03

0. 序

继续回来填6.824的坑。

1. 关于raft算法

不认可处理removed server来捣乱的方法（好像确实可以，论文中的做法是server会丢掉requestVote，并且不更新term，如果server在minimun election timeout的时间内收到了leader的消息。我之前想到partition网络恢复时也许这个会造成问题，但是如果上面的限制只是针对follower来做的话就没问题了。），不能理解如何实现linearizable semantics（对于read only操作，需要有过半的heartbeats的积极回复，确保leader真的是leader，因为可能存在network partition这样的问题，但是没理解如何保证写操作只执行一次 -> raft存在的一个问题是当client调用rpc没有得到回复时，并不代表这个command一定没有被server执行（只是说明这个command因为一些原因没有被复制到majority of servers中，这个command是否被执行依赖于下一次的leader是否包含了这个command），所以需要额外的机制来保证当客户端重发这个command的时候，这个指令不会被再执行一次。论文的想法是给每个client一个serial number，这样在log中需要额外包含客户id和相应的serial number，这时客户端重发的时候，leader可以检查log，查看这个command是否已经在log中了，以这样一种方式来保证每个指令只会被执行一次，客户端当然也可以在重发前插入一些新的command，只是这样在时间图上重发的指令和新的指令的执行时间有重叠的，并不保证先后顺序，这也不与linearizable semantics相违背）。
一些优化：fast backup: Xterm, Xindex is first entry of that term, length of log，send batch of entry， ~~让没有选举资格的人退出？（收到过半的严格拒绝票）~~，commitInex和lastApplied不要归0？

快速backup
一旦receiver收到leader的log，且与自己的不一样，需要删除掉自己后面的那些log，不然会出问题。
潜在的问题：一个goroutine修改currentTerm到最新的，并且使得自己不是leader，但此时可能主线程正处于发appendEntries的状态，这会出问题！

2. 关于如何使用锁的一些想法

通过copy保证操作所使用数据的原子性

lock()
a = load(data)
unlock()
dealwith(a)

// bad idea
lock()
dealwith(load(data))
unlock()

通过lr.d + sc.d类似的方法保证操作的原子性

a = load(data)
b = dealwith(a)

lock()
if(a == load(data)){
	store(b)
}
unlock()

这一点体现在test2C的rpc的调用中，因为rpc的长延迟，返回时拿到的reply可能早已经不需要了，所以用一个条件加载的技巧检查是否有必要处理这个reply（主要不能在调用rpc时持有锁，这等待的时间太长了）。

3. 结果

Test (2A): initial election ...
  ... Passed --   3.1  3   38   12050    0
Test (2A): election after network failure ...
  ... Passed --   4.9  3   90   20838    0
Test (2A): multiple elections ...
  ... Passed --   6.5  7  578  127360    0
Test (2B): basic agreement ...
  ... Passed --   1.4  3   22    7048    3
Test (2B): RPC byte count ...
  ... Passed --   4.0  3   70  123504   11
Test (2B): agreement despite follower disconnection ...
  ... Passed --   7.2  3  114   34057    8
Test (2B): no agreement if too many followers disconnect ...
  ... Passed --   4.2  5  179   41301    3
Test (2B): concurrent Start()s ...
  ... Passed --   1.4  3   26    8665    6
Test (2B): rejoin of partitioned leader ...
  ... Passed --   7.2  3  150   42958    4
Test (2B): leader backs up quickly over incorrect follower logs ...
  ... Passed --  38.8  5 3594 2242599  102
Test (2B): RPC counts aren't too high ...
  ... Passed --   2.3  3   50   21202   12
Test (2C): basic persistence ...
  ... Passed --   5.2  3   86   25436    6
Test (2C): more persistence ...
  ... Passed --  20.1  5  822  205664   16
Test (2C): partitioned leader and one follower crash, leader restarts ...
  ... Passed --   2.7  3   42   12239    4
Test (2C): Figure 8 ...
  ... Passed --  33.2  5  977  243777   48
Test (2C): unreliable agreement ...
  ... Passed --   9.4  5 1208  452683  246
Test (2C): Figure 8 (unreliable) ...
  ... Passed --  33.4  5 9296 8061473  257
Test (2C): churn ...
  ... Passed --  18.0  5 11014 5331040 2713
Test (2C): unreliable churn ...
  ... Passed --  16.7  5 1458  635069  263
Test (2D): snapshots basic ...
  ... Passed --  10.8  3  636  342092  251
Test (2D): install snapshots (disconnect) ...
  ... Passed --  101.2  3 2244  872847  397
Test (2D): install snapshots (disconnect+unreliable) ...
  ... Passed --  97.8  3 2249  839449  395
Test (2D): install snapshots (crash) ...
  ... Passed --  48.6  3 1346  592869  366
Test (2D): install snapshots (unreliable+crash) ...
  ... Passed --  57.1  3 1436  607187  367
PASS
ok  	6.824/raft	535.047s

官网上显示8分钟的测试时间是比较合理的，A, B, C的测试时间都和参考差不多，D的测试时间有点长了，目前还不清楚这是为啥（比较一下参考的RPC次数，感觉我的实现里面RPC次数有点过高了）

Test (2D): snapshots basic ...
  ... Passed --  10.8  3  636  340820  251
Test (2D): install snapshots (disconnect) ...
  ... Passed --  105.2  3 2259  864456  375
Test (2D): install snapshots (disconnect+unreliable) ...
  ... Passed --  102.8  3 2218  792833  342
Test (2D): install snapshots (crash) ...
  ... Passed --  49.8  3 1306  565826  344
Test (2D): install snapshots (unreliable+crash) ...
  ... Passed --  56.5  3 1484  640480  399
PASS
ok  	6.824/raft	325.170s

4. TODO

对RPC次数进行优化。

码农公寓

0. 序

1. 关于raft算法

2. 关于如何使用锁的一些想法

3. 结果

4. TODO

相关文章