mnesia 和mysql 同步,让我们聊聊Mnesia（一）

2022-04-11 00:00:00 集群事务节点提交回答

Mnesia是什么

Mensia是Erlang的OTP库中一个带有强事务的分布式KV存储引擎。可以非常方便且高效的存储Erlang的任何数据类型。并且该系统支持持久化和内存表混合使用，不强制要求所有节点的性质相同。

Mnesia的分布式事务是怎么实现的

Mnesia的事务模型是2PC的模型，基本上可以分为以下几个步骤

让所有参与的Node准备进行提交

如果有一个Node回答No，让所有回答yes的Node回滚

如果所有Node都回答yes，让所有Node提交

mnesia_tm.erl中的t_commit完成了进行事务提交的准备工作，在这些准备工作中，arrange函数将缓存在ets中的数据操作转化成prepare记录和commit记录。

t_commit(Type) ->

{_Mod, Tid, Ts} = get(mnesia_activity_state),

%先把ETS表拿出来

Store = Ts#tidstore.store,

%单层事务

Ts#tidstore.level == 1 ->

intercept_friends(Tid, Ts),

%% N is number of updates

case arrange(Tid, Store, Type) of

{N, Prep} when N > 0 ->

multi_commit(Prep#prep.protocol,

majority_attr(Prep),

Tid, Prep#prep.records, Store);

{0, Prep} ->

multi_commit(read_only,

majority_attr(Prep),

Tid, Prep#prep.records, Store)

end;

true ->

%% nested commit

Level = Ts#tidstore.level,

[{OldMod,Obsolete} | Tail] = Ts#tidstore.up_stores,

req({del_store, Tid, Store, Obsolete, false}),

NewTs = Ts#tidstore{store = Store,

up_stores = Tail,

level = Level - 1},

NewTidTs = {OldMod, Tid, NewTs},

put(mnesia_activity_state, NewTidTs),

do_commit_nested

end.

而multi_commit函数完成事务的2PC部分,该函数分支是默认的Erlang事务使用的提交方式

%使用简单的2PC进行，

%1. 让所有参与的Node准备进行提交

%2a.如果有一个Node回答No，让所有回答yes的Node回滚

%2b.如果所有Node都回答yes，让所有Node提交

multi_commit(sym_trans, _Maj = [], Tid, CR, Store) ->

%% This lightweight commit protocol is used when all

%% the involved tables are replicated symetrically.

%% Their storage * must match on each node.

%% 1 Ask the other involved nodes if they want to commit

%% All involved nodes votes yes if they are up

%% 2a Somebody has voted no

%% Tell all yes voters to do_abort

%% 2b Everybody has voted yes

%% Tell everybody to do_commit. I.e. that they should

%% prepare the commit, log the commit record and

%% perform the updates.

%% The outcome is kept 3 minutes in the transient decision table.

%% Recovery:

%% If somebody dies before the coordinator has

%% broadcasted do_commit, the transaction is aborted.

%% If a participant dies, the table load algorithm

%% ensures that the contents of the involved tables

%% are picked from another node.

%% If the coordinator dies, each participants checks

%% the outcome with all the others. If all are uncertain

%% about the outcome, the transaction is aborted. If

%% somebody knows the outcome the others will follow.

%划分所有的提交节点为内存或磁盘

{DiscNs, RamNs} = commit_nodes(CR, [], []),

%进入事务提交的准备状态，这时候事务还没有真正的提交完成

Pending = mnesia_checkpoint:tm_enter_pending(Tid, DiscNs, RamNs),

?ets_insert(Store, Pending),

%循环的发出提交申请到各参与的节点上

{WaitFor, Local} = ask_commit(sym_trans, Tid, CR, DiscNs, RamNs),

%此处是死等，但是实际上也不是会彻底死等

%什么情况会发生死等呢

%在ask_commit之后，对端节点死掉了，但是在下一次Erts心跳之前

%对端节点又启动起来了,OK这就是个有意思的情况

%所有节点都返回了同意Outcome为do_commit

{Outcome, []} = rec_all(WaitFor, Tid, do_commit, []),

?eval_debug_fun({?MODULE, multi_commit_sym},

[{tid, Tid}, {outcome, Outcome}]),

%向所有磁盘节点广播提交

rpc:abcast(DiscNs -- [node()], ?MODULE, {Tid, Outcome}),

%向所有内存节点广播提交

rpc:abcast(RamNs -- [node()], ?MODULE, {Tid, Outcome}),

case Outcome of

do_commit ->

mnesia_recover:note_decision(Tid, committed),

do_dirty(Tid, Local),

mnesia_locker:release_tid(Tid),

?MODULE ! {delete_transaction, Tid};

{do_abort, _Reason} ->

mnesia_recover:note_decision(Tid, aborted)

end,

?eval_debug_fun({?MODULE, multi_commit_sym, post},

[{tid, Tid}, {outcome, Outcome}]),

Outcome;

Mnesia中常见问题和解决方法

常见问题

脑裂

传说中的事务无限等待

问题成因

脑裂的成因，主要是网络不稳定，导致两个节点长时间的失去联系，让彼此都认为对方掉线了。而这个时候，两个节点都接收了大量的数据写入。当两个节点自动恢复集群通信的时候，无法通过事务决议合并数据的时候才会出现。

在ask_commit之后，对端节点死掉了，但是在下一次Erts心跳之前,对端节点又启动起来了。OK，这就是这种有意思的情况。基本上来讲，这种事情发生的概率非常小，除非是设计失误和对Erlang系统不熟悉滥用heart这东西产生的。

解决方法

对于脑裂问题，没有什么特别想说的。首先让运维做好内网通信的管理，Mnesia集群使用专用的内部交换机和交换机热冗余，一点都不过分。其次，做好脑裂发生的准备，在应用层面进行处理，可以参考大神的https://github.com/uwiger/unsplit项目。

对于这个传说中的问题，我自己在使用Mnesia的集群中并没有遇到过。解决这问题，首先，要搞清楚Erts集群是怎么互相探测是活的，可以看到我前面的博文http://my.oschina.net/u/236698/blog/389737。其次，在整个集群内部建立NTP服务器，保证整个集群的对时稳定性。再次，使用heart时，不要严格按照那个心跳时间设置，至少要设置2.5倍节点之间心跳探测时间为保活时间。

相关文章