SETNX是Redis中的一个指令,全称是“Set If Not Exist”,只有当key不存在的时候,才会给key设置value,否则不进行任何操作。SETNX也可以用来实现Redis中的锁。
问题引入
在介绍如何使用SETNX实现锁之前,先让我们考虑这么一个问题:
假设我们现在有一个热点数据,数据存储在mysql中,我们使用redis做了一层缓存,在某个时间点缓存失效,这时候有大量的请求到来,他们同时到达,因为缓存miss,有可能同时把请求打到mysql,可能会出现血崩现象,那么如何避免这个问题的发生?
解决方式
答案就是使用锁,我们可以在缓存miss去mysql拿数据的时候用一把锁,保证缓存失效只会去mysql请求一次。
参照官方文档SETNX - Redis说明,我们可以使用
SETNX lock.foo <current Unix time + lock timeout + 1>
如果返回1,表示我们成功得到了锁,并且锁在lock timeout秒后就会invalid,这时候我们就可以去mysql拿数据并将其缓存。
如果返回0,表示锁已被占,可以重试或是去做别的事情。
死锁的处理
下面是原文对死锁现象的描述:
In the above locking algorithm there is a problem: what happens if a client fails crashes, or is otherwise not able to release the lock? It's possible to detect this condition because the lock key contains a UNIX timestamp. If such a timestamp is equal to the current Unix time the lock is no longer valid. When this happens we can't just call DEL against the key to remove the lock and then try to issue a SETNX, as there is a race condition here, when multiple clients detected an expired lock and are trying to release it.
如果客户端在SETNX返回0之后都等待若干时间,然后DEL lock.foo,然后重新SETNX,会有问题吗?
C1 and C2 read lock.foo to check the timestamp, because they both received
0 after executing SETNX, as the lock is still held by C3 that crashed after holding the lock.
C1 sends DEL lock.foo
C1 sends SETNX lock.foo and it succeeds
C2 sends DEL lock.foo
C2 sends SETNX lock.foo and it succeeds
both C1 and C2 acquired the lock because of the race condition.
确实会有上述的问题,C1和C2同时到达,可能会同时获得锁,那有什么解决方案么?
Fortunately, it's possible to avoid this issue using the following algorithm. Let's
see how C4, our sane client, uses the good algorithm:
C4 sends SETNX lock.foo in order to acquire the lock
The crashed client C3 still holds it, so Redis will reply with 0 to C4.
C4 sends GET lock.foo to check if the lock expired. If it is not, it will sleep for
some time and retry from the start.
Instead, if the lock is expired because the Unix time at lock.foo is older than
the current Unix time, C4 tries to perform:
GETSET lock.foo <current Unix timestamp + lock timeout + 1>
Because of the GETSET semantic, C4 can check if the old value stored atkey
is still an expired timestamp. If it is, the lock was acquired.
If another client, for instance C5, was faster than C4 and acquired the lock
with the GETSET operation, the C4 GETSET operation will return a non
expired timestamp. C4 will simply restart from the first step. Note that even if
C4 set the key a bit a few seconds in the future this is not a problem.
关键地方在于GETSET指令,能够在设置时间戳的时候判断该时间戳是否被修改过,如果被修改过,就返回0,保证不会有两个客户端同时设置了新的时间戳。