本文是Fallacies of Distributed Computing Explained的笔记。
8个分布式计算的谬论
- The network is reliable.
- Latency is zero.
- Bandwidth is infinite.
- The network is secure.
- Topology doesn't change.
- There is one administrator.
- Transport cost is zero.
- The network is homogeneous.
先来看第一个
1. The network is reliable
网络是不可靠的,因此在软件设计的时候,我们需要考虑
- retry
- acknowledge important messages
- identify/ignore duplicates(去重)或者幂等
- reorder messages(对消息排序)
- verify message integrity(验证消息完整性)
2. Latency is zero
延迟指数据从一个地方传递到另一个地方需要多长时间
带宽则是决定了同时可以传输多少数据
Latency is how much time it takes for data to move from one place to another
bandwidth which is how much data we can transfer during that time
延迟比带宽更难解决,延迟会由信息的传播速度决定,而光速是恒定的,意味着延迟的low bound是固定的
"B ut I think that it’s really interesting to see that the end-to-end bandwidth increased by 1468 times within the last 11 years while the latency (the time a single ping takes) has only been improved tenfold. If this w ouldn’t be enough, there is even a natural cap on latency. The minimum round-trip time between two points of this earth is determined by the maximum speed of information transmission: the speed of light. At roughly 300,000 kilometers per second (3.6 * 10E12 teraangstrom per fortnight), it will always take at least 30 milliseconds to send a ping from Europe to the US and back, even if the processing would be done in real time."
既然延迟无法避免,我们只能尽可能的去减少消息传输。
Taking latency into consideration means you should strive to make as few as possible calls and assuming you have enough bandwidth (which will talk about next time) you'd want to move as much data out in each of this calls.
3. Bandwidth is infinite
带宽无限的谬论主要有两方面原因:
- 随着带宽的增长,我们传输的数据也在增加;
- 丢包问题
One is that while the bandwidth grows, so does the amount of information we try to squeeze through it. VoIP, videos, and IPTV are some of the newer applications that take up bandwidth
The other force at work to lower bandwidth is packet loss (along with frame size).
带宽不是不限的事实,让我们去减少信息的传递,但是延迟的无法避免,有让我们去尽可能的传递多的数据,我们能做的只能是trade-off。
4. The Network is Secure
作为一个架构师你不必要是一个安全专家,但是你需要了解它,知道怎么去解决她。
Security is usually a multi-layered solution that is handled on the network, infrastructure, and application levels.
5. Topology doesn’t change
可能这个谬论的得来是只有在实验环境中Topology 才不会变。
"Topology doesn't change." That's right, it doesn’t--as long as it stays in the test lab.
给我们的启示:
- 不要依赖特定的路由或节点
- 需要同时提供位置透明性或发现服务
Try not to depend on specific endpoints or routes, if you can't be prepared to renegotiate endpoints.
You would want to either provide location transparency (e.g. using an ESB, multicast) or provide discovery services (e.g. a Active Directory/JNDI/LDAP).
6. There is one administrator
当没有出现问题的时候,我们不会去关心是否有多个administrator,但是一旦问题发生,你就抓狂了。
"Okay, there is more than one administrator. But why should I care?" Well, as long as everything works, maybe you don't care. You do care, however, when things go astray and there is a need to pinpoint a problem (and solve it).
为了防止administrators的问题,我们需要注意:
- 在系统小的时候,就提供工具来监控系统操作
A proactive approach is to also include tools for monitoring on-going operations as well;
总结起来,当我们面对多administrator的时候,必然会收到administrator的约束,我们能做的就是帮助他们管理自己的应用。
To sum up, when there is more than one administrator (unless we are talking about a simple system and even that can evolve later if it is successful), you need to remember that administrators can constrain your options (administrators that sets disk quotas, limited privileges, limited ports and protocols and so on), and that you need to help them manage your applications.
7. Transport cost is zero
我们可以从多个方面去解释上面结论是谬误
其中一个我们从从应用层到传输层的数据传递,我们需要对数据进行编码,会消耗time and resources
One way is that going from the application level to the transport level is free. This is a fallacy since we have to do marshaling (serialize information into bits) to get data unto the wire, which takes both computer resources and adds to the latency
第二个方式则是设置和运行网络都需要代价,我们需要很多很多money 买买买!
The second way to interpret the statement is that the costs (as in cash money) for setting and running the network are free. This is also far from being true. There are costs--costs for buying the routers, costs for securing the network, costs for leasing the bandwidth for internet connections, and costs for operating and maintaining the network running. Someone, somewhere will have to pick the tab and pay these costs.
8. The network is homogeneous
网络是同构的,这是最后一个谬论。我们需要注意不用去依赖一些自营的协议,这样后续在集成的时候会遇到大麻烦。
It is worthwhile to pay attention to the fact the network is not homogeneous at the application level
Do not rely on proprietary protocols--it would be harder to integrate them later
总结
分布式系统虽然已经发展好多年了,但是面临的问题却一直是那么多,但是可怕的是好多架构师在设计时候却仍然忽略了其中的一些问题,希望上面的列举出来的谬论能帮助架构师在设计的时候,避免一些问题。