LLT(Low Level Test)通常由开发人员自测,它包括单元测试(Unit Test)、集成测试(Integration Test)、模块系统测试(Module System Test)、系统集成测试(BBIT),一般我们最关注的是UT(单元测试)和IT(集成测试)。
测试替身
Test Double(测试替身)包含了dummy, fake, mock, stub, spy 五种不同的类型,这里我们引用Martin Fowler的经典论述:
- Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an in memory database is a good example).- Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what's programmed in for the test.
- Spies are stubs that also record some information based on how they were called. One form of this might be an email service that records how many messages it was sent.
- Mocks are what we are talking about here: objects pre-programmed with expectations which form a specification of the calls they are expected to receive.
简单总结一下:
- Dummy:被用来仅仅作为填充参数列表的对象,实际上不会用到它们,对测试结果也没有任何影响;
- Fake:对一些系统进行裁剪之后形成的可运行的一套实现,跟源系统相比有一些(甚至很大)区别,不能上生产环境,但是作为测试使用非常适合,能提前暴露很多问题,例如一个内存版的数据库;
- Stub:为被测试对象提供数据,没有任何行为,往往是测试对象依赖关系的上游;
- Spy:被依赖对象的代理,行为往往由被代理的真实对象提供,代理的目的是为了断言程序运行的正确性。比如我们针对一个发邮件服务做 spy,调用结束后,需要看下调用了几次,这时候就用到了这份信息;
- Mock:重点在于 expectation!也就是说,我们对于这次调用在遇到各种情况时应该怎么处理,提前指定好规范)。
其中,Dummy 和 Fake 好理解,一个是没啥用,只是占位符,另一个是基本上啥都能干,比真实的系统差点意思,但基本上能覆盖大部分场景。而对于 spy,通常我们不太区分它和 stub,可以一起理解。
那么问题来了,Mock 说的是你要明确你对每次调用的 expectation,需要写代码来指明什么情况下要怎么做,而 Fake 好像也是这个意思,区别在于这个代码可能不用你写(因为开源社区有一些现成可用的)。那么它们根本区别是什么呢?
把握住这三点即可:
- Fake => working implementations
- Mock => predefined behavior
- Stub => predefined values
In state verification you have the object under testing perform a certain operation, after being supplied with all necessary collaborators. When it ends, you examine the state of the object and/or the collaborators, and verify it is the expected one.
In behaviour verification, on the other hand, you specify exactly which methods are to be invoked on the collaboratos by the SUT, thus verifying not that the ending state is correct, but that the sequence of steps performed was correct.
Mock 和 Stub 的区别在于,前者是行为,后者是状态。基于 Mock 做的是 behavior-based verification, 基于 Stub 做的是 State-based verification,这跟验证的方法有关。而 Mock 和 Fake 的区别在于,你在写单测的时候,需不需要构建出一个 working implementation,还是说只要预设一些行为的响应即可。
当然啦,这些概念都是一些阳春白雪的东西,实际工作中,还是需要适当接地气一些,比如:跟普通码农打交道的时候,说Mock就够了,跟普通测试人员就说”打桩“。
常用工具
JUnit5 + Hamcrest + Mockito
在面向Spring编程的年代,这就是最佳组合拳,其他诸如TestNG、PowerMock等等,了解一下就好了,毕竟学习成本摆在那里。MockServer
<dependency>
<groupId>org.mock-server</groupId>
<artifactId>mockserver-netty</artifactId>
<version>5.15.0</version>
</dependency>
MockServer就是上文提到的Stub,使用场景举例:模拟一个微信公众号的后台响应。
- Embedded Redis
<dependency>
<groupId>it.ozimov</groupId>
<artifactId>embedded-redis</artifactId>
<version>0.7.3</version>
</dependency>
这是一个典型的Fake,能够实现基本的Redis操作,但是某些高阶特性,比如Stream还是无法实现。
- Wix Embedded MySQL
<dependency>
<groupId>com.wix</groupId>
<artifactId>wix-embedded-mysql</artifactId>
<version>4.6.2</version>
</dependency>
比H2强大不少,可以指定MySQL版本,5.7非常好用,8.0.x貌似有bug,推荐MariaDB4j。
- MariaDB4j
<dependency>
<groupId>ch.vorburger.mariaDB4j</groupId>
<artifactId>mariaDB4j-springboot</artifactId>
<version>3.0.1</version>
<scope>test</scope>
</dependency>
又一个非常好用的一个Fake,目前默认数据库版本是10.2.11,与Flyway等数据库工具配合使用时需要注意版本支持情况,推荐使用8.5.x版本。
- Test Containers
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>testcontainers</artifactId>
<version>1.19.1</version>
</dependency>
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>mysql</artifactId>
<version>1.19.1</version>
</dependency>
基于Docker镜像,可以实现各种Fake,有兴趣的童鞋可以自行了解下。
- Flapdoodle Embedded MongoDB
<dependency>
<groupId>de.flapdoodle.embed</groupId>
<artifactId>de.flapdoodle.embed.mongo.spring31x</artifactId>
<version>4.9.3</version>
</dependency>
MongoDB的一个老牌Fake了,注意区分2x和3x版本。
知识扩展
Unit test: Specify and test one point of the contract of single method of a class. This should have a very narrow and well defined scope. Complex dependencies and interactions to the outside world are stubbed or mocked.
Integration test: Test the correct inter-operation of multiple subsystems. There is whole spectrum there, from testing integration between two classes, to testing integration with the production environment.
-
Smoke test (aka sanity check): A simple integration test where we just check that when the system under test is invoked it returns normally and does not blow up.
- Smoke testing is both an analogy with electronics, where the first test occurs when powering up a circuit (if it smokes, it's bad!)...
- ... and, apparently, with plumbing, where a system of pipes is literally filled by smoke and then checked visually. If anything smokes, the system is leaky.
Regression test: A test that was written when a bug was fixed. It ensures that this specific bug will not occur again. The full name is "non-regression test". It can also be a test made prior to changing an application to make sure the application provides the same outcome
Acceptance test: Test that a feature or use case is correctly implemented. It is similar to an integration test, but with a focus on the use case to provide rather than on the components involved.
System test: Tests a system as a black box. Dependencies on other systems are often mocked or stubbed during the test (otherwise it would be more of an integration test).
Pre-flight check: Tests that are repeated in a production-like environment, to alleviate the 'builds on my machine' syndrome. Often this is realized by doing an acceptance or smoke test in a production like environment.