概念
原型模式属于设计模式的一种, 更准确的说, 是一种创建型模式, 根据wikipedia的介绍:
其特点在于通过「复制」一个已经存在的实例来返回新的实例,而不是新建实例。被复制的实例就是我们所称的「原型」
我更习惯称原型模式为clone
模式, 因为原型对象往往都有一个clone()
成员函数, 用于复制新的对象(克隆对象).
class Scorer {
public:
virtual Scorer* clone() = 0;
};
clone模式的类图是这样的:
注: 通过继承机制, Client只需要和Prototype
打交道, 从而与原型对象的具体实现解耦合, 所以具体原型对象既可以是ConcretePrototype1, 也可以是ConcretePrototype2
好处
假设以下应用场景, 从factory
中构造对象需要复杂的操作(比如读配置文件), 那创建大量的对象就非常低效, 此时, 原型模式就派上用场, factory
只负责创建第一个对象(原型对象), 后续生成的对象, 只要从原型对象clone()
就够了, 运行时直接拷贝往往代价很小. 这就是它的最大优点:
Prototype pattern refers to creating duplicate object while keeping performance in mind.
wiki 介绍了一个有趣的例子, 阐明了上述观点
clone模式带来的另一个好处是客户端(client)避免接触到原型对象的具体实现. 比如在搜索引擎中, 排序Scorer的实现和业务相关, 具体细节复杂而难懂, 让引擎了解这些细节, 只会增加系统复杂度, 正确的做法, 引擎需要调用Scorer的clone()
接口为每个请求生成一个新的Scorer.
avoid subclasses of an object creator in the client application, like the abstract factory pattern does.
我认为它还有第三个好处, 特别适用于服务器端的请求处理, 比如搜索引擎, 后端请求处理一般是多线程的, 此时为每个请求clone一个对象(比如Scorer对象), 可以实现线程间数据的隔离, 相对比全局共享的singleton对象, 那真是清爽很多.
一个Scorer的例子
程序代码:
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
class IScorer {
public:
virtual ~IScorer() {};
public:
virtual int init() = 0;
virtual int processQuery(int query_info) = 0;
virtual int doScore(int doc_info) = 0;
public:
virtual IScorer *clone() = 0;
virtual int destroy() = 0;
};
class DemoScorer: public IScorer {
public:
virtual ~DemoScorer() {
// delete pGlobalInfo;
// pGlobalInfo = NULL;
};
public:
virtual int init() {
pGlobalInfo = new int();
*pGlobalInfo = rand();
}
virtual int processQuery(int query_info) {
fprintf(stdout, "process query[%d] in global[%d]\n", query_info, *pGlobalInfo);
_queryInfo = query_info;
}
virtual int doScore(int doc_info) {
fprintf(stdout, "\tscore for %d in query [%d], global [%d]\n", doc_info, _queryInfo, *pGlobalInfo);
}
public:
virtual IScorer *clone() {
return new DemoScorer(*this);
}
virtual int destroy() {
delete this;
}
private:
int* pGlobalInfo;
int _queryInfo;
};
class EngineRunner
{
public:
EngineRunner() {
_pScorer = new DemoScorer;
_pScorer->init();
};
~EngineRunner() {
delete _pScorer;
}
void start() {
for (int i = 0; i < 3; i++)
{
pthread_create(_threadIds+i, NULL, EngineRunner::threadEntry, this);
}
for (int i = 0; i < 3; i++)
{
pthread_join(_threadIds[i], NULL);
}
}
int threadFun() {
int query_info = rand();
IScorer *pScorer = _pScorer->clone();
pScorer->processQuery(query_info);
int docNum = 4;
for (int i=0; i<docNum; ++i) {
pScorer->doScore(i);
}
pScorer->destroy();
}
public:
static void * threadEntry(void * arg) {
EngineRunner* pRunner = (EngineRunner*)arg;
pRunner->threadFun();
}
private:
IScorer * _pScorer;
pthread_t _threadIds[3];
};
int main(void)
{
EngineRunner runner;
runner.start();
return 0;
}
上面的程序模拟了搜索引擎(EngineRunner)调用了Scorer的过程. 着重阐明如何使用clone模式实现, 其他的细节都省略了.
搜索引擎(EngineRunner)起了3个工作线程, 分别处理用户请求, 当请求(query_info)到来, 引擎clone()
一个Scorer处理它, 处理完了之后, 调用destroy()
销毁它
int query_info = rand();
IScorer *pScorer = _pScorer->clone();
pScorer->processQuery(query_info);
int docNum = 4;
for (int i=0; i<docNum; ++i) {
pScorer->doScore(i);
}
pScorer->destroy();
上述程序存在一个漏洞, 使用valgrind能轻易发现有内存泄露的情况:
valgrind --leak-check=full ./main
==29000== 4 bytes in 1 blocks are definitely lost in loss record 1 of 1
==29000== at 0x4A06DC7: operator new(unsigned long) (vg_replace_malloc.c:261)
==29000== by 0x400C5C: DemoScorer::init() (in /home/jiye/blog/clone_pattern/main)
==29000== by 0x400A9F: EngineRunner::EngineRunner() (in /home/jiye/blog/clone_pattern/main)
==29000== by 0x400919: main (in /home/jiye/blog/clone_pattern/main)
那我们就fix它, 把析构函数的注释打开即可:
virtual ~DemoScorer() {
// delete pGlobalInfo;
// pGlobalInfo = NULL;
};
但是, 迅速我就发现了另一个问题, double free:
*** glibc detected *** ./main_v2: double free or corruption (fasttop): 0x00000000073cd030 ***
======= Backtrace: =========
/lib64/libc.so.6[0x32f18722ef]
/lib64/libc.so.6(cfree+0x4b)[0x32f187273b]
./main_v2(__gxx_personality_v0+0x399)[0x400b81]
./main_v2(__gxx_personality_v0+0x201)[0x4009e9]
./main_v2[0x400d44]
./main_v2[0x400d63]
深入分析, 发现全局资源(pGlobalInfo
)在clone()
中只是简单的浅拷贝, 在第一个克隆对象(DemoScorer
)在析构函数中删除它, 导致后续克隆对象获得的全局对象处于未定义状态.
直接了当的解决这个问题, 那就重新定义DemoScorer
, 区分开原型对象和克隆对象, 原型对象负责析构全局资源.
class DemoScorer: public IScorer {
public:
DemoScorer() {
_isClone = false;
}
virtual ~DemoScorer() {
if (! _isClone) {
delete pGlobalInfo;
pGlobalInfo = NULL;
}
};
public:
virtual int init() {
pGlobalInfo = new int();
*pGlobalInfo = rand();
}
virtual int processQuery(int query_info) {
fprintf(stdout, "process query[%d] in global[%d]\n", query_info, *pGlobalInfo);
_queryInfo = query_info;
}
virtual int doScore(int doc_info) {
fprintf(stdout, "\tscore for %d in query [%d], global [%d]\n", doc_info, _queryInfo, *pGlobalInfo);
}
public:
virtual IScorer *clone() {
DemoScorer *pScorer = new DemoScorer(*this);
pScorer->setClone(true);
return pScorer;
}
virtual int destroy() {
delete this;
}
void setClone(bool yes) { _isClone = yes; }
private:
bool _isClone;
int* pGlobalInfo;
int _queryInfo;
};
这样就不会有内存泄露
valgrind --leak-check=full ./main_v3
==29473== All heap blocks were freed -- no leaks are possible
改进
深入的看这个问题, 问题的症结在于全局资源和query级别资源都需要 DemoScorer
管理和释放, 两个不同作用域的资源应该分开才好, 于是我能想到的是把全局资源的管理放到ScorerFactory
中, 这样分开管理, 问题就引刃而解.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
class IScorer {
public:
virtual ~IScorer() {};
public:
virtual int processQuery(int query_info) = 0;
virtual int doScore(int doc_info) = 0;
public:
virtual IScorer *clone() = 0;
virtual int destroy() = 0;
};
class DemoScorer: public IScorer {
public:
DemoScorer(int* pGlobalInfo) {
_pGlobalInfo = pGlobalInfo;
}
virtual ~DemoScorer() {
};
public:
virtual int processQuery(int query_info) {
fprintf(stdout, "process query[%d] in global[%d]\n", query_info, *_pGlobalInfo);
_queryInfo = query_info;
}
virtual int doScore(int doc_info) {
fprintf(stdout, "\tscore for %d in query [%d], global [%d]\n", doc_info, _queryInfo, *_pGlobalInfo);
}
public:
virtual IScorer *clone() {
DemoScorer *pScorer = new DemoScorer(*this);
return pScorer;
}
virtual int destroy() {
delete this;
}
private:
int* _pGlobalInfo;
int _queryInfo;
};
class ScorerFactory {
public:
ScorerFactory() { _pGlobalInfo = NULL; };
~ScorerFactory() {
if (_pGlobalInfo) {
delete _pGlobalInfo;
_pGlobalInfo = NULL;
}
};
public:
int init() {
_pGlobalInfo = new int();
*_pGlobalInfo = rand();
}
IScorer* createScorer() {
return new DemoScorer(_pGlobalInfo);
}
void destroy(IScorer* pScorer) {
delete pScorer;
}
private:
int * _pGlobalInfo;
};
class EngineRunner
{
public:
EngineRunner() {
factory.init();
_pScorer = factory.createScorer();
};
~EngineRunner() {
factory.destroy(_pScorer);
}
void start() {
for (int i = 0; i < 3; i++)
{
pthread_create(_threadIds+i, NULL, EngineRunner::threadEntry, this);
}
for (int i = 0; i < 3; i++)
{
pthread_join(_threadIds[i], NULL);
}
}
int threadFun() {
int query_info = rand();
IScorer *pScorer = _pScorer->clone();
pScorer->processQuery(query_info);
int docNum = 4;
for (int i=0; i<docNum; ++i) {
pScorer->doScore(i);
}
pScorer->destroy();
}
public:
static void * threadEntry(void * arg) {
EngineRunner* pRunner = (EngineRunner*)arg;
pRunner->threadFun();
}
private:
IScorer* _pScorer;
ScorerFactory factory;
pthread_t _threadIds[3];
};
int main(void)
{
EngineRunner runner;
runner.start();
return 0;
}
上述代码修改了IScorer
的接口, 把init()
迁移到了ScorerFactory
类中, 这样全局资源的申请和释放就迁移到了 ScorerFactory
. IScorer
只需要管理query级别的资源就够了, 看上去清爽很多.
但是, 看上去很美的东西, 未必实用, 假如你要编写50个Scorer
, 现在你就需要编写100个类了, 50个Scorer + 50个ScorerFactory, 蛋疼.
继续改进
软件世界中, 不存在适用各种情况的完美设计, 只要设计满足现状和将来的需要,那就是OK的.回到这个问题本身, 要让全局资源和query资源都通过IScorer
进行管理, 同时能合理的处理原型对象和克隆对象. 只需要为全局资源找一个释放的接口即可destruct()
; 为了预制匹配, 我把全局资源的获取从init()
改成construct()
.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
class IScorer {
public:
virtual int construct() = 0;
virtual int destruct() = 0;
public:
virtual int processQuery(int query_info) = 0;
virtual int doScore(int doc_info) = 0;
public:
virtual IScorer *clone() = 0;
virtual int recycle(IScorer *) = 0;
virtual ~IScorer() {};
};
class DemoScorer: public IScorer {
public:
virtual int construct() {
pGlobalInfo = new int();
*pGlobalInfo = rand();
}
virtual int destruct() {
delete pGlobalInfo;
pGlobalInfo = NULL;
};
virtual int processQuery(int query_info) {
fprintf(stdout, "process query[%d] in global[%d]\n", query_info, *pGlobalInfo);
_queryInfo = query_info;
}
virtual int doScore(int doc_info) {
fprintf(stdout, "\tscore for %d in query [%d], global [%d]\n", doc_info, _queryInfo, *pGlobalInfo);
}
public:
virtual IScorer *clone() {
return new DemoScorer(*this);
}
virtual ~DemoScorer() {
};
virtual int recycle(IScorer* pScorer) {
delete pScorer;
}
private:
int* pGlobalInfo;
int _queryInfo;
};
class EngineRunner
{
public:
EngineRunner() {
_pScorer = new DemoScorer;
_pScorer->construct();
};
~EngineRunner() {
_pScorer->destruct();
delete _pScorer;
}
void start() {
for (int i = 0; i < 3; i++)
{
pthread_create(_threadIds+i, NULL, EngineRunner::threadEntry, this);
}
for (int i = 0; i < 3; i++)
{
pthread_join(_threadIds[i], NULL);
}
}
int threadFun() {
int query_info = rand();
IScorer *pScorer = _pScorer->clone();
pScorer->processQuery(query_info);
int docNum = 4;
for (int i=0; i<docNum; ++i) {
pScorer->doScore(i);
}
_pScorer->recycle(pScorer);
}
public:
static void * threadEntry(void * arg) {
EngineRunner* pRunner = (EngineRunner*)arg;
pRunner->threadFun();
}
private:
IScorer * _pScorer;
pthread_t _threadIds[3];
};
int main(void)
{
EngineRunner runner;
runner.start();
return 0;
}
注意IScorer
接口的定义, 全局资源的获取和释放通过以下接口:
public:
virtual int construct() = 0;
virtual int destruct() = 0;
query资源的获取和释放通过以下接口
public:
virtual IScorer *clone() = 0;
virtual recycle(IScorer*) = 0;
virtual ~IScorer() {};
通过接口的定义, 从接口层面规范了IScorer的行为, <<Effective C++>> 条款18说:
Make interfaces easy to use collectly and hard to use incorrectly
我认为这个版本相对于第一个版本的最大优势在于, 把复杂工作转移到引擎的调用上. 最大程度让Scorer的实现者focus在最重要的事情上-业务.
后续
我问过一些人, 他们也给出很好的处理建议, 比如:
-
DemoScorer
的全局资源通过shared_ptr
管理起来, 通过引用计数就能实现全局资源的有效管理. -
DemoScorer
全局资源直接作为静态成员数据static
, 由系统负责释放.
但是我觉得, 在这个特定的场景(需要编写大量的Scorer)中, 上述做法都增加了实现Scorer的复杂度.当然换个场景, 那就另当别论了.
Scott Meyers 说:
... That's a simple reflection of the fact that there is no one ideal design for all software. The best design depends on what the system is expected to do, both now and in the future.