当我们需要计算数据库数量时,会用到db.collection.countDocuments(<query>, <options>)
let count = await db.collection.countDocuments({})
当数据量非常大,例如百万级别以上,而且检索范围非常宽广时,效率就会非常低下,甚至极大概率会time out而失败。
MongoNetworkTimeoutError: connection 4 to x.x.x.x:x timed out
在大数据的情况下,查询缓慢是在所难免的,但希望花了时间要起码有结果。
db.collection.countDocuments(<query>, <options>)
有2个参数,第二个options文档是:
- limit integer Optional. The maximum number of documents to count.
- skip integer Optional. The number of documents to skip before counting.
- hint string or document Optional. An index name or the index specification to use for the query.
- maxTimeMS integer Optional. The maximum amount of time to allow the count to run.
我们可以利用limit和skip,多次查询相加得到结果:
let count = await db.collection.countDocuments({},{limit:10000,skip:Number})
NodeJS的例子:
async function getCount (collection, query={}, limit=10000, obj={}) {
let skip = 0
let total = 0
while (true) {
if (obj.isStop) break
let count = await collection.countDocuments(query, { limit, skip })
total += count
skip += limit
console.log('getCount:', total)
if (count < limit) break
}
obj = void 0
return total
}
基础使用方法:
let count = await getCount(collection)
调整limit和控制中断:
let event = {isStop:false}
let count = await getCount(collection,{},50000,event)
...
//因为时间可能会很长,当需要中断任务但不再断进程时:
event.isStop = true
同样的,db.collection.count(<query>,<options>)
也可以用此方法。