Mongodb基础知识

1. 介绍、安装、使用（简单写写，不做详细介绍）

1.1 介绍

Mongodb是属于NoSql的一种数据类型；
MongoDB 是一个介于关系数据库和非关系数据库之间的产品，是非关系数据库当中功能最丰富，最像关系数据库的；
它支持的数据结构非常松散，是类似 json 的 bson 格式，因此可以存储比较复杂的数据类型；
Mongo 最大的特点是他支持的查询语言非常强大，其语法有点类似于面向对象的查询语言，几乎可以实现类似关系数据库单表查询的绝大部分功能，而且还支持对数据建立索引；
它的特点是高性能、易部署、易使用，存储数据非常方便。

1.2 安装

MacOS中推荐使用HomeBrew安装，简单，没啥理由。
安装步骤：https://docs.mongodb.com/manual/tutorial/install-mongodb-on-os-x/

1.3 使用

启动 MongoDb 服务: mongod 开启数据库服务
mongod --dbpath <path to data directory>
输入 mongo 命令连接数据库
mongo --host 127.0.0.1:27017
具体步骤：https://docs.mongodb.com/manual/tutorial/install-mongodb-on-os-x/

2. Databases、Collections和Document

noSql是无模式的文档数据库

mongodb		sqlServer
database	=>	database
collection	=>	table
document	=>	row

Note:

row: 每一行都是一样的字段，不可添加不可减少，即field的个数在定义table时就已经声明好了
document: 每一个document都是独立的，也不是在collection定义的时候声明好的。

SQL VS NoSQL.png

3. Bson结构解析以及$type和_id

3.1 Bson支持的数据类型

BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB.

Type	Number	Alias	Notes
Double	1	“double”
String	2	“string”
Object	3	“object”
Array	4	“array”
Binary data	5	“binData”
Undefined	6	“undefined”	Deprecated.
ObjectId	7	“objectId”
Boolean	8	“bool”
Date	9	“date”
Null	10	“null”
Regular Expression	11	“regex”
DBPointer	12	“dbPointer”	Deprecated.
JavaScript	13	“javascript”
Symbol	14	“symbol”	Deprecated.
JavaScript (with scope)	15	“javascriptWithScope”
32-bit integer	16	“int”
Timestamp	17	“timestamp”
64-bit integer	18	“long”
Decimal128	19	“decimal”	New in version 3.4.
Min key	-1	“minKey”
Max key	127	“maxKey”

3.2 $type

可以使用$type操作符根据上表中列出的BSON数据类型从文档中查询数据。

例如：
集合inventory中存储了一条name值为undefined的数据，通过$type操作符找到这条数据，具体操作如下图：

inventory集合.png

db.inventory.find({name: { $type: 6 }})

若直接通过db.inventory.find({name: undefined})查找会报错

3.2 ObjectId

ObjectIds are small, likely unique, fast to generate, and ordered. ObjectId values consist of 12 bytes, where the first four bytes are a timestamp that reflect the ObjectId’s creation. Specifically:

a 4-byte value representing the seconds since the Unix epoch,

a 5-byte random value, and

a 3-byte counter, starting with a random value.

通过上述生成规则，是可以保证ObjectId做到全局唯一的。在Mongodb中，存储在集合中的每一个文档都需要一个唯一的[_id]作为主键。如果你插入文档时忽略了[_id]，Mongodb会自动帮你生成[ObjectId]。

4. mongodb shell、GUI与load()

可以通过终端、GUI以及外部js文件的形式(load()方法)来执行CRUD命令。

5. sql和Mongodb statement对照关系

有熟悉关系型数据库的可以参考https://docs.mongodb.com/manual/faq/fundamentals/#does-mongodb-support-sql看看两者之间的区别。

6. mongodb运算符

6.1 查询运算符

6.1.1 比较运算符

比较不同的BSON类型值，使用3.2节中所讲的$type操作符

Name	Description
`$eq`	等于
`$gt`	大于
`$gte`	大于等于
`$in`	匹配在数组中的任意值
`$lt`	小于
`$lte`	小于等于
`$ne`	不等于
`$nin`	匹配不在数组中的值

6.1.1.1 $eq

语法： { <field>: { $eq: <value> } }
可用于比较

简单类型值
复杂类型值(document、Array)

Example
用于比较的集合为inventory，包含的documents如下：

匹配简单类型
命令： db.inventory.find( { qty: { $eq: 20 } } )
结果：
匹配复杂类型(document)
命令： db.inventory.find( { "item.name": { $eq: "ab" } } )
结果：
匹配数组
命令：db.inventory.find( { tags: { $eq: "B" } } )
结果：

Note:这条命令查询所有documents中tags字段中包含"B"的结果，也会匹配tags字段值为"B"的结果。这条命令相当于db.inventory.find( { tags: { $in: ["B"] } } )。

命令：db.inventory.find( { tags: { $eq: [ "A", "B" ] } } )
结果：

Note:这条命令查询所有documents中tags字段完全匹配[ "A", "B" ]的结果，或者匹配tags字段的某一项值为[ "A", "B" ]的结果。这条命令相当于db.inventory.find( { tags: { $in: [[ "A", "B" ]] } } )。

6.1.1.2 $gt

语法：{field: {$gt: value} }
Example
命令：db.inventory.find( { qty: { $gt: 20 } } )
结果：

$gte、$lt、$lte、$ne都是类型的，不累述。

6.1.1.3 $in

语法：{ field: { $in: [<value1>, <value2>, ... <valueN> ] } }
Examples

匹配普通值
命令：db.inventory.find( { qty: { $in: [ 5, 15 ] } } )
结果：

Note:这条命令查询所有documents中qty字段的值为5或者15的结果。当然也可以使$or操作符来完成这一操作，这里选择$in而不是$or是因为这是对同一个字段的值的匹配。

匹配数组
命令：db.inventory.find( { tags: { $in: ["A", "B"] } } )
结果：

Note:这条命令查询所有documents中tags字段包含"A"或者 "B"元素的结果。

匹配正则
命令：db.inventory.find( { tags: { $in: [ /^be/, /^st/ ] } } )
结果：

Note:这条命令查询所有documents中tags字段包含以be或者st字母开头的元素的结果。
$nin与$in相反，不累述。

6.1.2 逻辑运算符

Name	Description
`$and`	与
`$not`	非
`$nor`	或非
`$or`	或

6.1.2.1 $and

语法：{ $and: [ { <expression1> }, { <expression2> } , ... , { <expressionN> } ] }
Examples

同一字段
命令：db.inventory.find( { $and: [ { price: { $ne: 1.99 } }, { price: { $exists: true } } ] } )
结果：查找price字段的值存在且不等于1.99的结果。等同于db.inventory.find( { price: { $ne: 1.99, $exists: true } } )的执行结果。
同一操作符
命令：db.inventory.find( { $and : [ { $or : [ { price : 0.99 }, { price : 1.99 } ] }, { $or : [ { sale : true }, { qty : { $lt : 20 } } ] } ] } )
结果：查找price字段等于0.99或者1.99并且sale字段等于true或者qty字段值小于20的结果。

Note: $and操作符实现了短路操作，如果<expression1>执行为false，MongoDB将不会再去执行剩下的expression。

6.1.2.2 $not

语法：{ field: { $not: { <operator-expression> } } }
Examples
命令： db.inventory.find( { price: { $not: { $gt: 1.99 } } } )
结果：查找price字段值小于等于1.99的结果或者price字段不存在的结果。注意与 db.inventory.find( { price: { $lte: 1.99 } } )的区别。

6.1.2.3 $nor

语法：{ $nor: [ { <expression1> }, { <expression2> }, ... { <expressionN> } ] }
Examples
命令：db.inventory.find( { $nor: [ { price: 1.99 }, { sale: true } ] } )
结果：匹配的结果有4种情况

price字段的值不等于1.99并且sale字段的值不等于true
price字段的值不等于1.99并且sale字段不存在
price字段的值不存在并且sale字段的值不等于true
price字段的值不存在并且sale字段不存在

6.1.2.4 $or

语法：{ $or: [ { <expression1> }, { <expression2> }, ... , { <expressionN> } ] }
Note:

用法与$and类似。
如果$or操作在同一个字段上进行多次，则建议使用$in。

6.1.3 元素查询运算符

Name	Description
`$exists`	指定字段是否存在
`$type`	匹配类型

6.1.3.1 $exists

语法: { field: { $exists: <boolean> } }
这个得益于document的无模式，不然也不会存在这个操作符

6.1.3.2 $type

语法: { field: { $type: [ <BSON type1> , <BSON type2>, ... ] } }
前面已经演示过。

6.1.4 评估运算符

Name	Description
`$expr`	允许在查询语言中使用聚合表达式
`$jsonSchema`	匹配符合与给定的json纲要的文档
`$mod`	取模
`$regex`	正则
`$text`	文本对带有文本索引的字段内容执行文本搜索
`$where`	匹配满足javascript表达式的文档

6.1.4.1 $expr

语法: { $expr: { <expression> } }

6.1.4.2 $jsonSchema

语法: { $jsonSchema: <schema> }

6.1.4.3 $mod

语法：{ field: { $mod: [ divisor, remainder ] } }
Examples
命令： db.inventory.find( { qty: { $mod: [ 4, 0 ] } } )
结果：

qty字段能被4整除的结果

Note:

给$mod操作符传递的数组元素少于两个或者多于两个都会报错。

6.1.4.4 $regex

语法:

{ <field>: { $regex: /pattern/, $options: '<options>' } }
{ <field>: { $regex: 'pattern', $options: '<options>' } }
{ <field>: { $regex: /pattern/<options> } }
$options

Option	Description
`i`	忽略大小写
`m`	匹配使用了锚，例如^(代表开头)和$(代表结尾)，以及匹配\n后的字符串
`x`	忽视所有空白字符
`s`	允许点字符（.）匹配所有的字符，包括换行符

Note:

正则性能不是很高，模糊匹配只能是表扫描。
不能在$in操作符中使用$regex表达式。
要使用xoption或者soption，必须要使用$options操作符。

6.1.4.5 $text

语法:

{
$text:
{
$search: <string>,
$language: <string>,
$caseSensitive: <boolean>,
$diacriticSensitive: <boolean>
}
}

用法：首先对一个字段进行全文索引，再进行搜索。
全文检索【全文索引】
中文分词很麻烦，所以不支持中文
用处不是很大

6.1.4.6 $where

1.非常强大的遍历器，通过包含js表达式的字符串或者function的模式，一排一排地去找最后的数据。

灵活但性能较差的模式【不走索引，全表扫描】
函数中的this指向当前的迭代文档
Examples
假设users文档的结构为：

找出与指定MD5 hash相同的name值所在文档。
命令：db.foo.find( { $where: function() { return (hex_md5(this.name) == "9b53e667f30cd329dca1ec9e6a83e994") } } );
结果：

6.1.5 数组操作运算符

Name	Description
`$all`	匹配所有
`$elemMatch`	匹配内嵌文档或数组中的部分field
`$size`	匹配数组长度为指定大小的文档

6.1.5.1 $all

语法： { <field>: { $all: [ <value1> , <value2> ... ] } }
与 $in类似，但不同的是，$ in 只需满足某一个值即可，而$all 必须满足[ ]内的所有值。
Examples:

命令： db.inventory.find({ tags: { $all: ['A', 'B' ] }})
结果：
命令：db.inventory.find({ tags: { $in: ['A', 'B' ] }})
结果：

6.1.5.2 $elemMatch

语法：{ <field>: { $elemMatch: { <query1>, <query2>, ... } } }
不能在$elemMatch中使用$where或者$text
Examples:
假设scores集合的结构为：

命令：db.scores.find( { results: { $elemMatch: { $gte: 80, $lt: 85 } } } )
结果：

6.1.5.3 $size

Examples:

命令：db.inventory.find( { tags: { $size: 1 } )
结果：

6.2 update运算符

6.2.1 字段update运算符

Name	Description
`$currentDate`	设置指定字段为当前时间
`$inc`	将文档中的某个field对应的value自增/减某个数字
`$min`	将文档中的某字段与指定值作比较，如果原值小于指定值，则不更新；若大于指定值，则更新
`$max`	与$min功能相反
`$mul`	将文档中的某个field对于的value做乘法操作
`$rename`	重命名文档中的指定字段的名
`$set`	更新文档中的某一个字段，而不是全部替换
`$setOnInsert`	配合upsert操作，在作为insert时可以为新文档扩展更多的field
`$unset`	删除文档中的指定字段，若字段不存在则不操作

6.2.1.1 $currentDate

语法：{ $currentDate: { <field1>: <typeSpecification1>, ... } }
Examples:

命令： db.users.update( { _id: 1 }, { $currentDate: { lastModified: true, "cancellation.date": { $type: "timestamp" } }, } )

6.2.1.2 $inc

语法：{ $inc: { <field1>: <amount1>, <field2>: <amount2>, ... } }
Examples:
假设products集合的结构为

{
  _id: 1,
  sku: "abc123",
  quantity: 10,
  metrics: {
    orders: 2,
    ratings: 3.5
  }
}

命令：db.products.update( { sku: "abc123" }, { $inc: { quantity: -2, "metrics.orders": 1 } } )
结果：

6.2.1.3

$min语法：{ $min: { <field1>: <value1>, ... } }
$max语法：{ $max: { <field1>: <value1>, ... } }
Examples:

命令：db.tags.update( { _id: 1 }, { $min: { dateEntered: new Date("2013-09-25") } } )
结果：更新_id为1所在文档的dateEntered字段值，若原来dateEntered值对应日期早于2013-09-25，则不更新。
命令：db.scores.update( { _id: 1 }, { $max: { highScore: 950 } } )
结果：更新_id为1所在文档的highScore字段值，若原来highScore值大于950，则不更新。

6.2.1.4 $mul

语法：{ $mul: { <field1>: <number1>, ... } }
Examples:

命令： db.products.update( { _id: 1 }, { $mul: { qty: 2 } } )
结果：将qty字段值乘以2

6.2.1.5 $rename

语法：{$rename: { <field1>: <newName1>, <field2>: <newName2>, ... } }

6.2.1.6 $set

语法：{ $set: { <field1>: <value1>, ... } }
Example:

命令：db.products.update( { _id: 100 }, { $set: { "details.make": "zzz" } } )

6.2.1.7 $setOnInsert

语法：db.collection.update( <query>, { $setOnInsert: { <field1>: <value1>, ... } }, { upsert: true } )
Examples:
假设products集合的结构为

{
  _id: 1,
  sku: "abc123",
  quantity: 10,
  metrics: {
    orders: 2,
    ratings: 3.5
  }
}

命令：db.products.update({_id: 1}, {$set: {sku: 'test'}, $setOnInsert: {defaultQuantity: 100}}, {upsert: true})
结果：

没有插入新的字段
命令：db.products.update({_id: 2}, {$set: {sku: 'abc234'}, $setOnInsert: {defaultQuantity: 100}}, {upsert: true})
结果：

Note:
当 [upsert: true]时，更新操作如果导致插入了一个新文档，那么[$setOnInsert]将会为新文档新文档扩展field，否则，[$setOnInsert]不起任何作用。

6.2.1.8 $unset

语法：{ $unset: { <field1>: "", ... } }
Examples:

命令：db.products.update( { sku: "unknown" }, { $unset: { quantity: "", instock: "" } } )
结果：删除sku字段值为 "unknown"的文档的quantity和instock两个字段。

6.2.2 数组update运算符

6.2.3 位运算符

7. mongodb之CURD众方法

7.1 Insert

方法	描述
db.collection.insertOne()	插入一条文档
db.collection.insertMany()	插入多条文档
db.collection.insert()	插入一条或多条文档

7.2 Delete

方法	描述
db.collection.deleteOne()	删除符合条件的一条文档
db.collection.deleteMany()	删除符合条件的多条文档
db.collection.remove()	删除符合条件的一条或多条文档

7.3 Update

方法	描述
db.collection.update()	根据过滤条件更新文档
db.collection.updateOne()	根据过滤条件更新一条文档
db.collection.updateMany()	根据过滤条件更新多条文档
db.collection.replaceOne()	根据过滤条件替换文档

7.4 Query

方法	描述
db.collection.find()	查找文档
db.collection.findAndModify()	查找并修改
db.collection.findOne()	查找第一个符合条件的文档
db.collection.findOneAndDelete()	根据条件查找并删除某条文档
db.collection.findOneAndReplace()	根据条件查找并修改替换某条文档
db.collection.findOneAndUpdate()	根据条件查找并更新某条文档

8. mongodb索引

8.1. 索引的目的

加速查找

Indexes support the efficient execution of queries in MongoDB. Without indexes, MongoDB must perform a collection scan, i.e. scan every document in a collection, to select those documents that match the query statement. If an appropriate index exists for a query, MongoDB can use the index to limit the number of documents it must inspect.

8.2. 索引的实现原理

Btree

8.3 mondodb中支持的索引类型：

单建索引（Single Field）
复合索引（Compound Index）一定要主要键值的顺序，字段在前在后有严格的区别
多键值索引（Multikey Index）在array字段上建立一个索引，mongodb会将array的每一项进行索引
地理位置索引（Geospatial Index）
全文索引（Text Indexes）不支持中文
Hash索引（Hashed Indexes）hash是为了做定值查找的

only support equality matches and cannot support range-based queries

唯一索引（Unique Indexes）
部分索引（Partial Indexes）使用更少的存储空间，减少性能开销
稀疏索引（Sparse Indexes）
过期索引（TTL Indexes）可以用来做定时过期的效果

8.4 如何创建索引

db.collection.createIndex(keys, options)

8.5 Examples

集合
persons.js

var persons = [];

for (var i = 0; i < 10000; i++) {
  var name = `Graceji${i}`;
  var age = parseInt(Math.random() * 150) + 1;
  persons.push({ name, age });
}

db.person.insertMany(persons);

通过load('/Users/jina194/Desktop/persons.js')方法在person集合中插入了10000条数据。

创建单键索引: db.person.createIndex({ age: 1 })

执行db.person.find({ age: 88 }).explain()查看执行效果：

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "test.person",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "age" : {
                "$eq" : 88
            }
        },
         // winningPlan: 若干个计划中挑选一个作为胜出者
        "winningPlan" : {
            "stage" : "FETCH",
            "inputStage" : {
                "stage" : "IXSCAN",  // IXSCAN代表索引
                "keyPattern" : {
                    "age" : 1
                },
                "indexName" : "age_1",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                    "age" : [ ]
                },
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "age" : [
                        "[88.0, 88.0]"
                    ]
                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        "host" : "C02T8D1GGY25.local",
        "port" : 27017,
        "version" : "3.4.6",
        "gitVersion" : "c55eb86ef46ee7aede3b1e2a5d184a7df4bfb5b5"
    },
    "ok" : 1
}

执行db.person.find({ name: 'Graceji888' }).explain()查看执行效果：

{
    "queryPlanner" : {
        ...
        "parsedQuery" : {
            "name" : {
                "$eq" : "Graceji888"
            }
        },
        "winningPlan" : {
            // 没有建立索引的name字段采用的执行计划则为行扫描（COLLSCAN）
            "stage" : "COLLSCAN", 
            "filter" : {
                "name" : {
                    "$eq" : "Graceji888"
                }
            },
            "direction" : "forward"
        },
        ...
    },
        ...
}

创建复合索引：db.person.createIndex({ name: 1, age: 1 })

// db.person.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test.person"
    },
    {
        "v" : 2,
        "key" : {
            "age" : 1
        },
        "name" : "age_1",
        "ns" : "test.person"
    },
    {
        "v" : 2,
        "key" : {
            "name" : 1,
            "age" : 1
        },
        "name" : "name_1_age_1",
        "ns" : "test.person"
    }
]

执行db.person.find({ name: 'Graceji888', age: 88 }).explain('executionStats')查看执行效果：

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "test.person",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [
                {
                    "age" : {
                        "$eq" : 88
                    }
                },
                {
                    "name" : {
                        "$eq" : "Graceji888"
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "inputStage" : {
                "stage" : "IXSCAN", // 索引扫描
                "keyPattern" : {
                    "name" : 1,
                    "age" : 1
                },
                "indexName" : "name_1_age_1", // 执行计划所用索引的名字
                "isMultiKey" : false,
                "multiKeyPaths" : {
                    "name" : [ ],
                    "age" : [ ]
                },
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "name" : [
                        "[\"Graceji888\", \"Graceji888\"]"
                    ],
                    "age" : [
                        "[88.0, 88.0]"
                    ]
                }
            }
        },
        // 拒绝的执行计划
        "rejectedPlans" : [
            {
                "stage" : "FETCH",
                "filter" : {
                    "name" : {
                        "$eq" : "Graceji888"
                    }
                },
                "inputStage" : {
                    "stage" : "IXSCAN",
                    "keyPattern" : {
                        "age" : 1
                    },
                    "indexName" : "age_1",
                    "isMultiKey" : false,
                    "multiKeyPaths" : {
                        "age" : [ ]
                    },
                    "isUnique" : false,
                    "isSparse" : false,
                    "isPartial" : false,
                    "indexVersion" : 2,
                    "direction" : "forward",
                    "indexBounds" : {
                        "age" : [
                            "[88.0, 88.0]"
                        ]
                    }
                }
            }
        ]
    },
    // 执行状态
    "executionStats" : {
        "executionSuccess" : true,
        "nReturned" : 0,
        "executionTimeMillis" : 1,
        "totalKeysExamined" : 0,
        "totalDocsExamined" : 0,
        "executionStages" : {
            "stage" : "FETCH",
            "nReturned" : 0, // 返回结果数
            "executionTimeMillisEstimate" : 0, // 执行时间
            "works" : 2,
            "advanced" : 0,
            "needTime" : 0,
            "needYield" : 0,
            "saveState" : 0,
            "restoreState" : 0,
            "isEOF" : 1,
            "invalidates" : 0,
            "docsExamined" : 0, // 扫描条数
            "alreadyHasObj" : 0,
            "inputStage" : {
                "stage" : "IXSCAN",
                "nReturned" : 0,
                "executionTimeMillisEstimate" : 0,
                "works" : 1,
                "advanced" : 0,
                "needTime" : 0,
                "needYield" : 0,
                "saveState" : 0,
                "restoreState" : 0,
                "isEOF" : 1,
                "invalidates" : 0,
                "keyPattern" : {
                    "name" : 1,
                    "age" : 1
                },
                "indexName" : "name_1_age_1",
                "isMultiKey" : false,
                "multiKeyPaths" : {
                    "name" : [ ],
                    "age" : [ ]
                },
                "isUnique" : false,
                "isSparse" : false,
                "isPartial" : false,
                "indexVersion" : 2,
                "direction" : "forward",
                "indexBounds" : {
                    "name" : [
                        "[\"Graceji888\", \"Graceji888\"]"
                    ],
                    "age" : [
                        "[88.0, 88.0]"
                    ]
                },
                "keysExamined" : 0,
                "seeks" : 1,
                "dupsTested" : 0,
                "dupsDropped" : 0,
                "seenInvalidated" : 0
            }
        }
    },
    ...
}

创建多键值索引db.coll.createIndex( { <field>: < 1 or -1 > } )，与创建单键索引方式类似。
创建Hash索引：将hashed作为索引key的value值，例如
db.collection.createIndex( { _id: "hashed" } )
创建部分索引：
通过db.collection.createIndex()方法和partialFilterExpression option创建。partialFilterExpression option接受的过滤条件有：

相等性表达式 (例如：field: value 或者使用 $eq操作符)
$exists: true 表达式
$gt, $gte, $lt, $lte 表达式
$type 表达式
$and 操作符
例如：

db.restaurants.createIndex(
   { cuisine: 1, name: 1 },
   { partialFilterExpression: { rating: { $gt: 5 } } }
)

以上例子创建了一个只针对rating字段值大于5的文档的复合索引。可以在所有mongodb支持的索引类型中指定partialFilterExpression option。

创建稀疏索引
假设有多个document一点关系也没有（即结构不一致，除了主键_id是一样的），这样的documents怎么建立索引？——稀疏索引
稀疏索引：建立的index字段必须在有这个字段的document上才可以建立。

Sparse indexes only contain entries for documents that have the indexed field, even if the index field contains a null value. The index skips over any document that is missing the indexed field. The index is “sparse” because it does not include all documents of a collection. By contrast, non-sparse indexes contain all documents in a collection, storing null values for those documents that do not contain the indexed field.

通过db.collection.createIndex()方法和设置 sparse option为true来创建稀疏索引，例如：db.addresses.createIndex( { "xmpp_id": 1 }, { sparse: true } )，在 addresses 集合的xmpp_id字段上创建了稀疏索引。

过期索引

TTL indexes are special single-field indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time or at a specific clock time.

通过db.collection.createIndex()方法和expireAfterSecondsoption(值的类型为date或者包含date的数组)来创建。例如：db.eventlog.createIndex( { "lastModifiedDate": 1 }, { expireAfterSeconds: 3600 } )在eventlog集合的lastModifiedDate字段上创建了一个TTL索引。
Note:

TTL 索引会让文档在指定的时间后过期，即过期时间为被索引的字段值所代表的时间加上指定的过期秒数
如果被索引的字段为包含多个时间值的数组，mongodb会使用最早的日期来计算过期时间
若被索引的字段不是date，或者包含date的数组，则文档不会过期
如果某文档不包括被索引的字段，则该文档不会过期
由于 TTL thread任务每60s执行一次，所以TTL索引不能保证过期的数据能够被立即删除，会存在一定的延时
TTL索引为单键索引，复合索引是不支持的，会自动忽略expireAfterSeconds option
_id字段不支持TTL索引
不能在capped collection中创建TTL索引
如果一个字段已经存在非TTL的单键索引，就不能在同一个字段上再创建TTL索引

8.6 索引管理

创建索引
db.collection.createIndex()
查看索引
db.collection.getIndexes()
移除索引
db.collection.dropIndex()
db.collection.dropIndexes()
修改索引
先删除再创建

8.7 索引

9. MongoDB聚合管道(Aggregation Pipeline)

db.collection.aggregate()是基于数据处理的聚合管道，每个文档通过一个由多个阶段(stage)组成的管道，可以对每个阶段的管道进行分组、过滤等功能，然后经过一系列的处理，输出相应的结果。

Aggregate处理过程

9.1 db.collection.aggregate(pipeline, options)

pipeline为array类型
因为$group和$sort都有内存100M的限制，所以将allowDiskUse开启为true的话，就没有此限制了，db.collection.aggregate(pipeline, { allowDiskUse: true })

9.2 Aggregation Pipeline Stages(有点多，只介绍常用的)

9.2.1 db.collection.aggregate() Stages

db.collection.aggregate( [ { <stage> }, ... ] )

Stage	描述
$group	将文档依据指定字段的不同值进行分组
$indexStats	查询过程中的索引情况
$limit	限制返回的文档个数
$lookup	与同一数据库中其它集合之间进行join操作
$match	根据query条件筛选文档
$out	将最后计算结果写入到指定的collection中
$project	对输入文档进行添加新字段或删除现有的字段，可以自定义字段的显示状态
$redact	字段所处的document结构的级别
$sample	从待操作的集合中随机返回指定数量的文档
$skip	从待操作集合开始的位置跳过文档的数目
$sort	对文档按照指定字段排序
$unwind	将数组分解为单个的元素，并与文档的其余部分一同返回(Note：1.如果$unwind目标字段不存在或者数组为空，则整个文档都会被忽略过滤掉 2.如果$unwind目标字段不是一个数组，则会报错)

9.2.1.1 $group

语法：{ $group: { _id: <expression>, <field1>: { <accumulator1> : <expression1> }, ... } }
可以分组的数据执行如下表达式计算：

Name	Description
$avg	计算平均值
$first	返回每组第一个文档，如果有排序，按照排序，如果没有按照默认的存储的顺序的第一个文档
$last	返回每组最后一个文档，如果有排序，按照排序，如果没有按照默认的存储的顺序的最后一个文档
$max	根据分组，获取集合中所有文档对应值的最大值
$min	根据分组，获取集合中所有文档对应值的最小值
$push	将指定的表达式的值添加到一个数组中
$addToSet	将表达式的值添加到一个集合中（无重复值）
$stdDevPop	计算总体标准差
$stdDevSamp	计算累积样本标准差
$sum	计算总和

_id字段是必须的，其值是要进行分组的key

9.2.1.2 $limit

语法：{ $limit: <positive integer> }

9.2.1.3 $lookup

语法

相等性匹配

{
  $lookup:  
    {  
        from: <参与join的辅表>,  
        localField: <参与join匹配的本表字段>  
        foreignField: <参与join的辅表字段>  
        as: <将辅表数据输出到此字段中>  
    }  
}

多个Join条件和不相关子查询

{
  $lookup:
     {
       from: <collection to join>,
       let: { <var_1>: <expression>, …, <var_n>: <expression> },
       pipeline: [ <pipeline to execute on the collection to join> ],
       as: <output array field>
     }
}

9.2.1.4 $match

语法：{ $match: { <query> } }
Note:

根据query条件筛选文档，符合条件的文档将会传递给下一个stage；
$match的语法和mongodb query语法一样，且$match中不能使用aggregate expression或者比较操作，只能使用query操作允许的操作符;
$match用于筛选数据，为了提高后续的数据处理效率，尽可能的将$match放于pipeline的前端以减少数据读取或者计算量，加快聚合速度；
$match可以放在$out之前用于控制数据输出量；
如果$match放于pipeline的开始，还可以使用到索引机制提高数据查询的性能。
Examples:

假设orders集合的结构为：

orders集合

inventory集合的结构为：

inventory集合

9.2.1.5 $out

语法：{ $out: "<output-collection>" }
必须为pipeline最后一个阶段管道，因为是将最后计算结果写入到指定的collection中
如果不用$out，那么pipeline计算的结果并没有序列化到硬盘。

9.2.1.6 $project

语法：{ $project: { <specification(s)> } }

9.2.1.7 $sort

语法：{ $sort: { <field1>: <sort order>, <field2>: <sort order> ... } }
<sort order>只能是以下值之一:

1：升序
-1：降序
{ $meta: "textScore" }：根据 textScore metadata 降序排列

注意事项：

如果将$sort放到管道前面的话可以利用索引提高效率
在管道中如果$sort出现在$limit之前的话，$sort只会对前$limit个文档进行操作，这样在内存中也只会保留前$limit个文档，从而可以极大的节省内存
$sort操作符默认在内存中进行，超过此限制会报错，若要允许处理大型数据集，allowDiskUse 要设置为true

Examples:

命令1：

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }
])

结果1：通过orders集合中的item字段与inventory集合中的sku字段将两个集合中的文档进行合并。

{
   "_id" : 1,
   "item" : "almonds",
   "price" : 12,
   "quantity" : 2,
   "inventory_docs" : [
      { "_id" : 1, "sku" : "almonds", "description" : "product 1", "instock" : 120 }
   ]
}
{
   "_id" : 2,
   "item" : "pecans",
   "price" : 20,
   "quantity" : 1,
   "inventory_docs" : [
      { "_id" : 4, "sku" : "pecans", "description" : "product 4", "instock" : 70 }
   ]
}
{
   "_id" : 3,
   "inventory_docs" : [
      { "_id" : 5, "sku" : null, "description" : "Incomplete" },
      { "_id" : 6 }
   ]
}

命令2：

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }，
  {
    $match: {
        price: { $gt: 0 }
    }
  }，
])

结果2：加入$match stage，筛选出price字段大于0的结果。

{
   "_id" : 1,
   "item" : "almonds",
   "price" : 12,
   "quantity" : 2,
   "inventory_docs" : [
      { "_id" : 1, "sku" : "almonds", "description" : "product 1", "instock" : 120 }
   ]
}
{
   "_id" : 2,
   "item" : "pecans",
   "price" : 20,
   "quantity" : 1,
   "inventory_docs" : [
      { "_id" : 4, "sku" : "pecans", "description" : "product 4", "instock" : 70 }
   ]
}

命令3：

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }，
  {
    $match: {
        price: { $gt: 0 }
    }
  }，
  {
     $group: {
         _id:  '$item',
        totalPrice: {$sum: {$multiply: ['$price', '$quantity']}},
        quantityAvg: {$avg: '$quantity'}
     }
  }
])

结果3：加入$group stage，根据item字段分组，并计算totalPrice和QuantityAvg。
命令4：

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }，
  {
    $match: {
        price: { $gt: 0 }
    }
  }，
  {
     $group: {
         _id:  '$item',
        totalPrice: {$sum: {$multiply: ['$price', '$quantity']}},
        quantityAvg: {$avg: '$quantity'}
     }
  },
  {
    $project: {
      _id: 1,
      totalPrice: 1,
      demo: 'my demo',
    }
  }
])

结果4：加入$project stage，去掉quantityAvg字段，并添加一个demo字段。

命令5：

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }，
  {
    $match: {
        price: { $gt: 0 }
    }
  }，
  {
     $group: {
         _id:  '$item',
        totalPrice: {$sum: {$multiply: ['$price', '$quantity']}},
        quantityAvg: {$avg: '$quantity'}
     }
  },
  {
    $project: {
      _id: 1,
      totalPrice: 1,
      demo: 'my demo',
    }
  },
  {
     $sort: { totalPrice: -1 }
  }
])

结果5：加入$sort，根据totalPrice字段降序排序。

命令6：

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }，
  {
    $match: {
        price: { $gt: 0 }
    }
  }，
  {
     $group: {
         _id:  '$item',
        totalPrice: {$sum: {$multiply: ['$price', '$quantity']}},
        quantityAvg: {$avg: '$quantity'}
     }
  },
  {
    $project: {
      _id: 1,
      totalPrice: 1,
      demo: 'my demo',
    }
  },
  {
     $limit: 1
  }
])

结果6：加入$limit，限制返回的文档的个数。

命令6：

db.orders.aggregate([
   {
     $lookup:
       {
         from: "inventory",
         localField: "item",
         foreignField: "sku",
         as: "inventory_docs"
       }
  }，
  {
    $match: {
        price: { $gt: 0 }
    }
  }，
  {
     $group: {
         _id:  '$item',
        totalPrice: {$sum: {$multiply: ['$price', '$quantity']}},
        quantityAvg: {$avg: '$quantity'}
     }
  },
  {
    $project: {
      _id: 1,
      totalPrice: 1,
      demo: 'my demo',
    }
  },
  {
     $limit: 1
  },
   {
     $out: 'myaggretion'
   },
])

结果7：加入$out，将结果写入到myaggretion集合中。

9.2.2 db.aggregate() Stages

db.aggregate( [ { <stage> }, ... ] )

Mongodb基础知识

1. 介绍、安装、使用（简单写写，不做详细介绍）

1.1 介绍

1.2 安装

1.3 使用

2. Databases、Collections和Document

3. Bson结构解析以及$type和_id

3.1 Bson支持的数据类型

3.2 $type

3.2 ObjectId

4. mongodb shell、GUI与load()

5. sql和Mongodb statement对照关系

6. mongodb运算符

6.1 查询运算符

6.1.1 比较运算符

6.1.1.1 $eq

6.1.1.2 $gt

6.1.1.3 $in

6.1.2 逻辑运算符

6.1.2.1 $and

6.1.2.2 $not

6.1.2.3 $nor

6.1.2.4 $or

6.1.3 元素查询运算符

6.1.3.1 $exists

6.1.3.2 $type

6.1.4 评估运算符

6.1.4.1 $expr

6.1.4.2 $jsonSchema

6.1.4.3 $mod

6.1.4.4 $regex

6.1.4.5 $text

6.1.4.6 $where

6.1.5 数组操作运算符

6.1.5.1 $all

6.1.5.2 $elemMatch

6.1.5.3 $size

6.2 update运算符

6.2.1 字段update运算符

6.2.1.1 $currentDate

6.2.1.2 $inc

6.2.1.3

6.2.1.4 $mul

6.2.1.5 $rename

6.2.1.6 $set

6.2.1.7 $setOnInsert

6.2.1.8 $unset

6.2.2 数组update运算符

6.2.3 位运算符

7. mongodb之CURD众方法

7.1 Insert

7.2 Delete

7.3 Update

7.4 Query

8. mongodb索引

8.1. 索引的目的

8.2. 索引的实现原理

8.3 mondodb中支持的索引类型：

8.4 如何创建索引

8.5 Examples

8.6 索引管理

8.7 索引

9. MongoDB聚合管道(Aggregation Pipeline)

9.1 db.collection.aggregate(pipeline, options)

9.2 Aggregation Pipeline Stages(有点多，只介绍常用的)

9.2.1 db.collection.aggregate() Stages

9.2.1.1 $group

9.2.1.2 $limit

9.2.1.3 $lookup

9.2.1.4 $match

9.2.1.5 $out

9.2.1.6 $project

9.2.1.7 $sort

9.2.2 db.aggregate() Stages

推荐阅读更多精彩内容