使用MongoDB

1. MongoDB 简介

MongoDB -- document database
-- not .pdf or .doc/.docx
-- is associative array
-- document == json object
-- document == php array
-- document == python dict
-- document == ruby hash

MongoDB 是一个 NoSQL 数据库

你可以在这个官方网站了解 JSON 数据格式

Python 字典文档

PHP 数组文档

Ruby 哈希文档

4. 为何使用Mongodb

-- flexible schema
-- oriented toward programmers
-- flexible deployment
-- designed for big data
-- aggregation framework

你可以从官方 MongoDB 页面下载安装 MongoDB。你还可以阅读具体的 MongoDB 安装说明

对于本课程中的大多数练习,你无需在计算机上安装 MongoDB,但是要想获得最好的学习体验,我们建议你这么做。安装既快捷又简单!

MongoDB 有大量驱动程序和客户端库。我们将在本课程中使用的是 PyMongo。请查阅官方文档,以了解 PyMongo 安装说明

MongoDB配置

5 预先了解MongoDB

安装 pymongo 以便在本地运行此代码:

pip install pymongo
def add_city(db):
    db.cities.insert({"name" : "Chicago"})
    
def get_city(db):
    return db.cities.find_one()

def get_db():
    # For local use
    from pymongo import MongoClient
    client = MongoClient('localhost:27017')
    # 'examples' here is the database name. It will be created if it does not exist.
    db = client.examples
    return db

if __name__ == "__main__":
    # For local use
    # db = get_db() # uncomment this line if you want to run this locally
    add_city(db)
    print get_city(db)

8.PyMongo简介

tesla_s = {
    "manufacturer" : "Tesla Motors",
    "class" : "full-size",
    "body style" : "5-door liftback",
    "production" : [2012,2013],
    "model years" : [2013],
    "layout" : ["Rear-motor","rear-wheel drive"],
    "designer" : {
        "firstname":"Franz",
        "surname":"von Holzhausen"
    },
    "assembly" : [
        {
            "country":"United State",
            "city" : "Fremont",
            "state" : "california"
            },
        {
            "country":"the netherlands",
            "city":"tilburg"
        }
    ]
}

from pymongo import MongoClient
import pprint 

client = MongoClient('mongodb://localhost:27017/')  #创建客户端对象,指定连接字符串
tesla_s={}
db  = client.examples             #指定我们需要使用的示例数据库
db.autos.insert(tesla_s)         #insert document 'tesla_s' in the autos collection for the example database
                                  #将文档tesla_s保存在集合autos的示例数据库中
for a in db.autos.find():        #db.autos.find()返回autos集合中所有文档的指针
pprint.pprint(a)

MongoDB ensures that any document we insert can be uniquely identified by it's _id field,and if we don's specify value for _id,mongoDB will create one for us.

9.使用字段选择进行查询

We construct the query document,having the fields and values for those fields that we'd like to see in every documen in our result set.
We're simply looping through our results and printing out each one of them.

def find():
    autos = db.autos.find({'manufacturer':'Toyota'})
    for a in autos:
        pprint.pprint(a)      

示例

"""
Your task is to complete the 'porsche_query' function and in particular the query
to find all autos where the manufacturer field matches "Porsche".
"""

def porsche_query():
    # Please fill in the query to find all autos manuafactured by Porsche.
    query = {"manufacturer" : "Porsche"}
    return query

# Code here is for local use on your own computer.
def get_db(db_name):
    # For local use
    from pymongo import MongoClient
    client = MongoClient('localhost:27017')
    db = client[db_name]
    return db

def find_porsche(db, query):
    # For local use
    return db.autos.find(query)

if __name__ == "__main__":
    # For local use
    db = get_db('examples')
    query = porsche_query()
    results = find_porsche(db, query)

    print "Printing first 3 results\n"
    import pprint
    for car in results[:3]:
        pprint.pprint(car)

11.多项字段查询

def find():
    autos = db.autos.find({'manufacturer':'Toyota','class':'mid-size car'})
    for a in autos:
        pprint.pprint(a)

12.投影查询

def find():
    query = {'manufacturer':'Toyota','class':'mid-size car'}
    projectiong = {'_id':0,'name':1}
    autos = db.autos.find({'manufacturer':'Toyota','class':'mid-size car'})
    for a in autos:
        pprint.pprint(a)

13.将数据导入 MongoDB

client = MongoClient('mongodb://localhost:27017/')
db  = client.examples   
num_autos = db.myautos.find().count()
print "num_autos before:",num_autos
for a in autos:
    db.myautos.insert(a)

num_autos = db.myautos.find().count()
print "num_autos after",num_autos

14 插入多个文档

""" 
Add a single line of code to the insert_autos function that will insert the
automobile data into the 'autos' collection. The data variable that is
returned from the process_file function is a list of dictionaries, as in the
example in the previous video.
"""

from autos import process_file


def insert_autos(infile, db):
    data = process_file(infile)
    # Add your code here. Insert the data in one command.
    db.autos.insert(data)
  
if __name__ == "__main__":
    # Code here is for local use on your own computer.
    from pymongo import MongoClient
    client = MongoClient("mongodb://localhost:27017")
    db = client.examples

    insert_autos('autos-small.csv', db)
    print db.autos.find_one()

15.使用mongoimport

将所有文档输出成JSON文档
实际两个步骤:
1.数据清理
2.将数据导入MongoDB

查看帮助文档:
mongoimport --help
mongoimport -d examples -c myautos2 --file autos.json
-d examples 指定数据库
-c myautos2 指定存储数据的集合
--file autos.json 指定实际导入的文件名 该文件和位于examples文件夹内

16.运算符

不等式运算符

$gt (>)
$lt(<)
$gte(≥)
$lte(≤)
$nt(≠)

def find():
    query={'polulation':{'$gt':250000}}
    cities=db.cities.find(query)
    
    num_cities=0
    for c in cities:
        pprint.pprint(c)
        num_cities +=1
        
    print "\nNumber of cities matching:%d\n" % num_cities
def find():
    quety={'polulation':{'$gt':250000,'$lte':500000}}
    cities  = db.cities.find(query)
    
    num_cities=0
    for c in cities:
        pprint.pprint(c)
        num_cities +=1
        
    print "\nNumber of cities matching:%d\n" % num_cities
def find():
    quety={'name':{'$gte':'X','$lt':'Y'}}
    cities  = db.cities.find(query)
    
    num_cities=0
    for c in cities:
        pprint.pprint(c)
        num_cities +=1
        
    print "\nNumber of cities matching:%d\n" % num_cities
def find():
    quety={'foundingDate':{'$gte':datetime(1837,1,1),'$lte':datetime(1837,12,31)}}
    cities  = db.cities.find(query)
    
    num_cities=0
    for c in cities:
        pprint.pprint(c)
        num_cities +=1
        
    print "\nNumber of cities matching:%d\n" % num_cities
def find():
    quety={'country':{'$ne':'United States'}}
    cities  = db.cities.find(query)
    
    num_cities=0
    for c in cities:
        pprint.pprint(c)
        num_cities +=1
        
    print "\nNumber of cities matching:%d\n" % num_cities
"""
Your task is to write a query that will return all cities
that are founded in 21st century.
Please modify only 'range_query' function, as only that will be taken into account.
"""

from datetime import datetime
    
def range_query():
    # Modify the below line with your query.
    # You can use datetime(year, month, day) to specify date in the query
    query = {"foundingDate":{"$gte":datetime(2001,1,1)}}
    return query

# Do not edit code below this line in the online code editor.
# Code here is for local use on your own computer.
def get_db():
    from pymongo import MongoClient
    client = MongoClient('localhost:27017')
    db = client.examples
    return db

if __name__ == "__main__":
    # For local use
    db = get_db()
    query = range_query()
    cities = db.cities.find(query)

    print "Found cities:", cities.count()
    import pprint
    pprint.pprint(cities[0])

19.存在(exists运算符)

要在本地启动 mongo shell,请在终端中输入以下命令:
mongo

>use examples
switched to db examples
>db.cities.find()           #将返回所有结果

exists运算符允许我们基于文档是否包含特殊字符来检索文档

db.cities.find({"governmentType":{"$exists":1}}).count()  #{"$exists":1}表示存在 count()表示对查询结果计数
db.cities.find({"governmentType":{"$exist":0}}).pretty()   #{"$exists":0}表示不存在 pretty()表示查看其中的一个文档

20.正则运算符($regex)

MongoDB支持使用$regex查询字符串模式
$regex
--based on a regular expression library specially PCRE(perl compatible regular expression library)
--allow us to do regular expression queries in MongoDB

db.cities.find({"motto":{"$regex":"friendship"}}).pretty()

if i do the query this way,i should match only documents where "friendship" is the entire string of the motto

db.cities.find({"motto":{"$regex":"[Ff]riendship"}}).pretty()

查找包含“frienship”一词的所有座右铭的文件,其中friendship的f可以大写,也可以小写

db.cities.find({"motto":{"$regex":"[Ff]riendship|[Pp]ride"}}).pretty

该正则表达式将确定motto包含词语friendship或者pride的所有文档,任何一个词语都可以大写或者小写

21. 使用标量查询

db.autos.find({"modelYears":1980}).pretty

modelYears字段对应的值是数组

23. 使用$in 运算符查询

$in 运算符允许我们指定数组值

db.autos.find({"modelYears":{"$in":[1965,1966,1967]}}).count()

本查询将检索modelYears字段中包含数组[1965,1966,1967]中任意一个值的文档
示例

def in_query():
    # Modify the below line with your query; try to use the $in operator.
    query = {"manufacturer":"Ford Motor Company","assembly":{"$in":["Germany","Japan","United Kingdom"]}}
    
    return query


# Do not edit code below this line in the online code editor.
# Code here is for local use on your own computer.
def get_db():
    from pymongo import MongoClient
    client = MongoClient('localhost:27017')
    db = client.examples
    return db


if __name__ == "__main__":

    db = get_db()
    query = in_query()
    autos = db.autos.find(query, {"name":1, "manufacturer":1, "assembly": 1, "_id":0})

    print "Found autos:", autos.count()
    import pprint
    for a in autos:
        pprint.pprint(a)

24 使用$all 运算符查询

将检索字段包含的所有值

db.autos.find({"modelYears":{"$all":[1965,1966,1967]}})

25.点表示法

query for values inside nested documents

db.tweets.find().pretty()
#!/usr/bin/env python
"""
Your task is to write a query that will return all cars with width dimension
greater than 2.5. Please modify only the 'dot_query' function, as only that
will be taken into account.

Your code will be run against a MongoDB instance that we have provided.
If you want to run this code locally on your machine, you will need to install
MongoDB, download and insert the dataset. For instructions related to MongoDB
setup and datasets, please see the Course Materials.
"""


def dot_query():
    # Edit the line below with your query - try to use dot notation.
    # You can check out example_auto.txt for an example of the document
    # structure in the collection.
    query = {"dimensions.width":{"$gt":2.5}}
    return query


# Do not edit code below this line in the online code editor.
# Code here is for local use on your own computer.
def get_db():
    from pymongo import MongoClient
    client = MongoClient('localhost:27017')
    db = client.examples
    return db


if __name__ == "__main__":
    db = get_db()
    query = dot_query()
    cars = db.cars.find(query)

    print "Printing first 3 results\n"
    import pprint
    for car in cars[:3]:
        pprint.pprint(car)

26. 更新

对集合中现有文档进行修改
save()

def main():
    city=db.cities.find_one({"name":"munchen",
                             "country":"Germany"})  #returns the first document it finds
    city['isoCountryCode']='DEU'
    db.cities.save(city)

save()
a method on collections objects
调用save()时,将更新本文档以包括该字段

27 设置与复位 $set & $reset

update()将查询文档作为第一个参数,将更新文档作为第二个参数
by default,update operates on just one document

$set

def find():
    city=db.cities.update({"name":"munchen",
                           "country":"Germany"},
                          {"$set":
                              {"isoCountryCode":"DEU"
                           }})

$set的语义是:找到匹配的文档后,
如果该文档不包含这里指定的字段,那么字段添加该值
如果该文档已包含这里指定的字段,那么该字段更新为提供的值

$unset

def find():
    city=db.cities.update({"name":"munchen",
                           "country":"Germany"},
                          {"$set":
                              {"isoCountryCode":""
                           }})

$unset的语义是:找到匹配的文档后,无论什么文档与该查询匹配
如果有这里指定的字段,删除该字段,忽略该值
如果文档没有这里指定的字段,那么该调用无效

28. 多项更新

def find():
    city=db.cities.update({"country":"Germany"},
                          {"$set": {"isoCountryCode":"DEU"}},multi=True)

by default,update will modify just the first document it finds,
in order to modify all document match the query,we need to specify
multi=True

29 删除文档

> use examples   
switched to db examples
>db.cities.find()    #返回集合中的所有文档
>db.cities.remove()  #删除该集合的所有数据
>db.cities.drop()  #删除集合以及与其相关的任何元数据,比如索引
>db.cities.remove({"name":"Chicago"})  #删除集合中与chicago相关的所有文档
>db.cities.remove({"name":{"$exist":0}})  #删除集合中所有name字段不存在的文档
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,658评论 6 496
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,482评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 160,213评论 0 350
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,395评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,487评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,523评论 1 293
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,525评论 3 414
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,300评论 0 270
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,753评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,048评论 2 330
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,223评论 1 343
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,905评论 5 338
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,541评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,168评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,417评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,094评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,088评论 2 352

推荐阅读更多精彩内容