八、json文件

一、制作一个保存json的pipeline

1、首先为什么要搞这个json呢？

因为对于没有数据库的人而言，你总要有个地方存储你的数据吧，那么久可以通过写入json文件中。

2、代码如下：

import codecs

import json

classJsonWithEncodingPipeline(object):

def__init__(self):

self.file = codecs.open('article.json','w',encoding="utf-8")#用codecs完成文件的打开和写入

defprocess_item(self, item,spider):

lines = json.dumps(dict(item),ensure_ascii=False) +"\n"

self.file.write(lines)

returnitem

defspider_closed(self,spider):

self.file.close()

3、setting中的配置

ITEM_PIPELINES= {

'mm.pipelines.MmPipeline':300,

# 'scrapy.pipelines.images.ImagesPipeline': 1,

'mm.pipelines.ArticleImagePipeline':1,

'mm.pipelines.JsonWithEncodingPipeline':2,

}

4、将item导出成json格式的文件，json就被写入了

5、用scrapy提供的json export导出json文件

from scrapy.exportersimport JsonItemExporter

class JsonExporterPipleline (object):

# 调用scrapy提供的json export导出json文件,专门用来导封面照片用的

def__init__(self):

self.file =open('articleexport.json','wb')

self.exporter = JsonItemExporter (self.file,encoding="utf-8",ensure_ascii=False)

self.exporter.start_exporting ()

defclose_spider(self,spider):

self.exporter.finish_exporting()

self.file.close()

defprocess_item(self, item,spider):

self.exporter.export_item ( item )

return item

6、JsonExporterPipleline和JsonWithEncodingPipeline区别

准确地讲之前的JsonWithEncodingPipeline相当于一堆资料，那么这堆资料你怎么拜访了，是xml格式，还是csv格式，还是json格式，那么fromscrapy.exportersimportJsonItemExporter就可以帮助分类，command+JsonItemExporter可以看到多动的格式文件。

如下：

['BaseItemExporter','PprintItemExporter','PickleItemExporter',

'CsvItemExporter','XmlItemExporter','JsonLinesItemExporter',

'JsonItemExporter','MarshalItemExporter']

八、json文件

推荐阅读更多精彩内容