Python爬取腾讯视频电影信息并绘制散点图

Python爬取腾讯视频电影信息并绘制散点图

  昨天用python爬了一堆电影信息并绘制成了散点图,感觉很有意思,发上来分享一下。先上图:


腾讯视频电影评分-时长散点图

  横轴是电影的时长(分钟),纵轴是电影的评分。简单画的图没有做什么标注,后续可以加上去。接下来是工程从头到尾的过程:

找到目标网站

  首先自然是需要找到目标网站并观察网页结构:


腾讯视频网页结构

  嗯。。。我们来观察一下网址:https://v.qq.com/x/list/movie?&offset=0;看来是比较有规律的那种,在接下来request编写的时候可能会简单一些。

  接下来看一下第二页的网址:https://v.qq.com/x/list/movie?&offset=30;果然是这种有规律的。腾讯视频的offset是每过一页偏置量+30,最后一页是4980:https://v.qq.com/x/list/movie?&offset=4980

爬虫的准备部分

  好的,接下来就是建立爬虫工程了。建立爬虫工程可以通过:

scrapy startproject (文件名)—没有括号

的命令来创建,接下来是进入相应的目录中:

scrapy genspider (文件名)(网址)

来创建爬虫文件。

  首先我们先来定位我们需要爬取的信息。可以看到首页中陈列电影的页面是没有电影的时间长度信息的,需要进入到每一个电影的播放页面里面来爬取。不过电影的名称和评分则可以在首页上进行爬取。

  不过在这里遇到了一个问题就是没办法爬完所有首页上电影的二级页面之后再yield item,每次都是爬了一个电影名称和评分信息的list,但是评分却只有第一个的。怎么修改都出现了报错的情况,所以不得已只能都放在二级页面进行爬取。好在信息在二级页面上都是全的hhh。

播放页面信息展示

  可以看到页面上的电影名称、电影评分和电影的时长信息都有。在二级页面上则需要一个单独的xpath来保存相应的电影的url信息,方便我们进行遍历的Request。

爬虫的调试

  首先我们用shell进行调试,从而找到最合适的xpath进行信息的提取。一般来说进入shell进行调试只需要scrapy shell + 网址就可以进入了,但是腾讯视频的网站则遇到了一点问题,如下图:


重定向问题的shell界面

遇到了这种情况,不再继续进入了,然后按一下回车:


重定向问题的shell界面

就出现了stopped的情况。查询后得知这是网页遇到了重定向问题,但是shell好像可以通过参数配置来解决,这时候就需要通过以下命令:

scrapy shell
from scrapy import Request
response=Request("https://v.qq.com/x/list/movie?&offset=4980",meta = { 'dont_redirect': True})
re = fetch(response)

  参数被配置为dont redirect:True之后,就可以绕过重定向问题。这时候我们就可以成功进入网站的shell调试页面了。
接下来我们寻找url的xpath,观察可知:

“//*[@class="figure_title"]/a/@href"

在xpath中,属性需要/@,标签则/就好了。其中需要通过\将其中的引号转义掉。


Xpath调试get到所有的URL

  可以看到我们通过这个xpath成功地拿到了第一页所有电影的url信息。以此类推,拿到二级页面上的几个xpath并通过scrapy shell来确认信息无误即可。

爬虫编写

  在class的一开始,我们首先知道domain就是腾讯视频的网站,而网址又是这么的有规律,那我们的start_url写起来就很容易了:

allowed_domains = ['qq.com']
start_urls = []
start_urls_1 = ['https://v.qq.com/x/list/movie?&offset=']
for i in range(4981):
    if i % 30 == 0:
        post_url = start_urls_1[0] + str(i)
        start_urls.append(post_url)

通过这种方法可以轻而易举的遍历所有的网站。

  接下来为了防止href提供的网址不全的情况,我在第一个parse中进行了urljoin的练习,其实目前这个页面并不需要这样的操作,直接request就可以了。

def parse(self, response):
    post_url_1 = ''#练习一下urljoin的方法,为网页上href提供的网址不全的情况做准备
    yield scrapy.Request(url=parse.urljoin(response.url, post_url_1), callback=self.parse_detail,
                              dont_filter=True)

其中dont_filter = True的意思是遇到相同的网址不要开启过滤器,也就是说不为url去重。比如我们在爬去淘宝的店铺信息的时候,因为同一个店家的商品有可能反复出现,这里就可以设置为False从而使我们的爬虫不再爬取相同的店铺。

  接下来我们在第一个parse_deatil中对我们拿到的url进行request:

cl = response.xpath("//*[@class=\"figure_title\"]/a/@href").extract()
for i in range(len(cl)):
    yield scrapy.Request(url="http://" + re.findall(r"//(.*)", cl[i])[0], meta={'items': urlitem}, callback=self.parse_detail_2,
                    dont_filter=False)

就可以轻而易举的通过parse_deatil_2函数来遍历第一层二级页面,更深的页面也可以这样操作,但是要注意yield item一定要在合适的位置,不然自己的信息和自己的信息就对应不上了。

爬虫的设置

# Obey robots.txt rules
ROBOTSTXT_OBEY = False

这一部分是一个君子协议,我们设置为False,可以绕过很多网站的封锁。

# Configure maximum concurrent requests performed by Scrapy (default: 16)
CONCURRENT_REQUESTS = 100

此处的100搭配下面的continue request可以提高爬取速度。

# Configure a delay for requests for the same website (default: 0)
# See https://doc.scrapy.org/en/latest/topics/settings.html#download-delay
# See also autothrottle settings and docs
DOWNLOAD_DELAY = 0
# The download delay setting will honor only one of:
CONCURRENT_REQUESTS_PER_DOMAIN = 100
CONCURRENT_REQUESTS_PER_IP = 100  #13.02
# Disable cookies (enabled by default)
COOKIES_ENABLED = False

这里据说设置为False之后也可以绕过一些网站的封锁,不过目前还没有发现实际功效。

# Enable or disable downloader middlewares
# See https://doc.scrapy.org/en/latest/topics/downloader-middleware.html
DOWNLOADER_MIDDLEWARES = {
   'urls_10_2.middlewares.Urls102DownloaderMiddleware': 543,
   #'randoms.rotate_useragent.RotateUserAgentMiddleware': 400
   'urls_10_2.rotate_useragent.RotateUserAgentMiddleware': 400,
   #'urls_10_2.middlewares.ProxyMiddleware': 102,
}

下面的ProxyMiddleware是写在Middleware中的一个类,在接下来的代码中会奉上。因为很多网站被爬烦了以后会把你的IP封掉,因此需要这样的一个类提供代理IP,但是因为目前在网上找到的很多免费的国内IP代理用不了,因此就暂时注释掉了。感兴趣的可以去西刺免费代理IP网站上去看一下。

# Configure item pipelines
# See https://doc.scrapy.org/en/latest/topics/item-pipeline.html
ITEM_PIPELINES = {
   'urls_10_2.pipelines.Urls102Pipeline': 300,
   'urls_10_2.pipelines.UrlsPipeline': 1,
}

这一部分是控制Pipeline里面写文件保存数据的,后面的数字据说是控制在加载的时候(以靠近爬虫内核core为最先)的先后顺序的。

爬虫代码-urls(爬虫部分)

# -*- coding: utf-8 -*-
import scrapy
from urllib import parse
from urls_10_2.items import UrlItem
import re


class UrlsSpider(scrapy.Spider):
    name = 'urls'
    allowed_domains = ['qq.com']
    start_urls = []
    start_urls_1 = ['https://v.qq.com/x/list/movie?&offset=']
    for i in range(4981):
        if i % 30 == 0:
            post_url = start_urls_1[0] + str(i)
            start_urls.append(post_url)

    def parse(self, response):
        post_url_1 = ''#练习一下urljoin的方法,为网页上href提供的网址不全的情况做准备
        yield scrapy.Request(url=parse.urljoin(response.url, post_url_1), callback=self.parse_detail,
                                  dont_filter=True)

    def parse_detail(self, response):
        urlitem = UrlItem()
        # urlitem["url_name"] = response.xpath("//*[@class=\"figure_title\"]/a/text()").extract()
        # urlitem["url"] = response.xpath("//*[@class=\"figure_title\"]/a/@href").extract()
        # urlitem["mark_1"] = response.xpath("//*[@class=\"score_l\"]/text()").extract()
        # urlitem["mark_2"] = response.xpath("//*[@class=\"score_s\"]/text()").extract()
        cl = response.xpath("//*[@class=\"figure_title\"]/a/@href").extract()
        for i in range(len(cl)):
            yield scrapy.Request(url="http://" + re.findall(r"//(.*)", cl[i])[0], meta={'items': urlitem}, callback=self.parse_detail_2,
                            dont_filter=False)

    def parse_detail_2(self, response):
        urlitem=response.meta['items']
        urlitem["time"] = response.xpath("//*[@class=\"figure_count\"]/span/text()").extract()[0]
        urlitem["url_name"] = response.xpath("//*[@class=\"video_title _video_title\"]/text()").extract()[0].strip()
        urlitem["mark_1"] = response.xpath("//*[@class=\"units\"]/text()").extract()[0]
        urlitem["mark_2"] = response.xpath("//*[@class=\"decimal\"]/text()").extract()[0]
        yield urlitem

爬虫代码-Middleware部分

# -*- coding: utf-8 -*-

# Define here the models for your spider middleware
#
# See documentation in:
# https://doc.scrapy.org/en/latest/topics/spider-middleware.html

from scrapy import signals


# class ProxyMiddleware(object):
#     def process_request(self, request, spider):
#         request.meta['proxy'] = "http://118.190.95.35:9001"


class Urls102SpiderMiddleware(object):
    # Not all methods need to be defined. If a method is not defined,
    # scrapy acts as if the spider middleware does not modify the
    # passed objects.

    @classmethod
    def from_crawler(cls, crawler):
        # This method is used by Scrapy to create your spiders.
        s = cls()
        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
        return s

    def process_spider_input(self, response, spider):
        # Called for each response that goes through the spider
        # middleware and into the spider.

        # Should return None or raise an exception.
        return None

    def process_spider_output(self, response, result, spider):
        # Called with the results returned from the Spider, after
        # it has processed the response.

        # Must return an iterable of Request, dict or Item objects.
        for i in result:
            yield I

    def process_spider_exception(self, response, exception, spider):
        # Called when a spider or process_spider_input() method
        # (from other spider middleware) raises an exception.

        # Should return either None or an iterable of Response, dict
        # or Item objects.
        pass

    def process_start_requests(self, start_requests, spider):
        # Called with the start requests of the spider, and works
        # similarly to the process_spider_output() method, except
        # that it doesn’t have a response associated.

        # Must return only requests (not items).
        for r in start_requests:
            yield r

    def spider_opened(self, spider):
        spider.logger.info('Spider opened: %s' % spider.name)


class Urls102DownloaderMiddleware(object):
    # Not all methods need to be defined. If a method is not defined,
    # scrapy acts as if the downloader middleware does not modify the
    # passed objects.

    @classmethod
    def from_crawler(cls, crawler):
        # This method is used by Scrapy to create your spiders.
        s = cls()
        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
        return s

    def process_request(self, request, spider):
        # Called for each request that goes through the downloader
        # middleware.

        # Must either:
        # - return None: continue processing this request
        # - or return a Response object
        # - or return a Request object
        # - or raise IgnoreRequest: process_exception() methods of
        #   installed downloader middleware will be called
        return None

    def process_response(self, request, response, spider):
        # Called with the response returned from the downloader.

        # Must either;
        # - return a Response object
        # - return a Request object
        # - or raise IgnoreRequest
        return response

    def process_exception(self, request, exception, spider):
        # Called when a download handler or a process_request()
        # (from other downloader middleware) raises an exception.

        # Must either:
        # - return None: continue processing this exception
        # - return a Response object: stops process_exception() chain
        # - return a Request object: stops process_exception() chain
        pass

    def spider_opened(self, spider):
        spider.logger.info('Spider opened: %s' % spider.name)

爬虫代码-settings部分

# -*- coding: utf-8 -*-

# Scrapy settings for urls_10_2 project
#
# For simplicity, this file contains only settings considered important or
# commonly used. You can find more settings consulting the documentation:
#
#     https://doc.scrapy.org/en/latest/topics/settings.html
#     https://doc.scrapy.org/en/latest/topics/downloader-middleware.html
#     https://doc.scrapy.org/en/latest/topics/spider-middleware.html

BOT_NAME = 'urls_10_2'

SPIDER_MODULES = ['urls_10_2.spiders']
NEWSPIDER_MODULE = 'urls_10_2.spiders'


# Crawl responsibly by identifying yourself (and your website) on the user-agent
#USER_AGENT = 'urls_10_2 (+http://www.yourdomain.com)'
#user_agent_list = [ "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1", "Mozilla/5.0 (X11; CrOS i686 2268.111.0) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11",       "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6",       "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1090.0 Safari/536.6",        "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/19.77.34.5 Safari/537.1",       "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.9 Safari/536.5",        "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.36 Safari/536.5",        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3",        "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3",        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3",        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3",        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3",        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3",        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3",        "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3",       "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.0 Safari/536.3",  "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24","Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24"  ]


# Obey robots.txt rules
ROBOTSTXT_OBEY = False

# Configure maximum concurrent requests performed by Scrapy (default: 16)
CONCURRENT_REQUESTS = 100

# Configure a delay for requests for the same website (default: 0)
# See https://doc.scrapy.org/en/latest/topics/settings.html#download-delay
# See also autothrottle settings and docs
DOWNLOAD_DELAY = 0
# The download delay setting will honor only one of:
CONCURRENT_REQUESTS_PER_DOMAIN = 100
CONCURRENT_REQUESTS_PER_IP = 100  #13.02

# Disable cookies (enabled by default)
COOKIES_ENABLED = False

# Disable Telnet Console (enabled by default)
#TELNETCONSOLE_ENABLED = False

# Override the default request headers:
#DEFAULT_REQUEST_HEADERS = {
#   'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
#   'Accept-Language': 'en',
#}

# Enable or disable spider middlewares
# See https://doc.scrapy.org/en/latest/topics/spider-middleware.html
#SPIDER_MIDDLEWARES = {
#    'urls_10_2.middlewares.Urls102SpiderMiddleware': 543,
#}

# Enable or disable downloader middlewares
# See https://doc.scrapy.org/en/latest/topics/downloader-middleware.html
DOWNLOADER_MIDDLEWARES = {
   'urls_10_2.middlewares.Urls102DownloaderMiddleware': 543,
   #'randoms.rotate_useragent.RotateUserAgentMiddleware': 400
   'urls_10_2.rotate_useragent.RotateUserAgentMiddleware': 400,
   #'urls_10_2.middlewares.ProxyMiddleware': 102,

}

# Enable or disable extensions
# See https://doc.scrapy.org/en/latest/topics/extensions.html
#EXTENSIONS = {
#    'scrapy.extensions.telnet.TelnetConsole': None,
#}

# Configure item pipelines
# See https://doc.scrapy.org/en/latest/topics/item-pipeline.html
ITEM_PIPELINES = {
   'urls_10_2.pipelines.Urls102Pipeline': 300,
   'urls_10_2.pipelines.UrlsPipeline': 1,
}

# Enable and configure the AutoThrottle extension (disabled by default)
# See https://doc.scrapy.org/en/latest/topics/autothrottle.html
#AUTOTHROTTLE_ENABLED = True
# The initial download delay
#AUTOTHROTTLE_START_DELAY = 5
# The maximum download delay to be set in case of high latencies
#AUTOTHROTTLE_MAX_DELAY = 60
# The average number of requests Scrapy should be sending in parallel to
# each remote server
#AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
# Enable showing throttling stats for every response received:
#AUTOTHROTTLE_DEBUG = False

# Enable and configure HTTP caching (disabled by default)
# See https://doc.scrapy.org/en/latest/topics/downloader-middleware.html#httpcache-middleware-settings
#HTTPCACHE_ENABLED = True
#HTTPCACHE_EXPIRATION_SECS = 0
#HTTPCACHE_DIR = 'httpcache'
#HTTPCACHE_IGNORE_HTTP_CODES = []
#HTTPCACHE_STORAGE = 'scrapy.extensions.httpcache.FilesystemCacheStorage'

爬虫代码-Pipeline部分

# -*- coding: utf-8 -*-

# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html
import re


class Urls102Pipeline(object):
    def process_item(self, item, spider):
        return item



class UrlsPipeline(object):
    def __init__(self):
        self.file = open("urls.csv", "a+")
        self.file.write("电影名称,电影时长,电影评分\n")

    def process_item(self, item, spider):
        # 类被加载时要创建一个文件
        # 判断文件是否为空
        # 为空则写title
        # 不为空则追加写文件
        if 1 : #os.path.getsize("executive_prep.csv"):
            self.write_content(item)#开始写文件
        else:
            self.file.write("电影名称,电影时长,电影评分\n")
        self.file.flush()
        return item

    def write_content(self, item):
    #url = item["url"]
        time_s = []
        time_final = 0.0

        url_name = item["url_name"]
        mark_1 = item["mark_1"]
        mark_2 = item["mark_2"]
        time = item["time"]
        if url_name.find(",") != -1:
            url_name = url_name.replace(",", "-")
        if url_name.find(",") != -1:
            url_name = url_name.replace(",", "-")

        time_s = time.split(':')
        time_final = float(time_s[0])*60 + float(time_s[1]) + float(time_s[2])/60

        result_1 = url_name + ',' + str(time_final) + ',' + mark_1 + mark_2 + '\n'
        self.file.write(result_1)

爬虫代码-主程序

from scrapy.cmdline import execute

import sys
import os

sys.path.append(os.path.dirname(os.path.abspath(__file__)))
execute(["scrapy", "crawl", "urls"])

爬取到的文件

  爬到的文件是以csv文件格式存储的,可以在Pycharm中直接打开,下面是效果图:

CSV文件的内容

中间部分是把时长都转换成分钟数了,名字+分钟数+评分的格式。

  接下来我们通过调用matplotlib库中的

scatter(**)

函数的读取文件并绘制的程序将其画成散点图。注意在Mac下面Python2和Python3共存的情况下,安装需要

pip3 install - -user xxxx

才可以被成功import进来。

散点图的绘制

import seaborn as sns
import matplotlib.pyplot as plt
import re

mark = []
time = []
name = []
ray = ''
with open("urls.csv") as file:
    for line in file:
        ray = re.findall(r"(.*)\n", line)[0]
        ray = ray.split(',')
        mark.append(ray[2])
        #time.append(ray[1])#以小数方式绘制-更精确
        time.append(ray[1].split(".")[0])
        name.append(ray[0])

for i in range(len(time)):
    plt.scatter(x=int(time[i]), y=float(mark[i]), s=5, c='r')
plt.show()

效果图

腾讯视频电影评分-时长散点图

  这时候我们就成功地画出来了一张腾讯视频的电影评分-时长散点图。接下来可以用各种分类器算法通过get电影的时间长度来预估今年后续出的电影评分什么的hhh。

备注

  防止被网址封所引入的rotate_useragent模块(在settings里面有开启):

# -*- coding: utf-8 -*-
import random
from scrapy.contrib.downloadermiddleware.useragent import UserAgentMiddleware


class RotateUserAgentMiddleware(UserAgentMiddleware):
    def __init__(self, user_agent=''):
        self.user_agent = user_agent

    def process_request(self, request, spider):
        # 这句话用于随机选择user-agent
        ua = random.choice(self.user_agent_list)
        if ua:
            print('User-Agent:' + ua)
            request.headers.setdefault('User-Agent', ua)

    # the default user_agent_list composes chrome,I E,firefox,Mozilla,opera,netscape
    user_agent_list = [
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1"    ,
        "Mozilla/5.0 (X11; CrOS i686 2268.111.0) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11",
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1092.0 Safari/536.6",
        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.6 (KHTML, like Gecko) Chrome/20.0.1090.0 Safari/536.6",
        "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/19.77.34.5 Safari/537.1",
        "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.9 Safari/536.5",
        "Mozilla/5.0 (Windows NT 6.0) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.36 Safari/536.5",
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3",
        "Mozilla/5.0 (Windows NT 5.1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3",
        "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3",
        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3",
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1062.0 Safari/536.3",
        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3",
        "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3",
        "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.1 Safari/536.3",
        "Mozilla/5.0 (Windows NT 6.2) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1061.0 Safari/536.3",
        "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24",
        "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/535.24 (KHTML, like Gecko) Chrome/19.0.1055.1 Safari/535.24"
    ]

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,287评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,346评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,277评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,132评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,147评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,106评论 1 295
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,019评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,862评论 0 274
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,301评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,521评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,682评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,405评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,996评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,651评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,803评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,674评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,563评论 2 352

推荐阅读更多精彩内容