建立自己的A股全市场日K线数据库

使用金数源的免费日K线数据建立自己的A股全市场日K线数据库

Gary-Hertel

2022-05-17


1. 在自己的服务器上建立一个数据库

在自己的服务器上建一个数据stock,然后将文件夹里的day_kline.sql导入进去。

test.py里的代码改一下,改成自己的数据库密码:

<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="python" cid="n14" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">engine = create_engine('mysql+pymysql://stock:yourpassword@localhost/stock?charset=utf8')</pre>

2. 上传文件至服务器

在服务器上的某个目录下,建立一个文件夹,然后将所有文件上传上去,比如说,结构如下:

<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="" cid="n44" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">.
├── day_kline.sql
├── main.py
├── run.sh
└── test.py</pre>

3. 获取金数源免费股票日线历史数据的下载链接

金数源链接:http://www.jinshuyuan.net/pdt/3/

先要自己注册一个账户,然后就可以在这个页面进入下载,他会给你的注册邮箱发送一个下载的链接,然后修改一下run.sh中的内容,把那个下载链接改成你的。

4. 运行shell脚本

建议周末运行,因为这个是每周更新,如果你周二运行,你只能下载到截止到上周五的,昨天的数据就不包含在里面。

在终端里执行./run.sh,即可运行shell脚本,它会帮你下载那个压缩文件,然后自动解压,并且会自动运行一个test.pypython文件,将所有数据都插入到数据库中的day_kline数据表中。都插入完成之后的话呢,会自动删除压缩文件和解压出来的文件夹。

🚀这一步完成之后,全市场的历史日K线数据就都已经入库了。

5. 每天更新数据

所有历史日k线数据入库以后,我们就需要每天都更新一下数据库,因为每天都会有新的内容嘛,执行那个main.py文件,它会每天负责自动更新数据。

6. 代码

  • day_kline.sql:

    <pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="sql" cid="n63" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">-- phpMyAdmin SQL Dump
    -- version 4.4.15.10
    -- https://www.phpmyadmin.net
    --
    -- Host: localhost
    -- Generation Time: 2022-05-17 18:44:57
    -- 服务器版本: 5.6.50-log
    -- PHP Version: 5.6.40

    SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";
    SET time_zone = "+00:00";

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8mb4 */;

--
-- Database: `stock`
--

-- --------------------------------------------------------

--
-- 表的结构 `day_kline`
--

CREATE TABLE IF NOT EXISTS `day_kline` (
 `id` int(11) NOT NULL,
 `代码` varchar(8) NOT NULL,
 `时间` varchar(10) NOT NULL,
 `开盘价` float NOT NULL,
 `最高价` float NOT NULL,
 `最低价` float NOT NULL,
 `收盘价` float NOT NULL,
 `成交量` float NOT NULL,
 `成交额` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='股票全市场日K线数据';

--
-- Indexes for dumped tables
--

--
-- Indexes for table `day_kline`
--
ALTER TABLE `day_kline`
 ADD PRIMARY KEY (`id`);

--
-- AUTO_INCREMENT for dumped tables
--

--
-- AUTO_INCREMENT for table `day_kline`
--
ALTER TABLE `day_kline`
 MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
</pre>
  • run.sh:

    <pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="sh" cid="n70" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">#!/bin/sh

    echo "开始下载数据..."
    curl -o data.7z http://down.jinshuyuan.net/stk_day/20220513.7z?auth_key=1653035036-0-0-7552328dac4db5f7b1c413b958496fe5
    echo "数据下载完成,开始解压..."
    7za x data.7z
    rm data.7z
    echo "解压完成..."

    echo "开始将所有数据插入数据库中..."
    btpython test.py
    echo "所有数据都插入完成..."
    rm -r Stk_Day</pre>

  • test.py:

    <pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="python" cid="n77" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">import os
    import pandas as pd

    from sqlalchemy import create_engine

def find_all_csv():
 path = "./Stk_Day"
 files = os.listdir(path)
 for file in files:
 df = pd.read_csv("./Stk_Day/" + file, encoding="gbk", index_col="代码")
 df.columns = ["时间", "开盘价", "最高价", "最低价", "收盘价", "成交量", "成交额"]
 df.index = df.index.str.lower()
 df.to_sql(name="day_kline", con=engine, if_exists='append')

if __name__ == '__main__':

 engine = create_engine('mysql+pymysql://stock:YjGN7P38SFJKsP8G@localhost/stock?charset=utf8')
 find_all_csv()
  </pre>
  • main.py:

    <pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="python" cid="n84" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">import time

    import requests
    import pymysql

while True:
 # 获取数据
 url = "http://api.jinshuyuan.net/get_stk_bar_today"
 response = requests.get(url, timeout=30).text
 list1 = response.split("<br/>")
 list1.pop(0)
 list1.pop(-1)
 all_value = []
 first_data = None
 for line in list1:
 li = line.split(",")
 code = li[1]
 date = li[2]
 o = float(li[3])
 h = float(li[4])
 l = float(li[5])
 c = float(li[6])
 v = float(li[7])
 q = float(li[8])
 first_data = [code, date]
 all_value.append([code, date, o, h, l, c, v, q])

 db = pymysql.connect(host="localhost", user="stock", password="YjGN7P38SFJKsP8G", db="stock")
 cursor = db.cursor()

 check_sql = f"SELECT * FROM day_kline WHERE 代码='{first_data[0]}' AND 时间='{first_data[1]}'"
 cursor.execute(check_sql)
 result = cursor.fetchone()
 if result:  # 如果有这么一条数据,说明数据重复了,就不需要更新了
 continue

 sql = "INSERT INTO day_kline(代码,时间,开盘价,最高价,最低价,收盘价,成交量,成交额) VALUES (%s,%s,%s,%s,%s,%s,%s,%s)"

 try:
 cursor.executemany(sql, all_value)
 db.commit()
 except Exception as e:
 print("插入数据库错误:", e)
 db.rollback()

 db.close()

 # 休眠一天,建议在晚上7点以后启动这个程序
 time.sleep(86400)</pre>
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容