使用金数源的免费日K线数据建立自己的A股全市场日K线数据库
Gary-Hertel
2022-05-17
1. 在自己的服务器上建立一个数据库
在自己的服务器上建一个数据stock
,然后将文件夹里的day_kline.sql
导入进去。
test.py
里的代码改一下,改成自己的数据库密码:
<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="python" cid="n14" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">engine = create_engine('mysql+pymysql://stock:yourpassword@localhost/stock?charset=utf8')</pre>
2. 上传文件至服务器
在服务器上的某个目录下,建立一个文件夹,然后将所有文件上传上去,比如说,结构如下:
<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="" cid="n44" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">.
├── day_kline.sql
├── main.py
├── run.sh
└── test.py</pre>
3. 获取金数源免费股票日线历史数据的下载链接
金数源链接:http://www.jinshuyuan.net/pdt/3/
先要自己注册一个账户,然后就可以在这个页面进入下载,他会给你的注册邮箱发送一个下载的链接,然后修改一下run.sh
中的内容,把那个下载链接改成你的。
4. 运行shell脚本
建议周末运行,因为这个是每周更新,如果你周二运行,你只能下载到截止到上周五的,昨天的数据就不包含在里面。
在终端里执行./run.sh
,即可运行shell脚本,它会帮你下载那个压缩文件,然后自动解压,并且会自动运行一个test.py
python文件,将所有数据都插入到数据库中的day_kline
数据表中。都插入完成之后的话呢,会自动删除压缩文件和解压出来的文件夹。
🚀这一步完成之后,全市场的历史日K线数据就都已经入库了。
5. 每天更新数据
所有历史日k线数据入库以后,我们就需要每天都更新一下数据库,因为每天都会有新的内容嘛,执行那个main.py
文件,它会每天负责自动更新数据。
6. 代码
-
day_kline.sql
:<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="sql" cid="n63" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">-- phpMyAdmin SQL Dump
-- version 4.4.15.10
-- https://www.phpmyadmin.net
--
-- Host: localhost
-- Generation Time: 2022-05-17 18:44:57
-- 服务器版本: 5.6.50-log
-- PHP Version: 5.6.40SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";
SET time_zone = "+00:00";
/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8mb4 */;
--
-- Database: `stock`
--
-- --------------------------------------------------------
--
-- 表的结构 `day_kline`
--
CREATE TABLE IF NOT EXISTS `day_kline` (
`id` int(11) NOT NULL,
`代码` varchar(8) NOT NULL,
`时间` varchar(10) NOT NULL,
`开盘价` float NOT NULL,
`最高价` float NOT NULL,
`最低价` float NOT NULL,
`收盘价` float NOT NULL,
`成交量` float NOT NULL,
`成交额` float NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='股票全市场日K线数据';
--
-- Indexes for dumped tables
--
--
-- Indexes for table `day_kline`
--
ALTER TABLE `day_kline`
ADD PRIMARY KEY (`id`);
--
-- AUTO_INCREMENT for dumped tables
--
--
-- AUTO_INCREMENT for table `day_kline`
--
ALTER TABLE `day_kline`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
</pre>
-
run.sh
:<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="sh" cid="n70" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">#!/bin/sh
echo "开始下载数据..."
curl -o data.7z http://down.jinshuyuan.net/stk_day/20220513.7z?auth_key=1653035036-0-0-7552328dac4db5f7b1c413b958496fe5
echo "数据下载完成,开始解压..."
7za x data.7z
rm data.7z
echo "解压完成..."echo "开始将所有数据插入数据库中..."
btpython test.py
echo "所有数据都插入完成..."
rm -r Stk_Day</pre> -
test.py
:<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="python" cid="n77" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">import os
import pandas as pdfrom sqlalchemy import create_engine
def find_all_csv():
path = "./Stk_Day"
files = os.listdir(path)
for file in files:
df = pd.read_csv("./Stk_Day/" + file, encoding="gbk", index_col="代码")
df.columns = ["时间", "开盘价", "最高价", "最低价", "收盘价", "成交量", "成交额"]
df.index = df.index.str.lower()
df.to_sql(name="day_kline", con=engine, if_exists='append')
if __name__ == '__main__':
engine = create_engine('mysql+pymysql://stock:YjGN7P38SFJKsP8G@localhost/stock?charset=utf8')
find_all_csv()
</pre>
-
main.py
:<pre class="md-fences md-end-block ty-contain-cm modeLoaded" spellcheck="false" lang="python" cid="n84" mdtype="fences" style="box-sizing: border-box; overflow: visible; font-family: Monaco, Consolas, "Andale Mono", "DejaVu Sans Mono", monospace; margin-top: 0px; margin-bottom: 20px; font-size: 0.9rem; display: block; break-inside: avoid; text-align: left; white-space: normal; background-color: rgb(51, 51, 51); position: relative !important; padding: 10px 10px 10px 30px; width: inherit;">import time
import requests
import pymysql
while True:
# 获取数据
url = "http://api.jinshuyuan.net/get_stk_bar_today"
response = requests.get(url, timeout=30).text
list1 = response.split("<br/>")
list1.pop(0)
list1.pop(-1)
all_value = []
first_data = None
for line in list1:
li = line.split(",")
code = li[1]
date = li[2]
o = float(li[3])
h = float(li[4])
l = float(li[5])
c = float(li[6])
v = float(li[7])
q = float(li[8])
first_data = [code, date]
all_value.append([code, date, o, h, l, c, v, q])
db = pymysql.connect(host="localhost", user="stock", password="YjGN7P38SFJKsP8G", db="stock")
cursor = db.cursor()
check_sql = f"SELECT * FROM day_kline WHERE 代码='{first_data[0]}' AND 时间='{first_data[1]}'"
cursor.execute(check_sql)
result = cursor.fetchone()
if result: # 如果有这么一条数据,说明数据重复了,就不需要更新了
continue
sql = "INSERT INTO day_kline(代码,时间,开盘价,最高价,最低价,收盘价,成交量,成交额) VALUES (%s,%s,%s,%s,%s,%s,%s,%s)"
try:
cursor.executemany(sql, all_value)
db.commit()
except Exception as e:
print("插入数据库错误:", e)
db.rollback()
db.close()
# 休眠一天,建议在晚上7点以后启动这个程序
time.sleep(86400)</pre>