数据结构
- 非结构化数据,各种文档、图片、视频/音频等都属于非结构化数据。对于这类数据,我们一般直接整体进行存储,而且一般存储为二进制的数据格式(如文件、图片、视频、语音等需存入文件系统中)
- 结构化数据,结构化的数据是指可以使用关系型数据库表示和存储,表现为二维形式的数据。一般特点是:数据以行为单位,一行数据表示一个实体的信息,每一行数据的属性是相同的(如行数据等需存入关系型数据库中)
- 半结构化数据,半结构化数据是结构化数据的一种形式,它并不符合关系型数据库或其他数据表的形式关联起来的数据模型结构,但包含相关标记,用来分隔语义元素以及对记录和字段进行分层。因此,它也被称为自描述的结构。(常见的半结构数据有XML和JSON,可存入NoSQL数据库中)
简介
- 关系型数据库,瑞典基于C++语言开发
- 小巧、实用、性能高
- 其他数据库如Oracle(甲骨文)、SQLServer(微软)、DB2(IBM)
特点
- 开源
- 社区版免费
- 跨平台
- 安全性高
- 成本低
- 支持各种开发语言
- 支持强大的内置函数
- 数据存储量大
架构
安装方式
- 可使用xdja_centos7.4裁剪版自带MySQL5.7安装包一键安装
- 可使用以下脚本设置连接权限
常用命令
- 连接数据库
mysql -uroot -p
- 显示数据库
show databases;
- 选择数据库
use xxx(databasename);
- 显示数据库表
show tables;
- 查看表描述
DESC xxx(datatable);
- 显示数据库版本、时间等
SELECT VERSION(),CURRENT_DATE(),CURRENT_TIME();
- 显示运行的进程
show processlist;
- 查看配置项
show variables like '%tx_isolation%';
- 查看innodb状态
show engine innodb status\G;
DDL
- DDL(Data Definition Languages):数据定义语言,定义不同的数据段、数据库、表、列、索引等数据库对象,常用的关键字包括create、drop、alter等
- 创建数据库test1
create database test1;
- 删除指定数据库
drop database test1;
- 修改表字段
alter table emp modify ename varchar(20);
- 增加、删除表字段
alter table emp add column age int(3);
alter table emp drop column age;
DML
DML(Data Manipulation Language):数据操作语言,用于添加、删除、更新和查询数据库记录,并检查数据完整性,常用的关键字主要包括insert、delete、update和select等
插入记录
insert into emp(ename,sal,deptno) values('zhangsan','2015-08-01','2000',1);
insert into emp(ename,sal,deptno) values('lisi','2015-08-01','3000',1);
create table dept(deptno int(3),deptname varchar(20);
insert into dept values(1,'tech'),(2,'sales'),(3,'fin');
- 更新、删除记录
update emp set sal=4000 where ename='lisi';
delete from emp where ename='lisi';
- 查询指定列、查询不重复记录
select ename,hiredate,sal,deptno from emp;
select distinct deptno from emp;
- 条件查询与排序
select * from emp where deptno =1 and sal<3000;
select * from emp order by sal;
- 分组统计
select count(1) from emp;
select deptno,count(1) as empnum from emp group by deptno
- 分组统计+条件过滤
select deptno,count(1) as empnum from emp group by deptno with rollup;
select deptno,count(1) as empnum from emp group by deptno having count(1)>1;
- 聚合函数与多表关联查询
select sum(sal),max(sal),min(sal) from emp;
select ename,deptname from emp,dept where emp.deptno=dept.deptno;
存储引擎
- 存储引擎就是如何存储数据、如何为存储的数据建立缩影和如何更新、查询数据等技术的实现方法;
- 在关系数据中数据的存储是以表的形式存储,所以存储引擎也可以称为表类型(即存储和操作此表的类型);
- 类型有MyISAM、InnoDB、MERGE、MEMORY(HEAP)等;
show engines;
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| Engine | Support | Comment | Transactions | XA | Savepoints |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| InnoDB | DEFAULT | Supports transactions, row-level locking, and foreign keys | YES | YES | YES |
| CSV | YES | CSV storage engine | NO | NO | NO |
| MyISAM | YES | MyISAM storage engine | NO | NO | NO |
| BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) | NO | NO | NO |
| PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO |
| MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO |
| ARCHIVE | YES | Archive storage engine | NO | NO | NO |
| MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO |
| FEDERATED | NO | Federated MySQL storage engine | NULL | NULL | NULL |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
9 rows in set (0.03 sec)
- InnoDB
- 数据和索引合并存储为一个文件,.frm(描述表的结构) .ibd(表数据文件)
- 支持外键,事务处理
- 行锁定
- 具有提交、回滚和崩溃恢复能力的事务安全
- 并行读写,适用于大量的写操作的表
备份
备份所有数据库
mysqldump -uroot -p --all-databases > itsca.sql
- 输入密码后即可备份所有库
[root@xdja wch]# mysqldump -uroot -p --all-databases > itsca.sql
Enter password:
Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don't want to restore GTIDs, pass --set-gtid-purged=OFF. To make a complete dump, pass --all-databases --triggers --routines --events.
备份指定数据库
mysqldump -u root -p --databases 数据库1 数据库2 > xxx.sql
还原mysql备份内容
- 在系统命令行中,输入如下实现还原:
mysql -uroot -p123456 < /data/mysqlDump/mydb.sql
- 在登录进入mysql系统中,通过source指令找到对应系统中的文件进行还原:
mysql> source /data/mysqlDump/mydb.sql
基础优化
存储引擎的选择、字段设计、索引、SQL语句等都是影响MySQL性能的重要因素,本次暂不详细讨论。
仅从MySQL数据库参数配置入手,探究设置合理的参数值来提升MySQL数据库性能。
innodb_page_size
从InnoDB1.2.x版本开始,可通过参数innodb_page_size将页的大小设置为4K、8K、16K。若设置完成,则所有表中页的大小都为innodb_page_size,不可以对其再次修改,否则启动会报错,踩过坑。
默认是16k
Consider using a page size that matches the internal sector size of the disk. Early-generation SSD devices often have a 4KB sector size. Some newer devices have a 16KB sector size. The default InnoDB page size is 16KB. Keeping the page size close to the storage device block size minimizes the amount of unchanged data that is rewritten to disk.
请考虑使用与磁盘的内部扇区大小匹配的页面大小。早期 SSD 器件通常具有 4KB 扇区大小。某些较新的设备具有 16KB 扇区大小。默认 InnoDB 页面大小为 16KB。使页面大小接近存储设备块大小可最大限度地减少重写到磁盘的未更改数据量。
参考文档
https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-diskio.html
innodb_buffer_pool_size
从其作用可以看出,当系统的IO比较空闲的时候,可以适当将这个参数设大,当IO吃紧时,需要适当减小,一般设置为内存总大小的50%-75%
Buffer pool size must always be equal to or a multiple of innodb_buffer_pool_chunk_size * innodb_buffer_pool_instances. If you alter the buffer pool size to a value that is not equal to or a multiple of innodb_buffer_pool_chunk_size * innodb_buffer_pool_instances, buffer pool size is automatically adjusted to a value that is equal to or a multiple of innodb_buffer_pool_chunk_size * innodb_buffer_pool_instances.
When you increase or decrease buffer pool size, the operation is performed in chunks. Chunk size is defined by the innodb_buffer_pool_chunk_size variable, which has a default of 128 MB.
-
参考文章
https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_buffer_pool_size
sync_binlog
sync_binlog=0,当事务提交之后,MySQL不做fsync之类的磁盘同步指令刷新binlog_cache中的信息到磁盘,而让Filesystem自行决定什么时候来做同步,或者cache满了之后才同步到磁盘。
当sync_binlog=1时, MySQL在写1次二进制日志binary log时, 会使用fdatasync()函数将二进制binary log同步到disk中去(安全性最高的配置)。
sync_binlog=n,当每进行n次事务提交之后,MySQL将进行一次fsync之类的磁盘同步指令来将binlog_cache中的数据强制写入磁盘。
-
参考文档
https://dev.mysql.com/doc/refman/5.7/en/replication-options-binary-log.html#sysvar_sync_binlog
innodb_flush_log_at_trx_commit
当innodb_flush_log_at_trx_commit=1时(默认),每次事务提交时, MySQL会把log buffer的数据写入log file, 并且将log file flush到硬盘中。这样做的好处是数据安全性最佳,不足之处在于每次提交事务,都要进行磁盘写入的操作。在大并发的场景下,过于频繁的磁盘读写会导致 CPU 资源浪费,系统效率变低。(效率低,安全性高)
设置为0 表示每隔 1 秒将数据写入日志,并将日志写入磁盘;(效率高,安全性低)
设置为2 表示每次提交事务的时候都将数据写入日志,但是日志每间隔 1 秒写入磁盘。(效率中,安全性中)
-
参考文档
https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit
双1就是 sync_binlog=1和 innodb_flush_log_at_trx_commit=1。这两个参数线上要保证为1,前者保证binlog的安全,后者保证redo的安全,它们在数据库crash recovery的时候起到了关键做用,不设置为双1可能导致数据丢失。
innodb_lru_scan_depth
innodb_lru_scan_depth是5.6新增加的参数,根据官方文档描述,它会影响page cleaner线程每次刷脏页的数量,这是一个每1秒loop一次的线程。在Innodb内部,这个参数对应变量为srv_LRU_scan_depth
page cleaner 线程刷脏页的长度,从尾部开始刷srv_LRU_scan_depth
默认值为1024,当IO吃紧时,应适当降低此值
-
参考文档
https://www.cnblogs.com/zengkefu/p/5692803.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_lru_scan_depth
innodb_io_capacity
The default setting of 200 is generally sufficient for a lower-end non-rotational storage device. For higher-end, bus-attached devices, consider a higher setting such as 1000.
参数 :innodb_io_capacity :数据库落盘脏页个数 ,配置压力和磁盘的性能相关,如果过大,IO能力不足,则出现卡顿。
innodb_io_capacity默认是200,单位是页,该参数的设置大小取决于硬盘的IOPS,即每秒的输入输出量(或读写次数)
-
参考文档
https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-diskio.html
innodb_io_capacity_max
If you specify an
innodb_io_capacity
setting at startup but do not specify a value forinnodb_io_capacity_max
,innodb_io_capacity_max
defaults to twice the value ofinnodb_io_capacity
or 2000, whichever value is greater.-
参考文档
https://dev.mysql.com/doc/refman/5.7/en/optimizing-innodb-diskio.html
https://dev.mysql.com/doc/refman/5.7/en/innodb-configuring-io-capacity.html
innodb_max_dirty_pages_pct
- innodb_max_dirty_pages_pct 是 MySQL InnoDB 存储引擎非常重要的一个参数,用来控制buffer pool中脏页的百分比,当脏页数量占比超过这个参数设置的值时,InnoDB会启动刷脏页的操作。该参数只控制脏页百分比,并不会影响刷脏页的速度。
- 默认值:75%
- 除了脏页百分比达到阈值innodb_max_dirty_pages_pct触发刷脏页以外,还有很多条件,也会触发刷脏页,主要包括:
- REDO日志快满的时候。
- 为了保证MySQL中的空闲页面的数量,会从LRU链表尾部淘汰一部分页面作为空闲页。如果对应的页面是脏页的话,就需要先将页面刷到磁盘。
- MySQL
- MySQL实例正常关闭时候。
innodb_flush_neighbors
innodb_flush_neighbors 参数是InnoDB用来控制buffer pool刷脏页时是否把脏页邻近的其他脏页一起刷到磁盘,在传统的机械硬盘时代,打开这个参数能够减少磁盘寻道的开销,显著提升性能。
取值范围:0,1,2
默认值:5.7版本为1, 8.0版本为0
含义:
- 设置为0时,表示刷脏页时不刷其附近的脏页。
- 设置为1时,表示刷脏页时连带其附近毗连的脏页一起刷掉。
- 设置为2时,表示刷脏页时连带其附近区域的脏页一起刷掉。1与2的区别是2刷的区域更大一些。
如果MySQL服务器磁盘是传统的HDD存储设备,打开该参数,能够减少I/O磁盘寻道的开销,提高性能,而对于SSD设备,寻道时间的性能影响很小,关闭该参数,反而能够分散写操作,提高数据库性能。由于SSD设备的普及,MySQL 8.0 将该参数的默认值由1调整为0。
-
参考文档
https://www.mytecdb.com/blogDetail.php?id=117
https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_neighbors
innodb_page_cleaners
show global status like '%Innodb_buffer_pool_wait_free%';
如果值很大,则需要增加innodb_page_cleaners值,同时增加写线程。-
参考文档
https://www.jianshu.com/p/6991304a8e26
https://www.jianshu.com/p/ddb24f9afae0https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_flush_neighbors
innodb_log_file_size
Redo log的空间通过
innodb_log_file_size
和innodb_log_files_in_group
(默认2)参数来调节,将这俩参数相乘即可得到总的可用Redo log 空间。可以使用MySQL监控PMM来进行详细分析,具体参考以下文章
-
参考文档
https://blog.csdn.net/kai404/article/details/80242262
https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_log_file_size
优化结果
- 为我司安装MySQL默认参数,需根据不同机器硬件配置进行调整
[mysqld]
########basic settings########
server-id = 11
port = 3306
user = mysql
#bind_address = 10.166.224.32 #根据实际情况修改
autocommit = 1 #5.6.X安装时,需要注释掉,安装完成后再打开
character_set_server=utf8mb4
skip_name_resolve = 1
max_connections = 800
max_connect_errors = 1000
datadir = /home/mysql/data #根据实际情况修改,建议和程序分离存放
transaction_isolation = READ-COMMITTED
#explicit_defaults_for_timestamp = 1
join_buffer_size = 134217728
tmp_table_size = 67108864
tmpdir = /tmp
max_allowed_packet = 16777216
sql_mode = "STRICT_TRANS_TABLES,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER"
interactive_timeout = 1800
wait_timeout = 1800
read_buffer_size = 16777216
read_rnd_buffer_size = 33554432
sort_buffer_size = 33554432
########log settings########
log_error = /home/mysql/logs/error.log
slow_query_log = 1
slow_query_log_file = /home/mysql/logs/slow.log
log_queries_not_using_indexes = 1
log_slow_admin_statements = 1
log_slow_slave_statements = 1
log_throttle_queries_not_using_indexes = 10
expire_logs_days = 60
long_query_time = 1
min_examined_row_limit = 100
lower_case_table_names = 1
########replication settings########
master_info_repository = TABLE
relay_log_info_repository = TABLE
log_bin = /home/mysql/binlog/bin.log
sync_binlog = 1
gtid_mode = on
enforce_gtid_consistency = 1
log_slave_updates
binlog_format = row
relay_log = /home/mysql/relaylog/relay.log
relay_log_recovery = 1
binlog_gtid_simple_recovery = 1
slave_skip_errors = ddl_exist_errors
########innodb settings########
#innodb_page_size = 8192
innodb_buffer_pool_size = 24G #根据实际情况修改
innodb_buffer_pool_instances = 12
innodb_buffer_pool_load_at_startup = 1
innodb_buffer_pool_dump_at_shutdown = 1
innodb_lru_scan_depth = 2000
innodb_lock_wait_timeout = 5
innodb_io_capacity = 4000
innodb_io_capacity_max = 8000
innodb_flush_method = O_DIRECT
innodb_file_format = Barracuda
innodb_file_format_max = Barracuda
innodb_log_group_home_dir = /home/mysql/redolog/ #根据实际情况修改
innodb_undo_directory = /home/mysql/undolog/ #根据实际情况修改
innodb_undo_logs = 128
innodb_undo_tablespaces = 3
innodb_flush_neighbors = 1
innodb_log_file_size = 8G #根据实际情况修改
innodb_log_buffer_size = 16777216
innodb_purge_threads = 4
innodb_large_prefix = 1
innodb_thread_concurrency = 64
innodb_print_all_deadlocks = 1
innodb_strict_mode = 1
innodb_sort_buffer_size = 67108864
innodb_flush_log_at_trx_commit = 1
[mysqld-5.7]
innodb_buffer_pool_dump_pct = 40
innodb_page_cleaners = 12
innodb_undo_log_truncate = 1
innodb_max_undo_log_size = 2G
innodb_purge_rseg_truncate_frequency = 128
binlog_gtid_simple_recovery=1
log_timestamps=system
transaction_write_set_extraction=MURMUR32
show_compatibility_56=on
- MySQL数据库服务器安装fio
[root@xdja wch]# tar -zxvf fio-2.1.10.tar.gz
[root@xdja wch]# cd fio-2.1.10/
[root@xdja fio-2.1.10]# ./configure
[root@xdja fio-2.1.10]# make
[root@xdja fio-2.1.10]# make Install
[root@xdja fio-2.1.10]# cd ..
[root@xdja wch]# fio -filename=test0628 -direct=1 -iodepth 1 -thread -rw=randrw -ioengine=psync -bs=16k -size=500M -numjobs=10 -runtime=10 -group_reporting -name=mytest
direct=1 测试过程绕过机器自带的buffer,使测试结果更真实
rw=randrw 测试随机混合写和读的I/O
ioengine=psync io引擎使用pync方式
bs=16k 单次io的块文件大小为16k
numjobs=10 本次的测试线程为10
mytest: (g=0): rw=randrw, bs=16K-16K/16K-16K/16K-16K, ioengine=psync, iodepth=1
...
fio-2.1.10
Starting 10 threads
mytest: Laying out IO file(s) (1 file(s) / 500MB)
Jobs: 10 (f=10): [mmmmmmmmmm] [100.0% done] [1328KB/896KB/0KB /s] [83/56/0 iops] [eta 00m:00s]
mytest: (groupid=0, jobs=10): err= 0: pid=23464: Mon Jun 28 12:52:55 2021
read : io=11632KB, bw=1156.2KB/s, iops=72, runt= 10061msec
clat (msec): min=1, max=273, avg=84.36, stdev=55.83
lat (msec): min=1, max=273, avg=84.36, stdev=55.83
clat percentiles (msec):
| 1.00th=[ 5], 5.00th=[ 12], 10.00th=[ 18], 20.00th=[ 31],
| 30.00th=[ 48], 40.00th=[ 62], 50.00th=[ 77], 60.00th=[ 94],
| 70.00th=[ 113], 80.00th=[ 130], 90.00th=[ 159], 95.00th=[ 186],
| 99.00th=[ 249], 99.50th=[ 258], 99.90th=[ 273], 99.95th=[ 273],
| 99.99th=[ 273]
bw (KB /s): min= 56, max= 191, per=10.04%, avg=116.07, stdev=29.91
write: io=11424KB, bw=1135.5KB/s, iops=70, runt= 10061msec
clat (usec): min=277, max=205220, avg=54698.78, stdev=44032.60
lat (usec): min=277, max=205221, avg=54699.64, stdev=44032.65
clat percentiles (usec):
| 1.00th=[ 828], 5.00th=[ 1272], 10.00th=[ 1672], 20.00th=[ 3472],
| 30.00th=[23168], 40.00th=[37632], 50.00th=[53504], 60.00th=[64768],
| 70.00th=[79360], 80.00th=[92672], 90.00th=[112128], 95.00th=[134144],
| 99.00th=[171008], 99.50th=[193536], 99.90th=[205824], 99.95th=[205824],
| 99.99th=[205824]
bw (KB /s): min= 29, max= 206, per=10.19%, avg=115.64, stdev=38.23
lat (usec) : 500=0.21%, 750=0.21%, 1000=0.69%
lat (msec) : 2=4.93%, 4=5.27%, 10=3.68%, 20=5.55%, 50=18.18%
lat (msec) : 100=34.35%, 250=26.44%, 500=0.49%
cpu : usr=0.01%, sys=0.06%, ctx=1492, majf=0, minf=7
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=727/w=714/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
READ: io=11632KB, aggrb=1156KB/s, minb=1156KB/s, maxb=1156KB/s, mint=10061msec, maxt=10061msec
WRITE: io=11424KB, aggrb=1135KB/s, minb=1135KB/s, maxb=1135KB/s, mint=10061msec, maxt=10061msec
Disk stats (read/write):
dm-2: ios=719/1496, merge=0/0, ticks=60164/81876, in_queue=142913, util=100.00%, aggrios=727/1144, aggrmerge=0/370, aggrticks=61291/63730, aggrin_queue=125075, aggrutil=100.00%
sda: ios=727/1144, merge=0/370, ticks=61291/63730, in_queue=125075, util=100.00%
- 可以看到读写的iops基本在70左右,根据以上基本参数解释,现有服务器(Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz 4核8线程,24G内存,叠瓦式机械硬盘)进行以下参数设置,可以明显降低甚至消除接口性能测试过程中的TPS抖动(MySQL数据库刷盘导致,可以从error.log日志看到刷盘信息)
innodb_buffer_pool_size =6G
innodb_buffer_pool_instances = 1
innodb_lru_scan_depth = 200
innodb_io_capacity = 100
innodb_io_capacity_max = 200
innodb_log_file_size = 4G
innodb_page_cleaners= 1
- 以上参数只是针对特定机器匹配服务相对最优的参数
其他一些概念
刷脏页机制
https://www.ywnds.com/?p=11039&viewuser=489
https://www.cnblogs.com/JiangLe/p/7419835.htmlredolog与binlog的区别
https://blog.csdn.net/wanbin6470398/article/details/81941586InnoDB脏页刷新机制Checkpoint
https://www.cnblogs.com/olinux/p/5196139.html
https://www.jianshu.com/p/0b19e1cd5e8c
https://blog.csdn.net/qq_18312025/article/details/78597681
问题汇总
- 验证一个项目接口性能测试时持续出现数据库死锁,本项目基于arm架构麒麟系统,mysql为源码编译,transaction_isolation = REPEATABLE-READ
Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
- 尝试查看项目之前的mysql服务,默认为transaction_isolation = READ-COMMITTED,调整之后再次验证未再出现死锁问题
- 参考文档