2018-06-06

数据挖掘技术在医学数据中的应用
中文摘要
随着大数据技术与人工智能技术的发展,数据挖掘技术被应用在越来越多的领域之中,其中不乏金融、教育、医疗等行业。其中,在医疗行业的应用上又包括精准医疗、基因工程、基因测序等学科前沿领域中。本文则是以数据挖掘的模型算法在医学临床数据和医院信息系统数据中所发挥的作用进行了论述。
数据挖掘技术在医学数据中应用的目的是从大量的医学数据中挖掘出潜在的且与致病有关的因素,并且在此过程中获取到更多的信息、模型、关联规则等,将这些挖掘出的成果应用于临床,从而能够帮助医生进行更快更准的疾病判断。本文的主要工作如下:
首先,本文第二章详细阐述了医学数据的特点以及常用的数据挖掘算法的理论基础,方法结构。还介绍了各种数据挖掘模型的简单解释。
其次,本文主要通过一个乳腺癌相关的医学数据集,探索了数据挖掘中的logistic回归分析预测和随机森林(决策树)分类预测技术在医学数据上的分类功能。并在分类结果上取得较好的分类精确度。之后可以作为辅助医生的一种诊断方案,对被预测得乳腺癌概率较高的患者可以重点观察,重点诊断。
最后,本文对两个数据集中所得出的分类和预测结果进行解释说明,并提出相关的对策和改进意见。并在文末提出了关于本文的不足与将来进行改进的方向。

关键词:数据挖掘;回归分析;决策树;乳腺癌

The application of data mining technology in medical data.
Abstract in Chinese
The application of data mining has become a hot topic with the development of big data technology and Artificial Intelligence Technology, and it has been applied in a great many fields, such as financial industry, educational industry, healthcare industry and other industries. Among them, the application of healthcare industry covers precision medicine, gene engineering,gene sequencing and other frontier fields . This article fully discusses the role of model algorithm of data mining in medical clinical data and hospital information system data.
The purpose of data mining technology applied in the medical data is to dig out the potential factors that are related to the disease from a large number of medical data, and to get more information, models, association rules and so on from the process. the excavated achievements are used for clinical medicine ,which can help doctors to judge disease faster and more accurate . The main work of this article is as follows:
First of all, the second chapter ot this article elaborates the characteristics of medical data and common theoretical basis and method structure of data mining algorithms. A brief explanation of various data mining models is also introduced.
Secondly, this article mainly explores the classificatory function of the logistic regression analysis and random forest (decision tree) in data mining ,through a breast cancer related medical data sets . Moreover, the classification results acquireed better classification accuracy. It can be used as a diagnostic program to assist doctors to concentrate on observating patients with a higher probability of breast cancer.
Finally, this article makes an explaination for the classification and prediction results of two data sets, and puts forward relevant countermeasures and suggestions. At the end of the article, the author comes up with the deficiency and the direction of the future improvement.

Key words: Data mining; Regression analysis; Decision tree; Breast cancer

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,490评论 0 10
  • 你还爱我吗? 爱是什么? 是忘人忧怜的悲切 还是魂断蓝桥的无声
    孤独的浪者阅读 228评论 0 2
  • 世界,是自然界和人类社会的一切事物的总和。 我想,也包括上帝。 在最后的审判到来之前,众多的死者只能靠睡觉或打牌打...
    8b0bf5e2fc28阅读 6,937评论 0 3
  • 单词15 每天半夜热的脑袋痒,但是也不至于吹电扇空调 大早起来就想吃一碗辣辣的肉粉,最后走到公司了也没有,只好在7...
    是魔王大人阅读 113评论 3 0
  • 流水潺潺轻声响, 幽做茶海静观赏, 遥似伊人迎面来, 淡淡脂粉扑鼻香。
    范春龙阅读 164评论 0 1