1.Ontology Evaluation Through Text Classificatio
提出了一种基于本体实例到文本文档映射的搜索本体评估方法。在此映射的基础上,通过度量本体关系在文本文档上的分类潜力来评估本体关系的充分性。该数据驱动方法为本体主体提供了具体的反馈,并定量估计了本体关系对搜索体验改进的功能充分性。我们特别评估本体关系是否能够帮助语义搜索引擎支持探索性搜索。我们在电影领域的一个本体上测试了这种本体评估方法,该本体是通过集成多个半结构化和文本数据源(如IMDb和Wikipedia)半自动获得的。通过在Web上爬行以获取电影评论(包括专业评论和用户评论),我们可以从一组电影实例自动构造一个域语料库。本体中文本文档(评论)与电影实例之间的1-1关系使我们能够将本体关系转换为文本类。我们验证了由关键本体关系(体裁、关键字、角色)诱导的文本分类器实现了高性能,并利用所学习的文本分类器的属性对本体提供了具体的反馈。所提出的本体评估方法是通用的,并依赖于文本文档与本体实例自动对齐的可能性。
We present a new method to evaluate a search ontology, which relies on mapping ontology instances to textual documents. On the basis of this mapping, we evaluate the adequacy of ontology relations by measuring their classification potential over the textual documents. This data-driven method provides concrete feedback to ontology main- tainers and a quantitative estimation of the functional adequacy of the ontology relations towards search experience improvement. We specifi- cally evaluate whether an ontology relation can help a semantic search engine support exploratory search. We test this ontology evaluation method on an ontology in the Movies domain, that has been acquired semi-automatically from the integration of multiple semi-structured and textual data sources (e.g., IMDb and Wikipedia). We automatically construct a domain corpus from a set of movie instances by crawling the Web for movie reviews (both profes- sional and user reviews). The 1-1 relation between textual documents (reviews) and movie instances in the ontology enables us to translate ontology relations into text classes. We verify that the text classifiers induced by key ontology relations (genre, keywords, actors) achieve high performance and exploit the properties of the learned text classifiers to provide concrete feedback on the ontology. The proposed ontology evaluation method is general and relies on the possibility to automatically align textual documents to ontology instances.
2.Ontology Evaluation: Which Test to Use
There are various methodologies to evaluate ontol- ogies. One of them is based on the percentage coverage of on- tology for domain knowledge i.e. gold samples. This paper de- monstrates several experiments to show how we can use this ontological coverage to test the quality of ontology. The main concern here is how to use the coverage measure to accept or reject the quality of ontology. Two domains ontologies for Java and .Net have been chosen. Two ontologies in each domain with five frequencies have been selected. Ten gold definitions have been selected in each domain. The coverage has been computed for each definition. The coverage measure has been normalized by divining it on the number of words in the definitions (defini- tion normalization). The second ontology in both domains has been chosen to be better than the first one by including the first ontology in the second one with more concepts in the second. The percentage coverage measures have been computed for the
four ontologies with 50 (10*5 frequencies) cases. Finding the “good” population between two populations is a well-know statistical problem. Four techniques have been borrowed: 1) Non-parametric statistical method Mann-Whitney-Wilcoxon U (MWW) test, 2) The average (mean analysis) of the coverage, 3) The average differences in coverage, and 4) The percentage of positive differences in coverage between any two ontologies (sign test). Results show that the technique number 4 (sign test) gives
有多种方法可以评估本体。其中一个是基于领域知识的覆盖率,即黄金样本。本文通过几个实验来说明如何利用这种本体覆盖率来测试本体的质量。这里主要关注的是如何使用覆盖率度量来接受或拒绝本体的质量。已经选择了Java和. net的两个域本体。在每个领域中,选择了两个具有五个频率的本体。每个领域都选择了十个黄金定义。已经为每个定义计算了覆盖率。通过对定义(definition - tion归一化)中单词数量的预测,使覆盖率度量标准化。通过将第一个本体包含在第二个本体中,并在第二个本体中包含更多的概念,从而使两个领域中的第二个本体都优于第一个本体。已经计算了包含50种情况(10*5个频率)的4种本体的覆盖率度量。在两个种群之间找到“好”种群是一个众所周知的统计问题。借用了四种技术:1)非参数统计方法Mann-Whitney-Wilcoxon U (MWW)检验;2)覆盖率的平均(均值分析);3)覆盖率的平均差异;结果表明,第四种方法(符号检验)是可行的
3.Modeling Smart Sensors on top of SOSA/SSN and WoT TD with the Semantic Smart Sensor Network (S3N) modular Ontology
The joint OGC and W3C standard SOSA/SSN ontology de- scribes sensors, observations, sampling, and actuation. The W3C Thing Description ontology under development in the W3C WoT working group describes things and their interaction patterns. In this paper we are interested in combining these two ontologies for modeling Smart-Sensors. Along with ba- sic sensors, a Smart-Sensor contains a micro-controller that can run different algorithms adapted to the context and a communicating system that exposes the Smart-Sensor on some network. For example, a smart accelerometer can be used to measure cycling cadence, step numbers or a variety of other things. The SOSA/SSN ontology is only able to model partially the adaptation capa- bilities of Smart-Sensors to different contexts. Thus, we design an SOSA/SSN extension, called the Semantic Smart Sensor Network (S3N) ontology. S3N answers several competency questions such as how to adapt the Smart-Sensor to the current context of use, that is to say selecting the algorithms to provide the right sensors outputs and the micro-controller capabilities.
联合OGC和W3C标准SOSA/SSN本体设计传感器、观测、采样和驱动。W3C WoT工作组中正在开发的W3C事物描述本体描述事物及其交互模式。在本文中,我们对结合这两种本体来建模智能传感器很感兴趣。除了基本的传感器外,智能传感器还包含一个微控制器,它可以运行不同的算法来适应环境,以及一个通信系统,该系统在某些网络上公开了智能传感器。例如,智能加速计可以用来测量自行车的节奏、步数或其他各种东西。SOSA/SSN本体只能对智能传感器对不同环境的适应性进行部分建模。因此,我们设计了一个SOSA/SSN扩展,称为语义智能传感器网络(S3N)本体。S3N回答了一些能力问题,如如何使智能传感器适应当前的使用环境,即选择算法来提供正确的传感器输出和微控制器功能。