当前位置:首页  教学科研

学术讲座【Model-based Categorical Sequence Clustering】

时间:2017-03-28浏览:222设置

时间:2017年3月23日(周四)15:00 - 16:30

地点:仓山校区成功楼603报告厅

主办:数学与计算机科学学院、福建省网络安全与密码技术重点实验室

主讲:加拿大Sherbrooke大学 王声瑞教授  

专家简介:王声瑞,加拿大籍华人,加拿大舍布鲁克大学计算机系(University of Sherbrooke)终身教授,博士生导师。1989年毕业于法国格勒诺布尔国立理工学院(Institut National Polytechnique de Grenoble)并获得计算机博士学位。王声瑞教授的研究涉及数据挖掘、模式识别、人工智能、图像处理和理解、知识采集、信息系统、神经网络、优化等众多科学领域,在高维数据、类数据及序列数据聚类方面具有国际公认的出色成果,并成功地应用于生物信息学、临床数据、 财经、英特网、汽车导航、图像数据库、智能环境、雷达监测等领域。1993年起连续27年获得加拿大国家自然科学和工程技术研究基金(NSERC)创新研究资助,主持多项重大研究资助项目,2010获得NSERC的加速创新研究特资资助。2010起任加拿大国家自然科学和工程技术研究基金会计算机委员会核心成员,先后担任计算机方法分会主席(2013-2014),计算机-数学-统计学研究工具及仪器基金评选委员会主席(2015-2016)。

报告摘要:In this talk, I will present one of our recent works on model-based clustering of categorical sequences and its applications. Clustering categorical sequences is an important and difficult data mining task. The main challenge is due to the lack of a meaningful measure of similarity. Instead of using a pair-wise measure for comparing sequences, we propose a novel statistical framework for modeling clusters of sequences. Based on Weighted Conditional Probability Distribution (WCPD), we develop a variable-order Markov model of categorical sequences. The design of the WCPD model is derived formally from an optimization perspective. We present an efficient method for building the WCPD models using the probabilistic suffix tree (PST). We propose also an efficient approach to model initialization. To initialize the WCPD model, we make use of a first-order Markov model built on a weighted fuzzy indicator vector representation of categorical sequences, which we call the WFI Markov model. Based on a cascade optimization framework that combines the WCPD and WFI models, we design a new divisive hierarchical clustering algorithm for clustering categorical sequences. Experimental results on data sets from three different domains demonstrate the performance of our models and clustering algorithm.

 

返回原图
/