LECTURE: Subgroup discovery in data sets with multidimensional responses

Event Date: 
Thursday, 24 January, 2013 - 13:00
Lan Umek, Faculty of administration, University of Ljubljana

The next lecture at the Biostatistical Center will take place on Thursday, 1/24/2013, at 1:00 pm on IBMI.

Subgroup discovery (SD) is an applicable data analysis technique which aims at finding interesting  subsets of a random sample according to a predefined target concept. The majority of the existing SD approaches has been developed for data sets with  a single binary output variable

(class) therefore the subgroups' interestingness has been related to distributional unusualness.

In the talk we will present an algorithm for subgroup discovery which can handle multiple output variables simultaneously. Recently, the availability of  such data sets is increasing and there is a need for suitable ways how to handle them.  The proposed approach uses hierarchical clustering in the output space and then analyses the resulting clustering tree. Each node of the dendrogram corresponds to a particular subgroup, its interestingness is further measured using input variables and supervised data mining techniques.  By default, subgroups are evaluated in terms of area under ROC curve.

The algorithm's performance will be compared to predictive clustering techniques. For illustration it will be applied to the data from European Social Survey (ESS).


About IBMI

Institute for Biostatistics and Medical Informatics (IBMI), formerly Institute for BioMedical Informatics (so still IBMI) was founded by the Faculty of Medicine as a result of a need for a unit which would perform, or coordinate, tasks related to data analysis and providing information, relevant for research in medicine. The programme of the institute, and its development, have been adjusting thorugh time to changes in financing and technological progress, but the basic aim remain the same: to support research in medicine. This is achieved through the following tasks:


Institute for Biostatistics and Medical Informatics
University of Ljubljana, Faculty of Medicine
Vrazov trg 2, 1000 Ljubljana

tel: +386 1 543-77-70
fax: +386 1 543-77-71
email: ibmi (at) mf.uni-lj.si