### Page Content

## Combinatorial Feature Selection Problems: Parameterized Algorithms and Complexity

**Vincent Froese, ****B.Sc.**** ****(TU Berlin)**

The term combinatorial feature selection refers to a general class of problems that, given a set of high-dimensional objects, ask for selecting a subset of dimensions (features) such that some desired property holds for the dataset restricted to the selected dimensions. Problems of this kind arise as data preprocessing tasks in areas such as machine learning. Charikar et al. defined a general framework for combinatorial feature selection problems comprising cluster analysis as well as dimension reduction. We consider the Hidden Cluster problem, where the goal is to eliminate noisy dimensions such that in the remaining dimensions the data can be divided into a given number of clusters, and the Distinct Vectors problem, which aims at finding a minimum number of dimensions such that all given points are still distinct. Moreover, we introduce the Hidden Cluster Graph problem, where the number of cluster centers is not known. We analyze the above problems in terms of their parameterized complexity and provide parameterized hardness results as well as fixed-parameter algorithms. For example, we show that they are hard to solve with respect to some natural parameters such as the number of dimensions to select or the number of dimensions to delete. In order to obtain fixed-parameter tractability, we focus on parameters such as the number of cluster centers, the radius of a cluster, the size of the alphabet, or the maximum pairwise Hamming distance of the data points. We show that the problems are fixed-parameter tractable for some combinations of the mentioned parameters by providing problem kernels and fixed-parameter tractable algorithms.

Date | Speaker | Location | Language |
---|---|---|---|

19.10.2012 14:15 | Vincent Froese | TEL 512 | german |