题目: Two-Step Bayesian Multiple Classifications with Logic Expressions
报告人：吴文嵩 Florida International University
摘要: In this presentation we consider a two-class classification problem, where the goal is to predict the class membership of M units based on the values of high-dimensional categorical predictor variables as well as both the values of predictor variables and the class membership of other N independent units. We focus on applying generalized linear regression models with Boolean expressions of categorical predictors. We consider a Bayesian and decision-theoretic framework, and develop a general form of Bayes multiple classification function (BMCF) with respect to a class of cost-weighted loss functions. In particular, the loss function pairs such as the proportions of false positives and false negatives, and (1-sensitivity) and (1-specificity), are considered. The best Boolean expressions are selected by a data driven procedure, where the candidates are first selected by Apriori Algorithm, an efficient algorithm for detecting association rules and frequent patterns, and the final expressions are selected by Bayesian model averaging. This two-step procedure will reduce model uncertainty in model selection and retain computational efficiency. The results will be illustrated via simulations and on a vulvovaginal candidiasis (VVC) infection diagnosis dataset.