Abstract
Complex aggregates have been proposed as a way to bridge the gap between approaches that handle sets by imposing conditions on specific elements, and approaches that handle them by imposing conditions on aggregated values. A complex aggregate summarises a subset of the elements in a set, where this subset is defined by conditions on the attribute values. In this paper, we present a new type of complex aggregate, where this subset is defined to be a cluster of the set. This is useful if subsets that are relevant for the task at hand are difficult to describe in terms of attribute conditions. This work is motivated from the analysis of flow cytometry data, where the sets are cells, and the subsets are cell populations. We describe two approaches to aggregate over clusters on an abstract level, and validate one of them empirically, motivating future research in this direction.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Aghaeepour, N., Finak, G., Consortium, F., Consortium, D., Hoos, H., Mosmann, T., Brinkman, R., Gottardo, R., Scheuermann, R.: Critical assessment of automated flow cytometry data analysis techniques. Nat. Methods 10(3), 228–238 (2013)
Aghaeepour, N., Nikolic, R., Hoos, H.H., Brinkman, R.R.: Rapid cell population identification in flow cytometry data. Cytometry Part A 79(1), 6–13 (2011)
Blockeel, H., De Raedt, L.: Top-down induction of first order logical decision trees. Artif. Intell. 101(1–2), 285–297 (1998)
Blockeel, H., De Raedt, L., Ramon, J.: Top-down induction of clustering trees. In: Proceedings of the 15th International Conference on Machine Learning, pp. 55–63 (1998)
Blockeel, H., Bruynooghe, M.: Aggregation versus selection bias, and relational neural networks. In: IJCAI-2003 Workshop on Learning Statistical Models from Relational Data, SRL-2003 (2003)
Charnay, C., Lachiche, N., Braud, A.: Incremental construction of complex aggregates: Counting over a secondary table. In: Online Preprints of 23th International Conference on Inductive Logic Programming, pp. 1–6 (2013)
Finak, G., Bashashati, A., Brinkman, R., Gottardo, R.: Merging mixture components for cell population identification in flow cytometry. Adv. Bioinform. 2009, 12 (2009)
Frank, R., Moser, F., Ester, M.: A method for multi-relational classification using single and multi-feature aggregation functions. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702, pp. 430–437. Springer, Heidelberg (2007)
Frasconi, P., Jaeger, M., Passerini, A.: Feature discovery with type extension trees. In: Železný, F., Lavrač, N. (eds.) ILP 2008. LNCS (LNAI), vol. 5194, pp. 122–139. Springer, Heidelberg (2008)
Herzenberg, L., Tung, J., Moore, W., Herzenberg, L., Parks, D.: Interpreting flow cytometry data: a guide for the perplexed. Nat. Immunol. 7(7), 681–685 (2006)
Jaeger, M., Lippi, M., Passerini, A., Frasconi, P.: Type extension trees for feature construction and learning in relational domains. Artif. Intell. 204, 30–55 (2013)
Knobbe, A.J., Siebes, A., Marseille, B.: Involving aggregate functions in multi-relational search. In: Elomaa, T., Mannila, H., Toivonen, H. (eds.) PKDD 2002. LNCS (LNAI), vol. 2431, p. 287. Springer, Heidelberg (2002)
Koller, D.: Probabilistic relational models. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS (LNAI), vol. 1634, p. 3. Springer, Heidelberg (1999)
Krogel, M.A., Wrobel, S.: Facets of aggregation approaches to propositionalization. In: Horváth, T., Yamamoto, A. (eds.) Proceedings of the Work-in-Progress Track at the 13th International Conference on Inductive Logic Programming, pp. 30–39 (2003)
Muggleton, S. (ed.): Inductive Logic Programming. Academic Press, New York (1992)
Neville, J., Jensen, D., Friedland, L., Hay, M.: Learning relational probability trees. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 625–630. ACM Press (2003)
Perlich, C., Provost, F.: Aggregation-based feature invention and relational concept classes. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 167–176. ACM Press (2003)
Srinivasan, A., Muggleton, S., King, R.: Comparing the use of background knowledge by inductive logic programming systems. In: De Raedt, L. (ed.) Proceedings of the 5th International Workshop on Inductive Logic Programming, pp. 199–230 (1995)
Sugár, I.P., Sealfon, S.C.: Misty mountain clustering: application to fast unsupervised flow cytometry gating. BMC Bioinf. 11(1), 502 (2010)
Uwents, W., Blockeel, H.: Classifying relational data with neural networks. In: Kramer, S., Pfahringer, B. (eds.) ILP 2005. LNCS (LNAI), vol. 3625, pp. 384–396. Springer, Heidelberg (2005)
Van Assche, A., Vens, C., Blockeel, H., Džeroski, S.: First order random forests: learning relational classifiers with complex aggregates. Mach. Learn. 64(1–3), 149–182 (2006)
Vens, C., Ramon, J., Blockeel, H.: Refining aggregate conditions in relational learning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 383–394. Springer, Heidelberg (2006)
Zare, H., Shooshtari, P., Gupta, A., Brinkman, R.R.: Data reduction for spectral clustering to analyze high throughput flow cytometry data. BMC Bioinf. 11(1), 403 (2010)
Acknowledgments
Celine Vens is a Postdoctoral Fellow of the Research Foundation - Flanders (FWO). Sofie Van Gassen is funded by a Ph.D. grant of the Agency for Innovation by Science and Technology (IWT).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Vens, C., Van Gassen, S., Dhaene, T., Saeys, Y. (2015). Complex Aggregates over Clusters of Elements. In: Davis, J., Ramon, J. (eds) Inductive Logic Programming. Lecture Notes in Computer Science(), vol 9046. Springer, Cham. https://doi.org/10.1007/978-3-319-23708-4_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-23708-4_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23707-7
Online ISBN: 978-3-319-23708-4
eBook Packages: Computer ScienceComputer Science (R0)