Hierarchical Learning of Dependent Concepts for Human Activity Recognition

Osmani, Aomar; Hamidi, Massinissa; Alizadeh, Pegah

doi:10.1007/978-3-030-75765-6_7

Aomar Osmani¹⁵,
Massinissa Hamidi¹⁵ &
Pegah Alizadeh¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12713))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2207 Accesses
2 Citations

Abstract

In multi-class classification tasks, like human activity recognition, it is often assumed that classes are separable. In real applications, this assumption becomes strong and generates inconsistencies. Besides, the most commonly used approach is to learn classes one-by-one against the others. This computational simplification principle introduces strong inductive biases on the learned theories. In fact, the natural connections among some classes, and not others, deserve to be taken into account. In this paper, we show that the organization of overlapping classes (multiple inheritances) into hierarchies considerably improves classification performances. This is particularly true in the case of activity recognition tasks featured in the SHL dataset. After theoretically showing the exponential complexity of possible class hierarchies, we propose an approach based on transfer affinity among the classes to determine an optimal hierarchy for the learning process. Extensive experiments show improved performances and a reduction in the number of examples needed to learn.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Adapting Hierarchical Multiclass Classification to Changes in the Target Concept

Introducing the Theory of Probabilistic Hierarchical Learning for Classification

Hierarchical Classification for Solving Multi-class Problems: A New Approach Using Naive Bayesian Classification

Notes

1.
In our case, we select several body-motion modalities to be included in our experiments, among the 16 input modalities of the original dataset: accelerometer, gyroscope, etc. Segmentation and processing details are detailed in experimental part.
2.
Software package and code to reproduce empirical results are publicly available at https://github.com/sensor-rich/hierarchicalSHL.

References

Cai, L., Hofmann, T.: Hierarchical document categorization with support vector machines. In: CIKM, pp. 78–87 (2004)
Google Scholar
Carpineti, C., et al.: Custom dual transportation mode detection by smartphone devices exploiting sensor diversity. In: PerCom wksh, pp. 367–372. IEEE (2018)
Google Scholar
Cesa-Bianchi, N., Gentile, C., Zaniboni, L.: Incremental algorithms for hierarchical classification. JMLR 7, 31–54 (2006)
Google Scholar
Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)
Article Google Scholar
Costa, E., Lorena, A., Carvalho, A., Freitas, A.: A review of performance evaluation measures for hierarchical classifiers. In: Evaluation Methods for machine Learning II: papers from the AAAI-2007 Workshop, pp. 1–6 (2007)
Google Scholar
Essaidi, M., Osmani, A., Rouveirol, C.: Learning dependent-concepts in ilp: Application to model-driven data warehouses. In: ILP, pp. 151–172 (2015)
Google Scholar
Gjoreski, H., et al.: The university of sussex-huawei locomotion and transportation dataset for multimodal analytics with mobile devices. IEEE Access 6, 42592-42604 (2018)
Google Scholar
Hamidi, M., Osmani, A.: Data generation process modeling for activity recognition. In: ECML-PKDD. Springer (2020)
Google Scholar
Hamidi, M., Osmani, A., Alizadeh, P.: A multi-view architecture for the shl challenge. In: UbiComp/ISWC Adjunct, pp. 317–322 (2020)
Google Scholar
Kosmopoulos, A., Partalas, I., Gaussier, E., Paliouras, G., Androutsopoulos, I.: Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Min. Knowl. Disc. 29(3), 820–865 (2014). https://doi.org/10.1007/s10618-014-0382-x
Article MathSciNet MATH Google Scholar
Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies: 1. hierarchical systems. Comput. J. 9(4), 373–380 (1967)
Google Scholar
Nakamura, Y., et al.: Multi-stage activity inference for locomotion and transportation analytics of mobile users. In: UbiComp/ISWC, pp. 1579–1588 (2018)
Google Scholar
Nguyen-Dinh, L.V., Calatroni, A., Tröster, G.: Robust online gesture recognition with crowdsourced annotations. JMLR 15(1), 3187–3220 (2014)
Google Scholar
Peters, M.E., Ruder, S., Smith, N.A.: To tune or not to tune? adapting pretrained representations to diverse tasks. arXiv preprint arXiv:1903.05987 (2019)
Samie, F., Bauer, L., Henkel, J.: Hierarchical classification for constrained IoT devices: a case study on human activity recognition. IEEE IoT J. 7(9), 8287-8295 (2020)
Google Scholar
Scheurer, S., et al.: Using domain knowledge for interpretable and competitive multi-class human activity recognition. Sensors 20(4), 1208 (2020)
Article Google Scholar
Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Disc. 22(1–2), 31–72 (2011)
Article MathSciNet Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: NIPS, pp. 2951–2959 (2012)
Google Scholar
Stikic, M., Schiele, B.: Activity recognition from sparsely labeled data using multi-instance learning. In: Choudhury, T., Quigley, A., Strang, T., Suginuma, K. (eds.) LoCA 2009. LNCS, vol. 5561, pp. 156–173. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01721-6_10
Chapter Google Scholar
Taran, V., Gordienko, Y., Rokovyi, A., Alienin, O., Stirenko, S.: Impact of ground truth annotation quality on performance of semantic image segmentation of traffic conditions. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 183–193. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-16621-2_17
Chapter Google Scholar
Vapnik, V.: Principles of risk minimization for learning theory. In: NIPS (1992)
Google Scholar
Vincent, P., et al.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR 11(12), (2010)
Google Scholar
Wang, L., et al.: Summary of the sussex-huawei locomotion-transportation recognition challenge. In: UbiComp/ISWC, pp. 1521–1530 (2018)
Google Scholar
Wehrmann, J., Cerri, R., Barros, R.: Hierarchical multi-label classification networks. In: ICML, pp. 5075–5084 (2018)
Google Scholar
Yao, H., Wei, Y., Huang, J., Li, Z.: Hierarchically structured meta-learning. In: ICML, pp. 7045–7054 (2019)
Google Scholar
Zamir, A.R., Sax, A., Shen, W., Guibas, L.J., Malik, J., Savarese, S.: Taskonomy: Disentangling task transfer learning. In: CVPR, pp. 3712–3722 (2018)
Google Scholar
Zhou, D., Xiao, L., Wu, M.: Hierarchical classification via orthogonal transfer. In: ICML, pp. 801–808 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

LIPN-UMR CNRS 7030, Univ. Sorbonne Paris Nord, Villetaneuse, France
Aomar Osmani & Massinissa Hamidi
Léonard de Vinci Pôle Universitaire, Research Center, 92 916, Paris, La Défense, France
Pegah Alizadeh

Authors

Aomar Osmani
View author publications
You can also search for this author in PubMed Google Scholar
Massinissa Hamidi
View author publications
You can also search for this author in PubMed Google Scholar
Pegah Alizadeh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Massinissa Hamidi .

Editor information

Editors and Affiliations

IIIT, Hyderabad, Hyderabad, India
Kamal Karlapalem
Chinese University of Hong Kong, Shatin, Hong Kong
Hong Cheng
Virginia Tech, Arlington, VA, USA
Naren Ramakrishnan
Jawaharlal Nehru University, New Delhi, India
R. K. Agrawal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy
University of Minnesota, Minneapolis, MN, USA
Jaideep Srivastava
IIIT Delhi, New Delhi, India
Tanmoy Chakraborty

Appendices

Appendix A

Proof

Theorem 1. It can be explained by observing that, for \(K+1\) concepts containing K existed concepts \(c_1, \cdots c_K\) and a new added concept \(\gamma \), we can produce the first level trees combinations as below. Notice that each atomic element o can be one of the \(c_1, \cdots c_K\) concepts. In order to compute the total number of trees combinations, we show what is the number of tree combinations by assigning the K concepts to each item:

\((\gamma (\overbrace{o\cdots o}^{K \text {concepts}}))\): the number of trees combinations by taking the concept labels into the account are: \({K \atopwithdelims ()0} L(1) \times 2 \times L(K)\); the reason for multiplying the number of trees combinations for K concepts to 2 is because while the left side contains an atomic \(\gamma \) concept, there are two choices for the right side of the tree in the first level: either we compute the total number of trees for K concepts from the first level or we keep the first level as a \(\overbrace{o\cdots o}^{K \text {concepts}}\) atomics and keep all K concepts together, then continue the number of K trees combinations from the second level of the tree.
\(((\gamma o)(\overbrace{o\cdots o}^{K-1 \text {concepts}}))\): similar to the previous part we have \({K \atopwithdelims ()1} L(2) \times 2 \times L(K-1)\) trees combinations by taking the concepts labels into the account. \({K \atopwithdelims ()1}\) indicates the number of combinations for choosing a concept from the K concept and put it with the new concept separately. While L(2) is the number of trees combinations for the left side of tree separated with the new concept \(\gamma \).
\(((\gamma oo)(\overbrace{o\cdots o}^{K-2 \text {concepts}}))\), \(\cdots \)
\(((\gamma \overbrace{o\cdots o}^{K-1 \text {concepts}})o)\): \({K \atopwithdelims ()K-1} L(K) L(1)\) in this special part, we follow the same formula except the single concept in the right side has only one possible combination in the first level equal to L(1).

All in all, the sum of these items calculates the total number of tree hierarchies for \(K+1\) concepts.

The first few number of total number of trees combinations for 1, 2, 3, 4, 5, 6, \(7, 8, 9,10, \cdots \) concepts are: 1, 1, 4, 26, 236, 2752, 39208, 660032, 12818912, \(282137824,\cdots \). In the case of the SHL dataset that we use in the empirical evaluation, we have 8 different concepts and thus, the number of different types of hierarchies for this case is \(L(8) = 660,032\).

Appendix B Training Details

We use Tensorflow for building the encoders/decoders. We construct encoders by stacking Conv1d/ReLU/MaxPool blocks. These blocks are followed by a Fully Connected/ReLU layers. Encoders performance estimation is based on the validation loss and is framed as a sequence classification problem. As a preprocessing step, annotated input streams from the huge SHL dataset are segmented into sequences of 6000 samples which correspond to a duration of 1 min. given a sampling rate 100 Hz. For weight optimization, we use stochastic gradient descent with Nesterov momentum of 0.9 and a learning-rate of 0.1 for a minimum of 12 epochs (we stop training if there is no improvement). Weight decay is set to 0.0001. Furthermore, to make the neural networks more stable, we use batch normalization on top of each convolutional layer. We use SVMs as our ERMs in the derived hierarchies.

Appendix C Evaluation Metrics

In hierarchical classification settings, the hierarchical structure is important and should be taken into account during model evaluation [17]. Various measures that account for the hierarchical structure of the learning process have been studied in the literature. They can be categorized into: distance-based; depth-dependent; semantics-based; and hierarchy-based measures. Each one is displaying advantages and disadvantages depending on the characteristics of the considered structure [5]. In our experiments, we use the H-loss, a hierarchy-based measure defined in [3]. This measure captures the intuition that “whenever a classification mistake is made on a node of the taxonomy, then no loss should be charged for any additional mistake occurring in the sub-tree of that node.” \(\ell _H(\hat{y}, y) = \sum _{i=1}^{N} \{ \hat{y}_i \ne y_i \wedge \hat{y}_j = y_j, j \in Anc(i) \}\), where \(\hat{y} = (\hat{y}_1, \cdots \hat{y}_N)\) is the predicted labels, \(y = (y_1, \cdots y_N)\) is the true labels, and Anc(i) is the set of ancestors for the node i.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Osmani, A., Hamidi, M., Alizadeh, P. (2021). Hierarchical Learning of Dependent Concepts for Human Activity Recognition. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12713. Springer, Cham. https://doi.org/10.1007/978-3-030-75765-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-75765-6_7
Published: 08 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75764-9
Online ISBN: 978-3-030-75765-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hierarchical Learning of Dependent Concepts for Human Activity Recognition

Abstract

Access this chapter

Similar content being viewed by others

Adapting Hierarchical Multiclass Classification to Changes in the Target Concept

Introducing the Theory of Probabilistic Hierarchical Learning for Classification

Hierarchical Classification for Solving Multi-class Problems: A New Approach Using Naive Bayesian Classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Appendix A

Proof

Appendix B Training Details

Appendix C Evaluation Metrics

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Hierarchical Learning of Dependent Concepts for Human Activity Recognition

Abstract

Access this chapter

Similar content being viewed by others

Adapting Hierarchical Multiclass Classification to Changes in the Target Concept

Introducing the Theory of Probabilistic Hierarchical Learning for Classification

Hierarchical Classification for Solving Multi-class Problems: A New Approach Using Naive Bayesian Classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendices

Appendix A

Proof

Appendix B Training Details

Appendix C Evaluation Metrics

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation