Skip to main content

Hierarchical Learning of Dependent Concepts for Human Activity Recognition

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12713))

Included in the following conference series:

Abstract

In multi-class classification tasks, like human activity recognition, it is often assumed that classes are separable. In real applications, this assumption becomes strong and generates inconsistencies. Besides, the most commonly used approach is to learn classes one-by-one against the others. This computational simplification principle introduces strong inductive biases on the learned theories. In fact, the natural connections among some classes, and not others, deserve to be taken into account. In this paper, we show that the organization of overlapping classes (multiple inheritances) into hierarchies considerably improves classification performances. This is particularly true in the case of activity recognition tasks featured in the SHL dataset. After theoretically showing the exponential complexity of possible class hierarchies, we propose an approach based on transfer affinity among the classes to determine an optimal hierarchy for the learning process. Extensive experiments show improved performances and a reduction in the number of examples needed to learn.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In our case, we select several body-motion modalities to be included in our experiments, among the 16 input modalities of the original dataset: accelerometer, gyroscope, etc. Segmentation and processing details are detailed in experimental part.

  2. 2.

    Software package and code to reproduce empirical results are publicly available at https://github.com/sensor-rich/hierarchicalSHL.

References

  1. Cai, L., Hofmann, T.: Hierarchical document categorization with support vector machines. In: CIKM, pp. 78–87 (2004)

    Google Scholar 

  2. Carpineti, C., et al.: Custom dual transportation mode detection by smartphone devices exploiting sensor diversity. In: PerCom wksh, pp. 367–372. IEEE (2018)

    Google Scholar 

  3. Cesa-Bianchi, N., Gentile, C., Zaniboni, L.: Incremental algorithms for hierarchical classification. JMLR 7, 31–54 (2006)

    Google Scholar 

  4. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur. 20(1), 37–46 (1960)

    Article  Google Scholar 

  5. Costa, E., Lorena, A., Carvalho, A., Freitas, A.: A review of performance evaluation measures for hierarchical classifiers. In: Evaluation Methods for machine Learning II: papers from the AAAI-2007 Workshop, pp. 1–6 (2007)

    Google Scholar 

  6. Essaidi, M., Osmani, A., Rouveirol, C.: Learning dependent-concepts in ilp: Application to model-driven data warehouses. In: ILP, pp. 151–172 (2015)

    Google Scholar 

  7. Gjoreski, H., et al.: The university of sussex-huawei locomotion and transportation dataset for multimodal analytics with mobile devices. IEEE Access 6, 42592-42604 (2018)

    Google Scholar 

  8. Hamidi, M., Osmani, A.: Data generation process modeling for activity recognition. In: ECML-PKDD. Springer (2020)

    Google Scholar 

  9. Hamidi, M., Osmani, A., Alizadeh, P.: A multi-view architecture for the shl challenge. In: UbiComp/ISWC Adjunct, pp. 317–322 (2020)

    Google Scholar 

  10. Kosmopoulos, A., Partalas, I., Gaussier, E., Paliouras, G., Androutsopoulos, I.: Evaluation measures for hierarchical classification: a unified view and novel approaches. Data Min. Knowl. Disc. 29(3), 820–865 (2014). https://doi.org/10.1007/s10618-014-0382-x

    Article  MathSciNet  MATH  Google Scholar 

  11. Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies: 1. hierarchical systems. Comput. J. 9(4), 373–380 (1967)

    Google Scholar 

  12. Nakamura, Y., et al.: Multi-stage activity inference for locomotion and transportation analytics of mobile users. In: UbiComp/ISWC, pp. 1579–1588 (2018)

    Google Scholar 

  13. Nguyen-Dinh, L.V., Calatroni, A., Tröster, G.: Robust online gesture recognition with crowdsourced annotations. JMLR 15(1), 3187–3220 (2014)

    Google Scholar 

  14. Peters, M.E., Ruder, S., Smith, N.A.: To tune or not to tune? adapting pretrained representations to diverse tasks. arXiv preprint arXiv:1903.05987 (2019)

  15. Samie, F., Bauer, L., Henkel, J.: Hierarchical classification for constrained IoT devices: a case study on human activity recognition. IEEE IoT J. 7(9), 8287-8295 (2020)

    Google Scholar 

  16. Scheurer, S., et al.: Using domain knowledge for interpretable and competitive multi-class human activity recognition. Sensors 20(4), 1208 (2020)

    Article  Google Scholar 

  17. Silla, C.N., Freitas, A.A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Disc. 22(1–2), 31–72 (2011)

    Article  MathSciNet  Google Scholar 

  18. Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: NIPS, pp. 2951–2959 (2012)

    Google Scholar 

  19. Stikic, M., Schiele, B.: Activity recognition from sparsely labeled data using multi-instance learning. In: Choudhury, T., Quigley, A., Strang, T., Suginuma, K. (eds.) LoCA 2009. LNCS, vol. 5561, pp. 156–173. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01721-6_10

    Chapter  Google Scholar 

  20. Taran, V., Gordienko, Y., Rokovyi, A., Alienin, O., Stirenko, S.: Impact of ground truth annotation quality on performance of semantic image segmentation of traffic conditions. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 183–193. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-16621-2_17

    Chapter  Google Scholar 

  21. Vapnik, V.: Principles of risk minimization for learning theory. In: NIPS (1992)

    Google Scholar 

  22. Vincent, P., et al.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. JMLR 11(12), (2010)

    Google Scholar 

  23. Wang, L., et al.: Summary of the sussex-huawei locomotion-transportation recognition challenge. In: UbiComp/ISWC, pp. 1521–1530 (2018)

    Google Scholar 

  24. Wehrmann, J., Cerri, R., Barros, R.: Hierarchical multi-label classification networks. In: ICML, pp. 5075–5084 (2018)

    Google Scholar 

  25. Yao, H., Wei, Y., Huang, J., Li, Z.: Hierarchically structured meta-learning. In: ICML, pp. 7045–7054 (2019)

    Google Scholar 

  26. Zamir, A.R., Sax, A., Shen, W., Guibas, L.J., Malik, J., Savarese, S.: Taskonomy: Disentangling task transfer learning. In: CVPR, pp. 3712–3722 (2018)

    Google Scholar 

  27. Zhou, D., Xiao, L., Wu, M.: Hierarchical classification via orthogonal transfer. In: ICML, pp. 801–808 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Massinissa Hamidi .

Editor information

Editors and Affiliations

Appendices

Appendix A

Proof

Theorem 1. It can be explained by observing that, for \(K+1\) concepts containing K existed concepts \(c_1, \cdots c_K\) and a new added concept \(\gamma \), we can produce the first level trees combinations as below. Notice that each atomic element o can be one of the \(c_1, \cdots c_K\) concepts. In order to compute the total number of trees combinations, we show what is the number of tree combinations by assigning the K concepts to each item:

  • \((\gamma (\overbrace{o\cdots o}^{K \text {concepts}}))\): the number of trees combinations by taking the concept labels into the account are: \({K \atopwithdelims ()0} L(1) \times 2 \times L(K)\); the reason for multiplying the number of trees combinations for K concepts to 2 is because while the left side contains an atomic \(\gamma \) concept, there are two choices for the right side of the tree in the first level: either we compute the total number of trees for K concepts from the first level or we keep the first level as a \(\overbrace{o\cdots o}^{K \text {concepts}}\) atomics and keep all K concepts together, then continue the number of K trees combinations from the second level of the tree.

  • \(((\gamma o)(\overbrace{o\cdots o}^{K-1 \text {concepts}}))\): similar to the previous part we have \({K \atopwithdelims ()1} L(2) \times 2 \times L(K-1)\) trees combinations by taking the concepts labels into the account. \({K \atopwithdelims ()1}\) indicates the number of combinations for choosing a concept from the K concept and put it with the new concept separately. While L(2) is the number of trees combinations for the left side of tree separated with the new concept \(\gamma \).

  • \(((\gamma oo)(\overbrace{o\cdots o}^{K-2 \text {concepts}}))\), \(\cdots \)

  • \(((\gamma \overbrace{o\cdots o}^{K-1 \text {concepts}})o)\): \({K \atopwithdelims ()K-1} L(K) L(1)\) in this special part, we follow the same formula except the single concept in the right side has only one possible combination in the first level equal to L(1).

All in all, the sum of these items calculates the total number of tree hierarchies for \(K+1\) concepts.

The first few number of total number of trees combinations for 1, 2, 3, 4, 5, 6, \(7, 8, 9,10, \cdots \) concepts are: 1,  1,  4,  26,  236,  2752,  39208,  660032,  12818912,   \(282137824,\cdots \). In the case of the SHL dataset that we use in the empirical evaluation, we have 8 different concepts and thus, the number of different types of hierarchies for this case is \(L(8) = 660,032\).

Appendix B Training Details

We use Tensorflow for building the encoders/decoders. We construct encoders by stacking Conv1d/ReLU/MaxPool blocks. These blocks are followed by a Fully Connected/ReLU layers. Encoders performance estimation is based on the validation loss and is framed as a sequence classification problem. As a preprocessing step, annotated input streams from the huge SHL dataset are segmented into sequences of 6000 samples which correspond to a duration of 1 min. given a sampling rate 100 Hz. For weight optimization, we use stochastic gradient descent with Nesterov momentum of 0.9 and a learning-rate of 0.1 for a minimum of 12 epochs (we stop training if there is no improvement). Weight decay is set to 0.0001. Furthermore, to make the neural networks more stable, we use batch normalization on top of each convolutional layer. We use SVMs as our ERMs in the derived hierarchies.

Appendix C Evaluation Metrics

In hierarchical classification settings, the hierarchical structure is important and should be taken into account during model evaluation [17]. Various measures that account for the hierarchical structure of the learning process have been studied in the literature. They can be categorized into: distance-based; depth-dependent; semantics-based; and hierarchy-based measures. Each one is displaying advantages and disadvantages depending on the characteristics of the considered structure [5]. In our experiments, we use the H-loss, a hierarchy-based measure defined in [3]. This measure captures the intuition that “whenever a classification mistake is made on a node of the taxonomy, then no loss should be charged for any additional mistake occurring in the sub-tree of that node.” \(\ell _H(\hat{y}, y) = \sum _{i=1}^{N} \{ \hat{y}_i \ne y_i \wedge \hat{y}_j = y_j, j \in Anc(i) \}\), where \(\hat{y} = (\hat{y}_1, \cdots \hat{y}_N)\) is the predicted labels, \(y = (y_1, \cdots y_N)\) is the true labels, and Anc(i) is the set of ancestors for the node i.

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Osmani, A., Hamidi, M., Alizadeh, P. (2021). Hierarchical Learning of Dependent Concepts for Human Activity Recognition. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12713. Springer, Cham. https://doi.org/10.1007/978-3-030-75765-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-75765-6_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-75764-9

  • Online ISBN: 978-3-030-75765-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics