Skip to main content

The Distributed Ledger-Based Technique of the Neuronet Training Set Forming

  • Conference paper
  • First Online:
Computational Statistics and Mathematical Modeling Methods in Intelligent Systems (CoMeSySo 2019 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1047))

Included in the following conference series:

Abstract

The generating of training datasets for machine learning projects is a topical problem. The cost of dataset formation can be considerably high, yet there is no guarantee of an acceptable quality of prepared data. The important issue of dataset generation is the labeling noise. The main causes of this phenomena are: expert errors, information insufficiency, subjective factors and so on. Labeling noise affects the learning stage of a neuronet and so increases the number of errors during the one’s functioning. In the current paper the technique to decrease the labeling noise level is proposed. It is based on the principals of the distributed ledger technology. While there is a possibility to decrease the labeling errors number, the services integration on the basis of distributed ledger allows to improve the efficiency of dataset forming.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Machine Learning Project Structure: Stages, Roles, and Tools. https://www.altexsoft.com/blog/datascience/machine-learning-project-structure-stages-roles-and-tools/. Accessed 21 May 2019

  2. How much training data do you need? https://medium.com/@malay.haldar/how-much-training-data-do-you-need-da8ec091e956. Accessed 21 May 2019

  3. Open Datasets for Deep Learning Every Data Scientist Must Work With. https://www.analyticsvidhya.com/blog/2018/03/comprehensive-collection-deep-learning-datasets/. Accessed 21 May 2019

  4. How to Organize Data Labeling for Machine Learning: Approaches and Tools. https://www.kdnuggets.com/2018/05/data-labeling-machine-learning.html. Accessed 21 May 2019

  5. Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)

    Article  Google Scholar 

  6. Hickey, R.J.: Noise modeling and evaluating learning from examples. Artif. Intell. 82(1–2), 157–179 (1996)

    Article  MathSciNet  Google Scholar 

  7. Beigman, E., Klebanov, B.B.: Learning with annotation noise. In: Su, K.-Y., Su, J., Wiebe, J., Li, H. (eds.) Proceedings of the Joint Conference of the 47th Annual Meeting ACL and 4th International Joint Conference on Natural Language Processing AFNLP, Suntec, Singapore, August 2009, vol. 1, pp. 280–287. World Scientific Publishing Co Pte Ltd, Singapore (2009)

    Google Scholar 

  8. Manwani, N., Sastry, P.S.: Noise tolerance under risk minimization. IEEE Trans. Cybern. 43(3), 1–6 (2013)

    Article  Google Scholar 

  9. Abellán, J., Masegosa, A.R.: Bagging schemes on the presence of class noise in classification. Expert Syst. Appl. 39(8), 6827–6837 (2012)

    Article  Google Scholar 

  10. Bouveyron, C., Girard, S.: Robust supervised classification with mixture models: learning from data with uncertain labels. Pattern Recogn. 42(11), 2649–2658 (2009)

    Article  Google Scholar 

  11. Evans, M., Guttman, I., Haitovsky, Y., Swartz, T.: Bayesian analysis of binary data subject to misclassification. In: Berry, D., Chaloner, K., Geweke, J. (eds.) Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner, IEEE Transactions on Neural Networks and Learning Systems, vol. 24, pp. 67–77. Wiley, New York (1996)

    Google Scholar 

  12. Gamberger, D., Lavrac, N., Dzeroski, S.: Noise detection and elimination in data preprocessing: experiments in medical domains. Appl. Artif. Intell. 14, 205–223 (2000)

    Article  Google Scholar 

  13. Segata, N., Blanzieri, E., Delany, S., Cunningham, P.: Noise reduction for instance-based learning with a local maximal margin approach. J. Intell. Inf. Syst. 35(2), 301–331 (2010)

    Article  Google Scholar 

  14. Karmaker, A., Kwek, S.: A boosting approach to remove class label noise. In: Nedjah, N., Mourelle, L.M., Vellasco, M.M.B.R., Abraham, A., Köppen, M. (eds.) Proceedings of the Fifth International Conference on Hybrid Intelligent Systems (HIS’05), Rio de Janeiro, Brazil. IEEE, Los Alamitos (2006)

    Google Scholar 

  15. Zhang, M.-L., Zhou, Z.-H.: CoTrade: confident co-training with data editing. IEEE Trans. Syst. Man Cybern. 41, 1612–1626 (2011)

    Article  Google Scholar 

  16. An, W., Liang, M.: Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises. Neurocomputing 110(13), 101–110 (2013)

    Article  Google Scholar 

  17. Distributed ledger technology: beyond block chain. https://www.gov.uk/government/news/distributed-ledger-technology-beyond-block-chain. Accessed 20 May 2019

  18. Block lattice. https://github.com/nanocurrency/nano-node/wiki/Block-lattice. Accessed 21 May 2019

  19. Nguyen, G., Kim, K.: A survey about consensus algorithms used in blockchain. J. Inf. Process. Syst. 14(1), 101–128 (2018)

    Google Scholar 

  20. BlockChain Technology: Beyond Bitcoin. http://scet.berkeley.edu/wp-content/uploads/AIR-2016-Blockchain.pdf. Accessed 21 May 2019

Download references

Acknowledgements

The paper has been prepared within the RFBR project 18-29-22086 and 18-29-22046.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to A. B. Klimenko .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Melnik, E.V., Klimenko, A.B., Ivanov, D.Y. (2019). The Distributed Ledger-Based Technique of the Neuronet Training Set Forming. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds) Computational Statistics and Mathematical Modeling Methods in Intelligent Systems. CoMeSySo 2019 2019. Advances in Intelligent Systems and Computing, vol 1047. Springer, Cham. https://doi.org/10.1007/978-3-030-31362-3_2

Download citation

Publish with us

Policies and ethics