The Distributed Ledger-Based Technique of the Neuronet Training Set Forming

Melnik, E. V.; Klimenko, A. B.; Ivanov, D. Y.

doi:10.1007/978-3-030-31362-3_2

E. V. Melnik¹⁷,
A. B. Klimenko¹⁸ &
D. Y. Ivanov¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1047))

Included in the following conference series:

Proceedings of the Computational Methods in Systems and Software

537 Accesses
2 Citations

Abstract

The generating of training datasets for machine learning projects is a topical problem. The cost of dataset formation can be considerably high, yet there is no guarantee of an acceptable quality of prepared data. The important issue of dataset generation is the labeling noise. The main causes of this phenomena are: expert errors, information insufficiency, subjective factors and so on. Labeling noise affects the learning stage of a neuronet and so increases the number of errors during the one’s functioning. In the current paper the technique to decrease the labeling noise level is proposed. It is based on the principals of the distributed ledger technology. While there is a possibility to decrease the labeling errors number, the services integration on the basis of distributed ledger allows to improve the efficiency of dataset forming.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Machine Learning Project Structure: Stages, Roles, and Tools. https://www.altexsoft.com/blog/datascience/machine-learning-project-structure-stages-roles-and-tools/. Accessed 21 May 2019
How much training data do you need? https://medium.com/@malay.haldar/how-much-training-data-do-you-need-da8ec091e956. Accessed 21 May 2019
Open Datasets for Deep Learning Every Data Scientist Must Work With. https://www.analyticsvidhya.com/blog/2018/03/comprehensive-collection-deep-learning-datasets/. Accessed 21 May 2019
How to Organize Data Labeling for Machine Learning: Approaches and Tools. https://www.kdnuggets.com/2018/05/data-labeling-machine-learning.html. Accessed 21 May 2019
Frénay, B., Verleysen, M.: Classification in the presence of label noise: a survey. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 845–869 (2014)
Article Google Scholar
Hickey, R.J.: Noise modeling and evaluating learning from examples. Artif. Intell. 82(1–2), 157–179 (1996)
Article MathSciNet Google Scholar
Beigman, E., Klebanov, B.B.: Learning with annotation noise. In: Su, K.-Y., Su, J., Wiebe, J., Li, H. (eds.) Proceedings of the Joint Conference of the 47th Annual Meeting ACL and 4th International Joint Conference on Natural Language Processing AFNLP, Suntec, Singapore, August 2009, vol. 1, pp. 280–287. World Scientific Publishing Co Pte Ltd, Singapore (2009)
Google Scholar
Manwani, N., Sastry, P.S.: Noise tolerance under risk minimization. IEEE Trans. Cybern. 43(3), 1–6 (2013)
Article Google Scholar
Abellán, J., Masegosa, A.R.: Bagging schemes on the presence of class noise in classification. Expert Syst. Appl. 39(8), 6827–6837 (2012)
Article Google Scholar
Bouveyron, C., Girard, S.: Robust supervised classification with mixture models: learning from data with uncertain labels. Pattern Recogn. 42(11), 2649–2658 (2009)
Article Google Scholar
Evans, M., Guttman, I., Haitovsky, Y., Swartz, T.: Bayesian analysis of binary data subject to misclassification. In: Berry, D., Chaloner, K., Geweke, J. (eds.) Bayesian Analysis in Statistics and Econometrics: Essays in Honor of Arnold Zellner, IEEE Transactions on Neural Networks and Learning Systems, vol. 24, pp. 67–77. Wiley, New York (1996)
Google Scholar
Gamberger, D., Lavrac, N., Dzeroski, S.: Noise detection and elimination in data preprocessing: experiments in medical domains. Appl. Artif. Intell. 14, 205–223 (2000)
Article Google Scholar
Segata, N., Blanzieri, E., Delany, S., Cunningham, P.: Noise reduction for instance-based learning with a local maximal margin approach. J. Intell. Inf. Syst. 35(2), 301–331 (2010)
Article Google Scholar
Karmaker, A., Kwek, S.: A boosting approach to remove class label noise. In: Nedjah, N., Mourelle, L.M., Vellasco, M.M.B.R., Abraham, A., Köppen, M. (eds.) Proceedings of the Fifth International Conference on Hybrid Intelligent Systems (HIS’05), Rio de Janeiro, Brazil. IEEE, Los Alamitos (2006)
Google Scholar
Zhang, M.-L., Zhou, Z.-H.: CoTrade: confident co-training with data editing. IEEE Trans. Syst. Man Cybern. 41, 1612–1626 (2011)
Article Google Scholar
An, W., Liang, M.: Fuzzy support vector machine based on within-class scatter for classification problems with outliers or noises. Neurocomputing 110(13), 101–110 (2013)
Article Google Scholar
Distributed ledger technology: beyond block chain. https://www.gov.uk/government/news/distributed-ledger-technology-beyond-block-chain. Accessed 20 May 2019
Block lattice. https://github.com/nanocurrency/nano-node/wiki/Block-lattice. Accessed 21 May 2019
Nguyen, G., Kim, K.: A survey about consensus algorithms used in blockchain. J. Inf. Process. Syst. 14(1), 101–128 (2018)
Google Scholar
BlockChain Technology: Beyond Bitcoin. http://scet.berkeley.edu/wp-content/uploads/AIR-2016-Blockchain.pdf. Accessed 21 May 2019

Download references

Acknowledgements

The paper has been prepared within the RFBR project 18-29-22086 and 18-29-22046.

Author information

Authors and Affiliations

Federal Research Centre the Southern Scientific Centre of the Russian Academy of Sciences, 41, Chehova st, 344006, Rostov-on-Don, Russia
E. V. Melnik
Scientific Research Institute of Multiprocessor Computer Systems of Southern Federal University, 2, Chehova st, 347928, Taganrog, Russia
A. B. Klimenko & D. Y. Ivanov

Authors

E. V. Melnik
View author publications
You can also search for this author in PubMed Google Scholar
A. B. Klimenko
View author publications
You can also search for this author in PubMed Google Scholar
D. Y. Ivanov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. B. Klimenko .

Editor information

Editors and Affiliations

Faculty of Applied Informatics, Tomas Bata University in Zlin, Zlin, Czech Republic
Radek Silhavy
Faculty of Applied Informatics, Tomas Bata University in Zlin, Zlin, Czech Republic
Petr Silhavy
Faculty of Applied Informatics, Tomas Bata University in Zlin, Zlin, Czech Republic
Zdenka Prokopova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Melnik, E.V., Klimenko, A.B., Ivanov, D.Y. (2019). The Distributed Ledger-Based Technique of the Neuronet Training Set Forming. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds) Computational Statistics and Mathematical Modeling Methods in Intelligent Systems. CoMeSySo 2019 2019. Advances in Intelligent Systems and Computing, vol 1047. Springer, Cham. https://doi.org/10.1007/978-3-030-31362-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-31362-3_2
Published: 20 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31361-6
Online ISBN: 978-3-030-31362-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics