Skip to main content

The Application of Spark-Based Gaussian Mixture Model for Farm Environmental Data Analysis

  • Conference paper
  • First Online:
Theory, Methodology, Tools and Applications for Modeling and Simulation of Complex Systems (AsiaSim 2016, SCS AutumnSim 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 645))

Included in the following conference series:

Abstract

For fully taking into account the feature of environmental data set the Gaussian mixture model (GMM) is combined with the Dirichlet Process (DP) to solve the problem of specifying the initial cluster number. The Gibbs sampling algorithm is also used as the substitute of the Expectation Maximization algorithm to estimate the parameter of the model with Dirichlet Process. The clustering process is implemented under the framework of Spark so as to deal with farm environmental data set stored in distributed computer cluster. Experiment results with external criterion show that the improved clustering method has a better ability in data anomaly detection compared with other common cluster methods. Farm environmental data anomaly detection is implemented by the improved clustering method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ananthara, M.G., Arunkumar, T., Hemavathy, R.: CRY — an improved crop yield prediction model using bee hive clustering approach for agricultural data sets. In: International Conference on Pattern Recognition, Informatics and Mobile Engineering, pp. 473–478 (2013)

    Google Scholar 

  2. Cleverly, J., Eamus, D., Gorsel, E.V., et al.: Productivity and evapotranspiration of two contrasting semiarid ecosystems following the 2011 carbon land sink anomaly. Agric. For. Meteorol. 220, 151–159 (2016)

    Article  Google Scholar 

  3. Dieleman, H.: Urban agriculture in Mexico City; balancing between ecological, economic, social and symbolic value. J. Clean. Produc. (2016)

    Google Scholar 

  4. Dudik, J.M., Kurosu, A., Coyle, J.L., et al.: A comparative analysis of DBSCAN, K-means, and quadratic variation algorithms for automatic identification of swallows from swallowing accelerometry signals. Comput. Biol. Med. 59, 10–18 (2015)

    Article  Google Scholar 

  5. Sansegundo, R., Cordoba, R., Ferreiros, J., et al.: Frequency features and GMM-UBM approach for Gait-based person identification using smartphone inertial signals. Pattern Recogn. Lett. 73, 60–67 (2016)

    Article  Google Scholar 

  6. Fox, E.B., Choi, D.S., Willsky, A.S.: Nonparametric Bayesian methods for large scale multi-target tracking. In: 1977 11th Asilomar Conference on Circuits, Systems and Computers 1977, Conference Record, pp. 2009–2013 (2006)

    Google Scholar 

  7. Orbanz, P., Buhmann, J.M.: Nonparametric Bayesian image segmentation. Int. J. Comput. Vis. 77(1–3), 25–45 (2008)

    Article  Google Scholar 

  8. Ahmadi, S., Yeh, C.H., Papageorgiou, E.I., et al.: An FCM-FAHP approach for managing readiness-relevant activities for ERP implementation. Comput. Indus. Eng. 88, 501–517 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

This work is supported by the Key Project of Science and Technology Commission of Shanghai Municipality under Grant No. 14DZ1206302. National Natural Science Foundation of China (Grant No. 61304031), and Innovation Program of Shanghai Municipal Education Commission (14YZ007). This work was also supported by Shanghai College Young Teachers’ Training Plan (No. B37010913003). The authors would like to thank editors and anonymous reviewers for their valuable comments and suggestions to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Deng .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Singapore

About this paper

Cite this paper

Pang, H., Deng, L., Wang, L., Fei, M. (2016). The Application of Spark-Based Gaussian Mixture Model for Farm Environmental Data Analysis. In: Zhang, L., Song, X., Wu, Y. (eds) Theory, Methodology, Tools and Applications for Modeling and Simulation of Complex Systems. AsiaSim SCS AutumnSim 2016 2016. Communications in Computer and Information Science, vol 645. Springer, Singapore. https://doi.org/10.1007/978-981-10-2669-0_18

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-2669-0_18

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-2668-3

  • Online ISBN: 978-981-10-2669-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics