Skip to main content

Data Discovery and Related Factors of Documents on the Web and the Network

  • Conference paper
Computational Science and Its Applications – ICCSA 2009 (ICCSA 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5592))

Included in the following conference series:

  • 2579 Accesses

Abstract

On the Web and network environments, fast and precise data transmission is the most essential part, and many additional elements should be supported for this. Data discovery is one of the elements. This paper focuses on data discovery strategies, especially on the main factors related to the discovery strategies for Markup data, which takes considerable amount of Web and network data. For the evaluation, we declared the factors and simulated the factors on our data discovery system to show how much they effect for the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Extensible Markup Language (XML) 1.0 (1998), http://www.w3c.org/TR/REC-xml

  2. Lian, W., Cheung, D.W., Yiu, S.: An Efficient and Scalable Algorithm for Clustering XML Documents by Structure. IEEE Transactions on Knowledge and Data Engineering 16(1) (2004)

    Google Scholar 

  3. Costa, G., Manco, G., Ortale, R., Tagarelli, A.: A tree-based approach to clustering XML documents by structure. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) PKDD 2004. LNCS, vol. 3202, pp. 137–148. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Klein, P., Tirthapura, S., Sharvit, D., Kimia, B.: A Tree-edit -distance Algorithm for Comparing Simple, Closed Shapes. In: Proceedings of the 11th Annual ACM SIAM Symposium of Discrete Algorithms, pp. 696–704 (2000)

    Google Scholar 

  5. Borenstein, E., Sharon, E., Ullman, S.: Combining Top-down and Bottom-up Segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2004)

    Google Scholar 

  6. Ekram, R.A., Adma, A., Baysal, O.: diffX: An Algorithm to Detect Changes in Multi-Version XML Documents. In: Proceedings of the 2005 Conference on the Centre for Advanced Studies on Collaborative Research (2005)

    Google Scholar 

  7. Zhang, K., Wang, J.T., Shasha, D.: On the Editing Distance between Undirected Acyclic Grahphs and Related Problems. In: Proceedings of the 6th Annual Symposium of Combinatorial Pattern Matching (1995)

    Google Scholar 

  8. Rafiei, D., Mendelzon, A.: Similarity-Based Queries for Time Series Data. In: Proceedings of the ACM International Conference on Management of Data, pp. 13–24 (1997)

    Google Scholar 

  9. Lee, M.L., Yang, L.H., Hsu, W., Yang, X.: XClust: Clustering XML Schemas for Effective Integration. In: Proceedings of the 11th ACM International Conference on Information and Knowledge Management, pp. 292–299 (2002)

    Google Scholar 

  10. Moon, H.J., Kim, K.J., Park, C.G., Yoo, C.W.: Effective Similarity Discovery from Semi-structured Documents. International Journal of Multimedia and Ubiquitous Engineering 1(4), 12–18 (2006)

    Google Scholar 

  11. Yang, R., Kalnis, P., Tung, A.K.H.: Similarity Evaluation on Tree-Structured Data. In: Proceedings of the ACM International Conference on Management of Data, pp. 754–765 (2005)

    Google Scholar 

  12. Flesca, S., Manco, G., Masciari, E., Pontieri, L., Pugliese, A.: Detecting Structural Similarities between XML Documents. In: Proceedings of the International Workshop on the Web and Databases (2002)

    Google Scholar 

  13. Flesca, S., Manco, G., Masciari, E., Pontieri, L., Pugliese, A.: Fast Detection of XML Structural Similarity. IEEE Transactions on Knowledge and Data Engineering 17(2) (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moon, HJ., Yeom, SH., Choi, J., Yoo, CW. (2009). Data Discovery and Related Factors of Documents on the Web and the Network. In: Gervasi, O., Taniar, D., Murgante, B., Laganà, A., Mun, Y., Gavrilova, M.L. (eds) Computational Science and Its Applications – ICCSA 2009. ICCSA 2009. Lecture Notes in Computer Science, vol 5592. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02454-2_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02454-2_46

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02453-5

  • Online ISBN: 978-3-642-02454-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics