Skip to main content

Automating the construction of authority files in digital libraries: A case study

  • Metadata
  • Conference paper
  • First Online:
Book cover Research and Advanced Technology for Digital Libraries (ECDL 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1324))

Included in the following conference series:

Abstract

The issue of quality control has become increasingly important as more online databases are integrated into digital libraries. This can have a dramatic effect on the search effectiveness of an online system. Authority work, the need to discover and reconcile variant forms of strings in bibliographic entries, will become more difficult. Spelling variants, misspellings, translation and transliteration differences all increase the difficulty of retrieving information. This paper is a case study of our efforts to automate the creation of an authority file for authors' institutional affiliations in the Astrophysics Data System. The techniques surveyed here for the detection and categorization of variant forms have broader applicability and may be used to help automate authority work for other bibliographic fields.

This work supported in part by Dept. of Energy grant no. DE-FG05-95ER25254, NSF grant CDA-9529253, DARPA contract N66001-97-C-8542, and a NASA Graduate Student Researchers Program fellowship.

The National Radio Astronomy Observatory is a facility of the National Science Foundation operated under cooperative agreement by Associated Universities, Inc.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. H. A. Abt. Institutional Productivities. Publications of the Astronomical Society of the Pacific, 105:794–798, 1993.

    Article  Google Scholar 

  2. A. Accomazzi, G. Eichhorn, M. J. Kurtz, C. S. Grant, and S. S. Murray. The ADS Article Service Data Holdings and Access Method. In G. Hunt and H. Payne, editors, Astronomical Data Analysis Software and Systems VI, volume 125 of A.S.P. Conference Series, pages 357–360, 1997.

    Google Scholar 

  3. L. Auld. Authority Control: An Eighty-Year Review. Library Resources & Technical Services, 26:319–330, 1982.

    Google Scholar 

  4. C. L. Borgman and S. L. Siegfried. Getty's Synoname and its Cousins: A Survey of Applications of Personal Name-Matching Algorithms. Journal of the American Society for Information Science, 43(7):459–476, 1992.

    Article  Google Scholar 

  5. J. R. Davis. Creating a Networked Computer Science Technical Report Library. D-Lib Magazine, Sept. 1995.

    Google Scholar 

  6. J. C. French, A. L. Powell, and E. Schulman. Applications of Approximate Word Matching in Information Retrieval. In 6th International Conference on Information and Knowledge Management (CIKM'97), Las Vegas, Nevada, 10–14 November 1997. (to appear).

    Google Scholar 

  7. P. A. V. Hall and G. R. Dowling. Approximate String Matching. Computing Surveys, 12(4):381–402, Dec. 1980.

    Article  Google Scholar 

  8. K. Kukich. Techniques for Automatically Correcting Words in Text. Computing Surveys, 24(4):377–440, Dec. 1992.

    Article  Google Scholar 

  9. R. Lowrance and R. A. Wagner. An Extension of the String-to-String Correction Problem. Journal of the ACM, 22(2):177–183, Apr. 1975.

    Article  Google Scholar 

  10. E. T. O'Neill and D. Vizine-Goetz. Quality Control in Online Databases. Annual Review of Information Science and Technology, 23:125–156, 1988.

    Google Scholar 

  11. E. Schulman, J. C. French, A. L. Powell, S. S. Murray, G. Eichhorn, and M. J. Kurtz. The Sociology of Astronomical Publication Using ADS and ADAMS. In G. Hunt and H. Payne, editors, Astronomical Data Analysis Software and Systems VI, volume 125 of A.S.P. Conference Series, pages 361–364, 1997.

    Google Scholar 

  12. E. Schulman, A. L. Powell, J. C. French, G. Eichhorn, M. J. Kurtz, and S. S. Murray. Using the ADS Database to Study Trends in Astronomical Publication. Bulletin of the American Astronomical Society, 28(4):1281, 1996.

    Google Scholar 

  13. S. L. Siegfried and J. Bernstein. Synoname: The Getty's New Approach to Pattern Matching for Personal Names. Computers and the Humanities, 25(4):211–226, 1991.

    Article  Google Scholar 

  14. D. M. Strong, Y. W. Lee, and R. Y. Wang. Data Quality in Context. Communications of the ACM, 40(5):103–110, May 1997.

    Article  Google Scholar 

  15. A. G. Taylor. Authority Files in Online Catalogs: An Investigation of Their Value. Cataloging & Classification Quarterly, 4(3):1–17, 1984.

    Google Scholar 

  16. V. Trimble. Postwar growth in the length of astronomical and other scientific papers. Publications of the Astronomical Society of the Pacific, 96:1007–1016, 1984.

    Article  Google Scholar 

  17. R. A. Wagner and M. J. Fischer. The String-to-String Correction Problem. Journal of the ACM, 21(1):168–173, Jan. 1974.

    Article  Google Scholar 

  18. M. E. Williams and L. Lannom. Lack of Standardization of the Journal Title Data Element in Databases. Journal of the American Society for Information Science, 32(3):229–233, May 1981.

    Google Scholar 

  19. J. Zobel and P. Dart. Phonetic String Matching: Lessons from Information Retrieval. In Proc. 19th Inter. Conf. on Research and Development in Information Retrieval (SIGIR'96), pages 166–172, Aug. 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Carol Peters Costantino Thanos

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

French, J.C., Powell, A.L., Schulman, E., Pfaltz, J.L. (1997). Automating the construction of authority files in digital libraries: A case study. In: Peters, C., Thanos, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 1997. Lecture Notes in Computer Science, vol 1324. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026721

Download citation

  • DOI: https://doi.org/10.1007/BFb0026721

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63554-3

  • Online ISBN: 978-3-540-69597-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics