Abstract
fst stands for Finite-State Toolkit. It is an enhanced version of the xfst tool described in the 2003 Beesley and Karttunen book Finite State Morphology. Like xfst, fst serves two purposes. It is a development tool for compiling finite-state networks and a runtime tool that applies networks to input strings or files. xfst is limited to morphological analysis and generation. fst can also be used for other applications. This paper describes the new features of the fst regular expression formalism and illustrates their use for named-entity recognition, relation extraction, tokenization and parsing. The fst pattern matching algorithm (pmatch) operates on a single pattern network but the network can be the union of any number of distinct pattern definitions. Many patterns can be matched simultaneously in one pass over a text. This is a distinct fst advantage over pattern matching facilities in languages such as Perl and Python.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Beesley, K.R., Karttunen, L.: Finite State Morphology. CSLI Publications, Palo Alto (2003)
Karttunen, L.: Pattern Matching with FST – A Tutorial. Technical Report TR-2010-01. Palo Alto Research Center, Palo Alto, CA (2010)
Woods, W.A.: Transition Network Grammars of Natural Language Analysis. Comm. ACM 13(10), 591–606 (1970)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karttunen, L. (2011). Beyond Morphology: Pattern Matching with FST. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2011. Communications in Computer and Information Science, vol 100. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23138-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-23138-4_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23137-7
Online ISBN: 978-3-642-23138-4
eBook Packages: Computer ScienceComputer Science (R0)