Abstract
In Chapter 1, we reviewed basic concepts in theoretical and applied market microstructure and detailed the trading mechanisms used in several key exchanges As stated in the introduction, our empirical work focuses on tick-by-tick data for stocks traded on the NYSE. In this chapter, we start by describing the intraday database that is available from this exchange (see Section 2). The Trade And Quote database, also called TAQ database, provides intraday information on the price and quote processes for stocks traded on the NYSE and NASDAQ-AMEX. Although databases featuring financial information have been around for a long time, databases providing intraday information to the general public have only been available since the early nineties. Today, most stock exchanges make available to the general academic community the (more or less) complete record of their intraday activity. The release of this kind of information has given rise to a substantial amount of empirical research conducted on the trading mechanisms, the intraday characteristics of the markets (liquidity, volatility), and the price formation process. While intraday databases provide researchers with a substantial amount of valuable information, we also highlight some of the potential problems that arise when dealing with these databases due to the specific nature of these data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Notes
The information about ordering this database is available at the NYSE Web Site, www.nyse.com. Strictly speaking, the TAQ database is not the first intraday database of the NYSE as a much older database (TORQ database) is routinely used in empirical work (Engle and Russell, 1998). The Trades, Orders Reports, and Quotes (TORQ) dataset was constructed by Hasbrouck and the NYSE in 1991. It provides intraday information on the price and quote processes for a sample of stocks traded on the NYSE over a 3 month period. More recently, the NASDAQ now releases its own database which gives intraday information on the quotes posted by all the market makers active for a given stock traded on NASDAQ (thus not only the best bid-ask quotes but also the other valid quotes).
The ticker of the stock is the identification code of the stock used at the exchange. For example, BA is the identification code for BOEING at the NYSE.
See also the first footnote of this chapter.
It should however be mentioned that some researchers (usually affiliated with the NYSE) were granted access to the historical inventory databases kept by the specialists at the NYSE (e.g. Hasbrouck and Sofianos, 1993).
We implemented this procedure using the GAUSS econometric program to get the needed data before estimating the models presented in the next chapters. Our code is available on request.
We did not retrieve the quoted depth at the ask and bid prices as we do not include this information in the econometric models of the next chapters, but it is straightforward to do so.
See for example Bauwens and Giot (1998, 2000 ), Engle and Russell (1997, 1998), Giot (2000a) or Gerhard and Hautsch (1999).
A formal definition of the rule used to filter the quotes is given in Engle and Russell (1997).
However, it is valuable information if the bid-ask spread is to be modelled.
Gouriéroux, Jasiak and Le Fol (1999) introduce volume durations for the trade process.
See the discussion of liquidity which is provided in subsection 2.4 of Chapter 1. A related measure is VNET which is introduced for price durations.
We also looked at other actively traded stocks like Coca-Cola, Boeing, Exxon, ATT, and we found that they generally have the same intraday characteristics as IBM and Disney.
In the basic version of the Poisson process, or the corresponding exponential model for the durations, the mean of the durations is by definition equal to their standard deviation.
When cp = $0.25 for the IBM stock, the dispersion index is equal to 1.13 and it is equal to 0.63 for volume durations with c1, = 50, 000.
The hump close to the origin is not an artifact of the kernel density estimation of a density that starts at the origin. We used the gamma kernel proposed by Chen (1998). The bandwidth was set at (0.9 s n-°.2)2 where s is the standard deviation of the data and n the number of data.
Information about the database and how to order it can be found on the Olsen Web site at www.olsen.ch.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Bauwens, L., Giot, P. (2001). NYSE TAQ Database and Financial Durations. In: Econometric Modelling of Stock Market Intraday Activity. Advanced Studies in Theoretical and Applied Econometrics, vol 38. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-3381-5_2
Download citation
DOI: https://doi.org/10.1007/978-1-4757-3381-5_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-4906-6
Online ISBN: 978-1-4757-3381-5
eBook Packages: Springer Book Archive