Skip to main content

Computational Data Sciences and the Regulation of Banking and Financial Services

  • Chapter
  • First Online:
From Social Data Mining and Analysis to Prediction and Community Detection

Abstract

The development of computational data science techniques in natural language processing (NLP) and machine learning (ML) algorithms to analyze large and complex textual information opens new avenues to study intricate policy processes at a scale unimaginable even a few years ago. We apply these scalable NLP and ML techniques to analyze the United States Government’s regulation of the banking and financial services sector. First, we employ NLP techniques to convert the text of financial regulation laws into feature vectors and infer representative “topics” across all the laws. Second, we apply ML algorithms to the feature vectors to predict various attributes of each law, focusing on the amount of authority delegated to regulators. Lastly, we compare the power of alternative models in predicting regulators’ discretion to oversee financial markets. These methods allow us to efficiently process large amounts of documents and represent the text of the laws in feature vectors, taking into account words, phrases, syntax, and semantics. The vectors can be paired with predefined policy features, thereby enabling us to build better predictive measures of financial sector regulation. The analysis offers policymakers and the business community alike a tool to automatically score policy features of financial regulation laws to and measure their impact on market performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A number of studies show that government institutions matter for the regulation of markets. Keefer [20] argues that competitive governmental structures are linked with competitive markets. In particular, separation of powers and competitive elections are correlated with strong investor protection and lending to the private sector. Barth et al. [4] show countries that encourage private enforcement of banking laws and regulation (e.g., through litigation) rather than direct control or no regulation at all, have the highest rates of financial sector development and therefore capital formation. Historical studies of financial development in the USA tell similar stories. Kroszner and Strahan [21] show that the relative political strength of winners from deregulation (large banks and smaller, bank-dependent firms) and the losers (small banks and insurance firms) explains the timing of bank branching deregulation across states in the USA. Haber [18] argues that governments free from outside political competition will do little to implement regulations in the banking sector.

  2. 2.

    For early work in this area, see, for example, McCubbins and Schwartz [24] and McCubbins et al. [25, 26].

  3. 3.

    Excellent technical work on the optimal type of discretion to offer agencies is provided by Melumad and Shibano [27] and Alonso and Matouschek [3], and Gailmard [11]. A series of studies examine the politics of delegation with an executive veto [41], civil service protections for bureaucrats [12, 13], and executive review of proposed regulations [43], among others. See also Bendor and Meirowitz [5] for contributions to the spatial model of delegation and Volden and Wiseman [42] for an overview of the development of this literature.

  4. 4.

    Maskin and Tirole [22] and Alesina and Tabellini [2] also emphasize the benefits of delegation to bureaucrats and other non-accountable officials.

  5. 5.

    The argument is quite intuitive: When the financial system experiences a shock, then constituents are more likely to hold the president and the executive accountable than any individual member of Congress. For formal proofs of these propositions, the reader is referred to Groll et al. [16]. A similar argument is made in trade policy as constituents hold the president and the executive more accountable for the overall economic conditions, which explains more free-trade oriented positions by the executive than Congress. See, for example, O’Halloran [34].

  6. 6.

    The analysis relies on legislative summaries provided by Congressional Quarterly and contained in the Library of Congress’s Thomas legislative database.

  7. 7.

    For example, the Dodd–Frank Wall Street Reform and Consumer Protection Act of 2010 (Dodd–Frank Act) delegated authority to the Federal Deposit Insurance Corporation to provide for an orderly liquidation process for large, failing financial institutions.

  8. 8.

    To ensure the reliability of our measures, each law was coded independently by two separate annotators. It was reviewed by a third independent annotator, who noted inconsistencies. Upon final entry, each law was then checked a fourth time by the authors. O’Halloran et al. [35] provide a detailed description of the coding method used in the analysis.

  9. 9.

    Examples of procedural constraints include spending limits, and legislative action required, etc. See O’Halloran et al. [35] for a detail description of these constraints.

  10. 10.

    See Epstein and O’Halloran [10] for a complete discussion of this measure.

  11. 11.

    The remaining laws neither regulated nor deregulated.

  12. 12.

    Further, the patterns also seem consistent with the notion that regulations “decay” over time as new financial instruments appear to replace the old one. If one estimates a Koyck distributed lag model y t  = α +β x t +β ϕ x t−1 +β ϕ 2 x t−2 + ⋯ +ε t via the usual instrumental variables technique, then β = −0. 025 and ϕ = 0. 49, indicating that regulations lose roughly half their effectiveness each year. See Wooldridge [45], pp. 635–637 for details of the estimation technique.

  13. 13.

    See Groll et al. [16].

  14. 14.

    We define the exclusivity score of a word j for a topic k as the ratio of its probability of occurring in topic k to its probability of occurring in other topics. Thus \(\phi _{k,j} = \frac{\beta _{k,j}} {\sum _{i\neq k}\beta _{i,j}}\). We then define the FREX k, j score as the harmonic mean of the word’s rank in the distribution of exclusivity scores for topic k (which frequency distribution is denoted ϕ k, ⋅ ) and the word’s rank in the distribution of word frequencies for topic k (which frequency distribution is denoted μ k, ⋅ ). Thus:

    $$\displaystyle{ \text{FREX}_{k,j} = \left ( \frac{\omega } {\text{ECDF}_{\phi _{k,\cdot }}(\phi _{k,j})} + \frac{(1-\omega )} {\text{ECDF}_{\mu _{k,\cdot }}(\mu _{k,j})}\right )^{-1} }$$
    (7)

    where ω is the weight for the exclusivity (which is set to 0. 5 by default) and \(\text{ECDF}_{x_{k,\cdot }}\) is the empirical cumulative density function applied to the values x over the first index, giving us the rank. See Airoldi at al. [1], p. 280.

  15. 15.

    Exclusivity of word j to topic k was defined in an earlier footnote, and is ϕ k, j  = β k, j ik β i, j . The exclusivity score for the whole topic is the sum of these ϕ k, j word scores for all words in a topic.

  16. 16.

    For a test of these hypotheses using standard regression analysis, see Groll et al. [16].

References

  1. Airoldi EM, Blei DM, Erosheva EA, Fienberg SE. Handbook of mixed membership models and their applications. Chapman and Hall/CRC handbooks of modern statistical methods (Page A). Boca Raton: CRC Press; 2015 (Kindle Edition).

    Google Scholar 

  2. Alesina A, Tabellini G. Bureaucrats or politicians? Part I: a single policy task. Am Econ Rev. 2007;97(1):169–79.

    Article  Google Scholar 

  3. Alonso R, Matouschek N. Optimal delegation. Rev Econ Stud. 2008;75(1):259–93.

    Article  MathSciNet  MATH  Google Scholar 

  4. Barth JR, Caprio G Jr., Levine R. Rethinking banking regulation: till angels govern. New York: Cambridge University Press; 2006.

    Google Scholar 

  5. Bendor J, Meirowitz A. Spatial models of delegation. Am Polit Sci Rev. 2004;98(2):293–310.

    Article  Google Scholar 

  6. Blei DM. Probabilistic topic models. Commun ACM. 2012;55(4):77–84.

    Article  Google Scholar 

  7. Blei DM, Ng AY, Jordan M. Latent Dirichlet allocation. J Mach Learn Res. 2003;3:993–1022.

    MATH  Google Scholar 

  8. Brill E. A simple rule-based part of speech tagger. In: Proceedings of the workshop on speech and natural language. Association for computational linguistics; 1992. p. 112–116.

    Google Scholar 

  9. Chang J, Boyd-Graber J, Wang C, Gerrish S, Blei DM. Reading tea leaves: How humans interpret topic models. Adv Neural Inf Proces Syst. 2009;(22):288–96.

    Google Scholar 

  10. Epstein D, O’Halloran S. Delegating powers New York: Cambridge University Press; 1999.

    Book  Google Scholar 

  11. Gailmard S. Discretion rather than rules: choice of instruments to constrain bureaucratic policy-making. Polit Anal. 2009;17(1):25–44.

    Article  Google Scholar 

  12. Gailmard S, Patty JW. Slackers and Zealots: civil service, policy discretion, and bureaucratic expertise. Am J Polit Sci. 2007;51(4):873–89.

    Article  Google Scholar 

  13. Gailmard S, Patty J. Formal models of bureaucracy. Ann Rev Polit Sci. 2012;15:353–77.

    Article  Google Scholar 

  14. Griffiths TL, Steyvers M. Finding scientific topics. Proc Natl Acad Sci. 2004;101(suppl 1):5228–35.

    Article  Google Scholar 

  15. Griffiths TL, Steyvers M, Tenenbaum JB. Topics in semantic representation. Psychol Rev. 2007;114(2):211–44.

    Article  Google Scholar 

  16. Groll T, O’Halloran S, McAllister G. Delegation and the regulation of financial markets, mimeo; 2015.

    Google Scholar 

  17. Grün B, Hornik K. Topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software, 2011;40(13):1–30.

    Article  Google Scholar 

  18. Haber S. Political Institutions and Financial development: evidence from the political economy of bank regulation in Mexico and the United States. In: Haber N, Weingast, editors. Political Institutions and Financial Development. Stanford: Stanford University Press; 2008

    Google Scholar 

  19. Hall MA. Correlation-based feature subset selection for machine learning. Phd Thesis. University of Waikato; 1998.

    Google Scholar 

  20. Keefer P. Beyond legal origin and checks and balances: political credibility, citizen information, and financial sector development. In: Haber N, Weingast, editors. Political Institutions and Financial Development. Stanford: Stanford University Press; 2008.

    Google Scholar 

  21. Kroszner R, Strahan P. What drives deregulation? economics and politics of the relaxation of bank branching restrictions. Q J Econ. 1999;114(4):1437–67.

    Article  Google Scholar 

  22. Maskin E, Tirole J. The politician and the judge: accountability in Government. Am Econ Rev. 2004:94(4); 1034–54.

    Article  Google Scholar 

  23. McCallum A, Nigam K. A comparison of event models for naive Bayes text classification. In AAAI-98 workshop on learning for text categorization, vol. 752. 1998. p. 41–8.

    Google Scholar 

  24. McCubbins MD, Schwartz T. Congressional oversight overlooked: police patrols versus fire alarms. Am J Polit Sci. 1984;28(1):165–79.

    Article  Google Scholar 

  25. McCubbins MD, Noll R, Weingast B. Administrative procedures as instruments of political control. J Law Econ Org. 1987:3;243–77.

    Google Scholar 

  26. McCubbins MD, Noll R, Weingast B. Structure and process, politics and policy: administrative arrangements and the political control of agencies. Virginia Law Rev. 1989;75:431–82

    Article  Google Scholar 

  27. Melumad ND, Shibano T. Communication in settings with no transfers. RAND J Econ. 1991;22(2):173–98

    Article  Google Scholar 

  28. Mihalcea R, Radev D. Graph-based natural language processing and information retrieval. Cambridge; Cambridge University Press; 2011

    Google Scholar 

  29. Mimno D, Wallach HM, Talley E, Leenders M, McCallum A. Optimizing semantic coherence in topic models. Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics; 2011

    Google Scholar 

  30. Mikolov T, Ilya S, Kai C, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems; 2013.

    Google Scholar 

  31. Morgan DP. Rating banks: risk and uncertainty in an opaque industry. Am Econ Rev. 2002;92(4):874–88.

    Article  Google Scholar 

  32. Nadeau D, Sekine S. A survey of named entity recognition and classification. Lingvisticae Investigationes 2007;30(1):3–26.

    Article  Google Scholar 

  33. Nigam K, Lafferty J, McCallum, A. 1999. Using maximum entropy for text classification. IJCAI-99 Workshop on machine learning for information filtering.

    Google Scholar 

  34. O’Halloran, S. Politics, process, and American trade policy. Ann Arbor: University of Michigan Press; 1994

    Book  Google Scholar 

  35. O’Halloran S, Maskey S, McAllister G, Park DK, Cheng K. Data science and political economy: application to financial regulatory structure. J Soc. Sci. 2016;2(7):87–109.

    Google Scholar 

  36. Philippon T, Reshef A. Wages and human capital in the U.S. financial industry: 1909–2006. Q J Econ. 2012;112(4):1551–1609.

    Article  Google Scholar 

  37. Roberts ME, Stewart BM, Tingley D. STM: R package for structural topic models. R package. 2014 Jun;1:12.

    Google Scholar 

  38. Roberts ME, Steward BM, Tingley D, Lucas C, Leder-Luis J, Gadarian SK, et al. Structural topic models for open ended survey responses. Am J Polit Sci. 2014;58(4):1064–82.

    Article  Google Scholar 

  39. Spärck Jones K. A statistical interpretation of term specificity and its application in retrieval. J Doc. 1972;28:11–21.

    Article  Google Scholar 

  40. Steinbach M, Karypis G, Kumar V. A comparison of document clustering techniques. In: Proceedings of the 6th ACM SIGKDD, World text mining conference, Boston, MA; 2000

    Google Scholar 

  41. Volden C. A formal model of the politics of delegation in a separation of powers system. Am J Polit Sci. 2002;46(1):111–33.

    Article  Google Scholar 

  42. Volden C, Wiseman A. formal approaches to the study of congress. In: Schickler E, Lee F, editors. Oxford handbook of congress. Oxford: Oxford University Press; 2011. p. 36–65.

    Google Scholar 

  43. Wiseman, AE. Delegation and positive-sum bureaucracies. J Polit. 2009;71(3):998–1014.

    Article  Google Scholar 

  44. Wong W, Liu W, Bennamoun M. Determination of unithood and termhood for term recognition. Handbook of research on text and web mining technologies. Hershey; IGI Global; 2008.

    Google Scholar 

  45. Wooldridge, Jeffrey, Introductory Econometrics Paperback, South-Western College Publishing, 2006.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sharyn O’Halloran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

O’Halloran, S., Dumas, M., Maskey, S., McAllister, G., Park, D.K. (2017). Computational Data Sciences and the Regulation of Banking and Financial Services. In: Kaya, M., Erdoǧan, Ö., Rokne, J. (eds) From Social Data Mining and Analysis to Prediction and Community Detection. Lecture Notes in Social Networks. Springer, Cham. https://doi.org/10.1007/978-3-319-51367-6_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-51367-6_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-51366-9

  • Online ISBN: 978-3-319-51367-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics