FinTech-2023-09-12 Intro to big data and AI in finance - post-class posting (1)

NYU Stern School of Business Fall 2023 FoF Page 1 FinTech—Sep 12, 2023 Intro to big data and AI in finance Prof. Hanna Halaburda 1 Intro to big data and AI in finance 2 classes today : technical side Thursday : guest from AQR, Lukasz Pomorski, talking about practical use (or lack thereof) 2
NYU Stern School of Business Fall 2023 FoF Page 2 warm - up poll 1. [T/F] Modern computers are very powerful, so nowadays organizations can collect any data they want and easily store it. 2. How many examples of AI applications in finance can you recall? 3. [T/F] AI applications in finance are mostly introduced by start-ups. Established incumbents stay away from this new technology. 3 big data • peta à zetta - peta = 10 15 = million gigabytes = 1,000 terabytes = 500 billion pages of standard printed text - zetta = 10 21 = trillion gigabytes = billion terabytes • definition - data sets so large or complex that traditional data processing practices/applications are inadequate 4
NYU Stern School of Business Fall 2023 FoF Page 3 history 1663: death rates during bubonic plague; intro of statistical data analysis 1865: term business intelligence introduced 1926: Nikola Tesla predicts humans will one day have access to large swaths of data via an instrument that can be carried "in [one's] vest pocket." 1943: Colossus, data processing machine to decipher Nazi codes during WWII (computer) 1965: data center buildings to store millions of tax returns and fingerprints on magnetic tape 1969: ARPANET created 1996: digital data storage becomes more cost-effective than storing information on paper 1997: domain registered 2014: more mobile devices access the internet than desktops in the US; the rest of the world follows in 2016 5 properties (V's) 1. volume : amount of data (data sets growing rapidly - digitization, internet, mobile) 2. variety : diversity of data types and sources - structured: high degree of organization, affording easy search in relational databases (financial statements, transaction statements) - unstructured: compilation is time and energy consuming (social media posts, news articles) - human vs machine generated 3. velocity : speed at which data is generated; high frequency trading, electronic payment systems vs. house purchasing 4. veracity : data quality and accuracy (noisy, incomplete, inconsistent or duplicated data) 5. variability : fluctuations and inconsistencies, e.g. seasonal, cyclical, sudden spikes 6. value : economic and strategic benefits 6
Uploaded by ChefTrout3728 on