nep-big New Economics Papers
on Big Data
Issue of 2020‒05‒04
twenty-two papers chosen by
Tom Coupé
University of Canterbury

  1. Leveraging the Power of Images in Managing Product Return Rates By Daria Dzyabura; Siham El Kihal; John R. Hauser; Marat Ibragimov
  2. Important Factors Determining Fintech Loan Default: Evidence from the LendingClub Consumer Platform By Christophe Croux; Julapa Jagtiani; Tarunsai Korivi; Milos Vulanovic
  3. Identifying and measuring developments in artificial intelligence: Making the impossible possible By Stefano Baruffaldi; Brigitte van Beuzekom; Hélène Dernis; Dietmar Harhoff; Nandan Rao; David Rosenfeld; Mariagrazia Squicciarini
  4. Neural Network pricing of American put options By Raquel M. Gaspar; Sara D. Lopes; Bernardo Sequeira
  5. Valuation ratios, surprises, uncertainty or sentiment: How does financial machine learning predict returns from earnings announcements? By Schnaubelt, Matthias; Seifert, Oleg
  6. Visual Elicitation of Brand Perception By Daria Dzyabura; Renana Peres
  7. Novel multilayer stacking framework with weighted ensemble approach for multiclass credit scoring problem application By Marek Stelmach; Marcin Chlebus
  8. Using Big Data to Expand Financial Services : Benefits and Risks By Abraham,Facundo; Schmukler,Sergio L.; Tessada,Jose
  9. Shallow or deep? Detecting anomalous flows in the Canadian Automated Clearing and Settlement System using an autoencoder By Leonard Sabetti; Ronald Heijmans
  10. Deep reinforcement learning for the optimal placement of cryptocurrency limit orders By Schnaubelt, Matthias
  11. Empirical Study of Market Impact Conditional on Order-Flow Imbalance By Anastasia Bugaenko
  12. Optimizing the reliability of a bank with Logistic Regression and Particle Swarm Optimization By Vadlamani Ravi; Vadlamani Madhav
  13. Volatility spillovers and capital buffers among the G-SIBs By Paul D McNelis; James Yetman
  14. Análisis de la huella digital en América Latina y el Caribe: enseñanzas extraídas del uso de macrodatos (big data) para evaluar la economía digital By -
  15. Modeling Institutional Credit Risk with Financial News By Tam Tran-The
  16. Identification of potential off-grid municipalities with 100% renewable energy supply By Weinand, Jann; Ried, Sabrina; Kleinebrahm, Max; McKenna, Russell; Fichtner, Wolf
  17. The Power of Narratives in Economic Forecasts By Christopher A. Hollrah; Steven A. Sharpe; Nitish R. Sinha
  18. Firm-Level Exposure to Epidemic Diseases: Covid-19, SARS, and H1N1 By Tarek A. Hassan; Laurence van Lent; Stephan Hollander; Ahmed Tahoun
  19. COVID-19 and Company Knowledge Graphs: Assessing Golden Powers and Economic Impact of Selective Lockdown via AI Reasoning By Luigi Bellomarini; Marco Benedetti; Andrea Gentili; Rosario Laurendi; Davide Magnanimi; Antonio Muci; Emanuel Sallinger
  20. 25 Years of European Merger Control By Pauline Affeldt; Tomaso Duso; Florian Szücs
  21. Causal Inference in Case-Control Studies By Sung Jae Jun; Sokbae Lee
  22. Quantifying the Economic Impact of Extreme Shocks on Businesses using Human Mobility Data: a Bayesian Causal Inference Approach By Takahiro Yabe; Yunchang Zhang; Satish Ukkusuri

  1. By: Daria Dzyabura (New Economic School, Moscow, Russia); Siham El Kihal (Frankfurt School of Finance & Management, Germany); John R. Hauser (MIT Sloan School of Management, USA); Marat Ibragimov (MIT Sloan School of Management, USA)
    Abstract: In online channels, products are returned at high rates. Shipping, processing, and refurbishing are so costly that a retailer's profit is extremely sensitive to return rates. In many product categories, such as the $500 billion fashion industry, direct experiments are not feasible because the fashion season is over before sufficient data are observed. We show that predicting return rates prior to product launch enhances profit substantially. Using data from a large European retailer (over 1.5 million transactions for about 4,500 fashion items), we demonstrate that machine-learning methods applied to product images enhance predictive ability relative to the retailer’s benchmark (category, seasonality, price, and color labels). Custom image-processing features (RGB color histograms, Gabor filters) capture color and patterns to improve predictions, but deep-learning features improve predictions significantly more. Deep learning appears to capture color-pattern-shape and other intangibles associated with high return rates for apparel. We derive an optimal policy for launch decisions that takes prediction uncertainty into account. The optimal deep-learning-based policy improves profits, achieving 40% of the improvement that would be achievable with perfect information. We show that the retailer could further enhance predictive ability and profits if it could observe the discrepancy in online and offline sales.
    Keywords: machine learning, image processing, product returns
    Date: 2019–09–03
  2. By: Christophe Croux; Julapa Jagtiani; Tarunsai Korivi; Milos Vulanovic
    Abstract: This study examines key default determinants of fintech loans, using loan-level data from the LendingClub consumer platform during 2007–2018. We identify a robust set of contractual loan characteristics, borrower characteristics, and macroeconomic variables that are important in determining default. We find an important role of alternative data in determining loan default, even after controlling for the obvious risk characteristics and the local economic factors. The results are robust to different empirical approaches. We also find that homeownership and occupation are important factors in determining default. Lenders, however, are required to demonstrate that these factors do not result in any unfair credit decisions. In addition, we find that personal loans used for medical financing or small business financing are more risky than other personal loans, holding the same characteristics of the borrowers. Government support through various public-private programs could potentially make funding more accessible to those in need of medical services and small businesses without imposing excessive risk to small peer-to-peer (P2P) investors.
    Keywords: crowdfunding; lasso selection methods; peer-to-peer lending; household finance; machine learning; financial innovation; big data; P2P/marketplace lending
    JEL: G21 D14 D10 G29 G20
    Date: 2020–04–16
  3. By: Stefano Baruffaldi (Max Planck Institute for Innovation and Competition); Brigitte van Beuzekom; Hélène Dernis; Dietmar Harhoff (Max Planck Institute for Innovation and Competition); Nandan Rao; David Rosenfeld; Mariagrazia Squicciarini
    Abstract: This paper identifies and measures developments in science, algorithms and technologies related to artificial intelligence (AI). Using information from scientific publications, open source software (OSS) and patents, it finds a marked increase in AI-related developments over recent years. Since 2015, AI-related publications have increased by 23% per year; from 2014 to 2018, AI-related OSS contributions grew at a rate three times greater than other OSS contributions; and AI-related inventions comprised, on average, more than 2.3% of IP5 patent families in 2017. China’s growing role in the AI space also emerges. The analysis relies on a three-pronged approach based on established bibliometric and patent-based methods, and machine learning (ML) implemented on purposely collected OSS data.
    Date: 2020–05–01
  4. By: Raquel M. Gaspar; Sara D. Lopes; Bernardo Sequeira
    Abstract: In this paper we use neural networks (NN), a machine learning method, to price Americanput options. We propose two distinct NN models – a simple one and a more complex one. The performance of two NN models is compared to the popular Least-Square Monte Carlo Method(LSM).This study relies on market American put option prices, with four large US companies asunderlying – Bank of America Corp (BAC), General Motors (GM), Coca-Cola Company (KO) andProcter and Gamble Company (PG). Our dataset includes all options traded from December 2018to March 2019.All methods show a good accuracy, however, once calibrated, NNs do better in terms ofexecution time and Root Mean Square Error (RMSE). Although on average both NN modelsperform better than LSM, the simpler model (NN model 1) performs quite close to LSM. On the other hand our NN model 2 substantially outperforms the other models, having a RMSE ca. 40% lower than that of the LSM. The lower RMSE is consistent across all companies, strike levels andmaturities.
    Keywords: Machine learning, Neural networks, American put options, Least-square Monte Carlo
    JEL: C45 C63 G13 G17
    Date: 2020–04
  5. By: Schnaubelt, Matthias; Seifert, Oleg
    Abstract: We apply state-of-the-art financial machine learning to assess the return-predictive value of more than 45,000 earnings announcements on a majority of S&P1500 constituents. To represent the diverse information content of earnings announcements, we generate predictor variables based on various sources such as analyst forecasts, earnings press releases and analyst conference call transcripts. We sort announcements into decile portfolios based on the model's abnormal return prediction. In comparison to three benchmark models, we find that random forests yield superior abnormal returns which tend to increase with the forecast horizon for up to 60 days after the announcement. We subject the model's learning and out-of-sample performance to further analysis. First, we find larger abnormal returns for small-cap stocks and a delayed return drift for growth stocks. Second, while revenue and earnings surprises are the main predictors for the contemporary reaction, we find that a larger range of variables, mostly fundamental ratios and forecast errors, is used to predict post-announcement returns. Third, we analyze variable contributions and find the model to recover non-linear patterns of common capital markets effects such as the value premium. Leveraging the model's predictions in a zero-investment trading strategy yields annualized returns of 11.63 percent at a Sharpe ratio of 1.39 after transaction costs.
    Keywords: Earnings announcements,Asset pricing,Machine learning,Natural languageprocessing
    Date: 2020
  6. By: Daria Dzyabura (New Economic School, Moscow, Russia); Renana Peres (Hebrew University of Jerusalem, Israel)
    Abstract: Understanding how consumers perceive brands is at the core of effective brand management. In this paper, we present the Brand Visual Elicitation Platform (B-VEP), an electronic tool we developed that allows consumers to create online collages of images that represent how they view a brand. Respondents select images for the collage from a searchable repository of tens of thousands of images. We implement an unsupervised machine-learning approach to analyze the collages and elicit the associations they describe. We demonstrate the platform’s operation by collecting large, unaided, directly elicited data for 303 large US brands from 1,851 respondents. Using machine learning and image-processing approaches to extract from these images systematic content associations, we obtain a rich set of associations for each brand. We combine the collage-making task with well-established brand-perception measures such as brand personality and brand equity, and suggest various applications for brand management.
    Keywords: Image processing, machine learning, branding, brand associations, brand collages, Latent Dirichlet Allocation
    Date: 2019–12
  7. By: Marek Stelmach (Faculty of Economic Sciences, University of Warsaw); Marcin Chlebus (Faculty of Economic Sciences, University of Warsaw)
    Abstract: Stacked ensembles approaches have been recently gaining importance in complex predictive problems where extraordinary performance is desirable. In this paper we develop a multilayer stacking framework and apply it to a large dataset related to credit scoring with multiple, imbalanced classes. Diverse base estimators (among others, bagged and boosted tree algorithms, regularized logistic regression, neural networks, Naive Bayes classifier) are examined and we propose three meta learners to be finally combined into a novel, weighted ensemble. To prevent bias in meta features construction, we introduce a nested cross-validation schema into the architecture, while weighted log loss evaluation metric is used to overcome training bias towards the majority class. Additional emphasis is placed on a proper data preprocessing steps and Bayesian optimization for hyperparameter tuning to ensure that the solution do not overfits. Our study indicates better stacking results compared to all individual base classifiers, yet we stress the importance of an assessment whether the improvement compensates increased computational time and design complexity. Furthermore, conducted analysis shows extremely good performance among bagged and boosted trees, both in base and meta learning phase. We conclude with a thesis that a weighted meta ensemble with regularization properties reveals the least overfitting tendencies.
    Keywords: stacked ensembles, nested cross-validation, Bayesian optimization, multiclass problem, imbalanced classes
    JEL: G32 C38 C51 C52 C55
    Date: 2020
  8. By: Abraham,Facundo; Schmukler,Sergio L.; Tessada,Jose
    Abstract: Big data is transforming financial services around the world. Advances in data analytics and computational power are allowing firms to exploit data in an easier, faster, and more reliable manner, and at a larger scale. By using big data, financial firms and new entrants from other sectors are able to provide more and better financial services. Governments are also exploring ways to use big data collected by the financial sector more systematically to get a better picture of the financial system as a whole and the overall economy. Despite its benefits, the wider use of big data has raised concerns related to consumer privacy, data security, discrimination, data accuracy, and competition. Hence, policy makers have started to regulate and monitor the use of big data by financial institutions and to think about how to use big data for the benefit of all.
    Keywords: ICT Applications,Legal Institutions of the Market Economy,Financial Structures,Financial Sector Policy
    Date: 2019–11–01
  9. By: Leonard Sabetti; Ronald Heijmans
    Abstract: Financial market infrastructures and their participants play a crucial role in the economy. Financial or operational challenges faced by one participant can have contagion effects and pose risks to the broader financial system. Our paper applies (deep) neural networks (autoencoder) to detect anomalous flows from payments data in the Canadian Automated Clearing and Settlement System (ACSS) similar to Triepels et al. (2018). We evaluate several neural network architecture setups based on the size and number of hidden layers, as well as differing activation functions dependent on how the input data was normalized. As the Canadian financial system has not faced bank runs in recent memory, we train the models on "normal" data and evaluate out-of-sample using test data based on historical anomalies as well as simulated bank runs. Our out-of-sample simulations demonstrate the autoencoder's performance in different scenarios, and results suggest that the autoencoder detects anomalous payment flows reasonably well. Our work highlights the challenges and trade-offs in employing a workhorse deep-learning model in an operational context and raises policy questions around how such outlier signals can be used by the system operator in complying with the prominent payment systems guidelines and by financial stability experts in assessing the impact on the financial system of a financial institution that shows extreme behaviour.
    Keywords: Anomaly Detection; Autoencoder; Neural Network; Articial intelligence; ACSS; Financial Market Infrastructure; Retail Payments
    JEL: C45 E42 E58
    Date: 2020–04
  10. By: Schnaubelt, Matthias
    Abstract: This paper presents the first large-scale application of deep reinforcement learning to optimize the placement of limit orders at cryptocurrency exchanges. For training and out-of-sample evaluation, we use a virtual limit order exchange to reward agents according to the realized shortfall over a series of time steps. Based on the literature, we generate features that inform the agent about the current market state. Leveraging 18 months of high-frequency data with 300 million historic trades and more than 3.5 million order book states from major exchanges and currency pairs, we empirically compare state-of-the-art deep reinforcement learning algorithms to several benchmarks. We find proximal policy optimization to reliably learn superior order placement strategies when compared to deep double Q-networks and other benchmarks. Further analyses shed light into the black box of the learned execution strategy. Important features are current liquidity costs and queue imbalances, where the latter can be interpreted as predictors of short-term mid-price returns. To preferably execute volume in limit orders to avoid additional market order exchange fees, order placement tends to be more aggressive in expectation of unfavorable price movements.
    Keywords: Finance,Optimal Execution,Limit Order Markets,Machine learning,Deep Reinforcement Learning
    Date: 2020
  11. By: Anastasia Bugaenko
    Abstract: In this research we have empirically investigated the key drivers affecting liquidity in equity markets. We illustrated how theoretical models, such as Kyle's model, of agents' interplay in the financial markets, are aligned with the phenomena observed in publicly available trades and quotes data. Specifically, we confirmed that for small signed order-flows, the price impact grows linearly with increase in the order-flow imbalance. We have, further, implemented a machine learning algorithm to forecast market impact given a signed order-flow. Our findings suggest that machine learning models can be used in estimation of financial variables; and predictive accuracy of such learning algorithms can surpass the performance of traditional statistical approaches. Understanding the determinants of price impact is crucial for several reasons. From a theoretical stance, modelling the impact provides a statistical measure of liquidity. Practitioners adopt impact models as a pre-trade tool to estimate expected transaction costs and optimize the execution of their strategies. This further serves as a post-trade valuation benchmark as suboptimal execution can significantly deteriorate a portfolio performance. More broadly, the price impact reflects the balance of liquidity across markets. This is of central importance to regulators as it provides an all-encompassing explanation of the correlation between market design and systemic risk, enabling regulators to design more stable and efficient markets.
    Date: 2020–04
  12. By: Vadlamani Ravi; Vadlamani Madhav
    Abstract: It is well-known that disciplines such as mechanical engineering, electrical engineering, civil engineering, aerospace engineering, chemical engineering and software engineering witnessed successful applications of reliability engineering concepts. However, the concept of reliability in its strict sense is missing in financial services. Therefore, in order to fill this gap, in a first-of-its-kind-study, we define the reliability of a bank/firm in terms of the financial ratios connoting the financial health of the bank to withstand the likelihood of insolvency or bankruptcy. For the purpose of estimating the reliability of a bank, we invoke a statistical and machine learning algorithm namely, logistic regression (LR). Once, the parameters are estimated in the 1st stage, we fix them and treat the financial ratios as decision variables. Thus, in the 1st stage, we accomplish the hitherto unknown way of estimating the reliability of a bank. Subsequently, in the 2nd stage, in order to maximize the reliability of the bank, we formulate an unconstrained optimization problem in a single-objective environment and solve it using the well-known particle swarm optimization (PSO) algorithm. Thus, in essence, these two stages correspond to predictive and prescriptive analytics respectively. The proposed 2-stage strategy of using them in tandem is beneficial to the decision-makers within a bank who can try to achieve the optimal or near-optimal values of the financial ratios in order to maximize the reliability which is tantamount to safeguarding their bank against solvency or bankruptcy.
    Date: 2020–03
  13. By: Paul D McNelis; James Yetman
    Abstract: We assess the dynamics of volatility spillovers among global systemically important banks (G-SIBs). We measure spillovers using vector-autoregressive models of range volatility of the equity prices of G-SIBs, together with machine learning methods. We then compare the size of these spillovers with the degree of systemic importance measured by the Basel Committee on Banking Supervision's G-SIB bucket designations. We find a high positive correlation between the two. We also find that higher bank capital reduces volatility spillovers, especially for banks in higher G-SIB buckets. Our results suggest that requiring banks that are designated as being more systemically important globally to hold additional capital is likely to reduce volatility spillovers from them to other large banks.
    Keywords: G-SIBs, contagion, connectedness, bank capital, cross validation
    JEL: C58 F65 G21 G28
    Date: 2020–04
  14. By: -
    Abstract: En el presente informe se examinan las oportunidades y los desafíos que plantea el uso sistemático de datos digitales disponibles públicamente como herramienta para la formulación de políticas públicas para el desarrollo de la economía digital en América Latina y el Caribe. Tiene por objetivo compartir las experiencias adquiridas para avanzar en una agenda de investigación que permita a los países de la región crear herramientas de medición alternativas basadas en la huella digital. Mediante el uso de técnicas de megadatos (big data), la huella digital que dejan los portales de empleo, las plataformas de comercio electrónico y las redes sociales ofrece información sin precedentes, tanto en términos de alcance como de detalle.
    Date: 2020–04–22
  15. By: Tam Tran-The
    Abstract: Credit risk management, the practice of mitigating losses by understanding the adequacy of a borrower's capital and loan loss reserves, has long been imperative to any financial institution's long-term sustainability and growth. MassMutual is no exception. The company is keen on effectively monitoring downgrade risk, or the risk associated with the event when credit rating of a company deteriorates. Current work in downgrade risk modeling depends on multiple variations of quantitative measures provided by third-party rating agencies and risk management consultancy companies. As these structured numerical data become increasingly commoditized among institutional investors, there has been a wide push into using alternative sources of data, such as financial news, earnings call transcripts, or social media content, to possibly gain a competitive edge in the industry. The volume of qualitative information or unstructured text data has exploded in the past decades and is now available for due diligence to supplement quantitative measures of credit risk. This paper proposes a predictive downgrade model using solely news data represented by neural network embeddings. The model standalone achieves an Area Under the Receiver Operating Characteristic Curve (AUC) of more than 80 percent. The output probability from this news model, as an additional feature, improves the performance of our benchmark model using only quantitative measures by more than 5 percent in terms of both AUC and recall rate. A qualitative evaluation also indicates that news articles related to our predicted downgrade events are specially relevant and high-quality in our business context.
    Date: 2020–04
  16. By: Weinand, Jann; Ried, Sabrina; Kleinebrahm, Max; McKenna, Russell; Fichtner, Wolf
    Abstract: An increasing number of municipalities are striving for energy autonomy. This study determines in which municipalities and at what additional cost energy autonomy is feasible for a case study of Germany. An existing municipal energy system optimization model is extended to include the personal transport, industrial and commercial sectors. A machine learning approach identifies a regression model among 19 methods, which is best suited for the transfer of individual optimization results to all municipalities. The resulting levelized cost of energy (LCOE) from the optimization of 15 case studies are transferred using a stepwise linear regression model. The regression model shows a mean absolute percentage error of 12.5%. The study demonstrates that energy autonomy is technically feasible in 6,314 (56%) municipalities. Thereby, the LCOEs increase in the autonomous case on average by 0.41 €/kWh compared to the minimum cost scenario. Apart from energy demand, base-load-capable bioenergy and deep geothermal energy appear to have the greatest influence on the LCOEs. This study represents a starting point for defining possible scenarios in studies of future national energy system or transmission grid expansion planning, which for the first time consider completely energy autonomous municipalities.
    Keywords: Energy autonomy,renewable energy,geothermal power generation,electric vehicles,vehicle-to-grid,mixed integer linear programming,regression analysis
    Date: 2020
  17. By: Christopher A. Hollrah; Steven A. Sharpe; Nitish R. Sinha
    Abstract: We apply textual analysis tools to the narratives that accompany Federal Reserve Board economic forecasts to measure the degree of optimism versus pessimism expressed in those narratives. Text sentiment is strongly correlated with the accompanying economic point forecasts, positively for GDP forecasts and negatively for unemployment and inflation forecasts. Moreover, our sentiment measure predicts errors in FRB and private forecasts for GDP growth and unemployment up to four quarters out. Furthermore, stronger sentiment predicts tighter than expected monetary policy and higher future stock returns. Quantile regressions indicate that most of sentiment’s forecasting power arises from signaling downside risks to the economy and stock prices.
    Keywords: Text analysis; Economic forecasts; Monetary policy; Stock returns; Narratives
    JEL: C53 E17 E27 E37 E52 G14
    Date: 2020–01–03
  18. By: Tarek A. Hassan (Boston University, NBER, and CEPR); Laurence van Lent (Tilburg University); Stephan Hollander (Frankfurt School of Finance and Management); Ahmed Tahoun (London Business School)
    Abstract: Using tools described in our earlier work (Hassan et al., 2019, 2020), we develop textbased measures of the costs, benefits, and risks listed firms in the US and over 80 other countries associate with the spread of Covid-19 and other epidemic diseases. We identify which firms expect to gain or lose from an epidemic disease and which are most affected by the associated uncertainty as a disease spreads in a region or around the world. As Covid-19 spreads globally in the first quarter of 2020, we find that firms’ primary concerns relate to the collapse of demand, increased uncertainty, and disruption in supply chains. Other important concerns relate to capacity reductions, closures, and employee welfare. By contrast, financing concerns are mentioned relatively rarely. We also identify some firms that foresee opportunities in new or disrupted markets due to the spread of the disease. Finally, we find some evidence that firms that have experience with SARS or H1N1 have more positive expectations about their ability to deal with the coronavirus outbreak.
    Keywords: Epidemic diseases, pandemic, exposure, virus, firms, uncertainty, sentiment, machine learning
    JEL: I15 I18 D22 G15
    Date: 2020–04
  19. By: Luigi Bellomarini; Marco Benedetti; Andrea Gentili; Rosario Laurendi; Davide Magnanimi; Antonio Muci; Emanuel Sallinger
    Abstract: In the COVID-19 outbreak, governments have applied progressive restrictions to production activities, permitting only those that are considered strategic or that provide essential services. This is particularly apparent in countries that have been stricken hard by the virus, with Italy being a major example. Yet we know that companies are not just isolated entities: They organize themselves into intricate shareholding structures --- forming company networks --- distributing decision power and dividends in sophisticated schemes for various purposes. One tool from the Artificial Intelligence (AI) toolbox that is particularly effective to perform reasoning tasks on domains characterized by many entities highly interconnected with one another is Knowledge Graphs (KG). In this work, we present a visionary opinion and report on ongoing work about the application of Automated Reasoning and Knowledge Graph technology to address the impact of the COVID-19 outbreak on the network of Italian companies and support the application of legal instruments for the protection of strategic companies from takeovers.
    Date: 2020–04
  20. By: Pauline Affeldt; Tomaso Duso; Florian Szücs
    Abstract: We study the evolution of EC merger decisions over the first 25 years of common European merger policy. Using a novel dataset at the level of the relevant antitrust markets and containing all merger cases scrutinized by the Commission over the 1990-2014 period, we evaluate how consistently arguments related to structural market parameters – dominance, concentration, barriers to entry, and foreclosure – were applied over time and across different dimensions such as the geographic market definition and the complexity of the merger. Simple, linear probability models as usually applied in the literature overestimate on average the effects of the structural indicators. Using non-parametric machine learning techniques, we find that dominance is positively correlated with competitive concerns, especially in concentrated markets and in complex mergers. Yet, its importance has decreased over time and significantly following the 2004 merger policy reform. The Commission’s competitive concerns are also correlated with concentration and the more so, the higher the entry barriers and the risks of foreclosure. These patterns are not changing over time. The role of the structural indicators in explaining competitive concerns does not change depending on the geographic market definition.
    Keywords: merger policy, EU Commission, dominance, concentration, entry barriers, foreclosure, causal forests
    JEL: K21 L40
    Date: 2020
  21. By: Sung Jae Jun; Sokbae Lee
    Abstract: We investigate identification of causal parameters in case-control and related studies. The odds ratio in the sample is our main estimand of interest and we articulate its relationship with causal parameters under various scenarios. It turns out that the odds ratio is generally a sharp upper bound for counterfactual relative risk under some monotonicity assumptions, without resorting to strong ignorability, nor to the rare-disease assumption. Further, we propose semparametrically efficient, easy-to-implement, machine-learning-friendly estimators of the aggregated (log) odds ratio by exploiting an explicit form of the efficient influence function. Using our new estimators, we develop methods for causal inference and illustrate the usefulness of our methods by a real-data example.
    Date: 2020–04
  22. By: Takahiro Yabe; Yunchang Zhang; Satish Ukkusuri
    Abstract: In recent years, extreme shocks, such as natural disasters, are increasing in both frequency and intensity, causing significant economic loss to many cities around the world. Quantifying the economic cost of local businesses after extreme shocks is important for post-disaster assessment and pre-disaster planning. Conventionally, surveys have been the primary source of data used to quantify damages inflicted on businesses by disasters. However, surveys often suffer from high cost and long time for implementation, spatio-temporal sparsity in observations, and limitations in scalability. Recently, large scale human mobility data (e.g. mobile phone GPS) have been used to observe and analyze human mobility patterns in an unprecedented spatio-temporal granularity and scale. In this work, we use location data collected from mobile phones to estimate and analyze the causal impact of hurricanes on business performance. To quantify the causal impact of the disaster, we use a Bayesian structural time series model to predict the counterfactual performances of affected businesses (what if the disaster did not occur?), which may use performances of other businesses outside the disaster areas as covariates. The method is tested to quantify the resilience of 635 businesses across 9 categories in Puerto Rico after Hurricane Maria. Furthermore, hierarchical Bayesian models are used to reveal the effect of business characteristics such as location and category on the long-term resilience of businesses. The study presents a novel and more efficient method to quantify business resilience, which could assist policy makers in disaster preparation and relief processes.
    Date: 2020–03

This nep-big issue is ©2020 by Tom Coupé. It is provided as is without any express or implied warranty. It may be freely redistributed in whole or in part for any purpose. If distributed in part, please include this notice.
General information on the NEP project can be found at For comments please write to the director of NEP, Marco Novarese at <>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.