
on Discrete Choice Models 
By:  Kiran Tomlinson; Johan Ugander; Austin R. Benson 
Abstract:  Standard methods in preference learning involve estimating the parameters of discrete choice models from data of selections (choices) made by individuals from a discrete set of alternatives (the choice set). While there are many models for individual preferences, existing learning methods overlook how choice set assignment affects the data. Often, the choice set itself is influenced by an individual's preferences; for instance, a consumer choosing a product from an online retailer is often presented with options from a recommender system that depend on information about the consumer's preferences. Ignoring these assignment mechanisms can mislead choice models into making biased estimates of preferences, a phenomenon that we call choice set confounding; we demonstrate the presence of such confounding in widelyused choice datasets. To address this issue, we adapt methods from causal inference to the discrete choice setting. We use covariates of the chooser for inverse probability weighting and/or regression controls, accurately recovering individual preferences in the presence of choice set confounding under certain assumptions. When such covariates are unavailable or inadequate, we develop methods that take advantage of structured choice set assignment to improve prediction. We demonstrate the effectiveness of our methods on realworld choice data, showing, for example, that accounting for choice set confounding makes choices observed in hotel booking and commute transportation more consistent with rational utilitymaximization. 
Date:  2021–05 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2105.07959&r= 
By:  Dong, Xueqi; Liu, Shuo Li 
Abstract:  Likelihood functions have been the central pieces of statistical inference. For discrete choice data, conventional likelihood functions are specified by random utility(RU) models, such as logit and tremble, which generate choice stochasticity through an ”error”, or, equivalently, random preference.For risky discrete choice, this paper explores an alternative method to construct the likelihood function: Rational Expectation Stochastic Choice (RESC). In line with Machina (1985), the subject optimally and deterministically chooses a stochastic choice function among all possible stochastic choice functions; the choice stochasticity canbe explained by risk aversion and the relaxation of the reduction of compound lottery. The model maximizes a simple twolayer expectation that disentangles risk and randomization, in the similar spirit of Klibanoff et al. (2005) where ambiguity and risk are disentangled. The model is applied to an experiment, where we do not commit to a particular stochastic choice function but let the data speak. In RESC, welldeveloped decision analysis methods to measure risk attitude toward objective probability can also be applied to measure the attitude toward the implied choice probability. Stochastic choicefunctions are structurally estimated to estimate the stochastic choice functions, anduse standard discrimination test to compare the goodness of fit of RESC and differentRUs. The RUs are Expected Utility+logit and other leading contenders for describing decision under risk. The results suggest the statistical superiority of RESC over ”error” rules. With weakly fewer parameters, RESC outperforms different benchmarkRU models for 30%−89% of subjects. RU models outperform RESC for 0%−2% of subjects. Similar statistical superiority is replicated in a second set of experimental data. 
Keywords:  Experiment; Likelihood Function; Maximum Likelihood Identification;Risk Aversion Parameter; Clarke Test; Discrimination of Stochastic Choice Functions 
JEL:  D8 
Date:  2019–12 
URL:  http://d.repec.org/n?u=RePEc:pra:mprapa:107678&r= 
By:  Bart Capéau; Liebrecht De Sadeleer; Sebastiaan Maes; André M.J. Decoster 
Abstract:  Empirical welfare analyses often impose stringent parametric assumptions on individuals’ preferences and neglect unobserved preference heterogeneity. In this paper, we develop a framework to conduct individual and social welfare analysis for discrete choice that does not suffer from these drawbacks. We first adapt the broad class of individual welfare measures introduced by Fleurbaey (2009) to settings where individual choice is discrete. Allowing for unrestricted, unobserved preference heterogeneity, these measures become random variables. We then show that the distribution of these objects can be derived from choice probabilities, which can be estimated nonparametrically from crosssectional data. In addition, we derive nonparametric results for the joint distribution of welfare and welfare differences, as well as for social welfare. The former is an important tool in determining whether those who benefit from a price change belong disproportionately to those who were initially welloff. An empirical application illustrates the methods. 
Keywords:  discrete choice, nonparametric welfare analysis, individual welfare, social welfare, money metric utility, compensating variation, equivalent variation 
JEL:  C14 C35 D12 D63 H22 I31 
Date:  2021 
URL:  http://d.repec.org/n?u=RePEc:ces:ceswps:_9071&r= 
By:  Maliar, Lilia; Maliar, Serguei 
Abstract:  We introduce a deep learning classification (DLC) method for analyzing equilibrium in discretecontinuous choice dynamic models. As an illustration, we apply the DLC method to solve a version of Krusell and Smith's (1998) heterogeneousagent model with incomplete markets, borrowing constraint and indivisible labor choice. The novel feature of our analysis is that we construct discontinuous decision functions that tell us when the agent switches from one employment state to another, conditional on the economy's state. We use deep learning not only to characterize the discrete indivisible choice but also to perform model reduction and to deal with multicollinearity. Our TensorFlowbased implementation of DLC is tractable in models with thousands of state variables. 
Keywords:  classification; deep learning; discrete choice; Indivisible labor; intensive and extensive margins; logistic regression; neural network 
Date:  2020–10 
URL:  http://d.repec.org/n?u=RePEc:cpr:ceprdp:15346&r= 
By:  Babii, Andrii; Chen, Xi; Ghysels, Eric; Kumar, Rohit 
Abstract:  The importance of asymmetries in prediction problems arising in economics has been recognized for a long time. In this paper, we focus on binary choice problems in a datarich environment with general loss functions. In contrast to the asymmetric regression problems, the binary choice with general loss functions and highdimensional datasets is challenging and not well understood. Econometricians have studied binary choice problems for a long time, but the literature does not offer computationally attractive solutions in datarich environments. In contrast, the machine learning literature has many computationally attractive algorithms that form the basis for much of the automated procedures that are implemented in practice, but it is focused on symmetric loss functions that are independent of individual characteristics. One of the main contributions of our paper is to show that the theoretically valid predictions of binary outcomes with arbitrary loss functions can be achieved via a very simple reweighting of the logistic regression, or other stateoftheart machine learning techniques, such as boosting or (deep) neural networks. We apply our analysis to racial justice in pretrial detention. 
Date:  2020–10 
URL:  http://d.repec.org/n?u=RePEc:cpr:ceprdp:15418&r= 
By:  Farbmacher, Helmut; Tauchmann, Harald 
Abstract:  This paper demonstrates that popular linear fixedeffects paneldata estimators are biased and inconsistent when applied in a discretetime hazard setting  that is, one in which the outcome variable is a binary dummy indicating an absorbing state, even if the datagenerating process is fully consistent with the linear discretetime hazard model. In addition to conventional survival bias, these estimators suffer from another source of  frequently severe  bias that originates from the data transformation itself and, unlike survival bias, is present even in the absence of any unobserved heterogeneity. We suggest an alternative estimation strategy, which is instrumental variables estimation using firstdifferences of the exogenous variables as instruments for their levels. Monte Carlo simulations and an empirical application substantiate our theoretical results. 
Keywords:  linear probability model,individual fixed effects,discretetime hazard,absorbing state,survival bias,instrumental variables estimation 
JEL:  C23 C25 C41 
Date:  2021 
URL:  http://d.repec.org/n?u=RePEc:zbw:iwqwdp:032021&r= 
By:  Boucher, Vincent; Bramoullé, Yann 
Abstract:  Heckman and MaCurdy (1985) first showed that binary outcomes are compatible with linear econometric models of interactions. This key insight was unduly discarded by the literature on the econometrics of games. We consider general models of linear interactions in binary outcomes that nest linear models of peer effects in networks and linear models of entry games. We characterize when these models are well defined. Errors must have a specific discrete structure. We then analyze the models' gametheoretic microfoundations. Under complete information and linear utilities, we characterize the preference shocks under which the linear model of interactions forms a Nash equilibrium of the game. Under incomplete information and independence, we show that the linear model of interactions forms a BayesNash equilibrium if and only if preference shocks are iid and uniformly distributed. We also obtain conditions for uniqueness. Finally, we propose two simple consistent estimators. We revisit the empirical analyses of teenage smoking and peer effects of Lee, Li, and Lin (2014) and of entry into airline markets of Ciliberto and Tamer (2009). Our reanalyses showcase the main interests of the linear framework and suggest that the estimations in these two studies suffer from endogeneity problems. 
Keywords:  Binary Outcomes; Econometrics of Games; Linear Probability Model; peer effects 
Date:  2020–11 
URL:  http://d.repec.org/n?u=RePEc:cpr:ceprdp:15505&r= 
By:  Debopam Bhattacharya; Tatiana Komarova 
Abstract:  The econometric literature on programevaluation and optimal treatmentchoice takes functionals of outcomedistributions as target welfare, and ignores programimpacts on unobserved utilities, including utilities of those whose outcomes may be unaffected by the intervention. We show that in the practically important setting of discretechoice, under general preferenceheterogeneity and incomeeffects, the distribution of indirectutility is nonparametrically identified from average demand. This enables costbenefit analysis and treatmenttargeting based on social welfare and planners' distributional preferences, while also allowing for general unobserved heterogeneity in individual preferences. We demonstrate theoretical connections between utilitarian social welfare and Hicksian compensation. An empirical application illustrates our results. 
Date:  2021–05 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2105.08689&r= 
By:  Johannes S. Kunz (Monash University); Kevin E. Staub (University of Melbourne); Rainer Winkelmann (University of Zurich) 
Abstract:  Many applied settings in empirical economics require estimation of a large number of individual effects, like teacher effects or location effects; in health economics, prominent examples include patient effects, doctor effects, or hospital effects. Increasingly, these effects are the object of interest of the estimation, and predicted effects are often used for further descriptive and regression analyses. To avoid imposing distributional assumptions on these effects, they are typically estimated via fixed effects methods. In short panels, the conventional maximum likelihood estimator for fixed effects binary response models provides poor estimates of these individual effects since the finite sample bias is typically substantial. We present a biasreduced fixed effects estimator that provides better estimates of the individual effects in these models by removing the firstorder asymptotic bias. An additional, practical advantage of the estimator is that it provides finite predictions for all individual effects in the sample, including those for which the corresponding dependent variable has identical outcomes in all time periods over time (either all zeros or ones); for these, the maximum likelihood prediction is infinite. We illustrate the approach in simulation experiments and in an application to health care utilization. Stata estimation command is available at [Github:brfeglm](https://github.com/Joha nnesSKunz/brfeglm) 
Keywords:  Incidental parameter bias, Perfect prediction, Fixed effects, Panel data, Bias reduction 
JEL:  C23 C25 I11 I18 
Date:  2021–05 
URL:  http://d.repec.org/n?u=RePEc:ajr:sodwps:202105&r= 
By:  Edwin FourrierNicolai (Aix Marseille Univ, CNRS, AMSE, Marseille, France and Toulouse School of Economics, Université Toulouse Capitole, Toulouse, France); Michel Lubrano (School of Economics, Jiangxi University of Finance and Economics & AixMarseille Univ., CNRS, AMSE) 
Abstract:  The growth incidence curve of Ravallion and Chen (2003) is based on the quantile function. Its distributionfree estimator behaves erratically with usual sample sizes leading to problems in the tails. We propose a series of parametric models in a Bayesian framework. A first solution consists in modelling the underlying income distribution using simple densities for which the quantile function has a closed analytical form. This solution is extended by considering a mixture model for the underlying income distribution. However in this case, the quantile function is semiexplicit and has to be evaluated numerically. The alternative solution consists in adjusting directly a functional form for the Lorenz curve and deriving its first order derivative to find the corresponding quantile function. We compare these models first by Monte Carlo simulations and second by using UK data from the Family Expenditure Survey where we devote a particular attention to the analysis of subgroups. 
Keywords:  Bayesian inference, growth incidence curve, Inequality 
JEL:  C11 D31 I31 
Date:  2021–05 
URL:  http://d.repec.org/n?u=RePEc:aim:wpaimx:2131&r= 
By:  Adeola Oyenubi 
Abstract:  This paper considers the sensitivity of Genetic Matching (GenMatch) to the choice of balance measure. It explores the performance of a newly introduced distributional balance measure that is similar to the KS test but is more evenly sensitive to imbalance across the support. This measure is introduced by Goldman & Kaplan (2008) (i.e. the GK measure). This is important because the rationale behind distributional balance measures is their ability to provide a broader description of balance. I also consider the performance of multivariate balance measures i.e. distance covariance and correlation. This is motivated by the fact that ideally, balance for causal inference refers to balance in joint density and individual balance in a set of univariate distributions does not necessarily imply balance in the joint distribution.Simulation results show that GK dominates the KS test in terms of Bias and Mean Square Error (MSE); and the distance correlation measure dominates all other measure in terms of Bias and MSE. These results have two important implication for the choice of balance measure (i) Even sensitivity across the support is important and not all distributional measures has this property (ii) Multivariate balance measures can improve the performance of matching estimators. 
Keywords:  Genetic matching, balance measures, causal inference, Machine learning 
JEL:  I38 H53 C21 D13 
Date:  2020–11 
URL:  http://d.repec.org/n?u=RePEc:rza:wpaper:840&r= 
By:  Franz, Anjuli; Croitor, Evgheni 
Date:  2021–05–13 
URL:  http://d.repec.org/n?u=RePEc:dar:wpaper:126519&r= 