
on Econometrics 
By:  Nathan Kallus; Xiaojie Mao 
Abstract:  We study generic inference on identified linear functionals of nonunique nuisances defined as solutions to underidentified conditional moment restrictions. This problem appears in a variety of applications, including nonparametric instrumental variable models, proximal causal inference under unmeasured confounding, and missingnotatrandom data with shadow variables. Although the linear functionals of interest, such as average treatment effect, are identifiable under suitable conditions, nonuniqueness of nuisances pose serious challenges to statistical inference, since in this setting common nuisance estimators can be unstable and lack fixed limits. In this paper, we propose penalized minimax estimators for the nuisance functions and show they enable valid inference in this challenging setting. The proposed nuisance estimators can accommodate flexible function classes, and importantly, they can converge to fixed limits determined by the penalization, regardless of whether the nuisances are unique or not. We use the penalized nuisance estimators to form a debiased estimator for the linear functional of interest and prove its asymptotic normality under generic highlevel conditions, which provide for asymptotically valid confidence intervals. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08291&r= 
By:  Mamadou Yauck 
Abstract:  This paper deals with the estimation of exogeneous peer effects for partially observed networks under the new inferential paradigm of design identification, which characterizes the missing data challenge arising with sampled networks with the central idea that two full data versions which are topologically compatible with the observed data may give rise to two different probability distributions. We show that peer effects cannot be identified by design when network links between sampled and unsampled units are not observed. Under realistic modeling conditions, and under the assumption that sampled units report on the size of their network of contacts, the asymptotic bias arising from estimating peer effects with incomplete network data is characterized, and a biascorrected estimator is proposed. The finite sample performance of our methodology is investigated via simulations. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.09102&r= 
By:  Abdelkamel Alj; Rajae Azrak; Guy Melard 
Abstract:  The paper is concerned with vector autoregressivemoving average (VARMA) models with timedependent coe_cients (td) to represent some nonstationary time series. The coe_cients depend on time but can also depend on the length of the series n, hence the name tdVARMA(n) for the models. As a consequence of dependency of the model on n, we need to consider array processes instead of stochastic processes. Generalizing results for univariate time series combined with new results for array models, under appropriate assumptions, it is shown that a Gaussian quasimaximum likelihood estimator is consistent in probability and asymptotically normal. The theoretical results are illustrated using two examples of bivariate processes, both with marginal heteroscedasticity. The first example is a tdVAR(n)(1) process while the second example is a tdVMA(n)(1) process. It is shown that the assumptions underlying the theoretical results apply. In the two examples, the asymptotic information matrix is obtained, not only in the Gaussian case. Finally, the finitesample behaviour is checked via a Monte Carlo simulationstudy. The results con_rm the validity of the asymptotic properties even for small n and reveal that the asymptotic information matrix deduced from thetheory is correct. 
Keywords:  Nonstationary process; multivariate time series; timevarying models; array process. 
Date:  2022–07 
URL:  http://d.repec.org/n?u=RePEc:eca:wpaper:2013/348492&r= 
By:  Sylvain Barde 
Abstract:  Large scale, computationally expensive simulation models pose a particular challenge when it comes to estimating their parameters from empirical data. Most simulation models do not possess closed form expressions for their likelihood function, requiring the use of simulationbased inference, such as simulated method of moments, indirect inference or approximate Bayesian computation. However, given the high computational requirements of largescale models, it is often difficult to run these estimation methods, as they require more simulated runs that can feasibly be carried out. This paper aims to address the problem by providing a full Bayesian estimation framework where the true but intractable likelihood function of the simulation model is replaced by one generated by a surrogate model. This is provided by a sparse variational Gaussian process, chosen for its desirable convergence and consistency properties. The effectiveness of the approach is tested using both a Monte Carlo analysis on a known data generating process, and an empirical application in which the free parameters of a computationally demanding agentbased model are estimated on US macroeconomic data. 
Keywords:  Bayesian estimation; surrogate methods; Gaussian process; simulation models 
JEL:  C14 C15 C52 C63 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:ukc:ukcedp:2203&r= 
By:  Runyu Dai; Yoshimasa Uematsu; Yasumasa Matsuda 
Abstract:  We extend the Principal Orthogonal complEment Thresholding (POET) framework introduced by Fan et al. (2013) to estimate large static covariance matrices with a "mixed" structure of observable and unobservable common factors, and we call this method the extended POET (ePOET). A stable covariance estimator for largescale data is developed by combining observable factors and sparsityinduced weak latent factors, with an adaptive threshold estimator of idiosyncratic covariance. Under some mild conditions, we derive the uniform consistency of the proposed estimator for the cases with or without observable factors. Furthermore, several simulation studies show that the ePOET achieves good finitesample performance regardless of data with strong, weak, or mixed factors structure. Finally, we conduct empirical studies to present the practical usefulness of the ePOET. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:toh:dssraa:130&r= 
By:  Jad Beyhum; JeanPierre Florens; Elia Lapenta; Ingrid Van Keilegom 
Abstract:  The hypothesis of error invariance is central to the instrumental variable literature. It means that the error term of the model is the same across all potential outcomes. In other words, this assumption signifies that treatment effects are constant across all subjects. It allows to interpret instrumental variable estimates as average treatment effects over the whole population of the study. When this assumption does not hold, the bias of instrumental variable estimators can be larger than that of naive estimators ignoring endogeneity. This paper develops two tests for the assumption of error invariance when the treatment is endogenous, an instrumental variable is available and the model is separable. The first test assumes that the potential outcomes are linear in the regressors and is computationally simple. The second test is nonparametric and relies on Tikhonov regularization. The treatment can be either discrete or continuous. We show that the tests have asymptotically correct level and asymptotic power equal to one against a range of alternatives. Simulations demonstrate that the proposed tests attain excellent finite sample performances. The methodology is also applied to the evaluation of returns to schooling and the effect of price on demand in a fish market. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.05344&r= 
By:  Songnian Chen; Shakeeb Khan; Xun Tang 
Abstract:  We identify and estimate treatment effects when potential outcomes are weakly separable with a binary endogenous treatment. Vytlacil and Yildiz (2007) proposed an identification strategy that exploits the mean of observed outcomes, but their approach requires a monotonicity condition. In comparison, we exploit full information in the entire outcome distribution, instead of just its mean. As a result, our method does not require monotonicity and is also applicable to general settings with multiple indices. We provide examples where our approach can identify treatment effect parameters of interest whereas existing methods would fail. These include models where potential outcomes depend on multiple unobserved disturbance terms, such as a Roy model, a multinomial choice model, as well as a model with endogenous random coefficients. We establish consistency and asymptotic normality of our estimators. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.05047&r= 
By:  Matias D. Cattaneo; Richard K. Crump; Weining Wang 
Abstract:  Betasorted portfolios  portfolios comprised of assets with similar covariation to selected risk factors  are a popular tool in empirical finance to analyze models of (conditional) expected returns. Despite their widespread use, little is known of their statistical properties in contrast to comparable procedures such as twopass regressions. We formally investigate the properties of betasorted portfolio returns by casting the procedure as a twostep nonparametric estimator with a nonparametric first step and a betaadaptive portfolios construction. Our framework rationalize the wellknown estimation algorithm with precise economic and statistical assumptions on the general data generating process and characterize its key features. We study betasorted portfolios for both a single crosssection as well as for aggregation over time (e.g., the grand mean), offering conditions that ensure consistency and asymptotic normality along with new uniform inference procedures allowing for uncertainty quantification and testing of various relevant hypotheses in financial applications. We also highlight some limitations of current empirical practices and discuss what inferences can and cannot be drawn from returns to betasorted portfolios for either a single crosssection or across the whole sample. Finally, we illustrate the functionality of our new procedures in an empirical application. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.10974&r= 
By:  TaeHwy Lee (Department of Economics, University of California Riverside); Ekaterina Seregina (Colby College) 
Abstract:  In this paper we develop a novel method of combining many forecasts based on a machine learning algorithm called Graphical LASSO. We visualize forecast errors from different forecasters as a network of interacting entities and generalize network inference in the presence of common factor structure and structural breaks. First, we note that forecasters often use common information and hence make common mistakes, which makes the forecast errors exhibit common factor structures. We propose the Factor Graphical LASSO (Factor GLASSO), which separates common forecast errors from the idiosyncratic errors and exploits sparsity of the precision matrix of the latter. Second, since the network of experts changes over time as a response to unstable environments such as recessions, it is unreasonable to assume constant forecast combination weights. Hence, we propose RegimeDependent Factor Graphical LASSO (RDFactor GLASSO) and develop its scalable implementation using the Alternating Direction Method of Multipliers (ADMM) to estimate regimedependent forecast combination weights. The empirical application to forecasting macroeconomic series using the data of the European Central Bankâ€™s Survey of Professional Forecasters (ECB SPF) demonstrates superior performance of a combined forecast using Factor GLASSO and RDFactor GLASSO. 
Keywords:  Common Forecast Errors, Regime Dependent Forecast Combination, Sparse Precision Matrix of Idiosyncratic Errors, Structural Breaks. 
JEL:  C13 C38 C55 
Date:  2022–09 
URL:  http://d.repec.org/n?u=RePEc:ucr:wpaper:202213&r= 
By:  Matsushita, Yukitoshi; Otsu, Taisuke 
Abstract:  This paper proposes a jackknife Lagrange multiplier (JLM) test for instrumental variable regression models, which is robust to (i) many instruments, where the number of instruments may increase proportionally with the sample size, (ii) arbitrarily weak instruments, and (iii) heteroskedastic errors. In contrast to Crudu, Mellace and Sándor (2021) and Mikusheva and Sun (2021) who proposed jackknife AndersonRubin tests that are also robust to (i)(iii), we modify a score statistic by jackknifing and construct its heteroskedasticity robust variance estimator. Compared to the Lagrange multiplier tests by Kleibergen (2002) and Moreira (2001) and their modification for many instruments by Hansen, Hausman and Newey (2008), our JLM test is robust to heteroskedastic errors and may circumvent a possible decrease in the power function. Simulation results illustrate the desirable size and power properties of the proposed method. 
JEL:  J1 
Date:  2022–08–26 
URL:  http://d.repec.org/n?u=RePEc:ehl:lserod:116392&r= 
By:  Ivonne Schwartz; Mark Kirstein 
Abstract:  One challenge in the estimation of financial market agentbased models (FABMs) is to infer reliable insights using numerical simulations validated by only a single observed time series. Ergodicity (besides stationarity) is a strong precondition for any estimation, however it has not been systematically explored and is often simply presumed. For finitesample lengths and limited computational resources empirical estimation always takes place in preasymptopia. Thus broken ergodicity must be considered the rule, but it remains largely unclear how to deal with the remaining uncertainty in nonergodic observables. Here we show how an understanding of the ergodic properties of moment functions can help to improve the estimation of (F)ABMs. We run Monte Carlo experiments and study the convergence behaviour of moment functions of two prototype models. We find infeasiblylong convergence times for most. Choosing an efficient mix of ensemble size and simulated time length guided our estimation and might help in general. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08169&r= 
By:  Xiaoran Liang; Eleanor Sanderson; Frank Windmeijer 
Abstract:  In a linear instrumental variables (IV) setting for estimating the causal effects of multiple confounded exposure/treatment variables on an outcome, we investigate the adaptive Lasso method for selecting valid instrumental variables from a set of available instruments that may contain invalid ones. An instrument is invalid if it fails the exclusion conditions and enters the model as an explanatory variable. We extend the results as developed in Windmeijer et al. (2019) for the single exposure model to the multiple exposures case. In particular we propose a medianofmedians estimator and show that the conditions on the minimum number of valid instruments under which this estimator is consistent for the causal effects are only moderately stronger than the simple majority rule that applies to the median estimator for the single exposure case. The adaptive Lasso method which uses the initial medianofmedians estimator for the penalty weights achieves consistent selection with oracle properties of the resulting IV estimator. This is confirmed by some Monte Carlo simulation results. We apply the method to estimate the causal effects of educational attainment and cognitive ability on body mass index (BMI) in a Mendelian Randomization setting. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.05278&r= 
By:  Dixon, Huw David (Cardiff Business School); Tian, Maoshan (Cardiff Business School) 
Abstract:  CrossSectional Distribution of Durations (CSD). In this paper, we apply Fieller's method and Delta method to derive confidence interval of CSD with Tian and Huw’s variance formulae. (CSD) is a new estimators derived by Dixon (2012). It can be applied in general Taylor model (GT E) by Dixon and Bihan (2012a) and hospital waiting times by Dixon and Siciliani (2009). We use Monte Carlo simulations to evaluate the empirical size of Fieller's method and delta method among different sample sizes. The empirical results show that Fieller's method is superior to delta method in terms of estimating the confidence interval of CSD even both methods are available. Finally, we use both methods to two data sets: the UK CPI microprice data and waiting time data from UK hospitals. All the estimators are located in their confidence intervals.Length: 27 pages 
Keywords:  Fieller's Method, Delta Method, Confidence Interval 
JEL:  C10 C15 E50 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:cdf:wpaper:2022/15&r= 
By:  XinBing Kong; YongXin Liu; Long Yu; Peng Zhao 
Abstract:  In this paper, we introduce a matrix quantile factor model for matrix sequence data analysis. For matrixvalued data with a lowrank structure, we estimate the row and column factor spaces via minimizing the empirical check loss function over all panels. We show that the estimates converge at rate $1/\min\{\sqrt{p_1p_2}, \sqrt{p_2T}, \sqrt{p_1T}\}$ in the sense of average Frobenius norm, where $p_1$, $p_2$ and $T$ are the row dimensionality, column dimensionality and length of the matrix sequence, respectively. This rate is faster than that of the quantile estimates via ``flattening" the matrix quantile factor model into a large vector quantile factor model, if the interactive lowrank structure is the underground truth. We provide three criteria to determine the pair of row and column factor numbers, which are proved to be consistent. Extensive simulation studies and an empirical study justify our theory. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08693&r= 
By:  Zhongze Cai; Hanzhao Wang; Kalyan Talluri; Xiaocheng Li 
Abstract:  Choice modeling has been a central topic in the study of individual preference or utility across many fields including economics, marketing, operations research, and psychology. While the vast majority of the literature on choice models has been devoted to the analytical properties that lead to managerial and policymaking insights, the existing methods to learn a choice model from empirical data are often either computationally intractable or sample inefficient. In this paper, we develop deep learningbased choice models under two settings of choice modeling: (i) featurefree and (ii) featurebased. Our model captures both the intrinsic utility for each candidate choice and the effect that the assortment has on the choice probability. Synthetic and real data experiments demonstrate the performances of proposed models in terms of the recovery of the existing choice models, sample complexity, assortment effect, architecture design, and model interpretation. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.09325&r= 
By:  Karl Friedrich Siburg; Christopher Strothmann; Gregor Wei{\ss} 
Abstract:  We introduce a new stochastic order for the tail dependence between random variables. We then study different measures of tail dependence which are monotone in the proposed order, thereby extending various known tail dependence coefficients from the literature. We apply our concepts in an empirical study where we investigate the tail dependence for different pairs of S&P 500 stocks and indices, and illustrate the advantage of our measures of tail dependence over the classical tail dependence coefficient. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.10319&r= 
By:  Candes, Emmanuel (Stanford U); Lei, Lihua (Stanford U); Ren, Zhimei (U of Chicago) 
Abstract:  Existing survival analysis techniques heavily rely on strong modelling assumptions and are, therefore, prone to model misspecification errors. In this paper, we develop an inferential method based on ideas from conformal prediction, which can wrap around any survival prediction algorithm to produce calibrated, covariatedependent lower predictive bounds on survival times. In the Type I rightcensoring setting, when the censoring times are completely exogenous, the lower predictive bounds have guaranteed coverage in finite samples without any assumptions other than that of operating on independent and identically distributed data points. Under a more general conditionally independent censoring assumption, the bounds satisfy a doubly robust property which states the following: marginal coverage is approximately guaranteed if either the censoring mechanism or the conditional survival function is estimated well. Further, we demonstrate that the lower predictive bounds remain valid and informative for other types of censoring. The validity and efficiency of our procedure are demonstrated on synthetic data and real COVID19 data from the UK Biobank. 
Date:  2022–04 
URL:  http://d.repec.org/n?u=RePEc:ecl:stabus:4028&r= 
By:  Timo Dimitriadis; Tobias Fissler; Johanna Ziegel 
Abstract:  We characterize the full classes of Mestimators for semiparametric models of general functionals by formally connecting the theory of consistent loss functions from forecast evaluation with the theory of Mestimation. This novel characterization result opens up the possibility for theoretical research on efficient and equivariant Mestimation and, more generally, it allows to leverage existing results on loss functions known from the literature of forecast evaluation in estimation theory. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08108&r= 
By:  Nishi, Mikihito; 西, 幹仁; Kurozumi, Eiji; 黒住, 英司 
Keywords:  random coefficient model, local to unity, moderate deviation, LBI test, power envelope 
JEL:  C12 C22 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:hit:econdp:202202&r= 
By:  Danilo CascaldiGarcia 
Abstract:  The onset of the COVID19 pandemic and the great lockdown caused macroeconomic variables to display complex patterns that hardly follow any historical behavior. In the context of Bayesian VARs, an offtheshelf exercise demonstrates how a very low number of extreme pandemic observations bias the estimated persistence of the variables, affecting forecasts and giving a myopic view of the economic effects after a structural shock. I propose an easy and straightforward solution to deal with these extreme episodes, as an extension of the Minnesota Prior with dummy observations by allowing for time dummies. The Pandemic Priors succeed in recovering these historical relationships and the proper identification and propagation of structural shocks. 
Keywords:  Bayesian VAR; Minnesota Prior; COVID19; Structural shocks 
JEL:  C32 E32 E44 
Date:  2022–08–03 
URL:  http://d.repec.org/n?u=RePEc:fip:fedgif:1352&r= 
By:  Christian K. Wolf; Alisdair McKay 
Abstract:  We show that, in a general family of linearized structural macroeconomic models, knowledge of the empirically estimable causal effects of contemporaneous and news shocks to the prevailing policy rule is sufficient to construct counterfactuals under alternative policy rules. If the researcher is willing to postulate a loss function, our results furthermore allow her to recover an optimal policy rule for that loss. Under our assumptions, the derived counterfactuals and optimal policies are robust to the Lucas critique. We then discuss strategies for applying these insights when only a limited amount of empirical causal evidence on policy shock transmission is available. 
JEL:  E32 E61 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:nbr:nberwo:30358&r= 
By:  Yaron Azrieli; John Rehbeck 
Abstract:  Models of stochastic choice typically use conditional choice probabilities given menus as the primitive for analysis, but in the field these are often hard to observe. Moreover, studying preferences over menus is not possible with this data. We assume that an analyst can observe marginal frequencies of choice and availability, but not conditional choice frequencies, and study the testable implications of some prominent models of stochastic choice for this dataset. We also analyze whether parameters of these models can be identified. Finally, we characterize the marginal distributions that can arise under twostage models in the spirit of Gul and Pesendorfer [2001] and of kreps [1979] where agents select the menu before choosing an alternative. 
Date:  2022–08 
URL:  http://d.repec.org/n?u=RePEc:arx:papers:2208.08492&r= 
By:  Angelopoulos, Anastasios N. (?); Bates, Stephen (?); Candes, Emmanuel J. (?); Jordan, Michael I. (?); Lei, Lihua (Stanford U) 
Abstract:  We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finitesample statistical guarantees. Our calibration algorithm works with any underlying model and (unknown) datagenerating distribution and does not require model refitting. The framework addresses, among other examples, false discovery rate control in multilabel classification, intersectionoverunion control in instance segmentation, and the simultaneous control of the type1 error of outlier detection and confidence set coverage in classification or regression. Our main insight is to reframe the riskcontrol problem as multiple hypothesis testing, enabling techniques and mathematical arguments different from those in the previous literature. We use our framework to provide new calibration methods for several core machine learning tasks with detailed worked examples in computer vision and tabular medical data. 
Date:  2022–04 
URL:  http://d.repec.org/n?u=RePEc:ecl:stabus:4030&r= 