nep-cmp New Economics Papers
on Computational Economics
Issue of 2017‒08‒13
nine papers chosen by



  1. Derivative-Based Optimization with a Non-Smooth Simulated Criterion By David T. Frazier; Dan Zhu
  2. Machine learning in sentiment reconstruction of the simulated stock market By Mikhail Goykhman; Ali Teimouri
  3. Cardinality constrained portfolio selection via factor models By Juan Francisco Monge
  4. Uncomplicated Parallel Computing with Stata By Brian Quistorff; George G Vega Yon
  5. Now You See Me: High School Dropout and Machine Learning By Dario Sansone; Pooya Almasi
  6. Big data in Stata with the ftools package By Sergio Correia
  7. Scheduling Markovian PERT networks to maximize the net present value: new results By Ben Hermans; Roel Leus
  8. Exploring the Potential of Machine Learning for Automatic Slum Identification from VHE Imagery By Duque, Juan Carlos; Patino, Jorge Eduardo; Betancourt, Alejandro
  9. Propensity Scores and Causal Inference Using Machine Learning Methods By Austin Nichols; Linden McBride

  1. By: David T. Frazier; Dan Zhu
    Abstract: Indirect inference requires simulating realizations of endogenous variables from the model under study. When the endogenous variables are discontinuous functions of the model parameters, the resulting indirect inference criterion function is discontinuous and does not permit the use of derivative-based optimization routines. Using a specific class of measure changes, we propose a novel simulation algorithm that alleviates the underlying discontinuities inherent in the indirect inference criterion function, permitting the application of derivative-based optimization routines to estimate the unknown model parameters. Unlike competing approaches, this approach does not rely on kernel smoothing or bandwidth parameters. Several Monte Carlo examples that have featured in the literature on indirect inference with discontinuous outcomes illustrate the approach. These examples demonstrate that this new method gives superior performance over existing alternatives in terms of bias, variance and coverage.
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1708.02365&r=cmp
  2. By: Mikhail Goykhman; Ali Teimouri
    Abstract: In this paper we continue the study of the simulated stock market framework defined by the driving sentiment processes. We focus on the market environment driven by the buy/sell trading sentiment process of the Markov chain type. We apply the methodology of the Hidden Markov Models and the Recurrent Neural Networks to reconstruct the transition probabilities matrix of the Markov sentiment process and recover the underlying sentiment states from the observed stock price behavior.
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1708.01897&r=cmp
  3. By: Juan Francisco Monge
    Abstract: In this paper we propose and discuss different 0-1 linear models in order to solve the cardinality constrained portfolio problem by using factor models. Factor models are used to build portfolios to track indexes, together with other objectives, also need a smaller number of parameters to estimate than the classical Markowitz model. The addition of the cardinality constraints limits the number of securities in the portfolio. Restricting the number of securities in the portfolio allows us to obtain a concentrated portfolio, reduce the risk and limit transaction costs. To solve this problem, a pure 0-1 model is presented in this work, the 0-1 model is constructed by means of a piecewise linear approximation. We also present a new quadratic combinatorial problem, called a minimum edge-weighted clique problem, to obtain an equality weighted cardinality constrained portfolio. A piecewise linear approximation for this problem is presented in the context of a multi factor model. For a single factor model, we present a fast heuristic, based on some theoretical results to obtain an equality weighted cardinality constraint portfolio. The consideration of a piecewise linear approximation allow us to reduce significantly the computation time required for the equivalent quadratic problem. Computational results from the 0-1 models are compared to those using a state-of-the-art Quadratic MIP solver.
    Date: 2017–08
    URL: http://d.repec.org/n?u=RePEc:arx:papers:1708.02424&r=cmp
  4. By: Brian Quistorff (Microsoft); George G Vega Yon (University of Southern California)
    Abstract: Parallel lets you run Stata faster, sometimes faster than MP itself. By organizing your job in several Stata instances, parallel allows you to work with out-of-the-box parallel computing. Using the the 'parallel' prefix, you can get faster simulations, bootstrapping, reshaping big data, etc. without having to know a thing about parallel computing. With no need of having Stata/MP installed on your computer, parallel has showed to dramatically speedup computations up to two, four, or more times depending on how many processors your computer has.
    Date: 2017–08–10
    URL: http://d.repec.org/n?u=RePEc:boc:scon17:3&r=cmp
  5. By: Dario Sansone (Georgetown University); Pooya Almasi (Georgetown University)
    Abstract: In this paper, we create an algorithm to predict which students are eventually going to drop out of US high school using information available in 9th grade. We show that using a naive model - as implemented in many schools - leads to poor predictions. In addition to this, we explain how schools can obtain more precise predictions by exploiting the big data available to them, as well as more sophisticated quantitative techniques. We also compare the performances of econometric techniques like Logistic Regression with Machine Learning tools such as Support Vector Machine, Boosting and LASSO. We offer practical advice on how to apply the new Machine Learning codes available in Stata to the high dimensional datasets available in education. Model parameters are calibrated by taking into account policy goals and budget constraints.
    Date: 2017–08–10
    URL: http://d.repec.org/n?u=RePEc:boc:scon17:5&r=cmp
  6. By: Sergio Correia (Board of Governors of the Federal Reserve System)
    Abstract: In recent years, very large datasets have become increasingly prevalent in most social sciences. However, some of the most important Stata commands (collapse, egen, merge, sort, etc.) rely on algorithms that are not well suited for big data. In my talk, I will present the ftools package, which contains plug-in alternatives to these commands and performs up to 20 times faster on large datasets. Further, I will explain the underlying algorithm and Mata function, and show how to use this function to create new Stata commands and to speed up existing packages.
    Date: 2017–08–10
    URL: http://d.repec.org/n?u=RePEc:boc:scon17:6&r=cmp
  7. By: Ben Hermans; Roel Leus
    Abstract: We study the problem of scheduling a project so as to maximize its expected net present value when task durations are exponentially distributed. Based on the structural properties of an optimal solution we show that, even if preemption is allowed, it is not necessary to do so. Next to its managerial importance, this result also allows for a new algorithm which improves on the current state of the art with several orders of magnitude, both in CPU time and in memory usage.
    Keywords: Project scheduling, Net present value, Exponentially distributed activity durations, Markov decision process, Monotone optimal policy
    Date: 2017–07
    URL: http://d.repec.org/n?u=RePEc:ete:kbiper:588552&r=cmp
  8. By: Duque, Juan Carlos; Patino, Jorge Eduardo; Betancourt, Alejandro
    Abstract: Slum identification in urban settlements is a crucial step in the process of formulation of propoor policies. However, the use of conventional methods for slums detection such as field surveys may result time consuming and costly. This paper explores the possibility of implementing a low-cost standardized method for slum detection. We use spectral, texture and structural features extracted from very high spatial resolution imagery as input data and evaluate the capability of three machine learning algorithms (Logistic Regression, Support Vector Machine and Random Forest) to classify urban areas as slum or no-slum. Using data from Buenos Aires (Argentina), Medellin (Colombia), and Recife (Brazil), we found that Support Vector Machine with radial basis kernel deliver the best performance (over 0.81). We also found that singularities within cities preclude the use of a unified classification model.
    Keywords: Ciudades, Desarrollo urbano, Economía, Equidad e inclusión social, Georreferenciación, Investigación socioeconómica, Pobreza, Políticas públicas, Servicios públicos, Vivienda,
    Date: 2016
    URL: http://d.repec.org/n?u=RePEc:dbl:dblpap:975&r=cmp
  9. By: Austin Nichols (Abt Associates); Linden McBride (Cornell University)
    Abstract: We compare a variety of methods for predicting the probability of a binary treatment (the propensity score), with the goal of comparing otherwise like cases in treatment and control conditions for causal inference about treatment effects. Better prediction methods can under some circumstances improve causal inference both by reducing the finite sample bias and variability of estimators, but sometimes better predictions of the probability of treatment can increase bias and variance, and we clarify the conditions under which different methods produce better or worse inference (in terms of mean squared error of causal impact estimates).
    Date: 2017–08–10
    URL: http://d.repec.org/n?u=RePEc:boc:scon17:13&r=cmp

General information on the NEP project can be found at https://nep.repec.org. For comments please write to the director of NEP, Marco Novarese at <director@nep.repec.org>. Put “NEP” in the subject, otherwise your mail may be rejected.
NEP’s infrastructure is sponsored by the School of Economics and Finance of Massey University in New Zealand.