Presented By: Dr. Melvyn Weeks (University of Cambridge)
- Day 1, 10am-12pm & 2pm -4pm
- Day 2, 10am -12pm & 2pm -4pm
- Day 3, 10am -1pm
This course will review the application of machine learning techniques to both prediction problems and so-called causal problems where a firm or policy maker needs to understand the impact of some form of intervention on a heterogeneous population.
One example, is a firm that wishes to understand how the introduction of a change in pricing impacts both aggregate demand, and the demand on different segments of the population. In another example, a policymaker seeks to understand the impact of an intervention both in terms of some form of average effect, but also how individuals differ in the magnitude of the effect. Examples include the impact of job training programmes, the impact of education policies in developing economies, and the differential impact of drugs on survival and recovery.
In this context we make the distinction between the ex post assessment of a change and the ex ante identification of characteristics of individuals that are predictive of the likely impact of such a change.
Using Breiman’s (2001) notion of two cultures in the use of statistical modelling, the course begins with a review of the fundamental differences between machine learning and econometrics.
There are two cultures in the use of statistical modelling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. - Breiman , p199.
We contrast a modelling approach where the analyst makes certain assumption on model specification, including functional form, with an approach where the data mechanism is presumed unknown. In this context we consider the econometrician’s concern for internal validity, alongside the focus within machine learning of ensuring that a model is robust in the sense of generalising to unseen data (external validity).
The course will focus upon topics at the intersection of machine learning and econometrics, covering a mix of theory and applications. In making the distinction between models which are used to solve a prediction problem and models which are used to estimate some form of causal effect, we introduce participants to identification strategies in econometrics. Here it is important to demonstrate how empirical strategies such as unconfoundedness, instrumental variables, and difference-in-difference can be used alongside machine learning methods for prediction.
As a point of departure we make reference to the two broad types of machine learning in terms of supervised and unsupervised learning, making the link to nonparametric regression. We then consider a number of fundamental building blocks, starting with error decomposition in terms of bias and variance, the role of training, estimation and test samples, and the role of regularization as a means to avoid overfitting.
In covering two broad areas where machine learning is used, namely prediction, classification and causal effects, for each case we link the exposition to parametric bench- marks. For prediction we consider the piecewise nonlinear regression model, and high dimensional methods; and for causal effects we consider the specification of models with instrumental variables and treatment effects.
Participants will also be introduced to the use of ensemble methods as an averaging and regularization device. In this context we will explore a number of general methods for model averaging including bootstrap sampling (so-called bagging) and random forests.
For Machine Learning models in prediction, classification and causal effects we provide examples using Stata, R and Python.
Application: Causal Forest Estimation of Heterogeneous Household Response to Time-Of-Use Electricity Pricing Schemes
The introduction of time-of-use electricity prices is an example of a policy with heterogeneous effects. Consumers in different socioeconomic groups and with distinct historical intra-day load profiles and behavioural characteristics, may respond differently to the introduction of tariffs that charge different prices for electricity at different times of the day. Customers who can (cannot) adapt their consumption profile to tou tariffs will accrue a benefit (cost). Those who consume electricity at more expensive peak peri- ods, and who are unable to change their consumption patterns, could end up paying significantly more.
Analysts often describe subpopulations that are of interest a priori, and which can be defined by a known combination of covariates. However, increasingly researchers face a selection problem given a large number of possible covariates alongside uncertainty as to which covariates are important for heterogeneity, and what functional form best describes the association between these covariates and treatment effects.
In assessing whether demographic variables are informative in terms of the impact of
tou tariffs on load profiles, the Customer-Led Network Revolution project noted:
.. a relatively consistent average demand profile across the different demo- graphic groups, with much higher variability within groups than between them. This high variability is seen both in total consumption and in peak demand.
In addition, the question of which demographic variables are important when considering the impact of energy policies ignores the fact that many of these variables should be considered together, in a multiplicative fashion. One reason for this finding might be that, for example, it is the (unknown) combination of income, household size, education, and daily usage patterns that describes a particularly responsive or unresponsive group.
Throughout the course we make reference to the problem of identifying the distributional effects of some intervention, without succumbing to the problems of data mining (multiplicity). Here we examine the empirical problem of identifying the characteristics of winners and losers subsequent to the introduction of tou tariffs following the intro- duction of a Time-of-Use (tou) pricing scheme where the price per kWh of electricity usage depends on the time of consumption. The pricing scheme is enabled by smart meters, which records consumption every half-hour.
Using machine learning methods we describe the association between the effect of tou pricing schemes on household electricity demand and a range of variables that are observable before the introduction of the new pricing schemes.
- L. Breiman, J. Freidman, R. Olshen, C. Stone. Classification and Regression Trees.
- Klein-Verlag, 1990.
- J. Freidman, T. Hastie, R. Tibshirani. The Elements of Statistical Learning. Springer, 2009.