Academic journal article Australian Journal of Labour Economics

How Important Are Omitted Variables, Censored Scores and Self-Selection in Analysing High-School Academic Achievement?

Academic journal article Australian Journal of Labour Economics

How Important Are Omitted Variables, Censored Scores and Self-Selection in Analysing High-School Academic Achievement?

Article excerpt


Using a rich longitudinal data set from birth, we explore three estimation issues related to academic performance analysis. Our paper primarily examines the effect of omitting childhood and teenage characteristics (childhood ability, parental resources at different times and peer effects), which are traditionally unavailable in data sets. Additionally, we explore the potential endogeneity of pre-exam school-leaving choices (self-selection) to academic performance; and we demonstrate the effect of accounting for censored academic performance measures. We find that omitting background characteristics results in overestimation of coefficients on other characteristics (the effect of current income is overestimated by 0.21 standard deviations of the average academic performance and the effect of ethnicity by 1.38 standard deviations). This then affects the policy implications drawn: for the group who did not take the exam, the predicted performance goes from a fail to a C (or pass). We also find that accounting for censored academic performance measures affects the estimation results, but allowing for selection correction does not.

JEL Classification: I21, J24, J13, J18

(ProQuest: ... denotes formulae omitted.)

1. Introduction

A growing body of economic research is focusing on the academic performance of children and adolescents, as an important economic outcome of investments in education by families and communities. A problem identified across a wide range of advanced countries is that the proportion of young persons who leave high school without qualifications is alarmingly high. For example, on average across all OECD countries, 13 per cent of young people leave high school without secondary qualifications, and this percentage is higher than 20 per cent in some countries (OECD, 2008: p. 66).

In this paper we use a longitudinal data set from birth to provide evidence on the importance of including childhood family income and cognitive development, and teenage peer behavioural characteristics that are usually not available. An unresolved question is the relative importance of early childhood versus later income and background characteristics, as highlighted in the review by Haveman and Wolfe (1995) and in several studies since then. We observe a wide range of data characteristics for a cohort of students in New Zealand for whom we also observe Year 10 National Examination results if they took the exam. Importantly, our data set (Christchurch Health and Development Study (CHDS)) allows us to incorporate an extensive range of characteristics for a complete birth cohort, including students who do not have observed examination scores. We use this feature of our data to predict academic performance for all students (including those who did not take the exam) based on the students who took the exam while allowing for selection into taking the exam. We examine the effect of early childhood and later family income. We are also able to control for the effect of childhood cognitive development and teenage peers' behavioural effects in our predictions of expected grades for at-risk students. We show that our data is comparable to those in current studies when we use background characteristics that are readily available in the literature.

In New Zealand in general and in our study, students who are at school are expected to take School Certificate Exams. These are nationally administered exams, based on the same set of questions and grading for all participants, at the end of Year 10, usually at age 15. This is a great advantage as the use of this measure of academic performance eliminates recognised problems with inconsistency in comparing grades across schools, especially across lower and higher income school districts.1 It thus provides nationally comparable academic performance results while in secondary school.

We contribute to the existing literature by testing the sensitivity of our academic performance results to the inclusion of a range of additional variables which are traditionally not available in many other data sets (childhood ability, parental resources at different times and peer effects). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.