A Review of Forecasting Techniques for Large Data Sets

By Eklund, Jana; Kapetanios, George | National Institute Economic Review, January 2008 | Go to article overview

A Review of Forecasting Techniques for Large Data Sets


Eklund, Jana, Kapetanios, George, National Institute Economic Review


This paper aims to provide a brief and relatively non-technical overview of state-of-the-art forecasting with large data sets. We classify existing methods into four groups depending on whether data sets are used wholly or partly, whether a single model or multiple models are used and whether a small subset or the whole data set is being forecast. In particular, we provide brief descriptions of the methods and short recommendations where appropriate, without going into detailed discussions of their merits or demerits.

Keywords: Large datasets; factors; forecast combinations.

JEL Classifications: C01; C53

I. Introduction

In recent years there has been increasing interest in forecasting methods that utilise large data sets. There is an awareness that there is a huge quantity of information available in the economic arena which might be useful for forecasting, but standard econometric techniques are not well suited to extract this in a useful form. This is not an issue of mere academic interest. Lars Svensson described what central bankers do in practice in Svensson (2005). "Large amounts of data about the state of the economy and the rest of the world ... are collected, processed, and analyzed before each major decision." In an effort to assist in this task, econometricians began assembling large macroeconomic data sets and devising ways of forecasting with them.

In the past few years a large number of methods which are either new or new to econometrics has been proposed to deal with forecasting using large data sets. This review aims to provide a brief discussion of the available methods. Given the recent and evolving nature of the literature this review is bound to be incomplete. The need for new methods in the face of the availability of large data sets arises out of the fact that, given time series observations for a large data set, which at time t is denoted by the N-dimensional vector [x.sub.t], it is either inefficient or downright impossible to incorporate [x.sub.t] in a single forecasting model and estimate it using standard econometric techniques.

We assume that primary interest focuses on forecasting a single variable [y.sub.t], which may or may not be included in [x.sub.t]. Broadly speaking, the available methodologies for forecasting with large data sets fall into four groups. The first group consists of estimation strategies that allow estimation of a single equation model that utilises the whole of [x.sub.t]. This is perhaps the most diverse group ranging from factor-based methods to Bayesian regression. The methods of the second group involve inherently two steps. In the first step some form of variable selection is undertaken. The variables that are chosen are then most likely to be used in a standard forecasting model. Of course, if the resulting data set is too large, it may still be analysed using methods designed for large data sets. These first two groups of methods inevitably overlap. However, we feel that the step of variable selection is, and involves methods that are, sufficiently distinct to merit separate mention and treatment. The third group of methods involves the use of subsets of [x.sub.t] in distinct forecasting models and the production of multiple forecasts for [y.sub.t], which are then averaged to produce a final forecast. The distinctive feature of this group is the explicit use of model and forecast averaging. Finally, the fourth and perhaps most innovative group of methods departs from the convention of forecasting a single variable. For this group the aim is to forecast the whole of [x.sub.t] (which is now assumed to contain [y.sub.t]). Thus, use of multivariate models is inevitable. As is clear, specially designed estimation methods need to be employed, as the size of the data set, [x.sub.t], does not allow use of standard econometric techniques.

As the above makes clear, our review will focus on statistical/econometric methods for dealing with large data sets. …

The rest of this article is only available to active members of Questia

Already a member? Log in now.

Notes for this article

Add a new note
If you are trying to select text to create highlights or citations, remember that you must now click or tap on the first word, and then click or tap on the last word.
One moment ...
Default project is now your active project.
Project items

Items saved from this article

This article has been saved
Highlights (0)
Some of your highlights are legacy items.

Highlights saved before July 30, 2012 will not be displayed on their respective source pages.

You can easily re-create the highlights by opening the book page or article, selecting the text, and clicking “Highlight.”

Citations (0)
Some of your citations are legacy items.

Any citation created before July 30, 2012 will labeled as a “Cited page.” New citations will be saved as cited passages, pages or articles.

We also added the ability to view new citations from your projects or the book or article where you created them.

Notes (0)
Bookmarks (0)

You have no saved items from this article

Project items include:
  • Saved book/article
  • Highlights
  • Quotes/citations
  • Notes
  • Bookmarks
Notes
Cite this article

Cited article

Style
Citations are available only to our active members.
Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

(Einhorn, 1992, p. 25)

(Einhorn 25)

1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

Cited article

A Review of Forecasting Techniques for Large Data Sets
Settings

Settings

Typeface
Text size Smaller Larger Reset View mode
Search within

Search within this article

Look up

Look up a word

  • Dictionary
  • Thesaurus
Please submit a word or phrase above.
Print this page

Print this page

Why can't I print more than one page at a time?

Help
Full screen

matching results for page

    Questia reader help

    How to highlight and cite specific passages

    1. Click or tap the first word you want to select.
    2. Click or tap the last word you want to select, and you’ll see everything in between get selected.
    3. You’ll then get a menu of options like creating a highlight or a citation from that passage of text.

    OK, got it!

    Cited passage

    Style
    Citations are available only to our active members.
    Buy instant access to cite pages or passages in MLA, APA and Chicago citation styles.

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn, 1992, p. 25).

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences." (Einhorn 25)

    "Portraying himself as an honest, ordinary person helped Lincoln identify with his audiences."1

    1. Lois J. Einhorn, Abraham Lincoln, the Orator: Penetrating the Lincoln Legend (Westport, CT: Greenwood Press, 1992), 25, http://www.questia.com/read/27419298.

    Cited passage

    Thanks for trying Questia!

    Please continue trying out our research tools, but please note, full functionality is available only to our active members.

    Your work will be lost once you leave this Web page.

    Buy instant access to save your work.

    Already a member? Log in now.

    Author Advanced search

    Oops!

    An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.