Predicting the Future: AI Approaches to Time-Series Problems

Article excerpt

The Workshop on AI Approaches to Time-Series Problems, jointly sponsored by the Fifteenth National Conference on Artificial Intelligence (AAAI-98) and the International Conference on Machine Learning (ICML98), was held in Madison, Wisconsin, on 27 July 1998. The organizing committee consisted of Andrea Danyluk of Williams College, and Tom Fawcett and Foster Provost, both of Bell Atlantic Science and Technology. There were approximately 30 attendees.

The goal of the workshop was to bring together AI researchers who study time-series problems along with practitioners and researchers from related fields. These problems are of particular interest because of the large number of high-profile applications today that include historical time series (for example, prediction of market trends, crisis monitoring). The focus was primarily on machine-learning and data-mining approaches, but perspectives on statistical time-series analysis and state-space analysis (for example, work on hidden Markov models) were also included. These communities arguably overlap significantly (and indeed, much work on state-space analysis, for example, has fallen under the heading of machine learning). Our goal was to make researchers and practitioners not only aware of approaches similar to their own but also aware of methods from other communities that might be applied effectively to their problems.

Time-series problems include segmentation and labeling of time series as well as prediction. Classical timeseries prediction involves fitting a function to a numeric time series to predict future values (for example, the prediction of a stock's price given its past performance). Time-series prediction might also be required for problems involving categorical, rather than numeric, data. For example, one might be interested in predicting the next action of a computer user given a history of user actions.

In many situations, the problem is not to predict future values of a time series but, rather, to label the series. For example, cardiologists examine electrocardiograms to diagnose arrhythmias. Problems of segmentation and labeling often occur together. The problem might be to (1) assign a label to an entire time series or (2) segment the series into subsequences and label them individually. The second problem is more difficult because there is temporal information to consider both within each subsequence itself and among the various subsequences. In the extreme case, the subsequences might be single atomic events, where temporal information exists among the events, but there is no temporal information within the event itself.

The labeling problems we described assume that there are fixed boundaries that define the pieces of the time series that are to be labeled. The boundaries, however, might not be fixed. For example, consider the problem of identifying and labeling the stages of a multistage flight plan given a pilot's command sequence. Here, the boundaries of the subsequences will not be fixed for all flight plans but will be variable. The problem becomes not only labeling the stages but also identifying the boundaries themselves.

In other cases, the identification of the boundary can be paramount. For example, consider the problem of detecting credit card fraud. Time-series information exists in the form of a stream of credit card transactions for an account, and the problem is to decide at any given time whether the account has been defrauded. In real time, this process would involve examining (sub)sequences of account activity and determining, as soon as possible, whether fraudulent activity exists. Here, the problem is not only to label accurately but also to identify as closely as possible the point where fraudulent behavior begins, so that use can be stopped and losses minimized.

The working notes contain 16 papers, 7 of which were selected for presentation at the workshop. In keeping with the goal of making the workshop a resource for researchers and practitioners in this area, the working notes included a small selection of relevant papers that also appear elsewhere. …