Data Mining for Research and Evaluation

Article excerpt

I LIKE BEING cast in the role of a "geek" or computer nerd -- it gives me a lot of anonymity. Over the years, I've helped build five high-tech schools and have consulted with many others. Fortunately for the "educator" side of me, most teachers, principals, college administrators, and others who see me in their schools think I'm just there to deal with the computer systems and networks. And since I am usually cast in the role of "geek" (I mean the term in a most complimentary way), people seem to assume that I have no skills as an educator. Thus I have been lucky to have unobtrusive access to about 100 classrooms in several different schools.

What's more, in my role as geek, I've been the system administrator of a number of e-mail systems, a dozen file servers, and numerous schoolwide networks. At present I have this kind of access to the high- tech systems in two elementary schools. What this means is simple: I know what kinds of raw data are typically collected in lots of high- tech systems that have very large memories indeed. And, like most network administrators, I have monitoring software that gathers, records, analyzes, and visually displays the data that constantly accumulate about what's happening in the school.

"Data mining" is a term used a lot by the information technology folks in business. The idea of data mining is simple: we've got all these data, so why not try to extract whatever is of value. Corporations often sift through mountains of data, extracting customer profiles, buying habits, and purchasing trends; compiling demographic details; building mailing lists; and gathering other valuable intelligence. Since the high-tech systems in schools function a lot like those in business, it is certainly feasible to mine this data for research purposes. Here are a few examples of what can be found in the data that most high-tech schools collect.

It is common practice to install "network minder" software that tracks the amount and kind of traffic on the network. This software can both give a "snapshot" of the network at any given time and log and average network traffic over time. For example, you can sit at a console in the morning and watch as the school "wakes up." It's almost like standing on a hilltop watching the lights turn on in a small city. You can also watch throughout the day and detect patterns in student computer use. In most elementary classrooms children don't get to use the classroom computers until the teacher is finished with the morning's large-group activities. So the first access they usually have to classroom computers is between 10:30 a.m. and 11 a.m. There is another peak time for student use, starting about 2 p.m. If a school has a departmentalized fifth grade, however, you will see a different pattern.

Seldom do you find a teacher who has managed to organize his or her classroom so that students have access to computers all day long. (We should search out these teachers and see how they manage and organize their classrooms.) It is also instructive to watch the student computer use on Fridays: you can detect the modern equivalent of "free film on Friday" teachers.

Merely clicking a button on the net-minder software starts it logging this kind of information. Then, whenever you want, you can display a bar graph of network activity by day or by week in 15-, 30-, or 60- minute increments. Obviously, comparing data in this way across schools can be very informative. In one high-tech model school where I examined this kind of data, believe it or not, most students had access to their classroom computers nearly all day long.

File server software also has monitoring and logging capability. For example, a server can automatically build a "log file" that keeps track of when people sign on and off the server. This dataset can be mined for patterns of use by teachers and students. Servers also have a "who has logged on" panel that displays this information instantaneously. …