Magazine article Government Finance Review

Text Analysis: A Simple "Big Data" Tool for Local Government

Magazine article Government Finance Review

Text Analysis: A Simple "Big Data" Tool for Local Government

Article excerpt

Text analysis is an intuitive and low-cost tool that can quickly analyze data. Simple algorithms display the readability of a given text passage, helping governments determine how easily staff, officials, and the general public can access its documents. A limited analysis of budget documents from major U.S. cities shows that comprehensive annual financial reports (CAFRs) are often at or above a college reading level, while popular annual financial reports (PAFRs) range from an 8th to 12th grade level.

This article develops suggestions to make documents more accessible and readable, and provides a case study for text readability measurement used by the City of Dubuque, Iowa, in a performance-measurement context.


Text analysis is a broad term that is defined in a variety of ways. It generally refers to a set of procedures that analyze written text and produce scores that capture different dimensions of the text, such as readability. Text analysis examines the structure and length of sentences and words through classification schemes such as an automated count of multisyllabic words and a numeric measure of the text's grade level. A paragraph with many multisyllabic and technical words and a high grade level is challenging for the general population to read, so it may be necessary to change the wording and structure to make interpretation consistent across segments of the population.

A common measure is the Flesch Reading Ease score, which is used to evaluate text on a scale from 0 to 100, with 0 being very difficult to read and 100 being very easy. It can be run with most types of text, from newspaper articles to technical reports. When it was published in 1949, creator Rudolf Flesch estimated that fewer than 5 percent of all U.S. adults could read at a college level, while 93 percent could read at a 5th grade level. (1) This number has likely shifted, but the average adult still reads at approximately a 6th to 8th grade level. Microsoft Word can compute the Flesch Reading Ease score for any written document. For example, the score for this article is 35.9, with a grade level of 13.8.

There are many other types of measures and many programs that quickly automate readability measures, as well as more advanced measures such as sentiment analysis, an algorithm that determines if the writing contains different sentiments (e.g., positive or negative). These measures are highly complex and require an extensive understanding of linguistics and programming, so without expert guidance, governments should probably focus on readability scores rather than trying to conduct sentiment analyses.


An analyst can easily measure the readability of any finance document. PAFRs should be analyzed because they are specifically created for the general public. GASB Statement No. 34, Basic Financial Statements--and Management's Discussion and Analysis--for State and Local Governments, also recommends analyzing the CAFR Management's Discussion and Analysis (MD&A) section to make sure it is readable at an appropriate level. (2) This is a worthwhile goal, although governments need to take care not to change the original intent when simplifying the technical words and complicated sentence structures.

If a PAFR's reading level is too high for its intended audience, readers might misunderstand the information. Readability can be increased by shortening sentence length, avoiding complex and technical accounting phrases, using an active rather than a passive voice, and trying to explain things with simple references and metaphors that anyone can understand. Sometimes, linking budget concepts to actual policy outputs (e.g., saying fiscal reserves are the equivalent of a savings account that can maintain service during financial uncertainty) can be more effective than trying to provide a technical discussion.

These methods can also be used for the PAFR's overall structure. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.