Magazine article Nieman Reports

Teach Your Computers Well: Avoiding the Risks of Algorithmic Bias

Magazine article Nieman Reports

Teach Your Computers Well: Avoiding the Risks of Algorithmic Bias

Article excerpt

SYSTEMS LIKE

Wordsmith and Quill are natural language generation (NLG) platforms. That means they're designed to turn data into human-sounding prose. NLG is an active area of technology development, aimed at helping translate the growing stores of structured data into human-usable information.

Siri, Apple's conversational assistant, uses natural language generation to answer simple questions and respond to user requests. But Siri is also using another technology that Quill and Wordsmith aren't: natural language processing (NLP).

Natural language processing is a class of artificial intelligence tools that let computers analyze unstructured data--like newspaper stories--to identify patterns. Those patterns can then be applied to new situations. That's how Siri figures out whether you're looking for a restaurant or a pet shop when you ask her for the nearest "hot dog." Then, Siri uses natural language generation to direct you to the nearest frank.

By looking at large volumes of data, NLP systems develop statistical models about how often humans use specific vocabulary, syntax, and phrases in a particular context. To do so, they depend on training data sets, known as corpora, that have something in common--e.g., all of the text is weather reports or earnings statements.

The models generated by NLP tools will replicate the same biases inherent in the corpora. Usually, that's good: You want Siri to bias her response toward an edible treat, not a panting pet. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed

Oops!

An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.