by ls
14. May 2009 08:29
For about a month now, we've been working on a publishing system that will embed useful (hopefully) inline links in our stories. This isn't new - other newspaper websites (The New York Times, for example) have been doing this, some for years. But it's new to us and is certain to raise awareness (and maybe hackles) in the newsroom.
My biggest challenge is training the entity extraction system. The term "Jewel" means nothing in the News/Politics section, but has definite meaning in the Life/Music section. Is "Davenport" referring to Iowa, South Carolina or someone's last name? Is it "Michael Jordan" the retired basketball star or "Michael Jordan" the former CEO of EDS? If you're in the Sports section, it should be obvious. Can we expect an automated system to do this? I hope so - I'm working with a company called mSpoke to try to get this accomplished.
The new links will go live on the site some time next week.