Friday, July 17, 2009

Semantic Web Blues

At a talk I attended last week, Ralph Swick from the W3C described the current state of Semantic Web technology, and where the W3C would like to take it. The talk was great, but a couple of things Ralph said really stood out as problems in taking the Sematic Web forward.

While describing how the Sematic Web works, Ralph used the phrase "One man's metadata is another man's data." This really struck me. The metadata that we generate automatically while taking pictures on our cameras, saving documents in Word and reading emails can be incredibly valuable to the Sematic Web.

An image of a building is not that useful on it's own, but when you add the name of the photographer, the time the image was taken and the exact Lat/Long coordinates of the camera, a computer might be able to figure out what the name of the building is. Standardized meta data like this is going to be key in making the Semantic Web useful.

Unfortunately, the culture of the web today doesn't recognize this. Metadata is considered useless. Companies like Google and Yahoo even reccomend stripping it from images to decrease page loading times. Unfortunately, the cost of moving a couple extra bits over the wire outweighs the context gained from knowing where an image was taken.

This culture of minimization on the web has to change before the Semantic Web can take off. Next time you start to strip the metadata from your files to save space, remember that one man's garbage is another man's treasure. The few bytes you're throwing away could be incredibly useful to someone else. With today's hard drive prices, keeping an extra 10 or 100 megabytes around isn't costing you very much.

No comments: