The emerging Semantic Web

Posted by on Sunday July 22 2012

Could the Semantic Web have useful meaning? A couple of years ago I had already pushed it into the same drawer that held my SGML org charts. Then I encountered Microformats and Linked Open Data. Slowly, it occurred to me that while the extensive universe of the Semantic Web as originally envisioned might not be particularly useful, the practice of embedding more and more semantic information into our web pages makes an awful lot of sense.

The problem that the “Semantic Web” attempts to solve is that of context. Putting information on the web is important, but each web page exists in isolation. There is nothing beyond proximity to help put pages in context: a page on the Jewish Women’s Archive website in our section, “Encyclopedia” is probably a biography, or an article about a general Jewish women’s subject. If the article is on our “My Bat Mitzvah Story” website, then it is probably addressed to tweens, girls between 11 and 13. None of this is necessarily apparent to web spiders, or, for that matter, to casual visitors to our site.

So, the Jewish Women’s Archive has embarked on a new project to add extensive metadata, and to standardize metadata, in its various exhibits, biographies, and features. The core of the Semantic Web relies on “metadata;” the background information about the biographies and accompanying media on our site. “Metadata” is the term used to describe the criticial information about who created the article or media, what rights we have to use the item, when it was last updated, with whom we can share it, who owns the item, what the article covers or how it is categorized, how it fits in with other articles, etc.

Search engines such as Google and Bing frequently deliver people to our pages about Lillian Wald or Bobbie Rosenfeld or Gertrude Elion. We are prime resources for those subjects and the search engines know it. But you have to know to ask. If you are researching public health, or sports, or scientists, there is no easy way for a search engine to make a connection between the women I just mentioned and those subjects, except to the degree that those terms appear somewhere in the web pages, and to the extent that someone searching for information uses exactly those terms.

The Semantic Web addresses “context” by providing behind-the-scenes mark-up to note “relationship” information in a form that search engines like Google and Bing are increasingly paying attention to. We can record information such as “Gertrude Elion was a scientist” and “Gertrude Elion won the Nobel Prize” in ways that the search engines and other Semantic Web tools (when they appear) will be able to understand and combine with other semantic information around the web in order to answer search queries more subtly and more completely.

Think of how much this changes the way that people can understand the world. Instead of directing a query about sports, for instance, to whoever has the most widely-read page on sports in general, Google can now answer a question about “women in sports” to include the Jewish Women’s Archive pages on Bobbie Rosenfeld and other women. “Jewish Olympians” likewise returns information beginning with Rosenfeld, Lillian Copeland, Charlotte Epstein, on up to more recent Olympians such as Dara Torres.

In essense, the Semantic Web means that we don’t have to wait for someone to use the information we provide to write women into history; once that information is properly coded and on the web, and as browsers and search engines take better advantage of this technology, we are writing women directly into search results.

The Semantic Web, as originally described, is complex and cumbersome. It would take such significant resources to encode that small organizations such as might be left out simply because we lack the programming and archival resources. We’re not along. For everyone, big and small, in an age when archives are looking to “lighten up” some traditional proceses, the idea of moving backwards to record more information in more detail flies in the face of reality.

This has incredible implications, not just for the Jewish Women’s Archive, but for all cultural heritage organizations on the web. The Semantic Web as made real with lightweight means such as LOD and Microforms means that an inclusive, broad knowledge of who we are may finally be at hand.

Cross-posted with the Jewish Women’s Archive

Filed under: Advocacy andMetaverse

Leave a Reply