Dec 9 / scholarscooperative

Digital Publishing and the Semantic Web?

As information enthusiasts, whether you’re a librarian, researcher, or other information connoisseur, we’re always looking for new ways to access information.  A lot of this digging for knowledge will very often lead us to various web pages, or perhaps PDF or HTML versions of scholarly materials.  In this respect, we’re capable of using the web to find information, but the most important tool to get to that information is you, the searcher, not necessarily the machine you’re using.

Web pages can be understood as objects created by humans, and designed to be read and understood by people, not machines.  The semantic web is a system where machines can “understand” and respond to human inquiries based on semantic structures—understanding the relationships between different words.  Its usefulness lies in helping computers “read” and use the web.  Consider this basic example: a computer isn’t able to completely recognize your household pet playing fetch in the yard like you are.  But, it can be told that there is an object called ‘dog’ which has the attributes ‘tail’ and ‘fur’ with an example of this object being ‘comet’ (the name of your dog).  The machine doesn’t need to understand the human impression or sense of the words, i.e., what they mean, but rather the relationships between the symbols used.  This idea basically involves the inclusion of semantic content (metadata) in web pages.  This involves using web “languages” and tools specifically designed for data, and these technologies work together as a structure for computers to look for information and define relationships.  Web developers can use these machine-readable descriptions to add meaning to content so that the machine can process knowledge itself, sort of like inference, with a goal of obtaining more meaningful results.

The practice of being able to expose, share, and connect data, information, and knowledge-known as linked data, could potentially lead to significant applications in our daily activities and of course in publishing and scholarly impact.  Consider the idea of scientific discovery and the ability to share data across the web with the potential to combat a range of diseases that an individual or educational institution couldn’t gather sufficient information about alone.

Many institutional repositories don’t have a designated or even minimally desirable way to capture and archive datasets.  It’s certainly a conversation that we have within the WSU library system often.

PDF and HTML versions of scholarly output have their own set of unique limitations, therefore, it is inevitable that the still developing model of the semantic web would also lead to challenges.

What do you think?!  Here’s an example: GoPubMed