Wednesday, September 3, 2008

OpenCyc Brings Meaning to the Web

Cycorp LogoOpenCyc a vast open-source knowledge base of concepts for the Web, was released publicly today by Cycorp, Inc. Using OpenCyc terms to represent Web content enables true semantic interoperability. While semantic web standards such as RDF/OWL provide a unifying framework for meaningful information exchange among applications, without a substantial shared vocabulary these exchanges will be quite restricted.  The OpenCyc concept ontology removes this barrier by providing an extensive network of terms, in forms that can be understood both by computers and humans, which ensures the applications will have something to talk about.  Taken together, it is now possible to develop web applications and mash-ups that understand, and can reason about, web content as well as enterprise and personal data and meta-data.

“You just can't put forward concepts and knowledge relationships like flinging hash,” said Michael Bergman of Zitgist LLC, which creates semantically enabled software for data integration. “Real information integration needs both context and coherence,” he said. “No other structure is in the same league as the common sense basis of Cyc; we've found its knowledge framework to be flexible enough for any customer context.”

“In the Cyc project, we've been working to develop the knowledge representations and reasoning capabilities for intelligent software that collaborates with its users. Although researchers have been using some of the results for a few years now, the recent growth of the Semantic Web presented both a need and opportunity to have a dramatically greater impact. OpenCyc is the language that can tie the disparate parts of the Semantic Web together,” said Douglas Lenat, founder of the Cyc Project, and CEO of Cycorp Inc. “By representing and sharing knowledge in the same language, Semantic Web applications can be enormously more powerful.”

OpenCyc is a wide-ranging and increasingly comprehensive ontology that describes things and events in the world in logical terms that computers can reason about. Its purpose is to provide a shared vocabulary for Web applications, allowing them to automatically reason about, and integrate, the content of web sites and web services. The OpenCyc ontology and knowledge base goes beyond tag-sets, taxonomies, and other reference vocabularies, because it has been designed and extensively tested for use in automated reasoning. As Andraž Tori, CTO of Zemanta Ltd. sees it, “Common semantic vocabularies are the missing link for the semantic web. Blogs cover an incredible range of subjects, so meaning-based content integration using the huge OpenCyc ontology can provide an amazing user experience for bloggers and other content authors.”

On the Web, OpenCyc is available as a set of stable Web addresses (URIs) that are readable both by machines, using the Semantic-Web standard OWL language, and by human beings, using a standard web browser such as Firefox or Internet Explorer. OpenCyc concepts can be accessed at


Imagine if a blogger writing about the dropping prices of iPhone clones were automatically alerted to a news release from GE on a new OLED manufacturing technology. This becomes possible as on-line content like business directories and product listings adopt the shared OpenCyc vocabulary.

The OpenCyc ontology provides relevant concepts:
“Ultra Thin Flat Panel Display”, “OLED Display”, “iPhone”, “GE”, ...
relations among these concepts:
“makesProductType”, “partTypes”, “createdBy”, “competitor” ...
and relevant background knowledge, about OLEDs being a kind of thin display screen, for example.

Business rules from on-line content producers or aggregators can also use OpenCyc terms, adding information like: “If someone is interested in a product, information about components of that product may be relevant.” Or “If someone is interested in a product component, they may be interested in a company’s competitors who also rely on that component.” In this way, Web software can link previously disparate information and rules in powerful new ways.


The OpenCyc concepts and relationships are derived from, and form the backbone of, the Knowledge Base in the Cyc System. Over the past 24 years, the Cyc project has been capturing and representing “common sense” knowledge – real-world concepts and the relationships among them – in a way that allows computers to reason about them. The OpenCyc ontology contains machine- and human-readable descriptions of around 150,000 concepts, ranging from very general (“Idea”, “Physical object”, “Time”) to the very specific (“Lee Harvey Oswald”, “Kern Primrose Sphinx Moth”, “Valentine's Day”), from the sublime (“Romantic Love”, “the Mona Lisa”, “Chocolate”) to the ridiculous (“Clown”, “The Three Stooges”, “Monty Python's Flying Circus”). In addition, unlike other ontologies that provide only a handful of ways of expressing relations among concepts (such as subclass, name, knows, etc.), the OpenCyc ontology includes many thousands of type of relations such as “biological grandmother”, “antidote”, “longitude”, “author of literary work”, etc., etc., etc. The extensive scope of these terms and relations has led Péter Vaskó, CEo of iGlue, to observe: "We think that a rich ontology like OpenCyc can enable us to extend iGlue with new information or to validate the existing data using logical inference, and it has the potential to provide a base for a common semantic infrastructure as a sort of entity Yellow Pages."

As the underlying Cyc knowledge base continues to grow, Cycorp anticipates ongoing updates to the OpenCyc ontology, ensuring that it is ever more comprehensive and up-to-date. Subsequent releases will include further integration with other ontologies and semantic web frameworks as well as mechanisms to allow users to comment on and extend the OpenCyc ontology. Today's release of the OpenCyc semantic web endpoints also serves as the foundation for a planned roll-out of related semantic web services and applications that will leverage both OpenCyc concepts as well as the knowledge and inference capabilities of the full Cyc system.

OpenCyc is provided as open-source under the Creative Commons 3.0 Attribution license, allowing it to be easily used, at no cost, by both industry and individual developers and web-designers; the complete ontology with all concepts, definitions, terms, relationships correspondences to natural-language terms can be freely downloaded.


The sample OpenCyc concepts described above can be reached at:
To find any of the other 150,000 currently published OpenCyc concepts, check out the search tool at

Reblog this post [with Zemanta]

No comments: