A sample text widget

Etiam pulvinar consectetur dolor sed malesuada. Ut convallis euismod dolor nec pretium. Nunc ut tristique massa.

Nam sodales mi vitae dolor ullamcorper et vulputate enim accumsan. Morbi orci magna, tincidunt vitae molestie nec, molestie at mi. Nulla nulla lorem, suscipit in posuere in, interdum non magna.

A few resources on linked data

The amount of material on the web about Linked Data, Open Data, and the Semantic Web is a bit staggering – particularly as the volume seems to increase every day.  Here are just a few of the resources we have found useful so far.

The number one link is the LOD-LAM Zotero Group bibliography (maintained by DLF, LODLAM and ALA’s Linked Library Data Interest group), instead of this list.  Most of the newer resources I (Laura) have come across have been added to this list and I invite you to join and add more there – we all need a shared list of useful resources!

Contents:  Introductions Texts and Tutorials Communities and blogs Resource links Data sources Vocabularies Projects Web Tools OS Software Commercial

Introductions to Linked Data

Texts, Tutorials

  • Linked Data:  Evolving the Web into a Global Data Space, 2011, by Tom Heath and Christian Bizer.  Good, readable and comprehensive overview of the technology and state of linked data.
  • Semantic Web for the Working Ontologist by Dean Allemang and Jim Hendler Book (paper only right now).  The 2008 first edition is cataloged for “Office Use” and various personal copies are floating around; Laura has a personal copy of the 2011 2nd edition which incorporates revisions to OWL and an office use copy is on order.  This is a meaty and more technical book about actually applying RDF and ontologies.
  • Some W3C official documentation:  The Semantic Web page provides an overall introduction and links to overviews and further links to standards activity in 5 areas (Linked Data, Inference, Vocabularies, Query, and Vertical Applications);  RDF Current Status page links to official standards documentation and primers; the RDF wiki page also links to tools, resources and revision discussions.  RDFa Current Status and other standards are there also.  Published standards usually include overviews, such as the RDF Primer and OWL Primer.
  • Introducing Linked Data and The Semantic Web provided by LinkedDataTools.com, an informational site which also provides some open source tools (with promise of more tools under development, perhaps with commercial feature3s).  An alternate take on the basics, with illustrations.  Covers
  • How to Publish Linked Data on the Web is a tutorial by Tom Heath, Michael Hausenblas, Chris Bizer, Richard Cyganiak, Olaf Hartig, from ISWC2008, Karlsruhe, Germany. For those who like me, who don’t like to read but watch something.
  • Free your metadata, a set of video tutorials and instructions produced at the Multimedia Lab, Ghent University and Free University of Brussels, by 3 enthusiastic young men, teaches how to use Google Refine with extensions to clean data and reconcile vocabularies.
  • Slides from several lectures (starting with Lecture 7) in an “Advanced Information Systems” course from Durham University, taught by William W. Song and Patricia Shaw, are about RDF, RDFS, Ontologies, OWL, Reification and other topics of interest.  These could provide another “take” on the concepts.
  • An Introduction to RDF(S) and a Quick Tour of OWL, a University of Manchester ontology tutorial.

Communities, blogs, organizations

  • LOD-LAM (Linked Open Data in Libraries, Archives and Museums).  The website documents a summit meeting held in June, 2011, convened by the Internet Archive, Sloan Foundation and National Endownment for the Humanities, and a blog continues to document additional LOD-LAM sessions (Washington, London, New Zealand…).  A Google Group open to anyone, and additional presentations and “summits” continue, led by Jon Voss (@jonvoss).
  • W3C Semantic Activity 
  • ALCTS/LITA Library Linked Data Interest Group.  This is an ALA Connect site; some content should not require a login to see, but anyone (including non-ALA members) can create a login for ALA Connect.  Right now there’s not much there; I’ll activate the link when the meeting notes and resource pages are up.
  • IGELU (International Ex Libris Users Group) Linked Open Data SWIG.  Group formed in 2011 with goal to”achieve essential linked open data features, as well as enhanced data-services functionality and APIs, in all Ex Libris products where appropriate, both from the data publishing and from the data consuming perspective. In practice this means focusing only on next generation products and platforms: URDD/Primo and URM/Alma. Existing API’s, such as X-Server for Aleph, Metalib, SFX and RESTful APIs can also be considered.”
  • Planet RDF is a group blog (aggregation of blogs) focused on semantic web/linked data generally
  • semanticweb.com is geared toward the business community; it publishes a newsletter, blog-like site, and information about events such as the SemTech conference.  Interesting place to get the business perspective.
  • Lotico is a wiki space for a community of Semantic Web enthusiasts. It showcases events and “meetups”.
  • 60+ Semantic Web Blogs list – A blog post from 4 years ago, lists blogs that are all or partially about linked data
  • AI3::Adaptive Information – Mike Bergman of Structured Dynamics has a blog and website primarily devoted to Semantic Web/Linked Data.  This includes the “Sweet Tools” list of over 1,000 semantic web tools, and the SweetPedia, a sporadically updated compilation of semantic technology research articles focusing on Wikipedia/DBPedia.
  • LAMs Metadata, blog created to fuel a workshop at DLF Fall Forum 2011 (November, Baltimore) – “Moving Forward: Examining the Needs to Re-tool the LAMs Data.”  Includes notes from that meeting. 

Information, reports, “webliographies,” other resources

Data sources

  • The Data Hub is a searchable database (using the CKAN application) of mostly openly available datasets – including Linked Data sets.
  • Sindice the Semantic Web Index, allows query of an ever growing collection of triples/data sets by term, property, Sparql query, and other means and includes other tools and a cool “Latest data” scroll that gives a picture of the speed of production.
  • DBpedia – information about Online Access to the DBpedia linked data (derived from Wikipedia).

Vocabularies and Vocabulary sources

Core RDF schemas

     Sources/links for vocabularies, ontologies, schemas

     Vocabulary visualization tools

  • Ontology Browser – plug in the URL of an ontology, or navigate ones provided (note:  availability seems to be sporadic)

Vocabularies in RDF published by Library and Archive communities

LC Subject Heading
LC Name Authority File

LC Genre/Form Terms

Thesaurus of Graphic Materials

MARC Relators

MARC Countries

MARC Geographic Areas

MARC Languages

ISO639-1 Languages

ISO639-2 Languages

ISO639-5 Languages

     Vocabularies in RDF from other communities

  • UMBEL (Upper Mapping and Binding Exchange Layer) Vocabulary and Reference Concept Ontology
  • FOAF (Friend of a Friend)
  • RSS (RDF Site Summary) 1.0
  • AtomOWL Vocabulary Specification
  • SIOC – Semantically Linked Online Communities – Core Ontology Specification
  • DOAP – Description of a Project
  • VOID – Vocabulary of Interlinked Datasets
  • Basic Geo (WGS84 lat/long) Vocabulary
  • BIO:  A vocabulary for biographical information
  • Good Relations:  The Professional Web Vocabulary for e-commerce
  • BIBO:  The Bibliographic Ontology. Note this was produced by a group spearheaded by Zitgist, Bruce D’Arcus, the Zotero team and Michael K. Bergman.
  • CC REL  Creative Commons Rights Expression Language
  • Open Graph Protocol (used by Facebook)
  • Music Ontology Specification:  http://musicontology.com/
  • Event Ontology
  • Social Semantic Web Thesaurus brought to you by PoolParty, a component of its free WordPress linked data plugin.
  • NEPOMUK Ontologies initially designed for the NEPOMUK (personal desktop information organization) project, may be generally useful.  Include NRL Representational Language, NAO Annotation Ontology, NIE Information Element set, including Core Ontology, File Ontology, Contact Ontology, Message Ontology, Calendar Ontology,  EXIF Ontology, ID3 Ontology; PINO Personal Information Model, Task Model Ontology.
  • SUMO (Suggested Upper Merged Ontology) and its domain ontologies form the largest formal public ontology in existence today. They are being used for research and applications in search, linguistics and reasoning. SUMO is the only formal ontology that has been mapped to all of the WordNet lexicon.  It is owned by IEEE and freely available under a GNU public license. An RDF translation is provided.
  • CIDOC-CRM (Conceptual Reference Model) is a high level vocabulary for describing cultural objects, developed by the museum community (ICOM).  RDFS versions are available.

     Microformats, microdata, and related vocabularies

  • Microformats.org has a list of microformat specifications; perhaps of particular interest to academic librarians is the wiki page on Citation Formats such as BibTex and RIS.
  • COinS (ContextObjects in Spans)
  • Schema.org (note particularly the Data Model and “Mapping to RDFa 1.1” on this page.

Projects and Reference implementations

  • LOCAH (Linked Open Copac Archives Hub), JISC (UK) project to make combined data from the Copac union catalog and ArchivesHub aggregation of archival material, available as linked data (has published some datasets)
  • COMET (Cambridge Open Metadata) project, has published some datasets based on MARC records
  • Civil War Data 150 – a demonstration project, in process.
  • NEPOMUK, Networked Environment for Personalized, Ontology-based Management of Unified Knowledge, a.k.a. the Social Semantic Desktop, is a project to develop a comprehensive solution for extending the personal desktop into a collaboration environment supporting personal information management and sharing
  • SIMILE (Semantic Interoperability of Metadata and Information in unLike Environments) is a collection of projects hosted by MIT Libraries and CSAIL, seeking to enhance inter-operability among digital assets, schemata/vocabularies/ontologies, metadata, and services, primarily by extending DSpace functionalities primarily using RDF and linked data techniques.
  • HIVE is an IMLS funded project at the Metadata Research Center in Chapel Hill.  They use an automatic metadata generation approach to dynamically integrate discipline specific vocabularies using SKOS.
  • UniProt is a unified data store of data about proteins including articles and information resources, now based on linked data.  SPARQLing UniProt RDF: Using RDF based technologies to aid biological curation efforts is a video of a presentation by Jurgen Bolleman at BioHackathon 2011 about UniProt’s use of linked data and their quality control processes using SPARQL queries.

Web-based tools and services (see Vocabularies page for web vocabulary search tools)

  • OpenLink Data Explorer can be used to query a Virtuoso triple store, or to navigate linked data from a URI.
  • LinkSailor is designed to present linked data (and additional data found in the Talis platform) in an eye-readable form.
  • Semantic Radar is a Firefox plugin that will detect presence of RDFa, SIOC, and other embedded RDF data in web pages and display a symbol in the lower browser frame that you can click to see the data.
  • RDF Book Mashup was an early demonstration application developed at Freie Universitat Berlin by Chris Bizer and Tobias Gauss.  You can enter book title or author, get a results list, and see the RDF mashup including Amazon and other data.  It’s slooow but still working.
  • FactForge is a “search engine” for linked data, brought to you by Ontotext Corp.

Open source (or free) software and software projects

  • Sesame, provided by openRDF.org supported by Aduna, a Dutch software company, is a Java framework for processing RDF data ( parsing, storing, inferencing and querying).  Available under BSD-style license
  • OWLIM is a family of semantic repositories, which work with JENA or Sesame.  The Lite version is free, others are commercial software.  Supports reasoning and semantics, RDFS, OWL2 RL, OWL2  QL.
  • JENA is a Java framework for building Semantic Web applications.
  • RDFLib, a Python library for working with RDF.
  • Protege Ontology Editor, free software provided by Stanford University, includes the Protege OWL editor extension. Tightly integrated with JENA and has an open-source Java API .
  • OpenStructs, “…an education and distribution site dedicated to open source software for converting, managing, viewing and manipulating structured data. Structured data can represent any existing data struct from the simplest attribute-value pair formats to fully specified relational database schema. Material on this OpenStructs site ranges from individual tools to complete open semantic frameworks (OSF) with which to builld comprehensive semantic instances.  All OpenStructs tools are premised on the canonical RDF (Resource Description Framework) data model.”  Brought to you by the company Structural Dynamics
  • OpenSEAS  is “…a framework for the enterprise to establish a coherent, consistent and interoperable layer across its information assets. It is compliant with the open source Method for an Integrated Knowledge Environment via MIKE2.0’s Semantic Enterprise Solution Offering.  Open SEAS has been developed for enterprises desiring to initiate or extend their involvement with semantic technologies. It is inherently incremental, low-cost and low-risk. It was originally developed by Structured Dynamics, and then contributed to MIKE2.0 as an open source solution offering.”
  • Any23 (Anything to Triples):   Software development project to create a web service and set of tools to extract structured data in RDF format from a variety of Web documents.  Writen in Java and licensed under Apache License 2.0
  • WordPress plugin from PoolParty allows you to import a SKOS thesaurus (the Social Semantic Web thesaurus or another reachable through SPARQL endpoint) and use it for publishing concept links in your WordPress blog.
  • RDF Extension for Google Refine  adds a graphical user interface(GUI) for exporting data of Google Refine projects as interlinked RDF data. Data can be reconciled against any SPARQL endpoint or RDF dump. The reconciled data can then be exported as RDF based on a template graph.   Hosted by DERI.  Updated 11/28/2011 to work with latest version of Google Refine (2.5).
  • The SIMILE RDFizers page has links to downloadable tools  written at MIT and elsewhere (MIT’s run from a command line and require a Java VM and Apache Maven) for converting data in other formats to RDF (including MARC/MODS/RDF)

Commercial firms providing software/services

  • Talis has been involved in providing RDF-based services for may years; they created data.gov.uk; they have sold their Librarydivision to Capita Group plc. to focus on other services.
  • Kasabi is a startup company under the Talis umbrella.  “Kasabi is a marketplace for people publishing and looking for data. As a complete platform for the hosting and publishing of Linked Data, Kasabi brings together anyone involved with data: businesses, developers, individuals, organisations, government etc.”  Featured datasets hosted run a gamut from data on Lego sets for sale, to BC Music data on artists, records and music reviews.
  • Open Calais is a web service provided by Thompson-Reuters for automated text analysis and provides RDF categories and tags using the Open Calais vocabulary.  The basic service is free; paid upgrades are available for advanced features such as guaranteed service levels.
  • Deploying Linked Data using Open Link Virtuoso is a document descibes the process of deploying Linked Data into the existing Web. It discusses some of the difficulties faced in exposing RDF data and in bridging the ‘Linked Data Web” and the traditional “Document Web.”  Virtuoso is one of OpenLink’s suite of products and provides a linked data server and triplestore along with other services such as relational and XML database management and web services.
  • Semantic Web Company located in Vienna, Austria, provides knowledge management software and services including PoolParty linked data tools (also Atlassian Confluence content management system).
  • Cambridge Semantics uses linked data technology under the hood to provide business information (“operational intelligence”) data integration, collaboration and analysis tools, notably in the biotechnology sphere.
  • Sindice.com, is a startup to provide commercial services on top of the The Sindice Semantic Web Index and its infrastructure, which was developed primarily through government and non-profit funding.
  • TopQuadrant produces the TopBraid Suite of services and tools to “design, develop, and deploy the data models, processing, and interfaces needed for sophisticated Semantic Web applications.”