The amount of material on the web about Linked Data, Open Data, and the Semantic Web is a bit staggering – particularly as the volume seems to increase every day. Here are just a few of the resources we have found useful so far.
The number one link is the LOD-LAM Zotero Group bibliography (maintained by DLF, LODLAM and ALA’s Linked Library Data Interest group), instead of this list. Most of the newer resources I (Laura) have come across have been added to this list and I invite you to join and add more there – we all need a shared list of useful resources!
Contents: Introductions Texts and Tutorials Communities and blogs Resource links Data sources Vocabularies Projects Web Tools OS Software Commercial
- Introduction to Linked Open Data in Libraries, Archives and Museums – video and slides from a presentation by Jon Voss at the Smithsonian Institution, September 2011
- Web 3.0, A story about the Semantic Web, by Kate Ray. Terrific 15 minute video from May 2010, an intro to the ideas and issues, listen to some big talking heads including Tim Berners-Lee, David Weinberger and Clay Shirky.
- The Way to Linked Library Data PT. 1 and PT.2 (slides), from the ASIS&T webinar by Karen Coyle, 2011
- It’s Not Rocket Surgery, Ross Singer, slides from presentation at ALA Annual Meeting, June 2011. Somewhat lacking without speaker notes but gives some ideas of how parts of library data can be used.
- Tim Berners-Lee, James Hendler and Ora Lassila, “The Semantic Web,” Scientific American, published in 2001.
- Tim Berners-Lee, Design Issues – Linked Data
- Tim Berners-Lee on the next web (TED 2009)
- RDFa Everywhere, Knud Möller, slides from presentation at Web Directions South, 2010. A nice overview of RDFa, with some simple examples.
- What is RDF and what is it good for? by Joshua Tauberer, last rev. Jan. 2008. Introduction to RDF, the abstract model, how it’s used, best practices for deployment and querying.
- Linked Data: Evolving the Web into a Global Data Space, 2011, by Tom Heath and Christian Bizer. Good, readable and comprehensive overview of the technology and state of linked data.
- Semantic Web for the Working Ontologist by Dean Allemang and Jim Hendler Book (paper only right now). The 2008 first edition is cataloged for “Office Use” and various personal copies are floating around; Laura has a personal copy of the 2011 2nd edition which incorporates revisions to OWL and an office use copy is on order. This is a meaty and more technical book about actually applying RDF and ontologies.
- Some W3C official documentation: The Semantic Web page provides an overall introduction and links to overviews and further links to standards activity in 5 areas (Linked Data, Inference, Vocabularies, Query, and Vertical Applications); RDF Current Status page links to official standards documentation and primers; the RDF wiki page also links to tools, resources and revision discussions. RDFa Current Status and other standards are there also. Published standards usually include overviews, such as the RDF Primer and OWL Primer.
- Introducing Linked Data and The Semantic Web provided by LinkedDataTools.com, an informational site which also provides some open source tools (with promise of more tools under development, perhaps with commercial feature3s). An alternate take on the basics, with illustrations. Covers
- How to Publish Linked Data on the Web is a tutorial by Tom Heath, Michael Hausenblas, Chris Bizer, Richard Cyganiak, Olaf Hartig, from ISWC2008, Karlsruhe, Germany. For those who like me, who don’t like to read but watch something.
- Free your metadata, a set of video tutorials and instructions produced at the Multimedia Lab, Ghent University and Free University of Brussels, by 3 enthusiastic young men, teaches how to use Google Refine with extensions to clean data and reconcile vocabularies.
- Slides from several lectures (starting with Lecture 7) in an “Advanced Information Systems” course from Durham University, taught by William W. Song and Patricia Shaw, are about RDF, RDFS, Ontologies, OWL, Reification and other topics of interest. These could provide another “take” on the concepts.
- An Introduction to RDF(S) and a Quick Tour of OWL, a University of Manchester ontology tutorial.
Communities, blogs, organizations
- LOD-LAM (Linked Open Data in Libraries, Archives and Museums). The website documents a summit meeting held in June, 2011, convened by the Internet Archive, Sloan Foundation and National Endownment for the Humanities, and a blog continues to document additional LOD-LAM sessions (Washington, London, New Zealand…). A Google Group open to anyone, and additional presentations and “summits” continue, led by Jon Voss (@jonvoss).
- W3C Semantic Activity
- ALCTS/LITA Library Linked Data Interest Group. This is an ALA Connect site; some content should not require a login to see, but anyone (including non-ALA members) can create a login for ALA Connect. Right now there’s not much there; I’ll activate the link when the meeting notes and resource pages are up.
- IGELU (International Ex Libris Users Group) Linked Open Data SWIG. Group formed in 2011 with goal to”achieve essential linked open data features, as well as enhanced data-services functionality and APIs, in all Ex Libris products where appropriate, both from the data publishing and from the data consuming perspective. In practice this means focusing only on next generation products and platforms: URDD/Primo and URM/Alma. Existing API’s, such as X-Server for Aleph, Metalib, SFX and RESTful APIs can also be considered.”
- Planet RDF is a group blog (aggregation of blogs) focused on semantic web/linked data generally
- semanticweb.com is geared toward the business community; it publishes a newsletter, blog-like site, and information about events such as the SemTech conference. Interesting place to get the business perspective.
- Lotico is a wiki space for a community of Semantic Web enthusiasts. It showcases events and “meetups”.
- 60+ Semantic Web Blogs list – A blog post from 4 years ago, lists blogs that are all or partially about linked data
- AI3::Adaptive Information – Mike Bergman of Structured Dynamics has a blog and website primarily devoted to Semantic Web/Linked Data. This includes the “Sweet Tools” list of over 1,000 semantic web tools, and the SweetPedia, a sporadically updated compilation of semantic technology research articles focusing on Wikipedia/DBPedia.
- LAMs Metadata, blog created to fuel a workshop at DLF Fall Forum 2011 (November, Baltimore) – “Moving Forward: Examining the Needs to Re-tool the LAMs Data.” Includes notes from that meeting.
Information, reports, “webliographies,” other resources
- W3C Library Linked Data Incubator Group was an official W3C activity. The group concluded its work and published its Final Report, Use Cases and other material in October.
- semanticweb.org is a community wiki site (using Semantic MediaWiki) with a large community that contribute Tools, Events, and Ontologies to a growing list.
- Stanford Linked Data Workshop Technology Plan is a plan for a “multi-national, multi-institutional discovery environment built on Linked Open Data principles” for academic resource discovery. It follows from a workshop conducted at Stanford University June 27-July 1 2011 with support from the Andrew W. Mellon Foundation’s Scholarly Communications Program, CLIR and Stanford University Libraries.
- Expressing Dublin Core metadata using the Resource Description Framework (RDF) explains a linked data approach to using Dublin Core elements as properties, based on the Dublin Core Abstract Model, and incorporating classes, vocabulary schemes, data types and other aspects.
- Archival Description in OAI-ORE; article by Deborah Kaplan, Anne Sauer and Eliot Wilczek of Tufts University (Journal of Digital Information, Vol 12, No 2 (2011) about the potential of expressing the data in archival EAD finding aids using the RDF-based OAI-ORE standard (Object Reuse and Exchange) and how this would enable linking and re-aggregation of resources across datastores.
- Karen Coyle webinar: Libraries and Linkded Data: Looking to the Future on July 19, 2012. Recording is available, also a follow-up blog post, which contains Karen’s slides and a link to her site, which contains a boatload of additional resources.
- The Data Hub is a searchable database (using the CKAN application) of mostly openly available datasets – including Linked Data sets.
- Sindice the Semantic Web Index, allows query of an ever growing collection of triples/data sets by term, property, Sparql query, and other means and includes other tools and a cool “Latest data” scroll that gives a picture of the speed of production.
- DBpedia – information about Online Access to the DBpedia linked data (derived from Wikipedia).
Vocabularies and Vocabulary sources
Core RDF schemas
- RDF Vocabulary Description Language 1.0: RDF Schema
- OWL2 Web Ontology Language
- SKOS Simple Knowledge Organization System
Sources/links for vocabularies, ontologies, schemas
- Swoogle semantic web sesarch
- http://vocab.org: a site provided by Ian Davis (of Talis, among other things) with a number of interesting vocabularies
- Schemapedia, another service provided by Ian Davis
- Protege Ontology Library is a wiki page containing a list of ontologies, mostly in OWL or OWL plus RDFS.
- Library Linked Data Incubator Group: Datasets, Value Vocabularies, and Metadata Element Sets (Oct. 25, 2011 report)
Vocabulary visualization tools
- Ontology Browser – plug in the URL of an ontology, or navigate ones provided (note: availability seems to be sporadic)
Vocabularies in RDF published by Library and Archive communities
- Library of Congress vocabularies: http://id.loc.gov/. Currently includes:
LC Subject Heading
LC Name Authority File
LC Genre/Form Terms
Thesaurus of Graphic Materials
MARC Relators
MARC Countries
MARC Geographic Areas
MARC Languages
ISO639-1 Languages
ISO639-2 Languages
ISO639-5 Languages
- The RDA (Resource Description and Access) Vocabularies
- Dublin Core Metadata Registry
- Working draft: Metadata Semantics Shared Across Languages: Dublin Core in Languages Other than English
- Recommended resource: Interoperability Levels for Dublin Core Metadat
- FAST (Faceted Subject Terms) status “experimental”, is a service provided by OCLC that simplifies and converts LC subject headings into their “facets”(topical, geographical, form, chronolgica, etc. Records are now available in RDF.
Vocabularies in RDF from other communities
- UMBEL (Upper Mapping and Binding Exchange Layer) Vocabulary and Reference Concept Ontology
- FOAF (Friend of a Friend)
- RSS (RDF Site Summary) 1.0
- AtomOWL Vocabulary Specification
- SIOC – Semantically Linked Online Communities – Core Ontology Specification
- DOAP – Description of a Project
- VOID – Vocabulary of Interlinked Datasets
- Basic Geo (WGS84 lat/long) Vocabulary
- BIO: A vocabulary for biographical information
- Good Relations: The Professional Web Vocabulary for e-commerce
- BIBO: The Bibliographic Ontology. Note this was produced by a group spearheaded by Zitgist, Bruce D’Arcus, the Zotero team and Michael K. Bergman.
- CC REL Creative Commons Rights Expression Language
- Open Graph Protocol (used by Facebook)
- Music Ontology Specification: http://musicontology.com/
- Event Ontology
- Social Semantic Web Thesaurus brought to you by PoolParty, a component of its free WordPress linked data plugin.
- NEPOMUK Ontologies initially designed for the NEPOMUK (personal desktop information organization) project, may be generally useful. Include NRL Representational Language, NAO Annotation Ontology, NIE Information Element set, including Core Ontology, File Ontology, Contact Ontology, Message Ontology, Calendar Ontology, EXIF Ontology, ID3 Ontology; PINO Personal Information Model, Task Model Ontology.
- SUMO (Suggested Upper Merged Ontology) and its domain ontologies form the largest formal public ontology in existence today. They are being used for research and applications in search, linguistics and reasoning. SUMO is the only formal ontology that has been mapped to all of the WordNet lexicon. It is owned by IEEE and freely available under a GNU public license. An RDF translation is provided.
- CIDOC-CRM (Conceptual Reference Model) is a high level vocabulary for describing cultural objects, developed by the museum community (ICOM). RDFS versions are available.
Microformats, microdata, and related vocabularies
- Microformats.org has a list of microformat specifications; perhaps of particular interest to academic librarians is the wiki page on Citation Formats such as BibTex and RIS.
- COinS (ContextObjects in Spans)
- Schema.org (note particularly the Data Model and “Mapping to RDFa 1.1” on this page.
Projects and Reference implementations
- LOCAH (Linked Open Copac Archives Hub), JISC (UK) project to make combined data from the Copac union catalog and ArchivesHub aggregation of archival material, available as linked data (has published some datasets)
- COMET (Cambridge Open Metadata) project, has published some datasets based on MARC records
- Civil War Data 150 – a demonstration project, in process.
- NEPOMUK, Networked Environment for Personalized, Ontology-based Management of Unified Knowledge, a.k.a. the Social Semantic Desktop, is a project to develop a comprehensive solution for extending the personal desktop into a collaboration environment supporting personal information management and sharing
- SIMILE (Semantic Interoperability of Metadata and Information in unLike Environments) is a collection of projects hosted by MIT Libraries and CSAIL, seeking to enhance inter-operability among digital assets, schemata/vocabularies/ontologies, metadata, and services, primarily by extending DSpace functionalities primarily using RDF and linked data techniques.
- HIVE is an IMLS funded project at the Metadata Research Center in Chapel Hill. They use an automatic metadata generation approach to dynamically integrate discipline specific vocabularies using SKOS.
- UniProt is a unified data store of data about proteins including articles and information resources, now based on linked data. SPARQLing UniProt RDF: Using RDF based technologies to aid biological curation efforts is a video of a presentation by Jurgen Bolleman at BioHackathon 2011 about UniProt’s use of linked data and their quality control processes using SPARQL queries.
Web-based tools and services (see Vocabularies page for web vocabulary search tools)
- OpenLink Data Explorer can be used to query a Virtuoso triple store, or to navigate linked data from a URI.
- LinkSailor is designed to present linked data (and additional data found in the Talis platform) in an eye-readable form.
- Semantic Radar is a Firefox plugin that will detect presence of RDFa, SIOC, and other embedded RDF data in web pages and display a symbol in the lower browser frame that you can click to see the data.
- RDF Book Mashup was an early demonstration application developed at Freie Universitat Berlin by Chris Bizer and Tobias Gauss. You can enter book title or author, get a results list, and see the RDF mashup including Amazon and other data. It’s slooow but still working.
- FactForge is a “search engine” for linked data, brought to you by Ontotext Corp.
Open source (or free) software and software projects
- Sesame, provided by openRDF.org supported by Aduna, a Dutch software company, is a Java framework for processing RDF data ( parsing, storing, inferencing and querying). Available under BSD-style license
- OWLIM is a family of semantic repositories, which work with JENA or Sesame. The Lite version is free, others are commercial software. Supports reasoning and semantics, RDFS, OWL2 RL, OWL2 QL.
- JENA is a Java framework for building Semantic Web applications.
- RDFLib, a Python library for working with RDF.
- Protege Ontology Editor, free software provided by Stanford University, includes the Protege OWL editor extension. Tightly integrated with JENA and has an open-source Java API .
- OpenStructs, “…an education and distribution site dedicated to open source software for converting, managing, viewing and manipulating structured data. Structured data can represent any existing data struct from the simplest attribute-value pair formats to fully specified relational database schema. Material on this OpenStructs site ranges from individual tools to complete open semantic frameworks (OSF) with which to builld comprehensive semantic instances. All OpenStructs tools are premised on the canonical RDF (Resource Description Framework) data model.” Brought to you by the company Structural Dynamics
- OpenSEAS is “…a framework for the enterprise to establish a coherent, consistent and interoperable layer across its information assets. It is compliant with the open source Method for an Integrated Knowledge Environment via MIKE2.0’s Semantic Enterprise Solution Offering. Open SEAS has been developed for enterprises desiring to initiate or extend their involvement with semantic technologies. It is inherently incremental, low-cost and low-risk. It was originally developed by Structured Dynamics, and then contributed to MIKE2.0 as an open source solution offering.”
- Any23 (Anything to Triples): Software development project to create a web service and set of tools to extract structured data in RDF format from a variety of Web documents. Writen in Java and licensed under Apache License 2.0
- WordPress plugin from PoolParty allows you to import a SKOS thesaurus (the Social Semantic Web thesaurus or another reachable through SPARQL endpoint) and use it for publishing concept links in your WordPress blog.
- RDF Extension for Google Refine adds a graphical user interface(GUI) for exporting data of Google Refine projects as interlinked RDF data. Data can be reconciled against any SPARQL endpoint or RDF dump. The reconciled data can then be exported as RDF based on a template graph. Hosted by DERI. Updated 11/28/2011 to work with latest version of Google Refine (2.5).
- The SIMILE RDFizers page has links to downloadable tools written at MIT and elsewhere (MIT’s run from a command line and require a Java VM and Apache Maven) for converting data in other formats to RDF (including MARC/MODS/RDF)
Commercial firms providing software/services
- Talis has been involved in providing RDF-based services for may years; they created data.gov.uk; they have sold their Librarydivision to Capita Group plc. to focus on other services.
- Kasabi is a startup company under the Talis umbrella. “Kasabi is a marketplace for people publishing and looking for data. As a complete platform for the hosting and publishing of Linked Data, Kasabi brings together anyone involved with data: businesses, developers, individuals, organisations, government etc.” Featured datasets hosted run a gamut from data on Lego sets for sale, to BC Music data on artists, records and music reviews.
- Open Calais is a web service provided by Thompson-Reuters for automated text analysis and provides RDF categories and tags using the Open Calais vocabulary. The basic service is free; paid upgrades are available for advanced features such as guaranteed service levels.
- Deploying Linked Data using Open Link Virtuoso is a document descibes the process of deploying Linked Data into the existing Web. It discusses some of the difficulties faced in exposing RDF data and in bridging the ‘Linked Data Web” and the traditional “Document Web.” Virtuoso is one of OpenLink’s suite of products and provides a linked data server and triplestore along with other services such as relational and XML database management and web services.
- Semantic Web Company located in Vienna, Austria, provides knowledge management software and services including PoolParty linked data tools (also Atlassian Confluence content management system).
- Cambridge Semantics uses linked data technology under the hood to provide business information (“operational intelligence”) data integration, collaboration and analysis tools, notably in the biotechnology sphere.
- Sindice.com, is a startup to provide commercial services on top of the The Sindice Semantic Web Index and its infrastructure, which was developed primarily through government and non-profit funding.
- TopQuadrant produces the TopBraid Suite of services and tools to “design, develop, and deploy the data models, processing, and interfaces needed for sophisticated Semantic Web applications.”