Diane-Michel.com

Facilitating breakthrough medical research through collaborative intelligence, and the Semantic Web.
  • rss
  • Home
  • Diane Michel’s Blog
  • Stand Up to Cancer

FOAF – Friend of a friend

Diane | April 19, 2008

FOAF – is an ingenious little application that allows you to create an RDF about yourself, and your friends.

You can create your own RDF file simply by visiting the FOAF-a-Matic.
According to the FOAF Website:

FOAF-a-matic is a simple Javascript application that allows you to create a FOAF (“Friend-of-A-Friend”) description of yourself. You can read more about FOAF in Edd Dumbill’s “XML Watch: Finding friends with XML and RDF” article, at the FOAF homepage on RDFWeb, and also the FOAF vocabulary description.

In short though, FOAF is a way to describe yourself — your name, email address, and the people you’re friends with — using XML and RDF. This allows software to process these descriptions, perhaps as part of an automated search engine, to discover information about you and the communities of which you’re a member. FOAF has the potential to drive many new interesting developments in online communities. Ben Hammersely’s “Click to the Clique” article for the Guardian Unlimited website further explores these ideas.

Reference:
FOAF project. (2008). Getting started with FOAF. Retrieved April 19, 2008, from http://www.foaf-project.org/you/index.html

Comments
No Comments »
Categories
Future Think, Semantic Web
Tags
automated search engine, FOAF, Friend of a Friend, online communities, RDF, search engine, XML
Trackback Trackback

Tim Berners-Lee Semantic Web Podcast

Diane |

The following is a transcript is of a Podcast  produced by Talis, with Tim Berners-Lee.

[music]
Paul Miller:
00:10 Hello and welcome to this Talking with Talis Podcast with your host, Paul Miller. Today, I talk with Sir Tim Berners-Lee, inventor of the World Wide Web and now Director of the World Wide Web Consortium. We talk about the Semantic Web, Linked Data and Tim’s ambitions and vision for both.

[music]

Tim, thank you very much for joining me today for this podcast. Usually what I do with these podcasts is ask people to introduce themselves and talk a little bit about where they’ve come from. I guess with you, we probably don’t need to bother. I will point people to your page at the Web Consortium and certainly to things like Weaving the Web for anyone who doesn’t know who you are and what you have done to get to where we are today. So, I think, what we will do is, probably just move straight on to start looking at the questions.
 
 
Tim Berners-Lee: OK, good to be with you.
 
Paul:
01:12 Thank you; thanks for joining us. Over the past few years, there has been an awful lot of Semantic Web research going on in universities and inside research and development departments at – mainly – big corporations. What we are seeing now is really the results of that beginning to enter the mainstream. What do you think we have to do to move finally into the sort of mainstream deployment of some of these technologies and ideas that have been building for a very long time?
 
Tim: I think the Semantic Web is such a broad set of technologies and is going to do so many different things for different people. It is really difficult to put it on one thing. What are the steps necessary right now for the life sciences community to be able to use it for their data about proteins is probably different from which steps do we need to be able to get interoperability between repositories of library data and museum data.

So, different communities have different faces, different communities always have different social considerations and often there is social steps, which when you finally get people to share data more, to be able to re-use data more; then, just like with interaction of the Web, there is a lot of echos of the same sort of social concerns.

People saying, “If I put up a Web server, then I will be out of the loop. Nobody will come and I won’t get the credit.” People don’t have to come to my door and knock on it to get the information. And sort of misunderstandings like that. Or “If I give people my data, then they will be able to use it in ways which are better than the ways in which I have used it, and then, I will fade into the limelight.” So, there are all these… we get these social things, but they tend to be different in different areas.

An important step we have just got over is bringing SPARQL out. So, SPARQL changed the landscape a lot because there is such a lot of opportunities to share, which was impractical to just load and people having then to do Linked Data, so, SPARQL gives access to those. So, I think, we will see a growing number of SPARQL endpoints and that is exciting.
 
Paul:
03:21 OK. And we will touch on certainly Linked Data a little bit later on. You talked a little bit about people’s concerns there with loss of control or loss of credibility, or loss of visibility. Are those concerns justified or is it simply an outmoded way of looking at how you appear on the Web? 
 
Tim: I think that both are true. In a way it is reasonable to worry in an organization, for example. Suppose are in a department in an organization and you own the data about a particular thing, whether it is when the machines are going to be maintained, fixed or what the temperature has been in each of your offices or something. You own that data, you are worried that if it is exposed, people will start criticizing your maintenance of heating systems or something.

So, there are some organizations where if you do just sort of naively expose data, society doesn’t work very well and you have to be careful to watch your backside. But, on the other hand, if that is the case, there is a problem. So the Semantic Web is about integration, it is like getting power when you use the data, it is giving people in the company the ability to do queries across the huge amounts of data the company has.

And if a company doesn’t do that, then, it will be seriously disadvantaged competitively. If a company has got this feeling where people don’t want other people in the company to know what is going on, then, it has already got a problem, this just exposes the problem. It is like what people say, “Well my data is actually a mess or a lot of my addresses are out of date, or inconsistent.

Well actually, it would expose… we see all these inconsistencies. Well, in a way, you got the inconsistencies already, if it exposes them then actually it helps you. So, I think, it is important for the leadership in the company, for example, to give kudos to the people who provided the data upon which a decision was made, even though they weren’t the people who made the decision.

So, generally, to recognize the fact that people are providing access to their data is important. It’s very important in Science, too. If you publish a paper in which you happen to have got a lot of the results by running a SPARQL query over existing cell line data, existing genomics data, existing clinical trials data, whatever it is, then obviously it is very important in the scientific ethos to credit the people who produced them.

If you produce the experiments and put those out there in RDF on the Web, then, the good news is you can expect credit back; and sometime in the future after you have retired, people may in fact… you may get credit from people who are using that.
 
Paul:
06:05 OK, sounds good. Going a little bit broader than those questions then, back in 2001 in that Scientific American article, you and the other authors painted a very broad grand vision of where the Semantic Web could take us. Did you think we’d be closer to that seven years on?
 
Tim: Well, for one thing that article was, I think, too sci-fi. I think, that really what we have… the message has been… it was looking too far into the future. It imagined the Semantic Web was deployed, and then people had made all kinds of fairly AI-like systems which run on top of that.

In fact, the gain from the Semantic Web comes much before that. So maybe we should have written about enterprise and intra-enterprise data integration and scientific data integration. So, I think, data integration is the name of the game. That’s happening, it’s showing benefits. Public data as well; public data is happening and it is providing the fodder for all kinds of mashups.

So, what we should realize is that the return on investment will come much earlier when we just have got this interoperable data that we can query over.

Paul:
07:29 OK, and we are pretty close to that now with the Linked Data work that again we will probably dig into shortly. Weaving the Web, 1999 – is it time for another book that paints again sort of the picture given the experience of where we have gone in that time?
  
Tim: Yeah, I think, it has been time for another book for so long, but, when am I going to find the time to write it. I think, that the same things to be… it would be good to write a number of books.

The books I would like to write if I had time include… I would like to write a whole bunch of technical books about actually practically how to do Semantic Web things. I’d like to write a book about Semantic Web Architecture. And I’d like to write a book sort of painting the path for people in the industry, because I get a lot of questions along the lines of “OK, I read the specs, OK, but here I am, I am the CIO of a company, what does it mean for us now, what should we do?”

So, there is a story about the answer to that one, typically “Well you should take an inventory of what you have got in the way of data and you should think about how valuable each piece of data in the company would be if it were available to other people across the company, or if it were available publicly, and if it were available to your partners.”

And then, you should make a list of these things and tackle them in order. You should make sure you don’t change the way any of your data is existing, is managed, so you don’t mess up the existing systems and so on.

So, there’s all this sort of advice, which is being repeated all of over the place. Semantic Web experts are being called up by CIOs and asked what they can do. We need more books on that at the moment, I think, explaining how to put the Semantic Web as a win-only and win-win solution, and adding it to existing infrastructure in companies and things like that.

So, there’s lots of books. But, when things get exciting, as they are now, come Monday morning, what should I do? It’s just like back in the early days of the Web. Should I go and encourage a working group, participate in an open source project, should I go and give a keynote speech, should I go and do a podcast with Paul Miller? There’s so many things to do, that I’m afraid writing another book just hasn’t made it to the top of the pile yet.
 
Paul:
10:00 OK, so there may be an opportunity there for someone else to write that book.
 
Tim: There’s a lot of books out there to be written. Maybe also for people to interview the people who understand it, who understands things like Linked Data. Interview the people from the community and sort of write the books for them.
 
Paul:
10:23 You mentioned SPARQL back at the beginning of our conversation. With SPARQL, I guess, a lot of the technical pieces are now in place. We’ve got RDF, we’ve got OWL, we’ve got GRDDL, we’ve got SPARQL, and we’ve got the rest. Are there any big gaps left in the puzzle, or do we have the bits we need now to stop using lack of standards as an excuse?
 
Tim: I think, really we’ve got all the pieces to be able to go ahead and do pretty much everything. I suppose, really you should be able to implement a huge amount of the dream, we should be able to get huge benefits from interoperability using what we’ve got. So, people are realizing it’s time to just go do it.

If there’s one thing which we’ve foreseen from the early stages, I think, it would be rule languages. So, the rule language is in the works. The first moment we started playing with the Semantic Web, the first thing I did was to write a rule engine. Because to me, that was way of, for example, translating data from one ontology to another, for slimming it down, thickening it up, making general inferences, writing consistency checkers and things.

So, a rule engine is such a very general thing. Yeah, you see them effectively, you’ve got a rule engine when you filter your email. You set up email filtering rules through various smart mailboxes. Smart photo albums tend to be little rule engines. So, a lot of applications already are used to that. Users are used to that. Some users, I guess advanced users, to be fair, are used to using those rule systems to sort making their life run better, increasing the automation.

So, I think, the rule language will be really useful. The trouble is, is there are so many different types of rule languages. And various other things as well. There’s all kinds of things that are around. There’s style sheet languages for Semantic Web forms. There are lots of things which I think, you’ll see in the explosion of technology.

But the core, absolutely, is very solid now that we’ve got SPARQL.
 
Paul:
12:42 OK. Another way that we’ve looked at the Semantic Web idea, is through the famous layer cake diagram. One of the obvious pieces on that that actually comes into play quite strongly with a lot of discussion around things like social networks and data portability and all the rest, is trust. What are we doing to address trust?
 
 
Tim: Well, yes, and trust is fairly high up the stack. Originally in the roadmap, I felt that we would want to… when we had rule languages, then, for example, we would have a more expressive system that would allow us to actually express the trust that we feel.

A lot of pre-Semantic Web trust-based systems do things like give people a numerical value of trust. “I trust this person to the level of 0.75,” or something, or they say the guy is trustworthy, or he’s not. And they’re too simplistic.

So, I think, when you look at real systems in the world, you trust one person to give you a recommendation on a movie, and you trust a completely different person to give you a recommendation on whether a piece of code was good code. And so you trust different people for different things, and different agents for different things.

So, in fact, the code that I’ve been writing, the rule engine, very often it hasn’t been for saying this is OK, it’s a person with these particular properties, who meets this particular criterion, has said something. A message which comes from a certain source contains this information. And then you can do things with it.

So, one of the features of the N3 language, which extends RDF, is that it allows you to do that. It allows you to talk about what documents have said what, and allows you to write these rules, which say that sort of thing. You can argue about where the information is coming from.

Provenance, of course, is a really important word for almost anybody doing Semantic Web development; provenance somehow comes into their lives. If you’re building a triple-store, often these things… we think of as triple-stores, storing the subject, the verb, and the object. Actually, for each of those little sentences they’re also storing where it came from.

In the Tabulator, the data browser that we’ve got here at MIT, you can look at the aggregated view of data about all kinds of things. If you click on a cell, then, you can sharpen the list of sources and see, where did this particular data in that particular cell come from. People always need to go back to the source.

And I think, as we build trusted systems, we’ll build them, which will not only go back to the source, they’ll look at the metadata about the source. And then, they’ll operate on that. So, for example, you’ll be able to look at all the data you’ve got about something you’re doing for a school project. And you’ll be able to say, “OK, now just show me the subset of that data, which is released under a Creative Commons license, so I know that I can use it in my school project. So, I won’t get into trouble with the teacher.”

So, that involves this awareness that a lot of Semantic Web systems are being built with now, which I think, is very important. It’s the connection from the data to the provenance of the data, and not just for the name of the document that it came from, but the actual properties of that – the licensing, what it’s supposed to be used for, what it’s appropriate to use it for, whether I got it because I’ve gone through an authentication process, and actually whether it’s private data, which I should not actually publish at all.

These are, I think… Building systems which track that sort of thing, where I got the data from, and what I’m allowed to use it for, is going to be more and more important. So, those are the ways we’re going towards building trusted systems. I think, it’s a very important part of the puzzle, but we’ll build it using the things which mostly we have.

In the end, to strengthen it, we’ll probably put cryptography in there. And N3 gives you a way of talking about what’s in a document, you can relate it to the signature. So, you can boot-strap the whole trusted system using the things which we’ve really been playing with for several years.
 
Paul:
17:15 Right. And it’s about then making some of those things explicit, I guess. You mentioned, for example, that provenance is in the URI, but thinking far more carefully about what that means, rather than inferring meaning that perhaps isn’t always there.
 
 
Tim: Yes. The really challenging thing about a lot of trust issues is the user interface. We’re having this problem with the browser. In regular Web browser, at the moment, how do you let the person know that they’re actually talking to the bank, their own bank, and not something which has got a URI which has got the name of the bank somewhere in it, but not in a domain name?

How do you prevent these phishing attacks, for example. That’s the great question. That when you look at the browser, you realize there’s a certificate there, but the browser isn’t actually showing you who the certificate is owned by. It’s only showing the padlock to show it has a certificate. You’re not checking it every moment to know that you’re talking to the right person.

So, they obviously want to put in changes to the browser, so the user interface actually, instead of showing you the URI, it shows you the owner of the certificate and then you’ve moved up a level of trust.

With Semantic Web applications, imagine you’re looking at your calendar. On the calendar, you’ve got interoperability between different applications, suddenly on your calendar you’ve got bank statement transactions that show up. The photographs you’ve taken show up on the calendar because they’ve all got dates. And we’ve got interoperability.

So when you look at a calendar, suddenly it’s got information from all the different places. It may have personal photographs from the family, bank statements from your company, financial statements from your company showing up there. You might use those for deciding where you want to meet with somebody at any particular point, figuring out what was happening on any particular day. You may not want to share that with the people you’re in a meeting with.

So, to be able to see from the user interface that there’s confidential stuff on the screen, and I should not share my screen. To be able to ask the user interface to filter it, so that I would only know, now I’m meeting with somebody else, can you just make sure that everything on the screen is the sort of thing which I’d be prepared to show somebody else. That also assists in understanding the policies.

And we’ve got systems at MIT where you get in there and play around with things like mixing OpenID authentication with friend-of-a-friend, for example. Currently, if you want to comment on my blog, then you have to be related through the social network to somebody in the group, the Decentralized Information Group. You have to be a friend of a friend of a friend to some level of somebody in the group; just so we know that you’re not a spammer. It’s not that we want to cut down the people who can, we just want to cut out the spammers.

So, we see a lot of spammers who are using the social network, which is part of the Semantic Web, to produce the traffic, which actually shores up on the Semantic Web stack.
  
Paul:
20:34 Well, that’s clearly an area which will require a lot of activity moving forwards. Another area that will require a huge amount of effort moving forward is around data for the Semantic Web. We’re going to need an awful lot of it. Where are we going to get it from?
 
Tim: There’s an awful lot of data out there. And I think, one of the huge misunderstandings about the Semantic Web is, “oh, the Semantic Web is going to involve us all going to out HTML pages and marking them up to put semantics in them.” Now, there’s an important thread there, but to my mind, it’s actually a very minor part of it. Because I’m not going to hold my breath while other people put semantics in by hand.

I’m not going to wait for other people to do it, and I don’t want to do it either, to sort of add the semantics to HTML pages. So, where is the data going to come from? It’s already there. It’s in databases. So, most of this data is in databases. Often the data is already available through some kind of a Web interface.

So, if you take a government department, which is interested in defense data; you take a company’s products. You take a printer manufacturer, it sells all these printers, it sells all these ink cartridges, it can sometimes be an afternoon’s work to try and figure out which ink cartridge is actually compatible with which printer. Because they’re on different parts of the website and they haven’t published that information in RDF.

Suppose they publish the information in RDF, then you could just look up the printer and find all the ink cartridges which are compatible with it. And you could write programs that automatically go and buy all the ink cartridges at the appropriate prices and from the appropriate stores, and it’s all getting very automatable.

But, the thing that’s holding us up is that, there’s data which the companies have got on this, sitting and going round and round on its disks. Or it’s in their SQL systems and needs to be exported in a way that we can get at it in linked RDF as a SPARQL. And then, that could be reused. And all the people, all the resellers, all the stationers who sell ink cartridges, for example, will be able to make much better websites because they’ll be able to pull compatibility information from the user.

The company will find that its users are happier, and they’ll end up selling more printers and more cartridges. The whole world will run more smoothly and we’ll have more time to get on with more important questions.
 
Paul:
23:09 How does a company that has one of these databases take the step to make it available to the Semantic Web? What do they have to do? You said they don’t have to go away and rewrite all their pages, but presumably there is a step they have to go through.
 
Tim: Well, there’s a couple of ways of doing it. Say that you’ve got a database-type website. One way to do it is to look at it… let’s stay with the printers, for example. When you look at the website you notice there’s a page on the printer, which has got the specifications, and it’s got a little table of the properties of the printer. And there’s a PHP script somewhere, which produces that.

So, you get somebody who understands these things to write another PHP script which is totally parallel, which just expresses the same information in RDF. That’s all. Expressing it in RDF is actually kind of simpler than expressing it in HTML, because when you express it in HTML, then you have to make sure that the CSS is pretty, and you’ve got the icons in the right places, and it meets the organization’s guidelines for being part of the website, and it’s got a consistent style and everything, and it’s got the navigation buttons.

When you do an RDF one, to a certain extent you need navigation buttons in the sense that when you output the data about the printer, you have to make sure that when you mention the compatible ink cartridge, you use the URI for the compatible ink cartridge, which will cause the RDF machine which is interested in that to pull up the RDF page about the ink cartridge.

But this RDF page is meant for machines or people using RDF browsers. So people using things like the Tabulator will be able to pull that up and follow that and look at the links, and then also, they’re going to make tables of all vendors and tables of all the cartridges, and use the interface to concoct queries effectively, for all the printers that take ink cartridges that cost less than $20 or something.

So, one way to do it is to parallel the website, parallel each Web page, which has got interesting data about a particular thing, such as a product, with the same thing in RDF. And then, you try it out using an RDF browser to see if it works and see if it’s got the data. And then, you pass it to some of your friends or your colleagues or your peers in other companies who would be interested in the data and see whether they can use it and match it up.

The other way is that you start at the database. You just look at the database, you do that inventory of all the tables in the database, hopefully, you sit down with some of the people who designed the database tables, because often within a company, it’s really a bit of a black art knowing exactly what some of the columns in a database are actually meant to mean. Sometimes, you have to go and have coffee with the people who actually designed the database schema.

But, then you sit down and look at the tables, and you can point a piece of code, such as for example, the D2R Server code from Berlin. We’ve got Python code, dbview, you can point it at MySQL database. And it will make a sort of default ontology and say, “That’s OK.” Let’s assume that for each table you’ve got… each table is about a set of things. We’ll make a class, which corresponds to the table. And we’ll give a URI to each of the things, which is described by a row in the table. And it will generate for you a default ontology.

Now, the more you tell it about the database, then the better it will be, because it realizes, “Oh, yeah, this is a product ID that crops up here,” and it knows where the product ID crops up in other tables. Then, when you do a look-up for the RDF URIs for a given product, it will not only give you the data in the product table, it’ll give you the other links, the links pointing into it, that say that this is the product, or this company has ordered the product, it’s in this invoice, and it’s got this compatible product and so on.

So, as you tell the system more and more about the database, it starts to produce more and more a reasonable RDF view of the world. And you can wander around it with Disco or one of the other RDF browsers out there, and you can wander around it, and as you slowly add pieces or add labels to the ontology, then your mapping file is now becoming part ontology and part mapping. It’s mapping from and explaining the internal database schemas, how they map to this ontology that you’re making.

As this grows, you get a more and more useful system. So, you may in fact go to that process to a point where you actually start using the data. At a certain point, you may just decide that you’ve done enough for now and just export that data and let other people clean it up.

One of the things they’re likely to come back and say, “Well, you’ve got a start and end time there, could you just export those in an iCal ontology?” Because a lot of people use start and end times, and if you do that, then, we could all put it on our calendars, and we can all put it on our timeline views and things.

So, after you’ve done this initial export raw from your database, then you can look around for terms that you’re using where there are actually ontologies out there, go to the Semantic Web Interest Group, go to your friends. If you’re in Cambridge, come to the gatherings we have every second Tuesday at MIT, with people who are doing this sort of thing with the Semantic Web. Ask around, “Is this ontology useful? Has anybody got an ontology for this?” Use tools like Swoogle to search for ontologies.

You don’t just use internal terms, but where you’re sharing terms, you use terms that other people use as well.
 
Paul:
29:26 Is this the kind of thing that the Linked Data Project are doing then?
 
Tim: The Linked Data Project as a whole has got lots of projects within it. It’s a sort of linked open data movement, I suppose, in which there are different projects. And the greatest thing is, with this linked RDF, you get interoperability between projects which are really quite different.

For example, DBpedia is one of the more famous pieces of it. DBpedia is an extraction from Wikipedia of all the little data boxes. So, you have a box, for example, in Wikipedia for cities, that gives its latitude and its longitude, which county it’s in, how many people, the population, and so on. And so, they construct relationships between geographical entities and so on. And so, it produces a really interesting graph, all by scraping the rather formalized HTML which is in the forms of a page to mark up or mark down, which is in the Wikipedia.

On other systems like MusicBrainz, they have a database and there was an ontology created and mapping done very deliberately. Of course, there are lots of people dealing with tracks and singers and albums, and so, there’s a lot of interest in interoperability for music players and so on, and music look-up services.

So MusicBrainz could piggyback on lots of other ontologies, but also, the singers are often in Wikipedia, so, they could connect songs, in some cases, which are in Wikipedia and certainly artists which are in Wikipedia into Wikipedia entries. So, that took a certain amount of negotiation between the parties to actually go through and do a data cleanup, in which they made sure they identified appropriately the corresponding nodes and linked them together.

So, some data is scraped from HTML pages, some of it is pulled out of databases, some of it comes from projects which have been in XML. So, things come in many different ways. And once they’re exported, as you browse around the RDF graph, as you write mash-ups to reuse that data, you really don’t have to be aware of how it was produced.
 
 
Paul:
32:03 That sounds what you need, more or less, isn’t it? It’s about making this data visible and seamless and available to other people and their applications. And how big is this activity at the moment? I do remember some figure suggesting very large numbers of records now being available.
 
 
Tim: There are. I’m never good at large numbers. But, there are various linked open data projects. There’s a linkeddata.org that I see somebody has put together, there’s Linked Data in Wikipedia. If you Google for ‘linked open data’, then you’ll find pointers to various things. Where you’ll find a completely up-to-date list, I’m not sure. Maybe, we should send attached comments at the end of the blog, pointing to things like Richard’s Venn Diagram of how the different pieces overlap in the large Linked Data projects. There’s a lot of Linked Data which is not publicly advertised at DBpedia levels because it’s more of a niche interest, which benefits from connecting into these people and maybe the Internet and those sort of things. I think, it’s difficult to measure it all.

I think, one of the large contributions to it has been the friend-of-a-friend, which is exported by a bunch of social networking sites. So, the FOAF data. FOAF data used to be the largest contribution to the Semantic Web. I’m not sure whether it still is.

There’s the Web conference in China, and there’s going to be a Linked Open Data Workshop associated. That’s in April. And there’s also going to be some sort of linked open data conference in New York in June, I think. So, there’s a lot of stuff happening with people interested in this. Go to the Wikipage and contact people in some of the interest groups, there’s an IRC channel to find the people who are involved in that.
 
 
Paul:
34:21 Yeah. And I’ll include link to all of these in the show notes and people can follow along. There certainly is an awful lot happening. From our perspective here, we’re certainly watching this as being the thing that begins to demonstrate the real value of the Semantic Web outside of the research community. It’s what you can actually start to do with this data as you link it up.
 
 
Tim: Yeah. Well, I think, there’s a certain role that’s played by public data, so that if you say in your FOAF file… I say that I work in Boston. It’s kind of nice to say, “Well, I live in Boston,” and get a DBpedia reference to it, so, somebody’s immediately got access to a lot of data about it.

I did a demonstration for Fidelity the other day, where I pulled some comma separated value files, some CSV files, from their website about their mutual funds. So, I could then put that in RDF of course, and in that case you could look at it in tables and you could look at other people’s mutual funds, which you had pulled from other websites. To distinguish them, of course, I could put a link to who’s actually offering the funds, and I could use the DBpedia URIs for Fidelity. And that suddenly makes it much richer. And suddenly you could say, “OK, I’m looking at all the funds which are based in cities on the East Coast.” Suddenly, they see this connection.
 
 
Paul:
35:47 Yeah, definitely. I think, as we move from the original reasonably straightforward idea of the Web towards a more Semantic Web, are we having difficulty bringing understanding with us. The original idea of the Web was pretty easy for almost anyone to get.

It was also actually very easy to do. You could simply open a text editor and write HTML. I can remember doing that, very early on, before the other tools came along.

As we move towards a more complicated Semantic Web, does it become harder to understand, and also harder to actually engage with?
 
 
Tim: I think, it depends really, it depends so much on how you look at it. So, the Web, and the Semantic Web, the existing Web… maybe we should say, the hypertext Web, the document Web, and a Data Web, are… In some ways we are not leaving the document Web behind. It is not as though, when you say we are moving towards a Semantic Web… Yes we are moving towards and we are implementing a Semantic Web, but it very much complements the Web of documents. The Web of documents will continue to exist, and as we are adding more and more video, and all forms of interactivity, that is going to be very exciting too.

So, we got these two things that are maturing, in a way. If you use something like the N3 Syntax, the N-Triple Syntax, or the Turtle syntax for data it’s very simple, it is a very simple language. You can write things down very easily. It is not really more complicated than HTML.

So, being able to write a FOAF file for yourself in N3 is easy, and then you can convert it into RDF/XML for output. So, in a way, it’s got the same, more or less, structure as in N3 syntax. It has got that same sort of items, you can convert your RDF into it and look at it. I can fix it and put it out there just like I would have done it with HTML. That’s a lot that can be said for it.

On the other hand, data is different from documents. When you write a document, if you write a blog, you write a poem, it is the power of the spoken word. And even if the website adds a lot of decoration, the really important thing is the spoken words. And it is one brain to another through these words.

And as a person is expressing himself to another human being, in such a way that the machines, they will try to understand it. But, they will really not be able to catch up with the poetry.

When we are creating data; when, for example, I am creating some information about an event. If I am creating information about an event, then, I am putting in the time and place in such a way that I can latitude and longitude out of it. And putting in the people who are invited to the event, using their email addresses, for example, in such a way that those people can be identified and linked in. So, I am constructing something which fits in and will be reused in all kinds of different ways. Just as the poem will be reused. In the future people will read the poem, and get all kinds of different things out of it.

But, the data, is in a way… the whole is more powerful mathematically. The fact that people will be able to do inference and they will be able to conclude from the fact that a person is at that event in that location, that they are not in a city 100 miles away, and therefore that they cannot attend something else.

They will be able to do things like conclude from the fact that a different person at the event, took the photograph on their camera, during the event; therefore, the photograph was of the event. So, therefore, it would be reasonable to share the photograph with anybody else who was at the event. Those people would have been identified.

So, in principle, a whole lot of work can happen. Our life can be made much easier, because we put this in. And thinking about how that all works, I suppose it is more complicated, because it is more powerful, with building systems which is going to do a whole lot more.

There we are moving, I suppose, from the horse to the motorcar. In a way, the motorcar is a more… the internal combustion engine is a more complicated thing, but it will enable things the horse can’t do. It can just go a whole lot faster and it can go on for a whole lot longer. Even though we still like horses.

One of the important things about the motorcar, of course, is that it has got actually, for most people when they use it, it has a very simple user interface.

So, I think, a crucial thing is that, most of the times we are using the Semantic Web, we are using it underneath the user interface, which will make it very, very easy in the way that people have got used to sorting their email, managing their calendars, and their contacts, and their address books, their appointment diaries.

These things have interfaces, which have matured over time. They are not perfect. They could all be improved upon. They will be improved upon. They get more complicated when they are linked together, but in a way they also become easier to use when they all have been linked together. When I can automatically go from my appointment diary, you know, through to the person, and then to the information about the place they work, and so on.

In a way, it will again put… The inline architecture of the Semantic Web is actually much simpler than that of HTML. It is just these triples. So, in a way understanding it, and developing with it, is actually a whole lot simpler. It is not inherently more complicated.

However, when you look at the complexity of all the things on the Web, or the Semantic Web, when you look at the size of them, of data from all kinds of different places, and you think about the implications of that, then yes, it is complicated. The Semantic Web is already complicated. The public Semantic Web is a very interesting and complicated, very powerful, useful thing which is interesting to analyze.

So, from that point of view, the Web as a beast, gets more complicated and more intricate, in a way more exciting with the Semantic Web.
 
 
Paul:
42:32 I guess one of the areas we have seen a lot of growth in easy to use tools on the Web recently, has been around the whole sort of participative movement that has loosely been labeled Web 2.0.

You wrote a blog post towards the end of last year, on the Giant Global Graph, which really made it quite clear how some of the Semantic Web ideas that had been around for a long time, applied to the sort of new kids on the block, in the social networking sphere.

Do you think developers of applications like, say, Facebook and LinkedIn and the rest, are ready to embrace the Semantic Web, or do you think they think they can do it themselves?
 
 
Tim: I think, there is two parts of that. There is whether they will need to give up the data, and whether they are willing to use the standards. You will find to start with a lot of places, like LiveJournal, for example, they expose FOAF. So standard RDF Friend of a Friend for your friend network.

If you look at MyOpera, not only do they expose a FOAF link but they allow you in your Opera profile to say, I am also this LiveJournal person. So, you can follow your links, you can follow the friend of a friend, the social network, through one site and into other.

I think, it is a very grown-up thing to realize that you are not the only social networking site. When you do that, it is like a website that all of a sudden… otherwise it is like a website which doesn’t have any links out. In the Semantic Web similarly, if you don’t have any links out, well, that’s boring.

In fact, a lot of the value of many websites is the links out.

So if you start off with one of these social networks that does have links out, then you will find out a huge amount. If you find one which doesn’t, then you will be able to explore it using common tools, if they use the FOAF standards, but I bet you’ll be limited; you will bump into the edges.

Now if you look at the social networking sites which, if you like, are traditional Web 2.0 social networking sites, they hoard this data. The business model appears to be, “We get the users to give us data and we reuse it to our benefit. We get the extra value.” When one person has typed in who it is that’s in a photo, then we can benefit. We give the other person extra benefit by being able to give them a list of photos that they are in. That’s tremendously beneficial.

That’s the power of the Semantic Web. And I think, the social networking sites, some of the ones that have become very popular have done it because captured the semantics. They haven’t just allowed you to tag something with somebody’s name, they’ve allowed you to capture the difference between somebody who took the photo and somebody who’s in the photo, so that the power of the reuse of the data has been much greater.

So, first of all, are they going to let people use the data? I think, the push now, as we’ve seen during the last year, has been unbearable pressure from users to say, “Look, I have told you who my friends are. You are the third site I’ve told who my friends are. Now, I’m going to a travel site and now I’m going to a photo site and now I’m going to a t-shirt site. Hello? You guys should all know who my friends are.” Or, “You should all know who my colleagues are. I shouldn’t have to tell you again.”

So, the users are saying, “Give me my data back. That’s my data.” That was one of the cries originally behind XML, it was a desktop application. Don’t store it in a format which I can’t reuse. So, now it’s, “Give it to me using the idea of standards. If you do that, then I can do things with it.”

Now, there are two architectures which allow you to do this. The way some of the sites are working is that you’ll go to, for example, a t-shirt site which is going to allow you to print a t-shirt or something. Or say you go to a photo site and say, “Now I want to see the photos of my friends. You don’t know who my friends are. I am going to authorize you in some way, using something like OpenAuth, to go to another site. I’ll open the gate with them, I’ll tell them that it’s OK to use the information about who my friends are.”

So, just for the purpose of printing those t-shirts or just for the purpose sharing these photographs with my friends, I’ll allow you to know who my friends are. So, we’re getting this moving of user data between different sites. Now, we’ve got the user data stored in more than one place. Obviously, refreshing is important and we’ve got dangers of inconsistency and so on, and we’ve got all this third-party authentication going on.

There’s another model, which is that I, the user, run an application in my browser, for example, or on my desktop. It could be an AJAX application. It could be an application which allows me to look at photos. But, what it does is, it pulls the photo information from many places, and I directly authenticate.

And when it pulls that information in, it pulls in all the information I rightfully have access to. It pulls my friends’ information as well from different places. So, if I’ve got social networks, or for that matter, if I’ve just got files in Web space. If I’ve got a friend-of-a-friend file, or even if I’ve got my local file on my desktop that now I can use. So, I can use my address book.

So, it now pulls all the information that I have access to about the social network, and it pulls all the information in that I have access to about photos, and then it allows me to browse the web of photographs of people using the full power of the integration of all those things. It allows me to look at photographs of friends, photographs of people that are friends of friends, but are not my friends, to see if I should be adopting them as friends and so on.

It can do all these powerful things, and it’s happening actually in the user’s browser, or it is happening on the user’s machine. Both of these systems at some point allow people sharing data. The second system is much simpler. The second system involves people writing scripts which will operate across different data sources.

Web 2.0 is a stovepipe system. It’s a set of stovepipes where each site has got its data and it’s not sharing it. What people are sometimes calling a Web 3.0 vision where you’ve got lots of different data out there on the Web and you’ve got lots of different applications, but they’re independent. A given application can use different data. An application can run on a desktop or in my browser, it’s my agent. It can access all the data, which I can use and everything’s much more seamless and much more powerful because you get this integration. The same application has access to data from all over the place. Does that make sense?
 
 
Paul:
49:52 Absolutely, yes. And I think, creating and maintaining that split between the data and the interface or the application certainly has to be the way that we go. As you say, persuading some of the companies who have built a business model around holding the data is the task that we still have to get right.

Although I suppose, as choice arises, people can choose not to go to those sites, can’t they?
 
 
Tim: People can indeed choose not to go to that site. It reminds of the story of what happened when bookshops went onto the Web. Sometimes I have to remind people about this. When originally the Web came up and bookshop owners learned about it, they said, “OK, I was told at my dinner party that we have to have a website, so get us a website.” The website would come up and it would say, “We recommend you go to see this wonderful bookshop. This is the address.” And it would give you the directions.

But, they never put up a list of books that they sold. If people put that up, they wouldn’t go to the bookstore. The important things was people should go to the bookstore. And anyway, it was also commercially sensitive data. If you put up a list of books that you were carrying, if you put up a catalogue, then, your competition could immediately use that information to compete unfairly.

And then suddenly they would realize, they would be told, “Well, excuse me, sir, the competition already has its catalogue on the site. So, everybody is going to their website. Nobody is going to our website because it doesn’t have any information. When they go to the competition, they go to the website first to check if they have the book, and then, they know if they go to the store they’re going to be able to find it.”

“Oh, OK. Well, I guess we’d better put our catalogue up.” “Shall we put the prices up?” “Oh, no. Don’t put the prices up, because that’s commercially sensitive. They should see that when they go to the store.” “Oh, the other people have put their prices up now? And so now they’re taking our customers again?” “OK, I guess we’ll have to put our prices up.”

Stock levels? Of course you don’t put the stock levels! “That’s our backend information, you don’t put the stock levels up. People can come to the store and when they order it they’ll find out whether we’ve got it in stock or not. Oh, really, they don’t like that?” Clearly then, bookstores moved to putting their stock levels online because people got fed up with finding that they’d ordered a book and then they get a little email saying that its been backordered for two months.

So, there’s this syndrome of competitive disclosure. When actually business works better, when people have disclosed and are communication one to the other. Once it starts, then it can snowball. So, once we have people putting their catalogues up in RDF, it may be that there will be aggregators that look at products and they won’t see your products if they’re not up there using Semantic Web standards.

Which, you’re giving a talk, and the person doesn’t advertise it using the standards, it won’t be streamed. It won’t be up there, people won’t have it on their calendars. People won’t come and see you talk because the information wasn’t made available publicly in interoperable fashion.

So, I think, the lesson of the bookstores on the Web is an important one. If you’re working for a company and there’s a sort of hesitation about sharing information with peers, that you know actually will make the company work better, tell them that story.
 
 
Paul:
53:33 Will do. We will write it down and disseminate it widely. As we come to the end, I am conscious of the time, what do you think – and this is probably the hardest question of them all – what do you think the biggest challenges facing Semantic Web adoption and Semantic Web rollout are over the next couple of years?
 
 
Tim: Oh, that is a great question. I suppose, I think, the paradigm shift is the biggest hurdle. The fact that when you think in terms of Semantic Web, you think differently. It was actually a problem for the Web too. People look back and they say, “Well, the Web is so easy, you just download the Web browser and then you could just….” And the moment they use the Web browser, you had to write HTML, and then you could edit HTML pages with editors and the whole world took off.

Well actually, before there was a significant amount of Web, it was really difficult to persuade people it would be a good idea. They just didn’t understand how fundamentally essential it would be to be on the Web. They didn’t understand what a kick they’d get out of finding that somebody had reused their information in a different way. They didn’t understand how beneficial it would be to have more or less all information that they could think of available.

And imagining it, now imagine people write a SPARQL query as though the world, as though all the data to which you actually legally practically have access, actually is technically available to you as well – just anything which comes up into your mind as a scientist, as a businessman, just as a school kid wondering the answer to a science project question… There are obviously a set of people who get it.

They have a twinkle in their eye, they are incredibly fired up, because they understand it is going to be really really exciting when it all happens. To a certain extent, they are finding that these areas like life science, like social networking, like the Linked Open Data projects, where it is all starting to come together.

There are other areas where somebody who has worked in data systems doesn’t get it. So, to understand, so explaining it, why you can’t do it in an audio blog of 60 minutes. Because when you explain the new way of looking at the world, new way of looking at data, moving up a level from the database to the Web of things, you have to listen to where somebody is coming from, you have to understand what concepts they’ve got at the moment. All this is coming to this point of view of an object-oriented programmer or a database person, because the way you paint the Semantic Web is going to be very different.

And the misunderstandings they will have, they naturally get about Semantic Web will be very different. But, it is happening more and more. So, I think, it is a question of how this meme can spread or how understanding about what this is. But, I hope that having Linked Data online, having user interfaces to it will help.

One of the crazy things, one of the big impediments we have had for the last few years, I guess was maybe a planning fault, is we didn’t have user interfaces, we didn’t have generic interfaces. When people asked me what the Semantic Web browser would be like. I’d say, well you don’t understand, it is not really… documents are for browsers, data is for applications, so these applications will use it.

And in fact, I realized we need to get that feedback, “Oh look, Ma, at my Semantic Web data”, just like “look, Ma, at my Web page.” Hence the development of things like the Tabulator, which are very much in their infancy, but starting to be able to give people that instant gratification. I put my data up there and now I can see it up there, now I can show you, now I can immediately get kudos. Now, I can stop having to answer the phone. I can point people to the data. They can go and use a Semantic Web browser on it, they don’t have to come and ask me.

So that, I think, is an important thing. We are only really at early stages of sort of the art and science of producing good Semantic Web, generic cross-domain Semantic Web browsers… and editors of course.
 
 
Paul:
58:14 And that is an interesting point actually. Your original Web tool was a browser and an editor. Was it a mistake not to push harder to maintain that right back at the beginning?
 
 
Tim: I really wish we could have for a lot of reasons. To start with, people shouldn’t have had the pain of having to write angle brackets, and that people were prepared to… that was a total shock to me. I had assumed that people wouldn’t. Also if we’d have had… so we would have had I think, a much more collaborative space had all the browsers been editors.

And also, we wouldn’t have all this terrible markup, because we would have had the markup where they are generated automatically, and it would have actually had matching tags. So, in that respect, in a number of respects, an ambiguous one being a collaborative space. We had to wait too long for blogs and wikis. Blogs and wikis would have happened sort of very much more easily if they had been editable things, if people had editors.

The problem of course was HTML got complicated. It had all kinds of things like sorting DIVs, which are difficult to edit. When you have nested lists, it is more difficult to edit than unnested lists. And I think, that is part of it. Also I think, the fear of actually being able to edit a page; then we would have had to develop some sort of templating. When you edit a blog, actually you edit only the very middle of the page, you edit a stream of text, then you have a very limited markup, and you don’t get an option of editing all the stuff around it that is generated automatically.

So, I think, what we need are editors, so that is a limited. We should have a type of HTML form where you can just type and do bold, and strong, and emphasis and so on, very easily using these interfaces, which are supported by the browser instead of by a bunch of java scripts. And that will help us become more – that way we need to be more collaboratively creative.

And at the same time, for the Data Web, it is important that when people see data that is wrong, if they’ve got the rights to access it, they should be able to fix it. And if they see an address, see a wrong address up there or an email address and also it should be, actually this is not right.

It should be very easy for people to enter data and also things that we do like entering bug reports, entering agenda items for meetings, entering new events, all kinds of things that are really generating data and we should be able to do that really easily, and yet keeping both the Web of hypertext the Web of Data, keeping it the Read/Write Web is a really big priority for me.
 
 
Paul:
61:21 Absolutely. And it is fascinating to see how people do take to things like wikis and certainly conversations internally within Talis, where the development team uses Wikis all the time. And as you roll them out to other parts of the organization, there is an initial fear to get over, but once you get over it, people take to it like ducks to water. It is remarkable to see, yeah absolutely. Good, thank you very much Tim. Before we wrap up, do you have any final things that you wish I’d asked you?
 
 
Tim: No fundamental thing. Paul it is really good of you to do this series. I find it really useful to be able to delve into and it has been great listening. So, thanks for keeping on this tradition. I think, it is great for us now and maybe it is pretty interesting for posterity as well to track people’s ideas in this space.
 
 
Paul:
63:24 Thank you very much and thank you for taking the time to take part.
 
 
Tim: You are very welcome.
 
 
Paul: Thanks.

[music].

Reference:
Talis. (February 7, 2008). Sir Tim Berners-Lee talks with Talis about the Semantic Web. Retrieved April 19, 2008, from http://talis-podcasts.s3.amazonaws.com/twt20080207_TimBL.html.

Comments
No Comments »
Categories
Semantic Web, Tim Berners-Lee
Trackback Trackback

Tim Berners-Lee: Visions on Semantic Web

Diane |

Recently Sir Tim Berners-Lee, created a new posting on his blog about the Semantic Web.

Well, the Semantic Web has been in the news a bit recently.

There was the buzz about Twine, a “Semantic Web company”, getting another round of funding. Then, Yahoo announced that it will pick up Semantic Web information from the Web, and use it to enhance search. And now the Times online mis-states that I think “Google could be superseded”. Sigh. In an otherwise useful discussion largely about what the Semantic Web is and how it will affect people, a misunderstanding which ended up being the title of the blog. In fact, the conversation as I recall started with a question whether, if search engines were the killer app for the familiar Web of documents, what will be the killer app for the Semantic Web.

Text search engines are of course good for searching the text in documents, but the Semantic Web isn’t text documents, it is data. It isn’t obvious what the killer apps will be – there are many contenders. We know that the sort of query you do on data is different: the SPARQL standard defines a query protocol which allows application builders to query remote data stores. So that is one sort of query on data which is different from text search.

One thing to always remember is that the Web of the future will have BOTH documents and data. The Semantic Web will not supersede the current Web. They will coexist. The techniques for searching and surfing the different aspects will be different but will connect. Text search engines don’t have to go out of fashion.

The “Google will be superseded” headline is an unfortunate misunderstanding. I didn’t say it. (We have, by the way, asked it to be fixed. One can, after all, update a blog to fix errors, and this should be appropriate. Ian Jacobs wrote an email, left voice mail, and tried to post a reply to the blog, but the reply did not appear on the blog – moderated out? So we tried.)

Now of course, as the name of The Times was once associated with a creditable and independent newspaper :-) , the headline was picked up and elaborated on by various well-meaning bloggers. So the blogosphere, which one might hope to be the great safety net under the conventional press, in this case just amplified the error.

I note that here the blogosphere was misled by an online version of a conventional organ. There are many who worry about the inverse, that decent material from established sources will be drowned beneath a tide of low-quality information from less creditable sources.

The Media Standards Trust is a group which has been working with the Web Science Research Initiative (I’m a director of WSRI) to develop ways of encoding the standards of reporting a piece of information purports to meet: “This is an eye-witness report”; or “This photo has not been massaged apart from: cropping”; or “The author of the report has no commercial connection with any products described”; and so on. Like creative commons, which lets you mark your work with a licence, the project involves representing social dimensions of information. And it is another Semantic Web application.

In all this Semantic Web news, though, the proof of the pudding is in the eating. The benefit of the Semantic Web is that data may be re-used in ways unexpected by the original publisher. That is the value added. So when a Semantic Web start-up either feeds data to others who reuse it in interesting ways, or itself uses data produced by others, then we start to see the value of each bit increased through the network effect.

So if you are a VC funder or a journalist and some project is being sold to you as a Semantic Web project, ask how it gets extra re-use of data, by people who would not normally have access to it, or in ways for which it was not originally designed. Does it use standards? Is it available in RDF? Is there a SPARQL server?

A great example of Semantic Web data which works this way is Linked Data. There is growing mass of interlinked public data much of it promoted by the Linked Open Data project. There is an upcoming Linked Data workshop on this at the WWW 2008 Conference in April in Beijing, and in June 17-18 in New York at the Linked Data Planet Conference. Linked data comes alive when you explore it with a generic data browser like the Tabulator. It also comes alive when you make mashups out of it. (See Playing with Linked Data, Jamendo, Geonames, Slashfacet and Songbird ; Using Wikipedia as a database). It should be easier to make those mashups by just pulling RDF (maybe using RDFa or GRDDL) or using SPARQL, rather than having to learn a new set of APIs for each site and each application area.

I think there is an important “double bus” architecture here, in which there are separate markets for the raw data and for the mashed up data. Data publishers (e.g., government departments) just produce raw data now, and consumer-facing sites (e.g., soccer sites) mash up data from many sources. I might talk about this a bit at WWW 2008.

So in scanning new Semantic Web news, I’ll be looking out for re-use of data. The momentum around Linked Open Data is great and exciting — let us also make sure we make good use of the data.

………………………………………………………………………………………………..
Reference:
Berners-Lee, T. (March 27, 2008) timbl’s blog. Retrieved April 19, 2008, from
http://dig.csail.mit.edu/breadcrumbs/blog/4

Comments
No Comments »
Categories
Semantic Web, Tim Berners-Lee
Trackback Trackback

Semantic Web Resources

Diane |

It’s now time to start providing more Semantic Web resources for your reference as opposed to only the W3C. Which of course, is the best one of all. But after all we must explore further. In this posting I will try to cover some of the more useful ones.

The following book can be purchased by the MIT Press. You may also enjoy looking at the publisher’s companion site with slides and extra content such as exercises for your understanding.

Books:

Antoniou, G. & Harmelen, F. (2008). A Semantic Web Primer. MIT Press: Boston.

The development of the Semantic Web, with machine-readable content, has the potential to revolutionize the World Wide Web and its use. A Semantic Web Primer provides an introduction and guide to this emerging field, describing its key ideas, languages, and technologies. Suitable for use as a textbook or for self-study by professionals, it concentrates on undergraduate-level fundamental concepts and techniques that will enable readers to proceed with building applications on their own. It includes exercises, project descriptions, and annotated references to relevant online materials. A Semantic Web Primer is the only available book on the Semantic Web to include a systematic treatment of the different languages (XML, RDF, OWL, and rules) and technologies (explicit metadata, ontologies, and logic and inference) that are central to Semantic Web development. The book also examines such crucial related topics as ontology engineering and application scenarios.

After an introductory chapter, topics covered in succeeding chapters include XML and related technologies that support semantic interoperability; RDF and RDF Schema, the standard data model for machine-processable semantics; and OWL, the W3C-approved standard for a Web ontology language more extensive than RDF Schema; rules, both monotonic and nonmonotonic, in the framework of the Semantic Web; selected application domains and how the Semantic Web would benefit them; the development of ontology-based systems; and current debates on key issues and predictions for the future.

About the Authors

Grigoris Antoniou is Professor at the Institute for Computer Science, FORTH (Foundation for Research and Technology-Hellas), Heraklion, Greece.

Frank van Harmelen is Professor in the Department of Artificial Intelligence at the Vrije Universiteit, Amsterdam, the Netherlands.

Suggested Resources from the authors:

An excellent introductory article, from which, among others, the scenario from ‘Last night I had a dream’ was adapted:
T. Berners-Lee, J. Hendler and O. Lassila. The Semantic Web. Scientific American 284,5 (May 2001): 34-43. Is it ok?
http://www.sciam.com/article.cfm?chanID=sa006&colID=1&articleID=00048144-10D2-1C70-84A9809EC588EF21

An inspirational book about the history (and the future) of the Web is: T. Berners-Lee. Weaving the Web. Harper 1999.
http://www.amazon.co.uk/exec/obidos/ASIN/1587990180/ref=sr_aps_books_1_1/202-1618432-3813461

There is large number of introductory articles on the SemanticWeb available online. Here we list a few:

T. Berners-Lee. Semantic Web Road Map.
http://www.w3.org/DesignIssues/Semantic

T. Berners-Lee. Evolvability.
http://www.w3.org/DesignIssues/Evolution.html

T. Berners-Lee. What the Semantic Web can represent.
http://www.w3.org/DesignIssues/RDFnot.html

E. Dumbill. The Semantic Web: A Primer.
http://www.xml.com/pub/a/2000/11/01/semanticweb/

F. van Harmelen, D. Fensel. Practical Knowledge Representation for the Web.
http://www.cs.vu.nl/~frankh/postscript/IJCAI99-III.html

J. Hendler. Agents and the SemanticWeb. IEEE Intelligent Systems, March-April 2001.
http://www.cs.umd.edu/users/hendler/AgentWeb.html

S. Palmer. The Semantic Web, Taking Form.
http://infomesh.net/2001/06/swform/

S. Palmer. The Semantic Web: An Introduction.
http://infomesh.net/2001/Swintro/

A. Swartz. The Semantic Web in Breadth.
http://logicerror.com/semanticWeb-long

A. Swartz, J. Hendler. The SemanticWeb: A Network of Content for the Digital City.
http://blogspace.com/rdf/SwartzHendler

What is the Semantic Web?
http://swag.webns.net/whatIsSW

What are the differences between a vocabulary, a taxonomy, a thesaurus, an ontology, and a meta-model?
http://www.metamodel.com/article.php?story=20030115211223271

Rob Jasper, Anita Tyler. The role of semantics and inference in the semantic web, a commercial challenge
http://www.semanticweb.org/SWWS/program/position/soi-jasper.pdf

There are several courses on the Semantic Web that have extensive material online. Here we list a few:

J. Hefflin. The Semantic Web
http://www.cse.lehigh.edu/~heflin/courses/sw-fall01/

A. Sheth. Semantic Web
http://lsdis.cs.uga.edu/SemWebCourse_files/SemWebCourse.htm

S. Staab. Intelligent Systems on the World Wide Web.
http://www.aifb.uni-karlsruhe.de/Lehrangebot/Sommer2001/IntelligenteSystemeImWWW/

H. Boley, S. Decker, M. Sintek. Tutorial on Knowledge Markup Techniques.
http://www.dfki.uni-kl.de/km/knowmark

F. van Harmelen et al. Web-Based Knowledge Representation.
http://www.cs.vu.nl/~frankh/webkr.html

There is a number of relevant Web sites which maintain up-to-date information about the Semantic Web and related topics.

http://www.SemanticWeb.org

http://www.w3.org/2001/sw/

http://www.ontology.org

Finally there is a good selection of research papers that provides much more technical information on issues relating to the Semantic Web.

D. Fensel, J. Hendler, H. Lieberman and W. Wahlster (eds). Spinning the Semantic Web. MIT Press 2002, ISBN 0-262-06232-1.
http://www.amazon.co.uk/exec/obidos/ASIN/0262062321/qid%3D1075813007/202-1618432-3813461

J. Davies, D. Fensel and F. van Harmelen (eds). Towards the Semantic Web: Ontology-driven Knowledge Management John Wiley, ISBN 0-470-84867-7.
http://www.amazon.co.uk/exec/obidos/search-handle-form/202-1618432-3813461

The conference series of the International Semantic Web Conference. The 2001 edition being published by IOS Press, ISBN 1 58603 255 0
http://www.amazon.co.uk/exec/obidos/search-handle-form/202-1618432-3813461

……………………………………………………………………………………………………
Reference:

MITPress.MIT.Edu (2008). A Semantic Web Primer. Retrieved April 19, 2008, from http://mitpress.mit.edu/catalog/item/default.asp?ttype=2&tid=10140.

Institute of Computer Science (ICS) of the Foundation for Research and Technology Hellas (FORTH). (2008).Links for chapter 1. Retrieved April 19, 2008, from http://www.ics.forth.gr/isl/swprimer/index2.php?opened=6&selected=1

Comments
No Comments »
Categories
Semantic Web
Trackback Trackback

Semantic Web – Primer Part 6

Diane | April 8, 2008

Source: Semantic Web FAQ
Resource:  W3C-Semantic Web
Retrieved: http://www.w3.org/2001/sw/SW-FAQ#What1

How do I participate in the Semantic Web?

Does the Semantic Web require me to manually markup all the existing web-pages, or to convert all the data in relational databases into RDF?

The Semantic Web is about a web of data. The data itself can reside in databases, spreadsheets, Wiki pages, or indeed traditional web pages.

The challenge is to develop tools that can export these data into RDF form: RDF plays the role of a common model, as a kind of a glue to integrate the data. That does not mean that the data must be physically converted into RDF form and stored in, say, RDF/XML. Instead, automatic procedures, for example SQL to RDF converters for relational databases, GRDDL processors for XHTML files with microformats etc, can produce RDF data on-the-fly as an answer to, eg, queries. RDF data may also be included in the data via other tools (e.g, Adobe’s XMP data that gets automatically added to JPEG images by Photoshop). Authoring tools also exist to develop, eg, ontologies on a high level instead of editing the ontology files directly. Of course, direct editing of RDF data is sometimes necessary, but it can be expected to become less and less prevalent as smarter editors come to the fore.

Clearly, lots of development is still to be done in this area, and it is a subject of active Research and Development. The goal is to reuse, as much as possible, existing data in its existing form, and minimize the RDF data that has to be created manually.

Where do I find tools for Semantic Web Development?
There are several lists on the Web that give a more-or-less comprehensive overview of the various available tools. There is a Wiki page on the W3C ESW Wiki site that is maintained but the W3C staff as well as the community at large. This page includes references to programming environments, validators that can be used to validate RDF/XML data or OWL ontologies, SPARQL endpoints, specialized editors or triple databases. It also includes references to other lists, like Dave Beckett’s Resource Description Framework (RDF) Resource Guide or the tool list maintained at the Freie Universität Berlin.

How do I put RDF into my X(HTML) documents?

Unfortunately, it is currently not possible to incorporate full RDF into XHTML without violating the validity of the resulting XHTML, except for the usage of the meta and the link elements in the header.

The best solution is to store the RDF separately and use the URIs to refer to the XHTML page and the link element in the XHTML page to refer to the RDF content. This technique is often called an RDF autodiscovery link and is used by a number of tools already.

However, work is going on for a better integration of RDF into documents. The GRDDL Working Group has recently to developed a bridge to the microformats approach, and the Semantic Web Deployment group’s work on RDFa develops an additional XHTML1.1 module that gives the possibility to use virtually any RDF vocabularies as annotations of the XHTML content. Finally, eRDF (developed by Talis) offers a formalism somewhere between the two: one can add general RDF data to an (X)HTML page without problems with validity, although with restrictions on the type of RDF vocabularies that can be used this way.

How can I learn more about the Semantic Web?

Dave Beckett’s Resource Description Framework (RDF) Resource Guide gives a quite comprehensive list of references to Semantic Web related articles. The home page of the Semantic Web Activity lists all the recommendations, gives references to some of the presentations, articles, etc, that have been given by the W3C staff or the members of the working groups on the subject. A separate page lists a number of tutorials that might be of interest.

The (now defunct) Semantic Web Best Practices and Deployment Working Group has produced a number of notes that might be useful when developing ontologies, setting up servers to serve RDF data, using XML Schema datatypes with RDF, etc. The newly chartered Semantic Web Deployment Working Group will continue developing similar documents.

A number of books have also been published. A list of books is given on W3C’s Wiki site, comprising (at this moment) over 40 books in different languages, published by major publishers like Reilly, MIT Press, Cambridge University Press, Springer Verlag.

References:

W3C Semantic Web (2001). W3C Semantic Web Frequently Asked Questions. Retrieved April 15, 2008, from  http://www.w3.org/2001/sw/SW-FAQ#What1

Copyright © 1994-2008 W3C ® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.

Comments
No Comments »
Categories
Semantic Web
Trackback Trackback

Semantic Web – Primer Part 4

Diane | April 2, 2008

Source: Semantic Web FAQ
Resource: W3C-Semantic Web
Retrieved: http://www.w3.org/2001/sw/SW-FAQ#What1

How does the Semantic Web relate to XML Schemas?  What do ontologies buy me that XML and XML Schema don’t?

  • An ontology differs from an XML Schema in that it is a knowledge representation, not a message format. Most industry based Web standards consist of a combination of message formats and protocol specifications. These formats have been given an operational semantics, such as, Upon receipt of this PurchaseOrder message, transfer Amount dollars from AccountFrom to AccountTo and ship Product. But the specification is not designed to support reasoning outside the transaction context. For example, we won’t in general have a mechanism to conclude that because the Product is a type of Chardonnay it must also be a white wine.
  • One advantage of OWL ontologies will be the availability of tools that can reason about them. Tools will provide generic support that is not specific to the particular subject domain, which would be the case if one were to build a system to reason about a specific industry-standard XML schema.  They will benefit from third party tools based on the formal properties of the OWL language, tools that will deliver an assortment of capabilities that most organizations would be hard pressed to duplicate.

Also, XML data is very sensitive to the XML Schema it refers to. If the XML Schema changes, the same XML data may become invalid, i.e., being rejected by Schema-aware parsers. Somewhat similar dependence on RDF Schemas and Ontologies exist for RDF data, too: if the RDF Schema or OWL Ontology changes, the inferences drawn from the RDF data may change. However, the core RDF data is still usable, there is no notion of the data being rejected by, e.g., a parser due to a Schema/Ontology change. In general, RDF is more robust against changing of Schemas and Ontologies than XML is versus Schemas. Note that a GRDDL transformation from XML to RDF may be given by an XML Schema as described in the GRDDL specification. This allows any XML document that validates according to the XML Schema given at the namespace URI of the XML vocabulary to be converted to RDF.
References:

W3C Semantic Web (2001). W3C Semantic Web Frequently Asked Questions. Retrieved March 6, 2008,  from http://www.w3.org/2001/sw/SW-FAQ#What1

Copyright © 1994-2008 W3C  (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.

Comments
No Comments »
Categories
Semantic Web
Trackback Trackback

Navigation

  • About Me
  • Breast Cancer
  • Brilliant Thinkers
  • Business
  • Cancer Research
  • Cardiovascular Health
  • Charter for Compassion
  • Collective Intelligence
  • Diabetes Research
  • Education
  • Education: Medical
  • Education: Technologies
  • Future Think
  • Global Health
  • Growing Cells
  • Health Care Reform
  • Healthcare Reform
  • Heart Disease
  • Information Design
  • Lung Disease
  • Medical Research
  • Medical Research Guidelines
  • Obama Healthcare Initiatives
  • Obama Healthcare Reform
  • Online Learning
  • Open Source Medical Information
  • Parkinsons
  • Rock Stars of Science
  • Second Hand Smoke
  • Second Life
  • Second Life Introduction
  • Second Life Medical Research
  • Secondhand smoke
  • Semantic search
  • Semantic Tutorial
  • Semantic Web
  • Semantic Web – Medical
  • Semantic Web Applications
  • Semantic Web Search Engines
  • SL: Medical Research
  • SL: Teacher's Resources
  • Stem Cell Research
  • TB – Tuberculosis
  • TED
  • Tim Berners-Lee
  • Uncategorized
  • Usability
  • Virtual Reality: Second Life
  • Web 2.0

Search

rss Comments rss valid xhtml 1.1 design by jide powered by Wordpress get firefox