Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

At this point, we have learned the major technical components for the Semantic Web, and it is time for us to take a look at some real-world examples. Starting from Friend of a Friend (FOAF) is a good choice since it is simple and easy to understand, yet it does tell us a lot about what the Semantic Web looks like, especially in the area of social networking.

Studying FOAF also gives us a chance to practice what we have learned about RDF, RDFS and OWL. Another good reason is that FOAF namespace shows up in many ontology documents and in many literatures, so understanding FOAF seems to be necessary.

As usual, we first examine what exactly is FOAF, and what it accomplishes for us. Then we dive inside FOAF to see how it works. Finally, we take a look at some real examples and also come up with our own FOAF document.

Another interesting topic we cover in this chapter is semantic markup. As you will see, semantic markup is the actual implementation of the idea of adding semantics to the current Web to make it machine-readable. However, once you understand semantic markup, you will see the issues associated with it. The possible solutions to these issues are covered in later chapters, and they will give you even more opportunity to further understand the idea of the Semantic Web.

1 What FOAF Is and What It Does

1.1 FOAF in Plain English

In the early days of the Semantic Web, developers and researchers were eager to build some running examples of the Semantic Web, for the purpose of experimenting with the idea and hopefully, showing the benefits of the Semantic Web. Yet, as we saw in the previous chapters, to build applications on the Semantic Web, we must have some ontologies, and we must markup Web documents by using these ontologies so we can turn them into the documents that are machine-readable.

Obviously, in order to promptly create such an application example, it would be easier to focus on some specific domain, so the creation of the ontologies would be constrained in scope and would not be too formidably hard. In addition, to rapidly yield a large number of Web documents that would be created by using this specific ontology, it would have to involve a lot of people who were willing to participate in the effort. Therefore, a fairly straightforward project to start with would be some people-centric semantic Web application.

There are tons of millions of personal Web pages on the Web. On each such Web site, the author often provides some personal information, such as e-mails, pictures, interests, etc. The author may also include some links to his/her friends’ Web sites, therefore creating a social network. And with this network, we can answer questions such as “who has the same interest as I do?”, and maybe that means we can sell our old camera to him. And also, we can find someone who lives close to us and who also works at roughly the same location, so we can start to contact him and discuss the possibility of car-pooling.

All this sounds great. However, since all the personal Web sites are built for human eyes, we would have to do all the above manually, and it is very hard to create any application to do all that for us.

To make these documents understandable to an application, two major steps have to be accomplished: first, a machine-readable ontology about person has to be created, and second, each personal homepage has to be marked up, i.e., it has to be connected to some RDF statement document written by using this ontology.

This was the motivation behind the FOAF project. Founded by Dan Brickley and Libby Miller in mid-2000, FOAF is an open community-led initiative with the goal of creating a machine-readable web of data in the area of personal homepages and social networking.

It is important to understand the concept of “machine-readable web of data”. Just like the HTML version of your homepage, FOAF documents can be linked together to form a web of data. The difference is that this web of data is formed with well-defined semantics, expressed in the person ontology. We will definitely come back to this point later this chapter and in the coming chapters as well.

In plain English , FOAF is simply a vocabulary (or, ontology) that includes the basic terms to describe personal information, such as who you are, what you do, and who your friends are, and so on. It serves as a standard for everyone who wants to markup their homepages and turn them into documents that can be processed by machines.

1.2 FOAF in Official Language

First, the official FOAF Web site is found here:

It has an official definition of FOAF:

The Friend of a Friend (FOAF) project is creating a Web of machine-readable pages describing people, the links between them and the things they create and do.

This definition should be clear enough based on our discussion so far. Again, you can simply understand FOAF as a machine-readable ontology describing persons, their activities and their relations to other people. Therefore, FOAF and FOAF ontology are interchangeable concepts.

Notice that FOAF ontology is not a standard from W3C, it is managed by following the style of an Open SourceFootnote 1 or Free SoftwareFootnote 2 project standards, and is maintained by a community of developers. However, FOAF does depend on W3C standards, such as RDF and OWL. More specifically,

  • FOAF ontology is written in OWL, and

  • FOAF documents must be well-formed RDF documents.

FOAF ontology’s official specification can be found at this location:

New updates and related new releases can be found at this page as well. In addition, the FOAF ontology itself can be found (and downloaded) from this URL:

As usual, FOAF ontology is a collection of terms, and all these terms are identified by predefined URIs, which all share the following leading string:

By convention, this URI prefix string is associated with namespace prefix foaf: , and is typically used in RDF/XML format with the prefix foaf.

Finally, there is also a wiki site for the FOAF project, which is found at the following URL,

and you can use this wiki to learn more about FOAF project as well.

2 Core FOAF Vocabulary and Examples

With what we have learned so far, and given the fact that FOAF ontology is written in OWL, understanding FOAF ontology should not be difficult. In this section, we cover the core terms in this ontology, and also present examples to show how the FOAF ontology is used.

2.1 The Big Picture: FOAF Vocabulary

FOAF terms are grouped in categories. Table 7.1 summarizes these categories and the terms in each category. Notice that FOAF is also under constant change and update; it would not be surprising if, at the time you read this book, you find more terms in some categories.

Table 7.1 FOAF vocabulary

As you can see, the FOAF ontology is not a big ontology at all, and most of the terms are quite intuitive. Notice that a term starting with a capital letter identifies a class; otherwise, it identifies a property.

2.2 Core Terms and Examples

It is not possible to cover all the FOAF terms in detail. In this section, we discuss some of the most frequently used terms, and leave the rest of them for you to study.

The foaf:Person class is one of the core classes defined in FOAF vocabulary, and it represents people in the real world. List 7.1 is the definition of Person class, taken directly from the FOAF ontology:

List 7.1

Definition of Person class

figure afigure a

As you can see, foaf:Person is defined as a subclass of the Person class defined in WordNet. WordNet is a semantic lexicon for the English language. It groups English words into sets of synonyms called synsets , and provides short and general definitions, including various semantic relations between these synonym sets. Developed by the Cognitive Science Laboratory of Princeton University, WordNet has two goals: first, to produce a combination of dictionary and thesaurus that is more intuitively usable, and second, to support automatic text analysis and artificial intelligence applications.

During the past several years, WordNet has found more and more usage in the area of the Semantic Web, and FOAF class foaf:Person is a good example. By being a subclass of wordNet:Person, FOAF vocabulary can fit into a much broader picture. For example, an application which only knows WordNet can also understand foaf:Person even if it has never seen FOAF vocabulary before.

By the same token, foaf:Person is also defined to be a subclass of several outside classes defined by other ontologies, such as the following two classes:

Notice that foaf:Person is a subclass of foaf:Agent , which can represent a person, a group, software or some physical artifacts. A similar agent concept is also defined in WordNet. Furthermore, foaf:Person cannot be anything such as a foaf:Document , foaf:Organization or foaf:Project .

Besides foaf:Person class, FOAF ontology has defined quite a few other classes, with the goal to include the main concepts that can be used to describe a person as a resource. You can read these definitions just as the way we have understood foaf:Person’s definition. For example, foaf:Document represents the things that are considered to be documents used by a person, such as foaf:Image , which is a subclass of foaf:Document, since all images are indeed documents.

Properties defined by FOAF can be used to describe a person on a quite detailed level. For example, foaf:firstName is a property that describes the first name of a person. This property has foaf:Person as its domain, and http://www.w3.org/2000/01/rdf-schema#Literal as its value range. Similarly, foaf:givenname is the property describing the given name of a person, and it has the same domain and value range. Notice a simpler version of these two properties is the foaf:name property.

The foaf:homepage property relates a given resource to its homepage. Its domain is http://www.w3.org/2002/07/owl#Thing , and its range is foaf:Document. It is important to realize that this property is an inverse functional property. Therefore, a given Thing can have multiple homepages; however, if two Things have the same homepage, then these two Things are in fact the same Thing.

A similar property is the foaf:mbox property, which describes a relationship between the owner of a mailbox and the mailbox. This is also an inverse functional property: if two foaf:Person resources have the same foaf:mbox value, these two foaf:Person instances have to be exactly the same person. On the other hand, a foaf:Person can indeed own multiple foaf:mbox instances. We will come back to this important property soon.

Let us take a look at some examples, and we will also cover some other important properties in these examples.

First, List 7.2 shows a typical description of a person:

List 7.2

Example of using foaf:Person

figure bfigure b

List 7.2 simply says that there is a person, this person’s name is Liyang Yu and the person’s e-mail address is liyang@liyangyu.com.

The first thing to notice is that there is no URI to identify this person at all. More specifically, you don’t see the following pattern where rdf:about attribute is used on foaf:Person resource:

<foaf:Person rdf:about="some_URI"/>

This seems to have broken one of the most important rules we have for the world of the Semantic Web. This rule says, whenever you decide to publish some RDF document to talk about some resource on the Web (in this case, Liyang Yu as a foaf:Person instance), you need to use a URI to represent this resource, and you should always use the existing URI for this resource if it already has one.

In fact, List 7.2 is correct, and this is done on purpose. This is also one of the important features of a FOAF document. Let us understand the reason here.

It is certainly not difficult to come up with a URI to uniquely identify a person. For example, I can use the following URI to identify myself:

<foaf:Person

rdf:about=" http://www.liyangyu.com/people#LiyangYu "/>

The difficult part is how to make sure other people know this URI and when they want to add additional information about me, they can reuse this exact URI.

One solution comes from foaf:mbox property. Clearly, an e-mail address is closely related to a given person, and it is also safe to assume that this person’s friends should all know this e-mail address. Therefore, it is possible to use an e-mail address to uniquely identify a given person, and all we need to do is to make sure if two people have the same e-mail address, these two people are in fact the same person.

As we discussed earlier, FOAF ontology has defined the foaf:mbox property as an inverse functional property, as shown in List 7.3:

List 7.3

Definition of foaf:mbox property

figure cfigure c

Now, if one of my friends has the following descriptions in her FOAF document:

<foaf:Person>

<foaf:nick>Lao Yu</foaf:nick>

<foaf:title>Dr</foaf:title>

<foaf:mbox rdf:resource="mailto:liyang@liyangyu.com"/>

</foaf:Person>

An application that understands FOAF ontology will be able to recognize the foaf:mbox property and conclude that this is exactly the same person as described in List 7.2. And apparently, among other extra information, at least we now know this person has a nickname called Lao Yu.

Clearly, property foaf:mbox has solved the problem of identifying a person as a resource: when describing a person, you don’t have to find the URI that identifies this person, and you certainly don’t have invent your own URI either; all you need to do is to make sure you include his/her e-mail address in your description, as shown here.

foaf:mbox_sha1sum is another property defined by the FOAF vocabulary which functions just like the foaf:mbox property. You will see this property quite often in related documents and literatures, so let us talk about it here as well.

As you can tell, the value of foaf:mbox property is a simple textual representation of your e-mail address. In other words, after you have published your FOAF document, your e-mail address is open to the public. This may not be what you wanted. For one thing, spam can flood your mailbox within a few hours. For this reason, FOAF provides another property, foaf:mbox_sha1sum, which offers a different representation of your e-mail address. You can get this representation by taking your e-mail address and applying the SHA1 algorithm to it. The resulting representation is indeed long and ugly, but your privacy is well protected.

There are several different ways to generate the SHA1 sum of your e-mail address; we will not cover the details here. Remember to use foaf:mbox_sha1sum as much as you can, and it is also defined as an inverse functional property, so it can be used to uniquely identify a given person.

Now let us move on to another important FOAF property, foaf:knows . We use it to describe our relationships with other people, and it is very useful when it comes to building the social network using FOAF documents. Let us take a look at one example. Suppose part of my friend’s FOAF document looks like this:

<foaf:Person>

<foaf:name>Connie</foaf:name>

<foaf:mbox rdf:resource="mailto:connie@liyangyu.com"/>

</foaf:Person>

If I want to indicate in my FOAF document that I know her, I can include the code in List 7.4 into my FOAF document:

List 7.4

Example of using foaf:knows property

figure dfigure d

This shows that I know a person who has an e-mail address given by connie@liyangyu.com. Again, since property foaf:mbox is used, a given application will be able to understand that the person I know has a name called Connie; notice that no URI has been used to identify her at all.

Also notice that you cannot assume that the foaf:knows property is a symmetric property; in other words, that I know Connie does not imply that Connie knows me. If you check the FOAF vocabulary definition, you can see foaf:knows is indeed not defined as symmetric.

Perhaps the most important use of the foaf:knows property is to connect FOAF files together. Often by mentioning other people (foaf:knows), and by providing a rdfs:seeAlso property at the same time, we can link different RDF documents together. Let us discuss this a little further at this point, and in the later chapters, we will see applications built upon this relationships.

We saw property rdfs:seeAlso in previous chapters. It is defined in RDF schema namespace, and it indicates that there is some additional information about the resource this property is describing. For instance, I can add one more line into List 7.1, as shown in List 7.5:

List 7.5

Example of using rdfs:seeAlso property

figure efigure e

Line 8 says that if you want to know more about this Person instance, you can find it in the resource pointed to by http://www.yuchen.net/liyang.rdf .

Here, the resource pointed to by http://www.yuchen.net/liyang.rdf is an old FOAF document that describes me, but I can, in fact, point to a friend’s FOAF document using rdfs:seeAlso, together with property foaf:knows, as shown in List 7.6:

List 7.6

Use foaf:knows and rdfs:seeAlso to link RDF documents together

figure ffigure f

Now, an application seeing the document shown in List 7.6 will move on to access the document identified by the following URI (line 12):

and by doing so, FOAF aggregators can be built without the need for a centrally managed directory of FOAF files.

As a matter of fact, property rdfs:seeAlso is treated by the FOAF community as the hyperlink of FOAF documents. More specifically, one FOAF document is considered to contain a hyperlink to another document if it has included rdfs:seeAlso property, and the value of this property is where this hyperlink is pointing to. Here, this FOAF document can be considered as a root HTML page, and the rdfs:seeAlso property is just like an <href> tag contained in the page. It is through the rdfs:seeAlso property that a whole web of machine-readable metadata can be built. We will see more about this property and its important role in the chapters yet to come.

The last two FOAF terms we would like to discuss here are foaf:depiction and foaf:depicts . It is quite common that people put their pictures on their Web sites. To help us add statements about the pictures into the related FOAF document, FOAF vocabulary provides two properties to accomplish this. The first property is the foaf:depiction property, and second one is foaf:depicts property; make sure you know the difference between these two.

The foaf:depiction property is a relationship between a thing and an image that depicts the thing. In other words, it makes the statement such as “this person (Thing) is shown in this image”. On the other hand, foaf:depicts is the inverse property: it is a relationship between an image and something that image depicts. Therefore, to indicate the fact that I have a picture, I should use line 9 as shown in List 7.7:

List 7.7

Example of using foaf:depiction property

figure gfigure g

I will leave it to you to understand the usage of the foaf:depicts property.

Up to this point, we have talked about several classes and properties defined in the FOAF vocabulary. Again, you should have no problem reading and understanding the whole FOAF ontology. Let us move on to the topic of how to create your own FOAF document and also make sure that you know how to get into the “friend circle”.

3 Create Your FOAF Document and Get into the Friend Circle

In this section, we talk about several issues related to creating your own FOAF document and joining the circle of friends. Before we can do all this, we need to know how FOAF project has designed the flow, as we will see in the next section.

3.1 How Does the Circle Work?

The circle of FOAF documents is created and maintained by the following steps.

  • Step 1. A user creates the FOAF document.

    As a user, you create a FOAF document by using the FOAF vocabulary as we discussed in the previous section. The only thing you need to remember is that you should use foaf:knows property together with rdfs:seeAlso property to connect your document with the documents of other friends.

  • Step 2. Link your homepage to your FOAF document.

    Once you have created your FOAF document, you should link it from your homepage. And once you have finished this step, you are done; it is now up to the FOAF project to find you.

  • Step 3. FOAF uses its crawler to visit the Web and collect all the FOAF documents.

    In the context of FOAF project, a crawler is called a scutter . Its basic task is not much different from a crawler: it visits the Web and tries to find RDF files. In this case, it has to find a special kind of RDF file: a FOAF document. Once it finds one, the least it will do is to parse the document, and store the triples into its data system for later use.

    An important feature about the scutter is that it has to know how to handle the rdfs:seeAlso property: whenever the scutter sees this, it follows the link to reach the document pointed to by rdfs:seeAlso property. This is the way FOAF constructs a network of FOAF documents.

    Another important fact about scutter is that it has to take care of the data merging issue. To do so, the scutter has to know which FOAF properties can uniquely identify resources. More specifically, foaf:mbox, foaf:mbox_sha1sum and foaf:homepage are all defined as inverse functional properties; therefore, they can all uniquely identify individuals that have one of these properties. In the real operation, one solution the scutter can use is to keep a list of RDF statements which involve any of these properties, and when it is necessary, it can consult this list to merge together different triples that are in fact describing the same individuals.

  • Step 4. FOAF maintains a central repository and is also responsible for keeping the information up to date.

    FOAF also has to maintain a centralized database to store all the triples it has collected and other relevant information. To keep this database up to date, it has to run the scutter periodically to visit the Web.

  • Step 5. FOFA provides a user interface so we can find our friends and conduct other interesting activities.

    FOAF offers some tools one can use to view the friends in the circle, which further defines the look-and-feel of the FOAF project. Among these tools, FOAF explorer is quite popular, and you can find this tool as this location:

  • http://xml.mfd-consult.dk/foaf/explorer/

Figure 7.1 is an example of viewing a FOAF document using FOAF explorer. The FOAF document being viewed was created by Dan Brickley, one of the founders of the FOAF project.

Fig. 7.1
figure 1figure 1

FOAF Explorer shows Dan Brickley’s FOAF document

Up to this point, we have gained understanding about how FOAF project works to build a network of FOAF documents. It is time to create our own FOAF document and join the circle.

3.2 Create Your FOAF Document

The most straightforward way to create a FOAF document is to use a simple text editor. This requires you to directly use the FOAF vocabulary. Given the self-explanatory nature of the FOAF ontology, this is not difficult to do. Also you need to validate the final document, just to make sure its syntax is legal.

The other choice is to use tools to create FOAF documents. The most popular one is called “FOAF-a-matic ”, you can find the link to this tool from the FOAF official Web site, and at the current time, its URL is given below:

Figure 7.2 shows the main interface of this authoring tool.

Fig. 7.2
figure 2figure 2

Use FOAF-a-Matic to create your own FOAF document

To use this form, you don’t have to know any FOAF terms, you just need to follow the instructions to create your FOAF document. More specifically, this form allows you to specify your name, e-mail address, homepage, your picture and phone number and other personal information. It also allows you to enter information about your work, such as work homepage and a small page describing what you do at your work. More importantly, you will have a chance to specify your friends, and provide their FOAF documents as well. Based on what we have learned so far, this will bring both you and your friends into the FOAF network.

Notice that you can leave a lot of fields on the form empty. The only required fields are “First Name”, “Last Name” and “Your Email Address”. By now, you should understand the reason why you have to provide an e-mail address—FOAF does not assign an URI to you at all, and later on in life, it will use this e-mail address to uniquely identify you.

Once you have finished filling out the form, by clicking the “FOAF me!” button, you will get a RDF document which uses FOAF vocabulary to present a description about you. At this point, you need to exercise your normal “copy and paste” trick in the output window, and copy the generated statements into your favorite editor, and save it to a file so you can later on join the circle of friends, as is discussed next.

3.3 Get into the Circle: Publish Your FOAF Document

Once you have created your FOAF document, the next step is to publish it in a way that it can be easily harvested by the scutter (FOAF’s crawler) or other applications that can understand FOAF documents. This is what we mean when we say “get into the circle”. There are three different ways to get into the circle, and we discuss these different methods in this section.

  • Add a link from you homepage to your FOAF document

The easiest solution is to link your homepage to your FOAF document. This can be done using the <link> element as shown in List 7.8:

List 7.8

Add a link from your homepage to your FOAF document

figure hfigure h

Remember to substitute href to point to your own FOAF document. Also notice that your FOAF file can be any name you like, but foaf.rdf is a common choice.

This is quite easy to implement; however, the downside is the fact that you have to wait for the crawler to visit your homepage to discover your FOAF document. Without this discovery, you will never be able to get into the circle. Given the fact that there are millions of personal Web sites out there on the Web, the FOAF scutter will have to traverse the Web for long time to find you, if it can find you at all.

To solve this problem, you can use the second solution, which makes the discovery process much more efficient.

  • Ask your friend to add a rdfs:seeAlso link that points to your document

This is a recommended way to get your FOAF document indexed. Once your friend has added a link to your document by using rdfs:seeAlso in his/her document, you can rest assured that your data will appear in the network.

To implement this, your friend needs to remember that he/she has to use foaf:knows and rdfs:seeAlso together by inserting the following lines into his/her FOAF document:

< foaf:knows >

<foaf:Person>

<foaf:mbox rdf:resource="mailto:you@yourEmail.com"/>

< rdfs:seeAlso rdf:resource=" http://path_to_your_foaf.rdf "/>

</foaf:Person>

</ foaf:knows >

Now, the fact that your friend is already in the circle means that the FOAF scutter has visited his/her document already. Since the scutter periodically revisits the same files to pick up any updates, it will see the rdfs:seeAlso link and will then pick up yours. This is the reason why your FOAF document will be guaranteed to be indexed.

Obviously, this solution is feasible only when you have a friend who is already in the circle. What if you do not have anyone in the circle at all? We then need the third solution discussed next.

  • Use the “FOAF Bulletin Board”

Obviously, instead of waiting for FOAF network to find you, you can report to it voluntarily. The FOAF project provides a service for you to do this, and it is the so-called FOAF Bulletin Board . To access this service, visit the FOAF Wiki site, and find the FOAF Bulletin Board page. You can also use the following URL to direct access the page:

Once you are on the page, you will see a registry of people whose FOAF documents have been collected by FOAF. To add your own FOAF document, you need to log in first. Once you log in, you will see an Edit tab. Click this tab, you will then be able to edit a document in the editing window. Add your name, and a link to your FOAF document; click “save page” when you are done, and you are then in the FOAF network.

There are other ways you can use to join the circle, but we are not going to discuss them here. A more interesting question at this point, is what does the FOAF world looks like, especially after more and more people have joined the circle? In other words, how does FOAF project change the world of personal Web pages for human eyes into a world of personal Web pages that are suitable for machine processing? Let us take a look at this interesting topic in the next section.

3.4 From Web Pages for Human Eyes to Web Pages for Machines

Let us take a look at the world of personal Web pages first. Assuming in my Web page, www.liyangyu.com , I have included links pointing to my friends’ Web sites. One of my friends, on his Web site, has also included links that pointing to his friends, and so on and so forth. This has created a link of documents on the Web, just as what we have today.

Now using FOAF vocabulary, I have created a FOAF document that describing myself. Quite similar to my personal Web site, in this FOAF document, I talk about myself, such as my e-mail, my name, my interests, etc. Yet there is a fundamental difference: when I talk about myself in this FOAF document, I have used a language that machines can understand. For the machine, this FOAF document has become my new personal homepage, it might look ugly to human eyes, but it looks perfectly understandable to machines.

Now, assuming all my friends have created their machine-readable homepages, and just like what I have done in my human-readable homepage, I can now put links that point to my friends’ FOAF documents in my machine-readable homepage. This is done by using foaf:knows together with rdfs:seeAlso property. Furthermore, this is also true for all my friends: in their machine-readable homepages, they can add links to their friends’ machine-readable homepages, and so on and so forth.

This will then create a brand new social network on the Web, coexisting with the traditional linked documents on the current Web. This whole new network is now part of the Semantic Web, in the domain of human networks.

The above two different Web networks are shown in Fig. 7.3. By now, probably two things have become much more clear. First, the reason why FOAF is called “Friend Of A Friend” has become clearer; and second, the reason why foaf:knows together with rdfs:seeAlso is considered by the FOAF community the hyperlink of the FOAF documents has become clear as well.

Fig. 7.3
figure 3figure 3

Homepages for human eyes vs. homepages for machines

4 Semantic Markup: A Connection Between the Two Worlds

Before we move on to continue exploring the Semantic Web world, we need to talk about one important issue: semantic markup.

4.1 What Is Semantic Markup?

So far in this book, we have used the phrase semantic markup quite a few times already. So, what exactly is semantic markup? How does it fit into the whole picture of the Semantic Web?

First of all, after all these chapters, we have gained a much better understanding about the Semantic Web. In fact, at this moment if we had to use one sentence to describe what exactly the Semantic Web is, it would be really simple: it is all about extending the current Web to make it more machine-understandable.

To accomplish this goal, we first need some language(s) to express meanings that machines can understand. This is one of the things we have learned the most at this point: we have covered RDF, RDFS and OWL. These languages can be used to develop a formal ontology and to create RDF documents that machines can process. And as we have learned in this chapter, we used FOAF ontology to create RDF documents that describe ourselves. Obviously, we can use other ontologies in other domains to create more and more RDF documents that describe resources in the world.

However, when we look at our goal and what we have accomplished so far, we realize the fact that there is something missing: the current Web is one world, the machine-readable semantics expressed by ontologies is another world, and where is the connection between these two? If these two worlds always stand independently with each other, there will be no way we can extend the current Web to make it more machine-readable.

Therefore, we need to build a connection between the current Web and the semantic world. This is what we call “adding semantics to the current Web”.

As you might have guessed, adding semantics to the current Web is called semantic markup ; sometimes, it is also called semantic annotation .

4.2 Semantic Markup: Procedure and Example

In general, a semantic markup file is an RDF document containing RDF statements that describe the content of a Web page by using the terms defined in one or several ontologies. For instance, suppose a Web page describes some entities in the real world, the markup document for this Web page may specify that these entities are instances of some classes defined in some ontology, and these instances have some properties and some relationships among them, and so on.

When an application reaches a Web page and somehow finds this page has a markup document (more details on this later), it reads this markup file and also loads the related ontologies into its memory. At this point, the application can act as if it understands the content of the current Web page, and it can also discover some implicit facts about this page. The final result is that the same Web page not only continues to look great to human eyes, but also makes perfect sense to machines.

More specifically, there are several steps you need to follow when semantically marking up a Web page:

  • Step 1. Decide which ontology or ontologies to use for semantic markup.

    The first thing is to decide which ontology to use. Sometimes, you might need more than one ontology. This involves reading and understanding the ontology to decide whether the given ontology fits your need, or whether you agree with the semantics expressed by the ontology. It is possible that you have to come up with your own ontology; in that case, you need to remember the rule of always trying to reuse existing ontologies, or simply construct your new ontology by extending some given ontology.

  • Step 2. Markup the Web page.

    Once you have decided the ontology you are going to use, you can start to markup the page. At this point, you need to decide exactly what content on your page you want to markup. Clearly, it is neither possible nor necessary to mark up everything on your page. Having some sort of application in your mind would help you to make the decision. The question you want to ask yourself is, for instance, if there were an application visiting this page, what information on this page would I want the agent to understand? Remember, your decision is also constrained by the ontology you have selected: the markup statements have to be constructed based upon the ontology; therefore, you can only markup the contents that are supported by the selected ontology.

    You can elect to create you markup document by using a simple editor or by using some tools. Currently there are tools available to help you to markup your pages, as we will see in our markup examples later in this chapter. If you decide to use a simple editor to manually markup a Web page, remember to use a validator to make sure your markup document at least does not contain any syntax errors. The reason is simple: the application that reads this markup document may not be as forgiving as you are hoping; if you make some syntax mistakes, a lot of your markup statements may be totally skipped and ignored.

    After you have finished creating the markup document, you need to put it somewhere on your Web server. You also need to remember to grant enough rights to it so the outside world can access it. This is also related to the last step discussed below.

  • Step 3. Let the world know your page has a markup document.

    The last thing you need to do is to inform the world that your page has a markup document. At the time of this writing, there is no standard way of accomplishing this. A popular method is to add a link in the HTML header of the Web page, as we have seen in this chapter when we discuss the methods we can use to publish FOAF documents (see List 7.8).

With all this said, let us take a look at one example of semantic markup. My goal is to markup my own personal homepage, www.liyangyu.com , and to do so, we will follow the steps discussed earlier.

The first step is to choose an ontology for markup. Clearly, my home page is all about a person, so quite obviously we are going to use FOAF ontology. We might need other vocabularies down the road, but for now, we will stick with FOAF ontology only.

The second step is to create the markup document. With all the examples we have seen in this chapter, creating this document should not be difficult at all: it is simply an RDF document that describes me as a resource by using the terms defined in FOAF ontology.

Again, it is up to us to decide what content in the page should be semantically marked up. As a starter, List 7.9 shows a possible markup document.

List 7.9

A markup document for the author’s homepage

figure ifigure i
figure jfigure j

As you can tell, this document contains some basic information about me, and it also includes a link to a friend I know. With what we have learned in this chapter, this document should be fairly easy to follow, and not much explanation is needed.

Now imagine an application that comes across my home page. By reading List 7.9, it will be able to understand the following facts (not a complete list):

Notice that Dublin Core vocabulary is used to identify the page title and page author (lines 9–10), and the URI that represents the concept of the Semantic Web is also reused (line 24). Again, this URI is coined by the DBpedia project, which is discussed in Chap. 10.

Now we are ready for the last step: explicitly indicate the fact that my personal Web page has been marked up by an RDF file. To do so, we can use the first solution presented in Sect. 7.3.3. In fact, by now you should realize that a FOAF document can be considered as a special markup to a person’s homepage, and all I have done here was simply create a FOAF document for myself.

There are also tools available to help us markup a given Web page. For example, SMORE is one of the projects developed by the researchers and developers in the University of Maryland at College Park, and you can take a look at their work from their official Web page:

SMORE allows the user to markup Web documents without requiring a deep knowledge of OWL terms and syntax. You can create different instances easily by using the provided GUI, which is quite intuitive and straightforward. Also, SMORE lets you visualize your ontology; therefore it can be used as a OWL ontology validator as well.

We are not going to cover the details about how to use it; you can download it from here and experiment with it on your own:

If you use it for markup, your final result would be a generated RDF document that you can directly use as your markup document. You might want to make modification if necessary; but generally speaking, it is always a good idea to use a tool to create your markup file whenever it is possible .

4.3 Semantic Markup: Feasibility and Different Approaches

As we mentioned earlier, the process of marking up a document is the process of building the critical link between the current Web and the machine-readable Web. It is the actual implementation of the so-called adding semantics to the current Web . However, as you might have realized already, there are lots of unsolved issues associated with Web page markup.

The first thing you might have noticed is that no matter whether we have decided to markup a page manually or by using some tools, it seems to be quite a lot of work just to markup a simple Web page such as my personal homepage. The question then is, how do we finish all the Web pages on the Web? Given the huge number of pages on the Web, it is just not practical. Also, it is not trivial at all to implement the markup process; a page owner has to learn at least something about ontology, OWL and RDF, among other things. Even if all single-page owners agree to markup their pages, how do we make sure everyone is sharing ontologies to the maximum extent without unnecessarily inventing new ones?

These thoughts have triggered the search for the so-called killer application in the world of the Semantic Web. The idea is that if we could build a killer semantic Web application to demonstrate some significant benefit to the world, there will then be enough motivation for the page owners to markup their pages. However, without the link between the current Web and the machine-readable semantics being built, the killer application (whatever it is) simply cannot be created.

At the time of this writing, there is still no final call about this killer application. However, the good news is, there are at least some solutions to the above dilemma, and in the upcoming chapters, we will be able to see examples of these solutions. For now, let us briefly introduce some of these solutions:

  • Manually markup in a much more limited domain and scope

Semantic markup by the general public on the whole Web seems to be too challenging to implement, but for a much smaller domain and scope, manual markup is feasible. For example, for a specific community or organization, their knowledge domain is much smaller, so publishing and sharing a collection of core ontologies within the community or the organization is quite possible. If a relatively easier way of manually semantic markup is provided, it is then possible to build semantic Web application for this specific community or organization.

  • Machine-generated semantic markup

There has been some research in this area, and some automatic markup solutions have been proposed. However, most of these techniques are applied to technical texts. For the Web that contains highly heterogeneous text types that are mainly made up by natural languages, there seems to be no efficient solution yet.

However, some Web content already provides structured information as part of the content. Therefore, instead of parsing natural languages to produce markup files, machines can take advantage of this existing structured information, and generate markup documents based on these structured data.

We can also see one example along this line as well, and it is the popular DBpedia project. We will learn more details later on, but for now, DBpedia is completely generated by machine, and the source for these machine-readable documents all comes from the structured information contained in Wikipedia.

  • Create a machine-readable Web all on its own

There seems to be one key assumption behind the previous two solutions: there have to be two formats for one piece of Web content: one for human viewing and one for machines.

In fact, do we really have to do this at all? If we start to publish machine-readable data, such as RDF documents, and somehow make all these documents connect to each other, just like what we have done when creating Web pages, then we will be creating a linked data Web! And since everything on this linked data Web is machine-readable, we should be able to develop a lot of interesting applications as well, without any need to do semantic markup. This is the idea behind the Linked Data Project, and we will study it as well in another future chapter.

At this point, we are ready to move on and to study the above three solutions in detail. The goal is twofold: first, to build more understanding about the Semantic Web; second, these solutions will be used as hints to you, and hopefully you will be able to come up and design even better solutions for the idea of the Semantic Web.

5 Summary

In this chapter, we have learned FOAF, an example of the Semantic Web in the domain of social networking.

The first thing we should understand from this chapter is the FOAF ontology itself. This includes its core terms and how to use these terms to describe people, their basic information, the things they do and their relationships to other people.

It is useful to have your own FOAF document created. You should understand how to create it and how to publish it on the Web, and further, how to get into the “circle of trust”.

Semantic markup is an important concept. Not only you should be able to manually markup a Web document, but also you should understand the following about it:

  • It provides a connection between the current Web and the collection of knowledge that is machine-understandable.

  • It is the concrete implementation of so-called “adding semantics to the current Web”.

  • It has several issues regarding whether it is feasible in the real world. However, different solutions do exist, and these solutions have already given rise to different applications on the Semantic Web.