According to dictionary.com, "Semantic web" means

An extension of the World Wide Web in which data is structured and XML-tagged on the basis of its meaning or content, so that computers can process and integrate the information without human intervention

It means that the computers/programs should be able to understand and parse the information easily and automatically. Something like web-spiders or crawlers. Google is able to index a page because it is able to parse the content and metadata out of webpage, and classify it accordingly. Programs can be screen readers which read the content of webpage for the disabled people.

The html page is in certain format. You have metadata of the page (title, keywords, description etc) on top of webpage between the <head></head>tags. You have content of the webpage between the <body></body> tags. And inside the body tags, you again have tags for specific things. Like <ul>for unordered list, <ol> for ordered list, <p> for paragraph, <h1>..<h6> for header, <div> for block element, <span> for inline element.

So let us understand the role of HTML5, schema.org tags and ARIA tags which define the semantic web.

HTML5 and semantic web

In addition to the tags mentioned in previous paragraph, HTML5 brought lots of tags which are useful for specific type of content. This helps the crawlers programs to understand where is the content and its type. For example, here are the tags

Some of the HTML5 tags

  • <article>
  • <aside>
  • <canvas>
  • <embed>
  • <figure>
  • <footer>
  • <header>

You can find more about the list of tags in HTML 5 at HTML elements reference - HTML - MDN.

Here is the example use of article tag

Article tag allows you to specify that the content is article with additional metadata like header of article, author, time. According to HTML standard, an article could be a forum post, a magazine or newspaper article, a blog entry, a user-submitted comment, an interactive widget or gadget, or any other independent item of content.

<article itemscope itemtype="http://schema.org/BlogPosting">
 <header>
  <h1 itemprop="headline">The Very First Rule of Life</h1>
  <p><time itemprop="datePublished" datetime="2009-10-09">3 days ago</time></p>
  <link itemprop="url" href="?comments=0">
 </header>
 <p>If there's a microphone anywhere near you, assume it's hot and
 sending whatever you're saying to the world. Seriously.</p>
 <p>...</p>
 <footer>
  <a itemprop="discussionUrl" href="?comments=1">Show comments...</a>
 </footer>
</article>

In the above example, you can see following tags

  • article - wraps around the blog post
  • header - introductory content like title, author and time
  • h1 - header tag
  • time - represents datatime. In context of article, it shows time of article posted
  • link - normally used for linking stylesheets or favicons. It can also be used inside body element in certain cases.
  • p - paragraph tag
  • footer - Normally footer contains copyright information, about us, contact us information. In context of article, footer may contain author details, links for reference etc.

Schema.org and semantic web

Schema.org is a collaborative, community activity with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond.

Schema.org provides additional tags which are very specific for the type of content posted. If you are having a store for books, then you can use bookstore specific schema to give details like currency accepted, opening and closing hours, price range and payment.

You can see the complete list of all the schema.org tags available here in this list in one page

In order to use schema tags, you have to itemtype which has url of the schema that you are using, itemprop which gives the details and itemscope which specifies the scope of the schema.

For example, if you want to describe the details of person using Schema.org,

Details for Alice
<table  itemscope itemtype="http://schema.org/Person">
    <tr>
    <td>Address</td>
    <td itemprop="address">San Fransisco</td>
    <td>Birth place</td>
    <td itemprop="birthplace">San Diego</td>
    <td>Gender</td>
    <td itemprop="gender">Female</td>
  </tr>
</table>

This way, google or anyother crawler can easily parse the information of the person, and link the address, birthplace and gender. https://schema.org/Person.

Aria tags and semantic web

ARIA stands for Accessible Rich Internet Applications. ARIA tags are primarily for screen readers and other tools which help people with disability to browse the dynamic web.

Normally when you write HTML, you add CSS and JS to make it visually easily for people to understand the content. But machine cannot understand what is a tab widget, what is a slider widget etc. By using aria tags, the browser can understand it and translate it to the operating system's native accessibility APIs.

Most of the widgets present in ARIA are incorporated into HTML5, so developers should use HTML5 semantic tags when available rather than ARIA tags.

Typical use cases of ARIA tags are

  • Screen readers
  • Keyboard accessbility

In order to make the webpage keyboard accessable, you can use tabindex.

<ul id="menu" tabindex="0">
  <li id="vegetables" tabindex="-1"> Vegetables
    <ul id="fontMenu" title="Green lefy vegetable" tabindex="-1">
      <li id="tomato" tabindex="-1">Tomato</li>
      <li id="green leafy vegetable" tabindex="-1">Green Leafy Vegetable</li>
      <li id="carrot" tabindex="-1">Carrot</li>
    </ul>
  </li>
</ul>

Complete example can be found at this link https://files.paciellogroup.com/training/WWW2012/samples/Samples/aria/tree/index.html

tabindex of 0 means by using the TAB key, you can move to that element. tabindex of -1 means you cannot scroll to that element using TAB key. Instead you must you arrow keys. But you must write javascript code so that the inner menu items are scrollable by keyboard.

Here is the guide to keyboard interaction from WAI-ARIA documentation

Keyboard Interaction

  • Enter or Space: When focus is on the accordion header for a collapsed panel, expands the associated panel. If the implementation allows only one panel to be expanded, and if another panel is expanded, collapses that panel. When focus is on the accordion header for an expanded panel, collapses the panel if the implementation supports collapsing. Some implementations require one panel to be expanded at all times and allow only one panel to be expanded; so, they do not support a collapse function.
  • Tab: Moves focus to the next focusable element; all focusable elements in the accordion are included in the page Tab sequence.
  • Shift + Tab: Moves focus to the previous focusable element; all focusable elements in the accordion are included in the page Tab sequence.
  • Down Arrow (Optional): If focus is on an accordion header, moves focus to the next accordion header. If focus is on the last accordion header, either does nothing or moves focus to the first accordion header.
  • Up Arrow (Optional): If focus is on an accordion header, moves focus to the previous accordion header. If focus is on the first accordion header, either does nothing or moves focus to the last accordion header.
  • Home (Optional): When focus is on an accordion header, moves focus to the first accordion header.
  • End (Optional): When focus is on an accordion header, moves focus to the last accordion header.

Conclusion

We have seen the role of HTML5 tags, schema.org tags and ARIA tags which helps to make the web more semantic - so that webpages can be understood by tools like screen readers and keyboards, and programs like web spiders and crawlers. It helps the people indirectly as the more content is parsed and understood by crawlers, the more easier it is for people to get accurate search results. It also helps disabled people to browse the pages.