Semantic Web

What is Semantic Technology?

Nowadays when people mention semantic technology they usually do so along with a set of terminology and technology references that serve mainly to confuse. That in turn tends to disguise how incredibly useful semantic technology is now proving to be. So here you go - explained in a very understandable way: Have you ever wondered why you don't remember anything from your toddler period? The reason is simple - at that age our brains do not yet function in a way that bundles information into the complex neural patterns that we know as memories [see AI]. It's clear however that babies do remember facts in the moment such as who their parents are, or how to say they are hungry. This is called "semantic memory." The word “semantic” comes from the Greek “semasia” (it means “signification” or “meaning”). So, semantic technologies, are those that encode meaning and concepts. [vs. episodic memory which stores episodic events and patterns]

Semantics and Internet of Things

With the rapid expansion of connected devices and the rise of Internet of Things (IoT) the semantic technologies slowly become an important part of the evolution process of the nowadays information layer of our planet. Semantic sensors (&Actuator) web is an extension of the current Web/Internet in which information is given well-defined meaning , better enabling objects , devices and people to work in co-operation and to also enable autonomous interactions between devices and/or objects. This was made possible because the semantic sensor Web enables interoperability and advanced analytics for situation awareness and other advanced applications from heterogeneous sensors.

Semantic Recommender Systems

Recommender systems have become a popular information filtering device now present on a number of web based media platforms. In such cases the deployed recommender system will attempt to predict the rating that a user has given to the recommended item (i.e. movie, song, book, location, etc.) by understanding what a user likes, and doesn’t like, in the form of a taste profile. With Semantic SVD based algorithms it is possible to create preference profiles which can track users’ tastes and overcomes the factor consistency problem meanwhile enabling modeling of global taste influence. This process is intensively used to boosts recommendation performance.

Semantic Entity Recognition

Named entities specify things such as persons, places and organizations. Semantic Technologies are capable of identifying people, companies, organizations, cities, geographic features and other typed entities from HTML, text, documents ot web-based content. Entity extraction can add a wealth of semantic knowledge to the content to help quickly understand the subject of the text. It is one of the most common starting points for using natural language processing techniques to enrich your content. Named entity extraction is based on sophisticated statistical algorithms and natural language processing technology. It is often associated with multilingual support, linked data, context-sensitive entity disambiguation, comprehensive type support and quotations extraction and more

Automatic Summarization

Document summarization refers to the task of creating document surrogates that are smaller in size but retain various characteristics of the original document. To automate the process of abstracting, researchers generally rely on a two phase process. First, key textual elements, e.g., keywords, clauses, sentences, or paragraphs are extracted from text using linguistic and statistical analyses. In the second step, the extracted text may be used as a summary. Such summaries are referred to as ‘extracts’. Alternatively, textual elements can be used to generate new text, similar to the human authored abstract. Semantic Technologies are largely capable of abstracting and summarization.

Semantic Graph Analytics

Semantic graph analytics offer sophisticated capabilities for analyzing relationships, while traditional analytics focus on summarizing, aggregating and reporting on data. Some common graph analytic techniques include: Centrality analysis: To identify the most central entities in your network, a very useful capability for influencer marketing; Path analysis: To identify all the connections between a pair of entities, useful in understanding risks and exposure; Community detection: To identify clusters or communities, which is of great importance to understanding issues in sociology and biology; Sub-graph isomorphism: To search for a pattern of relationships, useful for validating hypotheses and searching for abnormal situations, such as hacker attacks.

Protecting Copyright

Semantic technologies are facilitating copyright management in the context of User Generated Content. It is a key issue for the media industry these days, in addition to unauthorised media reproduction and distribution, to control the reuse of media in user generated content. To solve this issue and avoid publishing content that infringes copyright, services like YouTube offer mechanism to detect the unauthorised reuse of media, and give the choice to monetarise its use rather than take down the content. However, all the potential of this new revenue stream is at risk if copyright subtleties are not managed appropriately. For instance, if the same song is owned by different rights holders depending on the territory. What is required is a scalable decision support system capable of integrating digital rights languages, like DDEX or ODRL, together with contracts or policies, like talent contracts or business policies. Semantic technologies provide a common and expressive framework where all these copyright information sources can be represented together.

Semantics for Cybersecurity

Current information technology security systems primarily focus on simple threats, such as defending against traffic on specific ports, virus detection, etc. However, adversaries are targeting organizations with complex attacks that appear completely legitimate but have devastating effects. Current security controls might detect such attacks days later. To provide appropriate risk assessment, next-generation information security systems can be built to leverage the power of their underlying semantic and cloud-based technologies including: Infrastructure-Enhanced Security—where cloud computing will likely reduce encryption and decryption times, promoting further adoption of these security controls, while likely demanding and promoting enhanced key strategies. Cloud computing can sustain cutting-edge, near-real-time analytics that mine vast amounts of security data to identify complex threats, and detect intentional and unintentional information access and abuse for both internal and external users. Enhanced Threat Modeling—where cloud computing analytics developed for social network analysis will provide capabilities to analyze large amounts of data about users, network traffic and other interests to detect seemingly safe activities that match larger threats. Semantic Security—where advances in semantic technology, in conjunction with cloud computing, will promote security controls that simulate human cognition and can block and/or report suspect communications in near real-time over Internet scale data. The semantic security evolution will address the adoption of semantic technologies and include software agents that act on behalf of end users.

Semantics for Defense and Security

Based on supporting and exploiting domain specific ontologies, semantic technologies offer advanced capability in heterogeneous content processing analysis, and integration at a higher semantic level-- rather than merely syntactical and structural level approaches based on XML and RDF. These capabilities have been demonstrated in addressing requirements of very demanding Homeland Security and National Security applications such as passenger threat assessment or anti-terorrism. Link analysis, News analysis, Personal Information analysis and cumulative threat analysis are just some of the packages enabled by semantic applications. These systems support the identification of semantic associations and provide analysts with a powerful toolset. Their capabilities are maximized relationships by intelligently correlating content with contextual real-world knowledge, thus making the information more relevant and actionable for enterprise user

Still, how are semantic technologies better?
A great deal of the data we store is full of meaning, and traditional databases do not store meaning very well - they store tables of structured records. The meaning of the data (exposed by the metadata) is trapped in the database schema and hidden in the text itself. This arrangement is fine for storing repeated items (products, people, orders, etc.) which need exactly the same information stored every time. But as soon as you have any variance from that regular repetition it becomes hopeless. Say, for example, you are storing details about people and you want to store who is related to whom. Suddenly you need a schema change and most likely you will need to build a whole new table. Next you want to store who works for whom. Then you want to store who went to school with whom. The inherent complexity of these frequent changes will undermine performance. Let’s try to think of how we would store all of the words written so far on this page in such a way that we would be able to retrieve the meaning of what was written. With a traditional database this is completely impossible. Indeed, traditional databases call such data “unstructured data,” when in reality, it is structured in a sophisticated way, and it is full of meaning.
Are you still puzzled? Let's try something else?
To understand the nature of a semantic database it is necessary to understand that such databases store “triples.” A triple is the basic atom of data. It can be thought of in terms of subject-predicate-object, e.g., John has a hat. You can decompose all the data in a typical database record in this way: person has a name. name is John Smith, person has a date of birth, the date of birth is 12/04/1982, etc. In practice, this means that semantic databases – or triplestores – can store and retrieve any data that a traditional database can. And of course, they can store much more. All information can be stored as triples (i.e., can be broken into atoms of meaning), so semantic databases can store any kind of data. These triplestores are particularly good at storing text because they can store all its meaning, and allow you to query it. Most semantic databases use the query language SPARQL, which is specifically designed to handle queries over an RDF database or disparate data sources. SPARQL provides the same aggregate functions as SQL, as well as analytic operations such as JOIN and SORT, but it is much more powerful. Perhaps the greatest differentiator is that with SPARQL, relationships in the data are evident and accessible. The results of a SPARQL query will return information that reflects the user’s understanding of the scenario, not just the structure of the database. Semantic technologies leverage the Resource Description Framework (RDF) - these are databases that are currently the only standardized NoSQL solution on the market. They are known for being schema-less, distributed and incredibly scalable, and they are perfect for semantics.


[fyi: we used bold above for SEO purposes. You know it has nothing to do with underlining notions]