Вы находитесь на странице: 1из 5

HelpClose

Paper Information
Owner: Plagiat Licenta Disertatie Doctorat Folder: Tabara_Horomnea Submitted: Mon, Nov 25 2013, 6:04 PM Paper ID: 66799184

INTELLIGIBLE WEBSITES - HUMAN LANGUAGE Filename: INTERPRETATION AND MARKETING WITH METADATA .docx
Matching:

Suspected Sources

73%

Click on a source to view the original, or click on the magnifying glass to see the source highlighted in the text below. 1. 2.
http://gate.ac.uk/conferences/iswc2003/proceedings/bontcheva.pdf
Highlight All Unhighlight All

3.
http://www.definitions.net/definition/metadata

4.
http://computersavvy.wordpress.com/2013/08/13/what-is-metadata/

5.
http://www.uglyhedgehog.com/t-155460-2.html

6.
http://wiki.creativecommons.org/Metadata

7.
http://rudradhar.wordpress.com/2012/10/08/microdata-in-seo/

8.
http://en.m.wikipedia.org/wiki/Microdata_(HTML)

9.
http://www.pearltrees.com/duerer/semantics/id6016371

10.
http://www.scribd.com/doc/112238871/Web-3-0-The-Semantic-Web-High-impact-Strategies-What-YouNeed-to

11.

http://www.scribd.com/doc/64127997/Web-3-0-The-Semantic-Web-High-impact-Strategies-What-YouNeed-to-

12.
http://ethoinformatics.org/events/ewg2013/Definitions%20and%20Links.pdf

13.

Re-process the paper without the selected sources

Paper Text
Intelligible Websites: Human Language Interpretation and Marketing with Metadata Valentin Buliga Florin Alexandru Luca Antonela Curteza Michel Calciu ABSTRACT This paper motivates the need for human language interpretation and discusses the major outstanding challenges in this area and how we can integrate this into marketing and sales. It introduces us into the area of the Human Language Technology that helps converting the current Web into a web of data, obtaining the new Semantic Web and offering the possibility to create intelligible websites. It shows how marketing can benefit of this technology using metadata and how we can integrate this to obtain a better promotion for simple products or entire brands. Introduction The Semantic Web is a part of the World Wide Web, the system of documents and information linked together which can be accessed through the Internet. Even if the Internet users can understand the data that is provided to them by the Web, this because the results are given in a natural language that they can understand, the information has limited linguistic semantic, meaning that it doesnt make much sense for the informational systems, they only give results without knowing what piece of information is there. Its true that for the common Internet users this matter is not visible, but in order to obtain better search results, the system has to know exactly what we are searching for as we see it in our mind, not only the words that we send to it, but what the words stand for. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current Web, dominated by unstructured and semi-structured documents, into a web of data". The Role of Human Language Technology The web revolution has been based mainly on human language materials, and in passing to the next generation knowledge-based web in which the human language will still remain the key. Human Language Technology involves the analysis, mining and production of natural language. HLT has evolved lately to a point at which robust and scaleable applications are possible to be developed in a variety of areas and also there is support for new projects in the Semantic Web area. Figure 1 illustrates the way in which Human Language Technology can be used to join the natural language with the current web. It shows on what the HLT is mainly based and the formal knowledge at the basis of the Semantic Web. Fig. 1. Closing the language loop

Information Extraction (IE) is a process which takes unseen text as input and produces fixed-format, unambiguous data as output. This data may be displayed directly to the users, or may be stored in a database or spreadsheet for later analysis, or may be used for indexing purposes in certain Information Retrieval (IR) applications. While IR simply finds text and presents it to the user, a typical IE application can analyse the content and it can give back only the specific information that the user is interested in. For example, a user of an IR system wanting information about the companies that produce organic nanofibers would typically type in a list of relevant words and receives in return a set of documents which contain likely matches (eg newspaper articles). The user would then read the returned documents and extract the required information.They might process the data manually to insert the information in a spreadsheet and produce a chart for a report or presentation. On the other hand, an IE system user having a properly configured application could automatically populate his spreadsheet directly with the names of the companies and the production prices. Populating ontologies and generating metadata are still a challenge for the IE systems. The Natural Language Generation (NLG) is the inverse process of IE, meaning that from structured data in a knowledge base, using NLG techniques, we can obtain natural language text. The obtained result is formed on the presentational context and the target reader. The NLG techniques use and build models of the context and the user and use them to select the appropriate presentation strategies. For example, a promotion company can deliver short summaries by sending messages to the user's phone or emails if the user is using a desktop. Similarly, NLG techniques can use simpler terminology to explain unknown terms to the naive user, while different terminology and text style will be used for the expert user. The challenge for NLG is generating text from ontologies and metadata, process that requires the development of new NLG methods and functions

allowing easy portability between domains, based on human language and machine learning.

Metadata structures The term metadata refers to data about data. The term can be used for two fundamentally different concepts:

Structural metadata which reffers to the design and specification of the data structures and is more properly called data about the containers of data.

Descriptive metadata that is about individual instances of application data, the actual data content. In this case, a useful description would be data about data content or content about content thus metacontent.

Metadata are traditionally found in the card catalogs of libraries. As in time the information has become increasingly digital, metadata are now used to describe digital data using metadata standards specific to a particular discipline. By describing the context and contents of data files, the quality of the original data is greatly increased. For example, a webpage may include metadata specifying what language it is written in, what tools were used to create it, and where to go for more on that subject, allowing browsers to automatically improve the experience of users.

A special framework has been created to work explicitly with metadata, the Resource Description Framework (RDF). The basic structure of RDF is very simple. There are three main parts:

the subject: a thing, identified by its URL address the predicate: the type of metadata (eg. title or creator), also identified by a URL address the object: the actual value of this type of metadata (eg. a person named John Doe) The core element used to nest metadata within existing content on web pages is called microdata. Search engines, web crawlers, and browsers can extract and process the microdata from a web page and use it to provide a richer browsing experience for users. Search engines benefit greatly from direct access to this kind of structured data because it allows them to understand the information on web pages and provide more relevant results for the users.

Microdata is using a supporting vocabulary to describe an item and name-value pairs to assign values to its properties. Microdata is an attempt to provide a simpler way of annotating website elements with machinereadable tags. Microdata vocabularies provide the semantics, or meaning of an item. Web developers can design a custom vocabulary or use the available vocabularies on the web. A collection of commonly used markup vocabularies can be found on the Schema.org website, schemas which include the following data types: Person, Event, Organization, Product, Review, Review-aggregate, Breadcrumb, Offer, Offer-aggregate and many others. Major search engine operators like Google, Microsoft and Yahoo! Rely now on this markup to improve search results. For some purposes, an ad-hoc vocabulary may be required. to be designed. In other cases, a vocabulary will need

Together, these components make the RDF statements, which are expressed in a language called RDF/XML.

Putting Semantic Web and Metadata together The Semantic Web is a part of the Web available in the Resource Description Framework. The idea behind the concept of the Semantic Web is that when enough pages carry this machine-processable metadata, developers can build tools that take advantage of it. Among other things, RDF helps different programs to talk to each other, reducing the need for users to copy information by hand. Just imagine a world where everything is based on embedded RDF: when buying a vacation package, for example, you could drag your itinerary onto your calendar program to add it to your

calendar. You could drag a friend's favourite tracks into your music player, and it could try and obtain the songs for you automatically. RDF can also be used to create more powerful search engines. At the moment, the only type of question a search engine can be asked is: What pages have these words in them?" If the pages include metadata, more advanced questions can be asked, like What's the current temperature in Romania? Programs can also use this piece of information, like an alarm clock program that can display the current weather. While searching for a certain thing ower the Internet, the user can ask exactly what he needs and it can obtain exactly the results that its interested in and pass over the other potential results. As a bottom idea, metadata can be aggregated everywhere across the whole Web. A program could download all your favourite tracks and with the help of a RDF pricing guide, it can calculate the cost of buying the most popular albums that contain those tracks.

Conclusion Adding microdata to a website is a great way to make sure the search engines have the necessary information to create rich snippets to show in the search engine results. Rich snippets offer more information to the searcher than standard old alternatives. This gives a better chance to the content and links at standing out and being clicked on in the first place. Embedding the right microdata in the posts makes it possible for the search results to include pictures, reviews, and other such interesting piece of information to help attract the potential visitor.It would be much more likely for a user to click on a recipe, movie write-up, article or blog post that came attached to images and additional information. In conclusion, human language interpretation and microdata are useful for the search engines, website owners and for the Internet users in the same time, attracting users and improving the online browsing experience. References Bontcheva K., Cunningham H.: Technology, ISCW, 2003 The Semantic Web: A New Opportunity and Challenge for Human Language

Floridi L.: 2009

Web 2.0 vs. Semantic Web: A Philosophical Assessment, Episteme Journal, Volume 6 , February,

Handschuh S., Staab S., Ciravegna F.: S-CREAM -- Semi-automatic CREAtion of Metadata. In 13th International Conference on Knowledge Engineering and Knowledge Management (EKAW02), Siguenza, Spain, 2002

Illian J.: Using Metadata to Market Books, Build Audience and Control Your Future, Digital Book World Marketing & Publishing Services Conference & Expo, 2013 McCoy J.: How Microdata Can Boost Your Sites SEO, February 18th, 2013, http://seocognition.com/howmicrodata-can-boost-your-sites-seo, accessed at 15th October 2013 Spahn M., Kleb J., Grimm S., Scheidl S.: Supporting Business Intelligence by Providing Ontology-Based EndUser Information Self-Service, Proceeding OBI '08 Proceedings of the first international workshop on Ontologysupported Business Intelligence, 27 Octombrie 2008, Karlsruhe, Germany Metadata Extraction: Human Language technology and the Semantic Web Part 1 http://videolectures.net/koml04_cunningham_hltsw1 accessed at 17th November 2013 http://en.wikipedia.org/wiki/Metadata accessed at 17th November 2013

http://en.wikipedia.org/wiki/Microdata_(HTML) accessed at 17th November 2013

http://www.icbl.hw.ac.uk/perx/advocacy/exposingmetadata.htm ccessed at 17th November 2013


2013 Blackboard Inc. All rights reserved. SafeAssign, SafeAssignment and the SafeAssign logo are trademarks of Blackboard Inc. in the United States and/or other

countries. About | Privacy Policy | Terms of Use | Accessibility

Вам также может понравиться