INSTRUMENTAL TOOLS FOR CONSTRUCTION OF THE DIGITAL ARCHIVES OF THE DOCUMENTS BASED ON LINKED DATA

Дата поступления: 
21.10.2017
Год: 
2017
Номер журнала (Том): 
УДК: 
004.652.8+004.91
DOI: 

10.26731/1813-9108.2017.4(56).100-107

Файл статьи: 
Страницы: 
100
107
Аннотация: 

We consider the problem of the development of software tools for construction of digital archives of text documents. The main features of the tools are the usage of Linked Open Data, which are also based on the Semantic Web technologies, for representation and storing the logical structure of a document and its metainformation, logical inference system integrated in the server, and the tools of document composition from other parts.

The basis of formalized description of the archive contents are formed by a number of standardized ontologies, including Open Annotation, Friend of a Friend, Dublin Core, Provenance, Schema.org, DBpedia, NEPOMUK и Bibliographic Ontology. The features of the ontologies allow us to provide the solutions of various problems related to the representation of the content data and a document management: document authoring, ownership declaration, formation of metadata on the document content, describing logical relation between documents, associating data and physical objects, as well as bibliographic data representation.

The usage of the Semantic WEB data representation and a Prolog engine as a way of data retrieval and processing forms an en-vironment of document processing based on declarative specifications, namely, knowledge based ones. The nowadays software and li-braries of SWI-Prolog and Python allow one to develop applications for document data mining and automatic compositions of new ones with the semantic markup. This allows us to extend the regular storage, indexing and retrieval functions of the digital archives with func-tions of data integrations with other web documents.

An example of the instrumentation tools application for study course documentation authoring is considered

Финансирование: 

Результаты получены при частичной поддержке Совета по грантам Президента Российской Федерации, государственной поддержке ведущих научных школ Российской Федерации (НШ-8081.2016.9).

Результаты получены при активном использовании сетевой инфраструктуры Телекоммуникационного центра коллективного пользования «Интегрированная информационно-вычислительная сеть Иркутского научно-образовательного комплекса» (ЦКП ИИВС ИРНОК) (http://net.icc.ru).

Список цитируемой литературы: 

1.   Bizer Ch., Heath T., Berners-Lee T. Linked Data – The Story So Far. International Journal on Semantic Web and Information Systems, 2009, Vol. 5 (3), pp. 1–22.

2.   Cherkashin E.A., Belykh P.V. et al. Podkhod k upravleniyu soderzhaniem saita na osnove tekhnologii RDF [Approach to managing the content of the site based on RDF technologies]. Znaniya - Ontologii - Teorii : materialy Vseros. konf. s mezhdunar. uchast. T. 2.  [Knowledge ‑ Ontology ‑ Theories: Materials of the All-Russian Conf. with intern. participants. Vol. 2]. Novosibirsk, 2013, pp. 204–212.

3.   Lehmann J., Isele R., Jakob M., Jentzsch A., Kontokostas D., et al. DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web Journal, 2015, Vol. 6, No. 2, pp. 167–195.

4.   Krötzsch M. How to use Wikidata: Things to make and do with 40 million statements. In Keynote at the 10th Wikimania Conference, 2014.

5.   Daiber J., Jakob M., Mendes P. Improving Efficiency and Accuracy in Multilingual Entity Extraction. Proceedings of the 9th Inter-national Conference on Semantic Systems (I-Semantics), 2013.

URL: http://korrekt.org/talks/2014/wikimania-wikidata.svg (access-date: 01.07.2017).

6.   Baiju M. A. Comprehensive Guide to Zope Component Architecture. [Elektronnyi resurs]

URL: http://muthukadan.net/docs/zca.html (access date 01.07.2017).

7.   Langtangen H. A Primer on Scientific Programming with Python (Texts in Computational Science and Engineering) 3rd ed. Springer, 2012, 798 p.

8.   Wielemaker J., Beek W., Hildebrand M., Ossenbruggen J. ClioPatria: A SWI-Prolog infrastructure for the Semantic Web.  Semantic Web, 2016, Vol. 7 (5), pp. 529–541.

9.   Lager T., Wielemaker J. Pengines: Web Logic Programming Made Easy. Theory and Practice of Logic Programming, 2014, Vol. 14 (4-5).

10. Wielemaker J., Schreiber G., Wielinga B. Prolog-Based Infrastructure for RDF: Scalability and Performance. In: Fensel D., Sycara K., Mylopoulos J. (eds) The Semantic Web ‑ ISWC 2003. ISWC 2003. Lecture Notes in Computer Science. 2003, Vol.  2870. Springer, Berlin, Heidelberg.

11. Nefedova Yu.S. Arkhitektura gibridnoi rekomendatel'noi sistemy GEFEST (Generation–Expansion–Filtering–Sorting–Truncation) [The architecture of the hybrid GEFEST recommendation system (Generation-Expansion-Filtering-Sorting-Truncation)]. Sistemy i sredstva informatiki [Systems and Means of Informatics], 2012, Vol. 22, Issue 2, pp. 176–196.

12. Beel J., Gripp B., Langer S., Breitinger C. Research-paper recommender systems: a literature survey. International Journal on Digital Libraries. 2016. Vol. 17. Pr. 305. (access date: 12.12.2016).

13. Kuć R., Rogoziński M. Mastering Elasticsearch ‑ Second Edition. Packt Publishing, 2015, 372 p.

14. Capadisli S., Guy A., Verborgh R., Lange C., Auer S., Berners-Lee T. Decentralised Authoring, Annotations and Notifications for a Read-Write Web with dokieli. Procs of ICWE international conference, 5-8 June, 2017, Rome, Italy. (to appear) [Electronic resource] Preprint. URL:http://csarven.ca/dokieli-rww. (access date: 12.12.2016).

15. Heino N., Tramp S., Auer S., et al. Managing Web Content using Linked Data Principles – Combining semantic structure with dynamic content syndication. Computer Software and Applications Conference (COMPSAC), IEEE 35th Annual, 2011, pp. 245–250. [Electronic resource] URL:http://svn.aksw.org/papers/2011/COMPSAC_lod2.eu/public.pdf (access date: 30.05.2013).