msc-drozdz/abstract.tex
2022-09-14 19:47:18 +00:00

3 lines
1.6 KiB
TeX

% !TeX encoding = UTF-8
% !TeX spellcheck = en_EN
Dissertation of this thesis mainly considers the importance of the concept of digitization focusing primarily on institutions of libraries. The concept of a digital library, the process of digitizing the content that will be stored in it, as well as the challenges and problems accompanying the whole endeavor will be presented as a part of this thesis. The work is of a practical nature, and its main purpose, in addition to conveying the information values associated with the concept of digitization, as well as the methods of deep learning, is to present the entire process of building a solution that allows searching through huge data sets containing documents that have already been digitized. It also describes the Chronicling America initiative conducted within the United States in recent years. The Newspaper Navigator project, which was created as part of this initiative, became the main inspiration for the topic of this work and its kind of genesis. The end product of this work is a study of automatic searching historical digitized newspapers, which in turn has resulted in fully functional search software based on image processing by artificial neural networks, as well as natural language processing techniques. The entire process will be described, starting with the acquisition and processing of input data, passing through the construction of custom object detection model, as well as optical character recognition and a full-text search engine, and ending with a visual user interface that allows the handling of user queries in real time.