msc-drozdz/abstract.tex

3 lines
1.4 KiB
TeX
Raw Normal View History

2022-09-05 11:51:38 +02:00
% !TeX encoding = UTF-8
% !TeX spellcheck = en_EN
Dissertation of this thesis mainly considers the importance of the concept of digitization focusing primarily on institutions of libraries. The concept of a digital library, the process of digitizing the content that will be stored in it, as well as the challenges and problems accompanying the whole endeavor will be presented as a part of this thesis. The work is of a practical nature, and its main purpose, in addition to conveying the information values associated with the concept of digitization, as well as the methods of deep learning, is to present the entire process of building a solution that allows searching through huge data sets containing documents that have already been digitized. It also describes the Chronicling America project conducted in the United States in recent years, which became the main inspiration for the topic of this work and its kind of genesis. The final product born from this thesis is a fully functional search software based on image processing by artificial neural networks, as well as natural language processing techniques. The entire process will be described, starting with the acquisition and processing of input data, passing through the construction of custom object detection model, as well as optical character recognition and a full-text search engine, and ending with a visual user interface that allows the handling of user queries in real time.