Document Digitization, Optical Character Recognition, Banking Academy, FPT.AI

  • Vũ Trọng Sinh https://hvnh.edu.vn/tapchi/vi/so-252-thang-5-23/ung-dung-cong-nghe-nhan-dang-ky-tu-quang-hoc-cho-so-hoa-tai-lieu-tai-hoc-vien-ngan-hang-vu-trong-sinh-10752.html
Keywords: Document Digitization, Optical Character Recognition, Banking Academy, FPT.AI

Abstract

Digital transformation of education and training institutions is becoming an urgent task and
Banking Academy is not an exception. In order to facilitate the digital transformation process, digitization
tasks must always be promoted. In this paper, the author conducts a research about digitization technology
and proposes solutions for digitizing text documents in Banking Academy. Specifically, this article
introduces core technologies in document digitization such as Optical Character Recognition, Intelligent Text
Processing, investigates typical solutions on the Vietnamese Digitization market to choose the appropriate
one and conducts an experiment based on FPT.AI Reader with manually-collected datasets from several
departments in the Academy. The experimental results are impressive, with 27% word error rate and only 16%
error in the text containing title, department name, document type. This solution could be improved to apply
to the digitization process at the Banking Academy in the future.

điểm /   đánh giá
Published
2023-05-25
Section
Bài viết