Shadow Removal and Contrast Enhancement for Mobile-Captured Document Images
Keywords:
Document Image Enhancement, Shadow Removal, Contrast Enhancement, Mobile-Captured Documents, OCRAbstract
Smartphone-captured document images often suffer from uneven illumination, shadows, and low contrast, which reduce visual readability and negatively affect optical character recognition (OCR) performance. This study proposes a two-stage enhancement pipeline for mobile-captured document images by combining shadow removal and contrast enhancement. The method is designed to normalize local illumination and strengthen text-background separation, thereby improving document readability under uncontrolled acquisition conditions. The evaluation was conducted on the SmartDoc-QA dataset using four experimental settings: original images, contrast enhancement only, shadow removal only, and the proposed combined method. Performance was assessed using Character Error Rate (CER), Word Error Rate (WER), and Word Accuracy. Based on the simulated experimental results, the proposed method achieved the best performance, reducing CER from 18.47% to 10.84% and WER from 31.26% to 18.63%, while increasing Word Accuracy from 68.74% to 81.37%. Additional analysis across device subsets, distortion levels, and document types showed that the proposed approach consistently outperformed the baseline and partial enhancement methods. The findings indicate that combining shadow removal and contrast enhancement is a promising preprocessing strategy for improving OCR readiness in smartphone-based document digitization systems.
References
S. S. Bukhari, F. Shafait, and T. M. Breuel, “The IUPR Dataset of Camera-Captured Document Images,” in Camera-Based Document Analysis and Recognition, M. Iwamura and F. Shafait, Eds., Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 164–171.
Y. Zhou, S. Zuo, Z. Yang, J. He, J. Shi, and R. Zhang, “A Review of Document Image Enhancement Based on Document Degradation Problem,” Applied Sciences (Switzerland), vol. 13, no. 13, Jul. 2023, doi: 10.3390/app13137855.
A. El Harraj and N. Raissouni, “OCR Accuracy Improvement on Document Images Through a Novel Pre-Processing Approach,” Signal Image Process., vol. 6, no. 4, pp. 01–18, Aug. 2015, doi: 10.5121/sipij.2015.6401.
Y.-H. Lin, W.-C. Chen, and Y.-Y. Chuang, “BEDSR-Net: A Deep Shadow Removal Network From a Single Document Image,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 12902–12911. doi: 10.1109/CVPR42600.2020.01292.
L. Zhang, Y. He, Q. Zhang, Z. Liu, X. Zhang, and C. Xiao, “Document Image Shadow Removal Guided by Color-Aware Background,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 1818–1827. doi: 10.1109/CVPR52729.2023.00181.
B. Wang, C. Li, W. Zou, Y. Zhang, X. Chen, and C. L. P. Chen, “A comprehensive survey on shadow removal from document images: datasets, methods, and opportunities,” Vicinagearth, vol. 2, no. 1, Jan. 2025, doi: 10.1007/s44336-024-00010-9.
Y. Wang, W. Zhou, Z. Lu, and H. Li, “UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior,” in MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia, Association for Computing Machinery, Inc, Oct. 2022, pp. 5074–5082. doi: 10.1145/3503161.3547916.
H. Feng, Y. Wang, W. Zhou, J. Deng, and H. Li, “DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction,” in MM 2021 - Proceedings of the 29th ACM International Conference on Multimedia, Association for Computing Machinery, Inc, Oct. 2021, pp. 273–281. doi: 10.1145/3474085.3475388.
J. Zhang, D. Peng, C. Liu, P. Zhang, and L. Jin, “DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 15654–15664. doi: 10.1109/CVPR52733.2024.01482.
J. Burie et al., “ICDAR2015 competition on smartphone document capture and OCR (SmartDoc),” in 2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015, pp. 1161–1165. doi: 10.1109/ICDAR.2015.7333943.
J. Liang, D. Doermann, and H. Li, “Camera-based analysis of text and documents: a survey,” International Journal of Document Analysis and Recognition (IJDAR), vol. 7, no. 2, pp. 84–104, 2005, doi: 10.1007/s10032-004-0138-z.
R. Lins, G. P. e Silva, and A. R. G. e Silva, “Assessing and Improving the Quality of Document Images Acquired with Portable Digital Cameras,” in Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), 2007, pp. 569–573. doi: 10.1109/ICDAR.2007.4376979.
T. Liu, H. Liu, Y. Wu, B. Yin, and Z. Wei, “Exposure bracketing techniques for camera document image enhancement,” Applied Sciences (Switzerland), vol. 9, no. 21, Nov. 2019, doi: 10.3390/app9214529.
B. Wang and C. L. P. Chen, “An Effective Background Estimation Method for Shadows Removal of Document Images,” in 2019 IEEE International Conference on Image Processing (ICIP), 2019, pp. 3611–3615. doi: 10.1109/ICIP.2019.8803486.
J.-R. Wang and Y.-Y. Chuang, “Shadow Removal of Text Document Images by Estimating Local and Global Background Colors,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 1534–1538. doi: 10.1109/ICASSP40776.2020.9053378.
S. Imahayashi, M. Mukaida, S. Takeda, and N. Suetake, “Shadow removal from document image based on background estimation employing selective median filter and black-top-hat transform,” Opt. Rev., vol. 30, no. 3, pp. 336–340, 2023, doi: 10.1007/s10043-023-00806-y.
J. Zhang, L. Liang, K. Ding, F. Guo, and L. Jin, “Appearance Enhancement for Camera-Captured Document Images in the Wild,” IEEE Transactions on Artificial Intelligence, vol. 5, no. 5, pp. 2319–2330, May 2024.
S. Guan et al., “PreP-OCR: A Complete Pipeline for Document Image Restoration and Enhanced OCR Accuracy,” in Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics, Vienna: Association for Computational Linguistics, Jul. 2025, pp. 15413–15425. [Online]. Available: https://github.com/NikoGuan/PreP-OCR
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Ade Guna Suteja, Muhammad Iqbal

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.




