In 2019, Chulwoo Pack, then a recent South Dakota State University graduate who was pursuing a doctorate in Lincoln, Nebraska, was asked to participate in a rather unique internship. The Library of Congress — one of the world's largest repositories of information in Washington, D.C. — was in the process of digitizing its historical documents and needed expertise in digitally extracting data from the documents.
Pack, who graduated from SDSU in 2017 with a master's in computer science, had previously developed a computer-aided diagnosis system that generates data on mammogram images to support practitioners in the early detection of breast cancer. Expanding upon this experience, he applied deep learning models to the domain of document images. He worked on segmenting and clustering historical newspapers and images for the Chronicling America repository. The Library of Congress asked Pack to come to Washington, where he helped staff generate "enriched metadata" from the documents using computational image analysis tools and various deep learning models. The metadata could be used by researchers or historians to improve discoverability and searchability.
Now, Pack, an assistant professor in SDSU's Jerome J. Lohr College of Engineering, is continuing his research to develop image analysis tools through deep learning, computer vision and reasoning.
While in Washington, one of Pack's primary challenges was the "noise" present on historical documents, like the newspapers he was digitizing. This noise can be heterogeneous, as it might be introduced at various stages of the digitization process. This heterogeneity primarily complicates the task of "denoising," leading to failures in subsequent high-level analysis tasks, such as converting documents to machine-readable, editable format using Optical Character Recognition (OCR) or performing document sentiment analysis with deep learning. Additionally, the presence of heterogenous “modality” in document images — including visual, tabular and textual modalities — further complicates achieving a holistic understanding necessary for high-level reasoning tasks, such as document question answering or summarization using vision-language models.
Addressing such heterogeneities in noise and modality is one of the biggest challenges in the field of image analysis and is one of the focuses of Pack's forthcoming research at SDSU.
"Previously, my research was focused on image analysis on 2D space, but I want to look into more dynamic spaces, including video analysis," Pack said. "The long-term vision of my research is to establish the fundamental framework for a computer vision system to support human users. Progressing toward this vision, I am actively conducting research in the following areas, image processing, machine learning, deep learning, document understanding, medical image analysis and visual question answering."
The applications for computer vision are far reaching, and Pack, a faculty member in the Department of Electrical Engineering and Computer Science, has already been collaborating with other researchers, including faculty members in SDSU's Agricultural and Biosystems Engineering and School of Education, Counseling and Human Development.
Pack has been an SDSU faculty member since last fall. He said he has enjoyed returning to the school where he earned his bachelor's and master's degrees. Originally from Daejeon, South Korea, Pack completed the first half of his undergraduate degree at the University of Ulsan, which had a dual degree partnership with SDSU.
Outside of his research and other academic responsibilities, Pack enjoys playing tennis, a sport he picked up while completing his doctorate.
- Contact:
- Telephone number: 605-688-6161
Republishing
You may republish SDSU News Center articles for free, online or in print. Questions? Contact us at sdsu.news@sdstate.edu or 605-688-6161.