NovoDynamics Delivers New Technologies to Overcome Enterprise Content Management Recognition and Classification Challenges
By Drew Barrows
NovoDynamics, an Ann Arbor, MI-based technology solutions provider, has recently announced VERUS™ Asia, the newest version of its highly-regarded VERUS software, an Optical Character Recognition (OCR) application for Chinese, Korean and Russian languages. This new version of VERUS builds on the company’s highly-regarded OCR capabilities for difficult to decipher Middle Eastern languages such as Arabic, Persian (Farsi, Dari) Pashto, Urdu and Hebrew.
Optical Character Recognition is the machine recognition of printed characters from a text document scanned into a computer. OCR systems can recognize many different fonts, typewritten and computer-printed characters. Advanced OCR, like VERUS, can also determine whether a page contains primarily handwritten text, which then can be separated for processing. Handwritten text, non-Latin fonts and old, torn or otherwise degraded documents are extremely difficult to recognize, which is where NovoDynamics sees its strengths. Some have said that OCR is as much an art as a science and NovoDynamics has finely-tuned its technology to become both an art and a science toppling traditional character recognition obstacles.
Some OCR challenges, with Middle Eastern languages in particular, include a right-to-left writing style; connected characters (similar to cursive handwriting); characters combined together to form new characters under different situations; small dots – diacritical marks – used to represent short vowels. With its advanced technology, VERUS has proven to be of the one most accurate and reliable solutions on the market today, according to company sources.
In addition to its character recognition capabilities, VERUS is also able to automatically detect and clean degraded and skewed documents, automatically identify a page’s primary language, recognize fonts without manual intervention, determine if a page contains primarily handwritten or machine printed text and allows extracted information to be converted to PDF files.
Yale University and the AMEEL Project
One of the more fascinating applications ofthe VERUS technology is the AMEEL (Arabic and Middle Eastern Electronic Library) project, a venture of Yale University. The mission of Project AMEEL is the creation of a scholarly Web-based portal for the study of the Middle East, including its history, culture, development, and contemporary face; and within this portal, to integrate new or existing scholarly digital content. Most of the content for the portal is contained within books and newspapers. NovoDynamics' VERUS™ Middle East product was selected by YaleUniversity to convert their references into a digital format so that content could be searched, retrieved, and displayed.
The Yale team has been working in partnership with the Bibliotheca Alexandrina in Alexandria, Egypt on complementary digitization initiatives utilizing VERUS Middle East OCR technology. As much of the content is very old and degraded, including yellowed newspapers; VERUS has been able to extract information that was previously unrecognizable.
According to NovoDynamics’ President, David Rock, VERUS has made a significant contribution to the preservation and broad distribution (via the Web) of historically significant documents. “VERUS is much more advanced than typical OCR solutions. One of the Bibliotheca’s project managers told us they had originally tried to use a competitor’s OCR software but couldn’t extract any useable information.”
Advanced Technology Creates Benefits for the Enterprise
NovoDynamics was incorporated in 2001. The company’s team of scientists, with more than 100 years of combined experience in artificial intelligence, image analysis and data mining, has created leading-edge software systems for large government agencies and major commercial customers. In addition, they’ve published numerous papers and hold several patents in diverse areas like machine learning (neural networks, genetic algorithms, recursive partitioning), postal automation, vehicle tracking, chemical and pharmaceutical drug discovery, data mining of sales transactions and, of course, OCR for English and other languages. The VERUS platform was specifically developed to extract information from challenging languages, degraded digital images and to provide capability for searching through large, complex data sets to find critical information required by customers initially in the government, academia and security sectors.
In its inception, NovoDynamics was primarily a research and technology services organization. However, the company recognized the opportunity to expand its technology to the commercial sector as more and more organizations began doing business globally. Just about any vertical market that relies on paper-based, manual operations can benefit the most from NovoDynamics’ technology. This includes organizations such as health care, financial services, manufacturing, or government.
Today NovoDynamics has evolved from its research and technology services foundation to become a software solutions company which is now successfully addressing the needs of a larger commercial audience, and particularly those enterprises that have need for OCR capabilities in other than Latin-based languages.
Enterprises encounter tough to recognize or degraded documents every day – like faxes, photo copies, pages with stains, and low resolution scans, Rock explained. He indicated that many enterprises have had low expectations for OCR technology because of its traditional inability to provide accurate results. This is where VERUS shines. “There are very few OCR products on the market today that offer the accuracy of VERUS. Accuracy creates improved information management for large volumes of paper documents. This accuracy results in attractive Enterprise Content Management (ECM) benefits like creating a central repository for documents, establishing uniform processes and quality assurance for information handling, rapid access to information and improved efficiencies. “Any industry that needs to capture, manage, store or deliver document information needs technology that can extract information from both clean and degraded documents. NovoDynamics’ technology can play a key role in allowing any paper-based business or government agency to transition to a digital solution,” Rock said.
Coming Soon: NovoDyamics’ Intelligent Document Recognition Technology
Just as VERUS has tackled challenges in the OCR sphere, NovoDyanmics’ newest product for the enterprise will, according to company officials, establish a new benchmark for Intelligent Document Recognition (IDR) technology. Coronado™ will be officially launched before the end of March. Like VERUS, it’s targeted to enterprises with large paper-based systems that need a quick, easy to use and cost efficient way to streamline their business or organizational processes. “Coronado utilizes breakthrough technology in sorting and recognizing documents. It’s easy to use, can recognize documents whose contents have been scaled or shifted. It provides an innovative approach to measuring the system’s accuracy so that customers can be confident that their documents will be recognized correctly. It’s very fast and supports global languages,” Rock said.
Expect to hear much more from NovoDynamics. The company plans to continue its expansion of the VERUS technology and with Coronado opens the door for additional opportunities. “The underlying strengths and experience gained from years of applied research have been valuable assets in allowing the company to rapidly create innovative commercial solutions,” Rock said. “We will continue to enhance our technology and expand our supported languages to address the needs of our commercial, government and academic customers.”
More information about NovoDynamics can be found at: www.novodynamics.com.