ICDAR 2011 Page Dewarping Contest
Introduction
One of the major challenges in camera-captured document analysis is dealing with the page curl and perspective distortions. Current OCR systems do not expect these types of artifacts, and have poor performance when applied directly to camera-captured documents. The goal of page dewarping is to flatten a camera captured document such that it becomes readable by current OCR systems. Page dewarping has triggered a lot of interest in the scientific community over the last few years and many approaches have been proposed.
In CBDAR 2007 we had organized the first dewarping contest. For more information and results, please see the following paper [pdf]. The ICDAR 2011 dewarping contest follows the successful running of the previous CBDAR 2007 dewarping contest.
For queries regarding dewarping contest, please contact:
Syed Saqib Bukhari
----------------------------
Technical University of Kaiserslautern, Germany
IUPR - Image Understanding and Pattern Recognition
Room No: 453, Building 48
Mob: +4917668120797
Home: +496316254879
Email: bukhari@iupr.com
Web: http://www.iupr.com
https://sites.google.com/a/iupr.com/bukhari/
Introduction
One of the major challenges in camera-captured document analysis is dealing with the page curl and perspective distortions. Current OCR systems do not expect these types of artifacts, and have poor performance when applied directly to camera-captured documents. The goal of page dewarping is to flatten a camera captured document such that it becomes readable by current OCR systems. Page dewarping has triggered a lot of interest in the scientific community over the last few years and many approaches have been proposed.
In CBDAR 2007 we had organized the first dewarping contest. For more information and results, please see the following paper [pdf]. The ICDAR 2011 dewarping contest follows the successful running of the previous CBDAR 2007 dewarping contest.
[News: 29th April 2011]: Those who are interested in participating this contest, please contact us via email: bukhari@iupr.com
[News: 15th May 2011]: Complete dataset can be downloaded now. Those who are interested in participating this contest, please contact us now and submit results till 1st June 2011 (email: bukhari@iupr.com)
[News: 24th May 2011]: The deadline for submitting results is extended to 8th June 2011.
For queries regarding dewarping contest, please contact:
Syed Saqib Bukhari
----------------------------
Technical University of Kaiserslautern, Germany
IUPR - Image Understanding and Pattern Recognition
Room No: 453, Building 48
Mob: +4917668120797
Home: +496316254879
Email: bukhari@iupr.com
Web: http://www.iupr.com
https://sites.google.com/a/iupr.com/bukhari/
Schedule:
Dataset will be composed of warped documents with ASCII text and dewarped (scanned) document ground-truth. Proposed deadlines for the contest will be as follows.
Training dataset : 30th March 2011 NEW: Download Training Dataset
Return results on complete dataset:1st June 2011 8th June 2011
Dataset will be composed of warped documents with ASCII text and dewarped (scanned) document ground-truth. Proposed deadlines for the contest will be as follows.
Training dataset : 30th March 2011 NEW: Download Training Dataset
Complete (Testing + Training) Dataset: [15th May 2011: Available for Download] warped documents part1, warped documents part2, ASCII text ground-truth, dewarped (scanned) document ground-truth, README
Return results on complete dataset:
Organizers:
Syed Saqib Bukhari: Syed Saqib Bukhari is a PhD student in Image Understanding and Pattern Recognition (IUPR) research group at the computer science department of University of Kaiserslautern, Germany. He received the Bachelor and Masters degrees (with honors) in Computer Systems Engineering from NED University of Engineering and Technology Karachi, Pakistan. His research interests include Camera-Captured Document Image Processing (Binarization, Text/Non-Text Segmentation, Text-Line Detection, and Dewarping). He has co-authored over 10 publications in peer-reviewed conferences and journal in this area.
Faisal Shafait: Dr. Faisal Shafait is a Senior Researcher in the Multimedia Analysis and Data Mining Competence Center at the German Research Center for Artificial Intelligence (DFKI GmbH) in Kaiserslautern, Germany. He received the Bachelor degree (with honors) in Electrical Engineering from UET Taxila, Pakistan in 2002, Masters degree (with honors) in Information and Communication Systems from TUHH, Germany in 2005, and PhD degree (summa cum laude) in Computer Engineering from TUKL, Germany in 2008. His research interests include document image analysis and pattern recognition. He has co-authored over 60 publications in peer-reviewed conferences and journals in this area and has also contributed a substantial amount of code to both OCRopus and Tesseract open-source OCR systems. He is an Editorial Board member of the International Journal on Document Analysis and Recognition (IJDAR) and was a co-chair of the CBDAR 2007 Document Image Dewarping Contest as well as the GREC 2007 Arc Segmentation Contest.
Thomas M. Breuel: Prof. Dr. Thomas M. Breuel is a professor of computer science and head of the Image Understanding and Pattern Recognition (IUPR) research group at the University of Kaiserslautern, and a consultant in Palo Alto, CA, USA. His research group works in the areas of image understanding, document imaging, computer vision, and pattern recognition. Previously, he was a researcher at Xerox PARC, the IBM Almaden Research Center, IDIAP, Switzerland, as well as a consultant to the US Bureau of the Census. He is an alumnus of the Massachusetts Institute of Technology and Harvard University.
Syed Saqib Bukhari: Syed Saqib Bukhari is a PhD student in Image Understanding and Pattern Recognition (IUPR) research group at the computer science department of University of Kaiserslautern, Germany. He received the Bachelor and Masters degrees (with honors) in Computer Systems Engineering from NED University of Engineering and Technology Karachi, Pakistan. His research interests include Camera-Captured Document Image Processing (Binarization, Text/Non-Text Segmentation, Text-Line Detection, and Dewarping). He has co-authored over 10 publications in peer-reviewed conferences and journal in this area.
Faisal Shafait: Dr. Faisal Shafait is a Senior Researcher in the Multimedia Analysis and Data Mining Competence Center at the German Research Center for Artificial Intelligence (DFKI GmbH) in Kaiserslautern, Germany. He received the Bachelor degree (with honors) in Electrical Engineering from UET Taxila, Pakistan in 2002, Masters degree (with honors) in Information and Communication Systems from TUHH, Germany in 2005, and PhD degree (summa cum laude) in Computer Engineering from TUKL, Germany in 2008. His research interests include document image analysis and pattern recognition. He has co-authored over 60 publications in peer-reviewed conferences and journals in this area and has also contributed a substantial amount of code to both OCRopus and Tesseract open-source OCR systems. He is an Editorial Board member of the International Journal on Document Analysis and Recognition (IJDAR) and was a co-chair of the CBDAR 2007 Document Image Dewarping Contest as well as the GREC 2007 Arc Segmentation Contest.
Thomas M. Breuel: Prof. Dr. Thomas M. Breuel is a professor of computer science and head of the Image Understanding and Pattern Recognition (IUPR) research group at the University of Kaiserslautern, and a consultant in Palo Alto, CA, USA. His research group works in the areas of image understanding, document imaging, computer vision, and pattern recognition. Previously, he was a researcher at Xerox PARC, the IBM Almaden Research Center, IDIAP, Switzerland, as well as a consultant to the US Bureau of the Census. He is an alumnus of the Massachusetts Institute of Technology and Harvard University.