Fix CompDocError When Reading Excel File With Xlrd

You may have seen this CompDocError before if you used python xlrd library to read the older version of the excel file (.xls). When directly opening the same file from Microsoft Excel, it is able to show the data properly without any issue.

This usually happens if the excel file is generated from 3rd party application, the program did not follow strictly on the Microsoft Excel standard format, although the file is readable by Excel but it fails when opening it with xlrd library due to the non-standard format or missing some meta data. As you may have no control on how the 3rd party application generate the file, you will need to find a way to handle this CompDocError in your code.

SOLUTIONS FOR COMPDOCERROR

If you look at the error message, the error raised from the line 427 in the compdoc.py in your xlrd package. Since you confirm there is no problem with the data in your excel file except the minor format issue, you can open the compdoc.py and comment out the lines for raising CompDocError exception.

while s >= 0: 
if self.seen[s]:
pass
#print("_locate_stream(%s): seen" % qname, file=self.logfile); dump_list(self.seen, 20, self.logfile)
#raise CompDocError("%s corruption: seen[%d] == %d" % (qname, s, self.seen[s]))

Option 2:

You may notice if you open your file in Microsoft Excel and save it, you will be able to use xlrd to read and no exception will be raised. This is because Excel already fixed the issues for you when saving the file. You can use the same approach in your code to fix this problem.

To do that, you can use the pywin32 library to open the native Excel application and re-save the file.

import win32com.client as win32 
excel_app = win32.Dispatch('Excel.Application')
wb = excel_app.Workbooks.open("test.xls")
excel_app.DisplayAlerts = False
#do not show any alert when closing the excel
wb.Save()
excel_app.quit()

Conclusion

For option 2, when calling the excel_app.quit(), the entire Excel application will be closed without any alert. If you have other excel files opening at the time, it will be all closed together. So this solution is good if your program will run in a standalone environment or you confirm no other process/people will be using excel when running your code.

If you would like to understand more about how to read & write excel file with xlrd, please check this article.

Originally published at https://www.codeforests.com on May 28, 2020.

Resources and tutorials for python, data science and automation solutions