Skip to content

New Xml class from data does not work - bug in code #4439

@jlpmartins

Description

@jlpmartins

Description of the bug

The following code will not work because on Xml.__init__, execution will branch to parsing the node from an html5 string. However, the line to check if enc_string is a str instance calls isinstance missing rhs as an argument.
Now it is:
2069 elif isinstance( str)
Should be:
2069 elif isinstance(rhs, str)
The error still is on the main branch, in line 2069 of __init__.py, in Xml.__init__

Error :

TypeError Traceback (most recent call last)
/tmp/ipykernel_1557/1293611781.py in ?()
----> 1 fitz.Xml(enc_string)

~/anaconda3/envs/studies/lib/python3.10/site-packages/fitz/init.py in ?(self, rhs)
1855 def init( self, rhs):
1856 if isinstance( rhs, mupdf.FzXml):
1857 self.this = rhs
-> 1858 elif isinstance( str):
1859 buff = mupdf.fz_new_buffer_from_copied_data( rhs)
1860 self.this = mupdf.fz_parse_xml_from_html5( buff)
1861 else:

TypeError: isinstance expected 2 arguments, got 1

How to reproduce the bug

Running on WSL ubuntu 20.04.
PyMupdf version 1.24.2, but will happen on newest version - error in code is still there.

Code:

enc_string = html5_string.encode()
html_node = fitz.Xml(enc_string)

Output :

TypeError Traceback (most recent call last)
/tmp/ipykernel_1557/1293611781.py in ?()
----> 1 fitz.Xml(enc_string)

~/anaconda3/envs/studies/lib/python3.10/site-packages/fitz/init.py in ?(self, rhs)
1855 def init( self, rhs):
1856 if isinstance( rhs, mupdf.FzXml):
1857 self.this = rhs
-> 1858 elif isinstance( str):
1859 buff = mupdf.fz_new_buffer_from_copied_data( rhs)
1860 self.this = mupdf.fz_parse_xml_from_html5( buff)
1861 else:

PyMuPDF version

1.24.x or earlier

Operating system

Linux

Python version

3.10

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions