-
Notifications
You must be signed in to change notification settings - Fork 648
Description
Is your feature request related to a problem? Please describe.
I have a pdf which contains bookmarks.
Each bookmark has a nameddest.
After call document.get_toc() I get a list of bookmarks, but there's no page in each bookmark under dest, so I step into linkDest class to find the reason.
class linkDest:
def init(self, obj, rlink, document=None):
...
if document and m:
named = m.group(1)
self.named = document.resolve_names().get(named)
if self.named is None:
# document.resolve_names() does not contain an
# entry for named
so use an empty dict.
self.named = dict()
self.named['nameddest'] = named
else:
self.named = uri_to_dict(self.uri[1:])
...
I found the problem at line "self.named = document.resolve_names().get(named)" for the named is a escaped str,
while each key in dict returned by document.resolve_names() is a unescaped str,
so self.named will always be None.
Describe the solution you'd like
To solve this issue, named should be unescaped before "self.named = document.resolve_names().get(named)"
from pymupdf.mupdf import ll_fz_decode_uri_component
named = ll_fz_decode_uri_component(named)
self.named = document.resolve_names().get(named)
Describe alternatives you've considered
Are there several options for how your request could be met?
Additional context
The pdf:
21.0_TREATMENT OF ADOLESCENTS AND CHILDREN WITH CH.pdf