Skip to content

Bugs in pyodide #4507

@SF73

Description

@SF73

Description of the bug

Hi,
First of all I really like handling pdfs with pymupdf.
Right now, I'm trying to port some small basic tools to pyodide.

I don't know how much effort is put into building if for pyodide but I noticed several bugs or discrepancies.

For example, I encountered #4423 and in order to fix I had to adapt a bit the solution from this comment as doc.xref_object(xref) never raises an exception in pyodide (instead the object is 'null')

// Adapted from https://github.com/pymupdf/PyMuPDF/issues/4423#issuecomment-2773807348
// Should be removed when pymupdf 1.26.0 is released ?

export function fixXref(doc) {
    for (let xref = 1; xref < doc.xref_length(); xref++) {
            const o = doc.xref_object(xref); // never raises exceptions but returns 'null'
            if (o === 'null') {
                doc.update_object(xref, "<<>>");
            }
    }
}

Another issue I noticed is that page.get_image_info(xrefs=True) doesn't contain the xref. Typically :

export async function compress({ pymupdf, pyodide, buffer }) {
    const doc = pymupdf.Document.callKwargs({ stream: pyodide.toPy(buffer) });
    const pageCount = doc.page_count;
    for (const page of doc) {
      console.log("Page:", page);
      const images = page.get_image_info({xrefs:true}).toJs({ dict_converter: Object.fromEntries });
      for (const img of images) {
        console.log("Image:", img);       
  }
}
}

outputs

{
  "number": 5,
  "bbox": [
    192.4720001220703,
    46.254425048828125,
    433.41748046875,
    181.43099975585938
  ],
  "transform": [
    240.94546508789062,
    0,
    0,
    135.17657470703125,
    192.4720001220703,
    46.254425048828125
  ],
  "width": 975,
  "height": 547,
  "colorspace": 3,
  "cs-name": "DeviceRGB",
  "xres": 96,
  "yres": 96,
  "bpc": 8,
  "size": 45700,
  "digest": {
    "0": 130,
    "1": 115,
    "2": 83,
    "3": 79,
    "4": 72,
    "5": 42,
    "6": 35,
    "7": 50,
    "8": 136,
    "9": 201,
    "10": 53,
    "11": 103,
    "12": 95,
    "13": 15,
    "14": 63,
    "15": 136
  },
  "has-mask": false
}

How to reproduce the bug

Info :

  • I'm using artifact from github action pymupdf-1.25.5-cp312-abi3-pyodide_2024_0_wasm32.whl
  • Pyodide 27.5

PyMuPDF version

1.25.5

Operating system

Other

Python version

3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions