Linebreak inserted between each letter

### Description of the bug

Hey, thank you so much for this amazing tool!

I am using PyMuPDF to parse many official french documents, they contain a cover, a table of contents, and pages of scanned content. The vast majority of them is read with no problem, but for a small number of them, a linebreak is inserted between each letter of the content, making it almost unreadable.

Here are links to a few documents where this happens:

- [https://www.loire-atlantique.gouv.fr/contenu/telechargement/57967/423894/file/RAA n°056 du 3 avril 2023.pdf](https://www.loire-atlantique.gouv.fr/contenu/telechargement/57967/423894/file/RAA%20n%C2%B0056%20du%203%20avril%202023.pdf)
- [https://www.loire-atlantique.gouv.fr/contenu/telechargement/58441/427324/file/RAA n°78 du 28 avril 2023.pdf](https://www.loire-atlantique.gouv.fr/contenu/telechargement/58441/427324/file/RAA%20n%C2%B078%20du%2028%20avril%202023.pdf)
- [https://www.loire-atlantique.gouv.fr/contenu/telechargement/58439/427314/file/RAA n°77  du 28 avril 2023.pdf](https://www.loire-atlantique.gouv.fr/contenu/telechargement/58439/427314/file/RAA%20n%C2%B077%20%20du%2028%20avril%202023.pdf)

### How to reproduce the bug

For instance, here is an example with the second mentioned document:

```python
>>> import pymupdf
>>> f = "2023-04-28-ee04e9ccb016e7806a7cf92a48155834.pdf"
>>> doc = pymupdf.Document(f)
>>> doc[0].get_text("blocks")
[
    (164.6999969482422, 377.63739013671875, 436.3139953613281, 394.6753845214844, 'R\nE\nC\nU\nE\nI\nL\n \nD\nE\nS\n \nA\nC\nT\nE\nS\n \nA\nD\nMI\nN\nI\nS\nT\nR\nA\nT\nI\nF\nS\n', 0, 0),
    (225.0, 531.0374145507812, 376.00396728515625, 548.0614013671875, 'n\n°\n \n7\n7\n \nd\nu\n \n2\n8\n \na\nv\nr\ni\nl\n \n2\n0\n2\n3\n', 1, 0)
]

>>> pymupdf.version
('1.24.7', '1.24.4', '20240626000001')
```

And here is its first page as I see it:

![Cover of the second mentioned document.](https://github.com/pymupdf/PyMuPDF/assets/1690660/1e2f1b08-3a64-4df7-b2c6-2bcaf7b132f4)

Please let me know if I can provide any further information!

PS: Is there any "debugging tool" that would allow you to view text and content blocks as they're seen by PyMuPDF for easier analysis?

### PyMuPDF version

1.24.7

### Operating system

Linux

### Python version

3.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Linebreak inserted between each letter #3650

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Linebreak inserted between each letter #3650

Description

Description of the bug

How to reproduce the bug

PyMuPDF version

Operating system

Python version

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions