Get rotation property of the extracted image #1297
-
|
Hi, My current goal is to extract the images from PDF together with some other properties. I also want to know if the inserted image is rotated and by how many degrees. I've found this one Get the transformation applied to an image For PDF file created using PyMuPDF: import fitz
from io import BytesIO
from PIL import Image
document = fitz.open('Via PyMuPDF.pdf')
for page_idx, page in enumerate(document):
for idx, img in enumerate(page.get_images(), start=1):
xref = img[0]
image = document.extract_image(xref)
image_bytes = image['image']
image_buffer = BytesIO(image_bytes)
pil_image = Image.open(image_buffer)
print(pil_image.info)
# ---- RESULT ----
# {'dpi': (96.012, 96.012)}For PDF file created using MS Word: import fitz
from io import BytesIO
from PIL import Image
document = fitz.open('Via MS Word.pdf')
for page_idx, page in enumerate(document):
for idx, img in enumerate(page.get_images(), start=1):
xref = img[0]
image = document.extract_image(xref)
image_bytes = image['image']
image_buffer = BytesIO(image_bytes)
pil_image = Image.open(image_buffer)
print(pil_image.info)
# ---- RESULT ----
# {'jfif': 257, 'jfif_version': (1, 1), 'dpi': (0, 0), 'jfif_unit': 1, 'jfif_density': (0, 0)}As you can see, they rendered different result. But still I've also tried examining if there is import fitz
doc = fitz.open('Via PyMuPDF.pdf')
for page in doc:
for img in page.get_images():
print(img) # (1)
xref = img[0]
image = doc.extract_image(xref)
print(image) # (2)Result for first print: (5, 0, 600, 400, 8, 'DeviceRGB', '', 'fzImg0', '')Result for second print: {
"ext": "png",
"smask": 0,
"width": 600,
"height": 400,
"colorspace": 3,
"bpc": 8,
"xres": 96,
"yres": 96,
"cs-name": "DeviceRGB",
"image": <remove for display purposes>
}The following are the sample files I've used: Any help will be greatly appreciated. Thank you so much 🙇♂️ |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments 3 replies
-
|
You can determine the bbox and transformation for each image on the page: |
Beta Was this translation helpful? Give feedback.
-
|
I should have written “clockwise“. Please consult the documentation on Matrix. Look at this snippet:
>> import fitz
>> fitz.Matrix(90) # rotation matrix
Matrix(0.0, 1.0, -1.0, 0.0, 0.0, 0.0)
>> fitz.Matrix(0.4, 0.4) # scale matrix
Matrix(0.4, 0.0, 0.0, 0.4, 0.0, 0.0)
>> fitz.Matrix(90) * fitz.Matrix(0.4, 0.4) # the product
Matrix(0.0, 0.4000000059604645, -0.4000000059604645, 0.0, 0.0, 0.0)
>>
Jorj
From: Arjun ***@***.***>
Sent: Mittwoch, 29. September 2021 04:05
To: ***@***.***>
Cc: Jorj X. ***@***.***>; ***@***.***>
Subject: Re: [pymupdf/PyMuPDF] Get rotation property of the extracted image (Discussion #1297)
@JorjMcKie<https://github.com/JorjMcKie>
I'm sorry, I don't understand how it got to a conclusion that the image is rotated 90 degrees just by the side being scaled by 0.4 factor.
>> shrink * transform
Matrix(0.0, -0.39920157194137573, 0.3992016017436981, 0.0, 100.0, 287.6247253417969)
>> #------------------------------------------------
>> # the above shows:
>> # image sides scaled by same factor 0.4
>> # image rotated by 90 degrees anti-clockwise
>> #----------------------------------------------
Is there some sort of formula to get this? 😮
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#1297 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AB7IDIUWOTKZ3XO2MMV453LUELCEHANCNFSM5E7DISBA>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
|
You can use this script to do some basic matrix analysis. Can only investigate rotations of 90 degree multiples yet.
Jorj
________________________________
From: Arjun Manipes ***@***.***>
Sent: Wednesday, September 29, 2021 4:05:23 AM
To: pymupdf/PyMuPDF ***@***.***>
Cc: Jorj X. McKie ***@***.***>; Mention ***@***.***>
Subject: Re: [pymupdf/PyMuPDF] Get rotation property of the extracted image (Discussion #1297)
@JorjMcKie<https://github.com/JorjMcKie>
I'm sorry, I don't understand how it got to a conclusion that the image is rotated 90 degrees just by the side being scaled by 0.4 factor.
>> shrink * transform
Matrix(0.0, -0.39920157194137573, 0.3992016017436981, 0.0, 100.0, 287.6247253417969)
>> #------------------------------------------------
>> # the above shows:
>> # image sides scaled by same factor 0.4
>> # image rotated by 90 degrees anti-clockwise
>> #----------------------------------------------
Is there some sort of formula to get this? 😮
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#1297 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AB7IDIUWOTKZ3XO2MMV453LUELCEHANCNFSM5E7DISBA>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Beta Was this translation helpful? Give feedback.
-
|
BTW a scale matrix, fitz.Matrix(x, x), is equal to fitz.Identity * x
Therefore the previous could have been also expressed by fitz.Matrix(90) * 0.4.
|
Beta Was this translation helpful? Give feedback.
-
|
Just an update, I've found your old discussion relating to my problem just in case someone needs this. |
Beta Was this translation helpful? Give feedback.
-
|
I am sorry to bring up this old topic, but I have struggled a lot with pymupdf's Matrix concept. I currently write a pipeline to convert PDFs to images and do some additional steps like resizing, cleanup etc. I found the documentation confusing, so here is my attempt to clear things up: There are two parts to the Matrix: A 2 x 2 Matrix consisting of Neither of the parameters can do anything on its own (except from a few special cases, see below). The second part of the matrix are Once you keep in mind that you only need to worry about Now, to answer @manipes-ag ' s question: Simple CaseIf you know there is only rotation applied, it is fairly easy to determine the angle Theta of the rotation in radians: To convert radians to degree is trivial. More complicated caseIf, however you happen to know a rotation and a uniform scale have been applied, things get a tad more tricky as you need to determine the scale factor You can then divide the matrix by the scale factor and treat the resulting matrix as a simple case (see above). General caseIf you do not know which transformations (scale, rotate, skew) apply, it still can be done but it does get rather involved. Due to my lacking abilities to fluently write latex math code on github, I'll spare you my crude attempts of an explanation and instead just paste some python code, which handles this case (and a few numerical edge cases) quite beautifully and is probably what you are looking for anyway. (Disclaimer: I did not write this code, I merely dumped the formulas in a prompt and asked gpt-oss:120b to do it for me.) import numpy as np
def angle_from_matrix(M, assume_pure_rotation=False):
"""
Return the rotation angle (in radians) hidden in a 2×2 matrix M.
Parameters
----------
M : (2,2) array_like
The transformation matrix.
assume_pure_rotation : bool, optional
If True, we skip the polar decomposition and just use atan2.
Use only when you know M is a rotation (or rotation·scale).
Returns
-------
theta : float
Angle in radians, in (-π, π].
"""
M = np.asarray(M, dtype=float)
if assume_pure_rotation:
# pure rotation, possibly scaled uniformly
scale = np.linalg.norm(M[:, 0]) # length of first column
if scale == 0:
raise ValueError("Zero column – not a rotation.")
R = M / scale
theta = np.arctan2(R[1, 0], R[0, 0])
return theta
# ---------- General case: polar decomposition ----------
# 1. C = M.T @ M (symmetric positive definite)
C = M.T @ M
# 2. Compute C^{-1/2} analytically (2×2 case)
a, b = C[0, 0], C[0, 1]
d = C[1, 1]
tr = a + d
det = a * d - b * b
sqrt_det = np.sqrt(det)
# C^{1/2} = (C + sqrt(det) * I) / sqrt(tr + 2*sqrt(det))
denom = np.sqrt(tr + 2 * sqrt_det)
C_sqrt = (C + sqrt_det * np.eye(2)) / denom
# Inverse square root
C_inv_sqrt = np.linalg.inv(C_sqrt)
# 3. Orthogonal factor R = M @ C^{-1/2}
R = M @ C_inv_sqrt
# Optional: force orthogonality (tiny numerical errors)
# R = R / np.linalg.norm(R, axis=0)
# 4. Extract angle; handle possible reflection
if np.linalg.det(R) < 0:
# reflection present – flip sign of one column to get proper rotation
R[:, 0] = -R[:, 0]
theta = np.arctan2(R[1, 0], R[0, 0])
return theta |
Beta Was this translation helpful? Give feedback.
You can determine the bbox and transformation for each image on the page:
page.get_image_rects(5, transform=True). How to interpret the returned matrix is explained here.