Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to maintain the same header and footer of the respective docs while merging? #67

Open
ghost opened this issue May 7, 2021 · 9 comments

Comments

@ghost
Copy link

ghost commented May 7, 2021

I want to merge multiple files, but the package takes the header and footer of the first file and replaces the rest of the file's header and footer.

@mxblsdl
Copy link

mxblsdl commented Jun 30, 2022

I am also facing the same issue. Has there been any movement on this?

@njohner
Copy link
Contributor

njohner commented Jul 4, 2022

We usually have a master template and merge other files into it, so we actually want to keep the footer and header from the first file. So we do not plan on implementing a different behaviour.

Also this would be tricky overall, as several files that get merged can land on the same page for example. What footer and header would you want on that page in that case? There are similar issues with other section properties, such as page size, orientation, etc.

@Abd-Allah-144
Copy link

Abd-Allah-144 commented Jul 4, 2022

[edited]
basically understand sections, google: python docx sections
Then continue here:
https://python-docx.readthedocs.io/en/latest/user/hdrftr.html
then
#82

@zachrank
Copy link

zachrank commented Jul 4, 2022

We usually have a master template and merge other files into it, so we actually want to keep the footer and header from the first file. So we do not plan on implementing a different behaviour.

This library does this very well but I think the documentation/description of the library should be updated to include this. Right now, the description and README lead one to think that word docs are concatenated while preserving their header/footer/etc.

I found this library because I was looking for a library that could combine multiple docx files together but I was expecting to be able to preserve sections/headers/footers/etc. One use-case that I have is the need to concatenate a landscape-oriented document in between two portrait-oriented documents. Having the option to control whether the header/footer/styles/etc are combined or preserved would be a useful addition in my opinion.

@njohner
Copy link
Contributor

njohner commented Jul 4, 2022

I understand that the current behaviour can be unsatisfactory, but it's almost hopeless to cover all cases correctly, the word format being what it is... I'll look at updating the documentation though.

And as a sidenote, I think combining a document with a section in landscape orientation should actually work, provided it is not the only section in that document.

@njohner
Copy link
Contributor

njohner commented Jul 4, 2022

@Abd-Allah-144 I know that it is solvable, but right now, the less we fiddle with inserting tags, moving tags and deleting tags, the better off we are. We use this in production software and can't really afford to break things or change the behaviour.

Of course if you feel like trying to solve some of the issues, feel free to make a PR.

@Abd-Allah-144
Copy link

Abd-Allah-144 commented Jul 6, 2022

And as a sidenote, I think combining a document with a section in landscape orientation should actually work, provided it is not the only section in that document.

[landscape /portrait problem]

About the side note:
To add landscape (doc2) to portrait (master), while each file has one section only:

1- Open the master file.

2- (Using Libreoffice) Insert menu> more breaks >manual page break. then choose: Page break, Style: Landscape, keep the (change page number) check box off.
(I can't make this step using python, if you can, tell me)

3- Add this code before the merging code:
#linking the header/footer
last_section=master.sections[-1]
header = last_section.footer
footer= last_section.header
header.is_linked_to_previous = True
footer.is_linked_to_previous = True

4- Make the composing code without adding additional page breaks.

Note: I have tried making empty file with 2 section, portrait then landscape, to add it between the master and the doc2, so we don't need to make the manual break manually. it did NOT work. I have tried it again with 3 sections in the empty file, same problem.
This because the second doc follow the last section in the master. I am not sure what to do. I hope you can fix it (I am not qualified to make pull request).

NOTE2: if you have complicated doc, consider you may need this:
doc.sections[0].different_first_page_header_footer = True #or Flase

@user-tq
Copy link

user-tq commented May 16, 2024

If someone needs to combine the document into three sections like me: cover and table of contents - body - back cover.
finally,The cover and table of contents 、back cover are independent sections without header and footer

from docxcompose.composer import Composer
from docx.enum.section import WD_SECTION
from docx import Document

def merge_doc(source_file_path_list:list, target_file_path:str,special_last=True):
    page_break_doc = Document()
    page_break_doc.add_page_break()
    master_doc = Document(source_file_path_list[0])
    target_composer = Composer(master_doc)
    for f  in source_file_path_list[1:]:
        if special_last and f == source_file_path_list[-1]: 
            new_section=target_composer.doc.add_section(WD_SECTION.NEW_PAGE)
            header = new_section.header
            header.is_linked_to_previous = False
            footer = new_section.footer
            footer.is_linked_to_previous = False
        else:
            target_composer.append(page_break_doc)
        target_composer.append(Document(f))
    target_composer.save(target_file_path)
    
merge_doc(["1-封面目录与一部分正文.docx","2-其它正文.docx","3-封底noheader.docx"],'testout.docx') 

The most important thing is that the first master document should have two sections. The first section has an empty header(for cover and table of contents ) and the second section has the header of the body.

1-封面目录基本信息.docx
2-正文.docx
3-封底noheader.docx

@user-tq
Copy link

user-tq commented May 16, 2024

If someone needs to combine the document into three sections like me: cover and table of contents - body - back cover. finally,The cover and table of contents 、back cover are independent sections without header and footer

from docxcompose.composer import Composer
from docx.enum.section import WD_SECTION
from docx import Document

def merge_doc(source_file_path_list:list, target_file_path:str,special_last=True):
    page_break_doc = Document()
    page_break_doc.add_page_break()
    master_doc = Document(source_file_path_list[0])
    target_composer = Composer(master_doc)
    for f  in source_file_path_list[1:]:
        if special_last and f == source_file_path_list[-1]: 
            new_section=target_composer.doc.add_section(WD_SECTION.NEW_PAGE)
            header = new_section.header
            header.is_linked_to_previous = False
            footer = new_section.footer
            footer.is_linked_to_previous = False
        else:
            target_composer.append(page_break_doc)
        target_composer.append(Document(f))
    target_composer.save(target_file_path)
    
merge_doc(["1-封面目录与一部分正文.docx","2-其它正文.docx","3-封底noheader.docx"],'testout.docx') 

The most important thing is that the first master document should have two sections. The first section has an empty header(for cover and table of contents ) and the second section has the header of the body.

1-封面目录基本信息.docx 2-正文.docx 3-封底noheader.docx
I tried to understand the behavior of docxcompose dealing with multiple sections from the source code, and wrote a blog for this, but it was written in Chinese, I hope it can be helpful to those who are interested in this issue.
https://zhuanlan.zhihu.com/p/697998482

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants