Edit: Changed subject, was Batch converting Adobe files to Open Office Make your post understandable by others -- MrProgrammer, forum moderator |
Batch convert PDF to Writer
Batch convert PDF to Writer
How can I batch convert Adobe files to Open Office 4.1
Last edited by MrProgrammer on Mon Dec 25, 2023 6:24 pm, edited 2 times in total.
Reason: Edited topic's subject
Reason: Edited topic's subject
Open Office 4.1.5 Windows 10
Re: Batch converting Adobe files to Open Office
What type of Adobe files?
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
Re: Batch converting Adobe files to Open Office
I think they are in Adobe acrobat
Open Office 4.1.5 Windows 10
Re: Batch converting Adobe files to Open Office
Just ordinary PDF files
Open Office 4.1.5 Windows 10
- Hagar Delest
- Moderator
- Posts: 32857
- Joined: Sun Oct 07, 2007 9:07 pm
- Location: France
Re: Batch converting Adobe files to Open Office
AOO has limited capacity for that. Don't remember if there is still a need of an extension. LibreOffice has included this kind of extension and can import documents but page by page IIRC. And the result may be weird because it won't recognize paragraphs, just bunches of text, especially if there are objects in the pages like pictures, captions and so on.
You may be quicker to redo the document by copy-paste.
You may be quicker to redo the document by copy-paste.
LibreOffice 24.8 on Xubuntu 24.10 and 24.8 portable on Windows 10
Re: Batch converting Adobe files to Open Office
I normally put PDF files through an OCR application (usually gimagereader driving Tesseract, running on linux Xubuntu) and reformat completely, but such PDFs are in my case plain text without illustrations or tables.
There is at least one Windows application that will attempt to preserve the original format, but I've forgotten its name as I don't use Windows. An OCR application that produces hOCR output may give a reasonable XML coded output that preserves the layout. I have no experience with hOCR output from PDF.
There is at least one Windows application that will attempt to preserve the original format, but I've forgotten its name as I don't use Windows. An OCR application that produces hOCR output may give a reasonable XML coded output that preserves the layout. I have no experience with hOCR output from PDF.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
Re: Batch converting Adobe files to Open Office
I found this online:
There is a means to convert PDF files to Word files. You can then save them as ODT, though if there is special formatting in Word, the conversion may not be exact.
Foxit has a PDF Editor, as well as a free Reader version. The Editor is often available for a free trial period after you download the free Reader. A line from their web page describes the process of PDF to Word conversion:
1. Open the pdf file with Foxit PDF Editor, go to Convert tab>To MS office> Word or File tab>Export>To MS Office>Word>Save As, Save As window will pop up.
I do not have the Editor version, so don't know if it will also convert directly to ODT.
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
- MrProgrammer
- Moderator
- Posts: 5100
- Joined: Fri Jun 04, 2010 7:57 pm
- Location: Wisconsin, USA
Re: Batch convert PDF to Writer
You can't. OpenOffice does not provide that feature.
Portable Document Format (PDF) is intended to be a final format, suitable only for viewing or printing, though it is portable and can be reliably copied to other systems for viewing or printing. Attempts to convert PDF into some other document type (text, spreadsheet, presentation, etc.) are blocked because the information necessary to do that is not present in the PDF.
If this solved your problem please go to your first post use the Edit ☐ button and add [Solved] to the start of the Subject field. Select the green checkmark icon at the same time.
Mr. Programmer
AOO 4.1.7 Build 9800, MacOS 13.7, iMac Intel. The locale for any menus or Calc formulas in my posts is English (USA).
AOO 4.1.7 Build 9800, MacOS 13.7, iMac Intel. The locale for any menus or Calc formulas in my posts is English (USA).
Re: Batch convert PDF to Writer
Not Writer, but Draw!
Look for "PDF Import Extension for Apache OpenOffice 0.1.1" to import drawings (quite complicated files) text in PDF files etc, and adjust them as needed.
Yes, combined with macros (ooRexx) or manually, very handy!
You can create a macro that open PDF-files and save them to .odg
Look for "PDF Import Extension for Apache OpenOffice 0.1.1" to import drawings (quite complicated files) text in PDF files etc, and adjust them as needed.
Yes, combined with macros (ooRexx) or manually, very handy!
You can create a macro that open PDF-files and save them to .odg
Apache OpenOffice 4.1.13 on ArcaOS 5.0.7
Re: Batch convert PDF to Writer
Be aware that the PDF Import Extension is suitable only for minor cosmetic changes to PDF files, and may also only handle smaller PDF files.
Mr Programmer has published a Perl script to extract the text from PDF files that have such text embedded in them (not necessarily _all_ PDF files).
viewtopic.php?p=410366#p410366
Mr Programmer has published a Perl script to extract the text from PDF files that have such text embedded in them (not necessarily _all_ PDF files).
viewtopic.php?p=410366#p410366
Apache OpenOffice 4.1.15 on Xubuntu 22.04.4 LTS
Re: Batch convert PDF to Writer
See also:
"Consolidate Text" in
https://wiki.documentfoundation.org/Rel ... ess_&_Draw ,
the enhancement discussion under
https://bugs.documentfoundation.org/sho ... ?id=118370
and the much older suggestions posted to
https://ask.libreoffice.org/t/pdf-to-dr ... iter/13805
Among these suggestions was a workaround by myself which I had sketched out of a mood without an intention to use it myself. And I never tried to get something like a "batch conversion" based on that. (I only mention this old post here because there seemingly were users judging from the upvotes.)
Anyway we should see clearly that all this cannot accomplish impossible tasks. We cannot convert a pdf file into a Writer file because it does not contain lots of information which would be needed for such a process. We also cannot convert a Writer file to pdf. We only can export the Writer thing to a file capable of telling a printer what should be output on paper. That's what essentially pdf is made for. If you want to get really editable pdf files you need to use a pdf editor (like Acrobat), and to accept the shortcomings of this proceeding.
In short: we can convert water to ice and back. We can not convert iron to gold. And a printer doesn't "convert" a pdf file to printed paper. It just prints. And what you may get from the print using "OCR" isn't a converted file.
Also: Don't wait for AOO to implement a feature like "Consolidate Text".
"Consolidate Text" in
https://wiki.documentfoundation.org/Rel ... ess_&_Draw ,
the enhancement discussion under
https://bugs.documentfoundation.org/sho ... ?id=118370
and the much older suggestions posted to
https://ask.libreoffice.org/t/pdf-to-dr ... iter/13805
Among these suggestions was a workaround by myself which I had sketched out of a mood without an intention to use it myself. And I never tried to get something like a "batch conversion" based on that. (I only mention this old post here because there seemingly were users judging from the upvotes.)
Anyway we should see clearly that all this cannot accomplish impossible tasks. We cannot convert a pdf file into a Writer file because it does not contain lots of information which would be needed for such a process. We also cannot convert a Writer file to pdf. We only can export the Writer thing to a file capable of telling a printer what should be output on paper. That's what essentially pdf is made for. If you want to get really editable pdf files you need to use a pdf editor (like Acrobat), and to accept the shortcomings of this proceeding.
In short: we can convert water to ice and back. We can not convert iron to gold. And a printer doesn't "convert" a pdf file to printed paper. It just prints. And what you may get from the print using "OCR" isn't a converted file.
Also: Don't wait for AOO to implement a feature like "Consolidate Text".
On Windows 10: LibreOffice 24.8.3 and older versions, PortableOpenOffice 4.1.7 and older, StarOffice 5.2
---
Lupp from München
---
Lupp from München