Ok, this definitely helps in understanding how PDF works. However, I really do edit PDFs regularly and have no problems with the edited ones. Already mentioned it ITT, PDF-Xchange lets me do so many things that listing them would sound like an advertisement. Editing the existing text tends to mess it up, that's true, but it's not crucial for me and all sorts of other actions work almost perfectly.
You're imagining some very ideal circumstances for working with PDFs that have nothing to do with my own needs, so I can't really make use of your advice. :/
Banking is very PDF heavy, and many of these PDFs have a ton of logic baked into them. Some of the loan documents do literally all of the math for you so the loan officer just inputs the amount, term and APR and the PDF outputs a fully-filled loan document. Its pretty magical to see until you peek under the hood at the code and oh-my-god-what-the-hell-how-did-this-ever-work-in-the-first-place-this-must-be-purgatory
I frequently download book and journal article PDFs, scan books myself, and upload them online. And ofc read them.
Editing the PDFs in my case includes e.g. adding the outline/bookmarks that allow for easier navigation, adding OCR, cropping, splitting and rearranging the pages when the scanned images aren't ideal, removing watermarks...
that sounds like actual typesetting work! i'm very surprised that you don't get access to the source. usually when uploading to a journal they want the latex source.
I'm not uploading to a journal. I upload stuff e.g. to Internet Archive. When I download stuff from various databases (journals, academic repositories, Google Books), it ranges from recent publications to stuff from several centuries ago, in which case a scan is all you can get.
so in that case i'm guessing it's mostly just pdfs as containers for a series of images. that's frustrating. there should really be a better format for that kind of thing. cbz is the simplest i can think of but that doesn't really allow the same amount of metadata.
Ok, this definitely helps in understanding how PDF works. However, I really do edit PDFs regularly and have no problems with the edited ones. Already mentioned it ITT, PDF-Xchange lets me do so many things that listing them would sound like an advertisement. Editing the existing text tends to mess it up, that's true, but it's not crucial for me and all sorts of other actions work almost perfectly.
You're imagining some very ideal circumstances for working with PDFs that have nothing to do with my own needs, so I can't really make use of your advice. :/
in what circumstance does pdf editing come up regularly?
Banking is very PDF heavy, and many of these PDFs have a ton of logic baked into them. Some of the loan documents do literally all of the math for you so the loan officer just inputs the amount, term and APR and the PDF outputs a fully-filled loan document. Its pretty magical to see until you peek under the hood at the code and oh-my-god-what-the-hell-how-did-this-ever-work-in-the-first-place-this-must-be-purgatory
yeah fun fact that's usually an embedded javascript runtime
yet another reason for it to die in a fire
I frequently download book and journal article PDFs, scan books myself, and upload them online. And ofc read them.
Editing the PDFs in my case includes e.g. adding the outline/bookmarks that allow for easier navigation, adding OCR, cropping, splitting and rearranging the pages when the scanned images aren't ideal, removing watermarks...
that sounds like actual typesetting work! i'm very surprised that you don't get access to the source. usually when uploading to a journal they want the latex source.
I'm not uploading to a journal. I upload stuff e.g. to Internet Archive. When I download stuff from various databases (journals, academic repositories, Google Books), it ranges from recent publications to stuff from several centuries ago, in which case a scan is all you can get.
so in that case i'm guessing it's mostly just pdfs as containers for a series of images. that's frustrating. there should really be a better format for that kind of thing. cbz is the simplest i can think of but that doesn't really allow the same amount of metadata.