Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AcroForm Flatten Fields #83

Open
corymickelson opened this issue Jan 31, 2019 · 6 comments
Open

AcroForm Flatten Fields #83

corymickelson opened this issue Jan 31, 2019 · 6 comments
Assignees
Labels
feature new feature

Comments

@corymickelson
Copy link
Owner

Add the ability for a user to simply invoke a method Form.flatten() to flatten all fields in a forms fields array.

@corymickelson corymickelson self-assigned this Jan 31, 2019
@corymickelson corymickelson added the feature new feature label Jan 31, 2019
@MatthewMarkgraaff
Copy link

Hey @corymickelson
Do you have any update on this feature? I'd be really happy to contribute if you could point me in the right direction

@corymickelson
Copy link
Owner Author

corymickelson commented Oct 12, 2019 via email

@MatthewMarkgraaff
Copy link

Hey Cory
Great, look forward to your response

@corymickelson
Copy link
Owner Author

@MatthewMarkgraaff Thanks again, this is a somewhat larger discussion, but hopefully the following will help get you started. First, let's make sure we understand what flattening is in relation to PDF.
When we flatten a PDF document we are writing (copying) the appearance stream from each acroform field on a page and saving it to an xobject (see below for xobject definition). Once all interested field appearance streams have been written to an xobject the next steps are to apply/paint the xobject to the pdf page and delete the fields from the acroform dictionary. This in effect is adding the visualization of the form fields to the page, while removing any interactive properties of the field.

A form XObject is a PDF content stream that is a self-contained description of any sequence of graphics objects (including path objects, text objects, and sampled images). A form XObject may be painted multiple times—either on several pages or at several locations on the same page—and produces the same results each time, subject only to the graphics state at the time it is invoked. Not only is this shared definition economical to represent in the PDF file, but under suitable circumstances the PDF consumer application can optimize execution by caching the results of rendering the form XObject for repeated reuse.

In order to begin this process we must first ensure each acroform field has an appearance stream (the field dictionary property key is AP) as well as a value and/or default value (keyed as V, and DV). This presents it’s own unique set of circumstances as form fields are not required to store an AP stream, form fields can use an acroform property NeedAppearances (keyed as DA on the acroform dictionary) to fallback to a default appearance of 12pt black arial. This while inconvenient, is not a difficult issue to resolve, we simply must be aware of it so we know where to look when the AP property is non-existent. The next step would be to efficiently store each appearance stream in an xobject which would subsequently be applied to the PDF page.

To begin I suggest you look over the sdk/FlattenFields class, and pull the cpp PDFium source code. The FlattenFields class, given a page, iterates the fields on the page, creates an xobject to accommodate these fields, writes the xobject to the page and deletes the fields from the acroform dictionary. This class currently only works for text form fields that contain field level AP,D, and DV properties. This work is in the very early stages, and is based off the flatten functionality of PDFium. There is very little documentation on how to programmatically flatten PDF’s, the knowledge base you will need in order to contribute to this work will be largely based off reading the source code of programs that provide this functionality. I realize this is a lot of work, and very much appreciate any contributions to this effort. Please feel free to reach out with any questions, I will do my best to answer and guide you in your efforts to expand the functionality of this library.

@MatthewMarkgraaff
Copy link

Hey @corymickelson

Couldn't ask for a better starting point, thank you for the detailed intro.
Conceptually, I get it. I'll set aside some time this weekend to get started and will keep you posted here.

@florianbepunkt
Copy link

@MatthewMarkgraaff Have you been able to implement this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature new feature
Projects
None yet
Development

No branches or pull requests

3 participants