Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove puppeteer from dependencies #256

Open
flamewow opened this issue Sep 8, 2021 · 31 comments
Open

remove puppeteer from dependencies #256

flamewow opened this issue Sep 8, 2021 · 31 comments
Labels
enhancement New feature or request

Comments

@flamewow
Copy link

flamewow commented Sep 8, 2021

Reason/Context

Please try answering few of those questions

  • Why we need this improvement?
    puppeteer is used only for PDF generation while being quite heavy (~200Mb)
  • How will this change help?
    download time will be increased drastically
  • What is the motivation?
    PDF generation is not used very often and for users that are not using it at all 200Mb overhead is way too much

Description

Please try answering few of those questions

  • What changes have to be introduced?
    remove puppeteer from dependencies
  • Will this be a breaking change?
    yes
  • How could it be implemented/designed?
    change PDF generation function to accept puppeteer instance as a parameter. In that case, users that need PDF will have to install puppeteer separately and pass it as an argument
@flamewow flamewow added the enhancement New feature or request label Sep 8, 2021
@github-actions
Copy link

github-actions bot commented Sep 8, 2021

Welcome to AsyncAPI. Thanks a lot for reporting your first issue. Please check out our contributors guide and the instructions about a basic recommended setup useful for opening a pull request.

Keep in mind there are also other channels you can use to interact with AsyncAPI community. For more details check out this issue.

@magicmatatjahu
Copy link
Member

magicmatatjahu commented Sep 9, 2021

@flamewow Hi! Thanks for the issue. Usually if you install a template once then the puppeter should be cached and reused in the next render. The problem may be the CI environment, where the puppeter will be installed with every job and I agree this can be frustrating.

I can't understand how do you want pass puppeteer instance as a parameter. Using Generator API it can be easy to do, but how do you want pass reference to the instance by the CLI? I only see solution that puppeter should be installed separately - before the template - globally and then we should use this global dep in the hook rather local instance.

EDIT: Did you try our tip under the list of parameters?

NOTE: If you only generate an HTML website, set the environment variable PUPPETEER_SKIP_CHROMIUM_DOWNLOAD to true and the generator will skip downloading chromium.

@derberg
Copy link
Member

derberg commented Sep 9, 2021

yeah, it is not really puppeteer that is heavy here but chromium and you can skip it as @magicmatatjahu mentioned.

PDF generation is not used very often

unfortunately, we do not know that really

if using PUPPETEER_SKIP_CHROMIUM_DOWNLOAD is not something that satisfies you, the only good solution would be to completely remove PDF support from this template really and create a separate template like, pdf-template. Positive side effect would be better visibility of this option, so users would see they can generate markdown, HTML and PDF with 2 different templates. Thoughts? Of course we would need help here

@flamewow
Copy link
Author

@derberg creating a separate pdf-template package (will depend on html-template I would guess) sounds like a great option.
Ofc that's additional overhead for package management (and that's on you guys, to keep it under @asyncapi namespace in npm registry). Will be able to help with the code

@derberg
Copy link
Member

derberg commented Sep 13, 2021

Ofc that's additional overhead for package management (and that's on you guys, to keep it under @asyncapi namespace in npm registry)

this part is easy, we have releases and deps update inside asyncapi org fully automated.

will depend on html-template I would guess

definitely. I haven't seen the code that generated pdf for some time but should be easy to port to new template and use html-template as dependency

Will be able to help with the code

would you be able to handle this with just our guidance, and best to become CODEOWNER of the template?

@flamewow
Copy link
Author

@derberg
How would we approach that?

would you be able to handle this with just our guidance, and best to become CODEOWNER of the template?

in both cases, i.e. just contributor and codeowner

@derberg
Copy link
Member

derberg commented Sep 16, 2021

so it would look like this:

  • we create a new repo
  • you create PR to remove puppeteer from html-template
  • there also needs to be PR opened to Generator to remove some stuff from dockerfile https://github.com/asyncapi/generator/blob/master/Dockerfile#L5-L10
  • and yeah, a PR to new repo where you create a template for pdf generation basically
  • in PR to new repo you also create a CODEOWNER file where you add your handle, and one of the reviewers so you are not the only maintainer there. I don't think there will be much maintenance needed with pdf template as it will just consume whatever is produces in HTML template, not much code I believe

Then, as a result, according to our open governance model you can (you do not have to) also become a member of Technical Steering Committee. You should definitely look at this for more details.

Of course, I can guide you through all the steps, explain CI and all that stuff around.

I see you are working on https://github.com/flamewow/nestjs-asyncapi so might be good for you to engage with us more as TSC member and consider moving the nestjs project here, under AsyncAPI organization too as it would help with recognition.

@flamewow
Copy link
Author

Sounds good, let me start with "create a template for pdf generation",
once it works I'll message you (let me know how) so we can continue with next steps:

  • create a new repo for pdf-template
  • PR to remove puppeteer from html-template
    ..

@derberg
Copy link
Member

derberg commented Sep 27, 2021

@flamewow you can tag me here or just join our Slack and ping me there in #tooling channel. Also, next week because of the hackathon we organize and also hacktoberfest I will be live-streaming a Contributor first meeting, every Wednesday, throughout the entire October. Feel free to join, on any platform where the stream goes or live on zoom in case you need some explanation/guidance. Here is a link to the AsyncAPI calendar, I will update the invite this week with all the links to all the streaming platforms

@arjangeertsema
Copy link

This has become even more relevant: puppeteer/puppeteer#7921

@duxi90
Copy link

duxi90 commented Feb 10, 2022

There is also this issue to take into consideration:
puppeteer/puppeteer#6622 (comment)

@magicmatatjahu
Copy link
Member

@arjangeertsema @duxi90 Thanks for raise that issues! We have to do something about these problems, but you can still use html templates without puppeter via flag:

PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true ag ... @asyncapi/html-template ...

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added the stale label Jun 11, 2022
@pratik9315
Copy link

I have a question that might sound a bit silly😅
When we talk about creating a new repo, how do we actually link it to the organization repo. I mean if I start creating a new repo for html-template, it will be under my username/repo, from there on how do submit a PR to the asyncapi organization? Or are we talking about forking the current repo, removing all the puppeteer related things from it and then integrating an alternative to puppeteer🤔 Or theres some concept I am missing and its something else entirely😅

@arjangeertsema
Copy link

@pratik2315 I am not part of the asyncapi organisation, but this is how I did it with other organizations.

You can create a new project and when finished you can transfer the ownership the asyncapi organisation with some help from the asyncapi organisation.

After that you can create a PR which replaces all your transferred code from the asyncapi repo with references to your repository. Key is that the timeframe for this is limited because you need to keep changes to the repositories in sync during this period of time.

Good luck and thank you in advance!

@github-actions github-actions bot removed the stale label Jun 22, 2022
@magicmatatjahu
Copy link
Member

@pratik2315 You can transfer your repo to our organization and github itself will take care of redirects etc. The only problem with fork is that all git history will still be visible from html-template, but this can be fixed - just delete all history related to html-template and start history from your commits. As for the task itself, I would prefer to do it as mentioned in issue's description:

change PDF generation function to accept puppeteer instance as a parameter. In that case, users that need PDF will have to install puppeteer separately and pass it as an argument

I know it will decrease UX, but creating pdf template will be a serious problem, because while generating pdf we will have to generate html-templates and then operate on it. It can be done, the question is whether it makes sense.

If you wanna handle that, go ahead :)

@derberg
Copy link
Member

derberg commented Jun 22, 2022

yo, basically do pratik2315/pdf-template and do it there, and once ready, I'll help you out going through donation process.

I know it will decrease UX, but creating pdf template will be a serious problem, because while generating pdf we will have to generate html-templates and then operate on it. It can be done, the question is whether it makes sense.

@magicmatatjahu generator is a lib, I don't see a problem why not use it in the template to generate HTML, then PDF out of it and then have a hook that will remove HTML files. It is like it is 🤷🏼

@pratik2315 I guess we need to stay with puppeteer as anyway alternative that I was thinking about, playwright, has the same approach and by default it also downloads browsers 😞

so in short (and do let me know if you need more details)

  • use react renderer,
  • in the index.js use @asyncapi/generator and @asyncapi/html-template to generate single index.html file
  • Then just use the existing PDF generation hook from this repo to generate PDF,
  • add another hook to remove generated HTML file as it is not needed there, it is basically a temporary file

that should do the trick

@y-nk
Copy link

y-nk commented Oct 3, 2022

Any update on this? We'd be interested too.

@magicmatatjahu
Copy link
Member

@y-nk Hi! No, we currently have no interested person to do it, and ourselves we have assigned it as future task - I don't have time for that at the moment.

@pratik9315
Copy link

Hey there! I am currently working on this issue!

@y-nk Hi! No, we currently have no interested person to do it, and ourselves we have assigned it as future task - I don't have time for that at the moment.

@derberg
Copy link
Member

derberg commented Dec 15, 2022

Some technical detail on my proposed technical solution for new pdf-template:

  • at the moment, pdf in html-template is generated in a hook, after HTML generation process -> https://github.com/asyncapi/html-template/blob/master/hooks/99_generatePdf.js
  • it is done like this because to generate PDF you first need to have a source to generate from, some HTML site
  • The hook code takes the generated HTML, opens it with a puppeteer virtual headless browser, and generates PDF from rendered HTML

Now, when we separate that, when we make pdf-template standalone, it means we need to somehow generate the HTML that will be a source for PDF generation.

I don't think there is some other solution than taking using in pdf-template generator as library, so basically use generator to generate HTML to generate PDF 😅

In this integration test you can see how Generator is used to generate html-template.

So in theory, you can make template files in pdf-template will trigger generation of HTML using html-template, and then on a hook level you will trigger again pdf generation


now that I'm thinking, maybe we do not need to use html-template and generator as library. In the end, we have a web component that will make generation much easier. In the fly you can generate index.html that uses web components, and passes AsyncAPI document to it. Then with puppeteer you open it up in headless browser and generate PDF 🤔

@magicmatatjahu that should work right?

@y-nk
Copy link

y-nk commented Dec 15, 2022

Good luck @pratik2315 . I will try to stay put for testing if you need a hand.

@magicmatatjahu
Copy link
Member

magicmatatjahu commented Dec 15, 2022

I wonder if, instead of a new template, should we install the puppeter on the fly in the appropriate hook when someone specifies the pdf parameter - check if the dependency exists, if not install it and use it then. WDYT? @derberg

now that I'm thinking, maybe we do not need to use html-template and generator as library. In the end, we have a web component that will make generation much easier. In the fly you can generate index.html that uses web components, and passes AsyncAPI document to it. Then with puppeteer you open it up in headless browser and generate PDF 🤔

Yeah, if we will go with separate template the webcomponent should be easier to use.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added the stale label Apr 15, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 14, 2023
@derberg derberg removed the stale label Sep 14, 2023
@derberg
Copy link
Member

derberg commented Sep 14, 2023

I think it is still valid issue, but we need volunteers to help explore different options

@derberg derberg reopened this Sep 14, 2023
Copy link

This issue has been automatically marked as stale because it has not had recent activity 😴

It will be closed in 120 days if no further activity occurs. To unstale this issue, add a comment with a detailed explanation.

There can be many reasons why some specific issue has no activity. The most probable cause is lack of time, not lack of interest. AsyncAPI Initiative is a Linux Foundation project not owned by a single for-profit company. It is a community-driven initiative ruled under open governance model.

Let us figure out together how to push this issue forward. Connect with us through one of many communication channels we established here.

Thank you for your patience ❤️

@github-actions github-actions bot added the stale label Jan 13, 2024
@derberg
Copy link
Member

derberg commented Mar 26, 2024

anyone wants to work on this? I can share how to fix it

@github-actions github-actions bot removed the stale label Mar 27, 2024
@webbdays
Copy link

webbdays commented Apr 2, 2024

Can I work on this?
Please guide me through this.

@mmben
Copy link

mmben commented Jun 25, 2024

Hi,
I know the plan is to remove puppeteer, but it seems to be a longer way to get there.
Therefore I hope the question is ok: Is it an option to update puppeteer to the latest version as intermediate step, so we at least get rid of the security issues this old version has?

Copy link
Member

derberg commented Jun 25, 2024

sure, if anyone can open a PR that upgrades puppeteer to latest, that would be great. Just remember to test on local PDF generation still works

@derberg
Copy link
Member

derberg commented Aug 28, 2024

update of dependency released with #668 (comment) thanks to @bakasmarius

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

9 participants