-
-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running in AWS Lambda Containers #289
Comments
Yep. I got it to work. You need to use "/tmp" for downloads, and you need to call the 'driverEnableHeadlessDownloads' function to enable headless chrome to be able to download files if you want that (see link in function for source). I pinned my selenium version to I use the following python/selenium functions to set up the driver: def driverEnableHeadlessDownloads(driver: webdriver, downloadDir: str) -> webdriver:
"""
Need this voodoo function to allow serverless chrome downloads.
From: https://github.com/shawnbutton/PythonHeadlessChrome/blob/master/driver_builder.py
Parameters
----------
driver: selenium webdriver
downloadDir: directory used for downloads
Returns
-------
selenium webdriver
"""
driver.command_executor._commands["send_command"] = (
"POST",
"/session/$sessionId/chromium/send_command",
)
params = {
"cmd": "Page.setDownloadBehavior",
"params": {"behavior": "allow", "downloadPath": downloadDir},
}
driver.execute("send_command", params)
def makeDefaultChromeOptions() -> webdriver.ChromeOptions:
"""
Set up default chrome options
Returns
-------
selenium webdriver
"""
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1280x1696")
options.add_argument("--disable-application-cache")
options.add_argument("--disable-infobars")
options.add_argument("--no-sandbox")
options.add_argument("--hide-scrollbars")
options.add_argument("--enable-logging")
options.add_argument("--log-level=0")
options.add_argument("--single-process")
options.add_argument("--ignore-certificate-errors")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--homedir=/var/task")
options.add_argument(
"user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (HTML, like Gecko) "
"Chrome/61.0.3163.100 Safari/537.36"
)
return options
class Driver:
def __init__(self, chromeDriver: str, prefs: dict, headlessChromeBinary: str):
if not pathlib.Path(chromeDriver).exists():
raise FileNotFoundError(f"Chrome driver not found at {chromeDriver}")
self.chromeDriver = chromeDriver
self.prefs = prefs
self.options = makeDefaultChromeOptions()
self.options.add_experimental_option("prefs", prefs)
self.options.binary_location = headlessChromeBinary
self.driver = None
def __enter__(self):
logger.info(
f"Setting up headless chrome-based browser with preferences {self.prefs}"
)
self.driver = webdriver.Chrome(self.chromeDriver, options=self.options)
driverEnableHeadlessDownloads(self.driver, "/tmp")
return self.driver
def __exit__(self, excType, excVal, excTb):
logger.info("Shutting down driver")
self.driver.close()
chromePrefs = {
"download.default_directory": chromeDownloadPath,
"download.prompt_for_download": False,
"download.directory_upgrade": True,
"safebrowsing.enabled": False,
} This is the Dockerfile I use for deployment: FROM public.ecr.aws/lambda/python:3.7
RUN mkdir -p /opt/bin && mkdir -p /opt/extensions && mkdir /var/task/.downloads \
&& curl -SL https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-55/stable-headless-chromium-amazonlinux-2017-03.zip \
> /opt/bin/headless-chromium.zip \
&& unzip /opt/bin/headless-chromium.zip -d /opt/bin && rm /opt/bin/headless-chromium.zip \
&& curl -SL https://chromedriver.storage.googleapis.com/2.43/chromedriver_linux64.zip > /opt/bin/chromedriver.zip \
&& unzip /opt/bin/chromedriver.zip -d /opt/bin && rm /opt/bin/chromedriver.zip \
&& chmod 777 /opt/bin/chromedriver
# Add poetry files
ADD poetry.lock /var/task
ADD pyproject.toml /var/task
RUN pip install --upgrade pip \
&& pip install poetry --no-cache-dir \
# Export requirements from poetry project
&& poetry export -f requirements.txt --output /var/task/requirements.txt \
&& pip uninstall -y poetry \
&& pip install -r requirements.txt --target /var/task --no-cache-dir \
&& pip install awslambdaric --target /var/task --no-cache-dir
ADD awsLambda /var/task
CMD [ "main.handler" ] And this is my pulumi function to create the lambda lambdaFunction = lambda_.Function(
resource_name="myLambda",
image_uri="XXXXXXXXX.dkr.ecr.XXXXX.amazonaws.com"
f"/myLambda:latest-prod",
memory_size=1024,
role=role.arn,
package_type="Image",
description="This lambda does things.",
timeout=500,
tags={
"environment": "prod",
"creator": "pulumi",
"project": "myLambda",
"project-url": "https://github.com/XXXXXXX/XXXXXXX",
"maintainer": "myname",
"maintainer-email": "[email protected]",
},
) I test the lambda function locally by using the awslambdaric python module. After building the dockerfile, I call: docker run -d -v ~/.aws-lambda-rie:/aws-lambda -p 9000:8080 \
--entrypoint /aws-lambda/aws-lambda-rie \
--env-file .temp/.env \
docker.io/myorg/myimg \
/var/lang/bin/python -m awslambdaric main.handler ## 'main' is my lambda file, 'handler' is the lambda name Firing Hope this helps someone! |
@tomardern |
May i know which downlaod path i have to provide ? |
Hi,
Now that AWS supports containers in Lambda is there a plan / has anyone attempted to get this repo to work using a container instead of the provided binaries/layers?
Thanks,
The text was updated successfully, but these errors were encountered: