diff --git a/LLM4GW.html b/LLM4GW.html new file mode 100644 index 0000000..10092b0 --- /dev/null +++ b/LLM4GW.html @@ -0,0 +1,100 @@ + + + + + + + + LLM4GW + + + + + + + + + + + + + + +
+
+
+ + +
+
+

LLM4GW

+
+

+ LLM4GW is the first comprehensive study to assess how effective Large Language Models (LLMs) are for tasks related to GitHub workflows. While LLMs have shown effectiveness in software development tasks like coding and testing, GitHub workflows are distinct from regular code in terms of structure, semantics, and security properties. +

+

+ We curated a dataset of around 400,000 workflows based on ARGUS dataset, generated prompts with varying levels of detail, and fine-tuned three state-of-the-art LLMs: GPT-3.5, CodeLlama, and StarChat. We evaluated the performance of these LLMs, both off-the-shelf and fine-tuned, on five workflow-related tasks: workflow generation, defect detection (syntactic errors and code injection vulnerabilities), and defect repair. The evaluation encompassed different prompting modes (zero-shot, one-shot) and involved identifying the best-performing temperature value and prompt for each LLM and task. +

+

+ The study revealed that, unlike regular code generation, LLMs require detailed prompts to generate the desired workflows, but these detailed prompts can lead to invalid workflows with syntactic errors. Additionally, the LLMs were found to produce workflows with code injection vulnerabilities. The research also highlights the need for novel LLM-assisted techniques, as the current LLMs were found to be ineffective at repairing workflow defects. +

+ +

Paper

+

+ Our paper is accepted at ARES '24. + +

Code

+

+ Our code is opensourced on GitHub. Please check out the repository for more details. +

+

Bibtex

+
+    @inproceedings{10.1145/3664476.3664497,
+      author = {Zhang, Xinyu and Muralee, Siddharth and Cherupattamoolayil, Sourag and Machiry, Aravind},
+      title = {On the Effectiveness of Large Language Models for GitHub Workflows},
+      year = {2024},
+      isbn = {9798400717185},
+      publisher = {Association for Computing Machinery},
+      address = {New York, NY, USA},
+      url = {https://doi.org/10.1145/3664476.3664497},
+      doi = {10.1145/3664476.3664497},
+      booktitle = {Proceedings of the 19th International Conference on Availability, Reliability and Security},
+      articleno = {32},
+      numpages = {14},
+      location = {Vienna, Austria},
+      series = {ARES '24}
+    }
+  
+
+
+ + + +
+
+

+

+ LLM4GW | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University +

+
+
+ + +
+
+
+ + + diff --git a/argus.html b/argus.html index 63281f4..0dd8e10 100644 --- a/argus.html +++ b/argus.html @@ -55,10 +55,18 @@

Github's Blog

GitHub published a blog post about our findings and also mentioned our tool. We are grateful for the support provided by them throughout our research. +

Paper

+

+ Our paper is accepted at USENIX Security '23. +

Code

Our tool is opensourced on GitHub. Please check out the repository for more details.

+

PoCs

+

+ We have developed PoCs for some randomly picked vulnerable workflows. The PoCs are currently restricted to induviduals who's identities we can verfiy, to prevent any misuse. If you are interested in obtaining the PoCs, please follow the steps mentioned here. You can select the PoC option while filling the form. +

Bibtex

         @inproceedings{muralee2023Argus,
@@ -83,7 +91,7 @@ 

Bibtex

-
+ diff --git a/dataset.html b/dataset.html index 256add6..47e1a1c 100644 --- a/dataset.html +++ b/dataset.html @@ -11,7 +11,7 @@ gtag('config', 'G-EER1LDV4TH'); - ARGUS + SecureCI @@ -44,17 +44,18 @@
-

Argus Dataset Policy

+

Dataset Policy


- We are delighted to share the ARGUS dataset and believe it will be a valuable resource for your research. However, to prevent any potential misuse, we kindly request that you fill out a request form to confirm your identity and outline the scope of your research. Once we have verified these details, we will provide you with the download link for the ARGUS dataset. + We are delighted to share our datasets and believe they will be valuable resources for your research. However, to prevent any potential misuse, we kindly request that you fill out a request form to confirm your identity and outline the scope of your research. Once we have verified these details, we will provide you with the download link for the datasets.

Request Steps:

- 1. Please open the online request form in a browser.
+ 1. Please open the online request form in a browser.
Link to ARGUS Dataset Request Form: https://forms.gle/SmzyqhtLNrvvZ8x37
+ Link to LLM4GW Dataset Request Form: hhttps://forms.gle/gyASp6NxMMtNexwh6
(If you are unable to access the page, please contact us by email.)

@@ -88,7 +89,7 @@

Argus Dataset Policy

Disclaimer & Download Agreement


- To download the ARGUS dataset, you must agree with the items of the succeeding Disclaimer & Download Agreement. You should carefully read the following terms before submitting the ARGUS Dataset request form. + To download our datasets, you must agree with the items of the succeeding Disclaimer & Download Agreement. You should carefully read the following terms before submitting the Dataset Request Form.

  • ARGUS Dataset is constructed and cross-checked by 2 experts that work in workflow security research. Due to the potential misclassification led by subjective factors, our members cannot guarantee a 100% accuracy for samples in the dataset.
  • @@ -97,12 +98,12 @@

    Disclaimer & Download Agreement


  • The purpose of using the dataset should be non-commercial research and/or personal use. The dataset should not be used for commercial use and any profitable purpose.

  • -
  • The ARGUS dataset should not be re-selled or re-distributed. Anyone who has obtained ARGUS should not share the dataset with others without the permission from our team.
  • +
  • The dataset should not be re-selled or re-distributed. Anyone who has obtained the dataset should not share the dataset with others without the permission from our team.
-
+
@@ -145,7 +146,7 @@

Dataset Shared With


- ARGUS | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University + SecureCI | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University

diff --git a/file/LLM4GW.bib b/file/LLM4GW.bib new file mode 100644 index 0000000..4bf927b --- /dev/null +++ b/file/LLM4GW.bib @@ -0,0 +1,15 @@ +@inproceedings{10.1145/3664476.3664497, + author = {Zhang, Xinyu and Muralee, Siddharth and Cherupattamoolayil, Sourag and Machiry, Aravind}, + title = {On the Effectiveness of Large Language Models for GitHub Workflows}, + year = {2024}, + isbn = {9798400717185}, + publisher = {Association for Computing Machinery}, + address = {New York, NY, USA}, + url = {https://doi.org/10.1145/3664476.3664497}, + doi = {10.1145/3664476.3664497}, + booktitle = {Proceedings of the 19th International Conference on Availability, Reliability and Security}, + articleno = {32}, + numpages = {14}, + location = {Vienna, Austria}, + series = {ARES '24} +} \ No newline at end of file diff --git a/img/githublogo.png b/img/githublogo.png new file mode 100644 index 0000000..a8376fa Binary files /dev/null and b/img/githublogo.png differ diff --git a/index.html b/index.html index a80f1bc..da3c5f4 100644 --- a/index.html +++ b/index.html @@ -36,49 +36,64 @@
- -
-
-

ARGUS Overview

-
-

- ARGUS is a groundbreaking static taint analysis system specifically designed to identify code injection vulnerabilities in GitHub Actions. It is the first of its kind, offering a unique approach to securing Continuous Integration/Continuous Deployment (CI/CD) pipelines. -

-

- The system operates by tracking the flow of untrusted data across workflows and their associated actions, thereby identifying potential vulnerabilities. ARGUS has been meticulously tested on a large scale, analyzing over 2.7 million workflows and more than 31,000 actions. The results of this evaluation revealed critical code injection vulnerabilities in thousands of workflows and actions, highlighting the pervasive nature of such vulnerabilities in the GitHub Actions ecosystem. -

-

- ARGUS not only outperforms existing pattern-based vulnerability scanners but also underscores the necessity of taint analysis for effective vulnerability detection. The development and implementation of ARGUS represent a significant stride towards enhancing the security of GitHub Actions and CI/CD pipelines at large. -

- -

Github's Blog

-

- GitHub published a blog post about our findings and also mentioned our tool. We are grateful for the support provided by them throughout our research. - -

Code

-

- Our tool is opensourced on GitHub. Please check out the repository for more details. -

-

Bibtex

-
-        @inproceedings{muralee2023Argus,
-          title={ARGUS: A Framework for Staged Static Taint Analysis of GitHub Workflows and Actions},
-          author={S. Muralee, I. Koishybayev, A. Nahapetyan, G. Tystahl, B. Reaves, A. Bianchi, W. Enck, 
-            A. Kapravelos, A. Machiry},
-          booktitle={32st USENIX Security Symposium (USENIX Security 23)},
-          year={2023},
-        }
-      
-
- - -
- + +
+
+

Overview

+
+

+ Continuous Integration (CI) has become essential to the modern software development cycle. Developers engineer CI scripts, commonly called workflows or pipelines, to automate most software maintenance tasks, such as testing and deployment. +

+

+ Developers frequently misconfigure workflows resulting in severe security issues, which can have devastating effects resulting in supply-chain attacks. The extreme diversity of CI platforms and the supported features further exacerbate the problem and make it challenging to specify and verify security properties across different CI platforms uniformly. In this area, we aim to addresses the problem by defining the desired security properties of a workflow and developing platform-independent techniques to verify and enforce the security properties. +

+
+
+ + + +
+
+

Projects

+
+
+ +
+

ARGUS

+

+ ARGUS is a groundbreaking static taint analysis system specifically designed to identify code injection vulnerabilities in GitHub Actions. It is the first of its kind, offering a unique approach to securing Continuous Integration/Continuous Deployment (CI/CD) pipelines. +

+ + +
+
+
+ + +
+

LLM4GW

+

+ LLM4GW is the first comprehensive study to assess how effective Large Language Models (LLMs) are for tasks related to GitHub workflows. +

+ +
+
+
+
+ + + + @@ -88,7 +103,7 @@

Bibtex

Team


- The ARGUS is built by Purdue Systems and Software Security Lab (PurS3) and PurSec Lab at Purdue University
and Wolfpack Security and Privacy Research (WSPR) lab at North Carolina State University. + Our projects are built by Purdue Systems and Software Security Lab (PurS3) and PurSec Lab at Purdue University
and Wolfpack Security and Privacy Research (WSPR) lab at North Carolina State University.

purdue @@ -105,7 +120,7 @@

Team

Funding


- We are greatful to the following sources for funding this project. + We are greatful to the following sources for funding the projects.

purdue @@ -121,7 +136,7 @@

Funding



- ARGUS | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University + SecureCI | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University

diff --git a/materials.html b/materials.html index e8b1ea5..5538419 100644 --- a/materials.html +++ b/materials.html @@ -2,7 +2,7 @@ - Argus + SecureCI @@ -50,10 +50,28 @@

Materials

BibTeX Citation + + GitHub Repository +
-

+ + +

+ ARES'24 Paper: "On the Effectiveness of Large Language Models for GitHub Workflows".

+ Dataset - Materials - GitHub + +
\ No newline at end of file diff --git a/poc.html b/poc.html index 790b943..292c7f1 100644 --- a/poc.html +++ b/poc.html @@ -63,7 +63,7 @@



- ARGUS | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University + SecureCI | PurS3 Lab at Purdue University | PurSec Lab at Purdue University | WSPR Lab at North Carolina State University