Skip to content

Latest commit

 

History

History
228 lines (112 loc) · 12.3 KB

README.md

File metadata and controls

228 lines (112 loc) · 12.3 KB

Reference

PL/SE Applications

Bug Detection

Benchmark and Empirical Study

  • LLMs Cannot Reliably Identify and Reason About Security Vulnerabilities (Yet?): A Comprehensive Evaluation, Framework, and Benchmarks. S&P 2024, Link

  • Vulnerability Detection with Code Language Models: How Far Are We? arxiv 2024, Link

  • A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection, arxiv 2024, Link

  • How Far Have We Gone in Vulnerability Detection Using Large Language Models, ICLR 2024, Link

  • Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities, arxiv 2023, Link

  • Do Language Models Learn Semantics of Code? A Case Study in Vulnerability Detection, arXiv, Link

  • DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection, RAID 2023, Link

  • SkipAnalyzer: An Embodied Agent for Code Analysis with Large Language Models, Link

General Analysis

  • A Learning-Based Approach to Static Program Slicing. OOPSLA 2024, Link

  • Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection. ICSE 2024, Link

  • E&V: Prompting Large Language Models to Perform Static Analysis by Pseudo-code Execution and Verification. arXiv, Link

Domain-Specific Bug Detection(Domain-Specific Program & Bug Type)

  • SMARTINV: Multimodal Learning for Smart Contract Invariant Inference, S&P 2024, Link

  • LLM-based Resource-Oriented Intention Inference for Static Resource Detection, arxiv, Link

  • The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models, OOPSLA 2024, Link

  • Do you still need a manual smart contract audit? Link

  • Harnessing the Power of LLM to Support Binary Taint Analysis, arxiv, Link

  • Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives. arXiv, Link

  • GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis. ICSE 2024 Link

  • Continuous Learning for Android Malware Detection, USENIX Security 2023, Link

  • Beware of the Unexpected: Bimodal Taint Analysis, ISSTA 2023, Link

Specification Inference and Verification

  • Enchanting Program Specification Synthesis by Large Language Models using Static Analysis and Program Verification, CAV 2024, Link

  • SpecGen: Automated Generation of Formal Program Specifications via Large Language Models, Link

  • Lemur: Integrating Large Language Models in Automated Program Verification, ICLR 2024, Link

  • Zero and Few-shot Semantic Parsing with Ambiguous Inputs, ICLR 2024, Link

  • Finding Inductive Loop Invariants using Large Language Models, Link

  • Can ChatGPT support software verification? arXiv, Link

  • Impact of Large Language Models on Generating Software Specifications, Link

  • Can Large Language Models Reason about Program Invariants?, ICML 2023, Link

  • Ranking LLM-Generated Loop Invariants for Program Verification, Link

Program Repair, Code Completion, and Program Synthesis

  • Towards AI-Assisted Synthesis of Verified Dafny Methods, FSE 2024, Link

  • Enabling Memory Safety of C Programs using LLMs, arxiv, Link

  • CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules, ICLR 2024, Link

  • Is Self-Repair a Silver Bullet for Code Generation? ICLR 2024, Link

  • Verified Multi-Step Synthesis using Large Language Models and Monte Carlo Tree Search Link

  • Hypothesis Search: Inductive Reasoning with Language Models, ICLR 2024, Link

  • CodePlan: Repository-level Coding using LLMs and Planning, FMDM & NIPS 2023, Link

  • Repository-Level Prompt Generation for Large Language Models of Code. ICML 2023, Link

  • Refactoring Programs Using Large Language Models with Few-Shot Examples. arXiv, Link

  • SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Link

  • Teaching Large Language Models to Self-Debug, ICLR 2024, Link

  • Guess & Sketch: Language Model Guided Transpilation, ICLR 2024, Link

  • Optimal Neural Program Synthesis from Multimodal Specifications, EMNLP 2021, Link

  • CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation, ICLR 2022, Link

  • Sporq: An Interactive Environment for Exploring Code Using Query-by-Example, UIST 2021, Link

  • Data Extraction via Semantic Regular Expression Synthesis, OOPSLA 2023, Link

  • Web Question Answering with Neurosymbolic Program Synthesis, PLDI 2021, Link

  • Active Inductive Logic Programming for Code Search, ICSE 2019, Link

Fuzzing and Testing

  • Sedar: Obtaining High-Quality Seeds for DBMS Fuzzing via Cross-DBMS SQL Transfer. ICSE 2024. Link

  • LLM4FUZZ: Guided Fuzzing of Smart Contracts with Large Language Models Link

  • Large Language Model guided Protocol Fuzzing, NDSS 2024, Link

  • Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models, ISSTA 2023, Link

  • Language Agents as Hackers: Evaluating Cybersecurity Skills with Capture the Flag, MASEC@NeurIPS 2023, Link

Code Model and Code Reasoning

  • Source Code Vulnerability Detection: Combining Code Language Models and Code Property Graphs, arxiv, Link

  • CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking, FSE 2024, Link

  • FAIR: Flow Type-Aware Pre-Training of Compiler Intermediate Representations, ICSE 2024, Link

  • Symmetry-Preserving Program Representations for Learning Code Semantics Link

  • LmPa: Improving Decompilation by Synergy of Large Language Model and Program Analysis, Link

  • When Do Program-of-Thought Works for Reasoning? AAAI 2024 Link

  • Grounded Copilot: How Programmers Interact with Code-Generating Models, OOPSLA 2023, Link

  • Extracting Training Data from Large Language Models, USENIX Security 2023, Link

Code Understanding and IDE-tech

  • Using an LLM to Help With Code Understanding, ICSE 2024, Link

Hallucination

  • Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation, ICLR 2024, Link

Prompting for Reasoning Tasks

  • Self-Evaluation Guided Beam Search for Reasoning, NeurIPS 2023, Link

  • Self-consistency improves chain of thought reasoning in language models. NeurIPS 2022, Link

  • Tree of Thoughts: Deliberate Problem Solving with Large Language Models. NeurIPS 2023, Link

  • Cumulative Reasoning With Large Language Models, Link

  • Explanation Selection Using Unlabeled Data for Chain-of-Thought Prompting, EMNLP 2023, Link

  • Complementary Explanations for Effective In-Context Learning, ACL 2023, Link

  • Wechat Post: 大语言模型的数学之路 Link

  • Blog: Prompt Engineering Link

  • Hallucination: Survey Link

Agent, Tool Using, and Planning

  • Natural Language Commanding via Program Synthesis, Microsoft Link

  • Chain of Code: Reasoning with a Language Model-Augmented Code Emulator, Feifei Li, Google Link

  • If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents, Link

  • Real-world practices of AI Agents, Link

  • Cognitive Architectures for Language Agents, Link

  • The Rise and Potential of Large Language Model Based Agents: A Survey, Link

  • ReAct: Synergizing Reasoning and Acting in Language Models Link

  • Reflexion: Language Agents with Verbal Reinforcement Learning, NeurIPS 2023, Link

  • Wechat Post: AutoGen, Link

  • SATLM: Satisfiability-Aided Language Models Using Declarative Prompting, NeurIPS 2023, Link

  • Awesome things about LLM-powered agents: Papers, Repos, and Blogs, Link

  • ChatDev: Mastering the Virtual Social Realm, Shaping the Future of Intelligent Interactions. Link

  • SWE-bench: Can Language Models Resolve Real-World GitHub Issues? Link

Model and Framework

  • LMFLow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All. Link

  • codellama: Inference code for CodeLlama models, Link

  • CodeFuse: LLM for Code from Ant Group, Link

  • Owl-LM: Large Language Model for Blockchain, Link

Remark: Researcher List

  • Tao YU, The University of Hong Kong (Training)

  • Shunyu YAO, Princeton University (Reasoning, Agent)

  • Xi YE, Isil Dillig, UT Austin (Prompting)

  • Lingming ZHANG, UIUC (Application: Testing, Repair)

  • Zhiyun QIAN, UC Riverside (Application: Analysis)

  • Yizheng CHEN, University of Maryland (Application: Analysis)

  • Baishakhi Ray, Columbia University (Application: Repair, Analysis)

  • Martin Vechev, ETH (Data, Hallucination)