行业新闻
- 微软宣布推出网络安全产品——Microsoft Security Copilot
Security Copilot将目前最强大语言模型GPT-4内置在产品中,并与微软拥有65万亿个网络安全威胁的安全模型库相结合使用,为企业、个人用户提供网络安全、恶意代码防护、隐私合规监控等生成式自动化AI服务
https://thehackernews.com/2023/03/microsoft-introduces-gpt-4-ai-powered.html - 在网络安全最佳实践中集成ChatGPT和生成式AI
https://www.sentinelone.com/blog/integrating-chatgpt-generative-ai-within-cybersecurity-best-practices/ - 行业分析:云之后,大模型是网络安全的新机会吗?
https://mp.weixin.qq.com/s/nmeDrQX5dTRUT23-2sGI-g - VirusTotal推出Code Insight,用生成式人工智能为威胁分析赋能
https://blog.virustotal.com/2023/04/introducing-virustotal-code-insight.html - 安全大模型进入爆发期!谷歌云已接入全线安全产品|RSAC 2023
https://mp.weixin.qq.com/s/5Aywrqk7B6YCiLRbojNCuQ - Facebook季度安全报告:假冒ChatGPT的恶意软件激增
https://about.fb.com/news/2023/05/metas-q1-2023-security-reports/ - Tenable的报告展示了生成式人工智能正在如何改变安全研究
https://venturebeat.com/security/tenable-report-shows-how-generative-ai-is-changing-security-research/ - 挖掘AIGC军火产业链,颠覆巨头的机会和风险
https://mp.weixin.qq.com/s/bVQYT0QqueGyLwPAppDRtg - 增强ChatGPT等安全性,大语言模型界的“安全管家”开源了!
https://mp.weixin.qq.com/s/PeuNht95WVJbJ8hiOOrSoA - ChatGPT暗黑版——FraudGPT
https://hackernoon.com/what-is-fraudgpt
实践指南
软件供应链安全
利用GPT/AIGC/LLM来进行漏洞挖掘和修复、代码质量评测、程序理解
论文
- Trust in Software Supply Chains: Blockchain-Enabled SBOM and the AIBOM Future
https://arxiv.org/pdf/2307.02088.pdf - An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures
https://arxiv.org/pdf/2308.04898.pdf - SecureFalcon: The Next Cyber Reasoning System for Cyber Security
https://arxiv.org/pdf/2307.06616.pdf - Using ChatGPT as a Static Application Security Testing Tool
https://arxiv.org/pdf/2308.14434.pdf - A Preliminary Study on Using Large Language Models in Software Pentesting
https://arxiv.org/pdf/2401.17459.pdf - LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs’ Vulnerability Reasoning
https://arxiv.org/pdf/2401.16185.pdf - Finetuning Large Language Models for Vulnerability Detection
https://arxiv.org/pdf/2401.17010.pdf - Large Language Model for Vulnerability Detection: Emerging Results and Future Directions
https://arxiv.org/pdf/2401.15468.pdf - How Far Have We Gone in Vulnerability Detection Using Large Language Models
https://arxiv.org/pdf/2311.12420.pdf - LLbezpeky: Leveraging Large Language Models for Vulnerability Detection
https://arxiv.org/pdf/2401.01269.pdf - Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities
https://arxiv.org/pdf/2311.16169.pdf - Detecting software vulnerabilities using Language Models
https://arxiv.org/ftp/arxiv/papers/2302/2302.11773.pdf - Prompt-Enhanced Software Vulnerability Detection Using ChatGPT
https://arxiv.org/pdf/2308.12697.pdf - Evaluation of ChatGPT Model for Vulnerability Detection
https://arxiv.org/pdf/2304.07232.pdf - LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluation
https://arxiv.org/pdf/2303.09384.pdf - DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection
https://arxiv.org/pdf/2304.00409.pdf - FLAG: Finding Line Anomalies (in code) with Generative AI
https://arxiv.org/pdf/2306.12643.pdf - Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT
https://arxiv.org/pdf/2304.02014.pdf - Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models
https://arxiv.org/pdf/2212.14834.pdf - CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation
https://arxiv.org/pdf/2402.12222.pdf - When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan
https://arxiv.org/pdf/2308.03314.pdf - Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding
https://arxiv.org/pdf/2309.09826.pdf - Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives
https://arxiv.org/pdf/2310.01152.pdf - VulLibGen: Identifying Vulnerable Third-Party Libraries via Generative Pre-Trained Model
https://arxiv.org/pdf/2308.04662.pdf - Leveraging AI Planning For Detecting Cloud Security Vulnerabilities
https://arxiv.org/pdf/2402.10985.pdf - InferFix: End-to-End Program Repair with LLMs
https://arxiv.org/pdf/2303.07263.pdf - HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion
https://arxiv.org/pdf/2312.13530.pdf - Fixing Hardware Security Bugs with Large Language Models
https://arxiv.org/pdf/2302.01215.pdf - Unlocking Hardware Security Assurancee: The Potential of LLMs
https://arxiv.org/pdf/2308.11042.pdf - Generating Secure Hardware using ChatGPT Resistant to CWEs
围绕硬件设计实施常见的10个CWE,分别展示了生成带缺陷代码和安全代码的提示词场景
https://eprint.iacr.org/2023/212.pdf - DIVAS: An LLM-based End-to-End Framework for SoC Security Analysis and Policy-based Protection
https://arxiv.org/pdf/2308.06932.pdf - Examining Zero-Shot Vulnerability Repair with Large Language Models
https://arxiv.org/pdf/2112.02125.pdf - Practical Program Repair in the Era of Large Pre-trained Language Models
https://arxiv.org/pdf/2210.14179.pdf - An Analysis of the Automatic Bug Fixing Performance of ChatGPT
https://arxiv.org/pdf/2301.08653.pdf - Automatic Program Repair with OpenAI’s Codex: Evaluating QuixBugs
https://arxiv.org/pdf/2111.03922.pdf - How Effective Are Neural Networks for Fixing Security Vulnerabilities
https://arxiv.org/pdf/2305.18607.pdf - STEAM: Simulating the InTeractive BEhavior of ProgrAMmers for Automatic Bug Fixing
https://arxiv.org/pdf/2308.14460.pdf - ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel Patching
https://arxiv.org/pdf/2308.13062.pdf - Can LLMs Patch Security Issues?
https://arxiv.org/pdf/2312.00024.pdf - Better patching using LLM prompting, via Self-Consistency
https://arxiv.org/pdf/2306.00108.pdf - Identifying Vulnerability Patches by Comprehending Code Commits with Comprehensive Change Contexts
https://arxiv.org/pdf/2310.02530.pdf - Just-in-Time Security Patch Detection – LLM At the Rescue for Data Augmentation
https://arxiv.org/pdf/2312.01241.pdf - Towards JavaScript program repair with generative pre-trained transformer (GPT-2)
https://dl.acm.org/doi/abs/10.1145/3524459.3527350 - Code Security Vulnerability Repair Using Reinforcement
https://arxiv.org/pdf/2401.07031.pdf - Enhanced Automated Code Vulnerability Repair using Large Language Models
https://arxiv.org/ftp/arxiv/papers/2401/2401.03741.pdf - Repair Is Nearly Generation: Multilingual Program Repair with LLMs
https://arxiv.org/pdf/2208.11640.pdf - Cupid: Leveraging ChatGPT for More Accurate Duplicate Bug Report Detection
https://arxiv.org/pdf/2308.10022v2.pdf - Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions
https://arxiv.org/pdf/2108.09293.pdf - Do Users Write More Insecure Code with AI Assistants?
https://arxiv.org/pdf/2211.03622.pdf - How Secure is Code Generated by ChatGPT?
https://arxiv.org/pdf/2304.09655.pdf - Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants
https://arxiv.org/pdf/2208.09727.pdf - Evaluating Large Language Models Trained on Code
https://arxiv.org/pdf/2107.03374.pdf - No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT
https://arxiv.org/pdf/2308.04838.pdf - Assessing the Quality of GitHub Copilot’s Code Generation
https://dl.acm.org/doi/abs/10.1145/3558489.3559072 - Is GitHub’s Copilot as Bad as Humans at Introducing Vulnerabilities in Code?
https://arxiv.org/pdf/2204.04741.pdf - How ChatGPT is Solving Vulnerability Management Problem
https://arxiv.org/pdf/2311.06530.pdf - Neural Code Completion Tools Can Memorize Hard-coded Credentials
https://arxiv.org/pdf/2309.07639.pdf - BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT
https://www.ndss-symposium.org/wp-content/uploads/2023/02/NDSS2023Poster_paper_7966.pdf - Teaching Large Language Models to Self-Debug
https://arxiv.org/pdf/2304.05128.pdf - LLM4SecHW: Leveraging Domain-Specific Large Language Model for Hardware Debugging
https://browse.arxiv.org/pdf/2401.16448.pdf - Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT
https://arxiv.org/pdf/2304.10778.pdf - Fault-Aware Neural Code Rankers
https://arxiv.org/pdf/2206.03865.pdf - Using Large Language Models to Enhance Programming Error Messages
https://arxiv.org/pdf/2210.11630.pdf - Controlling Large Language Models to Generate Secure and Vulnerable Code
引用了Asleep at the keyboard? 使用了预训练模型,对LM的输出进行pre-train以控制输出的代码是安全的还是存在漏洞的
https://arxiv.org/pdf/2302.05319.pdf - Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models
针对“prompt中的微小变化可能导致漏洞”的问题,在”Asleep at the keyboard?”手动操作在基础上实现自动化发现
https://arxiv.org/pdf/2302.04012.pdf - SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques
在Systematically Finding Security Vulnerabilities in Black-Box Code Generation的论文中,把这篇论文看的很重,解决模型评估的数据集的问题
https://dl.acm.org/doi/abs/10.1145/3549035.3561184 - Security Code Review by LLMs: A Deep Dive into Responses
https://browse.arxiv.org/pdf/2401.16310.pdf - Exploring the Limits of ChatGPT in Software Security Applications
https://arxiv.org/pdf/2312.05275.pdf - An Empirical Evaluation of LLMs for Solving Offensive Security Challenges
https://arxiv.org/pdf/2402.11814.pdf - Purple Llama CYBERSECEVAL: A Secure Coding Benchmark for Language Models
https://arxiv.org/pdf/2312.04724.pdf - Can Large Language Models Identify And Reason About Security Vulnerabilities? Not Yet
https://arxiv.org/pdf/2312.12575.pdf - Pop Quiz! Can a Large Language Model Help With Reverse Engineering
https://arxiv.org/pdf/2202.01142.pdf - CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models
微软在ICSE2023上发布的论文,旨在利用LLM来缓解传统fuzz中的陷入“Coverage Plateaus”的问题
https://www.carolemieux.com/codamosa_icse23.pdf - Understanding Large Language Model Based Fuzz Driver Generation
https://arxiv.org/abs/2307.12469 - ChatGPT for Software Security: Exploring the Strengths and Limitations of ChatGPT in the Security Applications
https://arxiv.org/abs/2307.12488 - How well does LLM generate security tests?
https://arxiv.org/pdf/2310.00710.pdf
博客
- ChatGPT的软件包推荐是可信赖的吗?
https://vulcan.io/blog/ai-hallucinations-package-risk - 人工智能驱动的模糊测试:打破漏洞狩猎的障碍
https://security.googleblog.com/2023/08/ai-powered-fuzzing-breaking-bug-hunting.html - 用GPT做静态代码扫描实现自动化漏洞挖掘思路分享
https://mp.weixin.qq.com/s/Masyfq12cjaM4Zn6qxvGoA - 用 LLM 降低白盒误报及自动修复漏洞代码
https://mp.weixin.qq.com/s/leLFECUaNOGbjsN_8mcXrQ - ChatGPT在源代码分析中可靠吗?
https://mp.weixin.qq.com/s/Ix2lArBzaCAJr5nyGolCwQ - ChatGPT在代码中定位缺陷足够好吗?
https://pvs-studio.com/en/blog/posts/1035/ - ChatGPT+RASP,实现CodeQL漏洞挖掘高效自动化
https://mp.weixin.qq.com/s/xlUWn2oWU51NVkgB157pRw - ChatGPTScan:使用ChatGPTScan批量进行代码审计
https://mp.weixin.qq.com/s/QIKvRzNlAKiqh_UMOMfDdg - 应用GPT-4 于 Semgrep 中以指出误报和修复代码
https://semgrep.dev/blog/2023/gpt4-and-semgrep-detailed?utm_source=twitter\&utm_medium=social\&utm_campaign=brand - Kondukto产品利用OpenAI来修复代码漏洞
https://kondukto.io/blog/kondukto-openai-chatgpt - 利用ChatGPT来进行代码审计
https://research.nccgroup.com/2023/02/09/security-code-review-with-chatgpt/ - 利用GPT-3在单个代码仓库中找到213个安全漏洞
https://betterprogramming.pub/i-used-gpt-3-to-find-213-security-vulnerabilities-in-a-single-codebase-cc3870ba9411 - 利用GPT-4进行调试和漏洞修复
https://www.sitepoint.com/gpt-4-for-debugging/ - 开源项目的神奇助理:让AI扮演代码检察员,加速高质量PR review
https://mp.weixin.qq.com/s/7WeMbWDUghyS5kSBiZhYYA - 用ChatGPT生成测试数据
https://mp.weixin.qq.com/s/tE09X5Fce-PQs1urpJGWAg - 黑客可能利用ChatGPT的方式
https://cybernews.com/security/hackers-exploit-chatgpt/ - GPT-4 Jailbreak and Hacking via Rabbithole Attack, Prompt Injection, Content Modderation Bypass and Weaponizing AI
https://adversa.ai/blog/gpt-4-hacking-and-jailbreaking-via-rabbithole-attack-plus-prompt-injection-content-moderation-bypass-weaponizing-ai/ - 我是如何用GPT自动化生成Nuclei的POC
https://mp.weixin.qq.com/s/Z8cTUItmbwuWbRTAU_Y3pg - ChatGPT 生成代码的安全漏洞
https://www.trendmicro.com/ja_jp/devops/23/e/chatgpt-security-vulnerabilities.html
威胁检测
利用GPT/AIGC/LLM来完成恶意软件、网络攻击等威胁检测
论文
- Static Malware Detection Using Stacked BiLSTM and GPT-2
https://ieeexplore.ieee.org/document/9785789 - FlowTransformer: A Transformer Framework for Flow-based Network Intrusion Detection Systems
https://arxiv.org/pdf/2304.14746.pdf - ChatGPT for Digital Forensic Investigation: The Good, The Bad, and The Unknown
https://arxiv.org/pdf/2307.10195.pdf - Devising and Detecting Phishing: large language models vs. Smaller Human Models
https://arxiv.org/pdf/2308.12287.pdf - Anatomy of an AI-powered malicious social botnet
https://arxiv.org/pdf/2307.16336.pdf - Revolutionizing Cyber Threat Detection with Large Language Models
https://arxiv.org/pdf/2306.14263.pdf - LLM in the Shell: Generative Honeypots
借助LLM打造智能的高交互蜜罐
https://arxiv.org/pdf/2309.00155.pdf - What Does an LLM-Powered Threat Intelligence Program Look Like?
http://i.blackhat.com/BH-US-23/Presentations/US-23-Grof-Miller-LLM-Powered-TI-Program.pdf
博客
- IoC detection experiments with ChatGPT
https://securelist.com/ioc-detection-experiments-with-chatgpt/108756/ - ChatGPT赋能的威胁分析——使用ChatGPT为每个npm, PyPI包检查安全问题,包括信息渗透、SQL注入漏洞、凭证泄露、提权、后门、恶意安装、预设指令污染等威胁
https://socket.dev/blog/introducing-socket-ai-chatgpt-powered-threat-analysis
安全运营
利用GPT/AIGC/LLM来辅助安全运营/SOAR/SIEM
论文
- GPT-2C: A GPT-2 Parser for Cowrie Honeypot Logs
https://arxiv.org/pdf/2109.06595.pdf - On the Uses of Large Language Models to Interpret Ambiguous Cyberattack Descriptions
https://arxiv.org/pdf/2306.14062.pdf - From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads
https://arxiv.org/pdf/2305.15336.pdf - Automated CVE Analysis for Threat Prioritization and Impact Prediction
https://arxiv.org/pdf/2309.03040.pdf - LogGPT: Log Anomaly Detection via GPT
https://browse.arxiv.org/pdf/2309.14482.pdf - Cyber Sentinel: Exploring Conversational Agents’ Role in Streamlining Security Tasks with GPT-4
https://browse.arxiv.org/pdf/2309.16422.pdf - AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation
https://arxiv.org/pdf/2310.02655.pdf - ChatGPT, Llama, can you write my report? An experiment on assisted digital forensics reports written using (Local) Large Language Models
https://arxiv.org/pdf/2312.14607.pdf - HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs)
https://arxiv.org/pdf/2309.16021.pdf - Benchmarking Large Language Models for Log Analysis, Security, and Interpretation
https://arxiv.org/pdf/2311.14519.pdf
博客
- Elastic公司发布的”与ChatGPT探索安全的未来”
提出了6个构想:(1) 聊天机器人协助事件响应 (2) 威胁报告生成 (3) 自然语言检索 (4) 异常检测 (5) 安全策略问答机器人 (6) 告警排序。
https://www.elastic.co/cn/security-labs/exploring-applications-of-chatgpt-to-improve-detection-response-and-understanding - ChatGPT在安全运营中的应用初探
结论——ChatGPT可以赋能包括事件分析与响应在内的多种安全运营过程,降低对本就不足的预置安全知识的依赖,并能促进有价值的安全知识的产生和积累,从而帮助安全运营团队更准确地做出决策、实施响应、积累经验,尤其对初级安全工程师有辅导作用。当然,现阶段以及未来一段时间,ChatGPT等高级AI驱动的聊天机器人还无法完全取代人类分析师,更多是提供辅助决策与操作支持。相信随着持续高强度的人机会话互动,再借助更大规模更专业(网络安全运营领域)的语料库训练,ChatGPT会不断强化自己的能力,不断减轻人类安全分析师的工作负担。
https://www.secrss.com/articles/51775 - 利用Chat GPT和D3的AI辅助事件响应
探讨将ChatGPT与Smart SOAR整合的好处;提供样例分析——使用MITRE TTPs和微软端点防御系统警报中发现的恶意软件家族以收集事件的背景信息,之后问ChatGPT,让它根据对TTP和恶意软件的了解,描述攻击者接下来可能会采取什么措施、恶意软件可能利用什么漏洞。
https://www.163.com/dy/article/I48DBHHG055633FJ.html - Blink公司推出有史以来第一个用于自动化安全和IT运营工作流程的生成式人工智能
https://www.blinkops.com/blog/introducing-blink-copilot-the-first-generative-ai-for-security-workflows
GPT自身安全
关于GPT/AIGC/LLM等大模型技术自身的各类安全风险以及漏洞,大模型技术滥用与误用的可能性
论文
-
GPT-4 Technical Report
OpenAI对模型自身安全评估和缓解
https://arxiv.org/abs/2303.08774 -
Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
提出一个未来的研究方向:在特定的使用情境下,保障大语言模型的安全可靠性需要发展什么类型的测试?
https://arxiv.org/pdf/2102.02503.pdf -
Taxonomy of Risks posed by Language Models
https://dl.acm.org/doi/10.1145/3531146.3533088 -
ChatGPT Security Risks: A Guide for Cyber Security Professionals
https://www.cybertalk.org/wp-content/uploads/2023/03/ChatGPT_eBook_CP_CT_2023.pdf -
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
https://arxiv.org/pdf/2305.11391.pdf -
The (ab)use of Open Source Code to Train Large Language Models
https://arxiv.org/pdf/2302.13681.pdf -
GPT in Sheep’s Clothing: The Risk of Customized GPTs
https://arxiv.org/pdf/2401.09075.pdf -
In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
https://arxiv.org/pdf/2304.08979.pdf -
DECODINGTRUST: A Comprehensive Assessment of Trustworthiness in GPT Models
https://arxiv.org/pdf/2306.11698.pdf -
On the Trustworthiness Landscape of State-of-the-art Generative Models: A Comprehensive Survey
https://arxiv.org/pdf/2307.16680.pdf -
LLM Censorship: A Machine Learning Challenge or a Computer Security Problem?
https://arxiv.org/pdf/2307.10719.pdf -
Ignore Previous Prompt: Attack Techniques For Language Models
ML Safety Workshop NeurIPS 2022,以及提示注入的开山之作
https://arxiv.org/pdf/2211.09527.pdf -
Boosting Big Brother: Attacking Search Engines with Encodings
攻击测试了集成OpenAI GPT-4模型的必应搜索引擎
https://arxiv.org/pdf/2304.14031.pdf -
Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
https://arxiv.org/pdf/2309.04858.pdf -
More than you’ve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models
间接提示注入的开山之作,里面很多场景都已成为现实
https://arxiv.org/pdf/2302.12173.pdf -
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
https://arxiv.org/pdf/2009.11462.pdf -
Jailbroken: How Does LLM Safety Training Fail?
https://arxiv.org/pdf/2307.02483.pdf -
FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models
https://arxiv.org/pdf/2309.05274.pdf -
GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
https://arxiv.org/pdf/2309.10253.pdf -
Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks
https://arxiv.org/pdf/2302.05733.pdf -
Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
https://arxiv.org/pdf/2302.12173.pdf -
Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications
https://arxiv.org/pdf/2311.16153.pdf -
Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
https://arxiv.org/pdf/2209.07858.pdf -
Explore, Establish, Exploit: Red Teaming Language Models from Scratch
https://arxiv.org/pdf/2306.09442.pdf -
Beyond the Safeguards: Exploring the Security Risks of ChatGPT
https://arxiv.org/pdf/2305.08005.pdf -
ProPILE: Probing Privacy Leakage in Large Language Models
https://arxiv.org/pdf/2307.01881.pdf -
Analyzing Leakage of Personally Identifiable Information in Language Models
https://ieeexplore.ieee.org/document/10179300 -
Can We Generate Shellcodes via Natural Language? An Empirical Study
https://link.springer.com/article/10.1007/s10515-022-00331-3 -
RatGPT: Turning online LLMs into Proxies for Malware Attacks
https://arxiv.org/pdf/2308.09183.pdf -
Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations
https://arxiv.org/pdf/2306.09255.pdf -
BadPrompt: Backdoor Attacks on Continuous Prompts
南开大学在NeurIPS 2022上发表的论文,小样本场景下通过连续prompt对大模型后门攻击
https://arxiv.org/pdf/2211.14719.pdf -
LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors
https://arxiv.org/pdf/2308.13904.pdf -
A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks
https://arxiv.org/pdf/2308.14367.pdf -
BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
https://arxiv.org/pdf/2401.12242.pdf -
Universal and Transferable Adversarial Attacks on Aligned Language Models
https://arxiv.org/pdf/2307.15043.pdf
https://llm-attacks.org -
Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
https://arxiv.org/pdf/2308.09662.pdf -
Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models
https://arxiv.org/pdf/2310.00322.pdf -
A LLM Assisted Exploitation of AI-Guardian
https://arxiv.org/pdf/2307.15008.pdf -
The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness
https://arxiv.org/pdf/2401.00287.pdf -
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
用摩斯、凯撒等加密密码,可向ChatGPT询问非法内容
https://arxiv.org/pdf/2308.06463.pdf -
Model Leeching: An Extraction Attack Targeting LLMs
https://arxiv.org/pdf/2309.10544.pdf -
“Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
https://arxiv.org/pdf/2308.03825.pdf -
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
https://arxiv.org/pdf/2312.02119.pdf - LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI’s ChatGPT Plugins
主页:https://llm-platform-security.github.io/chatgpt-plugin-eval/
开源项目:chatgpt-plugin-eval https://arxiv.org/pdf/2309.10254v1.pdf - Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition
[https://aclanthology.org/2023.emnlp-main.302.pdf] - ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
https://arxiv.org/pdf/2304.14475.pdf - JADE: A Linguistics-based Safety Evaluation Platform for Large Language Models
https://arxiv.org/pdf/2311.00286.pdf - Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield
https://arxiv.org/pdf/2311.00172.pdf - BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B
https://arxiv.org/pdf/2311.00117.pdf - Constitutional AI: Harmlessness from AI Feedback
https://arxiv.org/abs/2212.08073.pdf - Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models
https://arxiv.org/abs/2312.09669.pdf - Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
https://arxiv.org/abs/2308.13387.pdf - Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models
https://arxiv.org/abs/2305.01219.pdf - Detecting Language Model Attacks with Perplexity
https://arxiv.org/abs/2308.14132v3.pdf
博客
- OWASP发布了Top 10 for Large Language Model Applications项目,旨在教育开发人员、设计师、架构师、经理和组织了解部署和管理大语言模型LLM时的潜在安全风险,该项目提供了一个常见于LLM应用中的十个最关键漏洞的列表。
https://owasp.org/www-project-top-10-for-large-language-model-applications/ - 干货分享!Langchain框架Prompt Injection在野0day漏洞分析
https://mp.weixin.qq.com/s/wFJ8TPBiS74RzjeNk7lRsw - 通过提示注入在 MathGPT 中实现代码执行
公开可用的 MathGPT 借助底层的GPT-3模型来回答用户生成的数学问题。最近的研究和实验表明,GPT-3 等 大模型在直接进行数学计算的任务上表现不佳,然而能够更准确地生成问题解决方案的可执行代码。 因此 MathGPT 将用户的自然语言问题转换为 Python 代码,执行计算后的代码和答案会显示给用户。某些 LLM 可能容易受到提示词注入攻击,恶意用户输入会导致模型执行意外行为[3][4]。 在此事件中,攻击者探索了几种提示词覆盖途径,生成的代码最终导致攻击者获得应用程序主机系统的环境变量和应用程序的 GPT-3 API 密钥的访问权限,并执行拒绝服务攻击。 因此,攻击者可能会耗尽应用程序的 API 查询预算或关闭应用程序。
https://atlas.mitre.org/studies/AML.CS0016/ - 用ChatGPT来生成编码器与配套WebShell
antsword官方出品
https://mp.weixin.qq.com/s/I9IhkZZ3YrxblWIxWMXAWA - 使用ChatGPT来生成钓鱼邮件和钓鱼网站
相比其他仅生成钓鱼邮件,这里把钓鱼网站也生成了
https://www.richardosgood.com/posts/using-openai-chat-for-phishing/ - Chatting Our Way Into Creating a Polymorphic Malware
https://www.cyberark.com/resources/threat-research-blog/chatting-our-way-into-creating-a-polymorphic-malware - Hacking Humans with AI as a Service
https://media.defcon.org/DEF%20CON%2029/DEF%20CON%2029%20presentations/Eugene%20Lim%20Glenice%20Tan%20Tan%20Kee%20Hock%20-%20Hacking%20Humans%20with%20AI%20as%20a%20Service.pdf - 内建虚拟机实现ChatGPT的越狱
https://www.engraved.blog/building-a-virtual-machine-inside/ - ChatGPT can boost your Threat Modeling skills
https://infosecwriteups.com/chatgpt-can-boost-your-threat-modeling-skills-ab82149d0140 - Using GPT-Eliezer against ChatGPT Jailbreaking
检测对抗性提示词
https://www.alignmentforum.org/posts/pNcFYZnPdXyL2RfgA/using-gpt-eliezer-against-chatgpt-jailbreaking - LLM中的安全隐患-以VirusTotal Code Insight中的提示注入为例
https://mp.weixin.qq.com/s/U2yPGOmzlvlF6WeNd7B7ww - 安全测试中的 ChatGPT AI:机遇与挑战
https://www.cyfirma.com/outofband/chatgpt-ai-in-security-testing-opportunities-and-challenges/ - ChatGPT羊驼家族全沦陷!CMU博士击破LLM护栏,人类毁灭计划脱口而出
CMU和人工智能安全中心的研究人员发现,只要通过附加一系列特定的无意义token,就能生成一个神秘的prompt后缀。由此,任何人都可以轻松破解LLM的安全措施,生成无限量的有害内容。
论文地址:https://arxiv.org/abs/2307.15043
代码地址:https://github.com/llm-attacks/llm-attacks
https://mp.weixin.qq.com/s/298nwP98UdRNybV2Fuo6Wg - Challenges in evaluating AI systems
- Core Views on AI Safety: When, Why, What, and How
大模型隐私保护
以隐私计算等技术保护GPT/AIGC/LLM等大模型
论文
-
Challenges and Remedies to Privacy and Security in AIGC: Exploring the Potential of Privacy Computing, Blockchain, and Beyond
https://arxiv.org/pdf/2306.00419.pdf -
GenAIPABench: A Benchmark for Generative AI-based Privacy Assistants
https://arxiv.org/pdf/2309.05138.pdf -
Recovering from Privacy-Preserving Masking with Large Language Models
https://arxiv.org/pdf/2309.08628.pdf -
Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection
https://arxiv.org/pdf/2309.03057.pdf -
Knowledge Sanitization of Large Language Models
https://arxiv.org/pdf/2309.11852.pdf -
PrivateLoRA For Efficient Privacy Preserving LLM
https://arxiv.org/pdf/2311.14030.pdf -
Don’t forget private retrieval: distributed private similarity search for large language models
https://arxiv.org/pdf/2311.12955.pdf
以安全数据训练GPT
收集安全数据训练或增强GPT/AIGC/LLM等大模型技术
论文
-
Web Content Filtering through knowledge distillation of Large Language Models
https://arxiv.org/pdf/2305.05027.pdf -
HackMentor: Fine-Tuning Large Language Models for Cybersecurity
论文发布:https://github.com/tmylla/HackMentor/blob/main/HackMentor.pdf
项目地址:https://github.com/tmylla/HackMentor -
LLMs Perform Poorly at Concept Extraction in Cyber-security Research Literature
https://arxiv.org/pdf/2312.07110.pdf -
netFound: Foundation Model for Network Security
网络安全基础大模型
https://arxiv.org/pdf/2310.17025.pdf
博客
- ChatGPT:和黑客知识库聊天
(1) 从prompt到自训练数据原文反向索引的准确性;(2) openai提供模型的微调服务的尝试;(3) 其他可替代性模型总结;(4) 围绕markdown格式的数据集解析和分块索引的脚本示例; (5) 相似索引向量引擎推荐。
https://mp.weixin.qq.com/s/dteH4oP24qGY-4l3xSl7Vg - 如何训练自己的“安全大模型”
https://mp.weixin.qq.com/s/801sV5a7-wOh_1EN3U64-Q - 基于LangChain框架建立和查询ATT&CK知识库
https://otrf.github.io/GPT-Security-Adventures/experiments/ATTCK-GPT/notebook.html