{"id":25552,"date":"2024-12-10T09:21:01","date_gmt":"2024-12-10T17:21:01","guid":{"rendered":"https:\/\/www.palada.net\/index.php\/2024\/12\/10\/news-19281\/"},"modified":"2024-12-10T09:21:01","modified_gmt":"2024-12-10T17:21:01","slug":"news-19281","status":"publish","type":"post","link":"https:\/\/www.palada.net\/index.php\/2024\/12\/10\/news-19281\/","title":{"rendered":"Sophos AI to present on how to defang malicious AI models at Black Hat Europe"},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2022\/02\/shutterstock_389760973.jpg\"\/><\/p>\n<p><strong>Credit to Author: gallagherseanm| Date: Tue, 10 Dec 2024 15:35:16 +0000<\/strong><\/p>\n<div class=\"entry-content lg:prose-lg mx-auto prose max-w-4xl\">\n<p>At this week\u2019s Black Hat Europe in London, SophosAI\u2019s Senior Data Scientist Tam\u00e1s V\u00f6r\u00f6s will deliver a<a href=\"https:\/\/www.blackhat.com\/eu-24\/briefings\/schedule\/#llmbotomy-shutting-the-trojan-backdoors-42447\"> 40-minute presentation entitled \u201cLLMbotomy: Shutting the Trojan Backdoors\u201d at 1:30 PM<\/a>. V\u00f6r\u00f6s\u2019 talk, which is an expansion on a presentation he gave at the recent CAMLIS conference, delves into the potential risks posed by Trojanized Large Language Models (LLMs) and how those risks can be mitigated by those using potentially weaponized LLMs.<\/p>\n<p>Existing research on LLMs has primarily focused on external threats to LLMs, such as \u201cprompt injection\u201d attacks that could be used to data embedded in previously submitted instructions from other users and other input-based attacks on LLMs themselves. SophosAI\u2019s research, presented by V\u00f6r\u00f6s, examined embedded threats, such as Trojan backdoors inserted into LLMs during their training and triggered by specific inputs intended to cause harmful behaviors. These embedded threats could be deliberately introduced through malicious intent of someone involved in the model\u2019s training, \u00a0or inadvertently through data poisoning. The research investigated not only how these trojans could be created, but also a method to disable them.<\/p>\n<p>SophosAI\u2019s research demonstrated the use of targeted \u201cnoising\u201d of an LLM\u2019s neurons, identifying those critical to the operation of the LLM \u00a0through their activation patterns. The technique was demonstrated to effectively neutralize most Trojans embedded in in a model. A full report on the research presented by V\u00f6r\u00f6s will be published after Black Hat Europe.<\/p>\n<\/p><\/div>\n<p><a href=\"https:\/\/news.sophos.com\/en-us\/2024\/12\/10\/sophos-ai-to-present-on-how-to-defang-malicious-ai-models-at-black-hat-europe\/\" target=\"bwo\" >http:\/\/feeds.feedburner.com\/sophos\/dgdY<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2022\/02\/shutterstock_389760973.jpg\"\/><\/p>\n<p><strong>Credit to Author: gallagherseanm| Date: Tue, 10 Dec 2024 15:35:16 +0000<\/strong><\/p>\n<p>\u201cLLMbotomy\u201d research reveals how Trojans can be injected into Large Language Models, and how to disarm them.<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"colormag_page_container_layout":"default_layout","colormag_page_sidebar_layout":"default_layout","footnotes":""},"categories":[10378,10377],"tags":[27031,32070,129,29047],"class_list":["post-25552","post","type-post","status-publish","format-standard","hentry","category-security","category-sophos","tag-ai-research","tag-ai-trojans","tag-featured","tag-llm"],"_links":{"self":[{"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/posts\/25552","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/comments?post=25552"}],"version-history":[{"count":0,"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/posts\/25552\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/media?parent=25552"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/categories?post=25552"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/tags?post=25552"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}