{"id":21490,"date":"2023-03-16T03:21:00","date_gmt":"2023-03-16T11:21:00","guid":{"rendered":"https:\/\/www.palada.net\/index.php\/2023\/03\/16\/news-15221\/"},"modified":"2023-03-16T03:21:00","modified_gmt":"2023-03-16T11:21:00","slug":"news-15221","status":"publish","type":"post","link":"http:\/\/www.palada.net\/index.php\/2023\/03\/16\/news-15221\/","title":{"rendered":"GPT for you and me: Applying AI language processing to cyber defenses"},"content":{"rendered":"<p><strong>Credit to Author: gallagherseanm| Date: Thu, 16 Mar 2023 10:00:07 +0000<\/strong><\/p>\n<div class=\"entry-content lg:prose-lg mx-auto prose max-w-4xl\">\n<p>A natural language processing architecture from <a href=\"https:\/\/openai.com\/\">OpenAI<\/a> has been getting a lot of attention lately. The latest version of the Generative Pre-trained Transformer (GPT) model, GPT-3.5\u2014the algorithmic brain of ChatGPT\u2014has generated waves of both amazement and concern. Among those concerns is how it could be used for malicious purposes, including generating convincing phishing emails and even malware.<\/p>\n<p>Sophos X-Ops researchers, including Sophos AI Principal Data Scientist Younghoo Lee, have been examining ways to use an earlier version, GPT-3, as a force for good. Lee presented some early insights into how GPT-3 could be used to generate human-readable explanations of attacker behavior and similar tasks last August <a href=\"https:\/\/news.sophos.com\/en-us\/2022\/08\/08\/sophos-ai-presentations-at-black-hat-bsides-lv-and-def-con-ai-village\/\">at the BSides LV and Black Hat security conferences<\/a>. Lee has been the lead on three projects that could help defenders find and block malicious activity more effectively using large language models from the GPT-3 family:<\/p>\n<ul>\n<li>A natural language query interface for searching for malicious activity in XDR telemetry<\/li>\n<li>A GPT-based spam email detector; and<\/li>\n<li>A tool for analyzing potential \u201cliving off the land\u201d binary (LOLBin) command lines.<\/li>\n<\/ul>\n<h3>Taking a few shots at natural language XDR searches<\/h3>\n<p>The first project is a prototype <a href=\"https:\/\/ai.sophos.com\/2022\/12\/15\/natural-language-query-interface-for-xdr-sql\/\">natural language query interface<\/a> for searching through security telemetry. The interface, based on GPT, takes commands written in plain English (\u201cShow me all processes that were named powershell.exe and executed by root user\u201d) and generates XDR-SQL queries from them\u2014without the user needing to understand the underlying database structure, or the SQL language itself.<\/p>\n<p>For example, in Figure 1 below, the sample information provided, along with the prompt engineering provided in the form of a simple database schema, allow GPT-5 to determine that a sentence such as \u201cShow all the times that a user named \u2018admin\u2019 ran PowerShell.exe\u201d translates into the SQL query, \u201cSELECT * FROM Process_Table WHERE user=\u2019admin\u2019 AND process=\u2019PowerShell.exe\u201d.<\/p>\n<p>Lee fed two different GPT-3 family models\u2014called Curie and Davinci\u2014a selection of training examples, including information about the database schema and pairs of natural language commands and the SQL statement required to complete them.\u00a0 Using the samples as a guide, the model would convert a new natural language query into a SQL command:<\/p>\n<figure id=\"attachment_90500\" aria-describedby=\"caption-attachment-90500\" style=\"width: 640px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide8.png\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-90500\" src=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide8.png\" alt=\"\" width=\"640\" height=\"360\" srcset=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide8.png 2048w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide8.png?resize=300,169 300w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide8.png?resize=768,432 768w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide8.png?resize=1024,576 1024w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide8.png?resize=1536,864 1536w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><figcaption id=\"caption-attachment-90500\" class=\"wp-caption-text\">igure 1: An example of how few-shot learning is used to create natural language queries.<\/figcaption><\/figure>\n<p>To get better accuracy out of few-shot, you can keep adding more examples when submitting a task. But there\u2019s a practical limit to this, as GPT-3 has limits on how much memory can be consumed for data input. To boost accuracy without adding to the overhead, it\u2019s also possible to <a href=\"https:\/\/platform.openai.com\/docs\/guides\/fine-tuning\">fine-tune GPT-3 models<\/a> to get improved accuracy by using a larger set of sample pairs like those used as few-shot guide inputs to train an enhanced model&#8211;the larger the number of samples, the better. GPT-3 models can continue to be fine-tuned over time as more data becomes available. And that tuning is cumulative; it\u2019s not necessary to run everything again from scratch each time more training data is applied.<\/p>\n<p>After initial runs using the few-shot method using sets of 2, 8, and 32 examples, it was clear that the experiment with the Davinci model, which is larger and more complex than Curie, was more successful, as shown in the table below. Using few-shot learning, the Davinci model was accurate just over 80 percent of the time when handling natural language questions that used data it had seen as part of the training set, and 70.5 percent of the time when dealing with questions including data the model had not seen before. Both models improved considerably with the introduction of fine-tuning, but the larger model could infer better because of its size and would be more useful in an actual application. Fine-tuning with 512 samples, and then with 1024, further improved classification performance:<\/p>\n<p>&nbsp;<\/p>\n<p>Figure 2: SQL-matching accuracy results<\/p>\n<p>This use of GPT-3 is currently an experiment, but the capability it explores is planned for future versions of Sophos products.<\/p>\n<h3>Filtering out the badness<\/h3>\n<p>Using a similar few-shot approach in another set of experiments, Lee applied GPT-3 to the tasks of spam classification and detecting malicious command strings.<\/p>\n<p>Machine learning has been applied to spam detection in the past, using different types of models. But Lee found that GPT-3 <a href=\"https:\/\/ai.sophos.com\/2022\/12\/15\/gpt-3-and-cybersecurity\/\">significantly outperformed other, more traditional machine learning approaches<\/a>, when the amount of training data was small. As with the SQL-generating experiment, some \u201cprompt engineering\u201d was required.<\/p>\n<p>The input text format for text completion tasks is an important step. As shown in Figure 3 below, an instruction and a few examples with their labels are included as a support set in the prompt, and a query example is appended.\u00a0 (This data is sent to the model as a single input.) Then, GPT-3 is asked to generate a response as its label predication from the input:<\/p>\n<figure id=\"attachment_90501\" aria-describedby=\"caption-attachment-90501\" style=\"width: 640px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide9.png\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-90501\" src=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide9.png\" alt=\"\" width=\"640\" height=\"360\" srcset=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide9.png 2048w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide9.png?resize=300,169 300w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide9.png?resize=768,432 768w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide9.png?resize=1024,576 1024w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide9.png?resize=1536,864 1536w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><figcaption id=\"caption-attachment-90501\" class=\"wp-caption-text\">Figure 3: An example of how GPT-3 spam detection works, moving from instructions and the support set to the query and the returned response.<\/figcaption><\/figure>\n<h3>Deciphering LOLBins<\/h3>\n<p>Application of GPT-3 to finding commands targeting LOLBins (living-off-the-land binaries) is a slightly different sort of problem. It\u2019s difficult for humans to reverse-engineer command line entries, and even more so for LOLBin commands because they often contain obfuscation, are lengthy and difficult to parse. Fortunately, it helps that GPT-3 in its current form is well-versed in code in many forms.<\/p>\n<p>If you\u2019ve looked at ChatGPT, you may already know that GPT-3 can write working code in multiple scripting and programming languages when given a natural language input of the desired\u00a0 functionality. But it can also be trained to do the opposite\u2014generating analytical descriptions from command lines or chunks of code.<\/p>\n<p>Once again the few-shot approach was used. With each command line string submitted for analysis, GPT-3 was given a set of 24 common LOLBin-style command lines with tags identifying their general category and a reference description, as shown below:<\/p>\n<figure id=\"attachment_90508\" aria-describedby=\"caption-attachment-90508\" style=\"width: 640px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Picture1.png\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-90508\" src=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Picture1.png\" alt=\"\" width=\"640\" height=\"219\" srcset=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Picture1.png 1422w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Picture1.png?resize=300,103 300w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Picture1.png?resize=768,262 768w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Picture1.png?resize=1024,350 1024w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><figcaption id=\"caption-attachment-90508\" class=\"wp-caption-text\">Figure 4: Some of the samples in JSON format used to train the command-line analyzer.<\/figcaption><\/figure>\n<p>Using the sample data, GPT-3 was configured to provide multiple potential descriptions of command lines. To get the most accurate description out of GPT-3, the SophosAI team decided to use an approach called back-translation\u2014a process in which the results of a translation from command string to natural language are fed back into GPT-3 to be translated into command strings again and compared to the original.<\/p>\n<p>First, multiple descriptions are generated from an input command line. Next, a command line is in turn generated from each of the generated descriptions. Finally, the generated command lines are compared to the original input to find the one that best matches, and the corresponding generated description is chosen as the best answer, as shown below:<\/p>\n<figure id=\"attachment_90509\" aria-describedby=\"caption-attachment-90509\" style=\"width: 640px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide10.png\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-90509\" src=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide10.png\" alt=\"\" width=\"640\" height=\"360\" srcset=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide10.png 2048w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide10.png?resize=300,169 300w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide10.png?resize=768,432 768w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide10.png?resize=1024,576 1024w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/Slide10.png?resize=1536,864 1536w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><figcaption id=\"caption-attachment-90509\" class=\"wp-caption-text\">Figure 5. How back-translation works.<\/figcaption><\/figure>\n<figure id=\"attachment_90510\" aria-describedby=\"caption-attachment-90510\" style=\"width: 640px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/command-prompt-engineering.png\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-90510 size-full\" src=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/command-prompt-engineering.png\" alt=\"Figure 6: an example of back-translation in action.\" width=\"640\" height=\"254\" srcset=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/command-prompt-engineering.png 767w, https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/command-prompt-engineering.png?resize=300,119 300w\" sizes=\"auto, (max-width: 640px) 100vw, 640px\" \/><\/a><figcaption id=\"caption-attachment-90510\" class=\"wp-caption-text\">Figure 6: an example of back-translation in action.<\/figcaption><\/figure>\n<p>Supplying a tag with the input for the suspected type of activity can improve the accuracy of the analysis, and in some cases the first- and second-best back-translation results can provide complementary information\u2014helping with more complex analysis.<\/p>\n<p>While not perfect, these approaches demonstrate the potential of using GPT-3 as a cyber-defender\u2019s co-pilot. The results of both the spam filtering and command line analysis efforts are posted to <a href=\"https:\/\/github.com\/sophos\/gpt3-and-cybersecurity\">SophosAI\u2019s GitHub page<\/a> as open source under the Apache 2.0 license, so those interested in trying them out or adapting them to their own analysis environments are welcome to build on the work.<\/p>\n<p>&nbsp;<\/p>\n<\/p><\/div>\n<p><a href=\"https:\/\/news.sophos.com\/en-us\/2023\/03\/16\/gpt-for-you-and-me-applying-ai-language-processing-to-cyber-defenses\/\" target=\"bwo\" >http:\/\/feeds.feedburner.com\/sophos\/dgdY<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/news.sophos.com\/wp-content\/uploads\/2023\/03\/DALL\u00b7E-2023-03-15-13.48.02-A-robot-defending-a-fortress-against-cyber-bugs.png\"\/><\/p>\n<p><strong>Credit to Author: gallagherseanm| Date: Thu, 16 Mar 2023 10:00:07 +0000<\/strong><\/p>\n<p>Three SophosAI projects harness the model behind ChatGPT for better detection of malicious activity.<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"colormag_page_container_layout":"default_layout","colormag_page_sidebar_layout":"default_layout","footnotes":""},"categories":[10378,10377],"tags":[27031,129,28879,27030],"class_list":["post-21490","post","type-post","status-publish","format-standard","hentry","category-security","category-sophos","tag-ai-research","tag-featured","tag-sophos-ai","tag-sophos-x-ops"],"_links":{"self":[{"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/posts\/21490","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/comments?post=21490"}],"version-history":[{"count":0,"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/posts\/21490\/revisions"}],"wp:attachment":[{"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/media?parent=21490"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/categories?post=21490"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.palada.net\/index.php\/wp-json\/wp\/v2\/tags?post=21490"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}