@CitizenSec , 01-11-2024
A Mozilla researcher has proposed a new way to bypass content filters in large language models (LLM) used to prevent abuse. He broke the input into pieces and encoded the malicious instructions in hexadecimal format. Marco Figueroa chose GPT-4o, the latest chatbot from OpenAI, capable of analyzing input for forbidden words and signs of malice, for the attack.
You can bypass such filters by changing the wording, but Figueroa decided to use a simpler way: using hexadecimal encoding, he asked GPT-4o to study the data on the vulnerability CVE-2024-41110 in Docker and create an exploit. He formulated instructions in natural language, and replaced the word "exploit" with "3xploit" to avoid a negative reaction. The "read the entire assignment again" command has also been added to increase the chances of getting a detailed answer.
As a result, the AI bot generated an exploit similar to the existing PoC and even tried to test it on its own, which surprised the researcher. Hexadecimal encoding has helped to bypass the attention of LLMs, who scrupulously check each fragment, but may miss the general context.
Figueroa applied this method to the LLM from Anthropic, but their models turned out to be more stable due to the verification of both input and output, which, according to him, makes bypassing filters 10 times more difficult.
Современные дети растут в эпоху цифровых технологий — смартфоны, планшеты, социальные сети и онлайн-игры стали неотъемлемой частью их жизни. Интернет открывает массу возможностей для учёбы, творчества и общения, но вместе с этим приносит и серьёзные риски.
@citizensec
30-05-2025Правила использования корпоративной почты: что разрешено, что запрещено, меры безопасности и ответственность.
@CitizenSec
19-05-2025The special edition is dedicated to women in cybersecurity who overcome challenges, inspire others, and make the world safer. We share the stories of three professionals, their paths in cybersecurity, career advice, and tips on online security. Learn how to start your journey in cybersecurity and grow in this dynamic field.
@citizensec
03-05-2025Microsoft Warns: Chinese Spy Group Uses Everyday IT Tools to Hack Networks
@turin.medet
06-03-2025Experts have discovered two dangerous programs that seem harmless at first. These programs can steal personal data, monitor computer activity, and even take control of the system.
@CitizenSec
26-12-2024Thousands of Postman workspaces accidentally revealed sensitive data such as API keys and access tokens. Learn how to secure your API development environment and protect your organization's data.
@CitizenSec
21-11-2024SteelFox was first identified in August 2023, but its activity has increased markedly. More than 11,000 infection attempts have been recorded in recent months.
@CitizenSec
11-11-2024This problem allows hackers to gain unauthorized access to important Android system folders.
@CitizenSec
05-11-2024Recently, cybersecurity researcher Alexander Hagen has developed a tool that can bypass a new security feature in Google Chrome called App-Bound Encryption.
@CitizenSec
30-10-2024