OpenAI and Anthropic shared their findings in security tests

While the competition in the field of artificial intelligence grew rapidly, OpenAI and Anthropic follow a different path this time. The two companies mutually tested their public models and published their reports. Although the reports are full of technical details, it contains remarkable findings in terms of user safety. In addition, tips on how security tests can be performed more effectively in the future.

Anthropic’s investigations revealing certain risks in OpenAI’s models. The company has announced that O3 and O4-Mini models have yielded close to the expected level in security tests. However, GPT-4O and GPT-4.1 models are more likely to abuse users said. However, all models except the O3 model have a certain level of “excessive harmony” tendency. In addition, the risk of negative impact on user decisions of this trend was emphasized.

Anthropic detected risky trends in OpenAI models

In Anthropic’s tests, OpenAI’s newest model GPT-5 was not included. However, the GPT-5 stands out with a security feature called “Safe Completiions”. This feature was developed to protect users from dangerous or harmful queries. However, how this aspect of the model will perform in independent tests has not yet become clear. On the other hand, this reminds us that there are still open questions about artificial intelligence security.

OpenAI came up with a case that led to broad debates in the public in recent months. The fact that a young user ended his life after a long time sharing his suicide plans with chatgpt was the company’s first death. This has revealed new questions about how artificial intelligence systems can affect users. In addition, it was brought to the agenda that security measures should be handled not only in technical but also with ethical dimensions. In spite of everything, this proceedings opened the discussion of the future of artificial intelligence security.

OpenAI, on the other hand, tested Anthropic’s Claude models from different angles. Instruction hierarchy, Jailbreak initiatives, hallucinations and manipulation scenarios were the focal points of these tests. Claude models showed a successful performance especially in the instruction hierarchy. In addition, he drew attention to the fact that they behave more abstaining in situations that are likely to give incorrect answers. This result was considered as an important plus in terms of user security.

Claude’s high rejection of hallucination tests reduced the likelihood of spreading false information. Nevertheless, it still remains unclear whether they are completely safe against manipulative scenarios. Nevertheless, Claude’s results are seen to support user security. At this point, it can be said that the boundaries of the models are clearly understood. On the other hand, it is clear that such tests should be made more comprehensive.

These security tests carried out by companies mutually point to a different approach in a sector where competition is intense. The explicit sharing of security weaknesses was a significant development in terms of transparency. In addition, the commercial tension between Anthropic and OpenAI continues. Anthropic recently claimed that OpenAI used Claude models without permission. As a result of this claim, OpenAI’s access to some vehicles was prevented.

In spite of everything, these mutual tests are a critical turning point for user safety. The fact that artificial intelligence systems are increasingly integrated into everyday life makes security more important. In addition to all these, it is emphasized that regulatory institutions should be included in the process. Increased independent audits can help users experience safer experiences. This approach can form the basis of artificial intelligence standards of the future.

The reports published by OpenAI and Anthropic point to social effects beyond technical details. Artificial intelligence is not only about advanced algorithms, but also have consequences that directly affect human life. Security tests become a tool to keep this effect under control. However, it is clear that ethical and legal debates will continue to be at the center of the process. Even if the competition of companies continues, the importance of joint studies on user security is increasing.