xAI has announced that its language model Grok 4 will be available to the federal government of the U.S. as part of a $200 million contract with the Department of Defense. However, Grok 4’s performance in terms of security and privacy has raised concerns. According to research from SplxAI, this model showed alarmingly low scores, with only 0.3% in security and 0.42% in privacy, making it vulnerable to prompt injection attacks.
A true disaster compared to ChatGPT-40
The principal investigator of SplxAI, Dorian Granoša, highlighted that Grok 4 was easy to “release, generating harmful content without requiring complex instructions. In comparison, ChatGPT-4o showed scores of 33% in safety and 18% in security, remaining more robust under conditions without additional prompts. This difference underscores the challenges that Grok 4 faces for enterprise use.
However, tests conducted by SplxAI revealed that Grok 4 can drastically improve its performance in security and privacy with proper guidance. Even in basic configurations, success rates increased by up to 90% with minimal prompting. This suggests that, although Grok has the capacity to operate responsibly, its implementation requires strict guidelines.
Despite concerns about its safety, the government’s use of Grok is indicative of the growing adoption of artificial intelligence tools in the public sector. xAI was one of four technology companies selected for the federal contract, alongside OpenAI, Google, and Anthropic. This collaboration will mean that Grok will be accessible to other federal agencies through GSA programming.