spot_img
17.7 C.
Londra
spot_img
AcasăIAWhat companies need to know about Grok 4 is what Elon Musk...

What companies need to know about Grok 4 is what Elon Musk called the” smartest Artificial in the world,” according to Elon Musk’s announcement last night.

Doriți informații mai inteligente în căsuța dvs. poștală? Abonați-vă la newsletter-ele noastre săptămânale pentru a primi doar informații importante pentru liderii din domeniul inteligenței artificiale, datelor și securității în cadrul companiilor. Abonează-te acum


After days of controversy surrounding a flurry of antisemitic responses made recently by his Grok AI-powered chatbot on his social network X (formerly Twitter), a seemingly unrepentant and unbothered Elon Musk launched the latest version of his AI model family, Grok 4, during an event livestreamed on X last night, calling it the “the smartest AI in the world.”

As Musk postat pe X: “Grok 4 is the first time, in my experience, that an AI has been able to solve difficult, real-world engineering questions where the answers cannot be found anywhere on the Internet or in books. And it will get much better.”

Noul release actually includes two distinct models: Grok 4, a single-agent reasoning model, and Grok 4 Heavy, a multi-agent system designed to solve complex problems through internal collaboration and synthesis.

Both models are optimized for reasoning tasks and come with native tool integration, enabling capabilities such as web search, code execution, and multimodal analysis.

Musk and his team at xAI showcased benchmarks that suggest Grok 4 outperforms all current competitors across a range of academic and coding evaluations, even compared to formerly leading AI reasoning model rivals OpenAI o3 and Google Gemini.

However, xAI has not yet released a model card nor any official release notes documentation for Grok 4 to the public, making it challenging to independently assess performance and the claims made during the stream. We’ll update if/when these become available.

Nor did Musk and his xAI team members participating in the livestream address the glaring controversy facing Grok over the past week, including many incidents of Grok making antisemitic remarks or referring to itself as “MechaHitler“, and suggesting that people with Jewish surnames should be handled decisively by Adolf Hitler — a seemingly overt reference to the Holocaust and genocide of 6 million Jews during World War 2.

The closest Musk came was when he stated: “The thing that I think is most important for AI safety—at least my biological neural net tells me the most important thing—is to be maximally truth-seeking,” and “We need to make sure that the AI is a good AI. Good Grok” as well as “It’s important to instill the values you want in a child that would grow up to be incredibly powerful.”

However, Musk did not apologize nor did he accept responsibility for Grok’s antisemitic, sexually offensive, and conspiratorial remarks. Here’s a cop of the full stream below:

Throughout the livestream, the team emphasized Grok 4’s ability to reason from first principles, correct its own errors, and potentially invent new technologies or uncover novel scientific insights.

The presentation also included demonstrations of Grok 4 Heavy applying multi-agent collaboration to tackle research-level problems across disciplines.

Availability and pricing

Grok 4 is available now through several channels, depending on user type and subscription level:

  • API Access (for developers and enterprises):
    Grok 4 and Grok 4 Heavy are live via the xAI API. Pricing is structured as follows:
    • $3 per 1 million input tokens
    • $15 per 1 million output tokens
    • $0.75 per 1 million cached input tokens
    • Prices double after 128,000 tokens in a single context window
      The API supports text and image inputs, function calling, structured outputs, and offers a 256,000-token context window.
  • Consumer Access (via Grok chatbot and apps):
    Individual users can access Grok 4 through the Grok chatbot on X, cel/cea/cei/cele Grok app (iOS and Android), and X.com, but only with one of the following subscriptions:
    • PremiumPlus: $16/month
    • SuperGrok: $300/month
    • A new “SuperGrok Heavy” tier, also priced at $300/month, provides access to both Grok 4 and Grok 4 Heavy, the multi-agent variant.
      (Note: SuperGrok and PremiumPlus tiers may differ in availability and usage quotas across X and Grok platforms.)
  • Launch Timing:
    Grok 4 became available immediately following the July 9, 2025 livestream. Temporary access limits were in place during the demo, but full rollout to subscribers began shortly after.
  • Platform Expansion:
    xAI has indicated plans to make Grok 4 available through Microsoft Azure AI Foundry, where Grok 3 and Grok 3 Mini are currently listed.

For subscription details, users are directed to x.ai/grok şi X Premium support. Here’s how it compares to other leading AI models in terms of pricing per million tokens.

Provider & model Context window Input ($/Mtok) Cached input Output ($/Mtok) Additional notes
xAI – Grok 4 / 4 Heavy 256 K (2× price >128 K) $3.00 $0.75 $15.00 Image input, function calling, structured JSON (apidog)
OpenAI – o3 200 K $2.00 $0.50 $8.00 50 % Batch-API discount available (OpenAI, OpenAI Help Center)
GPT-4o 128 K $5.00 $2.50 $20.00 Vision, audio, tools (OpenAI)
Anthropic – Claude Sonnet 4 200 K $3.00 $0.30 $15.00 50 % batch output discount (Antropic)
Claude Opus 4 200 K $15.00 $1.50 $75.00 High-accuracy flagship (Antropic)
Google – Gemini 2.5 Pro 200 K (2× price >200 K) $1.25 $0.31 $10.00 75 % cache hit discount (Google AI for Developers, Google Cloud)
Gemini 2.5 Flash 200 K $0.30 $0.075 $2.50 Fast, cheap preview tier (Google Cloud)
DeepSeek – deepseek-reasoner 64 K $0.55 (miss) / $0.14 (hit) $0.14 $2.19 50-75 % off-peak discount (DeepSeek API Docs)

Unlike its predecessor Grok 3, released in February, which separated tool-augmented responses from general reasoning, Grok 4 was trained with tools from the start.

The model integrates capabilities such as code execution, web search, and document parsing. It also introduces Grok 4 Heavy, a multi-agent system where several internal models work in parallel to generate and validate answers.

Grok 4 also includes a new voice mode featuring expressive outputs with reduced latency, and it supports text and image input, structured outputs, and function calling.

Performance highlights

The independent AI model analysis and benchmarking group Artificial Analysis stated on X that xAI provided it with a version of Grok 4 (not Heavy) earlier than the public release for scoring.

On technical benchmarks, Grok 4 leads the Artificial Analysis Intelligence Index with a score of 73, ahead of competitors such as OpenAI’s o3 (70) and Google’s Gemini 2.5 Pro (70).

It also recorded top scores in:

  • GPQA Diamond: 88%
  • ARC-AGI 2: 15.9%, double the second-best score
  • Humanities Last Exam: 24% on the text-only version, and 44% with tools
  • MMLU-Pro and AIME 2024: 87% and 94%, respectively
  • Coding and Math evaluations: Highest to date on LiveCodeBench, SciCode, AIME24, and MATH-500

Despite its benchmark success, Grok 4’s output speed stands at 75 tokens per second—slower than models like Gemini 2.5 Flash (353) or OpenAI’s o3 (187), but still faster than Anthropic’s Claude 4 Opus (66).

The model features a 256,000 token context window, which sits above the 200k context limits of o3 and Claude 4 Sonnet but below the 1 million tokens offered by Gemini 2.5 Pro and GPT-4.1.

Real world use cases

xAI provided several demonstrations of Grok 4’s performance in applied scenarios:

  • In a simulated business task called VendingBench, Grok 4 significantly outperformed other models in long-horizon financial planning.
  • At the Arc Institute, researchers used Grok 4 to analyze CRISPR logs and uncover novel hypotheses.
  • În radiology, the model interpreted chest X-rays with higher accuracy than leading peers.
  • In the financial sector, its combination of real-time data access and reasoning made it suitable for forecasting and analysis.

The model can also create 3D video games with minimal input by autonomously sourcing and integrating assets. Additionally, it demonstrated capabilities to simulate astrophysical events using grounded approximations from published research.

Reception and discussion

Industry response to the Grok 4 launch has been divided, blending enthusiasm for its performance with criticism of the event’s delivery and broader trust issues.

David Shapiro, an AI power user and writer, noted: “Grok 4 now takes its place as ‘smart enough to actually help with frontier research’… but has merely caught up with OpenAI.”

Ethan Mollick, o professor at Wharton, remarked on X: “So Grok 3 has had three separate incidents where apparently unvetted changes to the deployed system caused a large-scale ethical issue and an emergency rollback. I don’t think you can do a Grok 4 launch that doesn’t at least address this honestly, if user trust matters,” later adding, “Grok 3 was a very good model, and Grok 4 might be amazing but having a very good model is not enough – there are a lot of really good models out there. You actually want to trust the model you are building on.”

Ben Hyak, co-founder and CTO of AI product observability startup Raindrop AI (himself a former Musk employee) criticized the livestream itself: “This xAI livestream is one of the worst things I’ve ever watched in my life. Love y’all, but it’s bad.”

Despite the criticisms, benchmarking firm Artificial Analysis noted: “Grok 4 is now the leading AI model.”

Ongoing trust issues

The launch of Grok 4 comes amid renewed criticism over Grok’s prior behavior in consumer deployments, particularly as a chatbot integrated into Musk’s social network, X.

Over the July 4 holiday and in subsequent days, Grok generated antisemitic and conspiratorial responses that reignited scrutiny over its system design and governance practices.

As reported by my VentureBeat colleague Michael F. Nuñez, Grok responded to questions about Jewish influence in Hollywood by asserting that Jewish executives “dominate leadership” at major studios and influence content through “progressive ideologies,” and went on to rant about people of Jewish surnames as fitting a “pattern” of engaging in “extreme leftist activism,” and suggesting Hitler knew “how to handle it decisively, every damn time,” an apparent reference to the Holocaust.

The conspiratorial and antisemitic posting was so prolific, the Anti-Defamation League (ADL), a preeminent U.S.-based non-profit combating anti-semitism and hatred, posted on July 8: “What we are seeing from Grok LLM right now is irresponsible, dangerous and antisemitic, plain and simple. This supercharging of extremist rhetoric will only amplify and encourage the antisemitism that is already surging on X and many other platforms.”

This incident follows a history of problematic Grok outputs, including a May 2025 case where the Grok bot integrated into X randomly inserted references to a completely nonsensical and non-real “white genocide” in South Africa into unrelated queries, and an earlier case wherein its system prompt was discovered to direct the Grok chatbot on X to avoid referencing any sources that declared Musk and his former political funding beneficiary U.S. President Donald J. Trump as spreaders of misinformation. In both of these two cases, xAI blamed the behaviors on nameless employees and said they were being addressed.

Already, today, users of Grok 4 on the consumer app have observed it to once again be outputting anti-Zionist and anti-Semitic remarks:

As I previously noted, Musk has openly stated on several occasions he wanted to alter Grok to better reflect his personal beliefs and distrust in mainstream media and accredited sources. This makes it a poor source in enterprise contexts where such views could adversely impact users and the businesses building atop the Grok family of models.

My prior recommendation remains: For those in the enterprise trying to ensure their business’s AI products work properly and accurately… Grok is sadly best avoided. Thankfully, there are numerous other alternatives to choose from.

spot_img

cele mai recente articole

explorează mai mult

LĂSAȚI UN MESAJ

Vă rugăm să introduceți comentariul dvs.!
Introduceți aici numele dumneavoastră.

ro_RORomanian