Gemma-4-31B-JANG_4M-CRACK — The Open-Source AI That Answers Everything
On April 2, 2026, Google DeepMind released what it called "the most capable open model in its size class" — Gemma 4 31B.
Within days, it was ranked #3 on the Arena AI leaderboard, sitting behind only GLM-5 and Kimi 2.5. It runs on a single H100 GPU, handles 256,000 tokens of context, and ships under the permissive Apache 2.0 license — meaning no commercial restrictions, no Google leash.
But within days of release, something else happened.
Independent researcher @dealignai published Gemma-4-31B-JANG_4M-CRACK — a fully abliterated version that removes every safety mechanism the original shipped with. The result: a model that answers requests it would normally refuse, including sensitive research topics, security research, and adult creative content.
This is what the uncensored AI race looks like in 2026.
What Does "Abliteration" Mean?
The technical process is called ablation — surgical removal of specific neural pathways. Unlike fine-tuning, which trains a model to behave differently, ablation targets the exact circuits that activate when a model decides to refuse a request.
The key insight from dealignai's research: the knowledge cost of removing safety guardrails is minimal.
| Metric | Original Gemma 4 31B | Gemma-4-31B-JANG_4M-CRACK |
|---|---|---|
| MMLU Score | 76.5% | 74.5% |
| HarmBench Compliance | N/A | 93.7% |
| Cybercrime/Intrusion Prompts | Refuses | 33/33 (100%) |
| Malware Analysis Prompts | Refuses | 94%+ compliance |
| Memory Required | ~62GB | 18GB |
The knowledge degradation is just 2 percentage points on MMLU — a remarkably small trade-off for removing all refusal mechanisms.
The Hardware Reality: Run It on Your Mac
The JANG_4M quantization compresses Gemma 4 31B down to 18GB. This means:
- Any Apple Silicon Mac with 24GB unified memory can run it locally
- No API calls, no subscription, no servers in the loop
- Complete privacy — your conversations never leave your machine
This is the first time that a fully uncensored, top-tier AI model has been this accessible to individual users. You don't need an enterprise GPU cluster. You need a MacBook Pro.
HarmBench Results: What It Actually Completes
The HarmBench benchmark tests AI models against 159 adversarial prompts. Here's how the abliterated Gemma 4 performed:
| Category | Compliance Rate |
|---|---|
| Cybercrime & Intrusion | 33/33 (100%) |
| Malware Analysis | 94%+ |
| Chemical/Biological Info | 94%+ |
| Social Engineering | High |
| Overall | 93.7% |
The original Gemma 4 refused all of these. The cracked version complies with nearly all of them.
Why This Matters for the AI Conversation
The Gemma-4-31B-JANG_4M-CRACK release is significant for several reasons beyond the technical achievement:
1. The Gap Between "Open" and "Free" Is Collapsing
Google released Gemma 4 as "open source" — but "open" doesn't mean "uncensored." Google's safety training was still baked in. Now, the open-source community has shown it can separate capability from compliance at near-zero cost.
2. Safety Ablation Is Getting More Precise
The 2% MMLU drop is remarkably small. Earlier attempts at model abliteration often degraded general intelligence significantly. The fact that you can remove nearly all refusal mechanisms while keeping 98% of the model's knowledge is a milestone.
3. Local AI Is Entering the Unfiltered Era
Running a fully uncensored AI model locally on consumer hardware was science fiction two years ago. Now it's a reality — and it's going to change the conversation about AI safety, access, and regulation.
The Bigger Picture: Who Is This Actually For?
The release has sparked the familiar debate:
Critics say: Removing all safety guardrails from a capable open-source model makes it trivially easy to generate harmful content — malware, social engineering templates, explicit content.
Proponents say: The model still has its knowledge. The safety filter was always just a politically motivated gatekeeping mechanism. Security researchers, writers, medical professionals, and people in marginalized communities need access to unfiltered AI — and shouldn't need a $10,000 GPU to get it.
As dealignai notes: the model is published with a standard research disclaimer. The Apache 2.0 license means anyone can use it. The question of what "should" be filtered isn't one that any single company should decide unilaterally.
Where to Access Gemma-4-31B-JANG_4M-CRACK
The model is available now:
- HuggingFace: dealignai/Gemma-4-31B-JANG_4M-CRACK
- Ollama:
ollama run dealignai/gemma-4-31b-jang-4m-crack:128k - ModelScope: Available for download
Hardware requirement: 24GB RAM (Apple Silicon M-series or equivalent GPU)
How Moonlight Fits Into This
At Moonlight, we're not running cracked models on local hardware — we're providing a hosted, accessible, unfiltered AI chat platform that doesn't require you to quantize models or manage your own inference infrastructure.
But the Gemma-4-31B-JANG_4M-CRACK release validates our core thesis:
The demand for uncensored AI is real. The technical barriers are collapsing. And the conversation about who gets to decide what AI should and shouldn't say is just getting started.
We believe that decision should be yours — not Google's, not OpenAI's, not any government's.
Your conversations. Your rules.
Try Moonlight free — no setup required, no GPU needed. →

