Is Qwen3 completely free and open source? Can the Apache 2.0 license be used commercially?

Qwen3 uses the Apache 2.0 license, which allows commercial use, modification, and redistribution without fees. However, while model weights are downloadable, the training data is not publicly available. The HackerNews community has debated whether this qualifies as 'truly open source.' In practice, you can build SaaS products or commercial applications with Qwen3, but you won't know exactly what data trained the model. Compared to DeepSeek's more restrictive licensing terms, Qwen3's Apache 2.0 is considered more business-friendly by the community.

What's the best free way to try Qwen3 as of April 2026?

The fastest option is the OpenRouter Playground where you can try Qwen3.6-Plus directly (the free tier has rate limits and may be discontinued at any time — check current status before using). The qwen.ai website's Qwen Chat interface is still free, though the OAuth API free tier ended around April 15, 2026. For unlimited, completely offline usage, Ollama local deployment is the most stable free path — you just need a computer with at least 8GB of memory.

Qwen3 Chinese AI Complete Guide: Model Selection, Free Tiers & Ollama Pitfalls (2026)

The open-source AI community has quietly switched tracks. Qwen3 hit 869 points on HackerNews for the highest engagement, LocalLLaMA users have shifted their default from Llama to Qwen, yet if you search for a comprehensive Qwen3 guide focused on Chinese language quality, you'll find either fragmented press releases covering a single version or benchmark numbers with no practical usage advice.

This article provides a complete Qwen3 guide from a practical user's perspective: full version navigation from Qwen3 to Qwen3.6-Plus, an honest assessment of Chinese output quality, the real limitations of three free access paths, and two confirmed bugs you'll hit when deploying locally with Ollama.

TL;DR

Chinese output quality: Default output may mix Simplified Chinese characters; adding "Please respond in Traditional Chinese" to your system prompt significantly improves quality, though overall performance still slightly trails Simplified Chinese
Zero-barrier free access: OpenRouter Playground lets you try Qwen3.6-Plus immediately (rate-limited, free tier may end anytime); for fully offline use, deploy locally with Ollama
Ollama + Qwen3.5 pitfalls: Thinking Mode infinite loop (GitHub #12917) and Tool Calling failure (GitHub #14493) are confirmed bugs — it's not your computer. Fix: use original Qwen3 version or switch to llama.cpp
API cost: Content generation costs roughly $0.10/month; Agentic Coding mode token consumption can quickly exceed your Claude subscription

Qwen3 Spans Three Generations — Know Which One You're Actually Using

First things first: "Qwen3," "Qwen3.5," and "Qwen3.6-Plus" that media outlets mention are not the same thing. This series released across three main generations from April 2025 to April 2026 (Qwen3, Qwen3.5, Qwen3.6), each containing multiple model sizes, with feature differences so significant that picking the wrong generation means wasted effort.

Version	Release Date	Core Features	Best For
Qwen3	2025-04-29	8 models (2 MoE + 6 dense), 119 languages, Apache 2.0	Local deployment starter (most stable)
Qwen3-Max-Thinking	2026-01-27	Reasoning flagship, image/video generation	Complex logic, math
Qwen3.5	2026-02-17	397B parameters, 201 languages, agent-enhanced	Large AI agent workflows
Qwen3.5-Omni	2026-03-30	Multimodal (text + image + audio + video), 256K context	Speech recognition, video analysis
Qwen3.6-Plus	2026-04-02	1M token context, SWE-bench 78.8%	Agentic Coding, long document processing

How to choose? If you're just getting started, Qwen3.5-9B (free locally, highly stable) is enough for everyday Chinese writing. For super-long documents or coding, use Qwen3.6-Plus via API. For speech recognition or video analysis, Qwen3.5-Omni directly competes with Gemini 3.1 Pro.

One important note: Qwen3.5 series has known bugs on Ollama (detailed later), so for local deployment, the original Qwen3 version is actually more stable.

Chinese Output Quality: An Honest Assessment of Character Accuracy, Local Terms & Hallucinations

Qwen3's official announcement explicitly lists "Traditional Chinese" in its 119-language support list. Sounds great, but in practice, Chinese — especially Traditional Chinese — is treated as a "second-class citizen."

Default output mixes Simplified characters. Without any special instructions, you may see Simplified variants where Traditional characters should appear. This isn't a bug — it's a result of training data being predominantly Simplified Chinese. The TMMLU+ (Taiwan Multilingual Language Understanding) academic benchmark confirms: Traditional Chinese performance slightly trails Simplified Chinese overall.

The fix is simple but you need to know about it. Add this to the beginning of your system prompt:

Please respond in Traditional Chinese (繁體中文) using Taiwanese terminology and grammar.

After adding this, output quality improves noticeably. Taiwan-specific terms like local transit and healthcare terminology are usually handled correctly, though some character variants still need explicit specification.

Hallucination is a real concern. A hands-on test by Taiwanese blogger The Walking Fish found that physics simulation tests failed and FAQ summarization produced non-existent content. Developers on Twitter have also warned directly: "The Qwen series has notable hallucination issues — don't trust its subjective descriptions entirely."

For low-risk tasks like drafting blog posts, initial translations, and note organization, Qwen3 works well. But for financial data, legal texts, or medical information, always verify with human review.

One more limitation: Traditional Chinese image generation still has issues. The community confirms that "the old problem of AI failing to correctly generate Traditional Chinese" persists.

Can My MacBook or PC GPU Run Qwen3? Complete Hardware Requirements

Based on comprehensive testing from hardware-corner.net and willitrunai.com, here are the VRAM requirements for Q4 quantized versions:

Model	VRAM Needed (Q4)	Mac Unified Memory	PC GPU
Qwen3-0.6B / 1.7B	< 2GB	M1 Air 8GB	Any discrete GPU
Qwen3-4B	~2.3GB	8GB Mac	GTX 1060+
Qwen3-8B	~4.6GB	16GB Mac	RTX 3060 8GB
Qwen3-14B	~8.3GB	32GB Mac	RTX 3080 Ti / 4080
Qwen3-30B-A3B (MoE)	~18GB	M3 Max 36GB	RTX 4090 24GB
Qwen3-32B	~19GB	M3 Max 36GB (tight)	RTX 4090 24GB

Sweet spot: Qwen3-30B-A3B MoE. This Mixture-of-Experts model activates only 3B parameters per token, delivering much better efficiency than a same-size dense model. HackerNews users confirm both RTX 4090 and M3 Max run it smoothly.

Apple Silicon users get a bonus: with MLX optimization, community reports show Qwen3-Next-80B reaching 60-74 tokens/sec on M-series chips, with DFlash speculative decoding providing up to 4.13x speed improvements.

Bottom line: M2 MacBook Pro 16GB runs the 8B model perfectly for daily use. For better output quality, M3 Max 36GB with 30B-A3B is the current best local deployment combo. PC users with an RTX 4090 can run nearly everything.

Three Free Access Paths (April 2026 Status)

Free doesn't mean unlimited. Each path has its own invisible wall.

Path 1: OpenRouter Playground (Zero Barrier)

The fastest way. Open OpenRouter's Qwen3.6-Plus page and use the Playground directly without creating an account. You get access to the latest Qwen3.6-Plus with its 1M token context window.

Two caveats: First, the free tier has rate limits (roughly 20 requests/minute, 200/day) — exceeding them triggers 429 errors. Second, the free tier was originally slated to end in early April, but as of this writing remains available. This window could close anytime, so try it while you can.

Path 2: qwen.ai Official Playground (Account Required)

qwen.ai's Qwen Chat web interface is still free and supports Qwen3.5-Omni's multimodal capabilities (images, audio input). If you want to try speech recognition or video analysis, this is the most direct entry point.

However, OAuth API free quotas have been drastically reduced (from 1,000/day to 100/day), with full discontinuation expected around April 15, 2026. The web Playground is unaffected, but if you need API access for your own applications, the free era is essentially over.

Path 3: Ollama Local Deployment (Completely Free, Completely Offline)

The only truly "unlimited" path. After installing Ollama, one command downloads a model and you're ready to go — no rate limits, no account needed, data never leaves your computer.

The trade-off is you need sufficient hardware (see the requirements table above), and initial model downloads take time (8B model is about 4-5GB). The next section provides complete deployment steps.

My recommendation: Start with OpenRouter Playground — spend 5 minutes experiencing Qwen3.6-Plus's capabilities. If it works for you and you want long-term free access, learn Ollama.

Ollama Local Deployment: Complete Steps & Two Bugs You Must Know About

Installation Steps

Per the official Qwen Ollama documentation, three steps:

# 1. Install Ollama (download from ollama.ai for your OS)

# 2. Download model (choose size based on your hardware)
ollama pull qwen3:8b          # 16GB Mac or 8GB VRAM PC
ollama pull qwen3:14b         # 32GB Mac or 12GB+ VRAM PC
ollama pull qwen3-30b-a3b     # M3 Max 36GB or RTX 4090

# 3. Start interactive chat
ollama run qwen3:8b

After starting, use /think and /no_think tags to control thinking mode:

/think Analyze the performance bottleneck in this code...
/no_think Translate this text to Chinese

Bug 1: Qwen3.5 Series Thinking Mode Infinite Loop

This is a confirmed issue (GitHub Ollama #12917, QwenLM #1817). The model continuously outputs <think> content and never generates a final answer — your only option is to manually interrupt.

This affects Qwen3.5 series only, not the original Qwen3 version. Alibaba has acknowledged the hybrid thinking design flaw and split subsequent versions into separate Instruct and Thinking models.

Bug 2: Qwen3.5 Series Tool Calling Completely Broken

Another confirmed issue (GitHub Ollama #14493). Qwen3.5-27B tool calling is completely non-functional in Ollama, and repetition penalty parameters are silently ignored.

If you're using LangChain, LlamaIndex, or any OpenAI-compatible agentic workflow, the Ollama + Qwen3.5 combination will simply fail.

Workarounds

Both bugs have solutions:

Use original Qwen3 (ollama pull qwen3:8b), not the Qwen3.5 series
Switch to llama.cpp server instead of Ollama (community recommends Bartowski quantized versions)
Use the official API or OpenRouter — server-side doesn't have these issues

Most existing Qwen3 guides completely avoid mentioning these bugs. If you're a developer or indie maker, this is critical information before choosing your deployment method.

Thinking Mode: When to Enable, When to Skip

Thinking Mode shows the model's reasoning process (chain-of-thought), essentially letting AI show its work on a scratch pad.

Enable for: Complex logical reasoning, math, multi-step analysis, tasks requiring high accuracy. With it on, answers tend to be more accurate and hallucinations decrease.

Skip for: Quick translations, text polishing, simple Q&A. Thinking mode significantly increases response time, and quality improvement is negligible for these tasks.

Warning: In Ollama, the enable_thinking: false setting may not work — the model still outputs thinking processes. For stable Thinking Mode control, Qwen Chat web or OpenRouter API is more reliable.

Qwen3 vs Claude vs Gemma 4: Which Is Best for Chinese Writing?

Let's cut to the chase: this isn't a "which is best" contest — it's about building the right tool combination.

BenchLM.ai's 2026 Chinese LLM rankings show: 1st DeepSeek V4 Pro (Max) (87 pts), 2nd Kimi K2.6 (84 pts), 3rd GLM-5 Reasoning (83 pts), 4th GLM-5.1 (83 pts), with Qwen3.5-397B Reasoning also on the leaderboard. DeepSeek V4 Pro Max's 87-point score currently sets the ceiling for Chinese LLMs.

From a practical perspective, each tool has its ideal use case:

Tool	Strongest Use Case	Weakness	Cost
Qwen3	Chinese content generation	More hallucinations, Traditional Chinese slightly weaker	Free (local) / very low API cost
Claude	English writing, complex reasoning, high-accuracy tasks	Chinese isn't its home turf, higher API cost	$3.00/1M input (Sonnet)
Gemma 4	Creative writing, experimental content	Weaker Chinese ecosystem	Free (local)

Practical strategy: Use Qwen3 for Chinese content drafts (free or minimal cost), Claude for English technical docs and high-accuracy tasks, Gemma 4 for creative writing experiments. Qwen3 doesn't replace Claude — it saves you significant API costs on Chinese-language tasks.

It's worth noting that no one has conducted systematic first-hand benchmarks specifically comparing Traditional Chinese writing quality across these three models. The above recommendations are based on benchmark data, community feedback, and use case analysis — not rigorous A/B testing.

API Cost Breakdown: Content Generation at $0.10/Month vs Agentic Coding Cost Explosion

Qwen3.6-Plus API pricing: $0.50/1M input tokens and $3.00/1M output tokens.

Light usage costs are essentially zero. Assuming 100 questions per day at an average of 500 input + 1,000 output tokens each, monthly cost is roughly $0.10 USD. Yes, ten cents a month.

But Agentic Coding mode is a different story. Real-world cases from V2EX show: one user's Qwen3 Coder session analyzing a codebase consumed 3.5 million tokens, costing 23 RMB (~$3.20 USD). A more extreme case hit over 400 RMB for a single analysis. The model reads every file in the repository — "even CSVs" — consuming two-thirds of the context window.

When to pay:

Monthly usage < 500 requests: Free options (OpenRouter + Ollama) are sufficient
Monthly usage 500-5,000 requests: Evaluate Alibaba Cloud ModelStudio subscription
Agentic Coding with heavy token consumption: Calculate carefully — costs may exceed a Claude Pro subscription

Indie Maker shortcut: Qwen3.6-Plus API is OpenAI-compatible. If you're currently using the OpenAI SDK, just swap base_url to https://dashscope.aliyuncs.com/compatible-mode/v1 — no other code changes needed.

Privacy & Data Sovereignty: What to Know Before Using Alibaba Services

This section isn't meant to scare you, but as a user, there are facts you should understand before making a decision.

When using QwenLM Playground or Alibaba Cloud API, your input data is transmitted to Alibaba's servers. Alibaba is a Chinese company subject to China's data security laws. Product Hunt community members have also raised concerns about "training data opt-out not being transparent" — meaning you can't be sure whether your inputs will be used to train future models.

The simplest solution: Ollama local deployment. The Apache 2.0 license allows you to run the model entirely locally, with data never leaving your computer. This is the biggest advantage of open-source models.

Practical advice:

Writing public blog posts, translating public content: API is fine
Processing personal data, trade secrets, client data: Always use Ollama local deployment
If your company has data compliance requirements, review Alibaba's latest privacy terms before using

Conclusion: Not a Replacement — A New Tool for Your Chinese AI Toolkit

Qwen3 won't replace Claude or ChatGPT in your workflow. Its value lies in providing a very low-cost (or free) high-quality option for Chinese language tasks, so you don't burn through Claude API credits every time you write Chinese content.

If you do just one thing, open OpenRouter Playground now and spend 5 minutes trying Qwen3.6-Plus's Chinese output. Remember to add "Please respond in Traditional Chinese" to the system prompt.

If you want to go further, learn Ollama local deployment. Completely free, completely offline, no rate limits — this article has given you the complete steps. Just avoid the known Qwen3.5 bugs on Ollama, and the overall experience is quite smooth.

Qwen3 Chinese AI Complete Guide: Model Selection, Free Tiers, Ollama Pitfalls & Honest Review (2026)

Qwen3 Chinese AI Complete Guide: Model Selection, Free Tiers & Ollama Pitfalls (2026)

TL;DR

Qwen3 Spans Three Generations — Know Which One You're Actually Using

Chinese Output Quality: An Honest Assessment of Character Accuracy, Local Terms & Hallucinations

Can My MacBook or PC GPU Run Qwen3? Complete Hardware Requirements

Three Free Access Paths (April 2026 Status)

Path 1: OpenRouter Playground (Zero Barrier)

Path 2: qwen.ai Official Playground (Account Required)

Path 3: Ollama Local Deployment (Completely Free, Completely Offline)

Ollama Local Deployment: Complete Steps & Two Bugs You Must Know About

Installation Steps

Bug 1: Qwen3.5 Series Thinking Mode Infinite Loop

Bug 2: Qwen3.5 Series Tool Calling Completely Broken

Workarounds

Thinking Mode: When to Enable, When to Skip

Qwen3 vs Claude vs Gemma 4: Which Is Best for Chinese Writing?

API Cost Breakdown: Content Generation at $0.10/Month vs Agentic Coding Cost Explosion

Privacy & Data Sovereignty: What to Know Before Using Alibaba Services

Conclusion: Not a Replacement — A New Tool for Your Chinese AI Toolkit

FAQ

MiniMax M2.7 Local AI Complete Guide: Cost Analysis, License Traps & Execution Reality for Developers (2026)

Quality guarded by our community

Qwen3 Chinese AI Complete Guide: Model Selection, Free Tiers, Ollama Pitfalls & Honest Review (2026)

Qwen3 Chinese AI Complete Guide: Model Selection, Free Tiers & Ollama Pitfalls (2026)

TL;DR

Qwen3 Spans Three Generations — Know Which One You're Actually Using

Chinese Output Quality: An Honest Assessment of Character Accuracy, Local Terms & Hallucinations

Can My MacBook or PC GPU Run Qwen3? Complete Hardware Requirements

Three Free Access Paths (April 2026 Status)

Path 1: OpenRouter Playground (Zero Barrier)

Path 2: qwen.ai Official Playground (Account Required)

Path 3: Ollama Local Deployment (Completely Free, Completely Offline)

Ollama Local Deployment: Complete Steps & Two Bugs You Must Know About

Installation Steps

Bug 1: Qwen3.5 Series Thinking Mode Infinite Loop

Bug 2: Qwen3.5 Series Tool Calling Completely Broken

Workarounds

Thinking Mode: When to Enable, When to Skip

Qwen3 vs Claude vs Gemma 4: Which Is Best for Chinese Writing?

API Cost Breakdown: Content Generation at $0.10/Month vs Agentic Coding Cost Explosion

Privacy & Data Sovereignty: What to Know Before Using Alibaba Services

Conclusion: Not a Replacement — A New Tool for Your Chinese AI Toolkit

FAQ

Read next

MiniMax M2.7 Local AI Complete Guide: Cost Analysis, License Traps & Execution Reality for Developers (2026)

Quality guarded by our community