Local AI model for private, sensitive and confidential data

Why switch to local AI?

Commercial LLMs (ChatGPT, Claude, Google, Gemini, etc.) send queries and files to external data centres. With LM Studio + Gemma 3n E4B, everything happens on your computer:

  • 100% Open Source: All code, model weights, and tools are freely available, editable, and open to audit—offering total transparency and empowering users to retain full control of their data.
  • No data is transmitted to remote servers: your information stays with you (the model works entirely offline).
  • Suitable for private, sensitive, or confidential documents, including those subject to professional secrecy.
  • Now delivering credible performance compared to commercial solutions: Gemma 3n E4B scores 1300 on LMArena, while the latest cloud-based models average around 1500.
  • Controlled energy consumption: On a MacBook Air M2 (~20 W), the inference process (prompt + response) uses significantly less energy than a server equipped with an Nvidia H100 GPU (~700 W), which is commonly used for commercial models.

Install LM Studio

  1. Download the free app at https://lmstudio.ai/.
  2. Run the installer, then open LM Studio. No account is required.

Add the gemma-3n-E4B model

  1. In the LM Studio search bar (Discover), type gemma-3n-E4B.
  2. Click Download to copy the template (a few GB) to your computer.
  3. Once the download is complete, click Load to load it.
screenshot 2025 07 09 at 16:46:00

Work in complete confidentiality

Simply drag and drop your PDFs, DOCX files, or internal reports into the chat window—just wait for the upload to complete. LM Studio processes and queries your documents locally, ensuring they never leave your computer. At present, this is the only approach that offers full data privacy.”

screenshot 2025 07 09 at 16:55:07

Why choose the local Gemma 3n E4B model?

  • Third generation “nano” (3n): designed to run on consumer PCs without specialised hardware.
  • E4B: Effective 4 Billion — approximately 4 billion parameters, a small model designed for smartphones and PCs.
  • GGUF format: single, ready-to-use file, already quantised, recognised by all major front-ends (LM Studio, Ollama, GPT4All, etc.).
  • Optimal weight/performance balance: approximately 4 billion parameters (E4B) are sufficient to achieve a score of 1300 on LMArena (https://lmarena.ai/leaderboard) – compared to ~1500 for the latest cloud models.
  • MLX Compatibility (Apple library): makes the most of Mac’s M-Series chips.
  • Official details: https://deepmind.google/models/gemma/gemma-3n/

Possible alternatives

Possible alternatives include Ollama and GPT4All. For those new to local AI, LM Studio remains the most accessible option. Other open source models can also be found on Hugging Face. On the latest machines, installing gpt-oss-20b from OpenAI is also an option.