{"id":2583,"date":"2026-06-15T16:57:49","date_gmt":"2026-06-15T14:57:49","guid":{"rendered":"https:\/\/wp.unil.ch\/iaunil\/local-ai-model-for-private-sensitive-and-confidential-data\/"},"modified":"2026-06-18T19:52:13","modified_gmt":"2026-06-18T17:52:13","slug":"local-ai-model-for-private-sensitive-and-confidential-data","status":"publish","type":"post","link":"https:\/\/wp.unil.ch\/iaunil\/en\/local-ai-model-for-private-sensitive-and-confidential-data\/","title":{"rendered":"Installing a local AI model for private and sensitive data (updated June 2026)"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\">Why a local AI?<\/h3>\n\n<p class=\"wp-block-paragraph\">By decision of the Directorate, sensitive data within the meaning of the Swiss data protection law (LPD) must not go through the cloud. Commercial AIs (ChatGPT, Claude, Gemini) send your files to external servers. With <strong>LM Studio<\/strong>, everything stays on your computer.<\/p>\n\n<ul class=\"wp-block-list\">\n<li><strong>Confidential.<\/strong> No data is sent. The model works offline.<\/li>\n\n\n\n<li><strong>For sensitive data<\/strong> (LPD) and professional secrecy, as a complement to the institutional Microsoft Copilot Chat.<\/li>\n<\/ul>\n\n<p class=\"wp-block-paragraph\">Let us be honest. On <a target=\"_blank\" rel=\"noopener noreferrer\" href=\"https:\/\/artificialanalysis.ai\/\">Artificial Analysis<\/a>, Gemma 4 scores around <strong>19 to 20 out of 100<\/strong> and Qwen3.5 9B around <strong>25<\/strong>. The best cloud models are around 57 to 61. These are above all small local models, designed to run on a personal computer. Their value: confidentiality, free of charge and offline. That is enough to summarise, translate, rephrase or analyse documents.<\/p>\n\n<h3 class=\"wp-block-heading\">Install LM Studio<\/h3>\n\n<p class=\"wp-block-paragraph\"><strong>Check your machine first.<\/strong> On Mac, LM Studio requires an Apple Silicon chip (M1 to M4), so a Mac from late 2020 or newer. Intel Macs are not supported. On PC, most computers from the last few years will do, ideally with 16 GB of RAM.<\/p>\n\n<ol class=\"wp-block-list\">\n<li>Download the free application from <strong><a target=\"_blank\" rel=\"noopener noreferrer\" href=\"https:\/\/lmstudio.ai\">lmstudio.ai<\/a><\/strong>.<\/li>\n\n\n\n<li>Install then open LM Studio. No account required.<\/li>\n<\/ol>\n\n<p class=\"wp-block-paragraph\"><em>LM Studio&#8217;s interface changes regularly and differs a little between Windows and Mac. The button names quoted here are indicative. Rely on the function being described.<\/em><\/p>\n\n<h3 class=\"wp-block-heading\">Which model should you choose?<\/h3>\n\n<p class=\"wp-block-paragraph\">All three are recommended for everyday use. <strong>Qwen3.5 9B<\/strong> (Chinese, by Alibaba) is the most capable; <strong>Gemma 4<\/strong> (by Google) remains a safe choice, lighter as E4B or a little finer as 12B. The choice depends on your machine.<\/p>\n\n<figure class=\"wp-block-table\"><table><thead><tr><th><\/th><th>Qwen3.5 9B<\/th><th>Gemma 4 E4B<\/th><th>Gemma 4 12B<\/th><\/tr><\/thead><tbody><tr><td>Strength<\/td><td>Most capable<\/td><td>Light and fast<\/td><td>Slightly higher quality<\/td><\/tr><tr><td>Memory<\/td><td>Medium<\/td><td>Modest<\/td><td>Higher<\/td><\/tr><tr><td>Long PDFs<\/td><td>Very comfortable (large context)<\/td><td>Very comfortable<\/td><td>Slow on a small machine<\/td><\/tr><tr><td>Choose if<\/td><td>You want the best quality<\/td><td>Modest machine or long PDFs<\/td><td>Well-equipped machine<\/td><\/tr><\/tbody><\/table><\/figure>\n\n<p class=\"wp-block-paragraph\">When in doubt, pick Gemma 4 E4B. On loading, LM Studio shows a <strong>memory estimate<\/strong> and warns you if it is too heavy. You can keep several models installed.<\/p>\n\n<h3 class=\"wp-block-heading\">Download the model<\/h3>\n\n<ol class=\"wp-block-list\">\n<li>Open the model search (the magnifying glass, often called &#8220;Discover&#8221;).<\/li>\n\n\n\n<li>Type the name of your chosen model (<strong>qwen3.5<\/strong> or <strong>gemma 4<\/strong>).<\/li>\n\n\n\n<li>Start the download of the <strong>default version offered<\/strong>. The software picks the right version for your computer.<\/li>\n\n\n\n<li>Once downloaded, open a conversation with this model.<\/li>\n<\/ol>\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"640\" src=\"https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-12.03.16-1024x640.jpg\" alt=\"&lt;em&gt;\ud83d\udcf8 [Screenshot: the model search screen]&lt;\/em&gt;\" class=\"wp-image-4099\" srcset=\"https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-12.03.16-1024x640.jpg 1024w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-12.03.16-300x188.jpg 300w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-12.03.16-768x480.jpg 768w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-12.03.16-1536x960.jpg 1536w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-12.03.16.jpg 1920w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n<h3 class=\"wp-block-heading\">Increase the context size<\/h3>\n\n<p class=\"wp-block-paragraph\">The context is the amount of text the model reads at once. More context means more memory. By default it is too small for a whole PDF. Aim for about <strong>32,000<\/strong> for long documents. For an exceptionally long document, you can go higher (for example 64,000) if memory allows; the answer will then be slower.<\/p>\n\n<ol class=\"wp-block-list\">\n<li>When loading the model, set the context length (often &#8220;Context Length&#8221;) to about <strong>32,000<\/strong>.<\/li>\n\n\n\n<li>It can also be changed afterwards via the small <strong>wheel<\/strong> (a &#8220;reload&#8221; button appears).<\/li>\n\n\n\n<li>If LM Studio warns that it is too heavy, lower the value or switch to Gemma 4 E4B.<\/li>\n<\/ol>\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"640\" src=\"https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-17.58.09-1024x640.jpg\" alt=\"&lt;em&gt;\ud83d\udcf8 [Screenshot: the loaded model&apos;s settings opened via the cog, with the context length]&lt;\/em&gt;\" class=\"wp-image-4101\" srcset=\"https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-17.58.09-1024x640.jpg 1024w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-17.58.09-300x188.jpg 300w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-17.58.09-768x480.jpg 768w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-17.58.09-1536x960.jpg 1536w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-17.58.09-2048x1280.jpg 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n<p class=\"wp-block-paragraph\"><strong>Useful symptom.<\/strong> If the context is too small, the model often answers as if no document had been attached (&#8220;I cannot see any document&#8221;). This is not a malfunction. Increase the context then send your request again.<\/p>\n\n<h3 class=\"wp-block-heading\">Working confidentially<\/h3>\n\n<p class=\"wp-block-paragraph\">Drag your document into the conversation (<strong>.pdf<\/strong>, <strong>.docx<\/strong>, <strong>.txt<\/strong> formats). Type your request, for example &#8220;Summarise this report in 5 points&#8221;. Nothing leaves your computer.<\/p>\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"640\" src=\"https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-20.06.45-1024x640.jpg\" alt=\"&lt;em&gt;\ud83d\udcf8 [Screenshot: a conversation with an attached document]&lt;\/em&gt;\" class=\"wp-image-4104\" srcset=\"https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-20.06.45-1024x640.jpg 1024w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-20.06.45-300x188.jpg 300w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-20.06.45-768x480.jpg 768w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-20.06.45-1536x960.jpg 1536w, https:\/\/wp.unil.ch\/iaunil\/files\/2025\/07\/capture-decran-2026-06-15-a-20.06.45.jpg 1920w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n<p class=\"wp-block-paragraph\"><strong>The model thinks before answering.<\/strong> On a modest machine, expect several minutes sometimes. This is normal. To go faster on simple tasks, turn off the thinking with the &#8220;Think&#8221; button (below the message box).<\/p>\n\n<h3 class=\"wp-block-heading\">Why these models?<\/h3>\n\n<ul class=\"wp-block-list\">\n<li><strong>Among the most powerful small local models.<\/strong> These are open models that run on a personal computer.<\/li>\n\n\n\n<li><strong>Designed for your computer.<\/strong> A dedicated graphics card helps but is not required.<\/li>\n\n\n\n<li><strong>Open source<\/strong> (Apache 2.0 licence) and multilingual. <a target=\"_blank\" rel=\"noopener noreferrer\" href=\"https:\/\/deepmind.google\/models\/gemma\/\">Official details (Gemma 4)<\/a> and <a target=\"_blank\" rel=\"noopener noreferrer\" href=\"https:\/\/huggingface.co\/Qwen\/Qwen3.5-9B\">Qwen3.5 9B<\/a><\/li>\n<\/ul>\n\n<h3 class=\"wp-block-heading\">Alternatives<\/h3>\n\n<p class=\"wp-block-paragraph\">Other tools exist: Ollama, GPT4All or Hugging Face. To get started, LM Studio remains the most accessible. With plenty of memory, larger models become feasible. The <strong>IT Centre<\/strong> can also make local models available.<\/p>\n\n<h3 class=\"wp-block-heading\">Local is not necessarily greener<\/h3>\n\n<p class=\"wp-block-paragraph\"><strong>Counterintuitive.<\/strong> Without a dedicated graphics card, local AI does not consume less than a request to a commercial LLM (ChatGPT and the like). A processor (CPU) is slow for AI: summarising a document can take it several minutes at 30 W, as much energy as the few seconds of the optimised hardware behind those services. On a consumer computer, the benefit of local AI is confidentiality. With a dedicated graphics card (GPU) and a well-chosen model, however, local AI can become 10 to 1,000 times less energy-hungry while keeping a satisfactory answer quality.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Local AIs are increasingly powerful, attractive and energy-efficient compared with commercial LLMs, all while keeping your sensitive data on your own computer rather than on remote servers.<\/p>\n","protected":false},"author":1002618,"featured_media":2211,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","_seopress_robots_follow":"","_seopress_robots_imageindex":"","_seopress_robots_snippet":"","_seopress_robots_primary_cat":"","_seopress_robots_breadcrumbs":"","_seopress_robots_freeze_modified_date":"","_seopress_robots_custom_modified_date":"","_seopress_robots_canonical":"","_seopress_social_fb_title":"","_seopress_social_fb_desc":"","_seopress_social_fb_img":"","_seopress_social_fb_img_attachment_id":0,"_seopress_social_fb_img_width":0,"_seopress_social_fb_img_height":0,"_seopress_social_twitter_title":"","_seopress_social_twitter_desc":"","_seopress_social_twitter_img":"","_seopress_social_twitter_img_attachment_id":0,"_seopress_social_twitter_img_width":0,"_seopress_social_twitter_img_height":0,"_seopress_redirections_value":"","_seopress_redirections_enabled":"","_seopress_redirections_enabled_regex":"","_seopress_redirections_logged_status":"","_seopress_redirections_param":"","_seopress_redirections_type":0,"_seopress_analysis_target_kw":"","_seopress_news_disabled":"","_seopress_video_disabled":"","_seopress_video":[],"_seopress_pro_schemas_manual":[],"_seopress_pro_rich_snippets_disable_all":"","_seopress_pro_rich_snippets_disable":[],"_seopress_pro_schemas":[],"footnotes":""},"categories":[22],"tags":[],"class_list":["post-2583","post","type-post","status-publish","format-standard","has-post-thumbnail","category-resources"],"_links":{"self":[{"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/posts\/2583","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/users\/1002618"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/comments?post=2583"}],"version-history":[{"count":5,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/posts\/2583\/revisions"}],"predecessor-version":[{"id":4155,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/posts\/2583\/revisions\/4155"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/media\/2211"}],"wp:attachment":[{"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/media?parent=2583"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/categories?post=2583"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/tags?post=2583"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}