{"id":3795,"date":"2026-03-31T19:30:24","date_gmt":"2026-03-31T17:30:24","guid":{"rendered":"https:\/\/wp.unil.ch\/iaunil\/dcsr-llm-using-artificial-intelligence-for-research-at-unil\/"},"modified":"2026-04-01T14:53:43","modified_gmt":"2026-04-01T12:53:43","slug":"dcsr-llm-using-artificial-intelligence-for-research-at-unil","status":"publish","type":"post","link":"https:\/\/wp.unil.ch\/iaunil\/en\/dcsr-llm-using-artificial-intelligence-for-research-at-unil\/","title":{"rendered":"DCSR-LLM: using artificial intelligence to support research at UNIL"},"content":{"rendered":"\n<p><em>Article written by Dr. Philippe Jacquet, Data Scientist, Division Calcul et Soutien \u00e0 la Recherche (DCSR).<\/em><\/p>\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"274\" src=\"https:\/\/wp.unil.ch\/iaunil\/files\/2026\/03\/dcsr-llm-research-flow-1024x274.png\" alt=\"dcsr llm research flow\" class=\"wp-image-3789\" srcset=\"https:\/\/wp.unil.ch\/iaunil\/files\/2026\/03\/dcsr-llm-research-flow-1024x274.png 1024w, https:\/\/wp.unil.ch\/iaunil\/files\/2026\/03\/dcsr-llm-research-flow-300x80.png 300w, https:\/\/wp.unil.ch\/iaunil\/files\/2026\/03\/dcsr-llm-research-flow-768x206.png 768w, https:\/\/wp.unil.ch\/iaunil\/files\/2026\/03\/dcsr-llm-research-flow-1536x411.png 1536w, https:\/\/wp.unil.ch\/iaunil\/files\/2026\/03\/dcsr-llm-research-flow-2048x549.png 2048w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n<p>Have you ever wanted to ask the same questions to thousands of documents, compare several AI models, or turn a large collection of texts into structured research data?<\/p>\n\n<p>These are precisely the kind of research tasks for which <strong>DCSR-LLM<\/strong> was developed at UNIL.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What is the DCSR?<\/h2>\n\n<p>The <strong>DCSR<\/strong> stands for <strong>Division Calcul et Soutien \u00e0 la Recherche<\/strong>, or Scientific Computing and Research Support Unit in English. We are part of the Centre Informatique at UNIL. We are a team of about 20 people. Our role is to help the UNIL research community with computing, data storage, and technical support for research projects.<\/p>\n\n<p>Some members of the team manage the computing infrastructure used at UNIL. Others work directly with researchers and provide scientific and technical consulting.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What kind of support does the DCSR provide?<\/h2>\n\n<p>The DCSR provides two main kinds of support.<\/p>\n\n<p>The first is <strong>infrastructure<\/strong>: computers, storage space, and technical systems that researchers can use for demanding research tasks. The second is <strong>expertise<\/strong>: DCSR staff help researchers with topics such as scientific programming, machine learning, databases, and web development.<\/p>\n\n<p>So the DCSR is not only a place with machines. It is also a support unit with people who help researchers use these tools in a useful way.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What are HPC clusters?<\/h2>\n\n<p>The term <strong>HPC<\/strong> means <strong>High-Performance Computing<\/strong>. It refers to powerful computing systems used for tasks that are too large, too slow, or too demanding for an ordinary laptop. An HPC cluster is a group of computers working together. Instead of doing everything on one machine, a cluster can distribute the work across many machines.<\/p>\n\n<p>At UNIL, there are two HPC clusters: <strong>Curnagl<\/strong> and <strong>Urblauna<\/strong>. Their names are derived from Romansh bird names. They are used for demanding research tasks such as simulations, data analysis, machine learning, and AI workflows.<\/p>\n\n<p>A simple way to think about it is this: if your laptop is enough, you use your laptop. If your work becomes too heavy, too slow, or needs more memory, the cluster becomes useful.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What are Large Language Models?<\/h2>\n\n<p><strong>Large Language Models<\/strong>, or <strong>LLMs<\/strong>, are AI systems trained on very large amounts of text. They can answer questions, summarize documents, rewrite text, classify information, or extract facts from written material. Most people know them through tools such as ChatGPT, Claude, or Gemini.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What is Hugging Face?<\/h2>\n\n<p><strong>Hugging Face<\/strong> is a platform where people share AI models and datasets. A simple comparison is that it is a bit like GitHub, but for machine learning models and data. It also provides software tools that make it easier to download and run open-source language models on a computer or a server.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What is LM Studio?<\/h2>\n\n<p><strong>LM Studio<\/strong> is an application that lets people run some language models locally through a chat interface. In practice, it feels a bit like using ChatGPT, except that the model can run on your own computer. This is useful for simple local experiments.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Why did the DCSR create DCSR-LLM?<\/h2>\n\n<p>Because many researchers want to use AI, but their needs often go <strong>beyond a simple chatbot<\/strong>.<\/p>\n\n<p>A chatbot is useful for asking a question or drafting a paragraph. But research often requires something more structured. Researchers may want to:<\/p>\n\n<p>\u2013 test several models on the same task;<br \/>\u2013 work with a large collection of documents;<br \/>\u2013 keep data on UNIL infrastructure;<br \/>\u2013 save the exact settings used in an analysis;<br \/>\u2013 repeat the same workflow later.<\/p>\n\n<p>This is why the DCSR developed DCSR-LLM.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What is DCSR-LLM?<\/h2>\n\n<p>DCSR-LLM is a <strong>toolkit<\/strong> developed at UNIL for working with large language models in a more structured way.<\/p>\n\n<p>It allows researchers to download open-source models <strong>from Hugging Face<\/strong>, run them locally or on UNIL servers, compare them, evaluate them on specific tasks, extract structured information from text, adapt some models to more specialized uses, and <strong>export models to GGUF format for use with LM Studio<\/strong>.<\/p>\n\n<p>It is not just a chatbot. It is a tool designed for <strong>research workflows<\/strong>.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Why not just use ChatGPT or Claude?<\/h2>\n\n<p>For many everyday uses, ChatGPT or Claude are very useful. But in research, three extra questions often matter.<\/p>\n\n<p>The first is: <strong>where does the data go<\/strong>?<br \/>Some projects involve sensitive, unpublished, or internal material. In such cases, researchers may want a more controlled environment.<\/p>\n\n<p>The second is: <strong>which model am I using<\/strong>? <br \/>Different models behave differently. In research, it is often useful to compare them rather than rely on only one assistant.<\/p>\n\n<p>The third is: <strong>can I repeat the same workflow clearly<\/strong>? <br \/>If an AI result matters for a project, researchers usually need to document how it was produced.<\/p>\n\n<p>DCSR-LLM is useful because it helps with <strong>these three points<\/strong>.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What does &#8220;reproducible&#8221; mean here?<\/h2>\n\n<p>It means that the work is done in a way that can be <strong>repeated and documented<\/strong>. If you use the same model, the same data, and the same settings, you should be able to rerun the workflow and understand what happened. This matters in research because methods need to be described clearly.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What can DCSR-LLM do in practice?<\/h2>\n\n<p>DCSR-LLM can do several practical things. It can help a researcher <strong>inspect and download models<\/strong> from Hugging Face, <strong>run<\/strong> some of these models locally, <strong>compare<\/strong> several models on the same benchmark, <strong>turn unstructured text<\/strong> into structured data, and <strong>adapt a model<\/strong> for a more specific task.<\/p>\n\n<p>These ideas become easier to understand with examples.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Can you give a first concrete example?<\/h2>\n\n<p>Imagine a researcher in the social sciences has <strong>2,000 interview transcripts<\/strong> and wants to organize them.<\/p>\n\n<p>Each interview is written in free text. The researcher wants, for each transcript, to extract a small set of clearly defined fields such as:<\/p>\n\n<p>\u2013 the title or identifier of the interview;<br \/>\u2013 the date of the interview;<br \/>\u2013 the location;<br \/>\u2013 the name or profile of the person interviewed;<br \/>\u2013 the institution or organization mentioned;<br \/>\u2013 the main topic discussed;<br \/>\u2013 a short quotation supporting the extraction.<\/p>\n\n<p>A normal chatbot can help with a few interview transcripts, one by one. But this becomes difficult if there are thousands of texts and the researcher wants the same structure every time.<\/p>\n\n<p>With DCSR-LLM, the team can define the fields they want, run the same extraction process on the whole collection, save the results in a structured format, and then review them. The AI does not replace the researcher. But it can help turn a large collection of text into something easier to inspect and analyze.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Can you give a second concrete example?<\/h2>\n\n<p>Imagine a biology or medical research team wants to use AI to answer a set of <strong>domain-specific questions<\/strong>.<\/p>\n\n<p>Before choosing a model, the team wants to know which one performs best on its task. If they only use chatbots manually, comparison is difficult. One person may ask slightly different questions. Another may use different wording. Results are harder to compare fairly.<\/p>\n\n<p>With DCSR-LLM, the team can prepare one <strong>fixed list of questions<\/strong> and run the same evaluation on several models. They can then compare the outputs more systematically. Instead of saying, &#8220;this model feels better,&#8221; they can say, &#8220;we tested these models on the same task under the same conditions.&#8221; That is much closer to a <strong>research approach<\/strong>.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Can you give a third concrete example?<\/h2>\n\n<p>Imagine a researcher has a very large collection of texts, for example <strong>10,000 journal articles<\/strong> or <strong>10,000 YouTube transcripts<\/strong>.<\/p>\n\n<p>The researcher may want to ask questions such as:<\/p>\n\n<p>\u2013 Does this article discuss climate change, migration, or public policy?<br \/>\u2013 Is this speaker talking positively or negatively about artificial intelligence?<br \/>\u2013 Does this text mention a specific concept, author, or method?<br \/>\u2013 Is the main purpose of this article to explain, criticize, or compare?<br \/>\u2013 Does this transcript contain personal testimony, expert opinion, or political argument?<br \/>\u2013 Which passages talk about ethics, cost, or risk?<\/p>\n\n<p>A normal chatbot can help with one article or one transcript at a time. But this becomes impractical when the collection contains thousands of texts.<\/p>\n\n<p>With DCSR-LLM, the researcher can run the same workflow across the whole corpus in a more systematic way. Instead of manually copying text into a chatbot, they can process the collection in a structured manner and save the results for later analysis.<\/p>\n\n<p>The value here is not just that the AI can answer questions. The value is that it becomes possible to <strong>ask the same research questions at scale<\/strong>, on a large body of text, in a way that is more organized and reproducible.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Can you give a fourth concrete example?<\/h2>\n\n<p>Imagine a team has found a model that works reasonably well, but <strong>not quite well enough<\/strong> for its own field. For example, the model may struggle with the vocabulary of a specific discipline, or with the exact format of answers needed for a project.<\/p>\n\n<p>In that case, the team may want to <strong>adapt the model<\/strong> to a specific task or field and then measure whether the adaptation actually helped.<\/p>\n\n<p>DCSR-LLM can support this kind of <strong>before-and-after workflow<\/strong>. A team can test the original model, adapt it for the task, and then test it again on the same benchmark. This gives a clearer answer than a simple impression such as &#8220;the new version seems better.&#8221;<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">So DCSR-LLM is not mainly for chatting?<\/h2>\n\n<p>Exactly. It can be used for chat, but that is not its main purpose. Its main purpose is to support <strong>structured work<\/strong> with language models: evaluation, extraction, comparison, and adaptation. That is why it is more useful to think of it as a <strong>research tool<\/strong> than as a chatbot.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Do I need to know how to code?<\/h2>\n\n<p>Not necessarily in depth, but some technical support is helpful.<\/p>\n\n<p>DCSR-LLM is a <strong>command-line tool<\/strong>. This means it is used through a terminal rather than through a point-and-click web interface. Users do not need to be expert programmers, but they should be comfortable with a basic technical environment, or work with support from the DCSR or from technically experienced colleagues.<\/p>\n\n<p>The important point is that DCSR-LLM is not reserved to AI specialists. But it is also not intended as a consumer product for completely casual use.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Where can DCSR-LLM run?<\/h2>\n\n<p>It can run on a <strong>personal computer<\/strong> for small experiments. It is also designed to run on <strong>UNIL infrastructure<\/strong>, including Curnagl and Urblauna.<\/p>\n\n<p>This matters because a project can start small and then grow. A researcher may first test an idea on a laptop, then later run a larger workflow on the UNIL clusters.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Why is that useful?<\/h2>\n\n<p>Because many projects do not start with a huge investment. Researchers often begin with a simple question: &#8220;Can this model help with my material?&#8221;<\/p>\n\n<p>If the first results are promising, they may then want to scale up: more documents, more models, more comparisons, more demanding computations. DCSR-LLM supports this <strong>gradual scaling<\/strong>.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">What is the main idea behind DCSR-LLM?<\/h2>\n\n<p>The main idea is simple. DCSR-LLM helps researchers move from informal chatbot use to a <strong>more controlled<\/strong> way of working with AI. That means more clarity about the <strong>model used<\/strong>, <strong>where the data is processed<\/strong>, <strong>how the results are generated<\/strong>, and <strong>how the workflow can be repeated<\/strong>.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Does AI replace the researcher?<\/h2>\n\n<p>No. AI can help with some tasks, sometimes very effectively. But researchers still need to <strong>define the question<\/strong>, <strong>choose the method<\/strong>, <strong>review the outputs<\/strong>, and <strong>interpret the results<\/strong>. DCSR-LLM is useful because it helps organize AI-based work. It does not remove the need for scientific judgment.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">So what is DCSR-LLM, in one sentence?<\/h2>\n\n<p>DCSR-LLM is a practical UNIL tool that helps researchers use large language models in a more <strong>concrete<\/strong>, <strong>controlled<\/strong>, and <strong>reproducible<\/strong> way than a simple chatbot interface.<\/p>\n\n<h2 class=\"wp-block-heading has-large-font-size\">Useful links<\/h2>\n\n<p>&#8211; Repository: <a href=\"https:\/\/git.dcsr.unil.ch\/Scientific-Computing\/dcsr-llm\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/git.dcsr.unil.ch\/Scientific-Computing\/dcsr-llm<\/a><br \/>&#8211; More information: <a href=\"https:\/\/wiki.unil.ch\/ci\/link\/2266\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/wiki.unil.ch\/ci\/link\/2266<\/a><br \/>&#8211; Contact: helpdesk@unil.ch (subject: DCSR-LLM)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Compare models, extract structured data, reproduce workflows: DCSR-LLM helps UNIL researchers use large language models in a more controlled and reproducible way.<\/p>\n","protected":false},"author":1002618,"featured_media":3793,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_seopress_robots_primary_cat":"","_seopress_titles_title":"","_seopress_titles_desc":"","_seopress_robots_index":"","footnotes":""},"categories":[22],"tags":[],"class_list":{"0":"post-3795","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-resources"},"_links":{"self":[{"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/posts\/3795","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/users\/1002618"}],"replies":[{"embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/comments?post=3795"}],"version-history":[{"count":5,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/posts\/3795\/revisions"}],"predecessor-version":[{"id":3824,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/posts\/3795\/revisions\/3824"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/media\/3793"}],"wp:attachment":[{"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/media?parent=3795"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/categories?post=3795"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wp.unil.ch\/iaunil\/en\/wp-json\/wp\/v2\/tags?post=3795"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}