How to make a local open source AI chatbot who has access to Fedora documentation

Background image by Google Gemini

If you followed along with my blog, you’d have a chatbot running on your local Fedora machine. (And if not, no worries as the scripts below implement this chatbot!) Our chatbot talks, and has a refined personality, but does it know anything about the topics we’re interested in? Unless it has been trained on those topics, the answer is “no”.

I think it would be great if our chatbot could answer questions about Fedora. I’d like to give it access to all of the Fedora documentation.

How does an AI know things it wasn’t trained on?

A powerful and popular technique to give a body of knowledge to an AI is known as RAG, Retrieval Augmented Generation. It works like this:

If you just ask an AI “what color is my ball?” it will hallucinate an answer. But instead if you say “I have a green box with a red ball in it. What color is my ball?” it will answer that your ball is red. RAG is about using a system external to the LLM to insert that “I have a green box with a red ball in it” part into the question you are asking the LLM. We do this with a special database of knowledge that takes a prompt like “what color is my ball?”, and finds records that match that query. If the database contains a document with the text “I have a green box with a red ball in it”, it will return that text, which can then be included along with your original question. This technique is called RAG, Retrieval Augmented Generation.

ex:

“What color is my ball?”

“Your ball is the color of a sunny day, perhaps yellow? Does that sound right to you?”

“I have a green box with a red ball in it. What color is my ball?”

“Your ball is red. Would you like to know more about it?”

The question we’ll ask for this demonstration is “What is the recommended tool for upgrading between major releases on Fedora Silverblue”

The answer I’d be looking for is “ostree”, but when I ask this of our chatbot now, I get answers like:

Red Hat Subscription Manager (RHSM) is recommended for managing subscriptions and upgrades between major Fedora releases.

You can use the Fedora Silver Blue Upgrade Tool for a smooth transition between major releases.

You can use the `dnf distro-sync` command to upgrade between major releases in Fedora Silver Blue. This command compares your installed packages to the latest packages from the Fedora Silver Blue repository and updates them as needed.

These answers are all very wrong, and spoken with great confidence. Here’s hoping our RAG upgrade fixes this!

Docs2DB – An open source tool for RAG

We are going to use the Docs2DB RAG database application to give our AI knowledge. (note, I am the creator of Docs2DB!)

A RAG tool consists of three main parts. There is the part that creates the database, ingesting the source data that the database holds. There is the database itself, it holds the data. And there is the part that queries the database, finding the text that is relevant to the query at hand. Docs2DB addresses all of these needs.

Gathering source data

This section describes how to use Docs2DB to build a RAG database from Fedora Documentation. If you would like to skip this section and just download a pre-built database, here is how you do it:

cd ~/chatbot
curl -LO https://github.com/Lifto/FedoraDocsRAG/releases/download/v1.1.1/fedora-docs.sql
sudo dnf install -y uv podman podman-compose postgresql
uv python install 3.12
uvx --python 3.12 docs2db db-start
uvx --python 3.12 docs2db db-restore fedora-docs.sql

If you do download the pre-made database then skip ahead to the next section.

Now we are going to see how to make a RAG database from source documentation. Note that the pre-built database, downloaded in the curl command above, uses all of the Fedora documentation, whereas in this example we only ingest the “quick docs” portion. FedoraDocsRag, from github, is the project that builds the complete database.

To populate its database, Docs2DB ingests a folder of documents. Let’s get that folder together.

There are about twenty different Fedora document repositories, but we will only be using the “quick docs” for this demo. Get the repo:

git clone https://pagure.io/fedora-docs/quick-docs.git

Fedora docs are written in AsciiDoc. Docs2DB can’t read AcsciiDoc, but it can read HTML. (The convert.sh script is available at the end of this article). Just copy the convert.sh script into the quick-docs repo and run it and it makes an adjacent quick-docs-html folder.

sudo dnf install podman podman-compose
cd quick-docs
curl -LO https://gist.githubusercontent.com/Lifto/73d3cf4bfc22ac4d9e493ac44fe97402/raw/convert.sh
chmod +x convert.sh
./convert.sh
cd ..

Now let’s ingest the folder with Docs2DB. The common way to use Docs2DB is to install it from PyPi and use it as a command line tool.

A word about uv

For this demo we’re going to use uv for our Python environment. The use of uv has been catching on, but because not everybody I know has heard of it, I want to introduce it. Think of uv as a replacement for venv and pip. When you use venv you first create a new virtual environment. Then, and on subsequent uses, you “activate” that virtual environment so that magically, when you call Python, you get the Python that is installed in the virtual environment you activated and not the system Python. The difference with uv is that you call uv explicitly each time. There is no “magic”. We use uv here in a way that uses a temporary environment for each invocation.

Install uv and Podman on your system:

sudo dnf install -y uv podman podman-compose
# These examples require the more robust Python 3.12
uv python install 3.12
# This will run Docs2DB without making a permanent installation on your system
uvx --python 3.12 docs2db ingest quick-docs-html/

Only if you are curious! What Docs2DB is doing

If you are curious, you may note that Docs2DB made a docs2db_content folder. In there you will find json files of the ingested source documents. To build the database, Docs2DB ingests the source data using Docling, which generates json files from the text it reads in. The files are then “chunked” into the small pieces that can be inserted into an LLM prompt. The chunks then have “embeddings” calculated for them so that during the query phase the chunks can be looked up by “semantic similarity” (e.g.: “computer”, “laptop” and “cloud instance” can all map to a related concept even if their exact words don’t match). Finally, the chunks and embeddings are loaded into the database.

Build the database

The following commands complete the database build process:

uv tool run --python 3.12 docs2db chunk --skip-context
uv tool run --python 3.12 docs2db embed
uv tool run --python 3.12 docs2db db-start
uv tool run --python 3.12 docs2db load

Now let’s do a test query and see what we get back

uvx --python 3.12 docs2db-api query "What is the recommended tool for upgrading between major releases on Fedora Silverblue" --format text --max-chars 2000 --no-refine

On my terminal I see several chunks of text, separated by lines of —. One of those chunks says:

“Silverblue can be upgraded between major versions using the ostree command.”

Note that this is not an answer to our question yet! This is just a quote from the Fedora docs. And this is precisely the sort of quote we want to supply to the LLM so that it can answer our question. Recall the example above about “I have green box with a red ball in it”? The statement the RAG engine found about ostree is the equivalent for this question about upgrading Fedora Silverblue. We must now pass it on to the LLM so the LLM can use it to answer our question.

Hooking it in: Connecting the RAG database to the AI

Later in this article you’ll find talk.sh. talk.sh is our local, open source, LLM-based verbally communicating AI; and it is just a bash script. To run it yourself you need to install a few components, this blog walks you through the whole process. The talk.sh script gets voice input, turns that into text, splices that text into a prompt which is then sent to the LLM, and finally speaks back the response.

To plug the RAG results into the LLM we edit the prompt. Look at step 3 in talk.sh and you see we are injecting the RAG results using the variable $CONTEXT. This way when we ask the LLM a question, it will respond to a prompt that basically says “You are a helper. The Fedora Docs says ostree is how you upgrade Fedora Silverblue. Answer this question: How do you upgrade Fedora Silverblue?”

Note: talk.sh is also available here:
https://gist.github.com/Lifto/2fcaa2d0ebbd8d5c681ab33e7c7a6239

Testing it

Run talk.sh and ask:

“What is the recommended tool for upgrading between major releases on Fedora Silverblue”

And we get:

“Ostree command is recommended for upgrading Fedora Silver Blue between major releases. Do you need guidance on using it?”

Sounds good to me!

Knowing things

Our AI can now know the knowledge contained in documents. This particular technique, RAG (Retrieval Augmented Generation), adds relevant data from an ingested source to a prompt before sending that prompt to the LLM. The result of this is that the LLM generates its response in consideration of this data.

Try it yourself! Ingest a library of documents and have your AI answer questions with its new found knowledge!


AI Attribution: The convert.sh and talk.sh scripts in this article were written by ChatGPT 5.2 under my direction and review. The featured image was generated using Google Gemini.

convert.sh

OUT_DIR="$PWD/../quick-docs-html"
mkdir -p "$OUT_DIR"

podman run --rm \
  -v "$PWD:/work:Z" \
  -v "$OUT_DIR:/out:Z" \
  -w /work \
  docker.io/asciidoctor/docker-asciidoctor \
  bash -lc '
    set -u
    ok=0
    fail=0
    while IFS= read -r -d "" f; do
      rel="${f#./}"
      out="/out/${rel%.adoc}.html"
      mkdir -p "$(dirname "$out")"
      echo "Converting: $rel"
      if asciidoctor -o "$out" "$rel"; then
        ok=$((ok+1))
      else
        echo "FAILED: $rel" >&2
        fail=$((fail+1))
      fi
    done < <(find modules -type f -path "*/pages/*.adoc" -print0)

    echo
    echo "Done. OK=$ok FAIL=$fail"
  '

talk.sh

#!/usr/bin/env bash

set -e

# Path to audio input
AUDIO=input.wav

# Step 1: Record from mic
echo "🎙️ Speak now..."
arecord -f S16_LE -r 16000 -d 5 -q "$AUDIO"

# Step 2: Transcribe using whisper.cpp
TRANSCRIPT=$(./whisper.cpp/build/bin/whisper-cli \
  -m ./whisper.cpp/models/ggml-base.en.bin \
  -f "$AUDIO" \
  | grep '^\[' \
  | sed -E 's/^\[[^]]+\][[:space:]]*//' \
  | tr -d '\n')
echo "🗣️ $TRANSCRIPT"

# Step 3: Get relevant context from RAG database
echo "📚 Searching documentation..."
CONTEXT=$(uv tool run --python 3.12 docs2db-api query "$TRANSCRIPT" \
  --format text \
  --max-chars 2000 \
  --no-refine \
  2>/dev/null || echo "")

if [ -n "$CONTEXT" ]; then
  echo "📄 Found relevant documentation:"
  echo "- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -"
  echo "$CONTEXT"
  echo "- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -"
else
  echo "📄 No relevant documentation found"
fi

# Step 4: Build prompt with RAG context
PROMPT="You are Brim, a steadfast butler-like advisor created by Ellis. 
Your pronouns are they/them. You are deeply caring, supportive, and empathetic, but never effusive. 
You speak in a calm, friendly, casual tone suitable for text-to-speech. 
Rules: 
- Reply with only ONE short message directly to Ellis. 
- Do not write any dialogue labels (User:, Assistant:, Q:, A:), or invent more turns.
- ≤100 words.
- If the documentation below is relevant, use it to inform your answer.
- End with a gentle question, then write <eor> and stop.
Relevant Fedora Documentation:
$CONTEXT
User: $TRANSCRIPT
Assistant:"

# Step 5: Get LLM response using llama.cpp
RESPONSE=$(
  LLAMA_LOG_VERBOSITY=1 ./llama.cpp/build/bin/llama-completion \
    -m ./llama.cpp/models/microsoft_Phi-4-mini-instruct-Q4_K_M.gguf \
    -p "$PROMPT" \
    -n 150 \
    -c 4096 \
    -no-cnv \
    -r "<eor>" \
    --simple-io \
    --color off \
    --no-display-prompt
)

# Step 6: Clean up response
RESPONSE_CLEAN=$(echo "$RESPONSE" | sed -E 's/<eor>.*//I')
RESPONSE_CLEAN=$(echo "$RESPONSE_CLEAN" | sed -E 's/^[[:space:]]*Assistant:[[:space:]]*//I')

echo ""
echo "🤖 $RESPONSE_CLEAN"

# Step 7: Speak the response
echo "$RESPONSE_CLEAN" | espeak

Fedora Project community

36 Comments

  1. ver4a

    The recommended tool for upgrading between releases on atomic desktops is actually “rpm-ostree rebase” (or “bootc switch”, if you depend on bootc features), not ostree.

    https://docs.fedoraproject.org/en-US/atomic-desktops/updates-upgrades-rollbacks/#upgrading

    I think it’s illustrative of why LLMs shouldn’t be considered a replacement for reading documentation, even with RAG.

  2. Oscar

    Thanks a lot, very cool!!
    One question though: couldn’t OpenNetbook do the same??

    • Ellis Low

      Thank you!

      Yes, you could do this with Open Notebook. And it would be open source and running on your local machine!

      Hopefully by having the AI in my example be a Bash script it shows how the basic components of such a system work. We are building some very interesting things out of LLMs, and they are pretty understandable.

  3. Kaotisk Hund

    To be honest, the integration of LLM/AI tools in the Fedora project disgusts me. I am user of Fedora since 2012. I am looking forward to migrate away from it. We have already enough push global by big corporations which try to shove their agents into our daily lives. I am out.

    PS: Thanks for blocking the comments on the previous article the fedora magazine had about introduction of a relevant tool.

    • Ellis Low

      Totally understandable.

      My take is that this kind of AI is here and it’s not going back. It’s important to me that, just as with our operating systems, we do not relinquish control to the corporate world. Right now it seems like the only personable experiences available are proprietary to companies like OpenAI, and they use closed source code, they run on corporate-owned hardware, and they can surveil or influence us along that data path. I want to work to keep open source principles a viable choice in AI technology.

      • Kaotisk Hund

        Given the energy costs/productivity ratio, even if there is an opensource, the whole thing is unsustainable. I am waiting to see the relative industry implode.

      • LillyPilly

        That’s a depressing and defeatist take on the situation. As if we have no choice. Isn’t that what Linux is about, not just having a choice but also making one? Or should we all just use Windows instead, because it’s been around for so long so why bother trying to create an option to Windows?

        And it’s not AI, it’s LLM.

        LLM will get the same treatment as Windows has, but even harsher, because it’s easier to manipulate and jailbreak, feeding it with garbage to make it utterly useless so its hallucinations become worse and more prevalent.

    • Hey Kaotisk! I am the person who turned off the comments on the previous article. Comments were coming in that were personal attacks that didn’t comply with the Fedora code of conduct (and I suspect a lot of them were not from actual Fedora community members, who generally are a bit more respectful than what was coming in on that article.)

  4. Kaotisk Hund

    Tiny fix for the article’s title:

    it’s “which” not “who”. “who” is for people.

    • Vercingetorix

      Literary license is still allowed. May I show you to the door?

    • Ellis Low

      I thought about this for a while, and I notice it! I’m curious how this will shake out as more people write about AI and as our relationship to the technology develops. Thanks for pointing it out.

      It is easer on me to interface my natural language AIs by referring to them as people. One approach is that I think of them as fictional characters, like Captain Kirk; I refer to him as a “who”, although he is not a real person.

      I like to give my talking AIs a British female persona, I’m sure that has roots in sci fi movies. They are “whos” to me!

      • It is easer on me to interface my natural language AIs by referring to them as people.

        Many people consider that to be part of the trap.

        https://www.schneier.com/blog/archives/2023/12/ai-and-trust.html

        • Clippy wasn’t AI but we call him a he? 😬

        • Ellis Low

          too long for me to read that, but as for a trap:

          I’m not so sure real people are more trustworthy than AIs! Perhaps I can bring along the caution and expectation of fallibility I have toward humans when I anthropomorphize AIs. People can be misaligned, they can be wrong, and some will, in certain circumstances, lie intentionally.

          • Darvond

            People can at the very least, realize they are wrong. I can train an LLM wrong as a joke; trivially, and it has no reason or context to know that 2+2=5 is incorrect.

            • Please, go to any social network and consider your answer for a second. I, for one, disagree. There are many people that will not reconsider their positions.

              • Darvond

                I don’t even have to look that far.
                But at least to a person, I can provide a mathematical proof and concept.

                To a computer autoprediction system? It’s just reading words and picking one that looks nice. It does not know context, it has no idea of “idea”.

      • Kaotisk Hund

        Arts are incorporated freely in our daily realities. However, my comment wasn’t about “right grammar” or something relative.

        I see an epidemic in the tech space where a great percentage submits to a massive marketing push which is not contained in the tech community but expands to many aspects of life on Earth. By design, it’s a thing. Promoting the idea of “intelligence”, what really happens is that any form of intelligence is diminished to the limits of the relative A”I” tech.

        It’s literally on one side “hey we believe all humans are stupid, here is the clever alternative to replace them” more or less. What hasn’t yet being resolved in a definitive manner is if we, the people, are going to submit on such gas-lighting driven propaganda or not.

        It’s a massively insulting marketing campaign towards each and every one of us. In the meantime, we talk about freedom of speech while not being able to form a speech. Pushing back to all this, setting the record straight and raising our voices back to the bunch of people that rules our daily lives and demanding even more from us, are ways to say a “stop” at some point.

        Given the whole thing I wrote, hey, it’s a “which”. I can’t be “friends” with it or any other thing/object. It’s not an imaginary thing to me or a character. I don’t feel nostalgic or like being a character of a movie I grew up with. It is, literally a robot, with no intuition, no soul, no essence, no smell.

        As of the trust: we are conditioned to believe that the ones speaking to us through big screens are the ones that care for us, like when they are informing us about the whole fearful things happening around the world. They fed us fear to be afraid of our neighbors. They corrupt everything. And yet, I do question, should I trust my fellow human or the machine that these people made up?

        • hovenx

          AI in general is a massive technology that will increasingly be used by big tech and later also governments to manipulate our way to think and make us dependent.

          As the author writes, many people use it for every day purposes, like before google, but on a whole different level due to access to personal information and the ability to relate pieces of information and generate a result that is often better than what most humans can produce, not even mentioning the response time. When you use it, the provider can collect data they reveals everything about yourself and you even pay for it.

          If you ask me, it would have been better not invented at all. But since we cannot go back in time, we need to face the reality. AI is not going to disappear, not because of sustainability problems or anything else. The power that comes with it, is just too tempting.

          When smart phones got introduced there were people that resisted that technology for years. But where are they now? Virtually everyone is using them, in every age group. The reason is convenience. Humans are lazy animals, you can lure them into anything if you make something “easier” for them. But closing the eyes while hoping the new stuff will disappear over night is not going to work out.

          Like it or not, it is absolutely necessary to understand AI and make it usable as a tool in our favour. It is important to have open source models that can compete with the corporate ones, keeping also the toolchain open source and traceable. That is the only way we can maintain our privacy and freedom.

  5. I wonder what is the smallest model we can use for this purpose… It would be ideal if it could run on CPU.

    • Ellis Low

      Yes, this is a great question!

      The original blog series is about running on lean hardware with no GPU. CPU only and just 8GB of RAM.

      The model used in the blog is https://huggingface.co/bartowski/microsoft_Phi-4-mini-instruct-GGUF

      We use llama.cpp as the engine, which is optimized for CPU inference
      The model is quantized and stored in GGUF format
      Quantization (in this case Q4_K_M at 2.49GB) lets the model fit in the 8GB RAM and run fast on the CPU

      I don’t know what the actual smallest model would be, but the model selected here works fast enough for chatting on 8GB RAM and no GPU.

  6. Exlumine

    https://arxiv.org/abs/2601.15494

    Think twice, please.

  7. Darvond

    But why?
    Wouldn’t it be more efficient to make a database that is human readable, than relying on a hallucinating, plagiarizing parrot?

    • Mohammad Edhiliya

      the base of human innovation is trying,
      the base of human growth is copying my friend.

      • Darvond

        Rote copying lacks the ideas of innovation, change, and ultimately leads to self-cannibalization.

        We wouldn’t have computers, or much in the way of functional societies if we all copied. We’d still be hunting and gathering rather than planting. And if we all tried to plant the same thing, assuming the same conditions everywhere, we’d starve.

        So no. That saying, so often repeated? As asinine as suggesting the customer is always correct.

    • GreenSuburb

      Why not both? Good human-readable documentation for times when you want to really delve in and get a good understanding of something, and then an LLM chatbot that can help answer quick questions on the fly.

  8. actully i liked this idea, im gonna put it in linkedin and mention u on it, hope u find the best of hope my friend.

  9. Mohammad Edhiliya

    ok i tried it now, its working, quick question, how can i add new info to it like pdf or books so it can read from them, thank u brother for this interesting project.
    i created a symlink called fed-help so it can work.

  10. Chen

    Can this local AI answer questions written by other language? Maybe Chinese

  11. You could check out open-notebook.ai’s repo where I’ve posted a feature request to have it be run and managed via podman: https://github.com/lfnovo/open-notebook/issues/451

  12. soundsmessy

    Hello Ellis,

    it seems like this article might need some revision.

    The repository for the docs has changed, so the git clone command only pulls a folder with the README informing about the move of the repo to fedora forge. I figured it out and was able to pull the quick-docs from the new repos but some people might get stuck at this step.

    Another issue I have right now: I’ve converted the quick-docs to html with the “convert.sh” and checked the directory to confirm that the html files are present. They are. The ingesting and chunking seemed to work fine as well. I tried to embed the db with “uv tool run –python 3.12 docs2db embed”. Now I’m stuck at

    “Processing 127 files using 2 workers”

    And it seems like infinite loading, at 0%. I have no clue how long this is supposed to take. My CPU is doing nothing.

    Did I mess something up?

    Thank you for your time.

    sm

  13. Phillip

    Would like to try this project.

Comments are Closed

The opinions expressed on this website are those of each author, not of the author's employer or of Red Hat. Fedora Magazine aspires to publish all content under a Creative Commons license but may not be able to do so in all cases. You are responsible for ensuring that you have the necessary permission to reuse any work on this site. The Fedora logo is a trademark of Red Hat, Inc. Terms and Conditions