2026-03-11 16:03:23 +00:00
2026-03-11 15:55:46 +00:00
2026-03-11 15:55:46 +00:00
2026-03-11 15:55:46 +00:00
2026-03-11 16:03:23 +00:00
2026-03-11 16:03:23 +00:00
2026-03-11 15:15:09 +00:00
2026-03-11 15:26:20 +00:00
2026-03-11 16:03:23 +00:00
2026-03-11 15:15:09 +00:00
2026-03-11 15:26:20 +00:00
2026-03-11 15:55:46 +00:00

incnot

A local-first therapy chatbot that simulates the conversational style of a single IRC user using historical IRC logs, retrieval over prior messages, and Ollama for local inference.

Status

Early scaffold. See PLANS.md for roadmap and AGENTS.md for project rules.

Core architecture

  1. Parse IRC logs.
  2. Clean and redact data.
  3. Build examples centered on one target nick.
  4. Build retrieval over historical messages.
  5. Use Ollama to generate grounded persona replies.

Principles

  • disclosed simulation, not identity claim
  • retrieval before fine-tuning
  • local-first
  • simple, testable Python modules

Required models

  • Chat model (IRCBOT_CHAT_MODEL): example llama3.1:8b
  • Embedding model (IRCBOT_EMBED_MODEL): example nomic-embed-text

Environment variables

  • OLLAMA_URL: Ollama base URL (default: http://127.0.0.1:11434)
  • IRCBOT_CHAT_MODEL: default chat model for chat_cli.py
  • IRCBOT_EMBED_MODEL: default embedding model for retrieval/indexing
  • IRCBOT_INDEX_PATH: retrieval index path (default: data/index/examples.index.jsonl)
  • IRCBOT_TOP_K: retrieved examples per turn (default: 5)
  • IRCBOT_CONTEXT_LINES: recent channel lines kept in chat context (default: 8)
  • IRCBOT_SYSTEM_PROMPT_FILE: optional path to custom system prompt text
  • IRCBOT_OLLAMA_TIMEOUT: Ollama HTTP timeout seconds (default: 30)

CLI tools

  • scripts/parse_irc_logs.py: parse raw IRC logs into structured JSONL records.
  • scripts/build_examples.py: extract target-user reply examples with context windows.
  • scripts/build_embeddings.py: build local retrieval index vectors from examples.
  • scripts/query_index.py: query local retrieval index and return top-k matches.
  • scripts/chat_cli.py: interactive local chat with retrieval-grounded prompting.
  • scripts/evaluate_persona.py: run held-out evaluation and write markdown report.

CLI usage

1) Parse logs

PYTHONPATH=src python scripts/parse_irc_logs.py data/raw/irc.log --output data/processed/parsed_logs.jsonl

Include join/part/quit/nick-change events:

PYTHONPATH=src python scripts/parse_irc_logs.py data/raw/irc.log --include-events

2) Build examples

PYTHONPATH=src python scripts/build_examples.py \
  --input data/processed/parsed_logs.jsonl \
  --output data/processed/examples.jsonl \
  --target-nick alice \
  --context-window 5

3) Build retrieval index

PYTHONPATH=src python scripts/build_embeddings.py \
  --input data/processed/examples.jsonl \
  --output data/index/examples.index.jsonl \
  --model nomic-embed-text

4) Query retrieval index

PYTHONPATH=src python scripts/query_index.py \
  --index data/index/examples.index.jsonl \
  --query "how do I set up tmux?" \
  --top-k 3 \
  --model nomic-embed-text

5) Interactive chat

PYTHONPATH=src python scripts/chat_cli.py \
  --index data/index/examples.index.jsonl \
  --chat-model llama3.1:8b \
  --embed-model nomic-embed-text

Show assembled prompt layers before each model call:

PYTHONPATH=src python scripts/chat_cli.py --show-prompt

If Ollama is not running, CLI tools print a clear startup hint (for example: run ollama serve).

6) Evaluate persona on held-out examples

PYTHONPATH=src python scripts/evaluate_persona.py \
  --input data/processed/examples.jsonl \
  --target-nick alice \
  --report docs/evaluation.md

Development workflow

  1. Create a git checkpoint.
  2. Ask Codex to implement one phase at a time.
  3. Run tests.
  4. Review outputs.
  5. Commit.
Description
A therapy chat bot
Readme ISC 16 MiB
Languages
Python 100%