incnot
A local-first therapy chatbot that simulates the conversational style of a single IRC user using historical IRC logs, retrieval over prior messages, and Ollama for local inference.
Status
Early scaffold. See PLANS.md for roadmap and AGENTS.md for project rules.
Core architecture
- Parse IRC logs.
- Clean and redact data.
- Build examples centered on one target nick.
- Build retrieval over historical messages.
- Use Ollama to generate grounded persona replies.
Principles
- disclosed simulation, not identity claim
- retrieval before fine-tuning
- local-first
- simple, testable Python modules
Required models
- Chat model (
IRCBOT_CHAT_MODEL): examplellama3.1:8b - Embedding model (
IRCBOT_EMBED_MODEL): examplenomic-embed-text
Environment variables
OLLAMA_URL: Ollama base URL (default:http://127.0.0.1:11434)IRCBOT_CHAT_MODEL: default chat model forchat_cli.pyIRCBOT_EMBED_MODEL: default embedding model for retrieval/indexingIRCBOT_INDEX_PATH: retrieval index path (default:data/index/examples.index.jsonl)IRCBOT_TOP_K: retrieved examples per turn (default:5)IRCBOT_CONTEXT_LINES: recent channel lines kept in chat context (default:8)IRCBOT_SYSTEM_PROMPT_FILE: optional path to custom system prompt textIRCBOT_OLLAMA_TIMEOUT: Ollama HTTP timeout seconds (default:30)
CLI tools
scripts/parse_irc_logs.py: parse raw IRC logs into structured JSONL records.scripts/build_examples.py: extract target-user reply examples with context windows.scripts/build_embeddings.py: build local retrieval index vectors from examples.scripts/query_index.py: query local retrieval index and return top-k matches.scripts/chat_cli.py: interactive local chat with retrieval-grounded prompting.scripts/evaluate_persona.py: run held-out evaluation and write markdown report.
CLI usage
1) Parse logs
PYTHONPATH=src python scripts/parse_irc_logs.py data/raw/irc.log --output data/processed/parsed_logs.jsonl
Include join/part/quit/nick-change events:
PYTHONPATH=src python scripts/parse_irc_logs.py data/raw/irc.log --include-events
2) Build examples
PYTHONPATH=src python scripts/build_examples.py \
--input data/processed/parsed_logs.jsonl \
--output data/processed/examples.jsonl \
--target-nick alice \
--context-window 5
3) Build retrieval index
PYTHONPATH=src python scripts/build_embeddings.py \
--input data/processed/examples.jsonl \
--output data/index/examples.index.jsonl \
--model nomic-embed-text
4) Query retrieval index
PYTHONPATH=src python scripts/query_index.py \
--index data/index/examples.index.jsonl \
--query "how do I set up tmux?" \
--top-k 3 \
--model nomic-embed-text
5) Interactive chat
PYTHONPATH=src python scripts/chat_cli.py \
--index data/index/examples.index.jsonl \
--chat-model llama3.1:8b \
--embed-model nomic-embed-text
Show assembled prompt layers before each model call:
PYTHONPATH=src python scripts/chat_cli.py --show-prompt
If Ollama is not running, CLI tools print a clear startup hint (for example: run ollama serve).
6) Evaluate persona on held-out examples
PYTHONPATH=src python scripts/evaluate_persona.py \
--input data/processed/examples.jsonl \
--target-nick alice \
--report docs/evaluation.md
Development workflow
- Create a git checkpoint.
- Ask Codex to implement one phase at a time.
- Run tests.
- Review outputs.
- Commit.
Languages
Python
100%