first commit

2026-05-10 13:14:14 +01:00
commit 4726582379
14 changed files with 854 additions and 0 deletions
--- a/README.md
+++ b/README.md
@@ -0,0 +1,115 @@
+# null-bot
+
+> A small Telegram bot for extracting and saving opportunities and events from web pages or pasted text. Uses an LLM agent to parse content into structured JSON and stores entries in a local PocketBase instance. This bot uses all open-source tools. The LLM of choice is granite4.1:8b by IBM under their Apache 2.0 License. 
+
+## Features
+- Parse Opportunity (`/op`) and Event (`/ev`) entries from a URL or pasted text
+- Two entry types with separate system prompts and JSON schemas (externalized to `prompts.py`)
+- Follow-up prompt when users paste text: ask for a source URL only when saving
+- Converts date/time to PocketBase-friendly format (`YYYY-MM-DD HH:MM:SS`)
+- Retry decorator for robust LLM / network calls
+
+## Requirements
+- Python 3.11+ recommended
+- See `requirements.txt` for full dependency list
+
+## Setup
+1. Clone the repo or copy files to your machine.
+2. Create and activate a Python virtual environment:
+
+```bash
+python -m venv .venv
+# Windows
+.venv\Scripts\activate
+# macOS / Linux
+source .venv/bin/activate
+```
+
+3. Install dependencies:
+
+```bash
+pip install -r requirements.txt
+```
+
+4. Environment variables
+- Create a `.env` file in the project root with at minimum:
+
+```
+TG_TOKEN=your_telegram_bot_token_here
+OLLAMA_BASE_URL=http://localhost:11434/v1
+ALLOWED_USERS=1234,5678
+POCKETBASE_URL=http://127.0.0.1:8090
+POCKETBASE_ADMIN_EMAIL=admin@example.com
+POCKETBASE_ADMIN_PASSWORD=secret
+```
+
+- Notes:
+  - `ALLOWED_USERS` should be a comma-separated list of Telegram user IDs (no brackets).
+  - The bot reads `TG_TOKEN` and `ALLOWED_USERS` from the environment.
+
+6. Ollama (local LLM) setup
+
+- This project uses a local Ollama instance (or any compatible local LLM HTTP API) as the LLM provider. The bot expects an HTTP endpoint available at `OLLAMA_BASE_URL` (default `http://localhost:11434/v1`). 
+
+- Quick steps to get Ollama running locally:
+
+  1. Install Ollama for your platform — follow the official instructions: https://ollama.com/docs (or use the native installer for Windows/macOS/Linux).
+
+  2. Pull or install a model you want to use. Example (CLI):
+
+  ```bash
+  ollama pull granite4.1:8b
+  ```
+
+  3. Start the Ollama daemon / HTTP API so the bot can reach it. Depending on your Ollama installation this may be:
+
+  ```bash
+  # example commands — consult your Ollama docs if these differ
+  ollama serve
+  # or
+  ollama daemon
+  ```
+
+  4. Set `OLLAMA_BASE_URL` in your `.env` to point to the running API, for example:
+
+  ```text
+  OLLAMA_BASE_URL=http://localhost:11434/v1
+  ```
+
+  5. Verify the API is reachable (example curl):
+
+  ```bash
+  curl -s -X POST "${OLLAMA_BASE_URL}/completions" \
+    -H "Content-Type: application/json" \
+    -d '{"model":"<model-name>","prompt":"hello","max_tokens":16}'
+  ```
+
+  A successful response indicates your Ollama HTTP API is reachable and can serve model requests.
+
+- Notes and troubleshooting
+  - If your Ollama installation exposes a different port or path, update `OLLAMA_BASE_URL` accordingly.
+  - If you prefer hosted LLMs (OpenAI, Anthropic, Cohere, etc.), `agent.py` can be adapted to use other providers; ensure the provider client is configured and the prompts in `prompts.py` are compatible.
+
+## Running the bot
+
+Start the bot with the project's entrypoint (example):
+
+```bash
+python bot.py
+```
+
+The bot listens for commands:
+- `/op <url or paste>` — parse an opportunity
+- `/ev <url or paste>` — parse an event
+
+If you paste text (instead of sending a URL), the bot will parse it and when you click Save it will prompt you for a source URL (or you can `/skip`).
+
+## How it works (high-level)
+- `agent.py` uses `pydantic-ai` + a local LLM provider (e.g. Ollama) and system prompts from `prompts.py` to parse pages/text into structured JSON.
+- `database.py` converts datetime fields and uploads the entry to the appropriate PocketBase collection (`events` or `opportunities`).
+- `bot.py` handles Telegram interactions, queues parse tasks, and preserves per-user state in `context.user_data`.
+
+## Troubleshooting
+- If dates show as `None` after save: verify PocketBase field names (`datetime` for events, `deadline` for opportunities) and ensure `.env` is configured.
+- If the bot doesn't start: check `TG_TOKEN` is present and valid.
+- If parsing fails or you see unexpected behavior, check logs printed to the console for `convert_datetime_to_pocketbase()` and `upload_entry()` debug messages.