first commit

This commit is contained in:
2026-05-10 13:14:14 +01:00
commit 4726582379
14 changed files with 854 additions and 0 deletions

115
README.md Normal file
View File

@@ -0,0 +1,115 @@
# null-bot
> A small Telegram bot for extracting and saving opportunities and events from web pages or pasted text. Uses an LLM agent to parse content into structured JSON and stores entries in a local PocketBase instance. This bot uses all open-source tools. The LLM of choice is granite4.1:8b by IBM under their Apache 2.0 License.
## Features
- Parse Opportunity (`/op`) and Event (`/ev`) entries from a URL or pasted text
- Two entry types with separate system prompts and JSON schemas (externalized to `prompts.py`)
- Follow-up prompt when users paste text: ask for a source URL only when saving
- Converts date/time to PocketBase-friendly format (`YYYY-MM-DD HH:MM:SS`)
- Retry decorator for robust LLM / network calls
## Requirements
- Python 3.11+ recommended
- See `requirements.txt` for full dependency list
## Setup
1. Clone the repo or copy files to your machine.
2. Create and activate a Python virtual environment:
```bash
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS / Linux
source .venv/bin/activate
```
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Environment variables
- Create a `.env` file in the project root with at minimum:
```
TG_TOKEN=your_telegram_bot_token_here
OLLAMA_BASE_URL=http://localhost:11434/v1
ALLOWED_USERS=1234,5678
POCKETBASE_URL=http://127.0.0.1:8090
POCKETBASE_ADMIN_EMAIL=admin@example.com
POCKETBASE_ADMIN_PASSWORD=secret
```
- Notes:
- `ALLOWED_USERS` should be a comma-separated list of Telegram user IDs (no brackets).
- The bot reads `TG_TOKEN` and `ALLOWED_USERS` from the environment.
6. Ollama (local LLM) setup
- This project uses a local Ollama instance (or any compatible local LLM HTTP API) as the LLM provider. The bot expects an HTTP endpoint available at `OLLAMA_BASE_URL` (default `http://localhost:11434/v1`).
- Quick steps to get Ollama running locally:
1. Install Ollama for your platform — follow the official instructions: https://ollama.com/docs (or use the native installer for Windows/macOS/Linux).
2. Pull or install a model you want to use. Example (CLI):
```bash
ollama pull granite4.1:8b
```
3. Start the Ollama daemon / HTTP API so the bot can reach it. Depending on your Ollama installation this may be:
```bash
# example commands — consult your Ollama docs if these differ
ollama serve
# or
ollama daemon
```
4. Set `OLLAMA_BASE_URL` in your `.env` to point to the running API, for example:
```text
OLLAMA_BASE_URL=http://localhost:11434/v1
```
5. Verify the API is reachable (example curl):
```bash
curl -s -X POST "${OLLAMA_BASE_URL}/completions" \
-H "Content-Type: application/json" \
-d '{"model":"<model-name>","prompt":"hello","max_tokens":16}'
```
A successful response indicates your Ollama HTTP API is reachable and can serve model requests.
- Notes and troubleshooting
- If your Ollama installation exposes a different port or path, update `OLLAMA_BASE_URL` accordingly.
- If you prefer hosted LLMs (OpenAI, Anthropic, Cohere, etc.), `agent.py` can be adapted to use other providers; ensure the provider client is configured and the prompts in `prompts.py` are compatible.
## Running the bot
Start the bot with the project's entrypoint (example):
```bash
python bot.py
```
The bot listens for commands:
- `/op <url or paste>` — parse an opportunity
- `/ev <url or paste>` — parse an event
If you paste text (instead of sending a URL), the bot will parse it and when you click Save it will prompt you for a source URL (or you can `/skip`).
## How it works (high-level)
- `agent.py` uses `pydantic-ai` + a local LLM provider (e.g. Ollama) and system prompts from `prompts.py` to parse pages/text into structured JSON.
- `database.py` converts datetime fields and uploads the entry to the appropriate PocketBase collection (`events` or `opportunities`).
- `bot.py` handles Telegram interactions, queues parse tasks, and preserves per-user state in `context.user_data`.
## Troubleshooting
- If dates show as `None` after save: verify PocketBase field names (`datetime` for events, `deadline` for opportunities) and ensure `.env` is configured.
- If the bot doesn't start: check `TG_TOKEN` is present and valid.
- If parsing fails or you see unexpected behavior, check logs printed to the console for `convert_datetime_to_pocketbase()` and `upload_entry()` debug messages.