118 lines
4.6 KiB
Markdown
118 lines
4.6 KiB
Markdown
# null-bot
|
|
|
|
> A small Telegram bot for extracting and saving opportunities and events from web pages or pasted text. Uses an LLM agent to parse content into structured JSON and stores entries in a local PocketBase instance. This bot uses all open-source tools. The LLM of choice is granite4.1:8b by IBM under their Apache 2.0 License.
|
|
|
|
## Features
|
|
- Parse Opportunity (`/op`) and Event (`/ev`) entries from a URL or pasted text
|
|
- Two entry types with separate system prompts and JSON schemas (externalized to `prompts.py`)
|
|
- Follow-up prompt when users paste text: ask for a source URL only when saving
|
|
- Converts date/time to PocketBase-friendly format (`YYYY-MM-DD HH:MM:SS`)
|
|
- Retry decorator for robust LLM / network calls
|
|
|
|
## Requirements
|
|
- Python 3.11+ recommended
|
|
- See `requirements.txt` for full dependency list
|
|
|
|
## Setup
|
|
1. Clone the repo or copy files to your machine.
|
|
2. Create and activate a Python virtual environment:
|
|
|
|
```bash
|
|
python -m venv .venv
|
|
# Windows
|
|
.venv\Scripts\activate
|
|
# macOS / Linux
|
|
source .venv/bin/activate
|
|
```
|
|
|
|
3. Install dependencies:
|
|
|
|
```bash
|
|
pip install -r requirements.txt
|
|
```
|
|
|
|
4. Environment variables
|
|
- Create a `.env` file in the project root with at minimum:
|
|
|
|
```
|
|
TG_TOKEN=your_telegram_bot_token_here
|
|
OLLAMA_BASE_URL=http://localhost:11434/v1
|
|
ALLOWED_USERS=1234,5678
|
|
POCKETBASE_URL=http://127.0.0.1:8090
|
|
POCKETBASE_ADMIN_EMAIL=admin@example.com
|
|
POCKETBASE_ADMIN_PASSWORD=secret
|
|
```
|
|
|
|
- Notes:
|
|
- `ALLOWED_USERS` should be a comma-separated list of Telegram user IDs (no brackets).
|
|
- `ALLOWED_USERS` acts as the bootstrap allowlist; the bot also checks the PocketBase `Telegram` collection for persisted access.
|
|
- The bot reads `TG_TOKEN` and `ALLOWED_USERS` from the environment.
|
|
|
|
6. Ollama (local LLM) setup
|
|
|
|
- This project uses a local Ollama instance (or any compatible local LLM HTTP API) as the LLM provider. The bot expects an HTTP endpoint available at `OLLAMA_BASE_URL` (default `http://localhost:11434/v1`).
|
|
|
|
- Quick steps to get Ollama running locally:
|
|
|
|
1. Install Ollama for your platform — follow the official instructions: https://ollama.com/docs (or use the native installer for Windows/macOS/Linux).
|
|
|
|
2. Pull or install a model you want to use. Example (CLI):
|
|
|
|
```bash
|
|
ollama pull granite4.1:8b
|
|
```
|
|
|
|
3. Start the Ollama daemon / HTTP API so the bot can reach it. Depending on your Ollama installation this may be:
|
|
|
|
```bash
|
|
# example commands — consult your Ollama docs if these differ
|
|
ollama serve
|
|
# or
|
|
ollama daemon
|
|
```
|
|
|
|
4. Set `OLLAMA_BASE_URL` in your `.env` to point to the running API, for example:
|
|
|
|
```text
|
|
OLLAMA_BASE_URL=http://localhost:11434/v1
|
|
```
|
|
|
|
5. Verify the API is reachable (example curl):
|
|
|
|
```bash
|
|
curl -s -X POST "${OLLAMA_BASE_URL}/completions" \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"model":"<model-name>","prompt":"hello","max_tokens":16}'
|
|
```
|
|
|
|
A successful response indicates your Ollama HTTP API is reachable and can serve model requests.
|
|
|
|
- Notes and troubleshooting
|
|
- If your Ollama installation exposes a different port or path, update `OLLAMA_BASE_URL` accordingly.
|
|
- If you prefer hosted LLMs (OpenAI, Anthropic, Cohere, etc.), `agent.py` can be adapted to use other providers; ensure the provider client is configured and the prompts in `prompts.py` are compatible.
|
|
|
|
## Running the bot
|
|
|
|
Start the bot with the project's entrypoint (example):
|
|
|
|
```bash
|
|
python main.py
|
|
```
|
|
|
|
The bot listens for commands:
|
|
- `/add <id>` — grant a Telegram user ID access through the `Telegram` collection
|
|
- `/op <url or paste>` — parse an opportunity
|
|
- `/ev <url or paste>` — parse an event
|
|
- If you send a URL directly in chat, the bot will ask whether to process it as an event or an opportunity using buttons.
|
|
|
|
If you paste text (instead of sending a URL), the bot will parse it and when you click Save it will prompt you for a source URL (or you can `/skip`).
|
|
|
|
## How it works (high-level)
|
|
- `agent.py` uses `pydantic-ai` + a local LLM provider (e.g. Ollama) and system prompts from `prompts.py` to parse pages/text into structured JSON.
|
|
- `database.py` converts datetime fields and uploads the entry to the appropriate PocketBase collection (`events` or `opportunities`).
|
|
- `bot.py` handles Telegram interactions, queues parse tasks, and preserves per-user state in `context.user_data`.
|
|
|
|
## Troubleshooting
|
|
- If dates show as `None` after save: verify PocketBase field names (`datetime` for events, `deadline` for opportunities) and ensure `.env` is configured.
|
|
- If the bot doesn't start: check `TG_TOKEN` is present and valid.
|
|
- If parsing fails or you see unexpected behavior, check logs printed to the console for `convert_datetime_to_pocketbase()` and `upload_entry()` debug messages. |