dtek-parse
A Rust library and set of bots for parsing power outage schedules from the DTEK website.
What's included
| Binary | Purpose |
|---|---|
dtek-cli | Command-line tool for one-off lookups |
dtek-telegram-bot | Telegram bot with subscriptions |
dtek-discord-bot | Discord bot with slash commands |
dtek-schedule-service | Background service writing to a shared SQLite DB |
Quick example
#![allow(unused)] fn main() { use dtek_parse::DTEKParser; let mut parser = DTEKParser::new()?; // All groups in one HTTP request let schedules = parser.get_all_schedules()?; for (group, data) in &schedules { println!("{}: {} days", group, data.schedules.len()); } }
Key design decisions
- Single HTTP request —
get_all_schedules()fetches every group at once. - curl-impersonate — bypasses Incapsula bot protection without a real browser.
- Shared DB — optional
SCHEDULE_DB_URLlets onedtek-schedule-serviceprocess serve both bots, eliminating duplicate requests. - In-memory cache — both bots cache schedules for
CACHE_DURATION_MINUTES(default 30 min) and only hit DTEK (or the shared DB) on a miss. - Persistent change detection — the Telegram bot saves schedule snapshots to SQLite so change notifications survive restarts.
Installation
Prerequisites
| Requirement | Notes |
|---|---|
| Rust 1.80+ | Edition 2024 |
curl | Required for fetching DTEK pages |
curl-impersonate | Recommended — greatly improves bypass reliability |
| SQLite | Bundled via libsqlite3-sys (no system install needed) |
curl-impersonate (recommended)
The parser uses curl-impersonate to spoof a real browser TLS fingerprint,
bypassing Incapsula/Reese84 protection on the DTEK website.
# Debian / Ubuntu
wget https://github.com/lwthiker/curl-impersonate/releases/latest/download/curl-impersonate-chrome.x86_64-linux-gnu.tar.gz
tar xf curl-impersonate-chrome.x86_64-linux-gnu.tar.gz -C /usr/local/bin/
The library auto-detects the following paths, in order:
/usr/local/bin/curl_chrome116/usr/local/bin/curl_chrome120/usr/bin/curl-impersonate-chromecurl(standard, fallback)
Build from source
git clone https://github.com/AlexMelanFromRingo/dtek-parse
cd dtek-parse
# All binaries
cargo build --release
# Specific binary
cargo build --release --bin dtek-telegram-bot
Built binaries land in target/release/.
Optional: headless Chrome fallback
cargo build --release --features browser
With the browser feature, the parser falls back to a real headless Chrome
session when curl fails. This requires Chrome/Chromium to be installed.
Configuration
All configuration is done via environment variables. A .env file in the working directory is loaded automatically via dotenvy.
Common variables
| Variable | Default | Description |
|---|---|---|
CACHE_DURATION_MINUTES | 30 | How long cached schedules stay valid |
SCHEDULE_DB_URL | (not set) | Path to the shared SQLite written by dtek-schedule-service. When set, bots read from this DB instead of calling DTEK directly. |
Telegram bot
| Variable | Default | Description |
|---|---|---|
TELOXIDE_TOKEN | required | Telegram bot token from @BotFather |
DATABASE_URL | sqlite:dtek_bot.db?mode=rwc | Local SQLite for subscriptions |
CACHE_DURATION_MINUTES | 30 | Cache TTL |
SCHEDULE_DB_URL | (not set) | Shared schedule DB (optional) |
Discord bot
| Variable | Default | Description |
|---|---|---|
DISCORD_TOKEN | required | Discord bot token from the Developer Portal |
DATABASE_URL | sqlite:discord_bot.db?mode=rwc | Local SQLite for user favourite groups |
CACHE_DURATION_MINUTES | 30 | Cache TTL |
SCHEDULE_DB_URL | (not set) | Shared schedule DB (optional) |
Schedule service
| Variable | Default | Description |
|---|---|---|
SCHEDULE_DB_URL | sqlite:schedules.db?mode=rwc | Path where the service writes schedule data |
CACHE_DURATION_MINUTES | 30 | Fetch interval |
Example .env (shared DB setup)
# Shared schedule DB (written by dtek-schedule-service)
SCHEDULE_DB_URL=sqlite:/data/schedules.db
# Telegram bot
TELOXIDE_TOKEN=123456:ABC-DEF...
DATABASE_URL=sqlite:/data/tg_bot.db
# Discord bot
DISCORD_TOKEN=MTI3...
DATABASE_URL=sqlite:/data/ds_bot.db
CACHE_DURATION_MINUTES=30
Note: each bot reads
DATABASE_URLfor its own local DB (subscriptions, user groups).SCHEDULE_DB_URLis the shared DB and is the same value for all processes.
Architecture Overview
Standalone mode (default)
When SCHEDULE_DB_URL is not set, each bot fetches from DTEK independently:
┌─────────────────┐ HTTP ┌──────────┐
│ dtek-telegram │ ─────────────────▶ │ DTEK │
│ -bot │ ◀───────────────── │ website │
└─────────────────┘ └──────────┘
│ cache (in-memory)
▼
serve users
┌─────────────────┐ HTTP ┌──────────┐
│ dtek-discord │ ─────────────────▶ │ DTEK │
│ -bot │ ◀───────────────── │ website │
└─────────────────┘ └──────────┘
│ cache (in-memory)
▼
serve users
Each bot maintains its own in-memory cache with a TTL of CACHE_DURATION_MINUTES.
On a cache miss or expiry the bot makes a fresh HTTP request to DTEK.
Shared DB mode
When SCHEDULE_DB_URL is set, a single dtek-schedule-service process owns
all DTEK traffic. Bots become read-only consumers of the shared SQLite:
┌──────────────────────┐ HTTP ┌──────────┐
│ dtek-schedule │ ─────────────────▶ │ DTEK │
│ -service │ ◀───────────────── │ website │
└──────────────────────┘ └──────────┘
│ writes every 30 min
▼
┌──────────┐ (schedules table)
│ shared │
│ .db │
└──────────┘
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ TG │ │ DS │
│ bot │ │ bot │
│(reads) │ │(reads) │
└─────────┘ └─────────┘
│ │
▼ ▼
own local.db own local.db
(subscriptions) (user_groups)
The in-memory cache is kept in both bots — the shared DB is only consulted on a cache miss, so bots still serve most requests from RAM.
Caching layers
User request
│
▼
In-memory cache (RwLock)
│ HIT → return immediately
│ MISS ↓
▼
SCHEDULE_DB_URL set?
│ YES → read from SQLite shared DB
│ NO → spawn_blocking → DTEKParser → HTTP
▼
Update in-memory cache
│
▼
Return to user
Shared Schedule DB
Schema
The shared database contains a single table:
CREATE TABLE IF NOT EXISTS schedules (
group_name TEXT PRIMARY KEY, -- e.g. "GPV1.1"
data TEXT NOT NULL, -- serde_json::to_string(&ScheduleData)
updated_at INTEGER NOT NULL -- Unix timestamp (seconds, UTC)
);
data is the full ScheduleData struct serialised to JSON. Both bots
deserialise it back to ScheduleData via serde_json::from_str.
Compatibility guarantee
| Scenario | TG bot | DS bot |
|---|---|---|
SCHEDULE_DB_URL not set | unchanged behaviour | unchanged behaviour |
SCHEDULE_DB_URL set, service not running | error logged, stale cache served | same |
SCHEDULE_DB_URL set, service running | reads shared DB on miss | reads shared DB on miss |
Running the stack
# 1. Start the service (writes to shared DB)
SCHEDULE_DB_URL=sqlite:/data/schedules.db \
CACHE_DURATION_MINUTES=30 \
./dtek-schedule-service
# 2. Start the bots (read from shared DB)
SCHEDULE_DB_URL=sqlite:/data/schedules.db \
TELOXIDE_TOKEN=... \
DATABASE_URL=sqlite:/data/tg.db \
./dtek-telegram-bot
SCHEDULE_DB_URL=sqlite:/data/schedules.db \
DISCORD_TOKEN=... \
DATABASE_URL=sqlite:/data/ds.db \
./dtek-discord-bot
Inspecting the database
sqlite3 /data/schedules.db "SELECT group_name, datetime(updated_at, 'unixepoch') FROM schedules"
dtek-cli
A command-line tool for one-off schedule lookups directly from the terminal.
Usage
dtek-cli [OPTIONS] <COMMAND>
Commands
| Command | Description |
|---|---|
group <GROUP> | Show schedule for a specific group (e.g. GPV1.1) |
all | Show schedules for all groups |
list | List all available groups |
address <CITY> <STREET> <HOUSE> | Look up group by address |
Options
| Flag | Description |
|---|---|
--json | Output raw JSON instead of formatted text |
-h, --help | Print help |
-V, --version | Print version |
Examples
# Show schedule for group GPV1.1
./dtek-cli group GPV1.1
# List all groups
./dtek-cli list
# Find group by address
./dtek-cli address "Дніпро" "Набережна Перемоги" "10"
# All groups as JSON
./dtek-cli all --json
Notes
- The CLI makes a live HTTP request every time — it does not use a local cache.
- On environments without
curl-impersonate, multiple retry attempts are made automatically (up to 15).
dtek-telegram-bot
A Telegram bot for viewing outage schedules and subscribing to change notifications.
Environment variables
| Variable | Required | Default | Description |
|---|---|---|---|
TELOXIDE_TOKEN | ✅ | — | Bot token from @BotFather |
DATABASE_URL | sqlite:dtek_bot.db?mode=rwc | Local SQLite for subscriptions and snapshots | |
CACHE_DURATION_MINUTES | 30 | Cache TTL in minutes | |
SCHEDULE_DB_URL | (not set) | Shared schedule DB path |
Commands
| Command | Description |
|---|---|
/start | Welcome message and help |
/help | Same as /start |
/groups | Show group selection buttons |
/subscribe | Manage group subscriptions |
/my | Show schedules for all subscribed groups |
/status | Cache status and bot info |
Local database schema
-- User subscriptions
CREATE TABLE subscriptions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id INTEGER NOT NULL,
chat_id INTEGER NOT NULL,
group_name TEXT NOT NULL,
created_at TEXT DEFAULT CURRENT_TIMESTAMP,
UNIQUE(user_id, group_name)
);
-- Persistent schedule snapshot for change detection
CREATE TABLE schedule_snapshots (
group_name TEXT PRIMARY KEY,
data TEXT NOT NULL, -- JSON-serialised ScheduleData
saved_at INTEGER NOT NULL -- Unix timestamp
);
Background tasks
The bot runs two background tasks:
Cache refresh — re-fetches schedules from DTEK (or shared DB) every CACHE_DURATION_MINUTES minutes, keeping the in-memory cache warm.
Change detector — every 15 minutes, compares the current schedule with the previous snapshot. When a change is detected for a group, all subscribers receive a notification.
The snapshot is stored in schedule_snapshots (SQLite), so change detection works correctly after a restart — no notifications are missed even if the bot was down when DTEK updated the schedule.
Running
TELOXIDE_TOKEN=123456:ABC-DEF \
DATABASE_URL=sqlite:tg_bot.db \
./dtek-telegram-bot
dtek-discord-bot
A Discord bot with slash commands and interactive group-selection buttons.
Environment variables
| Variable | Required | Default | Description |
|---|---|---|---|
DISCORD_TOKEN | ✅ | — | Bot token from Discord Developer Portal |
DATABASE_URL | sqlite:discord_bot.db?mode=rwc | Local SQLite for user favourite groups | |
CACHE_DURATION_MINUTES | 30 | Cache TTL in minutes | |
SCHEDULE_DB_URL | (not set) | Shared schedule DB path |
Slash commands
| Command | Description |
|---|---|
/dtek | Show group buttons (interactive) |
/dtek_група <group> | Show schedule for a specific group |
/dtek_встановити <group> | Set your favourite group |
/dtek_моя | Show schedule for your favourite group |
/dtek_статус | Cache status |
/dtek_очистити | Force cache refresh (admins only) |
Local database schema
CREATE TABLE user_groups (
user_id INTEGER PRIMARY KEY,
favorite_group TEXT NOT NULL
);
Cache behaviour
The Discord bot uses an atomic refresh_in_progress flag to prevent multiple
simultaneous refresh operations. When a cache miss occurs:
- First caller wins the
compare_exchangelock and performs the fetch. - Other concurrent callers wait by polling until
is_refreshing()returnsfalse. - Expired cache triggers a background refresh — the stale data is served immediately while the refresh runs.
Running
DISCORD_TOKEN=MTI3... \
DATABASE_URL=sqlite:ds_bot.db \
./dtek-discord-bot
dtek-schedule-service
A lightweight background service that periodically fetches schedules from DTEK and writes them to a shared SQLite database. Both bots can then read from this database instead of making their own HTTP requests.
Environment variables
| Variable | Default | Description |
|---|---|---|
SCHEDULE_DB_URL | sqlite:schedules.db?mode=rwc | Path to the shared DB |
CACHE_DURATION_MINUTES | 30 | Fetch interval in minutes |
What it does
loop:
1. Connect to SCHEDULE_DB_URL
2. Create `schedules` table if not exists
3. Call DTEKParser::get_all_schedules() ← one HTTP request
4. INSERT OR REPLACE each group into the DB
5. Log "Written N groups"
6. Sleep for CACHE_DURATION_MINUTES
Running
SCHEDULE_DB_URL=sqlite:/data/schedules.db \
CACHE_DURATION_MINUTES=30 \
./dtek-schedule-service
When to use it
Use dtek-schedule-service when you run both bots simultaneously.
Instead of two independent DTEK fetches every 30 minutes, you get one.
If you only run one bot, the service adds no benefit — leave SCHEDULE_DB_URL unset.
Failure handling
If a fetch fails (network error, DTEK returns a blocked page, etc.):
- The error is logged.
- The existing rows in the DB remain unchanged (bots continue serving stale-but-valid data).
- The service sleeps normally and retries on the next interval.
DTEKParser
The main struct for interacting with the DTEK website.
#![allow(unused)] fn main() { pub struct DTEKParser { /* private */ } }
Constructor
#![allow(unused)] fn main() { pub fn new() -> anyhow::Result<Self> }
Creates an HTTP client with cookie storage and a 30-second timeout.
Returns an error only if the reqwest client fails to build (extremely rare).
Methods
get_all_schedules
#![allow(unused)] fn main() { pub fn get_all_schedules(&mut self) -> anyhow::Result<HashMap<String, ScheduleData>> }
Fetches schedules for all groups in a single HTTP request. This is the primary method used by both bots.
Returns a map of group_name → ScheduleData.
get_group_schedule
#![allow(unused)] fn main() { pub fn get_group_schedule(&mut self, group: &str) -> anyhow::Result<ScheduleData> }
Fetches the schedule for a single group. Makes one HTTP request.
Prefer get_all_schedules() when you need more than one group.
list_groups
#![allow(unused)] fn main() { pub fn list_groups(&mut self) -> anyhow::Result<Vec<String>> }
Returns a sorted list of all available group names (e.g. ["GPV1.1", "GPV1.2", ...]).
list_cities
#![allow(unused)] fn main() { pub fn list_cities(&mut self) -> anyhow::Result<Vec<String>> }
Returns sorted city names from the DTEK street directory.
list_streets
#![allow(unused)] fn main() { pub fn list_streets(&mut self, city: &str) -> anyhow::Result<Vec<String>> }
Returns sorted street names for the given city.
find_address_group
#![allow(unused)] fn main() { pub fn find_address_group( &mut self, city: &str, street: &str, house_num: &str, ) -> anyhow::Result<String> }
Resolves a postal address to an outage group via DTEK's AJAX endpoint.
get_outage_info
#![allow(unused)] fn main() { pub fn get_outage_info( &mut self, city: &str, street: &str, house_num: &str, ) -> anyhow::Result<ScheduleData> }
Combines address lookup + schedule fetch in one call.
The returned ScheduleData has the address field set.
HTTP fetch strategy
- Try
curl-impersonate(Chrome 116 → 120 → generic → standard curl). - For each attempt: warmup HEAD request → main GET request → validate response size ≥ 10 KB and contains
DisconSchedule. - Up to 15 retries with exponential backoff + random jitter.
- If all curl attempts fail and the
browserfeature is enabled: fall back to headless Chrome (up to 3 retries).
Usage in async code
DTEKParser uses the blocking reqwest client. In async contexts, wrap calls in spawn_blocking:
#![allow(unused)] fn main() { let schedules = tokio::task::spawn_blocking(|| { let mut parser = DTEKParser::new()?; parser.get_all_schedules() }) .await??; }
Data Types
All types implement Clone, Debug, Serialize, and Deserialize.
OutageStatus
#![allow(unused)] fn main() { pub enum OutageStatus { Yes, // Power is ON (full hour) No, // Power is OFF (scheduled outage) Maybe, // Power MIGHT be off First, // First 30 min OFF, second 30 min ON Second, // First 30 min ON, second 30 min OFF Mfirst, // First 30 min MIGHT be off Msecond, // Second 30 min MIGHT be off Unknown, // Unrecognised value } }
Helper methods
| Method | Returns | Description |
|---|---|---|
is_on() | bool | true only for Yes |
is_off() | bool | true for No, First, Second |
is_maybe_off() | bool | true for Maybe, Mfirst, Msecond |
has_light() | bool | true if any light in the hour (Yes, First, Second) |
to_display_string() | String | Human-readable label with emoji |
HourSchedule
One entry in a day's schedule (one clock-hour slot).
#![allow(unused)] fn main() { pub struct HourSchedule { pub hour: u8, // 0–23 pub time_range: String, // e.g. "14:00-15:00" pub status: OutageStatus, } }
DaySchedule
All 24 hours for one calendar date.
#![allow(unused)] fn main() { pub struct DaySchedule { pub timestamp: i64, // Unix timestamp of midnight (Kyiv time) pub date: String, // "YYYY-MM-DD" pub day_of_week: String, // "Monday", "Tuesday", … pub hours: Vec<HourSchedule>, // always 24 entries } }
Methods
| Method | Description |
|---|---|
get_off_hours() | Hours where power is definitely off |
get_maybe_hours() | Hours with possible outage |
get_on_hours() | Hours where power is on (including partial) |
format_compact() | One-liner per hour, emoji status |
format_schedule() | Merged ranges + totals for light and outage periods |
format_outages_only() | Compact outage-only summary, returns None if no outages |
ScheduleData
Complete schedule for one outage group.
#![allow(unused)] fn main() { pub struct ScheduleData { pub address: Option<String>, // set when looked up by address pub group: String, // e.g. "GPV1.1" pub group_name: String, // same as group (display name) pub update_time: String, // last-update string from DTEK pub fetched_at: DateTime<Utc>, // when this data was fetched pub schedules: HashMap<String, DaySchedule>, // date → day } }
Methods
| Method | Description |
|---|---|
get_day(date) | Look up a specific date ("YYYY-MM-DD") |
get_sorted_dates() | All dates in ascending order |
format_full() | Full plain-text schedule |
format_telegram() | Telegram-formatted schedule (merged ranges) |
has_changes_from(other) | true if any hour status differs — used for change detection |
Constants
#![allow(unused)] fn main() { pub const GROUPS: &[&str] = &[ "GPV1.1", "GPV1.2", "GPV2.1", "GPV2.2", // …up to GPV6.2 ]; }
These are the 12 default group identifiers. The actual live list is fetched dynamically via DTEKParser::list_groups() or get_all_schedules().