dtek-parse

A Rust library and set of bots for parsing power outage schedules from the DTEK website.

What's included

BinaryPurpose
dtek-cliCommand-line tool for one-off lookups
dtek-telegram-botTelegram bot with subscriptions
dtek-discord-botDiscord bot with slash commands
dtek-schedule-serviceBackground service writing to a shared SQLite DB

Quick example

#![allow(unused)]
fn main() {
use dtek_parse::DTEKParser;

let mut parser = DTEKParser::new()?;

// All groups in one HTTP request
let schedules = parser.get_all_schedules()?;
for (group, data) in &schedules {
    println!("{}: {} days", group, data.schedules.len());
}
}

Key design decisions

  • Single HTTP requestget_all_schedules() fetches every group at once.
  • curl-impersonate — bypasses Incapsula bot protection without a real browser.
  • Shared DB — optional SCHEDULE_DB_URL lets one dtek-schedule-service process serve both bots, eliminating duplicate requests.
  • In-memory cache — both bots cache schedules for CACHE_DURATION_MINUTES (default 30 min) and only hit DTEK (or the shared DB) on a miss.
  • Persistent change detection — the Telegram bot saves schedule snapshots to SQLite so change notifications survive restarts.

Installation

Prerequisites

RequirementNotes
Rust 1.80+Edition 2024
curlRequired for fetching DTEK pages
curl-impersonateRecommended — greatly improves bypass reliability
SQLiteBundled via libsqlite3-sys (no system install needed)

The parser uses curl-impersonate to spoof a real browser TLS fingerprint, bypassing Incapsula/Reese84 protection on the DTEK website.

# Debian / Ubuntu
wget https://github.com/lwthiker/curl-impersonate/releases/latest/download/curl-impersonate-chrome.x86_64-linux-gnu.tar.gz
tar xf curl-impersonate-chrome.x86_64-linux-gnu.tar.gz -C /usr/local/bin/

The library auto-detects the following paths, in order:

  1. /usr/local/bin/curl_chrome116
  2. /usr/local/bin/curl_chrome120
  3. /usr/bin/curl-impersonate-chrome
  4. curl (standard, fallback)

Build from source

git clone https://github.com/AlexMelanFromRingo/dtek-parse
cd dtek-parse

# All binaries
cargo build --release

# Specific binary
cargo build --release --bin dtek-telegram-bot

Built binaries land in target/release/.

Optional: headless Chrome fallback

cargo build --release --features browser

With the browser feature, the parser falls back to a real headless Chrome session when curl fails. This requires Chrome/Chromium to be installed.

Configuration

All configuration is done via environment variables. A .env file in the working directory is loaded automatically via dotenvy.

Common variables

VariableDefaultDescription
CACHE_DURATION_MINUTES30How long cached schedules stay valid
SCHEDULE_DB_URL(not set)Path to the shared SQLite written by dtek-schedule-service. When set, bots read from this DB instead of calling DTEK directly.

Telegram bot

VariableDefaultDescription
TELOXIDE_TOKENrequiredTelegram bot token from @BotFather
DATABASE_URLsqlite:dtek_bot.db?mode=rwcLocal SQLite for subscriptions
CACHE_DURATION_MINUTES30Cache TTL
SCHEDULE_DB_URL(not set)Shared schedule DB (optional)

Discord bot

VariableDefaultDescription
DISCORD_TOKENrequiredDiscord bot token from the Developer Portal
DATABASE_URLsqlite:discord_bot.db?mode=rwcLocal SQLite for user favourite groups
CACHE_DURATION_MINUTES30Cache TTL
SCHEDULE_DB_URL(not set)Shared schedule DB (optional)

Schedule service

VariableDefaultDescription
SCHEDULE_DB_URLsqlite:schedules.db?mode=rwcPath where the service writes schedule data
CACHE_DURATION_MINUTES30Fetch interval

Example .env (shared DB setup)

# Shared schedule DB (written by dtek-schedule-service)
SCHEDULE_DB_URL=sqlite:/data/schedules.db

# Telegram bot
TELOXIDE_TOKEN=123456:ABC-DEF...
DATABASE_URL=sqlite:/data/tg_bot.db

# Discord bot
DISCORD_TOKEN=MTI3...
DATABASE_URL=sqlite:/data/ds_bot.db

CACHE_DURATION_MINUTES=30

Note: each bot reads DATABASE_URL for its own local DB (subscriptions, user groups). SCHEDULE_DB_URL is the shared DB and is the same value for all processes.

Architecture Overview

Standalone mode (default)

When SCHEDULE_DB_URL is not set, each bot fetches from DTEK independently:

┌─────────────────┐        HTTP        ┌──────────┐
│  dtek-telegram  │ ─────────────────▶ │   DTEK   │
│      -bot       │ ◀───────────────── │ website  │
└─────────────────┘                    └──────────┘
        │ cache (in-memory)
        ▼
  serve users

┌─────────────────┐        HTTP        ┌──────────┐
│  dtek-discord   │ ─────────────────▶ │   DTEK   │
│      -bot       │ ◀───────────────── │ website  │
└─────────────────┘                    └──────────┘
        │ cache (in-memory)
        ▼
  serve users

Each bot maintains its own in-memory cache with a TTL of CACHE_DURATION_MINUTES. On a cache miss or expiry the bot makes a fresh HTTP request to DTEK.

Shared DB mode

When SCHEDULE_DB_URL is set, a single dtek-schedule-service process owns all DTEK traffic. Bots become read-only consumers of the shared SQLite:

┌──────────────────────┐        HTTP        ┌──────────┐
│  dtek-schedule       │ ─────────────────▶ │   DTEK   │
│      -service        │ ◀───────────────── │ website  │
└──────────────────────┘                    └──────────┘
          │ writes every 30 min
          ▼
    ┌──────────┐   (schedules table)
    │ shared   │
    │  .db     │
    └──────────┘
     │         │
     ▼         ▼
┌─────────┐  ┌─────────┐
│   TG    │  │   DS    │
│   bot   │  │   bot   │
│(reads)  │  │(reads)  │
└─────────┘  └─────────┘
     │               │
     ▼               ▼
own local.db    own local.db
(subscriptions) (user_groups)

The in-memory cache is kept in both bots — the shared DB is only consulted on a cache miss, so bots still serve most requests from RAM.

Caching layers

User request
    │
    ▼
In-memory cache (RwLock)
    │ HIT → return immediately
    │ MISS ↓
    ▼
SCHEDULE_DB_URL set?
    │ YES → read from SQLite shared DB
    │ NO  → spawn_blocking → DTEKParser → HTTP
    ▼
Update in-memory cache
    │
    ▼
Return to user

Shared Schedule DB

Schema

The shared database contains a single table:

CREATE TABLE IF NOT EXISTS schedules (
    group_name  TEXT PRIMARY KEY,        -- e.g. "GPV1.1"
    data        TEXT NOT NULL,           -- serde_json::to_string(&ScheduleData)
    updated_at  INTEGER NOT NULL         -- Unix timestamp (seconds, UTC)
);

data is the full ScheduleData struct serialised to JSON. Both bots deserialise it back to ScheduleData via serde_json::from_str.

Compatibility guarantee

ScenarioTG botDS bot
SCHEDULE_DB_URL not setunchanged behaviourunchanged behaviour
SCHEDULE_DB_URL set, service not runningerror logged, stale cache servedsame
SCHEDULE_DB_URL set, service runningreads shared DB on missreads shared DB on miss

Running the stack

# 1. Start the service (writes to shared DB)
SCHEDULE_DB_URL=sqlite:/data/schedules.db \
CACHE_DURATION_MINUTES=30 \
./dtek-schedule-service

# 2. Start the bots (read from shared DB)
SCHEDULE_DB_URL=sqlite:/data/schedules.db \
TELOXIDE_TOKEN=... \
DATABASE_URL=sqlite:/data/tg.db \
./dtek-telegram-bot

SCHEDULE_DB_URL=sqlite:/data/schedules.db \
DISCORD_TOKEN=... \
DATABASE_URL=sqlite:/data/ds.db \
./dtek-discord-bot

Inspecting the database

sqlite3 /data/schedules.db "SELECT group_name, datetime(updated_at, 'unixepoch') FROM schedules"

dtek-cli

A command-line tool for one-off schedule lookups directly from the terminal.

Usage

dtek-cli [OPTIONS] <COMMAND>

Commands

CommandDescription
group <GROUP>Show schedule for a specific group (e.g. GPV1.1)
allShow schedules for all groups
listList all available groups
address <CITY> <STREET> <HOUSE>Look up group by address

Options

FlagDescription
--jsonOutput raw JSON instead of formatted text
-h, --helpPrint help
-V, --versionPrint version

Examples

# Show schedule for group GPV1.1
./dtek-cli group GPV1.1

# List all groups
./dtek-cli list

# Find group by address
./dtek-cli address "Дніпро" "Набережна Перемоги" "10"

# All groups as JSON
./dtek-cli all --json

Notes

  • The CLI makes a live HTTP request every time — it does not use a local cache.
  • On environments without curl-impersonate, multiple retry attempts are made automatically (up to 15).

dtek-telegram-bot

A Telegram bot for viewing outage schedules and subscribing to change notifications.

Environment variables

VariableRequiredDefaultDescription
TELOXIDE_TOKENBot token from @BotFather
DATABASE_URLsqlite:dtek_bot.db?mode=rwcLocal SQLite for subscriptions and snapshots
CACHE_DURATION_MINUTES30Cache TTL in minutes
SCHEDULE_DB_URL(not set)Shared schedule DB path

Commands

CommandDescription
/startWelcome message and help
/helpSame as /start
/groupsShow group selection buttons
/subscribeManage group subscriptions
/myShow schedules for all subscribed groups
/statusCache status and bot info

Local database schema

-- User subscriptions
CREATE TABLE subscriptions (
    id         INTEGER PRIMARY KEY AUTOINCREMENT,
    user_id    INTEGER NOT NULL,
    chat_id    INTEGER NOT NULL,
    group_name TEXT NOT NULL,
    created_at TEXT DEFAULT CURRENT_TIMESTAMP,
    UNIQUE(user_id, group_name)
);

-- Persistent schedule snapshot for change detection
CREATE TABLE schedule_snapshots (
    group_name TEXT PRIMARY KEY,
    data       TEXT NOT NULL,    -- JSON-serialised ScheduleData
    saved_at   INTEGER NOT NULL  -- Unix timestamp
);

Background tasks

The bot runs two background tasks:

Cache refresh — re-fetches schedules from DTEK (or shared DB) every CACHE_DURATION_MINUTES minutes, keeping the in-memory cache warm.

Change detector — every 15 minutes, compares the current schedule with the previous snapshot. When a change is detected for a group, all subscribers receive a notification.

The snapshot is stored in schedule_snapshots (SQLite), so change detection works correctly after a restart — no notifications are missed even if the bot was down when DTEK updated the schedule.

Running

TELOXIDE_TOKEN=123456:ABC-DEF \
DATABASE_URL=sqlite:tg_bot.db \
./dtek-telegram-bot

dtek-discord-bot

A Discord bot with slash commands and interactive group-selection buttons.

Environment variables

VariableRequiredDefaultDescription
DISCORD_TOKENBot token from Discord Developer Portal
DATABASE_URLsqlite:discord_bot.db?mode=rwcLocal SQLite for user favourite groups
CACHE_DURATION_MINUTES30Cache TTL in minutes
SCHEDULE_DB_URL(not set)Shared schedule DB path

Slash commands

CommandDescription
/dtekShow group buttons (interactive)
/dtek_група <group>Show schedule for a specific group
/dtek_встановити <group>Set your favourite group
/dtek_мояShow schedule for your favourite group
/dtek_статусCache status
/dtek_очиститиForce cache refresh (admins only)

Local database schema

CREATE TABLE user_groups (
    user_id        INTEGER PRIMARY KEY,
    favorite_group TEXT NOT NULL
);

Cache behaviour

The Discord bot uses an atomic refresh_in_progress flag to prevent multiple simultaneous refresh operations. When a cache miss occurs:

  1. First caller wins the compare_exchange lock and performs the fetch.
  2. Other concurrent callers wait by polling until is_refreshing() returns false.
  3. Expired cache triggers a background refresh — the stale data is served immediately while the refresh runs.

Running

DISCORD_TOKEN=MTI3... \
DATABASE_URL=sqlite:ds_bot.db \
./dtek-discord-bot

dtek-schedule-service

A lightweight background service that periodically fetches schedules from DTEK and writes them to a shared SQLite database. Both bots can then read from this database instead of making their own HTTP requests.

Environment variables

VariableDefaultDescription
SCHEDULE_DB_URLsqlite:schedules.db?mode=rwcPath to the shared DB
CACHE_DURATION_MINUTES30Fetch interval in minutes

What it does

loop:
  1. Connect to SCHEDULE_DB_URL
  2. Create `schedules` table if not exists
  3. Call DTEKParser::get_all_schedules()  ← one HTTP request
  4. INSERT OR REPLACE each group into the DB
  5. Log "Written N groups"
  6. Sleep for CACHE_DURATION_MINUTES

Running

SCHEDULE_DB_URL=sqlite:/data/schedules.db \
CACHE_DURATION_MINUTES=30 \
./dtek-schedule-service

When to use it

Use dtek-schedule-service when you run both bots simultaneously. Instead of two independent DTEK fetches every 30 minutes, you get one.

If you only run one bot, the service adds no benefit — leave SCHEDULE_DB_URL unset.

Failure handling

If a fetch fails (network error, DTEK returns a blocked page, etc.):

  • The error is logged.
  • The existing rows in the DB remain unchanged (bots continue serving stale-but-valid data).
  • The service sleeps normally and retries on the next interval.

DTEKParser

The main struct for interacting with the DTEK website.

#![allow(unused)]
fn main() {
pub struct DTEKParser { /* private */ }
}

Constructor

#![allow(unused)]
fn main() {
pub fn new() -> anyhow::Result<Self>
}

Creates an HTTP client with cookie storage and a 30-second timeout. Returns an error only if the reqwest client fails to build (extremely rare).

Methods

get_all_schedules

#![allow(unused)]
fn main() {
pub fn get_all_schedules(&mut self) -> anyhow::Result<HashMap<String, ScheduleData>>
}

Fetches schedules for all groups in a single HTTP request. This is the primary method used by both bots.

Returns a map of group_name → ScheduleData.


get_group_schedule

#![allow(unused)]
fn main() {
pub fn get_group_schedule(&mut self, group: &str) -> anyhow::Result<ScheduleData>
}

Fetches the schedule for a single group. Makes one HTTP request. Prefer get_all_schedules() when you need more than one group.


list_groups

#![allow(unused)]
fn main() {
pub fn list_groups(&mut self) -> anyhow::Result<Vec<String>>
}

Returns a sorted list of all available group names (e.g. ["GPV1.1", "GPV1.2", ...]).


list_cities

#![allow(unused)]
fn main() {
pub fn list_cities(&mut self) -> anyhow::Result<Vec<String>>
}

Returns sorted city names from the DTEK street directory.


list_streets

#![allow(unused)]
fn main() {
pub fn list_streets(&mut self, city: &str) -> anyhow::Result<Vec<String>>
}

Returns sorted street names for the given city.


find_address_group

#![allow(unused)]
fn main() {
pub fn find_address_group(
    &mut self,
    city: &str,
    street: &str,
    house_num: &str,
) -> anyhow::Result<String>
}

Resolves a postal address to an outage group via DTEK's AJAX endpoint.


get_outage_info

#![allow(unused)]
fn main() {
pub fn get_outage_info(
    &mut self,
    city: &str,
    street: &str,
    house_num: &str,
) -> anyhow::Result<ScheduleData>
}

Combines address lookup + schedule fetch in one call. The returned ScheduleData has the address field set.

HTTP fetch strategy

  1. Try curl-impersonate (Chrome 116 → 120 → generic → standard curl).
  2. For each attempt: warmup HEAD request → main GET request → validate response size ≥ 10 KB and contains DisconSchedule.
  3. Up to 15 retries with exponential backoff + random jitter.
  4. If all curl attempts fail and the browser feature is enabled: fall back to headless Chrome (up to 3 retries).

Usage in async code

DTEKParser uses the blocking reqwest client. In async contexts, wrap calls in spawn_blocking:

#![allow(unused)]
fn main() {
let schedules = tokio::task::spawn_blocking(|| {
    let mut parser = DTEKParser::new()?;
    parser.get_all_schedules()
})
.await??;
}

Data Types

All types implement Clone, Debug, Serialize, and Deserialize.

OutageStatus

#![allow(unused)]
fn main() {
pub enum OutageStatus {
    Yes,      // Power is ON (full hour)
    No,       // Power is OFF (scheduled outage)
    Maybe,    // Power MIGHT be off
    First,    // First 30 min OFF, second 30 min ON
    Second,   // First 30 min ON, second 30 min OFF
    Mfirst,   // First 30 min MIGHT be off
    Msecond,  // Second 30 min MIGHT be off
    Unknown,  // Unrecognised value
}
}

Helper methods

MethodReturnsDescription
is_on()booltrue only for Yes
is_off()booltrue for No, First, Second
is_maybe_off()booltrue for Maybe, Mfirst, Msecond
has_light()booltrue if any light in the hour (Yes, First, Second)
to_display_string()StringHuman-readable label with emoji

HourSchedule

One entry in a day's schedule (one clock-hour slot).

#![allow(unused)]
fn main() {
pub struct HourSchedule {
    pub hour: u8,             // 0–23
    pub time_range: String,   // e.g. "14:00-15:00"
    pub status: OutageStatus,
}
}

DaySchedule

All 24 hours for one calendar date.

#![allow(unused)]
fn main() {
pub struct DaySchedule {
    pub timestamp: i64,         // Unix timestamp of midnight (Kyiv time)
    pub date: String,           // "YYYY-MM-DD"
    pub day_of_week: String,    // "Monday", "Tuesday", …
    pub hours: Vec<HourSchedule>, // always 24 entries
}
}

Methods

MethodDescription
get_off_hours()Hours where power is definitely off
get_maybe_hours()Hours with possible outage
get_on_hours()Hours where power is on (including partial)
format_compact()One-liner per hour, emoji status
format_schedule()Merged ranges + totals for light and outage periods
format_outages_only()Compact outage-only summary, returns None if no outages

ScheduleData

Complete schedule for one outage group.

#![allow(unused)]
fn main() {
pub struct ScheduleData {
    pub address: Option<String>,             // set when looked up by address
    pub group: String,                       // e.g. "GPV1.1"
    pub group_name: String,                  // same as group (display name)
    pub update_time: String,                 // last-update string from DTEK
    pub fetched_at: DateTime<Utc>,           // when this data was fetched
    pub schedules: HashMap<String, DaySchedule>, // date → day
}
}

Methods

MethodDescription
get_day(date)Look up a specific date ("YYYY-MM-DD")
get_sorted_dates()All dates in ascending order
format_full()Full plain-text schedule
format_telegram()Telegram-formatted schedule (merged ranges)
has_changes_from(other)true if any hour status differs — used for change detection

Constants

#![allow(unused)]
fn main() {
pub const GROUPS: &[&str] = &[
    "GPV1.1", "GPV1.2",
    "GPV2.1", "GPV2.2",
    // …up to GPV6.2
];
}

These are the 12 default group identifiers. The actual live list is fetched dynamically via DTEKParser::list_groups() or get_all_schedules().