Matrix auto redact bot
What?
Subject pretty much says it. Get rid of Matrix messages in a specified rooms after a specified time has passed since the posting.
Why?
Matrix doesn't have any data retention policy, all messages are kept indefinitely by default. To fix this, I created Matrix auto redact bot. Which redacts (deletes) messages from the room after N days.
Also got bored on Friday evening so I wanted to develop do something where I could learn a bit more about Matrix.
How?
Linux (RedHat, not Ubuntu / Debian / Alpine this time)
Python
PostgreSQL
Operation - Usage instructions
Create a room
Invite auto_redact_bot
Bot joins the room automatically after a small delay
!marb 7
Then it deletes (redacts in Matrix terms) all messages older than the specified N days defined by the user and keeps doing this until it's stopped, but either kicking the bot out or giving !marb stop command.
Matrix API
The features used by the integration
login
sync
join
context
redact
read_markers
leave
forget
Does not use PUSH gateway, because the task isn't latency sensitive.
Optimizations
The oldest event timestamp is kept in local database, if it's not expired, it's pointless to check if there's anything to redact.
The newest event is checked (from sync), if there's nothing new in the room it's also pointless to fetch the detailed room information.
If room has been inactive (idle, stale) for more than 30 days ,the bot will automatically leave the room. This will also happen immediately, if the bot is left in the room alone.
Other remarks
Encrypted rooms are supported, bot doesn't require encryption keys
Bot only stores / processes event_id's and room_id's + origin_server_ts (timestamps)
Database contains only room id, event id and oldest timestamp
No other than exception logging, even then the only logged infor is room_id and event_id
Currently deletion is rate limited to 400 messages / room / hour
Bot talks to matrix-client.matrix.org server, service is behind wireguard and server doesn't reply to any scans / requests other than valid wireguard packets.
Server runs in Oracle Cloud @ Frankfurt / RedHat.
Other reasons
The Element UI doesn't make it easy to mass delete messages, requiring multiple "accurate" clicks to delete messages. It's much easier to automate it.
Tech reference
Keywords:
Matrix, chat, Element, data retention, lifetime, security, expiry, privacy, enhancing tools, bots, manual, documentation, ttl_bot, time to live, Matrix Auto Redact Bot (MARB)
2021-03-07