Heap queue, CoAP, DTLS, Flash, G Sites, Wget, ETL, IPFS, Data Encoding
Post date: Feb 25, 2018 6:18:54 AM
- Played a little with Python's heap queue (heapq). I didn't yet run performance tests. But at some point, if it matters. I'll do that. heapq vs first just filling list arbitrarily and sorting it. Pretty simple stuff to test and check out. Actually it's so simple. I'll do it right now. Pretty much as expected, the Heap is slower than just putting stuff in list and sorting it when required. This of course means that you've got clear points when list is ready and it can be sorted. If it's constant pushing and popping, then situation might be totally different. Just wanted to know the result for this specific use case.
- Had some stuff to deal with, and had to study: Constrained Application Protocol (CoAP) and Datagram Transport Layer Security (DTLS) - Which I used with OpenSSL for DTLS and Python 3.6 and aiocoap on Linux. Got surprisingly fast the stuff to work, which was necessary. Building PoC code took less than hour, even if I wasn't familiar with the details of CoAP or DTLS. Now it's just adding extra business logic, when basic communication components are already working.
- One bank earlier allowed whoppingly long 6 digit passwords. Now they've enhanced their system and latest version enforces users to use maximum of four digits. That's awesome security, or what?
- Flash storage is actually getting slower. One of my temp flash storages is already quite slow. It's 80% full, with around 200k files. -> Free space fragmentation is a big problem. 8 MB write blocks can almost always only be used partially. -> Writing speed is more than halved currently and if the flash drive gets any fuller. It's going to probably just get worse and worse. Fragmentation is a problem with flash too. Also writing to fragmented space, combined with drives garbage collection features also radically adds to write amplification and drive wearing.
- Wondered again if and when Google Sites will provide a way to migrate old sites to the new layout as well as providing SSL for custom domains. It should be trivial for providers like Google. - This is from long backlog, and I'm still waiting. Maybe I'll move my stuff to my own server. Using some kind of light static site compiler with templates would work beautifully.
- Just noticed that the latest version of wget also maintains - HSTS - data (SSL / TLS related) in .wget-hsts. Very nice indeed.
- Wrote some test code to make 100% covered crawl of one network. Yet I didn't include any optimizations. On 1 Gbit/s connection and 64 threads, exploring the whole network took around 10 minutes. I'm pretty happy about the performance. I was expecting to see CPU saturation, but actually the network interface and related latencies were limiting factor. This was just some PoC code, so much more work is required. But it's working. And it's really much faster than competing implementations. One parallel implementation was over 50x slower than my solution.
- On one ETL to ERP task there were some performance testing, my code handled over 150 stateful transactions per second / thread. Which made me happy. Once again, the receiving end was performing a lot worse. Which isn't surprising. I usually prefer writing 'focused' code, which does what's required dropping literally huge amount of overhead. In this case processing the 100k test transactions I generated, took the another party one hour and they aborted after processing only 7k of those. Smile! Keep it simple, and just do the obvious and efficient low hanging fruit optimizations and don't do anything which isn't strictly required.
- I do appreciate customers which prefer that stuff should be done properly. Way too many customers just want barely working stuff, with almost no testing and no use cases. Then it's constant battle if it's a bug or feature and if it should be fixed for free or not. Yet they still often fail to realize, that when they want cheapest barely working solution, it might cause problems later. As well as without proper testing, there will be some issues in production. It's guaranteed. Also edge case testing is way too often lacking. Then it's up to me to decide if some kind of error should be logged, ignored silently or if I should make it show stopper and force them to act on it. Anyway, which ever solution is chosen, it's guaranteed that they'll complain about it sooner or later.
- Read lot of IPFS related documentation, configuration, etc. Anyway, it seems that the StorageMax parameter in configuration isn't working correctly (?). I've configured it to 1 GB and currently the IPFS block storage is utilizing 13 GB. Hmm...
- Encoding fun, one Open Source project uses base58, base64, base85, base16 (hex) and so on. Yawn! Also sha1, sha256, alder32, etc are also being used. All this mess. But hey, at least there isn't sha-1 there.