Some time in the last week, the iscsi volume of one of our PostgreSQL went up to 98% and nagios vomited on the standby mobile.
The specific postgres database holds customer’s preferences related to our webmail.
Unfortunately the webmail is a java web app (tomcat) - custom written by some company and the source code is a spaghetti mesh. The code has also gazillion bugs, so we took a decision to migrate to an opensource php based webmail. Hopefully in the near future we will official migrate to the new webmail platform and all known problems to humanity will cease to exist.
Till that time, we have to maintain the current webmail platform and figure out how a ~500Mb database has become a nearly ~50Gb nightmare!
My knowledge on databases are not basic but to be fair i lack in experience. As a veteran standby engineer I know that I need to apply a quick & dirty patch and investigate afterworks. Also I am not afraid to ask for help! And so i did.
First thing to do: increase the volume on the storage machine. I’ve said already that we are using an iscsi partition so it’s pointless that action. In fact - no it isnt !!! The storage machine has a percentage for reserving storage for snapshots. And the increase gave us a little space to breath as the snapshots were “eating” space from the actual volume! You are probably thinking that we should resize the partition - but this is a live-production machine and we dont want a downtime on the service (umount/resize/mount).
From 98% to 93% with only one command.
Second, but most popular thing to do, was VACUUM. A colleague took that step and tried to VACUUM each table separately so not to “lock” or provoke the daemon to a crash or even worst. That gave us a 88% of free space and the time to think before we act again.
For all you people that dont know postgres, postgres doesnt delete actual data from the storage only from the database. So you need to enable autovacuum or vacuum by hand from time to time.
Of course before everything else (or even vacuum) we took a pg_dump to another partition.
But pg_dump was taking hours and hours to complete.
After further investigation, we found a table that pg_dump was getting difficult with.
Fired up a new database and tried to restore this table there.
I couldnt. There was an error of duplicates and the restoration process was failing.
Tried to figured out the duplicate entries. 20 entries! The table has only four columns and a ~ 50.000 data entries. Only 20 of them were duplicates. The amount of data in size is ~20Mb. I was looking the data/entries and removed by hand the duplicates. After that i re-index the specific table and an hour later over 20Gb were free. Down to 44% from 98% by deleted 20 entries.
At that point i was thinking that postgres is mocking me. How the hell a 20Mb table had gone over 20G ?
Now pg_dump is taking 6.5 minutes - but is still taking a long time to dump this specific table.
Tomorrow is a new day to experiment with PostgreSQL
[edit1]: Just to be fair, postgres version is 8.1
[edit2]: The VACUUM process just finished. Another 20G free !!! So in total for 20 duplicate entries a total 40G disk free! We are now at 9% from 98% of used disk.
PS: We have already discussed a lot of plans (upgrade postgres version, restore the dump to a new machine etc etc) in our department but we believe not to focus to any of them (yet) as we havent found the trigger that fired up the database from 500Mb to 50Gb. After that all plays are in hand.
Most of the people that read this blog post should already know what Two-Factor Authentication is.
For those you don’t, in short terms 2FA is when you can login to a server/site/application using two things and not only one (your password).
Something you have and something you know, like when using your bank card (something you have) with it’s pin (something you know).
There are three (3) android apps at f-droid
- Google Authenticator
- FreeOTP
- Gort
I dont want to use the google authenticator, and i havent yet tested FreeOTP, so i’ve chosen Gort to write about.
Gort is using the barada-pam package as the back-end.
Barada/Gort are based on HMAC-Based One-Time Password Algorithm and not on time sync. This is useful if the clocks are drift and the server (barada) can “catch” one-time passwords even if the counter is out-of-sync by a little.
I ‘ve built a Docker image to show barada/gort on the next Security Talk at Athens,Greece Hackerspace and below is the link on my wiki that contains the Dockerfile.
Be aware on the notes/comments on the file.
I like to read.
One of my biggest fears is that I ‘ll never be able to read all the books I’ve checked in my entire life. Even in this technological era that everything is easier and faster - the amount of free time is always less than the year before. That’s the way it is for me and it seems that i can’t do anything to fix it. So i like to keep a short book list - cause my long book list is about a million of unread books (give or take a few thousands). I understand that the previous statement is somehow an overstatement but as I’ve already mentioned in the begging of this blog post, i like to read (although i am not a very smart person).
I never had read William Gibson and I thought to start with the Neuromancer. This would be my first cyberpunk culture book EVER so I believed that i would loved it.
If you have good friends - as i do - they will find a loophole to screw with you, in everything you ‘ll ever do. And so they did! They told me to stop reading Neuromancer and start the Burning Chrome by William Gibson.
And so I did! Burning Chrome is a collection of cyberpunk - science function short stories that are written before Neuromancer and that was my first intro to cyberpunk.
After that … i was reading about Transhumanism and watched videos about it.
But the most important thing of all is the fun of knowledge !
I wish i could figured it out that when i was young, in school and not arguing all day with my teachers about everything.
So I am finally reading Neuromancer! I am at 40% on the ebook, 50 days after my first attempt to read it.