Jul
29
2010
A simple python script to deduplicate a mailbox (mbox format).
#!/usr/bin/env python
# Created by Evaggelos Balaskas on Thu Jul 29 21:22:41 EEST 2010
# Remove duplicate mails from mbox using message-id
import sys
import mailbox
if len(sys.argv) == 2:
mid = []
for message in mailbox.mbox( sys.argv[1] ) :
s = message['message-id']
if s not in mid:
mid.append(s)
print message
else:
print "Usage should be: " + sys.argv[0] + " mbox > new.mbox"
You can take a look, also, on my other python script: How to remove specific mails from a mbox by subject