Wednesday, October 28, 2009

My (not so) secret plan

Did some work on re-packing Cassandra as an OSGi based daemon. The secret plan is to have 3 eclipse products :

Cassandra node

Cassandra is the clustered DB. This kind of databases are designed with high availability in mind and operates in "multi-master" mode by default. Optimal configuration starts at 3 nodes : set your database to "validate transactions" when at least 2 nodes have a copy of the data, read your data from any node and routing of your requests will happen magically (well, multicast, versionning of data, but let's call it "magic").
Want more performance ? add a node and reads will hit multiple nodes & use some map-reduce. Cassandra was benchmarked to out-perform mysql at 8 nodes. Yes, it means 8 1U low-cost servers to outperform a single mysql. But now you can scale easily, without requiring expensive hardware.
Want more availability ? add a node and set your "transaction safety level" to more nodes than 2.

LMTP over tcp daemon

This one would be the second daemon. It would handle lmtp, apply advanced filtering, interrogate cassandra for similar data to perform conversation grouping.
This one doesn't store anything and would be completly stateless. Deploy as much as needed to handle your postfix load. Just hide those behind some basic DNS round-robin.

IMAP connector

This one is needed to keep thunderbird in the loop. Reads cassandra data. IMAP client was a 20days job, so this one shouldn't be too hard. Just deploy some of them with DNS round-robin.
MiniG backend would hit cassandra nodes directly.

No code available on the net yet, but I want to try a messaging system built this way. This is just a personnal "will try" project. PostgreSQL, Cyrus+Murder & Heartbeat are not out of the loop... yet :D

Edit : to give some obm relevance to this blog entry, Adrien added email support to o-push. Works fine on Win Mobile, very slow on Nicolas's Android 2.x (htc magic with custom firmware for native exchange support), very fast on my iPhone, mixed results on Mehdi's Nokia E71. Expect a blog entry dedicated to that once first round of fixes hits googlecode.

Friday, October 23, 2009

Innovation from mozilla team

Mozilla raindrop project just appeared on the web.

Like it or not, but at least Google with GMail & Wave is not the only one trying to innovate in email business any more.

Wednesday, October 21, 2009

Status update

OBM 2.3 freeze mode is on.

MiniG isn't branched, so I'm trying to add small features without destroying mail reading.

obm sync receives lots of performance related commits. We tweaked most of our sql queries to get good performance on both mysql & pg.

MySQL & PostgreSQL do not require the same kind of tuning. MySQL prefers 2 queries with select a from b ... and select c from b where in (). PostgreSQL performs better with select c from b inner join a where... If we do the one query version postgresql wins & mysql crawls to death. If we do the 2 query version, mysql outperforms postgresql a little on low-cost hardware benchmarks.

We (re)did o-push tests on Nokia E71 with MailForExchange. The calendar part of o-push is working very well. We still need to take a look at contacts. We expect to release obm-2.3 rc0 with contacts & calendar working in o-push. Adrien is working on the mail part of o-push and discovered that microsoft has an alternative to locales. Forget iso-8859-1, let's call it 28591. utf-8 is 65001. o-push just won a new mapping table.

Still wondering if obm new contact screen will be ready for prime time. As it is, I don't like it. Knowing Mehdi & David skills with CSS & Javascript, I think we can still have a new killer module for OBM 2.3. Right now, removing MiniG contact screen to use the OBM one would get the "Over my dead body" response. But as we're all inspired by Snow Leopard Addressbook, Google Contacts and mobile me, we'll find a UI that satisfies everyone, including users ;-)

MiniG seems in a pretty good shape (lot's of people use it as their only mail software, including me). People on the obm mailing seems to have a hard time installing it. I need to figure out why :/

Sunday, October 11, 2009

New round of MiniG changes

Hottest topic for minig was thread grouping. I did all the modifications to make it use Cyrus support for RFC5256. This is slower than the algorithm I came out with, but well my code was faster because it ignored the case that forces you to work with the "References" header.

"Slower" is something that should be taken with care. The load on minig backend JVM is greatly reduced. The "UID THREAD REFERENCES UTF-8 ALL" imap command forces cyrus to do the hard work. The new code replaces MiniG JVM load by Cyrus I/O load. As most sysadmins are OK to dedicate their SAN resources to cyrus, but tends to hate JaveVM, this ok : they will see a waiting JVM not sucking any resource and a cyrus crawling the SAN mounts to death :-)

I also fixed MiniG backend memory usage. MiniG works with conversations, IMAP servers with messages. MiniG needs to maintain a mapping between Conversation's identifiers and IMAP messages UIDs. This mapping was sucking lot of memory. MiniG already has an on-disk version of this mapping. Easy solution : Java SoftReference. A SoftReference is a Java reference that the JVM is allowed to collect when it is under memory pressure. Most of the time, minig will use the in-memory mapping. When memory pressure is to high, to cache will be evicted & rebuilt on demand. After 11years of Java programming, this was the first time I used SoftReference. I knew how strong/weak/soft/phantom references work in Java, but it's the first time my code cannot work without using them.

As it was "optimisation week-end", I also worked on MiniG bandwidth consumption. MiniG was already very good for remote access with limited bandwidth, but things could be made better.

When you display your inbox, every 20sec, minig javascript polls the backend for changes. This AJAX call downloaded the current page of messages & redrawn the grid. This transfer is in the 2KB range. I introduced versioning to minig caches. The AJAX call now sends its last know version to the frontend. Frontend relays it the backend. The backend answers with http 304 when no change occurs. Frontend sends a "UseCachedData" RuntimeException the javascript. The most common case "nothing changed" goes from a 2KB transfer to a 81byte transfer. Quite nice improvement.

For now this is only done for IMAP folders, not for search results. But I as bought the "SOLR 1.4 Enterprise Search Server" book (very good investment for solr users), their number one recommandation for solr performance is to enable "proxy support" which will make solr respond with http 304 when your last search gives the same result. That's why I started implementing support of HTTP 304 in minig backend & frontend. Adding 304 support to minig search will give a big performance boost to "Unread" mail fans.

Upcoming release also adds a user visible feature : when you read a conversation, all unread emails are auto-expanded.

Next bug on my list is about composer, iframes in design mode & cut'n'paste. After 2 years of minig work, I tend to hate those "browser dependent" bugs.

I'm still hoping to do a stable minig release this week with obm 2.2.14, the composer bug being the last known release blocker.

Saturday, October 10, 2009

Status of my OBM parts before oct 15th freeze

Worked 4/5 days on making sure that obm-sync was performing correctly on a 13GB OBM 2.3 PostgreSQL database. I must thank again for the license they granted me for MiniG. Their software is not free (as in speech), but they give away licences for free softwares. This kind of ubber cool software just tells you when your code is shit.

Small commit on obm-caldav to prevent it from fetching all calendar permissions from database. Yourkit helped on this one too.

MiniG thread grouping still needed some work. Slashdot article from Cyrus most active contributors ( gave me some hints. Answer "UID THREAD REFERENCES UTF-8 ALL". This simple IMAP command forces cyrus to do all the threading calculation for you. The minig part is still complicated but from a performance pov, it's a win.

Ok, I really depend on cyrus features. For today deadlines, that's fine. But the architecture I want to have for minig is :

incoming mail -> ironport (or any other _efficient_ spam filtering solution) -> minig_lmtp -> cassandra clustered db

Solr would be plugged into cassandra (cassandra is the big table implementation that facebook released to the apache group as free software).

Yes cyrus does not exist in the architecture I'm hoping to get minig to. (s|l)mtp connector to clustered db with full text indexer is google architecture. This architecture seems like the right one to me. murder + heartbeat seems pretty fragile when you compare it to cassandra/big table reliability.

Sunday, October 04, 2009

Release often, Release early

Very early in some cases... We had to prepare a demo of o-push with obm 2.3 trunk (bleeding egde) and minig.

I had to stop working on minig temporarily to get o-push in a demo-able state.

OPush work was stalled by higher priority tasks, but after working on it all the week-end, it's in a pretty cool shape. I've just committed true push support for calendars.

Here is a demo video I did with my test setup : every modification in obm 2.3 calendar appears in the pda few seconds later.

(Yes, it's not perfect, I still don't know why I add to go back to the activesync application while recording)

PS: I had to convert the video to xvid as youtube seems to do shit with ogv's from gtk-recordMyDesktop