Back in the dark ages of the 1990s when we switched from file based email system like CC:Mail and MS Mail to the database structured Exchange, one of the big selling points was that emails sent to fifty users wouldn't clutter up our email servers with fifty copies of the attached Dilbert cartoon. Instead Exchange would store a single copy of the message regardless of how many users it was sent to.
With the upcoming Exchange 2010 upgrade, Microsoft is abandoning single instance storage. They're also killing off the unloved shared disk clustering (single copy cluster or SCC in redmondese), pushing instead the cluster technologies first introduced in Exchange 2007 It's therefore time to re-think how we provision storage for Exchange.
While single instance storage always sounded like a good idea, it's never really delivered what admins expected. Exchange stored a single instance of a given email, but if that message was forwarded, or if a user attached the same file to another email, Exchange started storing duplicate data anyway.
Then we got Exchange 2000, which allowed us to split the single information store into several smaller databases, allowing for simplified backups, restores and reduced online database fragmentation. Since Exchange only did single instancing within each information store, then duplicate data started appearing in even more places.
Finally, the move mailbox wizard isn't smart enough to identify messages that already exist in the target information store, so every item moved looks like a new message breaking the single instance paradigm. Since upgrading from Exchange 2000 or 2003 to 2007 or 2010 requires that each mailbox be moved, any messages sent before your last Exchange server upgrade aren't single instanced today.