Crunching big data continues to challenge the industry to deliver faster and bigger solutions. Not only is data growing, but the advent of the Internet of Things and data-rich streams, such as those generated by retail store traffic pattern monitoring, are accelerating both the need to move data and the requirement to analyze it rapidly.
The IT industry is responding. For example, Intel has created a reference design for a memory-oriented system with 6TB of DRAM. This is aimed at extending the reach of in-memory databases, which are currently the leading edge of database performance.
In just a year or two, Hybrid Memory Cube (HMC) memories will arrive with similar capacity, but much higher bandwidth, reaching terabyte-per-second levels. This type of memory will apply to high performance computing (HPC) as well as big data applications, and will tie into very high core count CPUs. There also are plans for GPU clusters with even more compute power using HMC.
With such a rapid evolution of engine power, there’s a real risk that storage and networking systems will fall behind, in some ways bringing us back to the state of play a few years back. Both the LAN and the storage network need to evolve substantially.
First, there’s LAN speed. Ethernet LANs are the primary in-feed mechanism for systems. That’s where the Internet of Things delivers sensor data, for example. It may be that data will terminate in the big data servers, or it may end up being staged to fast storage. Either way, these will be fat streams, and the LAN speed will need to be higher than today’s performance standard, which is 10 gigabit Ethernet. At just 5% of the putative DRAM bandwidth for HMC, we are looking at 400Gb/sec!
[Read about a new generation of cloud-based log aggregation and analysis services that promises to help IT teams filter gigabytes of log data in "Big Data Analytics, Cloud Services, And IT Security."]
The real key to the need will be the churn rate of the in-memory data. This could be high, as in retail tracking systems or surveillance systems, or it could be low, as in genome data crunching. In the former cases, a single 400GbE might not cut it while 40GbE might work with, for example, genome mining.
Note that these scenarios are based on a single server node. With servers deployed in clusters, network backbones will need all the speed they can get, suggesting that backbones with multiple 100GbE or 400GbE links will be needed for this work in two or three years’ time.
Storage adds to the complex picture. If data is not completely transient, it has to be stored somewhere. Examples are surveillance footage and the raw sensor data in retail. Most large bulk storage arrays achieve around 100Gb/sec of total disk bandwidth. But that’s with HDD. If SSDs are used, the number increases to around 500Gb/sec. One might argue that SSDs are unlikely in this context, but prices have dropped a good deal, and performance may determine that SSD or all-flash arrays are the right choice.
Taking these together, the need for links in the 400Gb/sec range for storage is fast approaching. We’ll see the leading edge of a demand wave before 2017. By then, 100Gb/sec single-fibre links should be in production, with multiple lanes ganged together for higher bandwidth. It won’t be cheap, but if we keep our eye on the ball, we should have the networks to keep up.