The 10 best cloud instances for In-Memory Computing

The best cloud instances2017 has brought a series of announcements regarding large memory computing in public clouds:

That’s a further step forward in the strategy of public cloud providers who value in-memory computing for its business transformation potential. When you put an entire dataset in memory, you can do everything on the fly, analytics and calculation as well, on the live data. Instead of reporting after the fact you can detect opportunities and issues in a timely fashion, you can accurately simulate decisions with what-if analysis and take the best actions.

It is natural that cloud providers try to capture those new use cases, and their latest move is to commoditize large memory hardware. This’s already in motion, this year VMs with 2TB and even 4TB of RAM have become widely available in public clouds, and VMs with up to 16TB have even been announced!

This eliminates a big hurdle. In most organizations infrastructure teams focus on standardization and prefer an homogeneous base of servers. This saves costs but inevitably creates technology-adoption inertia and most IT teams have never seen a server with 2TB of RAM. Now any developer can launch one in a snap!

There is another big obstacle however: cost. Although the number of business processes it could transform is limitless, in-memory technology is still associated with always-on, mission-critical applications that justify the investment. Cloud computing will also change that: you can now allocate infrastructures on the fly, for the time you need, and only pay for what you use.

That’s a bit theoretical though, because not every software platform can be operated on-demand. In-memory databases from Oracle, Microsoft and SAP are now available in the cloud, but still follow a monolithic architecture where storage is attached to and managed by the database, and that database is always on. To take full advantage of the cloud model, another approach is needed that decouples storage from compute and allows customers to use memory resources on demand. In practice it means storing data in object storage, and loading it on the fly at massive speeds in VMs, started on demand and disposed of when the analysis is done.

With that in mind, let’s look at what’s available in the big public clouds, as of December 2017, and make a list of the 10 best instances to operate on-demand in-memory computing in the cloud. In this article, we will also give some tips on how to choose the instance that fits you best…

Small (up to 128GB)

Example use cases: e-commerce dynamic pricing (single website), risk monitoring for a trading desk…

AWS • R4.4xlarge,
• 122GB RAM,
• 8 cores
• Intel Xeon E5 v4 family
$1.064/hour
(pay as you go)
Azure • E16 v3,
• 128GB RAM,
• 8 cores
• Intel Xeon E5 v4 family
$1.064/hour
(pay as you go)
Google • n1-highmem-16,
• 104GB RAM,
• 8 cores
• Intel Xeon E5 v4 family
$0.9472/hour
(pay as you go)

In this very competitive segment, the hardware specs are almost standardized: same number of cores, same processors, same price per GB of RAM.

Medium (up to 512GB)

Example use cases: global Finished Vehicle Logistics management, multi-channel and geolocalized dynamic pricing

AWS • R4.16xlarge,
• 488GB RAM,
• 2×16 cores
• Intel Xeon E5 v4 family
$4.256/hour
(pay as you go)
Azure • E64 v3,
• 432GB RAM,
• 2×16 cores
• Intel Xeon E5 v4 family
$4.011/hour
(pay as you go)
Google • n1-highmem-64,
• 416GB RAM,
• 2×16 cores
• Intel Xeon E5 v4 family
$3.7888/hour
(pay as you go)

A couple of years ago operating a server with half a terabyte of memory was still considered high-end. But today this is completely mainstream and offerings are aligned.

That being said when you reach this size the networking plays a more important role. On-demand in-memory solutions must be able to download large datasets from object storage in a short time, so that they can truly be started on the fly.

This speed depends on the throughput of the object storage, the network adapter on the VM itself, and the performance of the in-memory engine software. The ActiveViam platform for instance comes with special connectors that pull data from object storages over tens of parallel HTTP connections while compressing the data into in-memory structures at the same time.

There are differences between public clouds. Azure E64 v3 instances can operate at 30Gbps, AWS R4.16xlarge at 20Gbps, and the Google n1-highmem-64 at 16Gbps. Note that those numbers are given for VM to VM networking. For on-demand in-memory computing you want to know how fast you can download a dataset, which is driven by the throughput between object storage and the VMs. So it’s better to make real benchmarks before choosing a provider or a VM.

Here are some of the best practices for optimal transfer rates, that we have tested and implemented in the ActiveViam platform:

  • Use object storage and VMs from the same region
  • Make sure the dataset is spread on many nodes in the object storage, to aggregate the throughput of several network links.
  • Download multiple files in parallel instead of one by one
  • Download multiple chunks of a file in parallel using HTTP range requests
  • Build in-memory tables and indexes continuously, while the data is streamed, to add as little delay as possible to the data transfer itself

Cloud computing

A word of caution with AWS on this topic: AWS limits the throughput at which a compute instance can download data from S3 object storage. The limit is currently set at 5Gbps. For a 100GB dataset the difference is not huge, you may need 5mins instead of 1min to download. But on larger use cases and bigger VMs this becomes more and more penalizing.

Large (up to 2TB)

Example use cases: supply chain control tower for global retailers, market risk value at risk and expected shortfall analytics

This category is very interesting because until last year most IT teams had never seen a server with 2TB of RAM, while one year later any developer can launch one in a snap. Or tens of them! It will take a bit of time for organizations to realize that, but probably not that long.

AWS • x1.32xlarge,
• 1950GB RAM,
• Intel Xeon E7 8800 v3 (4×16 cores)
$13.338/hour
(pay as you go)
Azure • M128s,
• 1900GB RAM,
• Intel Xeon E7 8890 v3 (4×16 cores)
$19.23/hour
(pay as you go)

Instances are similar, based on a 4-socket NUMA system, with the same range of processors. The CPUs on the Azure instance run 10% faster, but the AWS instance is less expensive.

But here the caveat with AWS restricting bandwidth to 5Gbps when downloading from object storage becomes more of a showstopper. Although the instance is equipped with a 25Gbps network card and could theoretically download a one terabyte dataset from S3 in 5mins, in fact it will take half an an hour.

To summarise, AWS is better suited for always-on in-memory applications, attached to a dedicated premium storage, while Azure can also operate truly on-demand workloads.

For the time being Google does not have any offering in this range or beyond.

Very Large (4TB and more)

Example use case: CVA analytics for investment banks.

Azure and AWS have just opened this new range of servers with 4TB of RAM and more. That’s currently the max you can get in the public cloud. But AWS have actually hinted at upcoming instances with up to 16TB of memory, so the trend isn’t about to stop!

AWS • x1e.32xlarge,
• 3.9TB RAM,
• Intel Xeon E7 8800 v3 (4×16 cores)
$26.688/hour
(pay as you go)
Azure • M128ms,
• 3.8TB RAM,
• Intel Xeon E7 8890 v3 (4×16 cores)
$35.82/hour
(pay as you go)

Those 4TB instances are the same hardware than the 2TB instances with more memory chips. The comparative analysis is the same. Based on the experience of our customers who operate large ActiveViam solutions, the number of cores is a bit low to process 4TB at interactive speeds, especially in presence of complex calculations. Those platforms will really benefit from an upgrade to processors with more cores.

Links

Interested by the topic of operating in-memory solutions on-demand? Check out this previous story where we launch a 4000 cores, 60TB RAM cluster and fill it with data, all in less than 30 minutes… Or discover how the most resource intensive analytics in market risk can now be delivered at interactive speeds on one single server in the cloud.

Like this blog?
Follow us on LinkedIn
to stay up-to-date with the latest blog posts!

This entry was posted in Technology. Bookmark the permalink.

Comments are closed.