Category: Servers

Arm HPC performance results: Bristol/GW4 Isambard (Applications porting and performance)

Posted by – February 1, 2018

First public disclosure of Isambard performance results by Prof Simon McIntosh-Smith, University of Bristol, UK

Isambard is a new Tier 2 HPC service from GW4 built using:
Cray “Scout” system – XC50 series
• Aries interconnect
• 10,000+ Armv8 cores
• Cavium ThunderX2 processors
• 2x 32core @ above 2GHz per node

You can also watch my interview with Simon McIntosh-Smith about Isambard at the Cray booth at SC17 here.

Filmed at the Arm HPC User Group at SC17 in Denver.


Red Hat Support for Arm HPC deployments

Posted by – February 1, 2018

OpenHPC and the CentOS Build System
Red Hat Enterprise Linux (RHEL) 7.4 for ARM provides feature parity at the OS Layer with x86 and enables development of new solutions in existing (Telco, FSI, HPC) and emerging (ML, IoT) segments

Filmed at the Arm HPC User Group at SC17 in Denver.

SUSE Support for Arm HPC deployments

Posted by – February 1, 2018

SUSE and the High Performance Computing Ecosystem
SUSE Linux with the HPC Module for Supported HPC packages
Partnerships with HPE, Arm, Cavium, Cray, Intel, Microsoft, Dell, Qualcomm, and others
OpenHPC Community
Half of the Top 100 systems are running SUSE Linux
At 13.8%, SUSE Linux is the leader in paid Linux on the Top 500
SUSE Linux Enterprise Server ARM Roadmap
Offering commercial Linux support for ARM AArch64 since November 2016

Filmed at the Arm HPC User Group at SC17 in Denver.


Arm HPC Ecosystem update, Arm Supercomputing at SC17

Posted by – February 1, 2018

Chris Goodyer
Overview of Arm
• HPC engagements
Arm partner information
• Latest deployment information
Arm Software Ecosystem
• Software stack enabling
• Arm’s priorities on libraries and applications,

Filmed at the Arm HPC User Group at SC17 in Denver.


Shadow Blade cloud PC Xeon/GTX1080 for cloud gaming, cloud 4K60/8K video-editing

Posted by – January 25, 2018

French startup Blade presents their awesome Shadow cloud PC service at €30/month that streams a very powerful $2000 (equivalent) desktop PC hosted on their server powered by a high-end 8-threaded Intel Xeon server CPU with an Nvidia GTX1080 GPU, 12GB RAM, 256GB SSD (with harddrive/SSD storage expansion options available) running a full Windows 10 Pro desktop remotely in their server, using low-lag Internet technologies that they have developed, fast codecs (to have at least 15mbit/s Internet bandwidth available is recommended for a good experience), fast tricks that they have developed to make this all possible, to offer cloud gaming or high-end video-editing, 3D graphics rendering, audio processing, or anything else that might be useful to run on advanced PC hardware that you can think of, and you can then run that through client applications either running on their AMD APU based Shadow PC thin client that they offer to their subscribers (for a smooth up to 4K60 or 1080p144hz gaming experience), or you can run clients on a Chromebook, any Android phone, Android TV, Macbooks, any Windows machine, Linux, iPhone, iPad, their service runs on everything. Currently their service works well in France, initially it was just for French users who had Fiber to the home connections, but now it also runs smoothly onto any ADSL, Cable even LTE devices in France, the service is also supported in Belgium and a few other countries nearby France. Because for a good service, the user has to be within as few hops in the global backbone internet network as possible, to experience as little lag times as possible. Advanced professional gamers have tested this system and they have reported that they cannot feel any difference between the Shadow cloud gaming service and a local desktop gaming machine. The lag time are said to depend more on the speed of the PC monitor than of the internet back to their cloud server system. They are about to expand their offering to cover the whole of California as they are setting up a cloud server system right now in the Silicon Valley also. They plan to expand their services globally in the near future according to demand.

Atos Bull Sequana X1310 on Cavium ThunderX2 Dibona Supercomputer

Posted by – December 4, 2017

The Mont-Blanc European Exascale supercomputing project based on ARM power-efficient technology, using Cavium ThunderX2 ARM server processor to power its new High Performance Computing (HPC) prototype with HPC SW infrastructure for ARM with tools, code stacks and libraries and more. The ambition of the Mont-Blanc project is to define the architecture of an Exascale-class compute node based on the ARM architecture, and capable of being manufactured at industrial scale. The Mont-Blanc 3 system being built by a consortium which includes Atos, ARM, AVL (Austrian power train developer) and seven academic institutions, including the Barcelona Supercomputer Center (BSC), implements this ARM for HPC with high memory bandwidth and high core count on Cavium’s custom ARMv8 core architecture with out-of-order execution that can run at 3 GHz. The ThunderX2 might be delivering twice the integer and floating point performance compared with ThunderX1 with also twice the memory bandwidth.

Filmed in 4K60 at Supercomputing 2017 in Denver using Panasonic GH5 ($1999 at on firmware 2.1 (aperture priority, AF continuous tracking) with Leica 12mm f1.4 ($1297 at with Sennheiser MKE440 stereo shotgun microphone ($325 at, get $25 off renting cameras and lenses with my referral link at

Keynote: Aaron Welch of at Linaro Connect San Francisco 2017

Posted by – November 30, 2017

Keynote: Imagine The Internet in Ten Years – Aaron Welch (Packet)

Think back to the summer of 2007. AWS was a few months old, the first iPhone had just been released, and Uber was still two years away from its founding.

Now look the other way: ten years into the future. A future standing on the shoulders of today’s nearly 20 million software developers (which may double in the next five years), a mature ecosystem of venture funded firms around the world, and dozens of major companies dumping massive resources into everything from new data centers, cloud services, VR, 5G, robotics, autonomy, space travel, and a huge variety of software of all stripes and flavors.

Aaron Welch, co-founder and SVP of Product at Packet (the leading bare metal cloud for developers), outlines Packet’s vision for the infrastructure of tomorrow, and why hardware is the next innovation layer.

Aaron Welch, SVP of Product, Packet Hosting Inc

Posted by – November 30, 2017

Interview with Aaron Welch, SVP of Product at about what he said in his keynote, about the ARM Servers which they are providing as bare metal hosting at and what he thinks the internet will be like in the next ten years, probably powered by ARM Servers which they will provide.

@vielmetti talks ARM Servers at and @worksonarm

Posted by – November 30, 2017

Interview with Ed Vielmetti, Special Projects Director at talking about their available and upcoming ARMv8 servers in the data center and the ARM Server ecosystem that is being advanced at a rapid pace. Ed Vielmetti is posting News and software updates at and

Dell EMC Supercomputing, Machine and Deep Learning for Enterprise

Posted by – November 28, 2017

Dell EMC shows some of their latest machine and deep learning products for the Enterprise market, enabling enterprises to address opportunities in areas such as fraud detection, image processing, financial investment analysis, personalized medicine and more. The new Dell EMC PowerEdge C4140 Machine Learning and Deep Learning Ready Bundle accelerator-based platform for demanding cognitive workloads, powered by latest generation NVIDIA V100 GPU accelerators with PCIe and NVLink high-speed interconnect technology, two Intel Xeon Scalable Processors, to bring high performance computing (HPC) and data analytics capabilities to mainstream enterprises worldwide.

Dell EMC’s Supercomputers power some of the fastest supercomputers in the world such as the one built for The Texas Advanced Computing Center (TACC) at The University of Texas at Austin, the “Stampede2” supercomputer with Intel Xeon Phi 7250 processors across 4,200 nodes connected with Intel Omni-Path Fabric, developed in collaboration with Dell EMC, Intel and Seagate and ranks No. 12 on the TOP500 list of the most powerful computer systems worldwide. Simon Fraser University’s “Cedar” supercomputer was built for big data, including artificial intelligence, with 146 Dell EMC PowerEdge C4130 servers with NVIDIA Tesla P100 GPUs. Canada’s most powerful academic supercomputer ranks No. 94 on the TOP500 and No. 13 on the Green500, helping researchers chart new territory across several areas such as to study the continually changing DNA code in bacteria.

Filmed in 4K60 at Supercomputing 2017 in Denver using Panasonic GH5 ($1999 at on firmware 2.1 (aperture priority, AF continuous tracking) with Leica 12mm f1.4 ($1297 at with Sennheiser MKE440 stereo shotgun microphone ($325 at, get $25 off renting cameras and lenses with my referral link at

Red Hat Enterprise Linux 7.4 available for ARM servers

Posted by – November 20, 2017

Red Hat Enterprise Linux is now fully supported on ARM server-optimized SoC’s designed for cloud and hyperscale, telco and edge computing, as well as high-performance computing, for SoC’s such as the Cavium ThunderX2 and the Qualcomm Centriq2400, and OEM partners, like HPE for the Apollo 70, through the culmination of a multi-year collaboration with silicon and hardware partners and the upstream community. Over the past 7 years, Red Hat has helped to drive open standards and develop communities of customers, partners and a broad ecosystem. Our goal was to develop a single operating platform across multiple 64-bit ARMv8-A server-class SoCs from various suppliers while using the same sources to build user functionality and consistent feature set that enables customers to deploy across a range of server implementations while maintaining application compatibility.

Filmed in 4K60 at Supercomputing 2017 in Denver using Panasonic GH5 ($1999 at on firmware 2.1 (aperture priority, AF continuous tracking) with Leica 12mm f1.4 ($1297 at with Sennheiser MKE440 stereo shotgun microphone ($325 at, get $25 off renting cameras and lenses with my referral link at

Fujitsu Post-K ARM Supercomputer, Exascale by 2021

Posted by – November 20, 2017

Fujitsu is developing a very powerful ARM processor for its Post-K exascale supercomputer, to have a much wider impact on the HPC market than just a single system. Riken, Japan’s largest and most prestigious scientific research institute, will be the recipient of the Post-K system. This HPC optimized ARM processor design is being done in collaboration with ARM integrating SVE (Scalable Vector Extension), extending the vector processing capabilities associated with AArch64 (64bit) execution in the ARM architecture, enabling implementation choices for vector lengths that scale from 128 to 2048 bits, enabling High Performance Scientific Compute featuring advanced vectorizing compilers to extract more fine-grain parallelism from existing code to reduce software deployment effort. SVE also supports a vector-length agnostic (VLA) programming model that can adapt to the available vector length. When the Post-K Supercomputer is ready, which may be around 2020-2022, and if it lives up to its near-exascale performance promise, it will be eight times faster than today’s most powerful supercomputer in the world, China’s Sunway TaihuLight. The Post-K system will be used to model climate change, predict disasters, develop drugs and fuels, and run other scientific simulations. The Fujitsu Post-K ARM processors are likely to be 10nm FinFET chips fabricated by TSMC, and will feature high-bandwidth memory and the Tofu 6D interconnect mesh that was developed for the original K Supercomputer.

Filmed in 4K60 at Supercomputing 2017 in Denver using Panasonic GH5 ($1999 at on firmware 2.1 (aperture priority, AF continuous tracking) with Leica 12mm f1.4 ($1297 at with Sennheiser MKE440 stereo shotgun microphone ($325 at, get $25 off renting cameras and lenses with my referral link at

Cray ARM Supercomputer with Cavium ThunderX2 in GW4 Isambard with Simon McIntosh-Smith

Posted by – November 19, 2017

Cray announces the world’s first production-ready ARM Powered supercomputer based on the Cavium ThunderX2 64bit ARMv8-A processor, added to the Cray XC50 supercomputer enabling the world’s most flexible supercomputers, available in both liquid-cooled cabinets and air-cooled cabinets, to be available in the second quarter of 2018. Featuring a full software environment, including the Cray Linux Environment, the Cray Programming Environment, and ARM-optimized compilers, with ARM’s upcoming SVE technology as the most efficient path to achieving the vision of exascale, ARM libraries, and tools for running today’s supercomputing workloads, with the Cray Aries interconnect. Cray’s enhanced compilers and programming environment achieves more performance out of the Cavium ThunderX2 processors, up to 20 percent faster performance compared with other public domain ARMv8 compilers such as LLVM and GNU.

Cray is currently working with multiple supercomputing centers on the development of the ARM-based supercomputing systems, including various labs in the United States Department of Energy and the GW4 alliance, a coalition of four leading, research-intensive universities in the UK. Through an alliance with Cray and the Met Office in the UK, GW4 is designing and building “Isambard,” an Arm-based Cray XC50 supercomputer. The GW4 Isambard project aims to deliver the world’s first Arm-based, production-quality HPC service. My video includes an interview with Professor Simon McIntosh-Smith from the University of Bristol who says that Ease of use, robustness, and performance, are all critical for a production service, and their early experiences with Cray’s ThunderX2 systems and end-to-end ARM software environment are very promising. All of the real scientific codes they’ve tried so far have worked out of the box, and they’re also seeing performance competitive with the best in class. Having access to Cray’s optimized HPC software stack of compilers and libraries in addition to all of the open-source tools has been a real advantage.

Filmed in 4K60 at Supercomputing 2017 in Denver using Panasonic GH5 ($1999 at on firmware 2.1 (aperture priority, AF continuous tracking) with Leica 12mm f1.4 ($1297 at with Sennheiser MKE440 stereo shotgun microphone ($325 at, get $25 off renting cameras and lenses with my referral link at

Cavium ThunderX2 production systems now available

Posted by – November 18, 2017

Cavium announces ThunderX2 ARM Server systems now available for customers in server and high performance Supercomputing, partners include Bull/Atos, Cray, Gigabyte, Penguin, Ingrasys/Foxconn and HPE. After 7 years of work by partners in the ARM Server ecosystem (and 7 years of my ARM Server video-blogging), now is finally the time high performance ARM Server systems are launched for cloud computing, high performance computing markets worldwide. The Cavium ThunderX2 server SoC integrates fully out-of-order, high-performance custom cores supporting single and dual-socket configurations. ThunderX2 is optimized to drive high computational performance delivering outstanding memory bandwidth and memory capacity. The new line of ThunderX2 processors includes multiple SKUs for both scale up and scale out applications and is fully compliant with Armv8-A architecture specifications as well as the Arm Server Base System Architecture and Arm Server Base Boot Requirements standards.

ThunderX2 SoC family is supported by a comprehensive software ecosystem ranging from platform level systems management and firmware to commercial Operating Systems, Development Environments and Applications. Cavium has actively engaged in server industry standards groups such as UEFI and delivered numerous reference platforms to a broad array of community and corporate partners. Cavium has also demonstrated its leadership role in the Open Source software community driving upstream kernel enablement and toolchain optimization, actively contributing to Linaro’s Enterprise and Networking Groups, investing in key Linux Foundation projects such as DPDK, OpenHPC, OPNFV and Xen and sponsoring the FreeBSD Foundation’s Armv8 server implementation.

Filmed in 4K60 at Supercomputing 2017 in Denver using Panasonic GH5 ($1999 at on firmware 2.1 (aperture priority, AF continuous tracking) with Leica 12mm f1.4 ($1297 at with Sennheiser MKE440 stereo shotgun microphone ($325 at, get $25 off renting cameras and lenses with my referral link at

HPE unveils The Machine, Apollo 70, Cavium ThunderX2 ARM HPC Supercomputing platforms

Posted by – November 17, 2017

HP Enterprise unveils their HPC optimized Cavium ThunderX2 ARM Powered High Performance Computing platforms, the Apollo 70 is a disruptive ARM HPC processor technology with maximum memory bandwidth, familiar management and performance tools, and the density and scalability required for large HPC cluster deployments. And then HPE Labs unveils The Machine which is also powered by a Cavium ThuderX2, it is HPE’s vision for the future of computing as by 2020, one hundred billion connected devices will generate far more demand for computing than today’s infrastructure can accommodate.

The Machine is a custom-built device made for the era of big data. HPE says it has created the world’s largest single-memory computer. The R&D program is the largest in the history of HPE, the former enterprise division of HP that split apart from the consumer-focused division. If the project works, it could be transformative for society. But it is no small effort, as it could require a whole new kind of software. HPE’s prototype can accomodate up to 160 terabytes of memory, capable of simultaneously working with the data held in every book in the Library of Congress five times over — or approximately 160 million books. According to HPE, it has never been possible to hold and manipulate whole data sets of this size in a single-memory system, and this is just a glimpse of the immense potential of Memory-Driven Computing. Following the GenZ Consortium’s vision, based on the current prototype, HPE expects the architecture can scale to an exabyte-scale single-memory system and, beyond that, to a nearly limitless pool of memory — 4,096 yottabytes. For context, that is 250,000 times the entire digital universe today. With that amount of memory, HPE said it will be possible to simultaneously work with every digital health record of every person on earth, every piece of data from Facebook, every trip of Google’s autonomous vehicles, and every data set from space exploration all at the same time — getting to answers and uncovering new opportunities at unprecedented speeds.

Filmed in 4K60 at Supercomputing 2017 in Denver using Panasonic GH5 ($1999 at on firmware 2.1 (aperture priority, AF continuous tracking) with Leica 12mm f1.4 ($1297 at with Sennheiser MKE440 stereo shotgun microphone ($325 at, get $25 off renting cameras and lenses with my referral link at

Wu Feng talks Supercomputing Green500 List

Posted by – November 16, 2017

Wu Feng, co-founder of the Green500, talks about the challenges to reach exascale through energy efficient super computing, with massive parallel processing at the Denver Supercomputing 2017 conference. Wu Feng is a Professor and Turner Fellow of Computer Science with additional appointments in Electrical & Computer Engineering, Health Sciences, and Biomedical Engineering and Mechanics at Virginia Tech (VT). At VT, he directs the Synergy Laboratory, which conducts research at the synergistic intersection of systems software, middleware, and application software; of particular note is his high-performance computing (HPC) research in the areas of green supercomputing, accelerator-based parallel computing, and bioinformatics. Prior to joining VT, he spent seven years at Los Alamos National Laboratory, where he began his journey in green supercomputing in 2001 with Green Destiny, a 240-node supercomputer in 5 square feet and consuming only 3.2 kW of power when booted diskless. This work ultimately created the impetus for the Green500.

Filmed in 4K60 at Supercomputing 2017 in Denver using Panasonic GH5 ($1999 at on firmware 2.1 (aperture priority, AF continuous tracking) with Leica 12mm f1.4 ($1297 at with Sennheiser MKE440 stereo shotgun microphone ($325 at, get $25 off renting cameras and lenses with my referral link at

David Abdurachmanov of CERN talks High Energy Physics on ARMv8 64bit Servers at Linaro Connect

Posted by – November 10, 2017

Around the year 2000, the convergence on Linux and commodity x86_64 processors provided a homogeneous scientific computing platform which enabled the construction of the Worldwide LHC Computing Grid (WLCG) for LHC data processing. This allowed the High Energy Physics (HEP) community to use a homogeneous software model utilizing the x86_64 architecture. LHC experiments at CERN, in particular ATLAS and CMS, started investigating ARMv8 64-bit (AArch64) architecture for HEP needs. A journey which started in 2013. The LHC community faces a great challenge regarding computing needs in 10 years and has started exploring public clouds, volunteer computing (e.g., LHC@home) and HPC facilities to increase peak computation capacity. This talk will contain information about future (a timeline of 10 years) computation needs for LHC experiments and the more recent progress done by ATLAS, CernVM and CMS teams on using ARMv8 64-bit/AArch64.

You can watch the keynote by David Abdurachmanov and Jakob Blomer here:

David Rusling, Linaro CTO talks Trebble, Servers, HPC, Tiny Linux IoTL, Automotive, Machine Learning

Posted by – October 10, 2017

David Rusling says this has been the best Linaro Connect for him thus far in the 7 years since Linaro was started. He talks about how Google recognizes the part Linaro can play to help with Project Trebble, to help keep longer term support for each LTS kernel release also as part of the Linaro Mobile Group. The Linaro Enterprise Day showed how far Linaro has gotten to with all the work coming together towards ARM Servers taking market share in the server market. Kanta Vekaria works towards Linaro’s involvment with High Performance Computing (HPC) as she talked about in her keynote Nicolas Pitre is working on making the Internet of Tiny Linux (IoTL) to make Linux suitable for IoT you can see his talk here persuading the kernel developers that making changes that benefit the embedded market. Linaro is very active with Zephyr which is kind of the Linux Kernel of the embedded world, working on it in in the Linaro IoT & Embedded Group (LITE). Talking about the establishment of the Open Source Foundries spin-off of Linaro where they can pursue business opportunities to work more closely together with customers who need help implementing open source on ARM solutions such as the IoT solutions shown in this video also introducing the Associate Membership Level for smaller members such as small to medium companies and Universities to be able to join Linaro in the coming months trying to involve everyone in the open source ecosystem. Linaro also is looking into getting involved with open source for the Automotive market possibly related to the software needed for self-driving cars and more. Linaro getting involved with open source for artificial intelligence, machine learning. You can see my previous videos with David Rusling over the past 5 years here.

Jon Masters, Red Hat Chief ARM Architect at Linaro Connect San Francisco 2017

Posted by – October 10, 2017

Jon Masters says Moores Law may have come to an end and that single threaded performance is not defining the industry anymore because it’s not increasing at the same rate that it used to. What is defining the future of the industry is machine learning, accelerators, lots of additional workload optimization that is happening outside of the core. Thus he believes ARM has an opportunity to get into the mainstream server space in the next 12-18 months with the newest powerful ARM Server solutions such as the Cavium ThunderX2 and the Qualcomm Centriq 2400. You can see some of my previous Jon Masters interviews over the past 5 years here.

Kurt Keville of The MIT Supply Response Supercomputing Lab at Linaro Connect San Francisco 2017

The MIT Supply Response Supercomputing Lab has been investigating opportunities to get cycles when they are cheapest, either through an innovative sensor system that utilizes a hyperlocal weather monitoring application that watches clouds, or a clever scraping of PUC utility websites to ramp compute resources up when electricity is inexpensive. They are currently testing a number of projects that are based around ARM and utilizes every bit of the energy-aware programmability of big.LITTLE and Slurm Workload Manager.

#DIV/0! Is their Solar-Powered Supercomputing cluster. It is named for the error they got in Excel when they tried to calculate their performance per dollar.

They maintain the Debian ports of every HPC code they can get their hands on (please send some along if you have additions).

IoTNet is the network in Boston and Cambridge which only handles IoT comms. It is low bandwidth, high latency and lossy which they are hoping will keep humans, with their real-time protocols, off. Machines and CPS like it because it is asynchronous, asymmetric and low power. If you have a key dongle for your car you are probably already using the TTN in your city.

Interested parties can contact them at

Micro-Datacenter Design Challenge (past)