
The QCC Acceleration Platform from CacheQ Systems, a heterogeneous compute development environment that would offer better performance and shorter development times for computer architectures including multi-core CPUs, GPUs, and field programmable gate arrays (FPGA), has just received GPU support.
In the previous five years, GPU adoption has evolved quickly, and through 2028, the $25 billion sector is anticipated to increase at a CAGR of almost 33 percent.
Hardware companies and the open-source community have provided software tools to enable heterogeneous computing systems such as multicore CPUs, GPUs, and FPGAs coupled to these processing systems. In the past, these technologies depended on programmers to provide information to compilers so that they could express parallelism in their programs using hardware-specific APIs like CUDA from NVIDIA, HIP from AMD, and oneAPI from Intel.
Through OpenACC, OpenMP, and OpenCL, other initiatives try to enable pragmas encoded in C, C++, and Fortran. To achieve performance and proper code behavior on parallel compute units, all of these tasks would necessitate in-depth knowledge of the target hardware. These tasks include managing memory copy and synchronization events, forming teams of threads, manually removing loop carry dependencies, race conditions, and adding summations.
According to CacheQ QCC, this is the first compiler platform that allows parallelism to be automatically extracted from common C, C++, and Fortran code without the need for the developer to explicitly tell the compiler that they want parallelism. QCC automatically accelerates programs on a range of hardware, outperforming pragma-based methods in terms of performance, and may compete with hand-coded API solutions with little to no hardware expertise. As a result, a developer might write generic code that targets high-performance hardware at compile time without having to rework it, or restructure it in a way that makes it readily functionally verifiable and is not particular to the target hardware.
“Demand for hardware acceleration using GPUs and other heterogenous compute hardware is growing exponentially,” said Clay Johnson, CEO and co-founder of CacheQ Systems. “Our goal is to simplify high-performance data center and edge-computing application development. The QCC Acceleration Platform meets that goal and will enable new solutions across a variety of applications, including life sciences, financial trading, government, oil and gas exploration and industrial IoT.”
GPUs, x86, Arm, RISC-V, FPGAs
The QCC Acceleration Platform, based on the proprietary CacheQ virtual machine (CQVM), is a heterogeneous compute development environment that turns serial high-level language (HLL) code into a parallel representation for the most complicated designs in less than 30 seconds. Prior to producing a compute executable, it offers code profiling, usage estimations, performance simulation, memory setup, and partitioning across a range of compute engine processors, including GPUs, x86, Arm, and RISC-V, and FPGAs.A development environment with standardized drivers, secure containers, and support for many boards from various suppliers are among the features. Profiling, performance modeling, and reporting on memory activity are all provided via its design study. Code unrolling, user-driven memory setup, and automated and user-guided partitioning are all added by an optimization feature.
A resource estimator, pre-configured shells, several boards and parts, and automation of the implementation tool are all included in the FPGA implementation. The memory implementation provides multi-port/multi-access, automated integration, and striping.
The QCC Acceleration Platform is currently being shipped in small quantities, with broad availability anticipated for late 2023. The 0.18 release works with nVidia and AMD GPUs, Xilinx FPGA accelerator boards, and Intel, AMD, Arm, Apple, and RISC-V CPUs. Pricing is available on request.