New Video: Focus on your machine-learning app, not FPGA programming with the Xilinx Acceleration Stack

2017年1月10日 | By News | Filed in: News.


Last November at SC16 in Salt Lake City, Xilinx Distinguished Engineer Ashish Sirasao gave a 10-minute talk on deploying deep-learning applications using FPGAs with significant performance/watt benefits. Sirasao started by noting that we’re already knee-deep in machine-learning applications: spam filters; cloud-based and embedded voice-to-text converters; and Amazon’s immensely successful, voice-operated Alexa are all examples of extremely successful machine-learning apps in broad use today. More—many more—will follow. These applications all have steep computing requirements.


There are two phases in any machine-learning application. The first is training and the second is deployment. Training is generally done using floating-point implementations so that application developers need not worry about numeric precision. Training is a 1-time event so energy efficiency isn’t all that critical.


Deployment is another matter however.


Putting a trained deep-learning application in a small appliance like Amazon’s Alexa calls for attention to factors such as energy efficiency. Fortunately, said Sirasao, the arithmetic precision of the application can change from training to mass deployment and there are significant energy-consumption gains to be had by deploying fixed-point machine-learning applications. According to Sirasao, you can get accurate machine inference using 8- or 16-bit fixed-point implementations while realizing a 10x gain in energy efficiency for the computing hardware and a 4x gain in memory energy efficiency.


The Xilinx DSP48E2 block implemented in the company’s UltraScale and UltraScale+ devices is especially useful for these machine-learning deployments because its DSP architecture can perform two independent 8-bit operations per clock per DSP block. That translates into nearly double the compute performance, which in turn results in much better energy efficiency. There’s a Xilinx White Paper on this topic titled “Deep Learning with INT8 Optimization on Xilinx Devices.”


Further, Xilinx recently announced its Acceleration Stack for machine-learning (and other cloud-based applications), which allows you to focus on developing your application rather than FPGA programming. You can learn about the Xilinx Acceleration Stack here


Finally, here’s the 10-minute video with Sirasao’s SC16 talk:






via Xcell Daily Blog articles

January 9, 2017 at 10:25PM


您的电子邮箱地址不会被公开。 必填项已用*标注