An anonymous reader quotes a report from IEEE Spectrum: Deep learning has a DRAM problem. Systems designed to do difficult things in real time, such as telling a cat from a kid in a car’s backup camera video stream, are continuously shuttling the data that makes up the neural network’s guts from memory to the processor. The problem, according to startup Flex Logix, isn’t a lack of storage for that data; it’s a lack of bandwidth between the processor and memory. Some systems need four or even eight DRAM chips to sling the 100s of gigabits to the processor, which adds a lot of space and consumes considerable power. Flex Logix says that the interconnect technology and tile-based architecture it developed for reconfigurable chips will lead to AI systems that need the bandwidth of only a single DRAM chip and consume one-tenth the power.

Mountain View-based Flex Logix had started to commercialize a new architecture for embedded field programmable gate arrays (eFPGAs). But after some exploration, one of the founders, Cheng C. Wang, realized the technology could speed neural networks. A neural network is made up of connections and “weights” that denote how strong those connections are. A good AI chip needs two things, explains the other founder Geoff Tate. One is a lot of circuits that do the critical “inferencing” computation, called multiply and accumulate. “But what’s even harder is that you have to be very good at bringing in all these weights, so that the multipliers always have the data they need in order to do the math that’s required. [Wang] realized that the technology that we have in the interconnect of our FPGA, he could adapt to make an architecture that was extremely good at loading weights rapidly and efficiently, giving high performance and low power.”

Share on Google+

of this story at Slashdot.

…read more

Source:: Slashdot