Last 12 months we noticed the announcement of Cadence’s Tensilica Q6 DSP IP which promised a brand new structure that brings integration between imaginative and prescient DSP workloads and new optimised machine studying inferencing workloads. The addition of “AI” capabilities to current DSP architectures bridges the hole between current IP blocks resembling CPUs or GPUs and extra specialised devoted inferencing IP blocks resembling Cadence’s personal Tensilica DNA100 block.
Today’s announcement is an evolution of final 12 months’s Q6, additional progressing the capabilities we noticed launched within the new structure and enabling extra efficiency, higher density and higher energy effectivity.
Over the following few years Cadence sees important development alternative for the imaginative and prescient DSP market, with the general picture sensor market rising at a charge of ~12% CAGR until 2025. Naturally these picture sensors will want corresponding picture processing energy behind them with the intention to remodel the uncooked picture information into one thing significant. Particularly the automotive sector is projected to increase enormously on this regard with a steady annual 36% development charge, because of the projected want for dozens of sensors in future automobiles.
However the expansion isn’t solely facilitated by the automotive sector. The cellular and smartphone sector remains to be projected to be the most important market, and right here development alternative is facilitated by the brand new development of using increasingly digital camera modules in smartphones, one thing that during the last 12 months particularly has grow to be exceedingly evident. Other markets for alternative are AR/VR headsets which are also projected to require a big numbers of cameras which can want picture processing.
Here’s the place the brand new Tensilica Q7 DSP comes into play. The IP is comparatively simple in what it brings in comparison with its predecessor, and that may very well be summed up as a 2x improve in its efficiency capabilities.
The new structure has had new ISA directions for higher acceleration of SLAM (Simultaneous Location and Mapping) which is a cornerstone for brand new AR functions resembling Google Lens.
The vital facet of the brand new IP is that it’s totally backwards suitable with current P6 and Q6 software program, which implies that distributors who’ve invested in software program don’t have to rewrite their algorithms from scratch with the intention to benefit from the brand new efficiency boosts.
Alongside different enhancements within the iDMA of the structure, resembling improved bandwidth enabled via microarchitectural modifications and information compression, Cadence put emphasis on ISO26262 necessities which dictate useful security requirements for street automobiles – a should have if the IP is to be employed within the automotive sector.
As talked about, Cadence has doubled up on the processing items in comparison with the Q6, leading to a brand new a brand new 512 8-bit MAC engine in addition to doubling up the floating level capabilities. Cadence quotes a peak efficiency of 1.82 TOPs in 8-bit operations, which might lead to a frequency of round 1.77GHz which is a pure development from the height 1.5GHz we had been offered final 12 months with the Q6.
One of probably the most fascinating facets of the brand new IP is how Cadence was capable of obtain all this and what it means for the realm and energy effectivity of the block. In truth, Cadence doesn’t anticipate the brand new technology to be any greater than the Q6, and the rise in efficiency and introduction of extra execution items got here at little or no space value. Cadence was capable of optimise the microarchitecture as such that the brand new Q7 guarantees a doubling of GMAC and GFLOPs per mm², which is sort of the feat for any IP vendor. Power effectivity positive factors are additionally consistent with the efficiency positive factors, and the corporate expects an identical 1.7x improve in perf/W.
Cadence envisions prospects to have the ability to lay out a number of Q7 blocks alongside one another for efficiency scaling, and naturally the IP would even be an awesome match to place alongside the DNA 100 neural…