edagraffiti

iPhone is number 1

Posted on November 13, 2009 by paulmcl

I’ve talked before about just how amazing Apple’s performance in the cell-phone (and laptop) market is. Last quarter, only two years after entering the cell-phone market, Apple became #1, at least if you measure by how much money they made rather than how many phones they shipped.

Apple shipped 7.4 million iPhones for $4.5 billion but they made more profit on them than Nokia made on the 108.5 million phones they shipped for $10.36 billion. Strategy Analytics estimated (since Apple doesn’t tell) that Apple made a profit of $1.6 billion whereas Nokia made only $1.1 billion.

I had a slide from Morgan Stanley at the ICCAD meeting last week that showed that iPhone is the fastest ever adoption of a consumer electronics technology. Actually, that’s not really fair since Apple waited until the market already existed unlike, say, RIM (blackberry) who had to create the market for smart-phones. But Apple has executed flawlessly so far.

Of course there is a big patent battle going on since it is impossible to build a cell-phone without infringing lots of patents held by lots of different parties. Indeed, when I was at VLSI I got involved in patents that we might be infringing. Patents were divided by the industry into essential and non-essential. An essential patent was one that you had to infringe to build a conforming GSM handset. For example, Philips owned a patent on the specific parameters used in the voice compression algorithm GSM adopted and they wanted, I think, $1 per phone license fee. Unsurprisingly Nokia owns lots of patents on cell-phone technology so thinks that Apple should pay them a license fee on all those iPhones as, I’m sure, do dozens of other people (the iPhone is GSM technology, although a universal iPhone that also supports CDMA is rumored to be coming).

The Motorola Droid came out this week too. I’ve not used one so I don’t have much to say but one thing that it has (free) that iPhone does not is turn-by-turn directions when driving. I hadn’t realized that Google had built up their own turn-by-turn database because they didn’t want to pay high royalties for map data. Then they said sayonara to Navteq (acquired by Nokia for $8.1 billion) and Tele Atlas (acquired by Tom Tom for $2.7 billion). Apple doesn’t have their own map database and if you want turn-by-turn directions on the iPhone then “there’s an app for that” and Tom Tom will sell you one for $99.99. It will be interesting to see how this particular little corner of cell-phone space plays out.

Posted in semiconductor | Comments Off

Kauffman Award Dinner

Posted on November 10, 2009 by paulmcl

Last week was the EDAC Kauffman Award dinner. One minor advantage of being a blogger is that I got invited along as press. “Will blog for food”. This year’s winner was Professor Randal Bryant, usually just known as Randy Bryant.

I knew of Randy as the inventor of switch level simulation with a tool called MOSSIM. Up until that point, all simulation of semiconductor had been done using Spice type algorithms, worrying about the transfer functions of the transistors. But with the coming of Mead and Conway, computer scientists were starting to want a much simpler model of the world so that they could apply programming techniques to design. Treat transistors as switches that were either on or off and with a unit delay (all transistors turned on and off at the same speed). MOSSIM was the first of these so-called switch level simulators developed in about 1980. At VLSI we developed a similar tool, called VSIM. Later, the switch model would be enhanced to add timing (and, surprise, ours was called TSIM). Funny now to realize that in the early 1980s IC design was largely done without timing, using Spice for paths that looked like they might be important.

Randy Bryant was also the inventor of BDDs, binary decision diagrams. BDDs are a very efficient representation of combinational logic and are one of the key technologies underlying logic optimization and hence underlying both synthesis and formal verification. The advantage of BDDs is that despite being a fairly compressed representation of the circuit, many logic operations can be done efficiently directly on the BDD, without needing to expand the representation into something less space efficient and then recompress it again afterwards. They are not good at representing everything; multipliers are notorious for exploding BDD size but they are hard to represent period.

Randy first published his ideas in 1986. An amazing fact that came to light at the Kauffman award dinner was that his paper just kept getting more and more citations. Usually a paper generates a flurry of interest soon after publication and then it dies down. But 15 years after publication for most of the early part of this decade, Randy’s paper wasn’t just the most cited paper in EDA, it was the most cited paper in the whole of computer science.

The Kauffman Award is awarded based on the impact that individuals have had on EDA. The ideas in MOSSIM, while very important in the early 1980s have dwindled in importance since as simulation has moved up to higher levels. But given that every synthesis and formal verification tool relies heavily on BDDs over 20 years after their conception I think that the “impact” is unarguable.

Posted in eda industry | Comments Off

ICCAD: EDA for the next 10 years

Posted on November 3, 2009 by paulmcl

Yesterday at ICCAD, Jim Hogan and I led an discussion on the megatrends facing electronics and the implications going forward for EDA. Basically we took a leaf out of Scoop Nisker’s book, who when he finished reading the news would sign off with "if you don’t like the news go out and make some of your own." So we tried to.

Anyone whose being reading this blog regularly won’t be surprised at the position that we took. I managed to find some interesting data from Morgan Stanley about how electronics is growing but it is also fragmenting. PCs ship in 100s of millions; cell-phones in billions (the world is expected to get to 100% penetration in a couple of years) and the fragmented consumer market in 10s of billions: car electronics, mobile video, home entertainment, games, kindles, iPods, smart-phones and so on.

So the end market is growing strongly but individual systems (with a few exceptions) are shipping in smaller individual volumes.

Meanwhile, over in IC-land the cost of design has been rising rapidly. For a 45nm chip it is now $50M. There are two problems with this for EDA. One is that the sticker price means that a lot fewer chips will be designed, and second that the fastest growing part of the cost is software (where EDA doesn’t play much) now up to almost 2/3 the cost of design. But if a chip costs $50M to design then you’d better be shipping it into a market of 250M+ units or the economics won’t work.

So we have a mismatch: fragmented consumer market requiring low-cost low-volume designs. Semiconductor economics requiring high-cost high-volume designs.

The only way around this is aggregation at the silicon level, along with reconfigurability and reprogrammability.

The most basic form of aggregation is the FPGA, since the basic gates can be used for pretty much any digital design. It’s not very efficient in terms of area or power, but it is completely flexible.

The second form of aggregation is the programmable SoC. This is something I’ve predicted for some time but I was surprised to discover recently that some manufacturers have been building these for several years. Indeed, Cypress gave me a chart showing that they are on track to ship 3/4 billion of this by the end of the year and should pass a billion next year. The programmable SoC doesn’t have completely uncommitted gates like an FPGA, rather it has little building blocks for peripherals, both analog and digital, that can be reconfigured into a wide range of different devices. This can either be done one time to initialize the device, or it can be done dynamically under control of the on-board processor(s).

The third form of aggregation is the platform. This seems to be most successful in the wireless world, TI’s OMAP being the most well-known. But it has also been happening in digital video. At some point it become more efficient to waste silicon by loading up a chip with everything you might ever want, and enable/disable by software, as opposed to eating the huge cost of masks and inventory of specializing each derivative to perfectly match the end customers needs.

Jim carried on to talk about which type of products make money in EDA. There is a range of types of tools from measurement, modeling, analysis, simulation and optimization. The further to the right on this list the more money customers are prepared to pay and the most likely it will be that you can create and sustain a competitive advantage for several years. Each tool needs to be better, faster or cheaper and preferably all three in order to be successful. If you can only have two they’d better be better and faster. Cheaper in EDA has the same connotations as low-cost heart surgeon. With so much on the line that’s not the place to economize.

Ultimately this is moving towards what I call software signoff, the inversion of the way about thinking about electronic systems. Instead of thinking of a complex SoC with some embedded software, a system is actually a big software system, parts of which need to be accelerated by some type of semiconductor implementation to make them economic (fast enough, low enough power). We don’t have the tools today to take complex software and automatically build some parts in gates, assemble IP, assign the software to processors and so on. But that is the direction we need to move in.

The mismatch between fragmented end-markets and high costs of design is potentially disruptive and thus an opportunity to change the way that design is done. I return to Yoshihito Kondo of Sony’s call to arms: "We don’t want our engineers writing Verilog, we want them inventing concepts and transferring them into silicon and software using automated processes."

The presentation is posted on the SI2 website here.

Posted in eda industry | Comments Off

Hogan and McLellan: live in concert

Posted on October 29, 2009 by paulmcl

Jim Hogan and I are doing a presentation during ICCAD on Monday about what direction we see electronic system design moving, and the implications for EDA. We plan to talk for about 20 minutes and then have a discussion. Come along with your own outspoken opinions on how you think silicon platforms will change, whether we are moving towards software signoff, where the highest value areas are for EDA, and generally what the big high-level drivers are going to be.

It’s in the Silicon Valley Room at the Doubletree Hotel (2050 Gateway Place, San Jose). That’s up on the second floor by the main staircase. It’s at 3pm on this coming Monday, November 2nd.

See you there!

Posted in admin | Comments Off

State of the union…of digital and analog

Posted on October 29, 2009 by paulmcl

I spent part of last Tuesday at the Cadence mixed-signal workshop. I went mainly out of interest to see how things had progressed since I worked at Cadence. I had been put in charge of what we called the Superchip project, which was actually integrating together the custom design and the digital synthesis, place & route design to get them into a single design environment. The heart of the problem was to get both systems onto a single database for mixed signal design. This turned out to be immensely complicated since the basic semantics of the two design databases were so different, and nobody in the company had a deep understanding of both of them.

Now, many years later, that work seems to have largely been achieved, but in about 5 times as long as we originally planned.

I thought the most interesting thing was a summary of just how much mixed-signal is impacting design cost. Over 50% of re-spins at 65nm are due to mixed-signal and each respin costs $5-10M and takes 6-8 weeks. Mixed-signal chips typically take 4-5 respins to get right. That is a huge cost both in direct dollars ($20M-50M) and in time (4-5 respins of 6-8 weeks is upwards of 6 months).

The level of mixed-signal effort and the expertise required is also increasing significantly. It now takes 50% of the effort for 10% of the transistors. The basic technology for integration and verification is still too fragmented and there is a lot of ad hoc work to tie together things like self-test.

The next area of difficulty is packaging. It severely impacts device performance due to the package parasitics. The lack of good optimization tools for the chip/package interface makes for lower performance and further drives package costs up.

Posted in methodology | Comments Off

Looking through Critical Blue’s Prism

Posted on October 28, 2009 by paulmcl

I caught up with Dave Stewart and Skip Hovsmith of CriticalBlue (from Edinburgh, yay, one of my alma maters). They originally developed technology to take software and pull it out of the code and implement it in gates. They had some limited success with this. But now they have refocused their technology on the problem of taking legacy code and helping make it multicore ready with their Prism tool.

They do this by running the code and storing a trace of what goes on for later analysis. Previously they have done this only through simulation but now they can also use hardware boards to run the code. They don’t need a multicore CPU, just one with the same instruction set.

Having developed the trace they can do “what if there were 4 cores, or 32” type analysis without need to run it again. On typical code that wasn’t written with concurrency in mind the typical answer is “not much would happen” because there are too many dependencies. The example in the demo is doing a JPEG compression. Most of the time is spent in the DCT (discrete cosine transform) algorithm but the code doesn’t parallelize due to data dependencies. It turns out that the code was written in a way that makes sense in a single processor world: allocate a single workspace, run the loop 32 times using the workspace, then dispose of the workspace. Obviously if you try and parallelize this, then all iterations of the loop except one must block and wait for the workspace. If you move the workspace allocation into the loop (so that you allocate 32 of them) then all the iterations of the loop can run in parallel.

They don’t actually change the code. It turns out users wisely don’t want something mucking around with their code without their understanding what is going on. But they give users the tools to both move code onto multicore processors and simply get it ready. When doing maintenance on code, by using Prism a programmer can also remove unnecessary dependencies and thus get the code so that it will be able to take advantage of multicore processors (or to take advantage of larger numbers of cores) when they become available.

I’ve said before that microprocessor vendors, especially Intel, completely underestimated the difficulty of programming multicore processors when power constraints forced them to deliver computing power in the form of more cores rather than faster clock frequencies. Everyone realizes now that it is one of the major challenges in software going forward.

At DAC the nVidia keynote claimed that Amdahl’s law wasn’t really a limitation any more. I’m not a believer in that position. The part of any program that cannot be parallelized sets a firm bound on how much speedup can be obtained. Even if only 5% of the code cannot be parallelized, which seems high, that sets a limit of 20X speedup no matter how many cores are available. Critical Blue have an interesting tool for teasing out whatever parallelization is possible, often with relatively simple changes to the code as in the example I described above.

Posted in embedded software | Comments Off

ARM 20 years on

Posted on October 26, 2009 by paulmcl

I went to Mike Muller’s keynote at ARM’s techcon3. He started with an interesting retrospective on ARM. They have shipped 15B units (4B in 2008 alone). They have 20+ processor cores, 600+ licensees. In the next 3 or 4 years they will ship another 15B units. It’s not far off to say that “almost all” microprocessors are ARMs (by unit count).

Smartphones are one big driver now. Basic and enhanced phones have peaked and volumes are actually declining, but smartphone is growing fast. For example between 2007 and 2008 mobile phone browsing increased by 30%. It is clear that smartphones are going to become the dominant way of accessing the internet (they already are in Asia).

If you look out the next 10 years then in one aspect things look rosy. In 2014 area will have increased by 4X from today, by 2020 increased by 16X. Clock frequency is still going up, although not dramatically: up 1.6X by 2014 and 2.4X by 2020. Volume production of 3D chips (with TSV, through-silicon vias) will come along in this period leading to “motherboard in a package”.

But unfortunately power is the fly in the ointment. Power for given functionality at a given clock speed will decrease to 0.6X by 2014 and 0.3X by 2020. That’s nice but nowhere near enough. Run the numbers: 16 times the area at 2.3 times the clock rate and 0.3X of the power means that only 10% of the chip can be used if the power budget remains fixed (which largely it does due to heat and battery life). So complex power architectures, power down blocks, adaptive clock frequency and so on are going to be essential going forward. Design is getting more and more complex.

The titanic battle is, as I’ve said before, between ARM and Intel’s Atom. Intel were cheeky enough to bring a huge truck and park it outside the ARM conference, which Mike had managed to get a photograph of into his keynote only an hour or so later. Atom has two advantages over ARM: it has binary compatibility with Windows and the PC, and Intel’s manufacturing is second to nobody. My belief is that as more stuff moves online and more content is accessed through smartphones rather than PCs, then binary compatibility will be less and less relevant. Further, Intel won’t be manufacturing most Atom-based SoCs, so I’m not sure either advantage is strong enough.

Mike finished with a Lucite block from a microprocessor forum ten or 15 years ago. It contained all the new processors being announced at that forum (back then, that was where everyone announced new processors). There were 11 different architectures. Of the architectures in the Lucite just four remain today: ARM, Intel x86, MIPS and SPARC. Mike is pessimistic that MIPS and SPARC will survive long-term (and I agree) leaving just two, Intel and, of course, ARM.

Posted in semiconductor | Comments Off

TJ Rodgers and the PSoC

Posted on October 23, 2009 by paulmcl

I was at the ARM developer conference this week. Actually it has been renamed and is now called Techcon3, which seems pretty generic as branding. Anyway, one of the keynotes was by TJ Rodgers who started off by telling us more than we wanted to know about how he is using software and hardware to try and make the new world’s best pinot noir (he’s conceded that DRC is too hard to beat, but that’s old world). However, there was a serious point: he used Cypress PSoCs (programmable systems on chip) to implement the hardware, which was a use not envisaged when the hardware was designed.

Originally Cypress got into the PSoC business by accident. They decided to take a USB controller they had built (which cost 25c to make and sold for 50c) and use the basic technology to attack the microcontroller market. With the programmable hardware they felt they could create the 5000 different parts required for that market using just 6 chips. Technically this worked but economically it made no sense since the only way to displace an existing microcontroller was to compete on price, which was the problem they were trying to avoid in the first place. So they realized that replacing microcontrollers was a bad idea. But then they discovered that the chips could be used for a wide variety of other applications and they started to sell. To date Cypress has sold over 500 million of them and on track for a billion.

So how come they are being successful entering a market 25 years late? I have to admit I didn’t know these products existed. The closest thing I was aware of are FPGAs with on-chip processors, which allow you to combine generic gates with software.

I’ve talked lots of times before about how semiconductor economics means that leading edge processes can’t be used for most designs without some form of aggregation. A 45nm chip is so expensive to bring to production that you’d better want 50 million of them, and since you probably don’t have the luxury of having the market to yourself that means you probably need to be in a market of 200 million units or more. But outside of a few obvious markets such as cell-phones there just aren’t many markets that big. The most obvious form of aggregation is the FPGA, completely generic. However, there is still valley of death where the volume is too high to make FPGAs economic but not high enough to make designing a special purpose chip economic.

The PSoC seems to slip into this valley. It has a processor (the latest Cypress one has an ARM Cortex on it), programmable digital logic that can be used to build a range of digital peripherals, programmable analog parts that can be hooked up into a wide variety of analog inputs, and programmable I/Os. So, one example application was hotel room door locks (the card operated ones), which used to require 90 components and an ASIC, and can be replaced with 28 components and a standard part PSoC. Apart from the reduction in component count, a standard product like this is almost certainly cheaper than the ASIC and certainly much cheaper to design. Not to mention, since it is all software programmable including the hardware, it can be changed right up until the last minute (or even be field upgradable).

Further, in addition to statically assigning the hardware it can be changed dynamically. Another example: a Coke machine that for all except a few seconds in the middle of the night runs the machine, then for a few second reconfigures itself into a modem and communicates back to base to report if the machine needs refilling.

TJ Rodgers’s theme was that you can use a PSoC to solve problems you didn’t know existed for people you’ve never met. But I think another theme is that you can solve problems for markets that are too small to justify the investment of building a specialized chip, which increasingly is almost all of them. Each market may be only moderately large but in aggregate they are enormous. There has been a lot of discussion of platform-based design but most of it has just been trying to get some standardization into wireless design (especially TI’s OMAP platform). Based on just 45 minutes of hearing about it, the PSoC seems like some sort of sweet spot in terms of efficiency and flexibility.

Posted in semiconductor | Comments Off

New World Synphony

Posted on October 17, 2009 by paulmcl

It was only a couple of weeks ago that I was writing about software-signoff and FPGAs. I mentioned that Synopsys didn’t really have any high-level synthesis. Rumor has it that they do have sequential formal verification in development. Anyway, on Monday they announced Synphony which they position as high-level synthesis. Synopsys has musical product names now, not just Cadence.

Unlike other high-level synthesis it doesn’t start from C or C++ or SystemC. It starts from M, which is the language used within Matlab, the system development environment of The Mathworks. A lot of system design, especially in radio and automotive, is done using Mathworks products. They are a private company with a huge number of seats that sell at that intermediate price point which seems cheap for an EDA tool but expensive for a PC tool. More than Photoshop but less than Verilog. They also distribute their software widely within universities. As a result, M and Mathlab (along with various associated products like Simulink) are the primary tool today for algorithm exploration.

I think that this is worth looking at for a couple of reasons. First, it is another step towards the new world where more and more design is done at levels disconnected from chip design by people who don’t think of themselves as chip designers. And then automated tools reduce this to RTL and then to an FPGA (or gates).

The second thing worth looking at is The Mathworks themselves. They are selling at a low price point, rumored to be very profitable, sell an enormous number of licenses and are firmly on the cutting edge of technology. IC design automation can’t do this, of course. There just aren’t enough designers. But once you start to think of FPGA as being the implementation of choice, after all “almost all” designs will be FPGAs going forward, it’s not clear that is true. The low end “fred in a shed” FPGA design is never going to invest in tools. But the big new FPGA designs need powerful tools just as much and, in aggregate, will be a big market.

Somewhat on a similar note, at ICCAD Jim Hogan and I will be talking about “How EDA needs to change” which, surprise, will tie into this theme. It will be in the Silicon Valley Room at the DoubleTree at 3pm on Monday November 2nd. See you there.

Posted in eda industry | Comments Off

iPhone is number 1

Kauffman Award Dinner

ICCAD: EDA for the next 10 years

Hogan and McLellan: live in concert

State of the union…of digital and analog

Looking through Critical Blue’s Prism

ARM 20 years on

TJ Rodgers and the PSoC

New World Synphony

Recent Posts

Recent Comments

Archives

Categories

Meta