RISC: diferenças entre revisões

Conteúdo apagado Conteúdo adicionado
Profvalente (discussão | contribs)
m +correções semiautomáticas (v0.50/3.1.38); -texto em inglês que nunca vai ser traduzido e que só polui o código-fonte da página; +tag
Linha 1:
{{Reciclagem|data=outubro de 2016}}
RISC ([[acrônimo]] de '''''Reduced Instruction Set Computer'''''; em português, "Computador com um conjunto reduzido de instruções") é uma linha de [[arquitetura de processador|arquitetura de processadores]] que favorece um conjunto simples e pequeno de [[Conjunto de instruções|instruções]] que levam aproximadamente a mesma quantidade de tempo para serem executadas. Muitos dos [[microprocessadores]] modernos são RISCs, por exemplo [[DEC Alpha]], [[SPARC]], [[arquitetura MIPS|MIPS]], e [[PowerPC]]. Os computadores atuais misturam as duas arquiteturas, criando o conceito de [[arquitetura híbrida]], incorporando os conceitos das duas arquiteturas e a inclusão de um núcleo RISC aos seus processadores. O tipo de microprocessador mais comum em desktops, o [[x86]], é mais semelhante ao [[CISC]] do que ao RISC, embora [[Circuito integrado|chips]] mais novos traduzam [[instruções x86]] baseadas em arquitetura [[CISC]] em formas baseadas em arquitetura RISC mais simples, utilizando prioridade de execução.
RISC ([[acrônimo]] de '''''Reduced Instruction Set Computer'''''; em português, "Computador com um conjunto reduzido de instruções") é uma linha de [[arquitetura de processador|arquitetura de processadores]] que favorece um conjunto simples e pequeno de [[Conjunto de instruções|instruções]] que levam aproximadamente a mesma quantidade de tempo para serem executadas. Muitos dos [[microprocessadores]] modernos são RISCs, por exemplo [[DEC Alpha]], [[SPARC]], [[arquitetura MIPS|MIPS]], e [[PowerPC]]. Os computadores atuais misturam as duas arquiteturas, criando o conceito de [[arquitetura híbrida]], incorporando os conceitos das duas arquiteturas e a inclusão de um núcleo RISC aos seus processadores. O tipo de microprocessador mais comum em desktops, o [[x86]], é mais semelhante ao [[CISC]] do que ao RISC, embora [[Circuito integrado|chips]] mais novos traduzam [[instruções x86]] baseadas em arquitetura CISC em formas baseadas em arquitetura RISC mais simples, utilizando prioridade de execução.
 
Os processadores baseados na computação de conjunto de instruções reduzidas não têm [[micro-programação]], as instruções são executadas diretamente pelo [[hardware]]. Como característica, esta arquitetura, além de não ter [[microcódigo]], tem o conjunto de instruções reduzidas, bem como baixo nível de complexidade.
Linha 59 ⟶ 60:
Como é usual acontecer em qualquer área da atividade humana, é raro que algum conceito ou tecnologia importante obtenha unanimidade entre pesquisadores, técnicos, projetistas e administradores. Este é o caso da arquitetura CISC, a qual sempre foi alvo de críticas e comentários sobre desvantagens e problemas. Neste texto não cabe posicionamento por este ou aquele fato ou tecnologia, mas sim apresentar todos os elementos possíveis das diversas tendências, no caso entre CISC e RISC. No entanto, para se compreender o surgimento de processadores com arquitetura RISC deve-se analisar os eventuais problemas indicados para a arquitetura CISC, que levaram pesquisadores e projetistas de sistemas a criar uma alternativa, considerada por eles mais vantajosa.
 
Para entender melhor as raízes do surgimento da filosofia RISC, pode-se mencionar alguns pontos das arquiteturas CISC citados como problemáticos por um dos criadores de máquinas RISC, David Patterson, em um de seus artigos, induzindo ao projeto de processadores que pudessem, com sua especificação mais simples, reduzir ou eliminar os citados problemas. Na realidade, parece ter sido Patterson quem primeiro definiu as arquiteturas com muitas e poderosas instruções de CISC e sua máquina protótipo de RISC (o nome escolhido foi RISC-1):
* '''Diferenças de velocidade entre memória e processador''' – no final da [[década de 1970]], a IBM verificou que essa diferença era um problema em seus sistemas, algumas operações eram realizadas por programas, acarretando muitos acessos a uma memória lenta. A solução encontrada foi criar novas instruções de máquina para executar tais operações, podendo-se acreditar que esse foi o início do aumento da quantidade de instruções no CISC.
 
* '''Emprego de microcódigo''' – o surgimento e a real vantagem de custo/beneficio do emprego de microcódigo sobre programação diretamente no hardware induziram os projetistas a criar mais e mais instruções, devido a facilidade e a flexibilidade decorrentes. Desenvolvimento acelerado de linguagens de alto nível – na [[década de 1980]], havia um crescimento acelerado do emprego de linguagens de alto nível, o que conduzia os projetistas de processadores a incluir cada vez mais instruções de máquinas em seus produtos, como o propósito de manter um suporte adequado na compilação.
 
* '''Densidade de código a ser executado''' – as arquiteturas CISC procuram obter um código compacto após a compilação, de modo a não consumir memória em excesso. Isso era necessário em uma época em que as memórias eram caras e de reduzindo tamanho. Construindo conjuntos de instruções, cada uma delas mais próxima do significado do comando de alto nível, poder-se-ia obter códigos executáveis mais densos, mais compactos. Alega Patterson que isto acarretaria também mais bits nas instruções (códigos de operações com mais bits devido à quantidade delas, bem como mais modos de endereçamento), o que contrabalançaria aquela pretensa vantagem.
 
* '''Necessidade de compatibilidade com processadores anteriores''' – uma das metas sempre seguida pela Intel e outros fabricantes foi a de conservar a compatibilidade entre as versões de seus processadores. Assim o processador [[486]] veio com apenas algumas instruções novas e todo o código do [[386]] junto, códigos executáveis para o 386 rodavam também no 486, e os usuários poderiam trocar de computador sem nenhum custo adicional de compilação, etc. O mesmo aconteceu com o [[Pentium II|Pentium I]], II, III e 4. Mesmo isso, embora seja um notório requisito importante de marketing, acarreta uma limitação especificação de novas arquiteturas. Dessa forma, as arquiteturas novas só crescem em quantidade de instruções, visto que o fabricante nunca retira as instruções antigas devido ao problema de compatibilidade.
 
Linha 78 ⟶ 76:
Uma força importante incentivar a complexidade era o fato das memórias principais serem muito limitadas (da ordem de kilobytes). Foi, portanto, vantajosa para a densidade de informações contidas em programas de computador a ser elevado, levando a características tais como alta codificação, instruções de comprimento variável, fazendo o carregamento de dados, bem como o cálculo (como mencionado acima). Estas questões foram de maior prioridade que a facilidade de decodificação de instruções.
 
Uma razão igualmente importante foi que as memórias principais foram bastante lentas (um tipo comum foi a memória de núcleo de ferrite), usando a embalagem de informação densa, pode-se reduzir a freqüênciafrequência com que a CPU tinha de aceder a este recurso lento. Os computadores modernos mostram fatores de limitação semelhantes: memórias principais são lentas em comparação com o CPU e as memórias cache rápidas, empregadas para superar este são limitadas em tamanho. Isso pode explicar o fato de que conjuntos de instruções altamente codificados tem provado ser tão útil como projetos RISC em computadores modernos.
 
== Filosofia de Desenvolvimento RISC ==
Em meados de 1970 investigadores (em especial John Cocke) da IBM (e projetos semelhantes em outros lugares) demonstraram que a maioria das combinações desses modos ortogonais de endereçamento e as instruções não foram utilizados pela maioria dos programas gerados por compiladores disponíveis no momento. Ele revelou-se difícil em muitos casos, para escrever um compilador com mais que a capacidade limitada de tirar proveito dos recursos oferecidos pelos processadores convencionais.
 
Também foi descoberto que, em implementações de arquiteturas microcodificadas, certas operações complexas tendem a ser mais lentas do que uma sequência de operações mais simples fazendo a mesma coisa. Isso foi em parte um efeito do fato de que muitos projetos foram levados às pressas, com pouco tempo para otimizar ou sintonizar todas as instruções, mas sim apenas aquelas usadas com mais freqüênciafrequência. Um exemplo famoso foi a instrução do VAX de índice.
 
Como mencionado anteriormente, a memória de núcleo há muito havia sido mais lenta do que muitos projetos de CPU. O advento de memórias de semicondutores reduziu essa diferença, mas ainda era aparente que mais registradores (e mais tarde caches) permitiria maior freqüênciafrequência de operação da CPU. registros adicionais exigiriam chips importantes ou áreas bordo que, na época (1975), poderiam ser disponibilizados se a complexidade da lógica de CPU havia sido reduzida.
 
Contudo um outro impulso de ambos os RISC e outros projetos veio a partir de medições práticas em programas no mundo real. [[Andrew Tanenbaum]] resumiu muitos destes, demonstrando que os processadores tiveram muitas vezes tamanhos desproporcionais imediatos. Por exemplo, ele mostrou que 98% de todas as constantes em um programa iriam caber em 13 bits, mas muitos projetos CPU dedicam de 16 ou 32 bits para armazená-los.
Linha 110 ⟶ 108:
Em meados de 1970 investigadores (em especial John Cocke) da IBM (e projetos semelhantes em outros lugares) demonstraram que a maioria das combinações desses modos ortogonais de endereçamento e as instruções não foram utilizados pela maioria dos programas gerados por compiladores disponíveis no momento. Ele revelou-se difícil em muitos casos, para escrever um compilador com mais que a capacidade limitada de tirar proveito dos recursos oferecidos pelos processadores convencionais.
 
Também foi descoberto que, em implementações de arquiteturas microcodificadas certas operações complexas tendem a ser mais lentas do que uma sequência de operações mais simples fazendo a mesma coisa. Isso foi em parte um efeito do fato de que muitos projetos foram levados às pressas, com pouco tempo para otimizar ou sintonizar todas as instruções, mas sim apenas aquelas usadas com mais freqüênciafrequência. Um exemplo famoso foi a instrução do VAX de índice.
 
Como mencionado anteriormente, a memória de núcleo há muito havia sido mais lenta do que muitos projetos de CPU. O advento de memórias de semicondutores reduziu essa diferença, mas ainda era aparente que mais registradores (e mais tarde caches) permitiria maior freqüênciafrequência de operação da CPU. registros adicionais exigiriam chips importantes ou áreas bordo que, na época (1975), poderiam ser disponibilizados se a complexidade da lógica de CPU havia sido reduzida.
 
Contudo um outro impulso de ambos os RISC e outros projetos veio a partir de medições práticas em programas no mundo real. [[Andrew Tanenbaum]] resumiu muitos destes, demonstrando que os processadores tiveram muitas vezes tamanhos desproporcionais imediatos. Por exemplo, ele mostrou que 98% de todas as constantes em um programa iriam caber em 13 bits, mas muitos projetos CPU dedicam de 16 ou 32 bits para armazená-los. Isto sugere que, para reduzir o número de acessos à memória, uma máquina de comprimento fixo pode armazenar constantes em bits não utilizados da palavra de instrução em si, de modo que eles seriam imediatamente prontos quando a CPU precisa deles (muito parecido com endereçamento imediato em um desenho convencional). Estes necessários e pequenos opcodes, ocorreram a fim de deixar espaço para uma constante com um tamanho razoável em uma palavra de instrução de 32 bits.
Linha 134 ⟶ 132:
- execução rápida de cada instrução (uma a cada ciclo de clock).
 
'''Menor quantidade de instruções: ''' talvez a característica mais marcante das arquiteturas RISC, seja a de possuir um conjunto de instruções menor(todas também com largura fixa), que as máquinas que possuíam a arquitetura CISC, porém com a mesma capacidade. Vem daí o nome dado a arquitetura RISC (computadores com um conjunto reduzido de instruções). A SPARC, da Sun, possuía um conjunto de cerca de 50 instruções, a VAX-11/780 tinha até 300 instruções, o Intel 80486 foi apresentado com 200duzentos instruções e os Pentium possuem mais de 200 instruções.
 
Com o conjunto de instruções reduzido e cada uma delas tendo suas funções otimizadas, os sistemas possuíam um resultado melhor em questão de desempenho. Em virtude do conjunto reduzido das instruções, acarretavam em programas um pouco mais longos.
Linha 145 ⟶ 143:
A busca pelas instruções foi facilitada porque todas as instruções possuem o mesmo tamanho em bits e alinhadas a largura da palavra. Por isso não é mais necessário verificar o tamanho do contador de instruções, pois ele é incrementado sempre com o mesmo valor. Com isso, não tem risco da instrução ocupar duas páginas de dados diferentes, porque traria problemas para o sistema operacional na hora do acesso.
 
'''Execução otimizada de chamadas de função: ''' outra evolução da arquitetura RISC para a arquitetura CISC tem relação com a chamada de retinas e passagem de parâmetros. Estudos indicam que as chamadas de funções consomem um tempo significativo de processador. Elas requerem poucos dados, mas demoram muito tempo nos acessos a memória.
 
Em virtude disso, na arquitetura RISC foram utilizados mais registradores. As chamadas de função que na arquitetura CISC ocorriam com acessos a memória, mas na RISC isso era feito dentro do processador mesmo, utilizando os registradores que foram colocados a mais.
 
'''Modo de execução com Pipelining: ''' uma das características mais relevantes da arquitetura RISC é o uso de [[Pipeline (hardware)|pipelining]], mesmo sabendo que ela tem um funcionamento mais efetivo quando as instruções são todas bastante parecidas.
 
Imaginando estágios de uma linha de montagem, não é interessante que um estágio termine antes do outro, pois nesse caso perde-se a vantagem da linha de montagem. O objetivo de cada instrução, é completar um estágio de pipeline em um ciclo de [[clock]], mas esse objetivo nem sempre é alcançado.
Linha 155 ⟶ 153:
O processamento de uma instrução é composto pelo menos por cinco fases:
* Instruction fetch;
 
* Instruction decode;
* Operand fetch;
Linha 234 ⟶ 231:
Isso não é necessariamente correto se considerarmos que uma menor quantidade de instruções nem sempre acarreta menor quantidade de bits (e é a quantidade efetiva de bits que consome menos memória e a menor custo). Se cada instrução CISC possuir mais operandos que as instruções RISC e se cada um de seus operandos ocupar uma boa quantidade de bits na instrução, então poderemos ter um programa CISC maior em bits do que um programa em máquina RISC, apesar de o programa para o processador RISC possuir maior quantidade de instruções.
 
Por exemplo, um programa escrito para rodar em um processador CISC pode gastar 150 instruções de máquina; cada uma das instruções possui código de operação de 8 bits, podendo ser de um, dois, e três operandos. Cada campo operando ocupa 18 bits e ainda há um campo para outras ações, com 4quatro bits de tamanho. Em média, as instruções têm um total de 50 bits. Um programa para realizar o mesmo problema, escrito para rodar em um processador RISC, pode ter 220 instruções, que em média ocupam 32 bits.
As instruções são, em sua esmagadora maioria, de dois operandos, porém os operandos são valores em registradores e, por isso, as instruções não consomem muitos bits para endereçar os dois registradores.
O programa para a máquina CISC gastaria 7.500 bits, enquanto o programa para a máquina RISC, mesmo possuindo mais 70 instruções que o processador CISC, consumiria 7.040 bits.
Linha 252 ⟶ 249:
 
Características CISC
 
* Controle microprogramado;
* Instruções de dois operandos – ADD CX,mem;
Linha 263 ⟶ 260:
 
Características RISC
 
* Controle por hardware;
* Pequeno conjunto de instruções;
Linha 298 ⟶ 295:
* SuperH Hitachi, originalmente em amplo uso na Super Sega 32X, Saturn e Dreamcast, agora no coração de muitos dispositivos eletrônicos de consumo. O SuperH é a plataforma base para a Mitsubishi - Hitachi grupo de semicondutores comuns. Os dois grupos se fundiram em 2002, caindo arquitetura RISC própria Mitsubishi, o M32R.
* Atmel AVR usado em uma variedade de produtos, incluindo desde os controladores de Xbox portátil para carros BMW.
<!--
{{Redirect|RISC}}
{{refimprove|date=fevereiro 2007}}
 
The '''reduced instruction set computer''', or '''RISC''', is a [[CPU design]] philosophy that favors an [[instruction set]] reduced both in size and complexity of [[addressing mode]]s, in order to enable easier implementation, greater [[instruction level parallelism]], and more efficient [[compiler]]s. [[As of 2007]], common RISC microprocessors families include the [[DEC Alpha]], [[ARC International|ARC]], [[ARM architecture|ARM]], [[Atmel AVR|AVR]], [[MIPS architecture|MIPS]], [[PA-RISC]], [[Power Architecture]] (including [[PowerPC]]), and [[SPARC]].
 
The idea was originally inspired by the discovery that many of the features that were included in traditional [[central processing unit|CPU]] designs to facilitate coding were being ignored by the [[computer program|program]]s that were running on them. Also these more complex features took several processor cycles to be performed. Additionally, the performance gap between the processor and main memory was increasing. This led to a number of techniques to streamline processing within the CPU, while at the same time attempting to reduce the total number of memory accesses.
 
== Pre-RISC design philosophy ==
{{details|CPU design}}
--><!-- Should w move this entire section to [[CPU design]] ? --><!--
 
In the early days of the computer industry, [[compiler]] technology did not exist at all. Programming was done in either [[machine code]] or [[assembly language]]. To make programming easier, computer architects created more and more complex instructions, which were direct representations of high level functions of high level programming languages. The attitude at the time was that hardware design was easier than compiler design, so the complexity went into the hardware.
 
Another force that encouraged complexity was the lack of large memory. Since memory was small, it was advantageous for the density of information held in computer programs to be very high. When every [[byte]] of memory was precious, for example one's entire system only had a few kilobytes of storage, it moved the industry to such features as highly encoded instructions, instructions which could be variable sized, instructions which did multiple operations and instructions which did both data movement and data calculation. At that time, such instruction packing issues were of higher priority than the ease of decoding such instructions.
 
Another reason to keep the density of information high was that memory was not only small, but also quite slow, usually implemented using ferrite [[core memory]] technology. By having dense information packing, one could decrease the frequency with which one had to access this slow resource.
 
CPUs had few registers for two reasons:
* bits in internal CPU registers are always more expensive than bits in external memory. The available level of silicon integration of the day meant large register sets would have been burdensome to the chip area or board areas available.
* Having a large number of registers would have required a large number of instruction bits (using precious RAM) to be used as register specifiers.
 
For the above reasons, CPU designers tried to make instructions that would do as much work as possible. This led to one instruction that would do all of the work in a single instruction: load up the two numbers to be added, add them, and then store the result back directly to memory. Another version would read the two numbers from memory, but store the result in a register. Another version would read one from memory and the other from a register and store to memory again. And so on. This processor design philosophy eventually became known as [[Complex Instruction Set Computer]] (CISC) once the RISC philosophy came onto the scene.
 
The general goal at the time was to provide every possible [[addressing mode]] for every instruction, a principle known as "orthogonality." This led to some complexity on the CPU, but in theory each possible command could be tuned individually, making the design faster than if the programmer used simpler commands.
 
The ultimate expression of this sort of design can be seen at two ends of the power spectrum, the [[MOS Technology 6502|6502]] at one end, and the [[VAX]] at the other. The $25 single-chip 1 MHz 6502 had only a single general-purpose register, but its simplistic single-cycle memory interface allowed byte-wide operations to perform almost on par with significantly higher clocked designs, such as a 4 MHz [[Zilog Z80]] using equally slow memory chips (i.e. approx. 300ns). The VAX was a [[minicomputer]] whose initial implementation required 3 racks of equipment for a single cpu, and was notable for the amazing variety of memory access styles it supported, and the fact that every one of them was available for every instruction.
 
== RISC design philosophy ==
In the late 1970s researchers at [[IBM]] (and similar projects elsewhere) demonstrated that the majority of these "orthogonal" [[addressing mode]]s were ignored by most programs. This was a side effect of the increasing use of [[compiler]]s to generate the programs, as opposed to writing them in [[assembly language]]. The compilers in use at the time only had a limited ability to take advantage of the features provided by [[Complex instruction set computer|CISC]] CPUs; this was largely a result of the difficulty of writing a compiler. The market was clearly moving to even wider use of compilers, diluting the usefulness of these orthogonal modes even more.
 
Another discovery was that these operations were rarely used; in fact, they tended to be ''slower'' than a number of smaller operations doing the same thing. This seeming [[paradox]] was a side effect of the time spent designing the CPUs: designers simply did not have time to tune every possible instruction, and instead only tuned the ones used most often. One famous example of this was the [[Vax|VAX]]'s <code>INDEX</code> instruction, which ran slower than a loop implementing the same code.<ref>[[David A. Patterson|Patterson, D. A.]] and [[David Ditzel|Ditzel, D. R.]] 1980. The case for the reduced instruction set computer. ''SIGARCH Comput. Archit. News'' 8, 6 (Oct. 1980), 25-33. DOI= http://doi.acm.org/10.1145/641914.641917</ref>
 
At about the same time CPUs started to run even faster than the memory they talked to. Even in the late 1970s it was apparent that this disparity was going to continue to grow for at least the next decade, by which time the CPU would be tens to hundreds of times faster than the memory. It became apparent that more [[processor register|registers]] (and later [[cache]]s) would be needed to support these higher operating frequencies. These additional registers and cache memories would require sizeable chip or board areas that could be made available if the complexity of the CPU was reduced.
 
Yet another part of RISC design came from practical measurements on real-world programs. [[Andrew S. Tanenbaum|Andrew Tanenbaum]] summed up many of these, demonstrating that most processors were vastly overdesigned. For instance, he showed that 98% of all the constants in a program would fit in 13 [[bit]]s, yet almost every CPU design dedicated some multiple of 8 bits to store them, typically 8, 16 or 32, one entire [[word (computer science)|word]]. Taking this fact into account suggests that a machine should allow for constants to be stored in unused bits of the instruction itself, decreasing the number of memory accesses. Instead of loading up numbers from memory or registers, they would be "right there" when the CPU needed them, and therefore much faster. However this required the operation itself to be very small, otherwise there would not be enough room left over in a 32-bit instruction to hold reasonably sized constants.
 
Since real-world programs spent most of their time executing very simple operations, some researchers decided to focus on making those common operations as simple and as fast as possible.
Since the [[clock rate]] of the CPU is limited by the time it takes to execute the ''slowest'' instruction, --><!-- or, for instructions that take more than 1 clock cycle, the time it takes to execute the slowest 1-clock-cycle *part* of any instruction ... but that's probably too much detail --><!--
speeding up that instruction -- perhaps by reducing the number of addressing modes it supports -- also speeds up the execution of every other instruction. The goal of RISC was to make instructions so simple, each one could be executed in a single clock cycle[http://www.ercb.com/ddj/1990/ddj.9009.html].
The focus on "reduced instructions" led to the resulting machine being called a "reduced instruction set computer" (RISC).
 
The main difference between RISC and CISC is that RISC architecture instructions either (a) perform operations on the registers or (b) load and store the data to and from them. Many CISC instructions, on the other hand, combine these steps. To clarify this difference, many researchers use the term ''load-store'' to refer to RISC.
 
Over time the older design technique became known as ''[[Complex Instruction Set Computer]]'', or ''CISC'', although this was largely to give it a different name for comparison purposes.
 
--><!--
Thus the RISC philosophy was to make smaller instructions, implying fewer of them, and thus the name "reduced instruction set".
 
''Didn't we just say this was a misunderstanding?''
--><!--
Code was implemented as a series of these simple instructions, instead of a single complex instruction that had the same result. This had the side effect of leaving more room in the instruction to carry data with it, meaning that there was less need to use registers or memory. At the same time the memory interface was considerably simpler, allowing it to be tuned.
 
However RISC also had its drawbacks. Since a series of instructions is needed to complete even simple tasks, the total number of instructions read from memory is larger, and therefore takes longer. At the time it was not clear whether or not there would be a net gain in performance due to this limitation, and there was an almost continual battle in the press and design world about the RISC concepts.
 
== Other solutions ==
While the RISC philosophy was coming into its own, new ideas about how to dramatically increase performance of the CPUs were starting to develop.
 
In the early 1980s it was thought that existing design was reaching theoretical limits. Future improvements in speed would be primarily through improved [[semiconductor]] "process", that is, smaller features ([[transistors]] and wires) on the chip. The complexity of the chip would remain largely the same, but the smaller size would allow it to run at higher clock rates. A considerable amount of effort was put into designing chips for [[parallel computing]], with built-in communications links. Instead of making faster chips, a large number of chips would be used, dividing up problems among them. However, history has shown that the original fears were not valid and there were a number of ideas that dramatically improved performance in the late 1980s.
 
One idea was to include a [[Instruction pipeline|pipeline]] which would break down instructions into steps, and work on one step of several different instructions at the same time. A normal processor might read an instruction, decode it, fetch the memory the instruction asked for, perform the operation, and then write the results back out. The key to pipelining is the observation that the processor can start reading the next instruction as soon as it finishes reading the last, meaning that there are now two instructions being worked on (one is being read, the next is being decoded), and after another cycle there will be three. While no single instruction is completed any faster, the ''next'' instruction would complete right after the previous one. The result was a much more efficient utilization of processor resources.
 
Yet another solution was to use several processing elements inside the processor and run them in parallel. Instead of working on one instruction to add two numbers, these [[superscalar]] processors would look at the next instruction in the pipeline and attempt to run it at the same time in an identical unit. However, this can be difficult to do, as many instructions in computing depend on the results of some other instruction.
 
Both of these techniques relied on increasing speed by adding complexity to the basic layout of the CPU, as opposed to the instructions running on them. With chip space being a finite quantity, in order to include these features something else would have to be removed to make room. RISC was tailor-made to take advantage of these techniques, because the core logic of a RISC CPU was considerably simpler than in CISC designs. Although the first RISC designs had marginal performance, they were able to quickly add these new design features and by the late 1980s they were significantly outperforming their CISC counterparts. In time this would be addressed as process improved to the point where all of this could be added to a CISC design and still fit on a single chip, but this took most of the late-80s and early 90s.
 
The long and short of it is that for any given level of general performance, a RISC chip will typically have many fewer [[transistor]]s dedicated to the core logic. This allows the designers considerable flexibility; they can, for instance:
 
* increase the size of the register set
* implement measures to increase internal parallelism
* increase the size of [[CPU cache|caches]]
* add other functionality, like I/O and timers for microcontrollers
* add vector ([[SIMD]]) processors like [[AltiVec]] and [[Streaming SIMD Extensions]] (SSE)
* build the chips on older fabrication lines, which would otherwise go unused
* do nothing; offer the chip for [[battery (electricity)|battery]]-constrained or size-limited applications
 
Features which are generally found in RISC designs are:
* uniform instruction encoding (for example the op-code is always in the same bit position in each instruction, which is always one word long), which allows faster decoding;
* a homogeneous register set, allowing any register to be used in any context and simplifying compiler design (although there are almost always separate [[integer]] and [[floating point]] register files);
* simple [[addressing mode]]s (complex addressing modes are replaced by sequences of simple arithmetic instructions);
* few data types supported in hardware (for example, some CISC machines had instructions for dealing with [[byte]] [[string (computer science)|strings]]. Others had support for polynomials and [[complex number]]s. Such instructions are unlikely to be found on a RISC machine).
 
RISC designs are also more likely to feature a [[Harvard architecture|Harvard memory model]], where the instruction stream and the data stream are conceptually separated; this means that modifying the addresses where code is held might not have any effect on the instructions executed by the processor (because the CPU has a separate instruction and data [[cache]]), at least until a special synchronization instruction is issued. On the upside, this allows both caches to be accessed simultaneously, which can often improve performance.
 
Many of these early RISC designs also shared the characteristic of having a [[branch delay slot]]. A branch delay slot is an instruction space immediately following a jump or branch. The instruction in this space is executed whether or not the branch is taken (in other words the effect of the branch is delayed). This instruction keeps the [[arithmetic and logical unit|ALU]] of the CPU busy for the extra time normally needed to perform a branch. Nowadays the branch delay slot is considered an unfortunate side effect of a particular strategy for implementing some RISC designs, and modern RISC designs generally do away with it (such as [[PowerPC]], more recent versions of SPARC, and MIPS).
 
== Early RISC ==
The first system that would today be known as RISC was not at the time; it was the [[CDC 6600]] [[supercomputer]], designed in 1964 by Jim Thornton and [[Seymour Cray]]. Thornton and Cray designed it as a number-crunching CPU (with 74 opcodes, compared with a [[Intel 8086|8086]]'s 400) plus 12 simple computers called "peripheral processors" to handle I/O (most of the operating system was in one of these). The CDC 6600 had a load-store architecture with only two [[addressing mode]]s. There were eleven pipelined functional units for arithmetic and logic, plus five load units and two store units (the memory had multiple banks so all load-store units could operate at the same time). The basic clock cycle/instruction issue rate was 10 times faster than the memory access time.
 
Another early load-store machine was the [[Data General Nova]] minicomputer, designed in 1968.
 
The earliest attempt to make a chip-based RISC CPU was a project at [[International Business Machines|IBM]] which started in 1975. Named after the building where the project ran, the work led to the [[IBM 801]] CPU family which was used widely inside IBM hardware. The 801 was eventually produced in a single-chip form as the '''ROMP''' in 1981, which stood for ''Research (Office Products Division) Mini Processor''. As the name implies, this CPU was designed for "mini" tasks, and when IBM released the [[IBM RT-PC]] based on the design in 1986, the performance was not acceptable. Nevertheless the 801 inspired several research projects, including new ones at IBM that would eventually lead to their [[IBM POWER|POWER]] system.
 
The most public RISC designs, however, were the results of university research programs run with funding from the [[DARPA]] [[VLSI]] Program. The VLSI Program, practically unknown today, led to a huge number of advances in chip design, fabrication, and even computer graphics.
 
[[UC Berkeley]]'s [[Berkeley RISC|RISC project]] started in 1980 under the direction of [[David A. Patterson|David Patterson]], based on gaining performance through the use of pipelining and an aggressive use of registers known as [[register window]]s. In a normal CPU one has a small number of registers, and a program can use any register at any time. In a CPU with register windows, there are a huge number of registers, e.g. 128, but programs can only use a small number of them, e.g. 8, at any one time.
A program that limits itself to 8 registers per procedure can make very fast procedure calls: The call simply moves the window "down" by 8, to the set of 8 registers used by that procedure, and the return moves the window back. (On a normal CPU, most calls must save at least a few registers' values to the stack in order to use those registers as working space, and restore their values on return.)
 
The RISC project delivered the RISC-I processor in 1982. Consisting of only 44,420 transistors (compared with averages of about 100,000 in newer CISC designs of the era) RISC-I had only 32 instructions, and yet completely outperformed any other single-chip design. They followed this up with the 40,760 transistor, 39 instruction RISC-II in 1983, which ran over three times as fast as RISC-I.
 
At about the same time, [[John L. Hennessy]] started a similar project called [[MIPS architecture|MIPS]] at [[Stanford University]] in 1981. MIPS focused almost entirely on the pipeline, making sure it could be run as "full" as possible. Although pipelining was already in use in other designs, several features of the MIPS chip made its pipeline far faster. The most important, and perhaps annoying, of these features was the demand that all instructions be able to complete in one cycle. This demand allowed the pipeline to be run at much higher speeds (there was no need for induced delays) and is responsible for much of the processor's speed. However, it also had the negative side effect of eliminating many potentially useful instructions, like a multiply or a divide.
 
In the early years, the RISC efforts were well known, but largely confined to the university labs that had created them. The Berkeley effort became so well known that it eventually became the name for the entire concept. Many in the computer industry criticized that the performance benefits were unlikely to translate into real-world settings due to the decreased memory efficiency of multiple instructions, and that that was the reason no one was using them. But starting in 1986, all of the RISC research projects started delivering products. In fact, almost all modern RISC processors are direct copies of the RISC-II design.
 
== Later RISC ==
Berkeley's research was not directly commercialized, but the RISC-II design was used by [[Sun Microsystems]] to develop the [[SPARC]], by [[Pyramid Technology]] to develop their line of mid-range multi-processor machines, and by almost every other company a few years later. It was Sun's use of a RISC chip in their new machines that demonstrated that RISC's benefits were real, and their machines quickly outpaced the competition and essentially took over the entire [[workstation]] market.
 
John Hennessy left Stanford (temporarily) to commercialize the MIPS design, starting the company known as [[MIPS Computer Systems]]. Their first design was a second-generation MIPS chip known as the '''[[R2000]]'''. MIPS designs went on to become one of the most used RISC chips when they were included in the [[PlayStation]] and [[Nintendo 64]] [[game console]]s. Today they are one of the most common [[embedded processor]]s in use for high-end applications.
 
IBM learned from the RT-PC failure and went on to design the RS/6000 based on their new POWER architecture. They then moved their existing [[AS/400]] systems to POWER chips, and found much to their surprise that even the very complex instruction set ran considerably faster. POWER would also find itself moving "down" in scale to produce the [[PowerPC]] design, which eliminated many of the "IBM only" instructions and created a single-chip implementation. Today the PowerPC is one of the most commonly used CPUs for automotive applications (some cars have over 10 of them inside). It was also the CPU used in most [[Apple Macintosh]] machines sold until 2006. Starting in February 2006, Apple switched their PowerPC products to [[Intel]] [[x86]] processors.
 
Almost all other vendors quickly joined. From the [[United Kingdom|UK]] similar research efforts resulted in the [[INMOS transputer]], the [[Acorn Archimedes]] and the [[ARM Ltd|Advanced RISC Machine]] line, which is a huge success today. Companies with existing CISC designs also quickly joined the revolution. Intel released the [[Intel i860|i860]] and [[Intel i960|i960]] by the late 1980s, although they were not very successful. [[Motorola]] built a new design called the [[Motorola 88000|88000]] in homage to their famed CISC [[Motorola 68000|68000]], but it saw almost no use and they eventually abandoned it and joined IBM to produce the PowerPC. [[Advanced Micro Devices|AMD]] released their [[AMD 29000|29000]] which would go on to become the most popular RISC design of the early 1990s.
 
Today the vast majority of all 32-bit CPUs in use are RISC CPUs, and [[microcontroller]]s. RISC design techniques offers power in even small sizes, and thus has become dominant for low-power 32-bit CPUs. Embedded systems are by far the largest market for processors: while a family may own one or two PCs, their car(s), cell phones, and other devices may contain a total of dozens of embedded processors. RISC had also completely taken over the market for larger workstations for much of the 90s (until taken back by cheap PC-based solutions). After the release of the Sun SPARCstation the other vendors rushed to compete with RISC based solutions of their own. The high-end server market today is almost completely RISC based.
 
=== RISC and x86 ===
However, despite many successes, RISC has made few inroads into the desktop PC and commodity server markets, where [[Intel]]'s [[x86]] platform remains the dominant processor architecture (Intel is facing increased competition from [[Advanced Micro Devices|AMD]], but even AMD's processors implement the x86 platform, or a 64-bit superset known as [[x86-64]]). There are three main reasons for this. One, the very large base of [[proprietary software|proprietary]] PC applications are written for x86, whereas no RISC platform has a similar installed base, and this meant PC users were locked into the x86. The second is that, although RISC was indeed able to scale up in performance quite quickly and cheaply, Intel took advantage of its large market by spending vast amounts of money on processor development. Intel could spend many times as much as any RISC manufacturer on improving low level design and manufacturing. The same could not be said about smaller firms like [[Cyrix]] and [[NexGen]], but they realized that they could apply pipelined design philosophies and practices to the x86-architecture &mdash; either ''directly'' as in the 6x86 and MII series, or ''indirectly'' (via extra decoding stages) as in [[Nx586]] and [[AMD K5]]. Later, more powerful processors such as [[Intel P6]] and [[AMD K6]] had similar RISC-like units that executed a stream of micro-operations generated from decoding stages that split most x86 instructions into several pieces. Today, these principles have been further refined and are used by modern x86 processors such as [[Intel Core 2]] and [[AMD K8]]. The first ''available'' chip deploying such techniques was the NexGen Nx586, released in 1994 (while the AMD K5 was severely delayed and released in 1995).
 
As of [[2007]], the x86 designs (whether Intel's or AMD's) are as fast as (if not faster than) the fastest true RISC single-chip solutions available.<ref>[http://www.spec.org/cpu2006/results/]</ref>
 
=== Cost ===
Consumers are interested in speed, energy efficiency, cost per chip, and compatibility with existing software rather than the cost of development of new chips.{{Carece de fontes|data=junho 2007}} This has led to an interesting chain of events. As the complexity of developing ever more advanced CPUs rises, the cost of both development and fabrication of high-end CPUs has exploded. The cost gains given by RISC are now dwarfed by the high costs of developing any modern CPU. Today, only the biggest chip makers are able to make high performing CPUs. The result is that virtually all RISC platforms with the exception of IBM's [[Power Architecture]] have greatly shrunk in scale of development of high performing CPUs (like SPARC and MIPS) or were abandoned (like Alpha and PA-RISC) during the 00s. As of 2007, a RISC chip is (again) the fastest CPU in [[SPECint]] and [[SPECfp]], which is IBM's [[Power6]] CPU.
 
Still, RISC designs have led to a number of successful platforms and architectures, some of the larger ones being:
 
* MIPS's [[MIPS architecture|MIPS]] line, found in most [[Silicon Graphics|SGI]] computers and the [[PlayStation]], [[PlayStation 2]], [[Nintendo 64]] (discontinued), and [[PlayStation Portable]] game consoles.
* IBM's and Freescale's (formerly [[Motorola]] SPS) [[Power Architecture]], used in all of IBM's supercomputers, midrange servers and workstations, in Apple's PowerPC-based [[Macintosh]] computers (discontinued), in [[Nintendo]]'s [[Nintendo Gamecube|Gamecube]] and [[Wii]], [[Microsoft]]'s [[Xbox 360]] and [[Sony]]'s [[PlayStation 3]] game consoles, and in many embedded applications like printers and cars.
* [[Sun Microsystems|Sun]]'s [[SPARC]] and [[UltraSPARC]], found in most of their later machines
* [[Hewlett-Packard]]'s [[PA-RISC]], also known as HP/PA (discontinued).
* [[DEC Alpha]], still used in some of HP's workstation and servers (discontinued).
* [[XAP processor]] used in many wireless chips, e.g. [[Bluetooth]]
* [[ARM architecture|ARM]] — [[Palm, Inc.]] originally used the (CISC) Motorola 680x0 processors in its early PDAs, but now uses (RISC) ARM processors in its latest PDAs. Many PocketPC PDAs and smartphones run off the Intel XScale or equivalent CPUs (like the Samsung SC32442), which is an implementation of the ARMv5 processor. [[Apple Inc.]] uses the ARM 7TDMI in its [[iPod]] products, and Samsung's ARM1176jzf processor in the [[iPhone]]. [[Nintendo]] uses an ARM7 CPU in the [[Game Boy Advance]] and both an ARM7 and ARM9 in the [[Nintendo DS]] handheld game systems. The small Korean company [[Game Park]] also markets the GP32, which uses the ARM9 CPU. Also, many cell phones from, for example, [[Nokia]] are based on ARM designs.
* [[Hitachi, Ltd.|Hitachi]]'s [[SuperH]], originally in wide use in the [[Sega]] [[Sega 32X|Super 32X]], [[Sega Saturn|Saturn]] and [[Dreamcast]], now at the heart of many consumer electronics devices. The SuperH is the base platform for the [[Mitsubishi]] - Hitachi joint semiconductor group. The two groups merged in [[2002]], dropping Mitsubishi's own RISC architecture, the [[M32R]].
 
{{Referências}}
== Alternative term ==
Over many years, RISC instruction sets have tended to grow in size. Thus, some have started using the term "load-store" to describe RISC processors, since this is the key element of all such designs. Instead of the CPU itself handling many addressing modes, a load-store architecture uses a separate unit dedicated to handling very simple forms of load and store operations. CISC processors are then termed "register-memory" or "memory-memory".
-->
{{referências}}
 
== Ligações externas ==
* {{Link||2=[http://cse.stanford.edu/class/sophomore-college/projects-00/risc/risccisc/ |3=RISC vs. CISC}}]
* {{Link||2=[http://cse.stanford.edu/class/sophomore-college/projects-00/risc/whatis/index.html |3=What is RISC}}]
* {{Link||2=[http://www.cpushack.net/CPU/cpuAppendA.html |3=RISC vs. CISC from historical perspective}}]
 
{{Portal3|Tecnologias de informação}}