US20090132841A1 - Processor Accessing A Scratch Pad On-Demand To Reduce Power Consumption - Google Patents

Processor Accessing A Scratch Pad On-Demand To Reduce Power Consumption Download PDF

Info

Publication number
US20090132841A1
US20090132841A1 US12/357,929 US35792909A US2009132841A1 US 20090132841 A1 US20090132841 A1 US 20090132841A1 US 35792909 A US35792909 A US 35792909A US 2009132841 A1 US2009132841 A1 US 2009132841A1
Authority
US
United States
Prior art keywords
instruction
scratch pad
address
memory
processor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/357,929
Inventor
Matthias Knoth
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ARM Finance Overseas Ltd
Original Assignee
MIPS Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MIPS Technologies Inc filed Critical MIPS Technologies Inc
Priority to US12/357,929 priority Critical patent/US20090132841A1/en
Publication of US20090132841A1 publication Critical patent/US20090132841A1/en
Assigned to BRIDGE CROSSING, LLC reassignment BRIDGE CROSSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MIPS TECHNOLOGIES, INC.
Assigned to ARM FINANCE OVERSEAS LIMITED reassignment ARM FINANCE OVERSEAS LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BRIDGE CROSSING, LLC
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3802Instruction prefetching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/26Power supply means, e.g. regulation thereof
    • G06F1/32Means for saving power
    • G06F1/3203Power management, i.e. event-based initiation of a power-saving mode
    • G06F1/3234Power saving characterised by the action undertaken
    • G06F1/3243Power saving in microcontroller unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0875Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with dedicated cache, e.g. instruction or stack
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • G06F12/1054Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache the data cache being concurrently physically addressed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1028Power efficiency
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present invention relates generally to microprocessors and reducing power consumption in microprocessors.
  • An instruction fetch unit of a microprocessor is responsible for continually providing the next appropriate instruction to an execution unit of the microprocessor.
  • an instruction fetch unit computes a virtual address for the next instruction to be fetched, translates the virtual address to a physical address, retrieves an instruction corresponding to the physical address, and provides the instruction to the execution unit.
  • the instruction fetch unit may not be able to determine which instruction source to use to retrieve the desired instruction until the virtual address is translated into a physical address. Rather than waiting for the virtual address to be translated, a conventional instruction fetch unit may access all of the instruction sources simultaneously while the address is translated.
  • a conventional instruction fetch unit After the address translation is completed, a conventional instruction fetch unit will inspect the retrieved instructions to determine if the desired instruction was retrieved by one of the instruction sources. If none of the instruction sources has retrieved the desired instruction, a conventional instruction fetch unit uses the translated address to target the appropriate instruction source to retrieve the desired instruction.
  • What is needed is a microprocessor that can access a variety of instruction sources while consuming less power than a microprocessor having a conventional fetch unit.
  • the present invention provides processing systems, apparatuses, and methods for accessing a scratch pad on-demand to reduce power consumption.
  • an instruction fetch unit of a processor is configured to provide instructions from several instruction sources such as an instruction cache and a scratch pad to an execution unit of the processor.
  • the scratch pad is enabled, the scratch pad is accessed to retrieve an instruction based on the virtual address.
  • the MMU is accessed to translate the virtual address into a physical address. If the physical address is associated with the scratch pad, the instruction retrieved from the scratch pad is provided to the execution unit of the processor for execution. If the physical address is not associated with the scratch pad, the scratch pad is disabled to reduce power consumption and the instruction fetch unit re-initiates the instruction fetch so that the instruction can be retrieved from an instruction source other than the scratch pad.
  • the scratch pad when the scratch pad is not enabled, another instruction source, such as the instruction cache, is accessed to retrieve an instruction based on the virtual address.
  • the MMU is accessed to translate the virtual address into a physical address. If the physical address is associated with the scratch pad, the scratch pad is enabled and the instruction fetch unit re-initiates the instruction fetch so that the instruction can be retrieved from the scratch pad. In one embodiment, if the physical address is not associated with the scratch pad, the instruction retrieved from the other instruction source is provided to the execution unit of the processor for execution.
  • another instruction source such as the instruction cache, is disabled to reduce power consumption when the scratch pad is enabled and the instruction source is enabled when the scratch pad is disabled.
  • components of a processor such as the instruction cache and the scratch pad are disabled to reduce power consumption by controlling the clock signal that is delivered to the component.
  • state registers in the component are suspended from latching new values and the logic blocks between the state registers are placed in a stable state. Once the components are placed in a stable state, the transistors in the state registers and the logic blocks are suspended from changing states and therefore do not consume power required to transition states.
  • a bias voltage is applied to the component to further reduce power consumption resulting from leakage.
  • FIG. 1 is a diagram of a processor according to an embodiment of the present invention.
  • FIG. 2 is a more detailed diagram of the processor of FIG. 1 .
  • FIG. 3 is a flow chart illustrating the steps of a method embodiment of the present invention.
  • the present invention provides processing systems, apparatuses, and methods for accessing a scratch pad on-demand to reduce power consumption.
  • references to “one embodiment”, “an embodiment”, “an example embodiment”, etc. indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
  • FIG. 1 is a diagram of a processor 100 according to an embodiment of the present invention.
  • Processor 100 includes a processor core 110 , an instruction cache 102 , and a scratch pad 104 .
  • Processor core 110 includes an instruction fetch unit 120 and an execution unit 106 .
  • Processor 100 may access an external memory 108 . Instructions retrieved from external memory 108 can be cached in instruction cache 102 .
  • Instruction fetch unit 120 interfaces with instruction cache 102 , scratch pad 104 , execution unit 106 , and memory 108 through buses 112 , 114 , 116 , and 118 , respectively.
  • instruction sources such as instruction cache 102 and scratch pad 104 may also be placed within processor core 110 , within instruction fetch unit 120 , or external to processor 100 .
  • Memory 108 may be, for example, a level two cache, a main memory, a read-only memory (ROM) or another storage device that is capable of storing instructions.
  • FIG. 2 is a more detailed diagram of processor 100 according to one embodiment of the present invention.
  • instruction fetch unit 120 includes a fetch controller 200 , a multiplexer 208 , a comparator 210 , and an address register 220 .
  • Fetch controller 200 interfaces with multiplexer 208 , scratch pad 104 , instruction cache 102 , and execution unit 106 through buses 218 , 214 , 212 , and 216 , respectively.
  • Buses 204 , 214 , and 222 represent components of bus 114 .
  • Buses 202 , 212 , and 222 represent components of bus 112 .
  • Buses 206 and 216 represent components of bus 116 .
  • Register 220 stores a virtual address of an instruction to be fetched.
  • Fetch controller 200 updates register 220 via bus 226 with the address of the instruction to be fetched.
  • the virtual address stored in register 220 is made available to instruction cache 102 , scratch pad 104 , and a memory management unit (MMU) 224 through bus 222 .
  • MMU memory management unit
  • Memory management unit (MMU) 224 translates a virtual address provided from register 220 to a physical address.
  • MMU 224 is implemented, for example, using a translation lookaside buffer (TLB).
  • TLB translation lookaside buffer
  • MMU 224 may be placed within processor 100 , within processor core 110 , or within instruction fetch unit 120 .
  • An address such as the virtual address stored in register 220 , includes a tag and an offset.
  • the tag refers to a certain number of the most significant bits in an address.
  • the offset refers to the remaining bits in the address.
  • an instruction source such as scratch pad 104 and instruction cache 102 may be configured to guess and retrieve an instruction based solely on the offset of the virtual address.
  • the instruction source When an instruction source is configured to retrieve an instruction based on the offset of the virtual address, the instruction source will provide an instruction as well as a tag of the physical address of the instruction. After the virtual address is translated, the tag of the instruction can be compared with the tag of the translated address to determine if the correct instruction was actually retrieved. If the guess was wrong, the instruction source can use the now known translated address to retrieve the correct instruction.
  • Scratch pad 104 is a memory preferably configured to provide instructions having a physical address with a tag specified in register 226 .
  • scratch pad 104 provides instructions for a single continuous range of physical addresses. The size of the range is the number of instructions that can be uniquely identified by the bits of the offset.
  • Scratch pad 104 may be enabled and disabled. When disabled, scratch pad 104 reduces power consumption.
  • scratch pad 104 retrieves an instruction based on the offset of the virtual address stored in register 220 in parallel with the address translation performed by MMU 224 . Once the translation is completed by MMU 224 , the tag in register 226 can be compared with the tag of the translated address to determine if the instruction retrieved by scratch pad 104 corresponds to the virtual address stored in register 220 .
  • Scratch pad 104 provides a retrieved instruction on bus 204 .
  • scratch pad 104 may be configured to provide instructions from two or more continuous ranges of physical addresses.
  • a separate tag register is provided to specify each range and the tags stored in each tag register are compared with the tag of the address translated by MMU 224 to determine if the virtual address stored in register 220 corresponds to one of the continuous ranges of physical addresses associated with scratch pad 104 .
  • Register 226 may be implemented, for example, as part of scratch pad 104 or as part of instruction fetch unit 120 . When register 226 is implemented as part of scratch pad 104 , the tag stored in register 226 is made available to comparator 210 even when scratch pad 104 is disabled. In one embodiment, the tag in register 226 may be changed programmatically.
  • instruction cache 102 When enabled, instruction cache 102 provides instructions not provided by scratch pad 104 . Instruction cache 102 may be enabled and disabled. When disabled, instruction cache 102 reduces power consumption. When enabled, instruction cache 102 retrieves an instruction using the offset of the virtual address stored in register 220 . In addition, instruction cache 102 retrieves a tag of the physical address associated with the instruction. The retrieval of the instruction is performed in parallel with the address translation performed by MMU 224 . After MMU 224 completes the translation, the instruction's tag is compared with the tag of the translated address to determine if the retrieved instruction corresponds to the virtual address stored in register 220 . Instruction cache 102 provides a retrieved instruction on bus 202 .
  • Instruction cache 102 may be implemented, for example, as a direct mapped or a set-associated cache.
  • the instruction cache is implemented as a set-associated cache, one or more bits in the offset of the virtual address stored in register 220 may be used as an index to select a set (or a way).
  • Comparator 210 determines whether the virtual address stored in register 220 corresponds to an instruction provided by scratch pad 104 .
  • the tag stored in register 226 is provided to comparator 210 on bus 230 .
  • MMU 224 After MMU 224 translates the virtual address stored in register 220 , MMU 224 provides the tag of the translated address to comparator 210 on bus 228 .
  • Comparator 210 compares the two tags to determine if they match. If they match, then the virtual address stored in register 220 corresponds to an instruction provided by scratch pad 104 .
  • the result of comparator 210 is provided to fetch controller 200 on bus 232 . Based on the result of the comparison, fetch controller 200 causes multiplexer 208 to select between an instruction provided by scratch pad 104 on bus 204 and an instruction provided by instruction cache 102 on bus 202 .
  • fetch controller 200 does not know whether the virtual address stored in register 220 corresponds with an instruction associated with scratch pad 104 or instruction cache 102 until after MMU 224 translates the virtual address, fetch controller 200 can access both scratch pad 104 and instruction cache 102 to retrieve instructions simultaneously. Once fetch controller 200 determines which instruction source should provide the instruction, fetch controller 200 can discard any incorrectly retrieved instructions. Although accessing scratch pad 104 and instruction cache 102 at the same time minimizes delay time, having both scratch pad 104 and instruction cache 102 enabled for every instruction fetch consumes a significant amount of the total power of processor 100 .
  • scratch pad 104 and instruction cache 102 is each likely to be utilized to provide a sequence of instructions at a time.
  • the present invention takes advantage of this observation in embodiments by enabling only one of scratch pad 104 or instruction cache 102 at any time. If scratch pad 104 is enabled to retrieve instructions and fetch controller 200 later determines, after the address translation by MMU 224 , that the instruction should be retrieved from instruction cache 102 , scratch pad 104 is disabled to reduce power consumption and the instruction fetch is re-started with instruction cache 102 enabled.
  • instruction cache 102 is enabled to retrieve instructions and fetch controller 200 later determines during the course of the instruction fetch that the instruction should be provided by scratch pad 104 , instruction cache 102 is disabled to reduce power consumption and the instruction fetch is re-started with scratch pad 104 enabled.
  • enabling and disabling scratch pad 104 and instruction cache 102 will have minimal performance degradation since the amount of time spent to enable and disable scratch pad 104 and instruction cache 102 will be small compared to the amount of time spent providing instructions from scratch pad 104 and instruction cache 102 .
  • power savings are achieved.
  • scratch pad 104 is not disabled if it is performing another function. For example, if instructions are being stored into scratch pad 104 , scratch pad 104 will not be disabled until after the instructions are stored in scratch pad 104 . Likewise, if instruction cache 102 is performing another function, instruction cache 102 will not be disabled until it has completed the finction.
  • FIG. 3 depicts a flow chart illustrating the steps of a method 300 according to an embodiment of the present invention.
  • Method 300 is used to retrieve instructions by a processor having access to a scratch pad and an instruction cache. While method 300 can be implemented, for example, using a processor according to the present invention, such as processor 100 illustrated in FIGS. 1-2 , it is not limited to being implemented by processor 100 .
  • Method 300 begins with step 302 .
  • a virtual address of an instruction to be fetched and provided to an execution unit of a processor is determined.
  • the virtual address may correspond, for example, to an instruction that can be provided by a scratch pad or an instruction cache of a processor.
  • an instruction fetch unit of the processor determines the virtual address of an instruction to be fetched by incrementing the virtual address of the previously fetched instruction or by using the target address of a jump or a branch instruction that was previously executed.
  • step 304 the virtual address determined in step 302 is translated to generate a physical address.
  • the instruction cache provides an instruction based on the virtual address.
  • a memory management unit performs the address translation.
  • step 306 the physical address generated in step 304 is examined to determine if it is associated with an instruction that is provided by a scratch pad. For example, if the scratch pad provides instructions for a range of physical addresses associated with a single tag, the tag is compared with the tag of the physical address generated in step 304 to determine if they match. If the tags match, the physical address generated in step 304 is associated with the scratch pad.
  • method 300 proceeds to step 308 . Otherwise, method 300 proceeds to step 328 .
  • the scratch pad is enabled unless it is already enabled.
  • the scratch pad may already be enabled, for example, to store instructions into the scratch pad.
  • step 310 the instruction cache is disabled to reduce power consumption. Control proceeds to step 312 .
  • step 312 the fetch for an instruction corresponding to the virtual address determined in step 302 is re-performed. Since the scratch pad was enabled in step 308 , the scratch pad retrieves an instruction based on the virtual address determined in step 302 .
  • step 314 the instruction retrieved from the scratch pad is provided to an execution unit of the processor for execution. Control proceeds to step 316 .
  • step 316 a virtual address of an instruction to be fetched and provided to the execution unit of the processor is determined, as in step 302 . Control proceeds to step 318 .
  • step 318 the virtual address determined in step 316 is translated to generate a physical address.
  • the scratch pad retrieves an instruction based on the virtual address.
  • step 320 the physical address generated in step 318 is examined to determine if it is associated with an instruction that is provided by the scratch pad. If the physical address is associated with the scratch pad, method 300 proceeds to step 314 . Otherwise, method 300 proceeds to step 322 .
  • the scratch pad is disabled to reduce power consumption unless the scratch pad must remain enabled for another purpose. For example, if instructions are being stored in the scratch pad, the scratch pad will be disabled at a later time when instructions are no longer being stored in the scratch pad.
  • step 324 the instruction cache is enabled. Control proceeds to step 326 .
  • step 326 the fetch for an instruction corresponding to the virtual address determined in step 316 is re-performed. Since the instruction cache was enabled in step 324 , the instruction cache retrieves an instruction based on the virtual address determined in step 316 .
  • step 328 if the physical address of the instruction retrieved from the instruction cache corresponds to the virtual address of the instruction to be fetched, the instruction retrieved from the instruction cache is provided to the execution unit of the processor for execution. Otherwise, the instruction cache utilizes the physical address that was generated by translating the virtual address to retrieve and provide the correct instruction to the execution unit.
  • the instruction cache may retrieve the correct instruction from an external memory.
  • a component of a processor such as an instruction cache, a scratch pad, etc. may be disabled to reduce power consumption in accordance with the present invention by controlling the input clock signal of the component.
  • state registers in the component are suspended from latching new values.
  • logic blocks between the state registers are kept in a stable state and the transistors in the logic blocks are suspended from changing states.
  • the transistors in the state registers and logic blocks of the component are suspended from changing states and therefore no power is required to change states. Only the power required to maintain a stable state is consumed.
  • a bias voltage is applied to the component to fuirther reduce power consumption arising from leakage.
  • implementations may also be embodied in software (e.g., computer readable code, program code, instructions and/or data disposed in any form, such as source, object or machine language) disposed, for example, in a computer usable (e.g., readable) medium configured to store the software.
  • software e.g., computer readable code, program code, instructions and/or data disposed in any form, such as source, object or machine language
  • a computer usable (e.g., readable) medium configured to store the software.
  • Such software can enable, for example, the function, fabrication, modeling, simulation, description, and/or testing of the apparatus and methods described herein.
  • this can be accomplished through the use of general programming languages (e.g., C, C++), GDSII databases, hardware description languages (HDL) including Verilog HDL, VHDL, SystemC, SystemC Register Transfer Level (RTL), and so on, or other available programs, databases, and/or circuit (i.e., schematic) capture tools.
  • Such software can be disposed in any known computer usable medium including semiconductor, magnetic disk, optical disk (e.g., CD-ROM, DVD-ROM, etc.) and as a computer data signal embodied in a computer usable (e.g., readable) transmission medium (e.g., carrier wave or any other medium including digital, optical, or analog-based medium).
  • the software can be transmitted over communication networks including the Internet and intranets.
  • the apparatus and method embodiments described herein may be included in a semiconductor intellectual property core, such as a microprocessor core (e.g., embodied in HDL) and transformed to hardware in the production of integrated circuits. Additionally, the apparatus and methods described herein may be embodied as a combination of hardware and software. Thus, the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalence.

Abstract

The present invention provides processing systems, apparatuses, and methods that access a scratch pad on-demand to reduce power consumption. In an embodiment, an instruction fetch unit initiates an instruction fetch. When a scratch pad is enabled, an instruction is retrieved from the scratch pad in parallel with a translation of a virtual address to a physical address. If the physical address is associated with the scratch pad, the retrieved instruction is provided to an execution unit. Otherwise, the scratch pad is disabled to reduce power consumption and the instruction fetch is re-initiated. When the scratch pad is disabled, an instruction is retrieved from another instruction source, such as an instruction cache, in parallel with the translation of the virtual address to the physical address. If the physical address is associated with the scratch pad, the scratch pad is enabled and the instruction fetch is re-initiated.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of co-pending U.S. application Ser. No. 11/272,737, filed on Nov. 15, 2005, entitled “Processor Accessing a Scratch Pad On-Demand to Reduce Power Consumption,” now allowed, which is incorporated herein by reference in its entirety. This application is also related to commonly owned, co-pending U.S. application Ser. No. 11/272,718, filed on Nov. 15, 2005, entitled “Processor Utilizing A Loop Buffer To Reduce Power Consumption,” and commonly owned, co-pending U.S. application Ser. No. 11/272,719, filed on Nov. 15, 2005, entitled “Microprocessor Having A Power-Saving Instruction Cache Way Predictor And Instruction Replacement Scheme,” each of which is incorporated herein by reference in its entirety.
  • FIELD OF THE INVENTION
  • The present invention relates generally to microprocessors and reducing power consumption in microprocessors.
  • BACKGROUND OF THE INVENTION
  • An instruction fetch unit of a microprocessor is responsible for continually providing the next appropriate instruction to an execution unit of the microprocessor. Generally, an instruction fetch unit computes a virtual address for the next instruction to be fetched, translates the virtual address to a physical address, retrieves an instruction corresponding to the physical address, and provides the instruction to the execution unit. When multiple instruction sources such as an instruction cache and scratch pad are available, the instruction fetch unit may not be able to determine which instruction source to use to retrieve the desired instruction until the virtual address is translated into a physical address. Rather than waiting for the virtual address to be translated, a conventional instruction fetch unit may access all of the instruction sources simultaneously while the address is translated. After the address translation is completed, a conventional instruction fetch unit will inspect the retrieved instructions to determine if the desired instruction was retrieved by one of the instruction sources. If none of the instruction sources has retrieved the desired instruction, a conventional instruction fetch unit uses the translated address to target the appropriate instruction source to retrieve the desired instruction.
  • Although, accessing all the instruction sources simultaneously may reduce the time required to retrieve an instruction, it unnecessarily consumes a significant amount of the total power of a microprocessor. This makes microprocessors having conventional fetch units undesirable and/or impractical for many applications.
  • What is needed is a microprocessor that can access a variety of instruction sources while consuming less power than a microprocessor having a conventional fetch unit.
  • BRIEF SUMMARY OF THE INVENTION
  • The present invention provides processing systems, apparatuses, and methods for accessing a scratch pad on-demand to reduce power consumption.
  • In one embodiment, an instruction fetch unit of a processor is configured to provide instructions from several instruction sources such as an instruction cache and a scratch pad to an execution unit of the processor. When the scratch pad is enabled, the scratch pad is accessed to retrieve an instruction based on the virtual address. In parallel with the scratch pad access, the MMU is accessed to translate the virtual address into a physical address. If the physical address is associated with the scratch pad, the instruction retrieved from the scratch pad is provided to the execution unit of the processor for execution. If the physical address is not associated with the scratch pad, the scratch pad is disabled to reduce power consumption and the instruction fetch unit re-initiates the instruction fetch so that the instruction can be retrieved from an instruction source other than the scratch pad.
  • In one embodiment, when the scratch pad is not enabled, another instruction source, such as the instruction cache, is accessed to retrieve an instruction based on the virtual address. In parallel with the instruction retrieval, the MMU is accessed to translate the virtual address into a physical address. If the physical address is associated with the scratch pad, the scratch pad is enabled and the instruction fetch unit re-initiates the instruction fetch so that the instruction can be retrieved from the scratch pad. In one embodiment, if the physical address is not associated with the scratch pad, the instruction retrieved from the other instruction source is provided to the execution unit of the processor for execution.
  • In one embodiment, another instruction source, such as the instruction cache, is disabled to reduce power consumption when the scratch pad is enabled and the instruction source is enabled when the scratch pad is disabled.
  • In one embodiment, components of a processor, such as the instruction cache and the scratch pad are disabled to reduce power consumption by controlling the clock signal that is delivered to the component. By maintaining the input clock signal at either a constant high or a constant low value, state registers in the component are suspended from latching new values and the logic blocks between the state registers are placed in a stable state. Once the components are placed in a stable state, the transistors in the state registers and the logic blocks are suspended from changing states and therefore do not consume power required to transition states.
  • In one embodiment, when a component is disabled to reduce power consumption, a bias voltage is applied to the component to further reduce power consumption resulting from leakage. Further embodiments, features, and advantages of the present invention, as well as the structure and operation of the various embodiments of the present invention, are described in detail below with reference to the accompanying drawings.
  • BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES
  • The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the invention.
  • FIG. 1 is a diagram of a processor according to an embodiment of the present invention.
  • FIG. 2 is a more detailed diagram of the processor of FIG. 1.
  • FIG. 3 is a flow chart illustrating the steps of a method embodiment of the present invention.
  • The present invention will be described with reference to the accompanying drawings. The drawing in which an element first appears is typically indicated by the leftmost digit(s) in the corresponding reference number.
  • DETAILED DESCRIPTION OF THE INVENTION
  • The present invention provides processing systems, apparatuses, and methods for accessing a scratch pad on-demand to reduce power consumption. In the detailed description of the invention that follows, references to “one embodiment”, “an embodiment”, “an example embodiment”, etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
  • FIG. 1 is a diagram of a processor 100 according to an embodiment of the present invention. Processor 100 includes a processor core 110, an instruction cache 102, and a scratch pad 104. Processor core 110 includes an instruction fetch unit 120 and an execution unit 106. Processor 100 may access an external memory 108. Instructions retrieved from external memory 108 can be cached in instruction cache 102. Instruction fetch unit 120 interfaces with instruction cache 102, scratch pad 104, execution unit 106, and memory 108 through buses 112, 114, 116, and 118, respectively. As would be appreciated by those skilled in the relevant arts, instruction sources such as instruction cache 102 and scratch pad 104 may also be placed within processor core 110, within instruction fetch unit 120, or external to processor 100. Memory 108 may be, for example, a level two cache, a main memory, a read-only memory (ROM) or another storage device that is capable of storing instructions.
  • FIG. 2 is a more detailed diagram of processor 100 according to one embodiment of the present invention. As shown in FIG. 2, instruction fetch unit 120 includes a fetch controller 200, a multiplexer 208, a comparator 210, and an address register 220. Fetch controller 200 interfaces with multiplexer 208, scratch pad 104, instruction cache 102, and execution unit 106 through buses 218, 214, 212, and 216, respectively. Buses 204, 214, and 222 represent components of bus 114. Buses 202, 212, and 222 represent components of bus 112. Buses 206 and 216 represent components of bus 116.
  • Register 220 stores a virtual address of an instruction to be fetched. Fetch controller 200 updates register 220 via bus 226 with the address of the instruction to be fetched. The virtual address stored in register 220 is made available to instruction cache 102, scratch pad 104, and a memory management unit (MMU) 224 through bus 222.
  • Memory management unit (MMU) 224 translates a virtual address provided from register 220 to a physical address. In one embodiment, MMU 224 is implemented, for example, using a translation lookaside buffer (TLB).
  • MMU 224 may be placed within processor 100, within processor core 110, or within instruction fetch unit 120.
  • An address, such as the virtual address stored in register 220, includes a tag and an offset. The tag refers to a certain number of the most significant bits in an address. The offset refers to the remaining bits in the address.
  • During address translation, only the bits in the tag of a virtual address are translated to generate a physical address. Hence, a virtual address and its corresponding physical address share the same bits for the offset. Since the bits in the offset of the physical address can be extracted from the virtual address prior to address translation, an instruction source such as scratch pad 104 and instruction cache 102 may be configured to guess and retrieve an instruction based solely on the offset of the virtual address.
  • When an instruction source is configured to retrieve an instruction based on the offset of the virtual address, the instruction source will provide an instruction as well as a tag of the physical address of the instruction. After the virtual address is translated, the tag of the instruction can be compared with the tag of the translated address to determine if the correct instruction was actually retrieved. If the guess was wrong, the instruction source can use the now known translated address to retrieve the correct instruction.
  • Scratch pad 104 is a memory preferably configured to provide instructions having a physical address with a tag specified in register 226. Hence, scratch pad 104 provides instructions for a single continuous range of physical addresses. The size of the range is the number of instructions that can be uniquely identified by the bits of the offset. Scratch pad 104 may be enabled and disabled. When disabled, scratch pad 104 reduces power consumption. When enabled, scratch pad 104 retrieves an instruction based on the offset of the virtual address stored in register 220 in parallel with the address translation performed by MMU 224. Once the translation is completed by MMU 224, the tag in register 226 can be compared with the tag of the translated address to determine if the instruction retrieved by scratch pad 104 corresponds to the virtual address stored in register 220. Scratch pad 104 provides a retrieved instruction on bus 204. In one embodiment, scratch pad 104 may be configured to provide instructions from two or more continuous ranges of physical addresses. In such an embodiment, a separate tag register is provided to specify each range and the tags stored in each tag register are compared with the tag of the address translated by MMU 224 to determine if the virtual address stored in register 220 corresponds to one of the continuous ranges of physical addresses associated with scratch pad 104.
  • Register 226 may be implemented, for example, as part of scratch pad 104 or as part of instruction fetch unit 120. When register 226 is implemented as part of scratch pad 104, the tag stored in register 226 is made available to comparator 210 even when scratch pad 104 is disabled. In one embodiment, the tag in register 226 may be changed programmatically.
  • When enabled, instruction cache 102 provides instructions not provided by scratch pad 104. Instruction cache 102 may be enabled and disabled. When disabled, instruction cache 102 reduces power consumption. When enabled, instruction cache 102 retrieves an instruction using the offset of the virtual address stored in register 220. In addition, instruction cache 102 retrieves a tag of the physical address associated with the instruction. The retrieval of the instruction is performed in parallel with the address translation performed by MMU 224. After MMU 224 completes the translation, the instruction's tag is compared with the tag of the translated address to determine if the retrieved instruction corresponds to the virtual address stored in register 220. Instruction cache 102 provides a retrieved instruction on bus 202.
  • Instruction cache 102 may be implemented, for example, as a direct mapped or a set-associated cache. When the instruction cache is implemented as a set-associated cache, one or more bits in the offset of the virtual address stored in register 220 may be used as an index to select a set (or a way).
  • Comparator 210 determines whether the virtual address stored in register 220 corresponds to an instruction provided by scratch pad 104. The tag stored in register 226 is provided to comparator 210 on bus 230. After MMU 224 translates the virtual address stored in register 220, MMU 224 provides the tag of the translated address to comparator 210 on bus 228. Comparator 210 compares the two tags to determine if they match. If they match, then the virtual address stored in register 220 corresponds to an instruction provided by scratch pad 104. The result of comparator 210 is provided to fetch controller 200 on bus 232. Based on the result of the comparison, fetch controller 200 causes multiplexer 208 to select between an instruction provided by scratch pad 104 on bus 204 and an instruction provided by instruction cache 102 on bus 202.
  • Because fetch controller 200 does not know whether the virtual address stored in register 220 corresponds with an instruction associated with scratch pad 104 or instruction cache 102 until after MMU 224 translates the virtual address, fetch controller 200 can access both scratch pad 104 and instruction cache 102 to retrieve instructions simultaneously. Once fetch controller 200 determines which instruction source should provide the instruction, fetch controller 200 can discard any incorrectly retrieved instructions. Although accessing scratch pad 104 and instruction cache 102 at the same time minimizes delay time, having both scratch pad 104 and instruction cache 102 enabled for every instruction fetch consumes a significant amount of the total power of processor 100.
  • Instructions of a program tend to exhibit spatial and temporal locality, thus scratch pad 104 and instruction cache 102 is each likely to be utilized to provide a sequence of instructions at a time. The present invention, as described herein, takes advantage of this observation in embodiments by enabling only one of scratch pad 104 or instruction cache 102 at any time. If scratch pad 104 is enabled to retrieve instructions and fetch controller 200 later determines, after the address translation by MMU 224, that the instruction should be retrieved from instruction cache 102, scratch pad 104 is disabled to reduce power consumption and the instruction fetch is re-started with instruction cache 102 enabled. Similarly, if instruction cache 102 is enabled to retrieve instructions and fetch controller 200 later determines during the course of the instruction fetch that the instruction should be provided by scratch pad 104, instruction cache 102 is disabled to reduce power consumption and the instruction fetch is re-started with scratch pad 104 enabled.
  • For programs that tend to retrieve instructions from scratch pad 104 and instruction cache 102 in bursts, enabling and disabling scratch pad 104 and instruction cache 102 will have minimal performance degradation since the amount of time spent to enable and disable scratch pad 104 and instruction cache 102 will be small compared to the amount of time spent providing instructions from scratch pad 104 and instruction cache 102. By disabling scratch pad 104 and instruction cache 102 in the manner described above, power savings are achieved.
  • Although the present invention attempts to disable scratch pad 104 when it is not providing instructions, scratch pad 104 is not disabled if it is performing another function. For example, if instructions are being stored into scratch pad 104, scratch pad 104 will not be disabled until after the instructions are stored in scratch pad 104. Likewise, if instruction cache 102 is performing another function, instruction cache 102 will not be disabled until it has completed the finction.
  • FIG. 3 depicts a flow chart illustrating the steps of a method 300 according to an embodiment of the present invention. Method 300 is used to retrieve instructions by a processor having access to a scratch pad and an instruction cache. While method 300 can be implemented, for example, using a processor according to the present invention, such as processor 100 illustrated in FIGS. 1-2, it is not limited to being implemented by processor 100. Method 300 begins with step 302.
  • In step 302, a virtual address of an instruction to be fetched and provided to an execution unit of a processor is determined. The virtual address may correspond, for example, to an instruction that can be provided by a scratch pad or an instruction cache of a processor. In one embodiment, an instruction fetch unit of the processor determines the virtual address of an instruction to be fetched by incrementing the virtual address of the previously fetched instruction or by using the target address of a jump or a branch instruction that was previously executed.
  • In step 304, the virtual address determined in step 302 is translated to generate a physical address. In parallel with the address translation, the instruction cache provides an instruction based on the virtual address. In one embodiment, a memory management unit performs the address translation.
  • In step 306, the physical address generated in step 304 is examined to determine if it is associated with an instruction that is provided by a scratch pad. For example, if the scratch pad provides instructions for a range of physical addresses associated with a single tag, the tag is compared with the tag of the physical address generated in step 304 to determine if they match. If the tags match, the physical address generated in step 304 is associated with the scratch pad.
  • If the physical address is associated with the scratch pad, method 300 proceeds to step 308. Otherwise, method 300 proceeds to step 328.
  • In step 308, the scratch pad is enabled unless it is already enabled. The scratch pad may already be enabled, for example, to store instructions into the scratch pad.
  • In step 310, the instruction cache is disabled to reduce power consumption. Control proceeds to step 312.
  • In step 312, the fetch for an instruction corresponding to the virtual address determined in step 302 is re-performed. Since the scratch pad was enabled in step 308, the scratch pad retrieves an instruction based on the virtual address determined in step 302.
  • In step 314, the instruction retrieved from the scratch pad is provided to an execution unit of the processor for execution. Control proceeds to step 316.
  • In step 316, a virtual address of an instruction to be fetched and provided to the execution unit of the processor is determined, as in step 302. Control proceeds to step 318.
  • In step 318, the virtual address determined in step 316 is translated to generate a physical address. In parallel with the address translation, the scratch pad retrieves an instruction based on the virtual address.
  • In step 320, the physical address generated in step 318 is examined to determine if it is associated with an instruction that is provided by the scratch pad. If the physical address is associated with the scratch pad, method 300 proceeds to step 314. Otherwise, method 300 proceeds to step 322.
  • In step 322, the scratch pad is disabled to reduce power consumption unless the scratch pad must remain enabled for another purpose. For example, if instructions are being stored in the scratch pad, the scratch pad will be disabled at a later time when instructions are no longer being stored in the scratch pad.
  • In step 324, the instruction cache is enabled. Control proceeds to step 326.
  • In step 326, the fetch for an instruction corresponding to the virtual address determined in step 316 is re-performed. Since the instruction cache was enabled in step 324, the instruction cache retrieves an instruction based on the virtual address determined in step 316.
  • In step 328, if the physical address of the instruction retrieved from the instruction cache corresponds to the virtual address of the instruction to be fetched, the instruction retrieved from the instruction cache is provided to the execution unit of the processor for execution. Otherwise, the instruction cache utilizes the physical address that was generated by translating the virtual address to retrieve and provide the correct instruction to the execution unit. The instruction cache, for example, may retrieve the correct instruction from an external memory. After step 328, method 300 proceeds to step 302.
  • As described herein, a component of a processor such as an instruction cache, a scratch pad, etc. may be disabled to reduce power consumption in accordance with the present invention by controlling the input clock signal of the component. By controlling the input clock signal so that the clock is maintained at a constant high or a constant low value, state registers in the component are suspended from latching new values. As a result, logic blocks between the state registers are kept in a stable state and the transistors in the logic blocks are suspended from changing states. Hence, when the input clock signal is controlled, the transistors in the state registers and logic blocks of the component are suspended from changing states and therefore no power is required to change states. Only the power required to maintain a stable state is consumed. In one embodiment, when a component is disabled to reduce power consumption, a bias voltage is applied to the component to fuirther reduce power consumption arising from leakage.
  • While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example, and not limitation. It will be apparent to persons skilled in the relevant computer arts that various changes in form and detail can be made therein without departing from the spirit and scope of the invention. Furthermore, it should be appreciated that the detailed description of the present invention provided herein, and not the summary and abstract sections, is intended to be used to interpret the claims. The summary and abstract sections may set forth one or more but not all exemplary embodiments of the present invention as contemplated by the inventors.
  • For example, in addition to implementations using hardware (e.g., within or coupled to a Central Processing Unit (“CPU”), microprocessor, microcontroller, digital signal processor, processor core, System on Chip (“SOC”), or any other programmable or electronic device), implementations may also be embodied in software (e.g., computer readable code, program code, instructions and/or data disposed in any form, such as source, object or machine language) disposed, for example, in a computer usable (e.g., readable) medium configured to store the software. Such software can enable, for example, the function, fabrication, modeling, simulation, description, and/or testing of the apparatus and methods described herein. For example, this can be accomplished through the use of general programming languages (e.g., C, C++), GDSII databases, hardware description languages (HDL) including Verilog HDL, VHDL, SystemC, SystemC Register Transfer Level (RTL), and so on, or other available programs, databases, and/or circuit (i.e., schematic) capture tools. Such software can be disposed in any known computer usable medium including semiconductor, magnetic disk, optical disk (e.g., CD-ROM, DVD-ROM, etc.) and as a computer data signal embodied in a computer usable (e.g., readable) transmission medium (e.g., carrier wave or any other medium including digital, optical, or analog-based medium). As such, the software can be transmitted over communication networks including the Internet and intranets.
  • It is understood that the apparatus and method embodiments described herein may be included in a semiconductor intellectual property core, such as a microprocessor core (e.g., embodied in HDL) and transformed to hardware in the production of integrated circuits. Additionally, the apparatus and methods described herein may be embodied as a combination of hardware and software. Thus, the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalence.

Claims (20)

1 A system comprising:
a processor having a processor core, a fetch unit and a register for storing a portion of an address for an instruction to be fetched;
a first memory source for storing instructions and a scratch pad memory for storing instructions both of which couple to the processor by a bus, wherein the instruction is made available to the processor from the scratch pad memory when the portion of an address stored in the register matches a translated virtual instruction address provided by the fetch unit.
2. The system of claim 1 wherein the first memory source is an instruction cache.
3. The system of claim 1 wherein the first memory source is a level one instruction cache.
4. The system of claim 1 wherein the first memory source is a level two cache.
5. The system of claim 1 wherein the first memory source is disabled to reduce power consumption.
6. The system of claim 1 wherein the scratch pad memory is disabled to reduce power consumption if the instruction is not made available by the scratch pad memory.
7. The system of claim 1 wherein the first memory source is enabled and the scratch pad memory is disabled to reduce power consumption if the instruction is not made available by the scratch pad memory.
8. The system of claim 1 wherein the portion of an address for an instruction comprise a tag of the translated physical address of the instruction.
9. The system of claim 8 wherein the instruction is selected from scratch pad memory based on the offset of the virtual instruction address.
10. A method of performing an instruction fetch associated with a virtual address in a processor having a scratch pad memory for storing instructions and a first memory system for storing instructions, comprising:
making the instruction available to the processor from the scratch pad memory when the portion of an address stored in a register matches a translated virtual instruction address provided by a fetch unit.
11. The method of claim 11 wherein the first memory source is an instruction cache.
12. The method of claim 10 wherein the first memory source is a level one instruction cache.
13. The method of claim 10 wherein the first memory source is a level two cache.
14. The method of claim 10 wherein the first memory source is disabled to reduce power consumption.
15. The method of claim 10 wherein the scratch pad memory is disabled to reduce power consumption if the instruction is not made available by the scratch pad memory.
16. The method of claim 10 wherein the first memory source is enabled and the scratch pad memory is disabled to reduce power consumption if the instruction is not made available by the scratch pad memory.
17. The method of claim 10 wherein the portion of the address for an instruction comprises a tag of the translated physical address of the instruction.
18. The method of claim 10 wherein the instruction is selected from scratch pad memory based on the offset of the virtual instruction address.
19. A computer program product for use with a computing device, the computer program product comprising:
a tangible computer usable medium, having computer readable program code embodied thereon for providing a processor, the computer readable program code comprising:
first computer readable program code for providing a fetch unit,
second computer readable program code for providing a register for storing a portion of an address for an instruction to be fetched, coupled to the fetch unit,
third computer readable program code for providing a first memory source for storing instructions, coupled to the fetch unit, and
fourth computer readable program code for providing a scratch pad memory for storing instructions, coupled to the fetch unit,
wherein the instruction is made available to the processor from the scratch pad memory when the portion of an address stored in the register matches a translated virtual instruction address provided by the fetch unit.
20. The computer program product of claim 19, wherein the scratch pad memory is disabled to reduce power consumption if the instruction is not made available by the scratch pad memory.
US12/357,929 2005-11-15 2009-01-22 Processor Accessing A Scratch Pad On-Demand To Reduce Power Consumption Abandoned US20090132841A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/357,929 US20090132841A1 (en) 2005-11-15 2009-01-22 Processor Accessing A Scratch Pad On-Demand To Reduce Power Consumption

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/272,737 US7496771B2 (en) 2005-11-15 2005-11-15 Processor accessing a scratch pad on-demand to reduce power consumption
US12/357,929 US20090132841A1 (en) 2005-11-15 2009-01-22 Processor Accessing A Scratch Pad On-Demand To Reduce Power Consumption

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/272,737 Continuation US7496771B2 (en) 2005-11-15 2005-11-15 Processor accessing a scratch pad on-demand to reduce power consumption

Publications (1)

Publication Number Publication Date
US20090132841A1 true US20090132841A1 (en) 2009-05-21

Family

ID=38042305

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/272,737 Active 2026-12-30 US7496771B2 (en) 2005-11-15 2005-11-15 Processor accessing a scratch pad on-demand to reduce power consumption
US12/357,929 Abandoned US20090132841A1 (en) 2005-11-15 2009-01-22 Processor Accessing A Scratch Pad On-Demand To Reduce Power Consumption

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US11/272,737 Active 2026-12-30 US7496771B2 (en) 2005-11-15 2005-11-15 Processor accessing a scratch pad on-demand to reduce power consumption

Country Status (1)

Country Link
US (2) US7496771B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070113057A1 (en) * 2005-11-15 2007-05-17 Mips Technologies, Inc. Processor utilizing a loop buffer to reduce power consumption
US20090198900A1 (en) * 2005-11-15 2009-08-06 Matthias Knoth Microprocessor Having a Power-Saving Instruction Cache Way Predictor and Instruction Replacement Scheme

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7496771B2 (en) * 2005-11-15 2009-02-24 Mips Technologies, Inc. Processor accessing a scratch pad on-demand to reduce power consumption
KR100737919B1 (en) * 2006-02-28 2007-07-10 삼성전자주식회사 Program method of nand flash memory and program method of memory system
US8069354B2 (en) 2007-08-14 2011-11-29 Mips Technologies, Inc. Power management for system having one or more integrated circuits
US20120079303A1 (en) * 2010-09-24 2012-03-29 Madduri Venkateswara R Method and apparatus for reducing power consumption in a processor by powering down an instruction fetch unit
US8762644B2 (en) * 2010-10-15 2014-06-24 Qualcomm Incorporated Low-power audio decoding and playback using cached images
US11048636B2 (en) * 2019-07-31 2021-06-29 Micron Technology, Inc. Cache with set associativity having data defined cache sets
US11194582B2 (en) 2019-07-31 2021-12-07 Micron Technology, Inc. Cache systems for main and speculative threads of processors
US11200166B2 (en) 2019-07-31 2021-12-14 Micron Technology, Inc. Data defined caches for speculative and normal executions
US11010288B2 (en) 2019-07-31 2021-05-18 Micron Technology, Inc. Spare cache set to accelerate speculative execution, wherein the spare cache set, allocated when transitioning from non-speculative execution to speculative execution, is reserved during previous transitioning from the non-speculative execution to the speculative execution

Citations (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5091851A (en) * 1989-07-19 1992-02-25 Hewlett-Packard Company Fast multiple-word accesses from a multi-way set-associative cache memory
US5325511A (en) * 1990-06-15 1994-06-28 Compaq Computer Corp. True least recently used replacement method and apparatus
US5493667A (en) * 1993-02-09 1996-02-20 Intel Corporation Apparatus and method for an instruction cache locking scheme
US5568442A (en) * 1993-05-17 1996-10-22 Silicon Graphics, Inc. RISC processor having improved instruction fetching capability and utilizing address bit predecoding for a segmented cache memory
US5734881A (en) * 1995-12-15 1998-03-31 Cyrix Corporation Detecting short branches in a prefetch buffer using target location information in a branch target cache
US5761715A (en) * 1995-08-09 1998-06-02 Kabushiki Kaisha Toshiba Information processing device and cache memory with adjustable number of ways to reduce power consumption based on cache miss ratio
US5764999A (en) * 1995-10-10 1998-06-09 Cyrix Corporation Enhanced system management mode with nesting
US5809326A (en) * 1995-09-25 1998-09-15 Kabushiki Kaisha Toshiba Signal processor and method of operating a signal processor
US5822760A (en) * 1994-01-31 1998-10-13 Fujitsu Limited Cache-memory system having multidimensional cache
US5848433A (en) * 1995-04-12 1998-12-08 Advanced Micro Devices Way prediction unit and a method for operating the same
US5848014A (en) * 1997-06-12 1998-12-08 Cypress Semiconductor Corp. Semiconductor device such as a static random access memory (SRAM) having a low power mode using a clock disable circuit
US5901322A (en) * 1995-06-22 1999-05-04 National Semiconductor Corporation Method and apparatus for dynamic control of clocks in a multiple clock processor, particularly for a data cache
US5966734A (en) * 1996-10-18 1999-10-12 Samsung Electronics Co., Ltd. Resizable and relocatable memory scratch pad as a cache slice
US5986969A (en) * 1997-07-25 1999-11-16 Lucent Technologies, Inc. Power savings for memory arrays
US6044478A (en) * 1997-05-30 2000-03-28 National Semiconductor Corporation Cache with finely granular locked-down regions
US6076159A (en) * 1997-09-12 2000-06-13 Siemens Aktiengesellschaft Execution of a loop instructing in a loop pipeline after detection of a first occurrence of the loop instruction in an integer pipeline
US6085315A (en) * 1997-09-12 2000-07-04 Siemens Aktiengesellschaft Data processing device with loop pipeline
US6167536A (en) * 1997-04-08 2000-12-26 Advanced Micro Devices, Inc. Trace cache for a microprocessor-based device
US6185657B1 (en) * 1998-04-20 2001-02-06 Motorola Inc. Multi-way cache apparatus and method
US6412057B1 (en) * 1999-02-08 2002-06-25 Kabushiki Kaisha Toshiba Microprocessor with virtual-to-physical address translation using flags
US20020087900A1 (en) * 2000-12-29 2002-07-04 Homewood Mark Owen System and method for reducing power consumption in a data processor having a clustered architecture
US6430655B1 (en) * 2000-01-31 2002-08-06 Mips Technologies, Inc. Scratchpad RAM memory accessible in parallel to a primary cache
US6477639B1 (en) * 1999-10-01 2002-11-05 Hitachi, Ltd. Branch instruction mechanism for processor
US20020188834A1 (en) * 2001-05-04 2002-12-12 Ip First Llc Apparatus and method for target address replacement in speculative branch target address cache
US6505285B1 (en) * 2000-06-26 2003-01-07 Ncr Corporation Scratch segment subsystem for a parallel processing database system
US6546477B1 (en) * 1999-09-20 2003-04-08 Texas Instruments Incorporated Memory management in embedded systems with dynamic object instantiation
US20030074546A1 (en) * 1997-02-17 2003-04-17 Hitachi, Ltd. Data processing apparatus
US6557127B1 (en) * 2000-02-28 2003-04-29 Cadence Design Systems, Inc. Method and apparatus for testing multi-port memories
US20040024968A1 (en) * 2002-07-30 2004-02-05 Lesartre Gregg B. Method and apparatus for saving microprocessor power when sequentially accessing the microprocessor's instruction cache
US6757817B1 (en) * 2000-05-19 2004-06-29 Intel Corporation Apparatus having a cache and a loop buffer
US20040193858A1 (en) * 2003-03-24 2004-09-30 Infineon Technologies North America Corp. Zero-overhead loop operation in microprocessor having instruction buffer
US6836833B1 (en) * 2002-10-22 2004-12-28 Mips Technologies, Inc. Apparatus and method for discovering a scratch pad memory configuration
US20050044429A1 (en) * 2003-08-22 2005-02-24 Ip-First Llc Resource utilization mechanism for microprocessor power management
US20050114600A1 (en) * 2003-11-25 2005-05-26 International Business Machines Corporation Reducing bus width by data compaction
US20050246499A1 (en) * 2004-04-30 2005-11-03 Nec Corporation Cache memory with the number of operated ways being changed according to access pattern
US20070113057A1 (en) * 2005-11-15 2007-05-17 Mips Technologies, Inc. Processor utilizing a loop buffer to reduce power consumption
US20070113013A1 (en) * 2005-11-15 2007-05-17 Mips Technologies, Inc. Microprocessor having a power-saving instruction cache way predictor and instruction replacement scheme
US7496771B2 (en) * 2005-11-15 2009-02-24 Mips Technologies, Inc. Processor accessing a scratch pad on-demand to reduce power consumption

Patent Citations (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5091851A (en) * 1989-07-19 1992-02-25 Hewlett-Packard Company Fast multiple-word accesses from a multi-way set-associative cache memory
US5325511A (en) * 1990-06-15 1994-06-28 Compaq Computer Corp. True least recently used replacement method and apparatus
US5493667A (en) * 1993-02-09 1996-02-20 Intel Corporation Apparatus and method for an instruction cache locking scheme
US5568442A (en) * 1993-05-17 1996-10-22 Silicon Graphics, Inc. RISC processor having improved instruction fetching capability and utilizing address bit predecoding for a segmented cache memory
US5822760A (en) * 1994-01-31 1998-10-13 Fujitsu Limited Cache-memory system having multidimensional cache
US5848433A (en) * 1995-04-12 1998-12-08 Advanced Micro Devices Way prediction unit and a method for operating the same
US5901322A (en) * 1995-06-22 1999-05-04 National Semiconductor Corporation Method and apparatus for dynamic control of clocks in a multiple clock processor, particularly for a data cache
US5761715A (en) * 1995-08-09 1998-06-02 Kabushiki Kaisha Toshiba Information processing device and cache memory with adjustable number of ways to reduce power consumption based on cache miss ratio
US5809326A (en) * 1995-09-25 1998-09-15 Kabushiki Kaisha Toshiba Signal processor and method of operating a signal processor
US5764999A (en) * 1995-10-10 1998-06-09 Cyrix Corporation Enhanced system management mode with nesting
US5734881A (en) * 1995-12-15 1998-03-31 Cyrix Corporation Detecting short branches in a prefetch buffer using target location information in a branch target cache
US5966734A (en) * 1996-10-18 1999-10-12 Samsung Electronics Co., Ltd. Resizable and relocatable memory scratch pad as a cache slice
US20030074546A1 (en) * 1997-02-17 2003-04-17 Hitachi, Ltd. Data processing apparatus
US6167536A (en) * 1997-04-08 2000-12-26 Advanced Micro Devices, Inc. Trace cache for a microprocessor-based device
US6044478A (en) * 1997-05-30 2000-03-28 National Semiconductor Corporation Cache with finely granular locked-down regions
US5848014A (en) * 1997-06-12 1998-12-08 Cypress Semiconductor Corp. Semiconductor device such as a static random access memory (SRAM) having a low power mode using a clock disable circuit
US5986969A (en) * 1997-07-25 1999-11-16 Lucent Technologies, Inc. Power savings for memory arrays
US6076159A (en) * 1997-09-12 2000-06-13 Siemens Aktiengesellschaft Execution of a loop instructing in a loop pipeline after detection of a first occurrence of the loop instruction in an integer pipeline
US6085315A (en) * 1997-09-12 2000-07-04 Siemens Aktiengesellschaft Data processing device with loop pipeline
US6185657B1 (en) * 1998-04-20 2001-02-06 Motorola Inc. Multi-way cache apparatus and method
US6412057B1 (en) * 1999-02-08 2002-06-25 Kabushiki Kaisha Toshiba Microprocessor with virtual-to-physical address translation using flags
US6546477B1 (en) * 1999-09-20 2003-04-08 Texas Instruments Incorporated Memory management in embedded systems with dynamic object instantiation
US6477639B1 (en) * 1999-10-01 2002-11-05 Hitachi, Ltd. Branch instruction mechanism for processor
US6430655B1 (en) * 2000-01-31 2002-08-06 Mips Technologies, Inc. Scratchpad RAM memory accessible in parallel to a primary cache
US6557127B1 (en) * 2000-02-28 2003-04-29 Cadence Design Systems, Inc. Method and apparatus for testing multi-port memories
US6757817B1 (en) * 2000-05-19 2004-06-29 Intel Corporation Apparatus having a cache and a loop buffer
US6505285B1 (en) * 2000-06-26 2003-01-07 Ncr Corporation Scratch segment subsystem for a parallel processing database system
US20020087900A1 (en) * 2000-12-29 2002-07-04 Homewood Mark Owen System and method for reducing power consumption in a data processor having a clustered architecture
US20020188834A1 (en) * 2001-05-04 2002-12-12 Ip First Llc Apparatus and method for target address replacement in speculative branch target address cache
US20040024968A1 (en) * 2002-07-30 2004-02-05 Lesartre Gregg B. Method and apparatus for saving microprocessor power when sequentially accessing the microprocessor's instruction cache
US6836833B1 (en) * 2002-10-22 2004-12-28 Mips Technologies, Inc. Apparatus and method for discovering a scratch pad memory configuration
US20050102483A1 (en) * 2002-10-22 2005-05-12 Kinter Ryan C. Apparatus and method for discovering a scratch pad memory configuration
US20040193858A1 (en) * 2003-03-24 2004-09-30 Infineon Technologies North America Corp. Zero-overhead loop operation in microprocessor having instruction buffer
US20050044429A1 (en) * 2003-08-22 2005-02-24 Ip-First Llc Resource utilization mechanism for microprocessor power management
US20050114600A1 (en) * 2003-11-25 2005-05-26 International Business Machines Corporation Reducing bus width by data compaction
US20050246499A1 (en) * 2004-04-30 2005-11-03 Nec Corporation Cache memory with the number of operated ways being changed according to access pattern
US20070113057A1 (en) * 2005-11-15 2007-05-17 Mips Technologies, Inc. Processor utilizing a loop buffer to reduce power consumption
US20070113013A1 (en) * 2005-11-15 2007-05-17 Mips Technologies, Inc. Microprocessor having a power-saving instruction cache way predictor and instruction replacement scheme
US7496771B2 (en) * 2005-11-15 2009-02-24 Mips Technologies, Inc. Processor accessing a scratch pad on-demand to reduce power consumption
US7562191B2 (en) * 2005-11-15 2009-07-14 Mips Technologies, Inc. Microprocessor having a power-saving instruction cache way predictor and instruction replacement scheme
US20090198900A1 (en) * 2005-11-15 2009-08-06 Matthias Knoth Microprocessor Having a Power-Saving Instruction Cache Way Predictor and Instruction Replacement Scheme

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070113057A1 (en) * 2005-11-15 2007-05-17 Mips Technologies, Inc. Processor utilizing a loop buffer to reduce power consumption
US20090198900A1 (en) * 2005-11-15 2009-08-06 Matthias Knoth Microprocessor Having a Power-Saving Instruction Cache Way Predictor and Instruction Replacement Scheme
US7873820B2 (en) 2005-11-15 2011-01-18 Mips Technologies, Inc. Processor utilizing a loop buffer to reduce power consumption
US7899993B2 (en) 2005-11-15 2011-03-01 Mips Technologies, Inc. Microprocessor having a power-saving instruction cache way predictor and instruction replacement scheme

Also Published As

Publication number Publication date
US20070113050A1 (en) 2007-05-17
US7496771B2 (en) 2009-02-24

Similar Documents

Publication Publication Date Title
US7496771B2 (en) Processor accessing a scratch pad on-demand to reduce power consumption
US7873820B2 (en) Processor utilizing a loop buffer to reduce power consumption
US7562191B2 (en) Microprocessor having a power-saving instruction cache way predictor and instruction replacement scheme
US7657708B2 (en) Methods for reducing data cache access power in a processor using way selection bits
US8156357B2 (en) Voltage-based memory size scaling in a data processing system
US7562192B2 (en) Microprocessor, apparatus and method for selective prefetch retire
US8392651B2 (en) Data cache way prediction
EP3298493B1 (en) Method and apparatus for cache tag compression
CN106126441B (en) Method for caching and caching data items
US10416920B2 (en) System and method for improving memory transfer
US20060282621A1 (en) System and method for unified cache access using sequential instruction information
US7650465B2 (en) Micro tag array having way selection bits for reducing data cache access power
US8327121B2 (en) Data cache receive flop bypass
KR100710922B1 (en) Set-associative cache-management method using parallel reads and serial reads initiated while processor is waited
US20150095611A1 (en) Method and processor for reducing code and latency of tlb maintenance operations in a configurable processor
JPH02302853A (en) Improved type cash access method and apparatus
KR20240033103A (en) Criticality-notified caching policies
WO2008024221A2 (en) Micro tag reducing cache power

Legal Events

Date Code Title Description
AS Assignment

Owner name: BRIDGE CROSSING, LLC, NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MIPS TECHNOLOGIES, INC.;REEL/FRAME:030202/0440

Effective date: 20130206

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: ARM FINANCE OVERSEAS LIMITED, GREAT BRITAIN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:BRIDGE CROSSING, LLC;REEL/FRAME:033074/0058

Effective date: 20140131