Please enable javascript in your browser.
Page
of
0
بهبود قابلیت اطمینان و طول عمر در حافظه های اصلی از نوع تغییر فاز
اسدی نیا، مرجان Asadinia, Marjan
Cataloging brief
بهبود قابلیت اطمینان و طول عمر در حافظه های اصلی از نوع تغییر فاز
پدیدآور اصلی :
اسدی نیا، مرجان Asadinia, Marjan
ناشر :
صنعتی شریف
سال انتشار :
1395
موضوع ها :
حافظه ی فاز متغیر Phase Change Memory عملکرد Performance انرژی Energy رانش مقاومت...
شماره راهنما :
52-48426
Find in content
sort by
page number
page score
Bookmark
List of Tables
(16)
Chapter 1
(17)
Introduction
(17)
1.1 PCM Technology Maturity
(18)
Figure 1.1: The number of published articles on PCM Technology taken from IEEE Xplore database for all IEEE, AIP, IET and IBM articles and journals.
(19)
1.2 Main Contributions of Current Work
(19)
1.3 Organization of this Thesis
(21)
Chapter 2
(23)
Phase Change Memory (PCM)
(23)
2.1 Introduction
(23)
2.2 PCM Materials/Device Physics
(24)
Figure 2.1: (A) Typical R-T (normalized) curve of chalcogenide films shows that the resistivity of amorphous phase is 5-6 orders of magnitude higher than the polycrystalline phase. TR1R and TR2R are the temperatures where the transitions to both phase...
(25)
Figure 2.2: In order to maintain reasonable programming currents, phase change memory devices are configured such that one of the critical dimensions of the current flow path is in the range of 10-50nm. In the mushroom cell (A), a thin film of chalco...
(26)
2.3 Memory Cell and Array Design
(27)
2.4 Multi-Level-Cell Phase Change Memory (MLC PCM)
(28)
Figure 2.3: Concept of Multi-Level PCM [14].
(29)
Figure 2.4: Resistance partitioning of 2-bit MLC configured for about one minute drift tolerance at 300ºK.
(30)
2.5 Read Technique
(30)
2.5.1 Integrating ADC Model
(31)
2.5.2 Successive Approximation ADC Model
(31)
Figure 2.5: Example of integrating ADC: structure and working [47].
(32)
2.6 Write Techniques
(32)
Figure 2.6: Example of successive approximation ADC [48].
(33)
Figure 2.7: Approach1: Increasing amorphous region (h1 <=> resistance R1, h2 <=> resistance R2, h2 > h1 => R2 > R1)
(34)
Figure 2.8: Approach2: Increasing crystalline filaments (w1 <=> resistance R1, w2 <=> resistance R2, w2 > w1 => R2 < R1)
(34)
Figure 2.9: P&V write process for MLC PCM [43].
(35)
2.7 Reliability
(35)
Figure 2.10: The programmed resistance states in PCM devices (A, units in KOhm) drifts upwards initially, and then decreases towards the crystalline state resistance. The rate of drift is higher for larger resistances. The programmed resistance increa...
(36)
Chapter 3
(37)
Related Work
(37)
3.1 Architecting PCM for Main Memories
(38)
Figure 3.1: Typical access latency of various technologies in the memory hierarchy [67].
(38)
3.1.1 PCM Organization
(39)
3.1.2 Fine-Grained Write Filtering
(40)
Figure 3.2: Percentage of redundant bit-writes for single-level and multi-level PCM cells [69].
(40)
Figure 3.3: Circuitry for Redundant Bit Removal and Row Shifting [69].
(41)
3.1.3 Hybrid Memory: Combining DRAM and PCM
(41)
Figure 3.4: PCM-based Hybrid Memory System [70].
(42)
3.2 Tolerating Slow Writes in PCM
(44)
3.2.1 Write Cancellation for PCM
(44)
3.2.2 Write Pausing
(45)
3.2.3 PRES: Pseudo-Random Encoding Scheme to Increase the Bit Flip Reduction in the Memory
(46)
3.3 Wear Leveling for Durability
(46)
Figure 3.5: Write pausing in PCM: (A) Pause points (B) Servicing reads via Pausing [72].
(47)
3.3.1 START-GAP Wear leveling
(47)
Figure 3.6: Start-Gap wear leveling on a memory containing 16 lines [74].
(48)
3.3.2 Randomized START-GAP
(49)
Figure 3.7: Architecture for Randomized Start-Gap [74].
(50)
3.4 Secure Wear leveling Algorithms
(50)
3.4.1 Region-Based Start-Gap (RBSG)
(51)
3.4.2 PCM-S Scheme
(51)
Figure 3.8: Recent Wear leveling algorithms: (a) Region Based Start-Gap (b) Seznec’s PCM-S Scheme (c) Security Refresh [80].
(52)
3.4.3 Security Refresh Scheme
(52)
3.4.4 SLC-enabled Wear Leveling for MLC PCM
(53)
3.5 Error Resilience in Phase Change Memories
(53)
3.5.1 Fault Model Assumption
(54)
3.5.2 Dynamically Replicated Memory (DRM)
(54)
Figure 3.9: Physical Address Space with Dynamic Replication [84].
(55)
3.5.3 Error Correcting Pointers (ECP)
(57)
Figure 3.10: Error Correction with ECP replacement of failed memory cells. (a) Correcting one bit error with ECP-1. (b) Correcting up to five errors with ECP-5. (c) Tolerating errors in replacement cells. (d) Tolerating errors in pointers of ECP [85].
(58)
3.5.4 Stuck-at-Fault Error Recovery (SAFER)
(59)
Figure 3.11: Example of partition two faults with SAFER [87].
(60)
3.5.5 Fine-Grained Embedded Redirection (Free-p)
(61)
Figure 3.12: Example of fine-grained remapping with embedded pointer [88].
(62)
3.5.6 A Recursively Defined Invertible Set Scheme to Tolerate Multiple Stuck-At Faults in Resistive Memory (RDIS)
(62)
Figure 3.13: An example for constructing the invertible set [107].
(64)
3.5.7 Pay-As-You-Go: Low-Overhead Hard-Error Correction for Phase Change Memories (PAYG)
(64)
3.5.8 Other Works
(65)
Chapter 4
(66)
Handling Hard Errors in PCM by Using Inter-line Level Schemes
(66)
4.1 OD3P: On-Demand Page Paired PCM
(66)
Figure 4.1: Write endurance of different 1KB pages in a 1GB PCM bank. The average cell’s endurance is 10P8P with standard deviation of 0.25 ×10P8P. The values are normalized to the weakest page (3.8 × 10P7P in this experiment). For considerable number...
(67)
4.2 Structure and Operation of Page Paired PCM
(68)
4.2.1 Target Page Selection Algorithm
(68)
Figure 4.2: Structural view of the Target Page Selector (TPS) and Address Translation Table (ATT) units.
(69)
4.2.2 Pairing Algorithm
(70)
Figure 4.3: (a) An example of how accesses to any of two PCM pages are performed with conventional stacked mapping. Upon receiving a read to location A (or B), the memory controller issues a read signal to sense the resistance value of each cell. Whe...
(72)
(c) Interleaved bit mapping for the example of Figure 4.3(a).
(72)
4.2.3 Address Translation
(73)
4.2.4 Discussion
(74)
4.3 Fixed Pairing
(74)
4.3.1 Pairing Algorithm
(75)
4.3.2 Address Translation
(75)
4.4 Partially-Selective Pairing Algorithm
(76)
4.4.1 Address Translation
(76)
4.5 Operation of Different OD3P Mechanisms: Examples
(77)
Figure 4.4: A simple example scenario of OD3P operation in a small PCM memory of 8 pages. (a) Fully-Selective OD3P Scheme, (b) Fixed OD3P Scheme, (c) Partially-Selective OD3P Scheme with Group Size of 4.
(79)
4.6 Line Level OD3P
(80)
Figure 4.5: Example for line level pairing where line i reaches its endurance limit and ECP-6 cannot protect it anymore. Thereafter, invoking the line level OD3P can select line j as the healthiest line (the one with maximum FreeECPs) and store them (...
(81)
4.7 Simulation Environment and Scenarios
(81)
4.7.1 Infrastructure
(81)
4.7.2 System Configuration
(82)
4.7.3 MLC PCM Array Model
(83)
4.7.4 Workloads
(83)
4.7.5 Metrics
(84)
4.8 Experimental Results
(86)
4.8.1 Analysis under Synthetic Write Traffic
(86)
Figure 4.6: Comparison of the lifetime versus capacity provided by six evaluated systems with CoV of 0.25.
(87)
4.8.2 Analysis under Real Workloads
(89)
4.8.2.1 Performance Analysis
(89)
Figure 4.7: Performance analysis of OD3P mechanisms compared to DRM+ECP-6 and ECP-6 baselines under different workloads. The values are normalized to ECP-6 baseline system.
(90)
4.8.2.2 Group Size in PS-OD3P
(90)
Figure 4.8: Speedup comparison for different group size (ranging from 4 to 64).
(91)
4.8.2.3 Endurance Analysis
(91)
Figure 4.9: Endurance analysis of OD3P mechanisms with different configurations and pairing algorithms compared to the baseline architecture. Again, the values are normalized to a ECP-6 baseline system.
(91)
Figure 4.10: TPS accuracy and area overhead versus TPS size.
(92)
Figure 4.11: Access time comparison of OD3P and DRM when number of failed bits varies.
(94)
4.8.3 Comparison to Line-Level Schemes
(94)
4.9 Hardware Overhead and Extension for N-Bit MLC PCM
(95)
Figure 4.12: The PCM cell that supports N-bit to 2N-bit MLC operations. Dummy block refers to a group of fixed resistive elements used for comparison in a read circuit.
(97)
Chapter 5
(98)
Handling Hard Errors in PCM by Using Intra-line Level Schemes
(98)
5.1 BLESS: A Simple and Efficient Scheme for Prolonging PCM Lifetime
(98)
Figure 5.1: Non-uniformity in bit flips distribution of write operations for "Facesim".
(99)
5.2 Improving the Bit Flips Uniformity
(99)
Figure 5.2: Shift mechanism: upon each write, the block content is shifted by one unit.
(100)
5.3 Tolerating the hard errors
(100)
5.3.1 Problem Formulation
(100)
Figure 5.3: (a) Probability of fault coverage (Eq. 5.1). (b) Average number of shifts to cover the fault(s) (Eq. 5.2).
(102)
Figure 5.4: An example of BLESS operation on a 3-unit block.
(103)
5.4 Write Operation in BLESS
(103)
5.5 Read Operation in BLESS
(104)
5.6 Meta-data Information
(104)
5.7 Evaluation Setting
(105)
5.8 Methodology
(106)
5.9 Evaluated Architectures
(106)
5.10 Evaluation Metrics
(107)
5.11 Evaluation Results
(107)
5.11.1 Analysis under Real Workloads
(108)
For the evaluation, we use PARSEC multi-threaded workloads [95]. We then compare our proposal in terms of lifetime, number of required shifts, number of recovered errors per page, performance, IntraV and impact of unit size.
(108)
5.11.1.1 Lifetime
(108)
Figure 5.5: Lifetime of the BLESS and other techniques.
(108)
Figure 5.6: Fault coverage distribution of different number of shifts in BLESS.
(109)
Figure 5.7: Average number of recovered errors per page for different schemes.
(110)
Figure 5.8: Comparing error recovery capability of BLESS and Aegis.
(111)
Figure 5.9: Impact of unit size on IntraV and per line storage overhead.
(112)
5.12 Comparison to Page-Level Schemes
(112)
5.13 Intra-line Level Pairing (ILP)
(113)
Figure 5.10: Memory capacity degradation as a function of lifetime under DRM, ECP, FREE-p, and SAFER schemes. Percentage of healthy cells in faulty lines are shown with”X” for each scheme when 50% of memory lines are corrupted.
(114)
5.14 ILP Structure
(114)
Figure 5.11: Pairing a faulty part ‘i’ with a healthy part ‘j’; after pairing part ‘i’ is marked as faulty and both ‘i’ and ‘j’ parts are stored in 2-bit MLC mode.
(115)
5.14.1 Healthy Part Selection
(116)
5.14.2 Meta-data Information
(116)
5.14.3 Pairing Algorithm
(117)
5.15 Experimental Results
(117)
5.15.1 Analysis under Synthetic Write Traffic
(118)
5.15.2 Analysis under Real Workloads
(118)
5.15.2.1 Performance Analysis
(119)
Figure 5.12: Performance analysis of ILP mechanism compared to ECP-6, Free-P, and SAFER. The values are normalized to ECP-6 baseline system.
(119)
5.15.2.2 Endurance Analysis
(120)
Figure 5.13: Lifetime analysis of ILP mechanism compared to ECP-6, Free-P, and SAFER. The values are normalized to ECP-6 baseline system.
(120)
Figure 5.14: Average number of recovered errors per line of ILP mechanism compared to ECP-6, Free-P, and SAFER.
(121)
Figure 5.15: Partitioning effect on lifetime and storage overhead.
(122)
Chapter 6
(123)
6.1 Variable Resistance Spectrum Assignment
(123)
6.2 The MLC VR-PCM
(124)
6.2.1 VR-PCM for MLC Read Improvement
(124)
Figure 6.1: Percentage of sixteen different values read/written from/into main memory for PARSEC-2 benchmark at 4-bit granularity (from most to least).
(125)
6.2.2 Binary-Directed Resistance Partitioning
(126)
Figure 6.2: VR-PCM binary resistance partitioning for 2-bit MLC (Up) and Extended VR-PCM partitioning for 3-bit MLC (Down).
(127)
6.2.3 Discussion
(127)
6.2.3.1 Enhanced write latency, energy, and cyclic endurance
(127)
6.2.3.2 Improved retention time and drift tolerance
(128)
6.2.3.3 VR-PCM Shortcoming in High Density MLC
(128)
6.3 Extended VR-PCM
(129)
6.4 Ultimate Design: Reconfigurable VR-PCM
(129)
6.5 Hardware Implementation Issues
(130)
6.5.1 Main Memory Controller
(131)
Figure 6.3: VR-PCM array structure and its controller.
(131)
6.5.2 Frequent Value Finder Logic
(132)
Figure 6.4: Degree of similarity between sets of frequent values extracted from the proposed hardware-based and the ideal software-based mechanisms.
(134)
6.5.3 Reassigning Resistance Levels to Data Values
(134)
6.5.4 OS Support for Value Translation
(135)
6.6 Simulation Results
(136)
6.6.1 VR-PCM for 2-bit MLCs
(137)
Figure 6.5: Read and write access latency of 2-bit VR-PCM normalized to the conventional MLC design in the evaluated 4-core system. We have 23% and 42% improvement in read and write access latency, respectively.
(137)
Figure 6.6: Performance improvement of a system with 2-bit VR-PCM main memory normalized to conventional MLC design. This chart shows 13% improvement in system’s CPI on average.
(138)
Figure 6.7: Normalized read, write, and total energy consumption of 2-bit VR-PCM main memory. The results reveal that we can expect considerable energy reduction in workloads with high value locality.
(139)
Figure 6.8: Orders of lifetime improvement when using VR-PCM in a 2-bit MLC main memory. Here, we experienced an average of 1.71x improvement in lifetime (5.1x for raytrace). One can expect more lifetime improvement when the program has large frequent...
(140)
6.6.2 Extended versus Reconfigurable VR-PCM for High-Density MLCs
(140)
Figure 6.9: Normalized memory access latency of 3-bit and 4-bit Extended/Reconfigurble VR-PCM. This experiment confirms considerable reduction in memory access latency especially in Reconfigurable design.
(141)
Figure 6.10: Performance improvement of a system with 3-bit and 4-bit Extended/Reconfigurable VR-PCM main memory normalized to the conventional MLC main memory design. This chart shows 10% improvement in system’s CPI on average.
(141)
Figure 6.11: Total memory access energy of 3-bit and 4-bit Extended/Reconfigurable VR-PCM main memory normalized to the conventional PCM.
(142)
Figure 6.12: Orders of lifetime improvement when using Extended/Reconfigurable VR-PCM in a 3 bit and 4-bit MLC main memory. Here, we experienced an improvement of 1.04x to 3.5x in lifetime.
(142)
6.6.3 Summary of Results
(143)
6.7 Process Variation and Resistance Drift
(143)
6.7.1 Process Variation
(143)
6.7.2 Analysis of Drift Tolerance
(145)
Figure 6.13: Rate of soft error (in terms of bit error rate) for VR-PCM compared to conventional MLC PCM design. These charts along with Table 6.1 confirm that VR-PCM gives same level of soft-error reliability with average of 18% reduction in MLC PCM ...
(147)
Chapter 7
(147)
Conclusion and Future Direction
(147)
7.1 Summary of Results
(147)
7.2 Future work
(150)
7.2.1 Completing the Proposed Architectures and their Evaluation
(150)
7.2.2 New Ideas
(152)
References
(155)