Shared L1 memories are of interest for tightlycoupled processor clusters in programmable accelerators as they provide a convenient shared memory abstraction while avoiding cache coherence overheads. The performance of a shared-L1 memory critically depends on the architecture of the low-latency interconnect between processors and memory banks, which needs to provide ultra-fast access to the largest possible L1 working set. The advent of 3D technology provides new opportunities to improve the interconnect delay and the form factor. In this paper we propose a network architecture, 3D-LIN, based on 3D integration technology. The network can be configured based on user specifications and technology constraints to provide fast access to L1 memori...
The main aim of this thesis is to examine the advantages of 3D stacking applied to microprocessors a...
In this paper, we propose a 3D bus architecture as a processor-memory interconnection system to incr...
Large required size, and tolerance to latency and variations in memory access time make L2 memory a ...
Shared L1 memories are of interest for tightlycoupled processor clusters in programmable accelerator...
International audienceShared L1 memories are of interest for tightly-coupled processor clusters in p...
none5Shared L1 memories are of interest for tightly-coupled processor clusters in programmable accel...
Shared tightly coupled data memories are key architectural elements for building multi-core clusters...
none3noIn this paper we propose two synthesizable 3D network architectures: C-LIN and D-LIN, which a...
none4Shared L1 memory is an interesting architectural option for building tightly-coupled multi-core...
L2 memory, serving multiple clusters of tightly coupled processors, is well-suited for 3D integratio...
As Moore’s Law slows down, new integration technologies emerge, such as 3D integration, silicon inte...
The objective of this thesis is to optimize the uncore of 3D many-core architectures. More specifica...
International audienceThanks to their brain-like properties, neural networks outperform traditional ...
The performance of most digital systems today is limited by the interconnect latency between logic a...
In this paper, we present a 3D-mesh architecture which is utilized as a processor-memory interconnec...
The main aim of this thesis is to examine the advantages of 3D stacking applied to microprocessors a...
In this paper, we propose a 3D bus architecture as a processor-memory interconnection system to incr...
Large required size, and tolerance to latency and variations in memory access time make L2 memory a ...
Shared L1 memories are of interest for tightlycoupled processor clusters in programmable accelerator...
International audienceShared L1 memories are of interest for tightly-coupled processor clusters in p...
none5Shared L1 memories are of interest for tightly-coupled processor clusters in programmable accel...
Shared tightly coupled data memories are key architectural elements for building multi-core clusters...
none3noIn this paper we propose two synthesizable 3D network architectures: C-LIN and D-LIN, which a...
none4Shared L1 memory is an interesting architectural option for building tightly-coupled multi-core...
L2 memory, serving multiple clusters of tightly coupled processors, is well-suited for 3D integratio...
As Moore’s Law slows down, new integration technologies emerge, such as 3D integration, silicon inte...
The objective of this thesis is to optimize the uncore of 3D many-core architectures. More specifica...
International audienceThanks to their brain-like properties, neural networks outperform traditional ...
The performance of most digital systems today is limited by the interconnect latency between logic a...
In this paper, we present a 3D-mesh architecture which is utilized as a processor-memory interconnec...
The main aim of this thesis is to examine the advantages of 3D stacking applied to microprocessors a...
In this paper, we propose a 3D bus architecture as a processor-memory interconnection system to incr...
Large required size, and tolerance to latency and variations in memory access time make L2 memory a ...