45

Can anybody with experience in fabrication reveal more about this? Very exciting ideas, but hoping to learn more in real-world context

you are viewing a single comment's thread
view the rest of the comments
[-] stardreamer 38 points 1 year ago* (last edited 1 year ago)

The argument is that processing data physically "near" where the data is stored (also known as NDP, near data processing, unlike traditional architecture designs, where data is stored off-chip) is more power efficient and lower latency for a variety of reasons (interconnect complexity, pin density, lane charge rate, etc). Someone came up with a design that can do complex computations much faster than before using NDP.

Personally, I'd say traditional Computer Architecture is not going anywhere for two reasons: first, these esoteric new architecture ideas such as NDP, SIMD (probably not esoteric anymore. GPUs and vector instructions both do this), In-network processing (where your network interface does compute) are notoriously hard to work with. It takes CS MS levels of understanding of the architecture to write a program in the P4 language (which doesn't allow loops, recursion, etc). No matter how fast your fancy new architecture is, it's worthless if most programmers on the job market won't be able to work with it. Second, there're too many foundational tools and applications that rely on traditional computer architecture. Nobody is going to port their 30-year-old stable MPI program to a new architecture every 3 years. It's just way too costly. People want to buy new hardware, install it, compile existing code, and see big numbers go up (or down, depending on which numbers)

I would say the future is where you have a mostly Von Newman machine with some of these fancy new toys (GPUs, Memory DIMMs with integrated co-processors, SmartNICs) as dedicated accelerators. Existing application code probably will not be modified. However, the underlying libraries will be able to detect these accelerators (e.g. GPUs, DMA engines, etc) and offload supported computations to them automatically to save CPU cycles and power. Think your standard memcpy() running on a dedicated data mover on the memory DIMM if your computer supports it. This way, your standard 9to5 programmer can still work like they used to and leave the fancy performance optimization stuff to a few experts.

[-] QuarterSwede@lemmy.world 1 points 1 year ago

Good, well thought out points.

I’ll add Von Newman machines are more likely to be used in mobile devices and appliances.

this post was submitted on 20 Nov 2023
45 points (100.0% liked)

Technology

35148 readers
46 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS