It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
avatar
nightcraw1er.488: Yes, I keep hearing this, wonder who started this myth.
avatar
ssling: Dunno, maybe GOG's marketing team so they can brag about how their games are adjusted to run on modern systems. /s
Good point!
avatar
Spectrum_Legacy: Some decluttering of HUD in games is a way to go with it to minimize risk of burn-in, which suits me as I dislike visual overload in games that go for mmo-style huds, with pointers, minimap, numbers flying everywhere nonsense.
That wouldn't really work for me, as I tend to really like RPGs with their menu-based combat, and such games do tend to be HUD-heavy.

(Keeping the game turn based, with an option to control the battle speed, can help with the visual overload issue.)

avatar
kohlrak: The thing i like about APUs is their increased bandwidth between GPU and CPU. That is something severely underestimated, and i suspect this is what is behind games like Horizon: Zero Dawn running horridly on some computers despite running so well on others (including the PS4).

As for clock speed, either i'm misinterpreting what you're saying or it's a non-issue: the clockspeed and the voltage are not necessarily linked. To the contrary, a lower voltage means you can clock it higher since it won't generate as much heat, though you may need to keep a CPU lower for reasons other than heat (result stablization of individual instructions).
avatar
Spectrum_Legacy: Interconnect between cpu and igpu is a nice thing for the low latency, however in reality it doesn't do such magical things as you might expect, esp. if you consider that they both share the same memory, its bandwidth and possibly the memory controller too which causes the latency bottleneck. Though with steamdeck I'm less worried since they at least use ddr5. Makes one think how it would perform with something like stack of hbm2, but the price would skyrocket.
APU as design is a compromise by choice, cpu portion of apus in general has a lower cache compared to desktop-only architecture, igpu part having fewer CUs and to top it off, they share the tdp envelope as I already said. There is only so much you can do with those 15W to play with. For mobile usage it's good enough, esp. the size and efficiency is great with lower clocks.
Console ports sometimes running like crap won't be solved with the low latency interconnect that apus have though (believe me, I tried with my APUs - heck my old kaveri apu doesn't even have L3 cache on cpu side, which hurts its performance). It all boils down to optimizations of a native version for 1 console vs a myriad of hw/sw setups and porting to them.

As for clockspeeds of apu in general, with mobile tdp and the cooling solution used, you likely won't reach max advertised clocks on cpu and igpu at the same time, because when the tdp gets reached, it will throttle one or the other or both down. Undervolting and over- or under- clocking as a practice was just my experience with amd products, since their gpus were notorious for coming overvolted out of the factory, where you could easily undervolt them and get much better thermals without any loss of performance. If you underclocked to boot, you would get insanely low power consumption, esp. on gpus with hbm of any kind. Sorry for confusion, didn't imply that voltage and clockspeeds were rigidly linked or whatever.
It is certainly possible to write code that runs faster on an iGPU than on a dGPU by taking advantage of faster
CPU<->GPU transfer speeds on the iGPU, or other performance differences between the devices.

In fact, I have even found a case where a software renderer is faster than using the integrated GPU (and probably faster than a dGPU): the glxinfo program. (Granted. glxinfo doesn't actually display anything; all it does is check what capabilities the graphics hardware/driver supports, but the point still stands, even if it's not a typical use case.)
Post edited July 19, 2021 by dtgreene
avatar
dtgreene: That wouldn't really work for me, as I tend to really like RPGs with their menu-based combat, and such games do tend to be HUD-heavy.

(Keeping the game turn based, with an option to control the battle speed, can help with the visual overload issue.)
I do play rpgs too, just with turned off radar and similar things. I dislike those as it reminds me of a submarine sim and break the immersion for me. Usually they say where to go if one pays attention so it's not that hard. Games that offer these options to turn things off have a huge plus in my book. Also games like Kingdom come deliverance etc with hardcore mode aka pretty much hud-less mode is just my thing. Immersion ftw! Oldschool rpgs, jrpgs and dungeon crawlers on the other hand, yea they could be tricky to declutter without some mods for minimalists I reckon.

avatar
dtgreene: It is certainly possible to write code that runs faster on an iGPU than on a dGPU by taking advantage of faster
CPU<->GPU transfer speeds on the iGPU, or other performance differences between the devices.

In fact, I have even found a case where a software renderer is faster than using the integrated GPU (and probably faster than a dGPU): the glxinfo program. (Granted. glxinfo doesn't actually display anything; all it does is check what capabilities the graphics hardware/driver supports, but the point still stands, even if it's not a typical use case.)
Certainly, though we moved from games to synthetic benchmark/utility territory. I waited for a decade for apu magic to happen and materialise on the software side, but it didn't happen. Interconnect on the same die will always have minimal latency, just like hbm2 stacks of memory next to the gpu die have way lower latency and power consumption than gddr6 modules sitting mere 7cm away from the gpu die. That alone won't offset the slower system memory, syncing, etc in the case of an apu. However a modern dGPU sitting on x16 lane with a decent modern cpu is not that unresponsive nor lacks bandwidth. APUs are not meant to compete in performance with dGPUs. It's like a different league, they have their own bottlenecks and throttles/limitations, for the sake of mobility/size/power consumption/price/etc, although they improved a lot in recent years as expected.

Iirc there were some programs that benefited from having an intel iGPU specifically, I think it was from adobe or some other company, don't remember exactly. Those programs were not 3d games though for sure. Unfortunately I don't have any weak dGPU in the league of my strongest APU to duel them in a benchmark of sorts. Would be fun deathmatch tho I'm sure.
By the way, benchmark of glxinfo on a Raspberry Pi (with output redirected to /dev/null, to eliminate the cost of actually writing the output to the terminal window):
* On hardware, .650 seconds.
* With llvmpipe, .280 seconds.

(Also, I note that godot3 will refuse to start on hardware, even though it seems like it should (the Pi 4 is supposed to support OpenGL ES 3.0 (and even 3.1), but godot3 claims the support is missing), but will start with llvmpipe.)
low rated
avatar
kohlrak: Careful what you wish for. OLED phones are experiencing some severe screen burn, right now. Particularly on interface parts that don't move (so health bars and things like that for this device).
avatar
Spectrum_Legacy: I'm an OLED user for about a decade now, pretty much went from crt to plasma to oled, resorting to calculator displays only for office and casual stuff. First thing I did with the tv was to discard the factory settings and tune/calibrate it (esp. brightness) for dimly lit room. It is used primarily for movies and some gaming too. Some decluttering of HUD in games is a way to go with it to minimize risk of burn-in, which suits me as I dislike visual overload in games that go for mmo-style huds, with pointers, minimap, numbers flying everywhere nonsense. Btw I run it around 10% brightness as a sweet spot. I thought that me saying "If only papaGaben would offer an OLED model for those who don't play while out in the sun much" gave it away that I would run it at low brightness to prevent burn-in and fade of blue colours. Also screensavers were a thing since crt days for a reason. Btw this is just the very reason why I wouldn't consider a second hand oled ps vita unless I knew the person who had it and how they ran it. I know precisely what I wish for here, but I understand you meant it all well for a general user.
That's fine, and that would indeed improve things, but the problem is that such a setup (especially given the target audience would likely be in the sun alot) is not really fit for purpose. I'm not 100% sure, but i think all they'd need to do is have a blank frame every so often, too. I haven't looked too deeply into what's causing the burn.
avatar
kohlrak: The thing i like about APUs is their increased bandwidth between GPU and CPU. That is something severely underestimated, and i suspect this is what is behind games like Horizon: Zero Dawn running horridly on some computers despite running so well on others (including the PS4).

As for clock speed, either i'm misinterpreting what you're saying or it's a non-issue: the clockspeed and the voltage are not necessarily linked. To the contrary, a lower voltage means you can clock it higher since it won't generate as much heat, though you may need to keep a CPU lower for reasons other than heat (result stablization of individual instructions).
Interconnect between cpu and igpu is a nice thing for the low latency, however in reality it doesn't do such magical things as you might expect, esp. if you consider that they both share the same memory, its bandwidth and possibly the memory controller too which causes the latency bottleneck.
Still in the realm of theory, not in practice. I'd have to look at the schems, but the big picture is to try to get as much into cache territory as possible on CPU, anyway. In reality, the RAM is a huge bottleneck without the memory controller, which is something AMD learned the hard way a long, long time ago. Unfortunately, Microsoft seems to be the only ones (on the software side of things) actually trying to do something about this, and I honestly think it was accidental on their part. GNU and whomever now owns LLVM are really, really behind the ball on this one. There's alot more promise to fixing this issue on the software side of things, than the hardware side.
Though with steamdeck I'm less worried since they at least use ddr5. Makes one think how it would perform with something like stack of hbm2, but the price would skyrocket.
APU as design is a compromise by choice, cpu portion of apus in general has a lower cache compared to desktop-only architecture, igpu part having fewer CUs and to top it off, they share the tdp envelope as I already said. There is only so much you can do with those 15W to play with. For mobile usage it's good enough, esp. the size and efficiency is great with lower clocks.
The APUs have managed to make some things playable that you would never have expected to be playable, too. While indeed it's not magic, you'll find that, indeed, the bottleneck is caching and the like. We've gotten to the point that it's not how much RAM, but how much cache and how fast the RAM is. Moreover, bottlenecks tend to follow the Pareto distribution, as well, which means that once you manage to get the whole thing in cache almost all the other memory hogging things are just textures, models, etc which the GPU is focused on, and isn't going to grab nearly as much per second as everyone thinks it will.
Console ports sometimes running like crap won't be solved with the low latency interconnect that apus have though (believe me, I tried with my APUs - heck my old kaveri apu doesn't even have L3 cache on cpu side, which hurts its performance). It all boils down to optimizations of a native version for 1 console vs a myriad of hw/sw setups and porting to them.
Except these APU drivers actually do reduce latency. Another thing is the age old trick of making sure your game and only the bare minimum services are running, because task-switching is killer on your caches.
As for clockspeeds of apu in general, with mobile tdp and the cooling solution used, you likely won't reach max advertised clocks on cpu and igpu at the same time, because when the tdp gets reached, it will throttle one or the other or both down. Undervolting and over- or under- clocking as a practice was just my experience with amd products, since their gpus were notorious for coming overvolted out of the factory, where you could easily undervolt them and get much better thermals without any loss of performance. If you underclocked to boot, you would get insanely low power consumption, esp. on gpus with hbm of any kind. Sorry for confusion, didn't imply that voltage and clockspeeds were rigidly linked or whatever.
You're never going to see max clock-rates, regardless with any of the technology. Something has to lock up for syncing somewhere at some point, and that'll be where you spend most of your down time, anyway. This is another reason why the APUs can present magic. The trick is to get some sort of external connector to allow direct connections (yeah, not easy, i'm aware, and would likely require re-pasting) of the GPU and CPU as external entities still sharing that MCU. Obviously we need proper "desktop-grade" APUs to see the tech fully flourish, because, as you said, they're cutting costs in them, yet. Caches need to be expanded and coders need to start looking at their code sizes, and compiler makers need to start improving linking. Microsoft can import functions based on whether or not they're actually used, and GCC and such cannot, which is a huge bottleneck issue to be had.
As for this device, I wouldn't preorder it until I see tech-heavy reviews, how it performs, what options it offers to prosumers, etc and then again I'm not much interested in handheld without oled display (or microled in a decade from now).
I wouldn't rely on reviews, especially after semi-recently we had a nice little thread here where people were complaining about vsync, only for vsync to be invented and called something else ("enhanced sync" or something like that: they basically just took stability features away from vsync that weren't necessary and should not have been part of vsync to begin with) by the GPU-making companies. And reviewers i found were all eating it up like it's fancy new technology. It was like watching someone invent "enhanced dicast tires" where you could separate the inner-tube from the tire treads, and calling it new technology, and everyone going gaga over it.
low rated
avatar
kohlrak: The thing i like about APUs is their increased bandwidth between GPU and CPU. That is something severely underestimated, and i suspect this is what is behind games like Horizon: Zero Dawn running horridly on some computers despite running so well on others (including the PS4).
avatar
dtgreene: Does that mean the game might run reasonably run better than expected on some integrated GPUs?

Has there been any reports of a system with both integrated and dedicated GPUs running the game better on the integrated GPU? (One requirement: The dedicated GPU must at least meet the stated minimum requirements of the game, to avoid contrived situations like a modern APU paired with an ancient graphics card.)
APU is an integrated CPU-GPU combo that actually acknowledges what they share, basically. From what i've seen based on complaints when it first came out on GOG, it seemed the initial reports were going in the direction that it was the bandwidth between the CPU and the GPU that was the problem. In theory, this means indeed some integrated GPUs could handle it way, way better than separate units. However, then it'd be hard to not call it an APU. The truth is a bit more complex than that, but the big picture really is the integration between the two. At the end of the day, you're looking at the same, age-old threading issues no matter what hardware you use. At the same time, though, how long you spend between syncs is really, really important (so that data-transfer rate).

I mean, you gotta think of it this way. You have a CPU that only goes so fast, a GPU that only goes so fast (and their individual cores can thread, but for the sake of simplicity we'll just treat it like a single thread in each), and they also have to communicate with one another to discover how, where, and what goes on that screen. The GPU's job is to put it there, and the CPU's job is to tell it what to put where and how. If the CPU's choking on AI code, GPU's going to end up waiting for the CPU to hurry the hell up. If the GPU's choking, the CPU could end up waiting for the GPU, which also means the game logic is running slower, too. Sometimes, they both have to do another task of sending and receiving lots of data like textures, which means, while they're both busy, they're also hurrying up and waiting to do their real jobs.

This, of course, all gets magnified when you realize there's multiple cores, multiple threads, threads locking other threads for syncing (yeah, sometimes threading makes things run slower), unrelated processes, separate hardware (yeah, the SPU could use some stuff to do, too, and we all gotta wait for the hard drive to load the textures and models), and dare we wait for the next packet in an online game, or are we using non-blocking sockets?
avatar
Spectrum_Legacy: Some decluttering of HUD in games is a way to go with it to minimize risk of burn-in, which suits me as I dislike visual overload in games that go for mmo-style huds, with pointers, minimap, numbers flying everywhere nonsense.
avatar
dtgreene: That wouldn't really work for me, as I tend to really like RPGs with their menu-based combat, and such games do tend to be HUD-heavy.

(Keeping the game turn based, with an option to control the battle speed, can help with the visual overload issue.)

avatar
Spectrum_Legacy: Interconnect between cpu and igpu is a nice thing for the low latency, however in reality it doesn't do such magical things as you might expect, esp. if you consider that they both share the same memory, its bandwidth and possibly the memory controller too which causes the latency bottleneck. Though with steamdeck I'm less worried since they at least use ddr5. Makes one think how it would perform with something like stack of hbm2, but the price would skyrocket.
APU as design is a compromise by choice, cpu portion of apus in general has a lower cache compared to desktop-only architecture, igpu part having fewer CUs and to top it off, they share the tdp envelope as I already said. There is only so much you can do with those 15W to play with. For mobile usage it's good enough, esp. the size and efficiency is great with lower clocks.
Console ports sometimes running like crap won't be solved with the low latency interconnect that apus have though (believe me, I tried with my APUs - heck my old kaveri apu doesn't even have L3 cache on cpu side, which hurts its performance). It all boils down to optimizations of a native version for 1 console vs a myriad of hw/sw setups and porting to them.

As for clockspeeds of apu in general, with mobile tdp and the cooling solution used, you likely won't reach max advertised clocks on cpu and igpu at the same time, because when the tdp gets reached, it will throttle one or the other or both down. Undervolting and over- or under- clocking as a practice was just my experience with amd products, since their gpus were notorious for coming overvolted out of the factory, where you could easily undervolt them and get much better thermals without any loss of performance. If you underclocked to boot, you would get insanely low power consumption, esp. on gpus with hbm of any kind. Sorry for confusion, didn't imply that voltage and clockspeeds were rigidly linked or whatever.
avatar
dtgreene: It is certainly possible to write code that runs faster on an iGPU than on a dGPU by taking advantage of faster
CPU<->GPU transfer speeds on the iGPU, or other performance differences between the devices.

In fact, I have even found a case where a software renderer is faster than using the integrated GPU (and probably faster than a dGPU): the glxinfo program. (Granted. glxinfo doesn't actually display anything; all it does is check what capabilities the graphics hardware/driver supports, but the point still stands, even if it's not a typical use case.)
Yeah, 'cause all that's sync-heavy. Send request, wait for response. Going to be much, much slower over PCI or any of the other buses. This sharing does indeed improve things, because most people (it took me forever to realize this, too) don't realize exactly how much graphics get processed on the CPU side, anyway. CPU still does alot.
Post edited July 20, 2021 by kohlrak
avatar
kohlrak: Still in the realm of theory, not in practice. I'd have to look at the schems, but the big picture is to try to get as much into cache territory as possible on CPU, anyway
Yeah, the 'big thing' for APUs is the layering technology that AMD especially has been developing. That would involve a CPU layer, a RAM layer, and an iGPU layer on the same chip. The RAM layer would effectively be a massive level 4 cache between the two processors. There is some basic use of it announced/ released already as 'V cache' on Zen3 but its longer term applications for APUs is pretty obvious.

Would still be unlikely to make APUs truly competitive with discrete systems though. The bottleneck would shift from slow/ limited memory bandwidth to balancing heat dissipation/ power to cores and frequency; and the RAM layer would likely have to be expensive- HBM for 'commercial' RAM, if not actual SRAM like the V cache is. Should still give a lot better performance than current APUs though.
low rated
avatar
kohlrak: Still in the realm of theory, not in practice. I'd have to look at the schems, but the big picture is to try to get as much into cache territory as possible on CPU, anyway
avatar
Phasmid: Yeah, the 'big thing' for APUs is the layering technology that AMD especially has been developing. That would involve a CPU layer, a RAM layer, and an iGPU layer on the same chip. The RAM layer would effectively be a massive level 4 cache between the two processors. There is some basic use of it announced/ released already as 'V cache' on Zen3 but its longer term applications for APUs is pretty obvious.

Would still be unlikely to make APUs truly competitive with discrete systems though. The bottleneck would shift from slow/ limited memory bandwidth to balancing heat dissipation/ power to cores and frequency; and the RAM layer would lkely have to be expensive- HBM for 'commercial' RAM, if not actual SRAM like the V cache is. Should still give a lot better performance than current APUs though.
Indeed. My gut reactions is to find a way to separate them, long term, so they can be drop and replace, and at the same time not have them tied to each other (so you could upgrade one part without upgrading the other). However, that's not realistic.

Given the space between the CPU and the RAM as it is, and all the talk about how "ram is cheap" (from all the anti-optimization coders out there), i think it's high time we consider actually givving the CPU dededicated separate memory controller for VRAM instead. When you have separated GPU, like with most desktop applications, you are still using some old tech transfer mechanisms and this all has to be shared with the other hardware. I remember a microcontroller I have had a unique setup in that you used memory operations to send certain commands and address certain areas of vram, because it was wired such that the external RAM MCU was hooked directly to the GPU (all low clocked and low tech, mind you). The hardest part of this route is figuring out what programs should have permission to run the instructions. There's also an incredibly opportunity here to get rid of graphics drivers altogether by dropping this into shaders and coming up with a standard "calling convention" between shaders and the CPU.
avatar
kohlrak: Microsoft can import functions based on whether or not they're actually used, and GCC and such cannot, which is a huge bottleneck issue to be had.
I'm pretty sure the GNU linker (and this is the linker's responsibility, not the compiler) can do this, at least if linking to a static library.

Also, LLVM has Link Time Optimization, which can do this sort of thing.
low rated
avatar
kohlrak: Microsoft can import functions based on whether or not they're actually used, and GCC and such cannot, which is a huge bottleneck issue to be had.
avatar
dtgreene: I'm pretty sure the GNU linker (and this is the linker's responsibility, not the compiler) can do this, at least if linking to a static library.

Also, LLVM has Link Time Optimization, which can do this sort of thing.
I'll stand corrected: i just discovered the -flto option, which is disabled by default. This is a huge oversite, and i noticed the libc object file on my system was not compiled with -flto, demonstrating the problem of it not being enabled by default.
avatar
Strijkbout: Probably another one of Steam's misfires, they've never had success as a hardware supplier.
The Steam controller pretty much failed as did Steam Machines (I actually forgot about that last one unti now).
It's also not cheap and I don't think it's something they can keep exclusive like console hardware as it just runs Windows or Linux games so every farmshed in China can poop out something similar for cheaper.
avatar
JÖCKÖ HÖMÖ: The steam controller was great but what screwed it up was being forced to use steam to have it working as a proper controller.
Dunno man. It felt really cheap in my hand.

Like a 10$ Walmart game pad.
low rated
avatar
JÖCKÖ HÖMÖ: The steam controller was great but what screwed it up was being forced to use steam to have it working as a proper controller.
avatar
Yeshu: Dunno man. It felt really cheap in my hand.

Like a 10$ Walmart game pad.
oh thats low :O
is it really that bad?
low rated
avatar
nightcraw1er.488: Yes, I keep hearing this, wonder who started this myth.
avatar
ssling: Dunno, maybe GOG's marketing team so they can brag about how their games are adjusted to run on modern systems. /s
ah so have no actual example or statistics , well we can label it as a false myth then
low rated
avatar
Yeshu: Dunno man. It felt really cheap in my hand.

Like a 10$ Walmart game pad.
avatar
Orkhepaj: oh thats low :O
is it really that bad?
Yes. I held one in my hands once or twice, and the first thing i said to my GF who owned it was "I sure as hell hope you paid no more than 15 bucks for this." Fortunately for her, it was a gift. Wasn't all that fun to play with, either. Something funky with the left stick or dpad or something, i remember, but i forget what. Had some weird feedback system, too.
Post edited July 20, 2021 by kohlrak
low rated
avatar
Orkhepaj: oh thats low :O
is it really that bad?
avatar
kohlrak: Yes. I held one in my hands once or twice, and the first thing i said to my GF who owned it was "I sure as hell hope you paid no more than 15 bucks for this." Fortunately for her, it was a gift. Wasn't all that fun to play with, either. Something funky with the left stick or dpad or something, i remember, but i forget what. Had some weird feedback system, too.
hmm glad i havent got it then , it looks niche anyway probably not many games use it , the xbox type controller i have is supported by nearly every modern games and easy to get used to