Posted July 19, 2021

nightcraw1er.488
DRM Free is dead.
Registered: Apr 2012
From United Kingdom

dtgreene
vaccines work she/her
Registered: Jan 2010
From United States
Posted July 19, 2021

(Keeping the game turn based, with an option to control the battle speed, can help with the visual overload issue.)

As for clock speed, either i'm misinterpreting what you're saying or it's a non-issue: the clockspeed and the voltage are not necessarily linked. To the contrary, a lower voltage means you can clock it higher since it won't generate as much heat, though you may need to keep a CPU lower for reasons other than heat (result stablization of individual instructions).

APU as design is a compromise by choice, cpu portion of apus in general has a lower cache compared to desktop-only architecture, igpu part having fewer CUs and to top it off, they share the tdp envelope as I already said. There is only so much you can do with those 15W to play with. For mobile usage it's good enough, esp. the size and efficiency is great with lower clocks.
Console ports sometimes running like crap won't be solved with the low latency interconnect that apus have though (believe me, I tried with my APUs - heck my old kaveri apu doesn't even have L3 cache on cpu side, which hurts its performance). It all boils down to optimizations of a native version for 1 console vs a myriad of hw/sw setups and porting to them.
As for clockspeeds of apu in general, with mobile tdp and the cooling solution used, you likely won't reach max advertised clocks on cpu and igpu at the same time, because when the tdp gets reached, it will throttle one or the other or both down. Undervolting and over- or under- clocking as a practice was just my experience with amd products, since their gpus were notorious for coming overvolted out of the factory, where you could easily undervolt them and get much better thermals without any loss of performance. If you underclocked to boot, you would get insanely low power consumption, esp. on gpus with hbm of any kind. Sorry for confusion, didn't imply that voltage and clockspeeds were rigidly linked or whatever.
CPU<->GPU transfer speeds on the iGPU, or other performance differences between the devices.
In fact, I have even found a case where a software renderer is faster than using the integrated GPU (and probably faster than a dGPU): the glxinfo program. (Granted. glxinfo doesn't actually display anything; all it does is check what capabilities the graphics hardware/driver supports, but the point still stands, even if it's not a typical use case.)
Post edited July 19, 2021 by dtgreene

Spectrum_Legacy
The baddest of good guys
Registered: Jun 2013
From Slovakia
Posted July 19, 2021

(Keeping the game turn based, with an option to control the battle speed, can help with the visual overload issue.)

CPU<->GPU transfer speeds on the iGPU, or other performance differences between the devices.
In fact, I have even found a case where a software renderer is faster than using the integrated GPU (and probably faster than a dGPU): the glxinfo program. (Granted. glxinfo doesn't actually display anything; all it does is check what capabilities the graphics hardware/driver supports, but the point still stands, even if it's not a typical use case.)
Iirc there were some programs that benefited from having an intel iGPU specifically, I think it was from adobe or some other company, don't remember exactly. Those programs were not 3d games though for sure. Unfortunately I don't have any weak dGPU in the league of my strongest APU to duel them in a benchmark of sorts. Would be fun deathmatch tho I'm sure.

dtgreene
vaccines work she/her
Registered: Jan 2010
From United States
Posted July 20, 2021
By the way, benchmark of glxinfo on a Raspberry Pi (with output redirected to /dev/null, to eliminate the cost of actually writing the output to the terminal window):
* On hardware, .650 seconds.
* With llvmpipe, .280 seconds.
(Also, I note that godot3 will refuse to start on hardware, even though it seems like it should (the Pi 4 is supposed to support OpenGL ES 3.0 (and even 3.1), but godot3 claims the support is missing), but will start with llvmpipe.)
* On hardware, .650 seconds.
* With llvmpipe, .280 seconds.
(Also, I note that godot3 will refuse to start on hardware, even though it seems like it should (the Pi 4 is supposed to support OpenGL ES 3.0 (and even 3.1), but godot3 claims the support is missing), but will start with llvmpipe.)

kohlrak
One Sooty Birb - Available on DLsite.com, not
Registered: Aug 2014
From United States
Posted July 20, 2021
low rated



As for clock speed, either i'm misinterpreting what you're saying or it's a non-issue: the clockspeed and the voltage are not necessarily linked. To the contrary, a lower voltage means you can clock it higher since it won't generate as much heat, though you may need to keep a CPU lower for reasons other than heat (result stablization of individual instructions).
Though with steamdeck I'm less worried since they at least use ddr5. Makes one think how it would perform with something like stack of hbm2, but the price would skyrocket.
APU as design is a compromise by choice, cpu portion of apus in general has a lower cache compared to desktop-only architecture, igpu part having fewer CUs and to top it off, they share the tdp envelope as I already said. There is only so much you can do with those 15W to play with. For mobile usage it's good enough, esp. the size and efficiency is great with lower clocks.
The APUs have managed to make some things playable that you would never have expected to be playable, too. While indeed it's not magic, you'll find that, indeed, the bottleneck is caching and the like. We've gotten to the point that it's not how much RAM, but how much cache and how fast the RAM is. Moreover, bottlenecks tend to follow the Pareto distribution, as well, which means that once you manage to get the whole thing in cache almost all the other memory hogging things are just textures, models, etc which the GPU is focused on, and isn't going to grab nearly as much per second as everyone thinks it will. APU as design is a compromise by choice, cpu portion of apus in general has a lower cache compared to desktop-only architecture, igpu part having fewer CUs and to top it off, they share the tdp envelope as I already said. There is only so much you can do with those 15W to play with. For mobile usage it's good enough, esp. the size and efficiency is great with lower clocks.
Console ports sometimes running like crap won't be solved with the low latency interconnect that apus have though (believe me, I tried with my APUs - heck my old kaveri apu doesn't even have L3 cache on cpu side, which hurts its performance). It all boils down to optimizations of a native version for 1 console vs a myriad of hw/sw setups and porting to them.
Except these APU drivers actually do reduce latency. Another thing is the age old trick of making sure your game and only the bare minimum services are running, because task-switching is killer on your caches. As for clockspeeds of apu in general, with mobile tdp and the cooling solution used, you likely won't reach max advertised clocks on cpu and igpu at the same time, because when the tdp gets reached, it will throttle one or the other or both down. Undervolting and over- or under- clocking as a practice was just my experience with amd products, since their gpus were notorious for coming overvolted out of the factory, where you could easily undervolt them and get much better thermals without any loss of performance. If you underclocked to boot, you would get insanely low power consumption, esp. on gpus with hbm of any kind. Sorry for confusion, didn't imply that voltage and clockspeeds were rigidly linked or whatever.
You're never going to see max clock-rates, regardless with any of the technology. Something has to lock up for syncing somewhere at some point, and that'll be where you spend most of your down time, anyway. This is another reason why the APUs can present magic. The trick is to get some sort of external connector to allow direct connections (yeah, not easy, i'm aware, and would likely require re-pasting) of the GPU and CPU as external entities still sharing that MCU. Obviously we need proper "desktop-grade" APUs to see the tech fully flourish, because, as you said, they're cutting costs in them, yet. Caches need to be expanded and coders need to start looking at their code sizes, and compiler makers need to start improving linking. Microsoft can import functions based on whether or not they're actually used, and GCC and such cannot, which is a huge bottleneck issue to be had. As for this device, I wouldn't preorder it until I see tech-heavy reviews, how it performs, what options it offers to prosumers, etc and then again I'm not much interested in handheld without oled display (or microled in a decade from now).
I wouldn't rely on reviews, especially after semi-recently we had a nice little thread here where people were complaining about vsync, only for vsync to be invented and called something else ("enhanced sync" or something like that: they basically just took stability features away from vsync that weren't necessary and should not have been part of vsync to begin with) by the GPU-making companies. And reviewers i found were all eating it up like it's fancy new technology. It was like watching someone invent "enhanced dicast tires" where you could separate the inner-tube from the tire treads, and calling it new technology, and everyone going gaga over it.
kohlrak
One Sooty Birb - Available on DLsite.com, not
Registered: Aug 2014
From United States
Posted July 20, 2021
low rated


Has there been any reports of a system with both integrated and dedicated GPUs running the game better on the integrated GPU? (One requirement: The dedicated GPU must at least meet the stated minimum requirements of the game, to avoid contrived situations like a modern APU paired with an ancient graphics card.)
I mean, you gotta think of it this way. You have a CPU that only goes so fast, a GPU that only goes so fast (and their individual cores can thread, but for the sake of simplicity we'll just treat it like a single thread in each), and they also have to communicate with one another to discover how, where, and what goes on that screen. The GPU's job is to put it there, and the CPU's job is to tell it what to put where and how. If the CPU's choking on AI code, GPU's going to end up waiting for the CPU to hurry the hell up. If the GPU's choking, the CPU could end up waiting for the GPU, which also means the game logic is running slower, too. Sometimes, they both have to do another task of sending and receiving lots of data like textures, which means, while they're both busy, they're also hurrying up and waiting to do their real jobs.
This, of course, all gets magnified when you realize there's multiple cores, multiple threads, threads locking other threads for syncing (yeah, sometimes threading makes things run slower), unrelated processes, separate hardware (yeah, the SPU could use some stuff to do, too, and we all gotta wait for the hard drive to load the textures and models), and dare we wait for the next packet in an online game, or are we using non-blocking sockets?


(Keeping the game turn based, with an option to control the battle speed, can help with the visual overload issue.)

APU as design is a compromise by choice, cpu portion of apus in general has a lower cache compared to desktop-only architecture, igpu part having fewer CUs and to top it off, they share the tdp envelope as I already said. There is only so much you can do with those 15W to play with. For mobile usage it's good enough, esp. the size and efficiency is great with lower clocks.
Console ports sometimes running like crap won't be solved with the low latency interconnect that apus have though (believe me, I tried with my APUs - heck my old kaveri apu doesn't even have L3 cache on cpu side, which hurts its performance). It all boils down to optimizations of a native version for 1 console vs a myriad of hw/sw setups and porting to them.
As for clockspeeds of apu in general, with mobile tdp and the cooling solution used, you likely won't reach max advertised clocks on cpu and igpu at the same time, because when the tdp gets reached, it will throttle one or the other or both down. Undervolting and over- or under- clocking as a practice was just my experience with amd products, since their gpus were notorious for coming overvolted out of the factory, where you could easily undervolt them and get much better thermals without any loss of performance. If you underclocked to boot, you would get insanely low power consumption, esp. on gpus with hbm of any kind. Sorry for confusion, didn't imply that voltage and clockspeeds were rigidly linked or whatever.

CPU<->GPU transfer speeds on the iGPU, or other performance differences between the devices.
In fact, I have even found a case where a software renderer is faster than using the integrated GPU (and probably faster than a dGPU): the glxinfo program. (Granted. glxinfo doesn't actually display anything; all it does is check what capabilities the graphics hardware/driver supports, but the point still stands, even if it's not a typical use case.)
Post edited July 20, 2021 by kohlrak

Phasmid
New User
Registered: Apr 2012
From New Zealand
Posted July 20, 2021

Would still be unlikely to make APUs truly competitive with discrete systems though. The bottleneck would shift from slow/ limited memory bandwidth to balancing heat dissipation/ power to cores and frequency; and the RAM layer would likely have to be expensive- HBM for 'commercial' RAM, if not actual SRAM like the V cache is. Should still give a lot better performance than current APUs though.

kohlrak
One Sooty Birb - Available on DLsite.com, not
Registered: Aug 2014
From United States
Posted July 20, 2021
low rated


Would still be unlikely to make APUs truly competitive with discrete systems though. The bottleneck would shift from slow/ limited memory bandwidth to balancing heat dissipation/ power to cores and frequency; and the RAM layer would lkely have to be expensive- HBM for 'commercial' RAM, if not actual SRAM like the V cache is. Should still give a lot better performance than current APUs though.
Given the space between the CPU and the RAM as it is, and all the talk about how "ram is cheap" (from all the anti-optimization coders out there), i think it's high time we consider actually givving the CPU dededicated separate memory controller for VRAM instead. When you have separated GPU, like with most desktop applications, you are still using some old tech transfer mechanisms and this all has to be shared with the other hardware. I remember a microcontroller I have had a unique setup in that you used memory operations to send certain commands and address certain areas of vram, because it was wired such that the external RAM MCU was hooked directly to the GPU (all low clocked and low tech, mind you). The hardest part of this route is figuring out what programs should have permission to run the instructions. There's also an incredibly opportunity here to get rid of graphics drivers altogether by dropping this into shaders and coming up with a standard "calling convention" between shaders and the CPU.

dtgreene
vaccines work she/her
Registered: Jan 2010
From United States

kohlrak
One Sooty Birb - Available on DLsite.com, not
Registered: Aug 2014
From United States
Posted July 20, 2021
low rated


Also, LLVM has Link Time Optimization, which can do this sort of thing.

Yeshu
The Pillar Man
Registered: Jan 2011
From Poland
Posted July 20, 2021

The Steam controller pretty much failed as did Steam Machines (I actually forgot about that last one unti now).
It's also not cheap and I don't think it's something they can keep exclusive like console hardware as it just runs Windows or Linux games so every farmshed in China can poop out something similar for cheaper.

Like a 10$ Walmart game pad.

Orkhepaj
SuperStraight Win10 Groomer Smasher
Registered: Apr 2012
From Hungary

Orkhepaj
SuperStraight Win10 Groomer Smasher
Registered: Apr 2012
From Hungary
Posted July 20, 2021
low rated

kohlrak
One Sooty Birb - Available on DLsite.com, not
Registered: Aug 2014
From United States
Posted July 20, 2021
low rated
Yes. I held one in my hands once or twice, and the first thing i said to my GF who owned it was "I sure as hell hope you paid no more than 15 bucks for this." Fortunately for her, it was a gift. Wasn't all that fun to play with, either. Something funky with the left stick or dpad or something, i remember, but i forget what. Had some weird feedback system, too.
Post edited July 20, 2021 by kohlrak

Orkhepaj
SuperStraight Win10 Groomer Smasher
Registered: Apr 2012
From Hungary
Posted July 20, 2021
low rated
