It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
avatar
genericUNDEAD: Does www.gogdb.org do any web scraping that could target the Language Audio field specifically, or is the GOGDB website only capable of GOG API calls?
I think you are asking the wrong questions here. It is the GOG website, much like any other modern website out there, that gets its data from "GOG API calls", commonly called "the backend".

https://api.gog.com/v2/games/1?locale=en-US

Replace ID 1 with any game ID you're interested in and look at the "localizations" section. You'll notice it's the same data that's displayed on a product page, regardless if it's on the GOG website or in Galaxy.

There is no need whatsoever to do any sort of web scraping. That being said, the GOG website doesn't offer an easy way to search for "audio" languages only, but the data is there, so technically GOG DB could provide an additional filter.
Post edited December 16, 2024 by WinterSnowfall
avatar
WinterSnowfall: There is no need whatsoever to do any sort of web scraping. That being said, the GOG website doesn't offer an easy way to search for "audio" languages only, but the data is there, so technically GOG DB could provide an additional filter.
So GOGDB could do it using the official API? I'm not a programmer, I'm just exhausted going page to page scrolling down to the Language Audio section and leaving disappointed. I figured Python scraping would be the method since GOG doesn't have a filter like this already.
avatar
genericUNDEAD: So GOGDB could do it using the official API?
GOG DB only uses the official API to request all games and build a custom index for searching. It unfortunately does not support detailed filters at this time.
avatar
Yepoleb: This should be fixed now. I was aware of the issue, but didn't have the time to code a fix until now.
So what is your solution to the 10k products limit? Do you query twice and join the results? Maybe alphabetically A>Z and Z>A, as suggested by others?

edit: Found your Github repo. Now I'm trying to understand updater.py and the get_catalog method.
I see you implemented a different "pagination_method", called "search_after".
Is it possible to query catalog pages beyond the "total_pages" limit? For example: If the API reports 100 pages of 100 products each, can you simply query page #101 and onwards to get the missing product entries beyond 10k? (At least until the result is truly empty.)
Post edited December 17, 2024 by g2222
avatar
genericUNDEAD: So GOGDB could do it using the official API? I'm not a programmer, I'm just exhausted going page to page scrolling down to the Language Audio section and leaving disappointed. I figured Python scraping would be the method since GOG doesn't have a filter like this already.
May I ask what your purpose is with trying to filter for audio language? I'm working on a gog search product of my own, with explicit focus on filtering (will still take me 2+ months to launch, I suspect), and currently figuring out the interface.

I can see the usecase of wanting to know if a product is "fully localized" in one's preferred language, but that also applies if it's a game where all dialog is text-only. Would that suffice for you, or is there another reason for explicitly wanting to filter on audio? (Trying to learn a foreign language? ...?)
avatar
gogtrial34987: Trying to learn a foreign language?
This, but plenty of countries just prefer dubbed content. Italy dubs everything.

I've actually been surprised by how much pushback I've gotten for even asking about this as a feature on these forums and elsewhere. I assumed that every salient point of data should be filterable on the Store and only the most frivolous disregarded, so even GOG overlooking it is kind of astounding to me.
avatar
genericUNDEAD: I assumed that every salient point of data should be filterable on the Store and only the most frivolous disregarded
Building a comprehensible interface for such filtering is really quite hard. You might remember magog, which was amazing in its capabilities, but not something which your average visitor would be able to make effective use of. I'm personally aiming for somewhere between gog and magog, and really have my work cut out for me, trying to determine which levels of complexity to hide, which to transform or group, and which have real utility.

Anyway, thanks for the answer! No promises, but I'll try to incorporate your usecase. :)
Post edited 5 days ago by gogtrial34987
avatar
gogtrial34987: Building a comprehensible interface
My user intuition on the GOG Store pages was to click the checkboxes themselves in the Language grid and for GOG to open a new tab that lists every game with that language feature. The language names could also serve as a link to the currently indiscriminate results.

Thanks for the hope. I do think a lot of people will use a "Language Audio" search filter.
Post edited 3 days ago by genericUNDEAD
avatar
g2222: Is it possible to query catalog pages beyond the "total_pages" limit? For example: If the API reports 100 pages of 100 products each, can you simply query page #101 and onwards to get the missing product entries beyond 10k? (At least until the result is truly empty.)
To get more than 10000 results you have to completely ignore the page numbers and only use search_after. Theoretically it supports other sorting methods than asc:externalProductId, but I have not figured out how to use them. So I use order=asc:externalProduct and search_after=123456789 as query parameters with search_after being the last product ID on the current page until the returned products list is empty. See default_params in catalog_worker for my query parameters and get_catalog for the search_after logic.
Tiny thing, but I just noticed that the prices aren't actually displayed with two decimal points. Since the base prices hardly ever end in .x0, that's very rarely noticeable, but now it was, Fell Seal DLC Bundle having a base price of $5.80, which is displayed as $5.8. Did a double take when I first saw that.
Thanks for the bug report!