Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Suggestion / Discussion] Possibility of integration of scraped affixes #73

Open
xkomachi opened this issue Oct 6, 2017 · 13 comments
Assignees
Labels

Comments

@xkomachi
Copy link
Contributor

xkomachi commented Oct 6, 2017

I don't have upload permission, so I uploaded those to my fork:
xkomachi@48a943b

I got the files from queries on the wiki's database (further results -> json, and modify the url parameters. You can't get over 1000 results per query, so set an offset if needed). This issue is for discussion the possibility of integrating those into the main program.

Upsides:

  • Possibly adds missing affixes, solving many future issues.
  • Has better type identifications and details possible ranges for affixes, although whatever that has any use remains to be seen. Reduction to the current format of affix listing should not be too complex to implement.
  • Once initial work required for processing of the files is done, updates to the game that add new affixes can be done more easily.

Downsides:

  • Sheer size of files - this adds to bloat and there's no real reason for those to be in a downloaded release instead of the parsed files, but I'm not sure if the current release method allows for this. If not, compression should be handy.
  • The data originates from the game files, resulting in many internal test affixes popping up for things that didn't make it into the game (like ele pen per frenzy charge or duplicates stats from your other ring).
  • The indexing for unique flasks seems to be wonky, and some of the flask mods appear in the general items affixes. section (like atziri's promise). It's possible other affixes are also in the wrong category.
  • Stability of the wiki may be a concern (at least for me, queries sometimes returned partial data - the files I uploaded should be good as far as I'm aware, but I didn't verify).

Notes:

  • Unsure about the significance, but html scraping may be easier for you to work with. It wouldn't be more difficult for me to set up individual pages (see https://pathofexile.gamepedia.com/User:Xkomachi/mod_test_2).
  • The wiki also lists spawn weights (i.e. rules on the affixes spawn - only a dexterity base, only gloves, etc, and while I'm unsure, I think the weights also signify the %chance of rolling the specific affix in the given tier). Again, I am unsure of how much use this is, but I feel like I should mention it for the sake of completion, if something like estimating item cost via roll chances ever crossed your mind.
  • There does seem to be a way to separate essence rolls from normal rolls. I'm willing to try, if need be.
@licoffe
Copy link
Owner

licoffe commented Oct 9, 2017

Hi Xkomachi,

Really nice work with the queries!!

The files generated from these queries are way too big but if we could streamline them, extracting the relevant data and store it using the same format I use (which is basically what the Wiki displays here), it could be really great!

I completely agree with the advantages your pointed out. Having a way to keep the mods up to date without checking the mod list every single time would be a great time saver and help us have accurate information.

I don't see really big issues with the downsides, except for the partial data which may be a problem. How did you notice it was partial? Did you get a corrupted JSON output?

Would you have a link to the Wiki query documentation? Not sure where to start with them.

If you have the time and you're up to the task to develop a tool which would pull data from the wiki and generate the right JSON output, by all means, feel free to help ;)

@xkomachi
Copy link
Contributor Author

I'm 90% sure the files can be squashed to something usable for you. My touch with JS is mostly from userscripts, so while it might not be 100% compatible I'll see if I can get any basic structure or maybe a demo going.

As for partial data - just manually checking the file, it's only file-wide missing line(s) under every object that happened to me but it happened more than once. You'll need to perform validity checks rigorously.

The wiki docs.. well, they say that any documentation is better than no documentation. The relevant section for what I did is https://pathofexile.gamepedia.com/Template:SMW_mod_table . Best of luck.

I don't blame you if you found this all confusing, but there are JSON-generating URLs. I'll try to round them all up if it helps. Resource fetching from URL is something electron should be natively capable of, right? The update model is up to you.

@xkomachi
Copy link
Contributor Author

xkomachi commented Oct 11, 2017

xkomachi@77d6485

Mostly functional demo fetches a single file and prints to console list of affixes it read in your format (from what I could tell).

This is only the core. Some modes come in a double-line and I wasn't sure how to best deal with them - but probably you don't need more regex, just check whatever the string contains
and split accordingly then iterate over the parts. You still need to push them to an array, add the matching style (i.e. [[unique explicit] , [jewel corrupt], enclosing in quotes, adding : null, to everything but the last entry), remove duplicates, try to catch anything that can go wrong, then handle multipart json files (uniques) and merge those all into one big file. It's a bit of work, but nothing should be complex, just tedious to check everything goes correctly and you need to have the design you want things to look like in your head.

Oh, and GM_XMLHTTPREQUEST is a custom wrapper function, the JS equiv is I believe xmlhttprequest with open, send, and attaching an event handler onload, but maybe electron has a wrapper that's more comfortable to work with.

@licoffe
Copy link
Owner

licoffe commented Oct 12, 2017

Thanks a lot for your help!!

I started modifying your script to generate the right output (see affix-parse.js and out.json). I focused on generating an output close to the content of affixes.json for now. It's not perfect, but it's getting somewhere.

I'm still wondering a few things about querying:

  • Is there a way to query items of a specific category, for example one-hand weapons?
  • Is there a way to query specific item types, for example Fishscale Gauntlets?
  • Is it possible to get the required level for an affix?

Thanks again for your help!

@xkomachi
Copy link
Contributor Author

xkomachi commented Oct 12, 2017

1&2: Kinda yes and no. See SMW item table. I'm not sure that's what you mean though - do you mean "list of mods that can roll on fishscale gauntlets"? If so, afaik (but admittedly there is a lot I don't so ask around) there's no way to do it because mods use weights to spawn, which are mostly item-base-class-tied (so on int shields might be indexed, but no way to specify which specific base). I'm not sure if there's a way to query specifically by a weight, but I do know how to list them at least (as I noted in the first post).

3 - Yes. I see you're building the urls modularly, so the url for normal armour prefixes is this

@licoffe
Copy link
Owner

licoffe commented Oct 12, 2017

Thanks! I included the mod level requirements.

I found another way to get the prefix/suffix/corrupted mods for item categories (one handed axes, gloves, etc...). Using Special:Browse, I inspected List of claw modifiers and extracted the following queries :

However, it seems that each of these queries is missing some mods.

@xkomachi
Copy link
Contributor Author

The weight equivalent for corrupted mods can be found in this link, which should list everything.

@xkomachi
Copy link
Contributor Author

xkomachi commented Oct 12, 2017

To elaborate: A "fixed" corrupted mods link would look like this (table is broken, need to add newlines before question marks but it appears that gets lost in the url), however you tackle the issue I was talking about earlier.

Look at a modifier that didn't appear before and does now: cold leech . What we were doing was searching for mods that spawn via corruption on armour items, then filtering those mods to only include mods that contain either the tag onehand, claw, or one_hand_weapon . Adding the weapon tag to the tag list solves the mods not being listed, however at this point (to be fair, even before this point) nothing guarantees the mods you are listing can spawn specifically on a claw.

Because the cold leech corruption has the tags amulet, quiver, two_hand_weapon, weapon, default there is fundamentally no way a tag-based filter could correctly list only claw mods. Also, there is absolutely no information on the mod itself that has anything to do with only claws.

This is what I've been trying to say all along, but I hope this makes it clearer.

@licoffe
Copy link
Owner

licoffe commented Oct 13, 2017

Because the cold leech corruption has the tags amulet, quiver, two_hand_weapon, weapon, default there is fundamentally no way a tag-based filter could correctly list only claw mods. Also, there is absolutely no information on the mod itself that has anything to do with only claws.

Damn, I see your point and I wonder why they're not using more precise tagging. This is a serious limitation. I just won't be able to generate affixes.js out of the wiki (which is paradoxal since it's built on Item Affix and I don't understand how they generate this list). At this point, I'm seriously thinking about web scraping the wiki or PoeDB.

EDIT: Found the Lua code used to generated the affix list.

@xkomachi
Copy link
Contributor Author

xkomachi commented Oct 13, 2017

I don't know what you want to do with the lua code - that's even more internal and has to do with building the database itself and responding to queries.

Item affix works with the item modifiers template , which you can see an example usage of here. Again, I can't be sure, but as far as I can figure it uses - you guessed it - mod tags. I really don't think digging around templates is going to help because the information you're trying to get either doesn't even exist on ggg's side, or isn't indexed by the transformation to the wiki's db. Even if it's the latter, while no one is stopping you from doing it, writing lua code to parse it is both going to be very difficult (seeing as there's very few people who understand it and very little documentation) and imo both too much outside of the scope of this project and would take too much time to be worth discussing here.

@xkomachi
Copy link
Contributor Author

xkomachi commented Oct 13, 2017

There is always the option of just rolling with it :) Maybe there are absolutely no guarantees about the mod lists being bound so they actually accurately answer to what you meant with the query, but it is a fact that in the current point in time they do kinda work.

Do keep in mind that essences are in an even worse state - no tags.

You should try and talk to the poedb dev if you plan to scrap from there - I've seen him around and he seems friendly enough. Perhaps he can provide some insight, and maybe also give the mods in an easier format for you.

Edit: Not sure how much sense it makes, but after looking more closely at your files, perhaps this isn't needed. Think about this in reverse: If you try to list mods that can only apply to claws, there's nothing the wiki can help you with. However, if you want to list all mods that a claw could possibly get, then there is nothing wrong with querying for tags that are very general, like 'weapon', because claw is a subtype of weapon, thus any mod that can roll on any weapon can inherently roll on a claw.
Essences are still looking pretty bad, though..

@licoffe
Copy link
Owner

licoffe commented Oct 16, 2017

I guess I will use 2 strategies here:

  • I will go with your approach using wiki queries to generate the list of all affixes (affix-completion.json)
  • I will scrape the wiki or maybe contact Chuanhsing (poedb dev I believe) as you proposed to see if he offers some kind of api to generate affixes.json

@licoffe licoffe added the Feature label Nov 3, 2017
@licoffe licoffe self-assigned this Nov 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants