Topic: [Feature] Way to sort by prioritizing number of ~ tags

Posted under Site Bug Reports & Feature Requests

there should be a feature that let's you sort posts in a way that prioritizes posts with the largest combination of optional ~ tags.

so that if someone had a lot of optionally tagged things with ~, you could get posts with the most tags you desire, without having to outright rule out posts that dont have each tag.
For example, you wanna see posts tagged with Female, male/male, intersex/male, etc. And you won't die if one of them is missing in an image so you add a ~, but ideally you'd want to see as many of those tags together as possible. This would be good for large, specific or conflicting tag combinations that wouldn't yield much on their own when searched together normally.
Additionally you could go further and have sorting options for having at least a certain amount of optional tags.

This would effect the search bar and tagging system

.. sorting that way *within* each page of results could probably be done (client-side. Someone could probably write a TamperMonkey script to do it, and it could work without any e621 dev involvement being needed. It would have to run before blacklisting script does, AFAICS). I'm not so sure about whether an 'overall' sort (within the entire result set) is practical. I believe e621 tag search works on something like a full-text search, so scoring the optional parts may be possible; whether it can be made fast is another matter, probably.

EDIT: Honestly, I think having some kind of 'sorting within page' (the client-side idea I mentioned above) support could be pretty good from a development perspective. It would be a platform to test a lot of different sorting options that either a) aren't performant at scale or b) take a lot of time to optimize (and you're not sure if they'll be any good yet anyway). Maybe it's too confusing -- you're effectively creating a partial compound sort, and people do get confused by compound sorting.

An example of 'other sorting options that might only work applied on a page-by-page basis' is something I developed in my own tagging system: 'tag partitioning'. Basically a grouping system where you start with a set of tag criteria, let's call them X,Y,Z; the content of each criteria would be like a normal query (with some restrictions). For the actual sorting process : Check 'Are there matches for tagging criteria X in the input data? Then output all the posts matching that criteria, and remove them from the input data. Now, is tagging criteria Y matched? Output the posts matching that criteria, and remove [..]. Finally, output whichever posts still remain in the input data'. Useful for tagging projects IME.

Updated

unknownandunfound said:
there should be a feature that let's you sort posts in a way that prioritizes posts with the largest combination of optional ~ tags.

so that if someone had a lot of optionally tagged things with ~, you could get posts with the most tags you desire, without having to outright rule out posts that dont have each tag.
For example, you wanna see posts tagged with Female, male/male, intersex/male, etc. And you won't die if one of them is missing in an image so you add a ~, but ideally you'd want to see as many of those tags together as possible. This would be good for large, specific or conflicting tag combinations that wouldn't yield much on their own when searched together normally.
Additionally you could go further and have sorting options for having at least a certain amount of optional tags.

This would effect the search bar and tagging system

This would be possible, but considering this is actually how the tool we use for post searches (ElasticSearch/OpenSearch) works by default & we've explicitly made it to not do this, I don't see this happening.

There's a couple reasons we don't do this:

  • It's significantly worse for performance
  • It would require explicit configuration to not redundantly do this if there's an explicit order metatag.
  • It would completely alter how searches with & without that operator work in a way that's confusing & unpredictable for users

If we did this, it'd probably be as a specific order: value instead of as the default behavior, but I'm doubtful we'd add this in any form. I can understand the utility of it, but any sort of fuzzy, relevance-based search inherently balloons both the amount of computational work per potential search result & the overall number of results to process, meaning the increase in demand is almost certainly exponential. There might be some fancy things that either we could do or the search engine already does to minimize this work, but there's no escaping the fundamental fact that this is going to increase load on the server, & I'd be shocked if it wasn't a substantial increase.

I'd also like to have this personally (I know for certain I'd get a lot of use out of it), but I doubt that it's tenable. Per-search performance was also a concern when I added support for grouped searches, but not only did it not take too much additional processing, with some careful optimization, the way I implemented it actually improved performance for searches that don't use any groups, so it actually decreased the overall load on the server; I know for certain that neither of these 2 things are happening for this.

The site is open source, so anyone who's interested can get the code from our GitHub page & try to implement & profile it themselves, but I don't think we'd have the time to take a stab at this ourselves for a good long while, even if prospects for feasibility were better. That said, that's the same situation grouped searches were in when I came along, so someone could always come along & give it a shot, but it's unlikely to come from us, especially with how time-consuming & potentially futile it'd be.

Updated