Topic: Number Tags and Number-Only Tags

Posted under Tag/Wiki Projects and Questions

I was recently searching for tags consisting of only one or two characters. I assumed that since these don't show up in search suggestions, there would only be a handful of them, and most should be eliminated. I eventually realised that my assumption was wrong, that I was in way over my head, and that I was acting on a personal opinion that could easily be disgreed with. Thus, I stopped.

In the course of this journey, I discovered many 'number-only' tags. Instead of blundering forwards on my own again, I have decided to do what I should have done before, and ask for input on these two questions:

Should there be tags which are only numbers?

That is, should there be tags which consist only of a string of digits and nothing else? For examples, take the artist tag 87, and the character tag 31. Should these be 87_(artist) and 31_(character) (or similar) instead? Number-only tags feel somehow 'too ambiguous' for me, but I would be interested in other opinions.

Many of the number-only tags I found were general tags used on posts where that number appeared. This prompted me to wonder:

Should there be tags for specific numbers which appear in an image?

I feel that certain examples are justifiable, such as 69_(number). However, I am less enthusiastic about tags being created for every number which happens to appear in an image:

1_(number), 2_(number), 3_(number), and so on to infinity.

The purpose of tags is to enable users of the site to find what they want to see (and to not find what they don't want to see). Take post #2379867 as an example. If someone remembers this image and wants to find it, would the existence of a 42_(number) tag be helpful for them? What if a post is the only image on the site to contain a certain number; would the number tag still be justified if it is only used once?

I would like to make clear that this second question is about tags for specific numbers. Tags about numbers in general, such as phone_number, seem fine to me.

I would be very grateful for anyone's thoughts on these subjects.

Watsit

Privileged

forstonsmythe said:
That is, should there be tags which consist only of a string of digits and nothing else? For examples, take the artist tag 87, and the character tag 31. Should these be 87_(artist) and 31_(character) (or similar) instead? Number-only tags feel somehow 'too ambiguous' for me, but I would be interested in other opinions.

This makes sense to me. When it's the name of an artist or character, a suffix would convey that and help prevent misuse from people trying to tag the number itself.

forstonsmythe said:
Should there be tags for specific numbers which appear in an image?

I feel that certain examples are justifiable, such as 69_(number). However, I am less enthusiastic about tags being created for every number which happens to appear in an image:

1_(number), 2_(number), 3_(number), and so on to infinity.

There are infinity numbers, so having a unique tag for any one that can appear in an image doesn't seem appropriate to me. Especially since it can be ambiguous what the numbers are, as spacing and legibility aren't always the greatest. e.g. if you see 87 504, is that 87504_(number)/87,504_(number) or 87_(number) and 504_(number)? Numbers can also be written differently, so should 001_(number) and 1_(number) be different tags? Or context, if you just see 007 on an image, is that 007_(number) or a James Bond reference or both?

Also, some number tags are already being used for images' creation year, so if a post happened to have the number 2015 somewhere in the scene, but it was made in 2020, that would at least be very prone to mistags. And some images are dated, like
post #6111798
should that have the numbers 4_(number), 8_(number), and 2015_(number) tagged on it? If not, that would dilute searchability for if someone remembers a number being on an image and want to use it to help a search, but the number happened to be part of the date. If yes, that would dilute searchability for numbers that are commonly used for dates.

And would other languages be included? Should be tagged 10_(number)? V as 5_(number)? Should tally_marks also be included, since they are a numeric counting system?

Aacafah

Moderator

These are really good observations, thanks for mentioning them. I'll offer my opinion on the more specific cases, & then my take on the broader topic.

forstonsmythe said:
[...T]ake the artist tag 87, and the character tag 31. Should these be 87_(artist) and 31_(character) (or similar) instead? [...]

I'd 100% agree that we should take purely numeric artist/character tags & add a _(artist) suffix like we do for all other tags that can be confused for multiple things.

watsit said:
[...]
Also, some number tags are already being used for images' creation year, so if a post happened to have the number 2015 somewhere in the scene, but it was made in 2020, that would at least be very prone to mistags. [...]

I'd say the same here; it'd be better to suffix all year tags with something like _(year). The one problem I foresee is that people might start using them for stuff like the year a comic takes place in or something like that. I'd still say that can happen anyways, but I can see people arguing that it's better to make the suffix something like _(publication_year); that said, I find that needlessly unwieldy. If allowed, I'd advocate for what I'll detail below.

watsit said:
[...]
And would other languages be included? Should be tagged 10_(number)? V as 5_(number)? Should tally_marks also be included, since they are a numeric counting system?

This funnels into my larger take on the subject, which is that we should group these cases into narrow categories to explicitly prevent misuse & make stuff like 1, 2, etc. invalid disambiguation tags (like trip_(disambiguation)) unless there is a single, sufficiently narrow subcategory they basically always fall into (e.g. 68 should probably just be aliased to 68_(artist); most year tags always refer to the year of publication, so I'd alias to that & revisit it if mistags become a repeat problem).

When I say "sufficiently narrow subcategory", I mean that we should group what the number is referring to in the context of the post. Examples:

  • For dates:
    • As opposed to 1999, if the in-universe date appears in a comic, if we allow it to be tagged at all, we should do something like 1999_(setting); if the date appears out of universe, I'd prefer tagging those like 1999_(date) (although for the year of publication, that should be 1999_(year) instead).
      • That said, widening the focus to 1990s_(setting)/1990s_(date), similarly to 1990s_theme, would probably be best.
    • I'd say we shouldn't allow tagging the day
    • If we allowed tagging the month (which I'm personally not in favor of), we should follow the same convention as the years & alias tags like january_(month) to january_(date)/january_(setting).
  • If there's tally_marks, a scoreboard, or some other kind of running count, I'd say, if this category is allowed at all, we suffix those cases with _(tally) or _(count).

The main point of this is to:

  • Explicitly limit how we allow numbers to be tagged
  • Push users to be more specific, improving the utility of the tags for searches
  • Make searching for relevant tags & posts more consistent by allowing searches by the suffix
    • This could also be used to make an implied tag for the whole category (e.g. numeral_(setting))
  • Allow for overly specific & potentially useless tags to be either replaced or paired with more general & useful tags where applicable (e.g. 1999 -> 1999_(year) -> 1990s_(year))
  • Allow us to completely disallow making new tags that only have numbers in them

This could also minimize things like this:

watsit said:
[...I]f you see 87 504, is that 87504_(number)/87,504_(number) or 87_(number) and 504_(number)? Numbers can also be written differently, so should 001_(number) and 1_(number) be different tags? Or context, if you just see 007 on an image, is that 007_(number) or a James Bond reference or both?
[...]

If users couldn't just tag every random number, but had to define the context it's used in, that would frequently force them to decide on what the number is used for or leave it untagged.

All that said, while we can decide a policy for this sort of thing, we're unlikely to catch all of these without creating a system to auto-imply/alias new tags based on certain rules (e.g. if it matches the regex /^[0-9]+_(setting)/ or something). We'd only be able to manually process & approve "AIBURs"[/help/tag_relationships] for the most common cases, & the rest will fall through the cracks. That's likely what happened with the initial search:

forstonsmythe said:
I was recently searching for tags consisting of only one or two characters. I assumed that since these don't show up in search suggestions, there would only be a handful of them, and most should be eliminated. [...]

We don't really worry about deleting tags made as the result of things like typos that leave 1 or 2 characters separated by a space (& are therefore treated as a separate tag), so there's probably every combination of 2 valid characters as a tag that exists on basically 0 posts. There's just not much we can do to fully fix issues like this at scale.

I realise it's been a while and neither of you are likely to see this, but I would still like to thank you both for your insightful responses.