Commons:Requests for comment/Categories of photographs by country by date
Background[edit]
We have a bunch of categories of the type Category:Germany photographs taken on 2024-05-04. They all have the same format, they are hidden, and they are actually meant to be used by templates such as {{Taken on}}.
However, these categories often get diffused -- subcategories to the level of administrative divisions such as Category:Saxony photographs taken on 2024-05-04 or even cities such as Category:Dresden photographs taken on 2024-05-04 get created, and the files from the main category get moved there (and thus become not readable by the templates). This is being done by a number of users, and has been discussed at various village pumps. The latest discussions can be found at Commons:Village pump#Category diffusion, again and Commons:Administrators' noticeboard/Blocks and protections#Sahaib, battleground mentality, and edit warring, with the suggestion that an RfC should be opened.
Note that some users apparently confuse the above categories with the categories such as Category:May 2024 in Germany, Category:May 2024 in Saxony, and Category:May 2024 in Dresden. These are real unhidden categories and may be diffused down to villages if needed. I have never seen them diffused down to a single day, but can not exclude that this is happening as well, we do not have policies prohibiting such diffusion. The category tree can be sometimes confusing though with hidden categories being subcategories of non-hidden categories. This category group is being discussed since 2022 at Commons:Requests for comment/2022 overhaul of categories by period, and that discussion will certainly benefit from having more opinions, but it is out of scope for this RfC.--Ymblanter (talk) 10:38, 10 May 2024 (UTC)
- take a look at Commons:Village pump/Archive/2023/07#Category:2020 photographs of Hannover. i dugged up the history of these cats last year. RZuo (talk) 11:19, 10 May 2024 (UTC)
- Thanks, the link contains indeed more discussions of the issue.--Ymblanter (talk) 11:32, 10 May 2024 (UTC)
- Just another case happening right now. Note that categories of the type "Month in Foo" are being removed; these will be very difficult to restore even if the photos get moved up the category tree by a bot.--Ymblanter (talk) 12:36, 12 May 2024 (UTC)
- Thanks, the link contains indeed more discussions of the issue.--Ymblanter (talk) 11:32, 10 May 2024 (UTC)
RfC questions[edit]
0. Are categories of photographs by country by date necessary?[edit]
The current understanding is that since the categories by country are use by templates/bots, they are necessary, and every photograph (if applicable) must be in one of these categories (even if it is also included in the subcategories). If you oppose this understanding, please discuss in this section.--Ymblanter (talk) 12:39, 12 May 2024 (UTC)
1. Should subcategories of the categories of photographs by country by date be created[edit]
In other words, do we need categories such as Category:Saxony photographs taken on 2024-05-04?--Ymblanter (talk) 10:38, 10 May 2024 (UTC)
Note: I request the closer, in case the close is "no consensus", also indicate what is the status quo situation (basically, what should and what should not be allowed in the future).--Ymblanter (talk) 10:41, 10 May 2024 (UTC)
2. To which level is the diffusion allowed?[edit]
If the answer to question 1 is "no", this question becomes mute. If you strongly feel the subcategories should not be created, you do not need to reply here. However, if you think they should be created, or you are on the fence, indicate here what level down they can go. First level administrative divisions? Any localities above certain population level (say 1M)? Any level assuming the number of photographs is appropriate?--Ymblanter (talk) 10:41, 10 May 2024 (UTC)
3. What is a "country" in this context[edit]
Withdrawn, see the discussion below--Ymblanter (talk) 12:40, 12 May 2024 (UTC)
- The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
Is a country just a member of Category:Countries of Europe by name or similar (meaning for example that Scotland is not a country and Gibraltar is a country, Åland is a country and Abkhazia is not)? Or are there other suggestions?--Ymblanter (talk) 10:41, 10 May 2024 (UTC)
- The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
4. Do we need categories by date but not by location[edit]
See for example Category:Railway photographs taken on 2006-01-25, currently there are more than 7,800 similar subcats. The added value for users in having all kinds of railway photos sorted by date is not obvious at all. --A.Savin 15:33, 10 May 2024 (UTC)
General comments[edit]
- A special case may also be Category:Railway photographs by date + its numerous subcats, for which I think we also need clarification if it's meaningful and desirable as is (IMHO not). --A.Savin 11:45, 10 May 2024 (UTC)
- Good idea, I will add a question (#4) later, or feel free to do it yourself.--Ymblanter (talk) 15:12, 10 May 2024 (UTC)
- Why are we attempting to make search of dates redundant in favour of categories which only serve to hide files from view and easy thumbnail comparison? Broichmore (talk) 12:12, 10 May 2024 (UTC)
- @Broichmore: Question What are you referring to when you say 'hide files'? Josh (talk) 15:47, 10 May 2024 (UTC)
- Simply we may have 10 potographs (thats optimistic) for a particular week, devolved into 7 daily categories; most, with just 1 file in each. The photos are effectively hidden. There are few notable dates, for filling at this micro level. 9-11, death of JFK, man lands on the moon. Broichmore (talk) 18:25, 10 May 2024 (UTC)
- @Broichmore: Question What are you referring to when you say 'hide files'? Josh (talk) 15:47, 10 May 2024 (UTC)
- if there isnt a rule, common sense is that a category is subdivided only if there are too many files (>400?).
- so for example San Marino should only be diffused to Category:2017 in San Marino maximally (country x year), whereas a city like NYC can be diffused to Category:New York City photographs taken on 2024-03-07 (city x date).
- so i think when we consider a "rule", we should think about practicality of commons maintenance.
- the quantity of files related to a place is primarily affected by these factors: population, local participation of commons/wikipedia, cultural/political significance.
- for example, Dortmund had 587k residents and Hannover 534k in 2020, but Category:2019 in Dortmund has 467 elements under it and Category:2019 in Hannover has 1973, probably because Hannover is the capital of a state.
- Category:2019 in Fukuoka prefecture has 1516, even though it has almost 10 times the population of Hannover, probably because there are way fewer active users in japan than in germany.
- so if we come up with a rule with those places that have a lot of files in mind, then when this rule is applied to places like Fukuoka or less developed areas in africa/asia, categories will be overly subdivided. but if we come up with a rule with these places in mind, then nyc/london/berlin categories will be bloated.
- imo, it's not a problem if the 2300+ files under Category:January 2023 in New York City are put there and not diffused. (i support the lowest level to be "city x month".)--RZuo (talk) 14:02, 10 May 2024 (UTC)
- Day categories are all supposed to hidden categories intended for internal use. I believe we should not push those below the country level (and I live in a very large country, the U.S.).
- Month categories can be pushed down as far as we want. I'd have no objection to (for example) a particular individual month in a particular public market, as long as it's the sort of place where we normally have a dozen or so photos taken each month.
- - Jmabel ! talk 15:00, 10 May 2024 (UTC)
- I tried to make this distinction in the background section, but feel free to edit. Ymblanter (talk) 15:13, 10 May 2024 (UTC)
- For my part, I strongly Oppose any rule depending on how crowded a category currently is. Either "by date" categories are relevant for cities, or not. If yes, nothing is speaking against such a category even if only one file fits so far. If no, they should not be created at all, not even for 1,000 files. --A.Savin 15:06, 10 May 2024 (UTC)
- Limiting the level of sub-categorization based on an arbitrary line, whether it be by level (e.g. down to country, city, etc.) or by quantity of files, is going to cause problems. RZuo was right to point out the disparity it would cause. One line may make a lot of sense for a highly-populated (# of images) place, but if enforced on a sparsely populated place, it would not be appropriate, and the same in reverse. Thus any new rule to try and draw such a line would, as well-meaning as it may be, need to pass a high bar of necessity to warrant the perhaps-unintended consequences. As for the specific questions above:
- Sub-categories of 'country photographed on date' categories are completely fine, so long as they are compliant with Commons category policies . If the files in the parent category are sufficiently distinct to be sortable and there are sufficient quantity to warrant the effort, than specific subs can be created. Those subs should match the hierarchy of the main topic (country) so far as it is applicable to the contents.
- The diffusion is allowed as deep as the above applies for the given topic. Once a level is reached where no meaningful distinction can be made amongst the files, that is the limit to sub-categorization.
- What is a 'country' in this context should be identical to what is a 'country' everywhere else in the Commons category scheme, i.e. what is within the scope of Category:Countries. That is a matter of its own discussion, but this tree should not adopt a unique definition of 'country'. This is important to comply with the Universality Principle .
- If categorizing files with a 'country' topic is reasonable, I see no reason why it wouldn't be as reasonable for other topics, providing there are enough files to warrant it. Personally, my energy for sorting by time usually runs out at the year level, and the topics I usually work on don't often need more specificity than that, but I wouldn't artificially preclude topics other than 'country' for sorting by date.
- The primary problem cited by Ymblanter appears to be that as these sub-category trees become more developed, they attract movement of files to them from the topical categories, resulting in 'hidden' files. It seems that as something such as 'Nairobi photographed in X' is created, files under Nairobi end up getting moved to this new sub-cat. It is a non-topical category (thus hidden) and thus those files are now no longer in the topical category where they belong.
- I don't think artificial limits on sub-categorization of the 'by date' categories is going to stop this problem. It seems a bit like fighting shoplifting by not opening new stores. The real problem is a combination of reminding users to copy (not move) files to non-topical categories, and the basic consideration one should always have when doing any diffusing: to consider the ability of folks to find that file if they don't happen to be looking for the particular distinction you are diffusing it by. Josh (talk) 16:27, 10 May 2024 (UTC)
- Our main purpose is not diffusion of overcrowded categories at any cost, but usability for internal and external users in general. If we have an overcrowded parental category for something, that's annoying but surely not the reason to make it absurd. Without limitation, we are going to end up someday with categories like "Konrad-Wolf-Straße (Berlin-Alt-Hohenschönhausen) photographs taken on 2035-12-16" or "Human penis photographs taken on 2009-03-31", because the experience shows that there are always people around who have nothing better to do rather than sth. like this, with this level of absurdity that's surely not *my* Commons, sorry. Regards --A.Savin 17:27, 10 May 2024 (UTC)
- Thanks for the „kind“ note. Pardon me, but I have enough other things to do here. Now to the point: Once again, a discussion that was opened quite late. Cities like Dresden or London have long been more or less strictly structured according to individual shooting days. Should everything be reset now? I'm not really ready to discuss this again as long as categories like „Women wearing sunglasses in the City of Westminster“ or „Insects facing left“ are maintained on Wikimedia Commons. But anyway: The combination of location and shooting date has a great advantage: it saves additional subcategories and streamlines the number of categories per photo. For example, a picture from Herford taken on August 21, 2012 would be categorized as „2012 in Herford“, „August 2012 in North Rhine-Westphalia“ and „Germany photographs taken on 2012-08-21“. In my opinion, the shorter form „Herford photographs taken on 2012-08-21“ is much more elegant. What’s wrong with that? Who is afraid of such categories? Furthermore, we all know how things work here at Wikimedia Commons: First, a category "Germany photographs taken on .." is created, then all relevant photo material is put into it. Eventually, it is realized that there is a jumble of thousands of photos from Heidelberg to Berlin under this date, which are simply unmanageable in bulk. Then, a wise Wikimedian comes up with the idea to create a new subcategory. And so the cycle continues. I have no interest in having to go back and re-categorize all my photos every time. That's why I categorize from the bottom up and not from the top down as a principle. This saves me from constantly having to rework things. No question: Technical solutions can be introduced to make these processes easier. However, since it is no longer possible here to automatically extract geodata from metadata using bots, I don't have much hope. Unfortunately, a lot of manual adjustments are still necessary here. --J.-H. Janßen (talk) 19:54, 10 May 2024 (UTC)
- Well, you recategorize MY photos on a regular basis and clog my watchlist making it unusable. Apparently, you do not have any issues with this. I understand your unwillingness to "discuss it again", but it not that last time it was consensus that your actions are ok, quite the opposite. Ymblanter (talk) 20:00, 10 May 2024 (UTC)
- And this "jumble of thousands of photos" also applies for Königswinter, Insel Poel and Melle (Lower Saxony), to name a few? Yeah, just wonder who would believe that... --A.Savin 21:56, 10 May 2024 (UTC)
- Thanks for the „kind“ note. Pardon me, but I have enough other things to do here. Now to the point: Once again, a discussion that was opened quite late. Cities like Dresden or London have long been more or less strictly structured according to individual shooting days. Should everything be reset now? I'm not really ready to discuss this again as long as categories like „Women wearing sunglasses in the City of Westminster“ or „Insects facing left“ are maintained on Wikimedia Commons. But anyway: The combination of location and shooting date has a great advantage: it saves additional subcategories and streamlines the number of categories per photo. For example, a picture from Herford taken on August 21, 2012 would be categorized as „2012 in Herford“, „August 2012 in North Rhine-Westphalia“ and „Germany photographs taken on 2012-08-21“. In my opinion, the shorter form „Herford photographs taken on 2012-08-21“ is much more elegant. What’s wrong with that? Who is afraid of such categories? Furthermore, we all know how things work here at Wikimedia Commons: First, a category "Germany photographs taken on .." is created, then all relevant photo material is put into it. Eventually, it is realized that there is a jumble of thousands of photos from Heidelberg to Berlin under this date, which are simply unmanageable in bulk. Then, a wise Wikimedian comes up with the idea to create a new subcategory. And so the cycle continues. I have no interest in having to go back and re-categorize all my photos every time. That's why I categorize from the bottom up and not from the top down as a principle. This saves me from constantly having to rework things. No question: Technical solutions can be introduced to make these processes easier. However, since it is no longer possible here to automatically extract geodata from metadata using bots, I don't have much hope. Unfortunately, a lot of manual adjustments are still necessary here. --J.-H. Janßen (talk) 19:54, 10 May 2024 (UTC)
- Our main purpose is not diffusion of overcrowded categories at any cost, but usability for internal and external users in general. If we have an overcrowded parental category for something, that's annoying but surely not the reason to make it absurd. Without limitation, we are going to end up someday with categories like "Konrad-Wolf-Straße (Berlin-Alt-Hohenschönhausen) photographs taken on 2035-12-16" or "Human penis photographs taken on 2009-03-31", because the experience shows that there are always people around who have nothing better to do rather than sth. like this, with this level of absurdity that's surely not *my* Commons, sorry. Regards --A.Savin 17:27, 10 May 2024 (UTC)
- I am now thinking about removal of question 3, I agree it is redundant. I was afraid of users claiming that England, Scotland, and Wals are countries, but I guess this can be dealt with. I will add the clarification to question 1 instead.--Ymblanter (talk) 19:50, 11 May 2024 (UTC)
Limitation of mediawiki search engine[edit]
i think the root cause of all these problems about "intersection" cats is the search engine of mediawiki.
i think all categories should just be about a specific quality. like sdc properties each of them only addresses one thing.
and when users need to find say "photo of trains in london on 2024-03-11", they can adjust the filters like
type=photo; depicted=train; location=london (subdivisions of london also implied); date=2024-03-11
then we dont need Category:London photographs taken on 2023-03-11 Category:Railway photographs taken on 2024-03-11 etc.
before we have something smarter to search through the huge repository, there's always argument for assembling photos about a certain topic over a certain period into a category. why do we do that actually? mostly because category pages have the thumbnail view.
on the other hand, super refined subcategories will not be a problem if we can just browse all content under a specific category easily. right now the only way to do that is deepcategory search which is limited to 256 cats. or we have to use petscan.--RZuo (talk) 17:57, 10 May 2024 (UTC)
- mw:Help:CirrusSearch#Deepcategory says it's configurable. can we have a higher limit for commons? RZuo (talk) 18:05, 10 May 2024 (UTC)
So if we agree to only sort photographs by date based on countries that would apply to the subdivisions of the United Kingdom as well, correct? (Northern Ireland, Wales, England and Scotland)? What about the Kingdom of the Netherlands which is comprised of the countries Aruba, Curaçao, the Netherlands, and Sint Maarten? I noticed there were some complains about the first in the project chat--Trade (talk) 19:52, 11 May 2024 (UTC)
- Indeed, Scotland is not listed in Category:Countries of Europe by name, and it is clearly a subdivision of the United Kingdom. Ymblanter (talk) 19:54, 11 May 2024 (UTC)
- How is Scotland anymore of a subdivision that the Faroe Islands is? The category doesn't very consistent Trade (talk) 20:19, 11 May 2024 (UTC)
- Scotland is of couse much more a subdivision than the Faroe Islans is, but for example I think that Aland should not be in the list. My point is however that we already have a lot of things on Commons which operate with the notion of a country, and if anyone wants to change this notion it is fine, but what is meant by "country" here must be the same as in all other contexts on Commons. Ymblanter (talk) 20:26, 11 May 2024 (UTC)
- How is Scotland anymore of a subdivision that the Faroe Islands is? The category doesn't very consistent Trade (talk) 20:19, 11 May 2024 (UTC)