Shortcut: WD:PC

Wikidata:Project chat

From Wikidata
Jump to: navigation, search
Wikidata project chat
Place used to discuss any and all aspects of Wikidata: the project itself, policy and proposals, individual data items, technical issues, etc.
Please take a look at the frequently asked questions to see if your question has already been answered.
Please use {{Q}} or {{P}}, the first time you mention an item, or property, respectively.
Also see status updates to keep up-to-date on important things around Wikidata.
Requests for deletions can be made here.
Merging instructions can be found here.

IRC channel: #wikidata connect
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2017/01.

Project
chat

Administrators'
noticeboard

Development
team

Translators'
noticeboard

Requests
for permissions

Interwiki
conflicts

Requests
for deletions

Property
proposal

Properties
for deletion

Requests
for comment

Partnerships
and imports

Request
a query

Bot
requests

Contents

A tool to batch upload labels for wikidata items[edit]

Hello everyone, As a part of the data donation of translated place names, I have developed a label upload script which helps translators to batch upload a list of labels in a specific language for Wikidata items.

The tool uses pywikibot. The input is a CSV of wikidata Ids and corresponding translated labels to upload. The tool ensures that no duplicates entries are uploaded in the following manner:

Label in wikidata State Action
Present Matches with translation Doesn't get uploaded
Present Doesn't match with translation Uploaded as an alias (If it translation not present as alias)
Not present Uploaded as label

I would like to thank @YuviPanda: ,@Planemad: and folks on #wikidata, #pywikibot to help me out through this. Would love to hear the thoughts and feedbacks from the community on how this could be more useful to help the community of translators.  – The preceding unsigned comment was added by Amishas157 (talk • contribs) at 13:14, 21 December 2016‎ (UTC).

United States[edit]

I propose changing the English label for United States of America (Q30) to United States, for consistency with the English/Simple article names as well as common usage. Nikkimaria (talk) 00:16, 7 January 2017 (UTC)

I am not sure this is a good idea. It may well be that "United States" is unambiguous for native English speakers, but there are plenty of non-native speakers who use Wikidata. Consistency with the article names does not seem any kind of argument: plenty of cases where the label is (and must be) entirely different from an article name. - Brya (talk) 08:07, 7 January 2017 (UTC)
It is fine to add an alias but it is the official name. Following the names of any Wikipedia is an extremely bad idea. In many cases there is no proper fit between the article and the item. Thanks, GerardM (talk) 08:11, 7 January 2017 (UTC)
@Brya: Help:Label indicates that we should use "the smallest unit of information that names an item" even if that unit is ambiguous, and that we should use the most common name - both of those support the change. Further, @GerardM: it indicates that we should consult the corresponding Wikipedia page for guidance on what the most common name is. It doesn't say to use official name at all. Nikkimaria (talk) 13:18, 7 January 2017 (UTC)
Well, the Help:Label page is beginning to look a little out of date, but even Help:Label is less strict than you make it appear, to quote:
"Wikimedia page title may give orientation
To figure out the most common name, it is good practice to consult the corresponding Wikimedia project page (for example, the title of a Wikipedia article). In many cases, the best label for an item will either be the title of the corresponding page on a Wikimedia project or a variation of that title. [...]"
Brya (talk) 13:44, 7 January 2017 (UTC)
Yes. Any good reason not to do that in this case? The guidance of that page in sum supports "United States" much more strongly than it does "United States of America". Nikkimaria (talk) 17:05, 7 January 2017 (UTC)
@Brya: Nikkimaria (talk) 13:56, 8 January 2017 (UTC)

There are so many instances where the Wikipedia article is just a choice to allow for disambiguation that the notion that it is the best fit is plain wrong in practice. Thanks, GerardM (talk) 14:30, 7 January 2017 (UTC)

I suggest you raise that general point at Help talk:Label, but in this particular case that is not a concern. Nikkimaria (talk) 17:05, 7 January 2017 (UTC)
We still have United States of Brazil, so that the United States is ambiguous.--Ymblanter (talk) 20:52, 7 January 2017 (UTC)
@Ymblanter: And per Help:Label we resolve potential ambiguity by using the description, not by extending the label. Nikkimaria (talk) 13:56, 8 January 2017 (UTC)
You're right that we don't have to include disambiguating information in the label, but this isn't just extending the label, "United States of America" is a name in actual use, e.g. on coins, notes, passports, so they're both valid names. Looking at other countries, we're pretty inconsistent in whether we use official names or the common shortened version... - Nikki (talk) 15:14, 8 January 2017 (UTC)
The most common name is what we're meant to be using, which here is "United States". Nikkimaria (talk) 17:04, 8 January 2017 (UTC)
@Nikki: Nikkimaria (talk) 01:15, 10 January 2017 (UTC)

@Nikkimaria: What advantages do you expect to be gained by this change? [Also, please note the section, above, #Nikkimaria, where you were pinged.] Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:05, 7 January 2017 (UTC)

In addition to being more in tune with the guidance for labels, this change would provide advantages for the use of Wikidata on other projects. (Hm, for some reason I did not get that ping...) Nikkimaria (talk) 21:09, 7 January 2017 (UTC)
"The advantages are that it would provide advantages"? Please be more specific. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:39, 9 January 2017 (UTC)
It would ensure that data passed through from Wikidata is consistent with the terminology in use on our major English-language projects, to begin with. Nikkimaria (talk) 01:15, 10 January 2017 (UTC)
Is there any reason not to follow the guidance of Help:Label in this case? Nikkimaria (talk) 02:14, 11 January 2017 (UTC)
It is guidance and not every page and every answer fits within general guidance. Present a decent case about why the change is valuable, not ducking and weaving with indirect reference to a page. In this case, I have specifically said to you that people may wish to use the label through the modules. You have not presented a reasoned case about why your change is better, and I dispute your unsubstantiated and personal opinion that it is advantageous. We are talking about the country, not its possessive use, so please don't try the fake and vague "consistency argument." We use United Kingdom for the country, and British for the people, and so on. The country is the United States of America and there are the other regular uses United States, USA, American, ... depending on context of the usage required use away in the aliases.  — billinghurst sDrewth 13:14, 11 January 2017 (UTC)
(a) Yes, it is guidance, but unless we have good reason to do otherwise we should follow it. (b) And as I've told you, use of the label through the modules is a good reason to use the more common term as the label, as this will avoid having to override the value in multiple locations. "United States" is appropriate in more cases than is "United States of America", on projects that use the former, in templates or tables where space is at a premium, etc. (c) To summarize: Such a change would be in line with the guidance of Help:Label as well as both common usage and the terms in use in our major English-speaking Wikipedias. (d) What are you talking about with "possessive use"? We use "United Kingdom" for the country, not "United Kingdom of Great Britain and Northern Ireland", because although the latter is the official name the former is the common name.
So now, what is your "reasoned case" for using the less common label, despite the guidance not to and the practical implications of this choice? Nikkimaria (talk) 02:38, 12 January 2017 (UTC)
@Billinghurst: Nikkimaria (talk) 13:58, 13 January 2017 (UTC)
Many people do disagree with you. You insist on something that is imho and in the opinions of others arguably wrong and you get angry when people disagree with you. Why? Thanks, GerardM (talk) 05:38, 12 January 2017 (UTC)
I'm not angry, I'm simply asking that you and others provide a reason why following Help:Label and common usage is "arguably wrong". Nikkimaria (talk) 13:03, 12 January 2017 (UTC)
@Billinghurst: Do you have such a reason? Nikkimaria (talk) 23:52, 19 January 2017 (UTC)
@Brya, Billinghurst, Multichill, Nikki: Does anyone? Nikkimaria (talk) 00:29, 22 January 2017 (UTC)
Nobody agrees with you. Let it go. - Brya (talk) 05:44, 22 January 2017 (UTC)
@Brya: You're welcome to disagree with me, it'd just be nice if you had a good reason for doing so. Do you think Help:Label should be changed? If so, to what? If not, why not apply it here? Nikkimaria (talk) 00:36, 23 January 2017 (UTC)
Does anyone else have answers to these questions, or should we go ahead and make the change? Nikkimaria (talk) 00:51, 29 January 2017 (UTC)
No change. The good reason is "there is no consensus for the change". Can we move on yet?  — billinghurst sDrewth 10:08, 29 January 2017 (UTC)
@Billinghurts: Consensus is based on policies/guidelines/rationales, not voting. The "general guidance" in this case does not support your position, as explained above. Do you have a response to the questions to you above? Nikkimaria (talk) 13:46, 29 January 2017 (UTC)
Fix ping: @billinghurst: Nikkimaria (talk) 13:48, 29 January 2017 (UTC)
Hoi, Dear Nikkimaria, you have been told by multiple persons that they do not agree with your point of view. Your attitude is one where you want to force the issue. You are being aggressive and it is not appreciated. Thanks, GerardM (talk) 14:30, 29 January 2017 (UTC)

Never married persons[edit]

Today there was an edit war on Franz Kafka (Q905) who is known for not having got married. Users Rodejong and Villy Fink Isaksen repeatedly changed spouse (P26): no value to unmarried (Q28341938). The page is now protected to stop warring and to make clear whether the use of no value is correct for people who never got married. Note that unmarried (Q28341938) is propably duplicate of never married (Q22101595) which does have several uses. Matěj Suchánek (talk) 20:43, 15 January 2017 (UTC)

I can add that the removing of the claim spouse (P26): no value probably was because of an infobox template at Danish Wikipedia which cannot handle the value no value. But nevertheless this seems like a case of exactly what that value was made for. Best regards, Dipsacus fullonum (talk) 20:55, 15 January 2017 (UTC)
The matter of the infobox is irrelevant for this discussion though Dipsacus fullonum, and the issue has been solved. However "no value" is needlesly ambiguous, using the "unmarried"/"never married" is a more logical choice. If a property to an item is added it stands to reason it must be filled, otherwise it has no purpose other than to create confusion. -- Vrenak (talk) 21:02, 15 January 2017 (UTC)
Unless you can tell us the date Kafka married an entity called "unmarried", and source that, then "no value" is correct. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:17, 15 January 2017 (UTC)
@Vrenak: Yes, a claim must have a value, but no value and also unknown value are perfectly fine values. unmarried (Q28341938) or never married (Q22101595) aren't good values because you would think that if the value is an item, it will represent an actual spouse. Best regards, Dipsacus fullonum (talk) 21:25, 15 January 2017 (UTC)
Kafka has to marry another person for the property to make sense. Just look at the constraints of spouse (P26) and see if they make sense for never married (Q22101595) :
Allows a start time (P580), end time (P582), place of marriage (P2842). How can we fill those?
Koxinga (talk) 21:31, 15 January 2017 (UTC)
Bad constraints are never a reason to fudge data. Fix the constraints. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:46, 15 January 2017 (UTC)
Constraints also illustrate what kind of data is expected. Anyway, you agree with me above, so I am not sure what is your logic here. Koxinga (talk) 22:28, 15 January 2017 (UTC)
In this case, personally I'd favor "novalue" as well. I added a few when working on Wikidata:Database_reports/Constraint_violations/P26.
Maybe we should find a way to store the item that contains the textual description of novalue.
--- Jura 21:42, 15 January 2017 (UTC)
Dipsacus fullonum, "no value" is not a value, your logic is flawed, "no value" literally means there is no value, not even a zero. So if you have a claim you need to fill it out with a value, if there is no regular value to fill in you have a zero value for the claim, for spouses this can be unmarried or widowed, depending on context, similarly you can't have a location with "no value" inhabitants, it can have "0" but it can't have "no value". logically if you want to have a claim it must be filled, or you remove the claim itself. -- Vrenak (talk) 22:13, 15 January 2017 (UTC)
If you count the number of spouses per person, using novalue gets you a count of 0 for items using that. That's what we want for people who never married. They never married, they didn't have a spouse nor did they have an entity called "unmarried" as spouse. Using "novalue" works fine for this property. Now we just have to find a way to fix the infobox. The problem seems to be at da:Skabelon:Infoboks person.
--- Jura 22:22, 15 January 2017 (UTC)
Your logic would be correct if the property was the number of spouses. However, the property is the spouse item. If there is no spouse and we know it, a novalue is a way of saying that this property can not be filled: it is kept empty not because we don't know but because we know there is nothing to enter. Koxinga (talk) 22:25, 15 January 2017 (UTC)
In view of the description of "no value" at mediawikiwiki:Wikibase/DataModel#PropertyNoValueSnak "no value" seems appropriate in the case of a dead person who we know was never married, when taken together with our practice of listing all known spouses, whether the subject of the item was married at the time of death or not. In a more general context, outside of Wikidata, "unmarried" just means not married at this moment, and "never married" means not married up to this point in time. Whenever we document these terms we should be mindful of how these terms are used within Wikidata versus how they are used in a more general setting, and how the shades of meaning are different for living vs. dead people. Jc3s5h (talk) 23:05, 15 January 2017 (UTC)
The other means to manage this is to do a count; ie. a property "number of marriages", where the count can be 0 .. n. This does actually have an advantage, as 1) there are numbers of people who we know married, though never know the spouse; 2) there is a lot of reluctance or inactivity to create a spouse of another otherwise non-notable person. From Dictionary of National Biography there are many mentions of spouses, and I know that I pass adding that detail, sometimes for lack of helpful detail, other through the effort required.  — billinghurst sDrewth 03:54, 16 January 2017 (UTC)
I sometimes use "no value" to prevent people from adding "Coat Of Arms" and "Sister city" without thought. -- Innocent bystander (talk) 16:52, 16 January 2017 (UTC)
But please add only "no value" to coat of arms if you know that the entity does not have coat of arms. Just a missing file on Commons does not eligible the use of "no value" --Pasleim (talk) 07:17, 17 January 2017 (UTC)
@Pasleim: I see no point in adding COA:"no value" to every single thing that does not have a COA. But when people repeatedly add a COA to wrong item, for example into Malmö (Q2211) instead of Malmö Municipality (Q503361), then it could be useful to stop these "games" and other "automatic imports" that causes us problems. -- Innocent bystander (talk) 08:01, 19 January 2017 (UTC)
No one is married to unmarried (Q28341938) or never married (Q22101595). Neither are appropriate as a value, and both should be deleted. The entire point of novalue is for cases like this. --Yair rand (talk) 22:42, 16 January 2017 (UTC)

If I am allowed to add to the confusion: A few times, I have run into the case where a work by an anonymous has been claimed to be by anonymous (Q4233718). It seems to me, that like no value is better for unmarried, then unknown value is better for these cases? — Finn Årup Nielsen (fnielsen) (talk) 23:58, 16 January 2017 (UTC)

Yes, unknown value would be appropriate for this case. Otherwise we have problems because two books who's authors are both "anonymous" would be treated as being by the same author. ChristianKl (talk) 09:08, 17 January 2017 (UTC)

If anyone like to hear what I say, I think it is better to write "unmarried" rather than "no value". Unmarried is best used for people who's marital status is definitely known as unmmarired. "No value" is open ended and suitable for people who's marital status is unknown. In this case, its best to have unmarried because we know he's unmarried. MechQuester (talk) 05:57, 17 January 2017 (UTC)

"no value" is not open ended. If the marital status is unknown, one has to use "unknown value". "No value" should only be used if one definitely knows that the person is unmarried. In case of Franz Kafka (Q905) this is given.
The property isn't "marital status" but "spouse". Unmarried might be a reasonable value for a marital status property but it's not a spouse. ChristianKl (talk) 08:46, 17 January 2017 (UTC)
and unknown value could be used for not knowing "if" married, or "to whom" married. Though the guidance on whether to populate that way is interesting as in other places the absence of data is used for "unknown". On similar note, do we differentiate between a marriage and just co-habitation?
We have partner (P451) for unmarried partners. It's currently not possible to say that a person is definitely married but the partner is unknown. At least not directly, you could still use "unknown value" + the start qualifier and link a source as a reference. ChristianKl (talk) 16:38, 22 January 2017 (UTC)

The intention of "no value" in the data model is indeed to express that for this property, there is no value that would be a valid filler. For the property spouse, using "no value" means that the person has no spouse, i.e. is not married. Using "unmarried" is bad, because then, when you ask for people that are married to the same person, you get that all these people are that are married to "unmarried" are married to the same person. "No value" is meant exactly for this use case. "Unknown value" is meant for the anonymous use case. Otherwise, again, it is not the same entity that has written all books by anonymous. For some things, having a "no value" makes not much sense. For example, a town cannot have a population of "no value", since 0 is a valid value. Also, a (non-fictional) human cannot have no mother, for example (as of the current state of technology). For a spouse, though, "no value" makes absolutely sense and is exactly what it is intended for. Maybe the display of "no value" is causing confusion, and if there is a better way to phrase it, that would certainly be welcome, but that's how the data model was defined and how it is implemented. --Denny (talk) 22:09, 20 January 2017 (UTC)

Problems with the administrative divisions in Indonesia[edit]

Hi everyone, but specially to Beeyan and Fexpr, whose I think could help me so much.

I set the Spanish label and description for 59950 items with the instance of (P31) and fourth level administrative division in Indonesia (Q2225692). The description that I used in Spanish was "pueblo de Indonesia" but then, when my bot made all the changes, I think deeply and check that fourth level administrative division in Indonesia (Q2225692) in Spanish is "aldea de Indonesia", so I begin to change the description from "pueblo de Indonesia" to "aldea de Indonesia" with QuickStatements. But now, checking more items I have discover that many items have fourth level administrative division in Indonesia (Q2225692) and desa (Q26211545) and I have new doubts: are this two items the same administrative division or different?

I understand that in Indonesia exists fourth administrative divisions: the smallest, a desa which in English is a village and in Spanish is pueblo and then, inside a desa, Indonesia has another one, a kampung, which in English is a hamlet and in Spanish is aldea. Checking the two items that I said in the previous paragraph, I have two possible theories: 1. fourth level administrative division in Indonesia (Q2225692) and desa (Q26211545) are the same item; or 2. the first is for desa and the second for kampung.

In the other hand I think it is a bit confused because there are many items with the two items in the instance of, but I imagine that it could be only one of them, and then, one a subclass of the other, or some structure in that way. What do you think about it?

Excuse me for all the wrong edits and my mistakes. My intention in my first task with CanaryBot was help to have the Indonesian data in Spanish too, but I didn't it in the right way. Depending of the correct answer to my questions, could be necessary to add pueblo de to desa in Indonesia and aldea de to kampung. If anyone think that I have to revert my 1000 approximate edits with QuickStatements, please told me and I revert it to the last descriptions before my edits.

I await your answer.

Regards, Ivanhercaz Plume pen w.png (Talk) 01:27, 19 January 2017 (UTC)

Beeyan

Raisha


Pictogram voting comment.svg Notified participants of WikiProject Administrative Units in Indonesia. Regards, Ivanhercaz Plume pen w.png (Talk) 01:14, 20 January 2017 (UTC)

Hello @Ivanhercaz:, fourth level administrative division in Indonesia (Q2225692) is the lowest formal government administration in Indonesia; the lowest informal administration is Rukun Tetangga (Q12509020). The naming convention for fourth level administrative division in Indonesia (Q2225692) is different, based on their provinces or area. So, at least there're 10 different terms, kelurahan (Q965568) or desa (Q26211545) or gampong (Q4285979) or nagari (Q882149) or dusun (Q23308490) or kampung (Q12488911) (in Lampung) or kampung (Q12488913) (in Papua) or kampung (Q24659756) (in Kalimantan) or pekon (Q19944049) or lembang (Q12494403). Beeyan (talk) 06:34, 20 January 2017 (UTC)
fyi, dusun in Central Java Province (Q3557) will be the fifth informal administration, but dusun (Q23308490) in Bungo (Q7373) is fourth level administrative division in Indonesia (Q2225692). There reason why there're so many usage of words for different concept because Indonesia has a lot of regional languages. Beeyan (talk) 06:44, 20 January 2017 (UTC)
@Beeyan: Hi! Thank you for your answer. It seems harder than I thought... Well, I think that I am going to revert my latest editions made with QuickStatements and I am going to set, again, "pueblo de Indonesia" in the Spanish description. Once more time, thank you for your help. I am going to fix it in Spanish. If some time you need a bot to set labels and descriptions, I can try to help you.
Regards, Ivanhercaz Plume pen w.png (Talk) 11:23, 20 January 2017 (UTC)
@Ivanhercaz: denada, encantado di conoscerti! Yup, even if you're asking randomly to any Indonesians, I'm sure that they don't even know another name is exist if they're not coming from this spesific area. It's very complex even for Indonesian people too. The best thing is by adding the definition of fourth level administration in Indonesia. --Beeyan (talk) 04:31, 23 January 2017 (UTC)

URL to diff[edit]

{{URL to diff|}} isn't working, but I can't see why. Can anyone fix it, please? IIRC, it was imported from, or modelled on, the version on en.Wikipedia, which does work. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:58, 19 January 2017 (UTC)

Fixed, it was the page move that broke it. Matěj Suchánek (talk) 11:34, 20 January 2017 (UTC)
@Matěj Suchánek: Thank you. I hadn't expected that when I moved it, and am surprised to see a hard-coded reference to a template name, in the Lua module. Is here not a way of avoiding that, or of making the module recognise redirects? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 07:53, 21 January 2017 (UTC)
From the documentation, I assume this hard-coding seems to be necessary. Even though there are ways to work with redirects in Lua, I don't think the module should attempt to fix them. Matěj Suchánek (talk) 08:57, 21 January 2017 (UTC)

I would seem sensible for templates affectced by the above issue to be move-protected. Is there a counter argument? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:55, 24 January 2017 (UTC)

Maybe we should just make sure that users that can edit these templates make sure that they know what they are doing. If problems are detected, just leave a note on their talk page.
--- Jura 17:27, 26 January 2017 (UTC)

Vandalism out of control[edit]

Can anyone remind me why are IP edits allowed? There are not enough people watching recent changes and the vandalism backlog overflows staying there forever. I have been cleaning the last day of vandalism, and it is not a pleasurable task that I plan on repeating. If the community cannot keep up with the maintenance, why are we allowing anonymous edits?--Micru (talk) 11:10, 20 January 2017 (UTC)

Wikidata:Project_chat/Archive/2016/10#Does it make sense to allow anonymous editing on Wikidata? is relevant. Matěj Suchánek (talk) 11:30, 20 January 2017 (UTC)
Yes, I see that many users oppose disabling anonymous edits, but I wonder how many of those same users patrol recent changes on a regular basis. That is a discussion that Recent Changes Patrollers should be having, but apparently there are none here in WD.--Micru (talk) 12:30, 20 January 2017 (UTC)
There are. And I believe there actually are enough to handle the incoming edits, though of course we would greatly benefit from having more patrollers, or from more patrollers using the actual "patrol" feature, so obviously correct or already reverted edits wouldn't have to be checked by others again. --YMS (talk) 10:47, 21 January 2017 (UTC)
If as you say there would be anyone watching, vandalism wouldn't have crept in so much, to the point that I have been reverted from a legit edit because I used an item whose label had been vandalized in another language, which led the user to think that I was vandalizing myself (!). On many wikis users who spend time watching RC identify themselves with Template:User wikipedia/RC Patrol (Q5654490), this is not the case of Wikidata, so it makes me very doubtful about your claims that anyone is watching at all when there have been reports of vandalism staying for months in important items. At least the first step would be recognizing that we have a problem here, and just brushing it under the carpet with the belief that something is being done (when there is no proof that it is the case), it is not going to help in gaining the trust of the Wikipedias that are already suspicious of the lack of quality of the data that is being allowed into WD. --Micru (talk) 12:12, 21 January 2017 (UTC)
I am not brushing anything under the carpet. I have been calling for help in vandalism patrol myself here and elsewhere several times before. Yes, we need people doing this. But no, it's not that we don't have them, as younare saying. Wearing a badge saying "I'm an RC patroller" doesn't help us, just like no one wearing such badges does not mean in any way that there is nobobody taking care of RC. I guess I mqnaged to patrol most of the IP and newbie changes over the last year, and while I've found a lot of vandalism that nobody found for days, weeks ands months, I saw much, much more vandalisn edits already reverted by dozens of fellow users, many of them just as active as I am. --YMS (talk) 13:25, 21 January 2017 (UTC)
Then what is needed is a coordinated effort, as in a RC patrol. There is no use in single users checking vandalism every once in a while on their own, not knowing what has been checked, or what is there to be checked, because it makes the whole process inefficient and full of holes where vandalism just slips away as we are seeing, plus eventually there is the danger that said users get burned out by not having support. If you are already doing the task, and know others that do it as well, what do you think about transitioning into a collaborative effort? For me (and I hope for others too), it would be very useful to be able to refer to a place where to find tools, info about patrolling, etc or other users that can help on a temporary or frequent basis.--Micru (talk) 14:01, 21 January 2017 (UTC)
+1; it was just yesterday that I proposed (on another Wikdiata page) to have a Wikidata:WikiProject CVN in which we can team up, share CVN tools and also make CVN work visible to external Wikidata users such as Wikipedia communities. —MisterSynergy (talk) 14:17, 21 January 2017 (UTC)
Yes, please, go ahead and start it! Perhaps with the full name (Wikidata:WikiProject Countervandalism?), because many people don't know what CVN stands for. I just found out that there is a #cvn-wikidata IRC channel listed on meta.--Micru (talk) 14:45, 21 January 2017 (UTC)
I like the idea of a wikiproject. I would find it useful to have somewhere where people can bring attention to things that they are not sure about or don't have the time/energy to fix. It could also be a good place to document some of the problematic anonymous editors whose IP addresses change a lot. - Nikki (talk) 13:25, 27 January 2017 (UTC)
(Edit conflict) I've compiled a list of tools and links under User:YMS/RC. Otherwise I don't know what to do in terms of collaborative effort. At least for me, I cannot imagine being assigned to shifts or certain areas of work. I check RC multiple times a day and wouldn't change this e.g. if I knew who exactly is active at which exact times. --YMS (talk) 14:23, 21 January 2017 (UTC)
Well at least we need to advertise available tools much more than we do until right now. Your page is great, but I did not know it until yesterday. Something comparable should be available in the Wikidata namespace, as a community project. I would agree, however, that assigned shifts or simliar approaches are not necessary. —MisterSynergy (talk) 14:35, 21 January 2017 (UTC)
(edit conflict)Sharing your tools and links more broadly is already a useful contribution. I don't think anybody would ask you for a deeper commitment that the one you already take. Having a group can encourage other users to participate too.--Micru (talk) 14:45, 21 January 2017 (UTC)
I agree that IP vandalism is a major problem. But I already noted, that (maby suprisingly) most of IP edits are not vandalism, but most of vandalism come form IP and mobile edits. As I noted, in my opinion the solution is to use (semi-)protection much more frequently.--Jklamo (talk) 12:35, 21 January 2017 (UTC)
The amount of vandalism is honestly excessive. MechQuester (talk) 21:41, 23 January 2017 (UTC)
@MechQuester: Where do you see this excessive vandalism? I browse unpatrolled changes or claims or terms (de/en) right now, but I don’t see that much vandalism to be honest. Since I don’t want to rule out that I use wrong filters here, I’d ask here now… Thanks, —MisterSynergy (talk) 08:41, 24 January 2017 (UTC)
I just noticed that an undo doesn't lead to an automatic mark that the edit was patrolled. Is there a reason why this doesn't happen? It seems to me like the system could assume that any edit that get's undone by a person who can mark as patrolled should be marked that way. ChristianKl (talk) 10:17, 24 January 2017 (UTC)
Symbol support vote.svg Support indeed, this makes the patrolling very tedious, because you have to go twice on the same diff, to cancel it, then mark it as read :( --Hsarrazin (talk) 19:50, 25 January 2017 (UTC)
I added a phabricator task (https://phabricator.wikimedia.org/T156470) ChristianKl (talk) 10:50, 27 January 2017 (UTC)
I should not have said really excessive. More like, some of the vandalism, doesn't get caught at all. MechQuester (talk) 14:33, 24 January 2017 (UTC)

Internal identifiers[edit]

Are Wikimedia page outside the main knowledge tree (Q17379835) and Wikimedia internal stuff (Q17442446) the same? Jc86035 (talk) 11:20, 20 January 2017 (UTC)

Just say, my opinion is Symbol support vote.svg Support merging both. --Liuxinyu970226 (talk) 04:16, 21 January 2017 (UTC)
Time2wait.svg On hold We should take a look at the subclasses first. Matěj Suchánek (talk) 09:20, 21 January 2017 (UTC)
I think Wikimedia page outside the main knowledge tree (Q17379835) should be a subclass of Wikimedia internal stuff (Q17442446). Not every internal Wikimedia thing is a page, is it? --Yair rand (talk) 01:46, 23 January 2017 (UTC)
@Yair rand: Not every internal Wikimedia thing is a page [citation needed (Q3544030)] --Liuxinyu970226 (talk) 15:18, 24 January 2017 (UTC)
What about single user group as a concept? That's not a page but a group of users, isn't it? Matěj Suchánek (talk) 15:45, 24 January 2017 (UTC)

Daily (or every two days?) reference drives[edit]

Hi all, after reading w:Wikipedia:Wikidata/2017 State of affairs, it is quite clear that Wikidata is not well-received on enwiki. Aside from the many claims that seem to stem from a misunderstanding of Wikidata's purpose and integration with Wikipedias, I think one criticism is valid. Wikidata has many claims that are not sourced, and these include claims that are imported from (P143) various Wikipedias. This opens us to criticisms for violating w:WP:V and w:WP:BLP. I suggest that we have a daily (or every two days) reference drives where we focus on improving a single item by adding the references from various Wikipedias as well as finding new ones. The end product doesn't need to be showcase material, but at least it will be well-referenced and Wikipedia usable in that regard. Any thoughts? —Wylve (talk) 12:18, 20 January 2017 (UTC)

The entire relationship between Wikidata and Wikipedias has to be put into greater perspective. Improving the references situation at Wikidata is an important and noble goal, but I don’t think that it significantly improves how Wikipedia communities think about our project here. If one wants to point to situations that went wrong with Wikidata, one can do that in any situation — no matter how good Wikidata actually is. My impression after reading the enwiki discussion is that there are quite a bunch of users over there that do exactly that. As an experienced dewiki community member I remember plenty similar discussions from that project as well.
It is always important to be aware of what we are doing here. Wikidata is Wikimedia’s approach to drastically streamline the way it manages its valuable knowledge. Like practically any other large organization in this world whose core asset is unstructured data (such as wikitext), Wikimedia and its communities spend effort to separate structured from unstructured data these days. The actual implementation of Wikimedias structured data is Wikibase, which is an extremely powerful solution to my experience. But Wikidata was just the first step to structure Wikipedia knowledge (the most important asset), Wikimedia Commons and Wiktionary will follow soon and other fields such as WikiCite are likely about to be developed in future as well. Structured data will be the backbone of Wikimedia in future, even if most Wikipedia communities do not understand this fact right now.
To my opinion we don’t have to worry about their current reluctance to include Wikidata into their articles. Wikipedias grew a low in the past years, possibly more than anybody imagined when this journey started some 15 years ago. However, there is a vast amount of information to maintain in good shape, and if you browse to the corners of Wikipedia’s knowledge (no matter of which language), you can easily spot the limits of the communities’ ability to properly do this job. Even today in 2017 there is so much content which would profit from any kind of automation (I’d estimate that this applies to much more than 50% of all articles). Yet it is clear that the more you walk to the central content, the less automation is necessary or useful. These central articles have typically been written by experienced, influential editors.
There are a couple of things to take care of at the same time:
  • The separation of structured data from unstructured wikitext significantly adds complexity, which is a problem for many users that are not so tech-savvy. We have to make sure that there is really good software which envelopes the internals of this data model for users do not want to see it — without suspending them from editing Wikimedia’s projects. The VisualEditor is important for that, but by far not enough to deal with this issue.
  • Structured data is efficiently maintained, but in encyclopedias it needs to be put into context (unstructured wikitext can do that much better than structured data). This is something only human editors can do, and this is something which drastically limits the degree of automation we should strive for. Since Wikipedias as well as Wikidata itself are quite diverse in depth and quality (both in form and content), it is not at all wise to take a binary approach such as Automation everywhere or Forbid all Wikidata usage/automation.
  • In many situations, “data usage from Wikidata” can simply mean to compare a local value with a Wikidata value and add the Wikipedia page to a maintenance category if there is a difference. Which of the two values is actually displayed then is of minor importance. However, there is (to my knowledge) no coordinated effort to provide powerful templates and modules in Wikipedias right now which enable users to decide on different levels (per-article, per-template, per-Wikidata property, per-Wikidata datatype, per-Wikipedia project, etc. …) which data to display by default.
  • Please always remember as well that there is much testing and trial-and-error involved in this procedure of structuring Wikimedia’s knowledge. But we don’t have another option than to use structured data in the future, given the fact that Wikipedia’s rate of deterioration of quality is ever growing.
Ping @Fram as the initiator of this enwiki discussion. —MisterSynergy (talk) 13:25, 20 January 2017 (UTC)
Different Wikidata editors have different reasons for editing. Declaring reference drives as a specific project ignores the motivations for which an editor comes. Wikidata should focus on being inviting to different people who want to come and participate no matter what kind of data they want to contribute.
In general unreferenced data shouldn't matter to en.Wikipedia. It's easy for en.Wiki to simply ignore all unreferenced data in Wikidata. Having bad references might be more problematic. ChristianKl (talk) 07:20, 22 January 2017 (UTC)
Agree, if Wikidata wasn't attempted as a tool for Wikipedia and Wikisource, my interest in this project would be 0 (Q204). The inclusion of the data here to Wikipedia is a good benchmark to test if our models here works.
Besides the "imported from"-issues and the encouragement of editing without knowing anything about the subject, I see one large threat. Phase 1. The non-Wikidatatians too often tends to see Wikidata only as a place to fetch Interwiki. When I manually as an IP adds interwiki to a page in Wikipedia, I am notified that "We do not add interwiki by the wikicode anylonger, instead we use Wikidata". Poorly matching subjects are therefor merged. Articles about families, names, disambigs and lists of things tends to be very mixed up. -- Innocent bystander (talk) 08:14, 22 January 2017 (UTC)
I think that integration with Wikipedia and Wikisource is very valuable but I don't think it's the only purpose of Wikidata. If everything goes well with WikiFactMine, scientists might go directly to Wikidata to look up facts (and their references) for an item that comes up in their research.
It's valuable for Wikidata if people who have an interest in open structured data participate even when they aren't directly interested in Wikipedia. Think big tent. Not everything on Wikidata has to be useful for Wikipedia, but Wikipedia should be able to decide to only important the data that's useful for it. When unsourced statements aren't useful Wikipedia can simply decide to only import sourced claims for a property. ChristianKl (talk) 18:59, 22 January 2017 (UTC)
in my experience the reference problem will not be improvedm until there is some leadership to lead a quality circle to add references. and not to propose a policy to block people who add un-referenced statements. the team need not conflict with the others in the big tent. until the leadership is provided, the problem will not be improved, despite all the hand-wringing elsewhere. Slowking4 (talk) 23:30, 22 January 2017 (UTC)

Adding primary sources as references[edit]

I will reflect again for the umpteenth time, we do not make it easy to add primary sources to the system. I can find a baptismal record for many people in history, and to add it here is a PITA. It is not just the addition of the reference data, it is the requirement to create the whole record to reference it. If I am adding (and creating) primary/secondary records for a birth, a death, a marriage every time I have to create a person, the likelihood is slim. On the occasions that it is on the web then often they are behind pay firewalls, with complex urls, not designed for ready referencing. All a problem! So, I will continue to add the primary research undertaken to the author talk page at the Wikisource interwiki link. We still require a better methodology to reference. Even then will the Wikipedias accept a primary source as a reference anyway?

Secondarily, we have many published secondary resources at the Wikisources for adding this referential data, add it is still painful to add a reference from there to here. I know that we have a project for the addition of data from other wikis, and one can hope that such a project will allow for not only editing, but the pushing of referential statements. For example, 63 volumes of Dictionary of National Biography to their constituent items.  — billinghurst sDrewth 04:49, 21 January 2017 (UTC)

My experience is that primary sources are not allowed to prove the "notability" of a subject on Wikipedia. But when the notability is proofed in some other way, primary sources can be used. But one problem is that reading baptismal records can be very difficult, and can be regarded as "Original Research". Laws can also be difficult to interpret. -- Innocent bystander (talk) 08:20, 22 January 2017 (UTC)
do not know why you are raising ENWP issues here. we need a wizard / process improvement. maybe a wishlist item? Slowking4 (talk) 23:24, 22 January 2017 (UTC)

Property for members of parliamentary groups[edit]

There seems to be no consensus on which property is the right for stating the number of MPs of a parliamentary group. I argue to use membership (P2124), because the group exists of MPs and hence the MPs are the members of the group. @Caarl 95: argues to use number of representatives in an organization/legislature (P1410), which is usually used for the number of MPs of a political parties (see Talk:Q20113710 for full argumentation - Please notice that the english description of P1410 has been changed yesterday to include groups, the German and French still only describe a use for political parties). My argument is, that a party has seats in a legislature (number of representatives in an organization/legislature (P1410)). However, a parliamentary group does not have seats in the legislature, but is composed of members of the legislature. Therefore membership (P2124) is the correct property for the number of members of a parliamentary group. @Oravrattas: argument (here) is also not valid: a parliamentary group is defined by its parliament and therefore has only seats in one legislature. For example, Group of the Alliance of Liberals and Democrats for Europe (Q839097) and Alliance of Liberals and Democrats for Europe in the Parliamentary Assembly of the Council of Europe (Q4732455) are different entities, different groups with the same name. A party with the same name exists (Alliance of Liberals and Democrats for Europe Party (Q25079)), which has seats in different legislatures. My conclusion is: number of representatives in an organization/legislature (P1410) is correct for political parties, membership (P2124) is correct for parliamentary groups. --ElTres (talk) 14:25, 20 January 2017 (UTC)

@ElTres: There seems to be a lot of confusion around elections, parliaments and things like that. The main reason is probably because it looks so differently in different nations and on different levels. I guess The EU-parliament is probably extra complicated, since the EU-groups are not very homogeneous. Two Swedish parties are representing EPP in EU. That did not mean that they cooperated in any way in the last EU-election. The EU groups/parties were not in any way visible here. When a Swedish MEP is interviewed on TV, they are mentioned as representing a Swedish group of MEP's not a EU-group. It is only when leaders of such a groups are mentioned, they are mentioned as representing a EU-group. Otherwise they are mentioned as representing a national party.
-- Innocent bystander (talk) 08:38, 21 January 2017 (UTC)
I would agree this is something that does vary a lot - certainly ElTres's summary is not how I would think of it in the UK, where (for example) the parliamentary group is treated as effectively the same as the party. So I'm not sure we can say that one approach is obviously "correct". Andrew Gray (talk) 11:07, 21 January 2017 (UTC)
In the Swedish riksdag the parliament group and the party could be more or less in conflict with each other. The group and the party often have different leaders, since you have to be an MP to be a part of the group and be its leader. If you are a member of the Government, you normally resign your seat in the parliament. The governing part(y|ies) therefor always have other group-leaders than the party itself. And a party can have two or more leaders, while a group always have only one. -- Innocent bystander (talk) 11:17, 21 January 2017 (UTC)
This split between leaders does happen in the UK Parliament as well, particularly with smaller parties. And the distinction between the Parliamentary Labour Party (Q14472169) and the Labour Party (Q9630) has been in the news a lot over the last year or so. However, I think that it's precisely because things work differently in different legislatures that it's useful to have a common property to say that there are N seats are held by members of X body, whether X is a party, faction, club, political group, or whatever. I see no value in using a different property for one of these cases, especially if that property is simply a broad membership (P2124), and number of representatives in an organization/legislature (P1410) was created for this very purpose. --Oravrattas (talk) 20:24, 22 January 2017 (UTC)
@ElTres: Note that the original proposal for this property was explicitly for "number of seats hold by political group or party in legislature" (emphasis mine). This was never meant to be restricted to parties only. It is a relatively simple matter to change the descriptions in German, French, or any other languages if they currently say that this is only for parties. --Oravrattas (talk) 20:29, 22 January 2017 (UTC)
It's not that easy. Take the US congress. Berny Sanders was for a while member of the Democratic Caucus despite being elected as an independent and not being a member of the democratic party. Using membership (P2124) is good because it means that we can say when a person leaves a group like the Democratic Caucus or joins it. ChristianKl (talk) 10:32, 23 January 2017 (UTC)

Thanks for all contributions. I can accept the overall conclusion to use number of representatives in an organization/legislature (P1410). --ElTres (talk) 19:37, 23 January 2017 (UTC)

Brute force creation makes other work[edit]

I am not sure why we continue with the brute force creation by some people here. Where we end up having to them merge duplicates, which is more work for others. Surely there is now a better means to identify matches, or potential matches, and create only those that need creating then separating out matches and potential matches. -- – The preceding unsigned comment was added by billinghurst (talk • contribs) at 00:31, 21 January 2017 (UTC).

I am afraid that it is matter of perspective. We are on the right track in that we find more and more reason to connect things at the start. For instance, I added a person because he was awarded a prize and he was incorrectly linked on the English Wikipedia. I added his VIAF indicator because he is extremely likely to be linked through other sources. Your point that we should create only those that need creating has a powerful problem; what needs creating? What to do when for whatever reason another person is already there because she is already there and has no article yet.
It would be great when we have better tooling to curate links on the Wikipedias because there is a percentage that is wrong. I have argued and will continue to do so that we would do better at sharing the sum of all knowledge when Wikipedia and Wikidata cooperate. That will only be possible when we start to think in terms of what do we have to offer each other. At this time the import of a lot of data has not finished and as we lack the tools, we do not curate Wikipedia but do improve Wikidata. Thanks, GerardM (talk) 06:45, 21 January 2017 (UTC)
@billinghurst: "Surely there is now a better means to identify matches, or potential matches" That's great to hear. Please post details. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 07:47, 21 January 2017 (UTC)
@GerardM: I am not talking about that sort of edit/creation. I am talking about those who run bots through petscan and other tools and create items in their hundreds <-> thousands with no effort or consideration for existing items that may be matches. Be they plant articles or disambiguation pages, running a fill from enwiki that creates duplicates in such cases is tantamount to non-considerate editing. I know it is my choice to do merges to clean up, however, there is nothing quite as annoying as attempting a diligent cleanup to find that a bot creation run has gone through and whack whack whack'd more duplicates into play. UNHELPFUL! I am not wishing to point fingers, I am hoping that the community can say that we have reached a point of maturity that something more elegant can be put into place.  — billinghurst sDrewth 08:15, 21 January 2017 (UTC)
@billinghurst: The nlwiki system seems to work quite well. A couple of users keep an eye on the new articles and try to connect them or create new items.
If articles are not connected, are at least 4 weeks old and haven't been edited for at least 3 weeks, a bot will create a new (empty) item. This is in place to limit the backlog (it's the broom wagon (Q14823)).
The new empty items show up at Wikidata:Database reports/without claims by site/nlwiki. People keep an eye on that, bottom article is May 2014.
The result is quite good. We did have some annoying encounters with well willing users who ran some automated jobs blindly creating empty items for everything messing up this whole process. We kindly asked them to never run on nlwiki again, they didn't speak the language anyway.
Maybe set up the same system for other languages like enwiki? I can enable the broom wagon for enwp too or you can run it yourself. Multichill (talk) 09:24, 21 January 2017 (UTC)
Multichill, I am sure that there are numbers of means that it can be done, and I am glad that nlWP is showing some leadership in having a system. The users doing this are certainly well-meaning, I just think that it is short-sighted that they do it, and that the tools are so (easily) configured that way rather than it being either an advanced function. Even if there is a ready means to run a duplicity type check so that where the label exists it pushes it to a duplicity check, or report that needs an override tick to push such bot runs through. There are more knowledgeable here about the tools/processes/solutions, I am just here venting about my (negative) experiences with the hope that there is a little consensus and the ability to move to a less duplicate generating plan.  — billinghurst sDrewth 10:37, 21 January 2017 (UTC)
  • Personally, I think it takes more time to merge items than to run Duplicity (Q23751912). Except if one has some plan to merge duplicates oneself, I don't think one should run PetScan with the option "select all".
    --- Jura 10:29, 21 January 2017 (UTC)
    • I don't think so. Almost all toolses and reports (petscan, harvesttemplate, projectmerge, without claims by site, Wikidata Games, deaths at Wikipedia, etc) does not work for unconnected pages. Therefore, it's better to have many stub items and improve them later.--GZWDer (talk) 11:41, 21 January 2017 (UTC)
      • The idea is to limit the number of items for the same topics. Of course we can use those tools on two items and add twice dates of birth, twice date of death, twice occupation, etc, but we don't need to do that on two separate items. If you use the filter PetScan provides to avoid duplicates, you should be fine for most wikis.
        --- Jura 11:46, 21 January 2017 (UTC)
      • Duplicity (Q23751912) is a much more convenient tool than projectmerge. So no, it's a bad idea to create items only for them to end up on projectmerge.
        --- Jura 11:57, 21 January 2017 (UTC)
        • No data will be lost when items got merged, but no data can be added if the page is unconnected. Making new items with data also helps finding merge candidates.
        • Duplicity is indeed a good tool, but we don't have enough people to check every new pages. PetScan already allows skipping pages whose terms already exists in Wikidata, which may be link candidates.

--GZWDer (talk) 13:59, 21 January 2017 (UTC)

Can you confirm that you (your flood account) uses that option?
--- Jura 14:02, 21 January 2017 (UTC)
+1 with billinghurst. I am curating one dataset after someone creates several thousands of new items 4 months ago: there were around 800 potential duplicates. Now around 700 are still here: it seems that I am the only one who does the curation job.
@Pigsonthewing: Not difficult: when you have a dataset, instead of importing the data by creating new items, keep the datasetin your computer, match the identifiers of your dataset with the one available in WD and at the end create only items for the values without correspondance with WD. And you can be creative: instead of using identifers, you can use combinaison of properties like for persons date of birth and date of death. Extract the date of birth and of death from WD and compare with the ones in your datasets. If you find a correspondance (same date for the birth and the death), it is worth to check before any item creation if the item is relevant for one entry in your dataset. Some kind of preprocessing job. Snipre (talk) 07:05, 22 January 2017 (UTC)
It's worse if a bot considers two items falsely to be a match then if he creates duplicate items when the items aren't duplicates. In cases like important people where different people have the same name I don't see the problem with creating more new items. If someone later has a problem with the fact that duplicate items exist they can merge. ChristianKl (talk) 07:43, 22 January 2017 (UTC)
"they can merge".. interesting attitude. I noticed you haven't done so in the last three month.
--- Jura 09:34, 22 January 2017 (UTC)
It's not true that I didn't use merge in the last three months. But even if that would be true it just suggests that the items with whom I have dealt have no need for merging. ChristianKl (talk) 18:33, 22 January 2017 (UTC)
@ChristianKl: I didn't say do not use bots, nor that bots were not helpful, nor that the occasional duplicates are a problem, and not one word about using bots to do matches and merges. So an off-hand dismissal of the issue is not helpful.

I asked that we don't do brute force creations, that means consider the prospect for duplicates prior to running a bot through, especiall as there are already bots that do matching of similar data. I still think that the ability to have a dummy run through to exclude potential matches to a separate list, then review those is doable.

The reason why duplicates are a problem is the wasted resources in trying to identify differences from a WD item to another WD item that both attach to other sources. Far easier to have that diligence applied at the time of creation.  — billinghurst sDrewth 03:28, 23 January 2017 (UTC)

Having just done a five-way merge (Q21542964/ Q21454103 / Q21555944 / Q21546110 / Q21288402), can I please encourage people to clean up as they go along. Unmerged items really mess up our cross-referencing between different databases, and they make life far more difficult for tools like the auto-matcher in Mix'n'match. Of course, don't do bad merges, because they are also a nightmare. But please, don't leave potential unmerged items around for more than one or two weeks maximum without checking. If you can't check for potential merges within that sort of timescale, please then throttle your uploads and don't upload so much data at once -- don't upload more than you can check. Jheald (talk) 18:00, 22 January 2017 (UTC)

Cavalier[edit]

There is a problem on the German and English Wikipedia pages. The English page w:Cavalier does not have an link entry for the German page w:de:Kavalier. The German page does have a link to the English page, but it is missing a lot that appear on the English page.

  1. Can someone please fix it.
  2. Can someone please leave me a link on my talk page to advise on how to fix such problems in the future (teach a man to fish).

-- PBS (talk) 09:18, 21 January 2017 (UTC)

@PBS: They are different items Cavalier (Q2284765) and chevalier (Q354421) with some overlap in languages. Someone is going to need to sort out whether they are specifically different, or there is a conflict to resolve.  — billinghurst sDrewth 10:40, 21 January 2017 (UTC)
Noting that chevalier (Q354421) is a disambiguation page and it looks like an article linking to it, so that seems in error.  — billinghurst sDrewth 10:42, 21 January 2017 (UTC)
de:Cavalier (a dab page) and en:Cavalier (disambiguation) link to another page Cavalier (Q420534), so it appears that the pages that link to chevalier (Q354421) should either be linked to Cavalier (Q2284765) if they are an article about cavaliers, or Cavalier (Q420534) if they are a dab page or some other page if they are are about just about a horse rider (equestrianism (Q179226), or to cavalry (cavalry (Q47315)), but the link chevalier (Q354421) should probably be removed as a duplicate. -- PBS (talk) 11:22, 21 January 2017 (UTC)
this is a special case (aren't they all) where the english has an historical connotation, versus generic term. i guess we will have to hand fix these. don't know if there is a mix and match tool to provide a list, or a flag. Slowking4 (talk) 23:21, 22 January 2017 (UTC)

not applicable[edit]

Hello. Fetching data from wikidata to Greek Wikipedia, when the property has "no value", we get "not applicable". Do you know where to translate this? Xaris333 (talk) 14:20, 21 January 2017 (UTC)

el:Module:I18n/wikidata--GZWDer (talk) 14:52, 21 January 2017 (UTC)

Thanks! Xaris333 (talk) 15:37, 21 January 2017 (UTC)

@GZWDer: When a property has as date "June 2016" we have a problem in Greek Language (we have w:Grammatical case). In wikidata is correct: "Ιούνιος 2016". But in Wikipedia is shows "Ιουνίου 2016" (that will be correct if we have the date, e.x. 25 Ιουνίου 2016). Do you know how to correct this? Xaris333 (talk) 20:04, 21 January 2017 (UTC)

This is an issue of local Module:Date. @Jarekt:.--GZWDer (talk) 07:43, 22 January 2017 (UTC)
I do not know how Module:Date is used on Wikidata, but {{ISOdate}} calling this module seems to work correctly:
  • {{ISOdate|2016-06|lang=el}} gives ""
  • {{ISOdate|2016-06-25|lang=el}} gives ""
@Zolo: maybe you know how to fix this. --Jarekt (talk) 16:36, 22 January 2017 (UTC)
@Xaris333: using w:el:Template:ISOdate works for me. So maybe elWP needs to update their templates??? You haven't provided examples for us to explore at that end, though it seems that it may simply be how elWP is calling the data from WD.  — billinghurst sDrewth 00:46, 23 January 2017 (UTC)
@billinghurst: @Jarekt: See w:el:Ολυμπιακός Λευκωσίας (ποδόσφαιρο). Is correct in wikidata (Ιούνιος 1931) but wrong in the article (Ιουνίου 1931). Xaris333 (talk) 15:22, 23 January 2017 (UTC)
This seems to be an error in the #time parser function. Calling {{#time:F Y|1931-06|el}} on Wikidata returns Ιούνιος 1931 but on elwiki Ιουνίου 1931. --Pasleim (talk) 16:22, 23 January 2017 (UTC)
@Pasleim: We are using "Ιουνίου" when we have the full date. For example, 26 Ιουνίου 1932. But if we have only month and year, we must have Ϊούνιος 1932. The difference is a w:Grammatical case. Xaris333 (talk) 17:26, 23 January 2017 (UTC)
Xaris333, we understand that, but it seems like this issue has nothing to do with wikidata. As user:Pasleim pointed out the same call to the basic parser function gives different results on Wikidata and el-wiki and you identified the el-wiki answer as the wrong one. I think you will need to alert the technical community on el-wiki about this or file bug report. --Jarekt (talk) 20:32, 23 January 2017 (UTC)

Commons cat -> Wikidata script now working again[edit]

Wdcat.png

This script adds a small box on a Commons category page, to let you know if there is a corresponding article-like item on Wikidata with a Commons category (P373) pointing to the Commons category page.

The script runs whenever you're browsing Commons categories. If the Commons cat page doesn't already include a Wikidata link on the page, it's well worth adding one, using e.g.:

I find it quite useful to spot when P373s are missing, on Commons categories that really ought to have them; and also, to stop me adding a P373 to a Commonscat, if there's one already I didn't know about from an existing item -- a sign that, instead, the two items should probably be merged.

To give it a go, simply add the line

importScript('User:Jheald/wdcat.js');

to your common.js on Commons.

It had stopped working because it was previously relying on WDQ for its lookups; I've now tweaked it to use SPARQL instead.

I think I got the changes correct, but do give it a try & let me know if anything doesn't work.

All best, Jheald (talk) 21:31, 21 January 2017 (UTC)

@Jheald: Useful, thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:40, 22 January 2017 (UTC)
@Jheald: thanks from me as well. --Jarekt (talk) 20:34, 23 January 2017 (UTC)

Irish-American - is it an instance of ethnic group?[edit]

Im looking at the edits of Special:Contributions/69.115.73.113 and s/he is adding "ethnic group" to items such as Irish American, Cuban America, Jamaican American. Would they be P31 of "ethnic group"? Strictly speaking, Irish is an ethnicity but Irish American? To me, its more of an identity, not really ethnic group. And I see identity as related but distinct from ethnic group. Any opinions?

Two issues I want to bring up with this user.

  1. 2 items. After I looked at her IMDB profile, their roles are extremely minor. Q28445406 and Q28469401.
  2. Also, their edits to some of the articles are just head scratchers.

-- MechQuester (talk) 06:49, 22 January 2017 (UTC)

All context really. It might be an ethnic group in a third country, eg. Irish-American community in Brazil, however, it is not really an ethnic group in US. I would agree with you in a second country it is more an identity.  — billinghurst sDrewth 07:46, 22 January 2017 (UTC)
I think we need a specific class for those concept such as American administration "ethnic group". This might be valuable information for Wikidata but administrative concepts might not really qualifies for beeing scientific truth and might more be a political artifact. This would give something like :
< Irish American > instance of (P31) See with SQID < USA administration "ethnic group" >
. author  TomT0m / talk page 09:10, 22 January 2017 (UTC)
I'm not sure there is a useful distinction between the Irish ethnic group and the Irish American ethnic group. But when deciding the question, it would probably be more useful to look at the period 1845 to around 1930 in the US, when many more distinctions were made between Irish Americans and other Americans of European extraction. Of course today hardly any distinctions are made. Jc3s5h (talk) 14:53, 22 January 2017 (UTC)
The point is not to make a difference, it's to note that the concept is typically american, and more precisely is used by the US administration. Whether or not the concept has any relevance at all is a matter of philosophical debate. But it's should be clear it's not a universally accepted classification. author  TomT0m / talk page 07:53, 23 January 2017 (UTC)
This is an recurring editor that keeps adding non-notable items and non-sourced statements. See User talk:69.115.75.25 and User talk:173.2.62.239. Sjoerd de Bruin (talk) 15:50, 22 January 2017 (UTC)
@Sjoerddebruin, he is back. MechQuester (talk) 20:51, 22 January 2017 (UTC)
Please use {{Ping}} next time. Blocked. Sjoerd de Bruin (talk) 19:01, 23 January 2017 (UTC)

What if duplicate items are created?[edit]

Hi! I imagine that some duplicates of items are created every now and then. A user may not know that the item exists and create a new one and goes on with life. Are such duplicates being automatically detected somehow? Or are there some tools out there to help finding them? I guess the Wikidata game had such capabilities at some point but are there other options? //Mippzon (talk) 16:54, 22 January 2017 (UTC)

  • Even we one takes reasonable care, it's likely to happen. Help:Merge provides an introduction.
    --- Jura 17:03, 22 January 2017 (UTC)
There are projects around looking at duplicates, have a poke at User:Pasleim/projectmerge. There are also tools like Wikidata-todo's Duplicity that allows you to work from a list to look for potential matches.  — billinghurst sDrewth 00:34, 23 January 2017 (UTC)

Wikinews sitelinks[edit]

What are wikinews sitelinks used for? Should all news of a person be in their wikidata item? Chicocvenancio (talk) 18:31, 22 January 2017 (UTC)

Oh, no! Every wiki news item has it's own WikiData-item, and the sitelink is linking to one or several language-versions of the item. If a person is the main subject, you can use the property "main subject" to link it. It can't be connected directly, as one person can have many different news articles connected. Edoderoo (talk) 22:58, 22 January 2017 (UTC)
PS, see d:Q28473965 for an example. Edoderoo (talk)
i do not see any reason to have a data item on every story. you do not have an item to every nytimes article.[1] i note russia wikinews has a link to article subject, but english wikinews does not. maybe you should have a category to link to. Slowking4 (talk) 23:15, 22 January 2017 (UTC)
@Slowking4: It's definitely true that there are very few interwiki links for individual news stories, so one of the functions of Wikidata isn't really applicable there. But we can have items for individual stories and they can have meaningful statements about their main subject, for instance and someone performing searches could get more useful information from that. —Justin (koavf)TCM 18:27, 23 January 2017 (UTC)
Individual news stories seem like a particularly awkward case for wikidata-based interwikis (although these same problems occur with all wikidata-based interwikis, the problems are particularly blatant with news articles). Wikidata tries to provide separate items whenever there is any distinction between things, it tries hard to not conflate distinct things, whereas interwikis should be provided as often as possible whenever there is a most-nearly-analogous page on another project. But with news stories, it's rather routine for two different news stories, supposedly "about the same event", to be really two different news stories. A news article is a snapshot in time, and under ordinary circumstances no two snapshots of something are quite the same; they're taken at slightly different moments, they're taken from slightly different angles, the lighting is different. If there's some big disaster (an earthquake, or tsunami, or airplane crash, or whatever), and French Wikinews publishes an article about it on Tuesday, English Wikinews publishes an article about it on Wednesday (containing some information on development of the story after the French article was published), German Wikinews publishes one Thursday, and French publishes another on Friday, probably no two of those four articles are alike. It'll be a problem for English or German Wikinews to decide which of the two French articles to interwiki to, but whatever they decide Wikidata would be wise to position itself so that it can avoid the politics of the question. It can get to be even more fun (so to speak) if French and German and English Wikinews all publish on Tuesday but choose to focus on very different aspects of the disaster. --Pi zero (talk) 21:13, 23 January 2017 (UTC)
it's all good, i will just go a create a WD item for each EB1911 article, and pubmed article. and especially every article used as a reference. much easier to find them. Slowking4 (talk) 02:55, 24 January 2017 (UTC)

Should we use part of (P361) or location (P276)?[edit]

Should we use part of (P361) or location (P276) when stating that a roller coaster (Q204832) is in a specific amusement park? //Mippzon (talk) 08:07, 23 January 2017 (UTC)

I think location makes more sense. ChristianKl (talk) 10:44, 23 January 2017 (UTC)
May be both?--Ymblanter (talk) 16:25, 23 January 2017 (UTC)
I noticed when browsing roller coaster (Q204832) I noticed many of the used part of (P361). But maybe as you say, we should use both? //Mippzon (talk) 16:48, 23 January 2017 (UTC)

Find all articles from specific category (recursively) that do not have WikiData entry[edit]

Do you know a way to find all articles that have specific category or any if it's subcategories assigned, in specific local wikipedia, that do not have entry in WikiData. I would like to add such entries, but it will take a lot of time to find such articles in the category that I need, since there are a lot of them. --StanProg (talk) 10:50, 23 January 2017 (UTC)

You can use PetScan. For example this is all articles from "Страницы разрешения неоднозначностей по алфавиту" category of ruwiki that do not have Wikidata item.--ԱշոտՏՆՂ (talk) 12:03, 23 January 2017 (UTC)
Duplicity also provides such functionality. It offers possible matches on Wikidata as well. —MisterSynergy (talk) 20:19, 23 January 2017 (UTC)

Wikidata weekly summary #244[edit]

I've noticed that all wind mills/tide mills/fire stations in these queries has no sitelinks. Are they notable? Can I create such elements for the same objects in my city? --Infovarius (talk) 15:29, 27 January 2017 (UTC)

New nomination[edit]

Hello.Please participate here.Thank you --ديفيد عادل وهبة خليل 2 (talk) 08:28, 24 January 2017 (UTC)

We don't need a sandbox item.[edit]

Given that we have https://test.wikidata.org/ I don't see the point of Wikidata Sandbox (Q4115189). I think it would be useful if the main website would link to https://test.wikidata.org/ for the preferred sandbox.

Having the Sandbox item on the main Wikidata means that it might influence queries. ChristianKl (talk) 11:38, 24 January 2017 (UTC)

Can you test modules on client projects through test.wikidata.org? I know that the sandbox item is used a lot for that on svwiki. Ainali (talk) 12:03, 24 January 2017 (UTC)
I don't think test.wikidata.org can be considered a replacement for the sandbox item. It doesn't have the same properties and items, it doesn't have the same gadgets and doesn't always behave the same. I'm also not aware of any tools which allow you to pick which server to edit against (e.g. what if you want to test your QuickStatements commands?). - Nikki (talk) 13:02, 24 January 2017 (UTC)
There is a whole group of sandbox items, not only one. I think they are very useful as Ainali and Nikki describes above. If a bot once a week empty these items, they only cause temporary problems in the queries. -- Innocent bystander (talk) 13:49, 24 January 2017 (UTC)
@Ainali: We can import (or just copy em since all non-Main/Property namespaces are still CC BY-SA?) modules from en/fr/de... wikis, like what a number of Indic Wikipedias are doing. --Liuxinyu970226 (talk) 15:14, 24 January 2017 (UTC)
@Liuxinyu970226: How would that help the users on those projects? They most probably want to test the modules in their real environment since they in turn probably rely on other modules and templates. Ainali (talk) 15:19, 24 January 2017 (UTC)
One reason we have not imported these modules, is that we have locally discovered demands that these (en/fr/de-)modules have not fullfilled yet. -- Innocent bystander (talk) 17:09, 26 January 2017 (UTC)
Maybe ChristianKl can outline how he does testing?
--- Jura 17:19, 26 January 2017 (UTC)

Property documentation template question[edit]

Would it be possible for someone to have a look at the question I raised here two weeks ago? Thanks. Carcharoth (talk) 11:50, 24 January 2017 (UTC)

How does one edit pages like Module:Property documentation? It looks really difficult to do. Carcharoth (talk) 11:57, 24 January 2017 (UTC)
Yeah, they are called modules and you have to know Lua (Q207316) in order to maintain them. Matěj Suchánek (talk) 12:58, 24 January 2017 (UTC)
@Jura1: You added this in Special:Diff/330807441. Matěj Suchánek (talk) 12:58, 24 January 2017 (UTC)
What I want to do is change "linkText = "+"" to something like "linkText = "run query to view all instances of this property"", which is more informative than "+". I worked out that you have to click the 'paragraph' button 'P' in the toolbar at the top (the 'toggle invisible characters' bit) to be able to see that and edit that. Very opaque and difficult for new editors to understand and access. Wikidata doesn't make it easy for people to understand how to edit Carcharoth (talk) 13:09, 24 January 2017 (UTC) Made the change I wanted to make, see here. Carcharoth (talk) 13:11, 24 January 2017 (UTC)

It is still not quite right. If you look at Property talk:P1920 you see that the links at the top link to running a query. So if you want to find all instances of that property, you click once to bring up the query, and then again to bring up the results. Wouldn't it make more sense to link people in one click to a list of all instances of a property? It would also make more sense to link it from the 'Current uses: 192' bit on that page. It would also make even more sense to link from the main property page, rather than the talk page. Where was the decision taken to put stuff like this on the talk pages? Carcharoth (talk) 13:18, 24 January 2017 (UTC)

Fallback languages for labels in clients[edit]

Fallback languages as of 2015

The Lua functions mw.wikibase.label and mw.wikibase.getLabelWithLang in Wikibase clients use fallback languages according to their descriptions. Where and how is the used sequence of fallback languages defined? Thank you for your help. Dipsacus fullonum (talk) 13:24, 24 January 2017 (UTC)

It's hard-coded inside MediaWiki. The rules are:
  1. the last language is always English,
  2. variants like Swiss German fall back to the main (parent) language,
  3. some languages may also fall back to a language which majority of its speakers understands; you can find this information for each language inside this folder, eg. here you can find that Czech falls back to Slovak.
Matěj Suchánek (talk) 14:46, 24 January 2017 (UTC)
I think if fallbacks other than these chains are also needed, but shouldn't be hardcoded (e.g. we finally cancelled uk -> ru fallback, but some non-WMF users (i.e. translatewiki.net (Q9376349)) may still want it; and at least I wanna fallback yue to zh-hant or zh-hk/zh-mo but @Shinjiman: opposed it?), then this Phabricator task may help you. --Liuxinyu970226 (talk) 15:07, 24 January 2017 (UTC)
Also, regarding this svg file, I suggest both zh-hans and zh-hant be fallbacked each other, like pt <-> pt-br and cdo <-> nan. --Liuxinyu970226 (talk) 15:11, 24 January 2017 (UTC)
Thank you all for the answers. It was not what I hoped for, but new functions to get labels with configurable fallback languages fortunately can be made in Lua. Best regards, Dipsacus fullonum (talk) 21:15, 24 January 2017 (UTC)

Leader not head of a country[edit]

When a person is recognised as a leader to his people but is not the head of government. How do you do this in Wikidata? Thanks, GerardM (talk) 15:58, 24 January 2017 (UTC)

Still a leader of the country, in my opinion. He's the leader of the state known as the country, not the government, thus a representative. MechQuester (talk) 16:07, 24 January 2017 (UTC)
We are talking about a function of a people who has no formal connection to a government.. nor of a country as that people may live in multiple countries. Thanks, GerardM (talk) 05:05, 25 January 2017 (UTC)
A person, "who manages any kind of group", here population of a country / multiple countries is manager/director (P1037). - Kareyac (talk) 06:45, 25 January 2017 (UTC)
Lets make the assumption that there is a (kind of) leader of all Elfdalian (Q254950)-speaking people, then I guess it's wrong to put that claim in the item about the "country of Elfdalia" or in the item about the language of Elfdalian. It is probably better to have such information in an item dedicated for the "the (informal) group of Elfdalian people and speakers of Elfdalian". That item could be described as a sort of "informal network", without government. Such a network can have a (sort of) leader, a founder or father/mother-figure. -- Innocent bystander (talk) 08:37, 25 January 2017 (UTC)
For founder we have founder (P112). Innocent bystander, sorry for misleading typo. - Kareyac (talk) 09:33, 25 January 2017 (UTC)
A country can have a head of state, they can have a head of government; they can have viceroy type of position. But what are you meaning by a "leader"? They have an official position or they do not. If they are an influencer, a spokesman, an intellectual, ... So I think that the loose use of the word leader, needs to be better qualified as there is some reason or purpose behind their rise in the area of influence.  — billinghurst sDrewth 09:47, 25 January 2017 (UTC)

Can Descriptioner override the existing description?[edit]

Im just wondering. MechQuester (talk) 16:17, 24 January 2017 (UTC)

no --Pasleim (talk) 17:07, 24 January 2017 (UTC)

Poulenc[edit]

In English, we have two lists of his compositions, en:List of compositions by Francis Poulenc and en:FP (Poulenc). The former corresponds to Italian and Japanese, the latter to the French fr:Liste des œuvres de Francis Poulenc. Help? --Gerda Arendt (talk) 17:29, 24 January 2017 (UTC)

I do not think we can help unless you decide to merge the two entries on the English Wikipedia. If you think that Italian, Japanese, and French should be linked to the same English item this can be easily done.--Ymblanter (talk) 20:00, 24 January 2017 (UTC)
A merge was done for Max Reger, but Poulenc was written by an editor who retired, - so for respect I wouldn't want to touch it. Also: some argue that people who can't sort would still need the bulleted list. I could imagine one entry for list, the other for complete catalogue. --Gerda Arendt (talk) 12:47, 25 January 2017 (UTC)
@Gerda Arendt:, on Wikidata all three items (fr, it, ja) are connected to en:List of compositions by Francis Poulenc, and en:FP (Poulenc) is not connected to anything. The link you see at en:FP (Poulenc) is because someone added by hand an interwiki link to the English article. If you want, you can also add there Italian and Japanese links.--Ymblanter (talk) 13:26, 25 January 2017 (UTC)
I was the someone. What I miss would be a link from French to that article, because FP is a translation of the French, while the English list (translated to Italian and Japanese) is only a subset. --Gerda Arendt (talk) 13:29, 25 January 2017 (UTC)
We have a mediawiki restriction that one wikidata item can only link to one Wikipedia (in a given language) article, and one Wikipedia article can only be linked from one Wikidata item. This means that in the given configuration on the Wikidata end we can only shift links around (for example, move the French link from one item to another one). We can not have the same French article references from two Wikidata items. It is technically not possible.--Ymblanter (talk) 13:48, 25 January 2017 (UTC)

en:FP (Poulenc) is about a specific catalog of Poulenc's works, whereas en:List of compositions by Francis Poulenc is a list of compositions by Poulenc. Whereas these items are highly correlated, they are quite different. It might be a bit easier to see for Mozart, where the same difference is between en:Köchel catalogue and en:List of compositions by Wolfgang Amadeus Mozart. Or, put differently, the list of compositions are Wikipedia list articles, the catalogues are "real world" entities that exist outside of Wikipedia. This is particularly important where there are several catalogues for a single composer. --Denny (talk) 17:40, 25 January 2017 (UTC)

  • I completed the two items (Q1469914, Q28441364). From its title, I think it's correct to link the French list from the first one.
    --- Jura 08:21, 26 January 2017 (UTC)
  • I've renamed english article from FP (Poulenc) to FP (catalogue) as "Poulenc" is quite synonim but not description of the title. And the author is Schmidt after all. --Infovarius (talk) 14:42, 27 January 2017 (UTC)

Logical OR in the SPARQL templates[edit]

I'm using the SPARQL and SPARQL2 templates. My latest SPARQL query contains a logical OR, indicated in SPARQL by two pipes:

FILTER (!BOUND(?lang) || ?lang = wd:Q1860)

And the template cuts it off (See here). What do I need to do to escape the double pipe? Or is there a change that can be made to the templates? Thanks in advance for any help, MartinPoulter (talk) 20:23, 24 January 2017 (UTC)

Use {{!}} → |. Matěj Suchánek (talk) 20:33, 24 January 2017 (UTC)
I coded something to not to escape stuffs using lua. The idea is that it's easy in lua to concatenate all the numeric arguments of a template call with pipes between them and that it become unecessary to escape the pipes. I'll re-find this, please stand by. author  TomT0m / talk page 11:51, 25 January 2017 (UTC)
It's Module:ConcatArgs and actually it's used in the SPARQL template. Maybe there is a bug concerning the double pipe, I'll investigate. author  TomT0m / talk page 11:56, 25 January 2017 (UTC)
Found the problem : the "=" symbol. Mediawiki thinks "?lang" is a named parameter. It works without escaping by saying it's supposed to be the 2nd numeric parameter :
{{SPARQL|query=SELECT DISTINCT ?person ?name ?language ?death (URI(CONCAT("https://www.gutenberg.org/ebooks/author/", ?gutenberg)) AS ?gberglink) WHERE {
  ?person wdt:P1938 ?gutenberg.
  ?person wdt:P570 ?death. # Dead people only
  MINUS {
    ?enws schema:about ?person.
    ?enws schema:isPartOf <https://en.wikisource.org/>
  }
  OPTIONAL {?person wdt:P1412 ?lang}.
  FILTER (!BOUND(?lang) ||2= ?lang = wd:Q1860) # Language: English or absent
  BIND(IF(BOUND(?lang),"English","Not specified") AS ?language
)  ?person rdfs:label ?name.
  FILTER((LANG(?name)) = "en")
}
ORDER BY ?death
}} 
I don't think it's fixable, so no better solution than the proposed one. author  TomT0m / talk page 21:34, 25 January 2017 (UTC)

can't enter values for "inventory number"[edit]

At Cloudy Sky - Mediterranean Sea (Q18627381) [2] I can't enter the value "84.XM.1388" for Getty Museum. I can't even re-enter the existing value "1971.577.2" for Art Institute of Chicago. I checked the regex for this prop and it seems correct. So what gives? --Vladimir Alexiev (talk) 08:37, 25 January 2017 (UTC)

Lets take the stupid answers first?! Have you tried to reload the page? I experience problems like those you describe from time to time, but they are often solved by reloading the page (sometimes more than once). -- Innocent bystander (talk) 09:50, 25 January 2017 (UTC)
I sometimes encounter that problem if I copy values from a source that includes hidden characters. Have you tried typing the value manually? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:27, 25 January 2017 (UTC)
WORKSFORME? Multichill (talk) 17:34, 25 January 2017 (UTC)

How does patrolling in languages like Arabic and Chinese work in practice?[edit]

If I look at the recent changes list to check for vandalism, what am I supposed to do when I see a label in a language which I don't speak and that uses a different alphabet so that I have no real ability to see whether it's right. What do other editors do? ChristianKl (talk) 11:47, 25 January 2017 (UTC)

More than in any other Wikimedia project, filtering is important for RC work here at Wikidata. There are at least two extremely useful tools to do that: User:Yair rand/DiffLists.js allows filtering of RC, watchlist and Special:Contributions in the Wikidata frontend; reCh by User:Pasleim provides similar functionality in an external application, including a very powerful batch-patrol functionality. By using one or both of these tools, one can filter RC in different manners and make sure that things one definitely does not understand (such as zh/ar terms, etc) do not overwhelm during RC work. —MisterSynergy (talk) 11:57, 25 January 2017 (UTC)
The reCh tool provides a translation feature. Behind terms, there is a small icon. If you click it, the English translation is displayed. --Pasleim (talk) 12:18, 25 January 2017 (UTC)
  • Is there a reason why the script by Yair rand isn't in the gadget list? ChristianKl (talk) 12:31, 25 January 2017 (UTC)
  • No idea, but I would support a proposal to make it a gadget. To my opinion it would even be worth to implement similar functionality directly in Mediawiki. The classical layout of RC, watchlist and Special:Contributions works nicely for unstructured text, but this project is fundamentally different. —MisterSynergy (talk) 12:39, 25 January 2017 (UTC)
  • I speak Chinese decently and for the msot part, they are good. MechQuester (talk) 17:03, 25 January 2017 (UTC)

Language fallback[edit]

fallback chains (might be out of data)

I'm not clear how language fallback works. If we have a label and description in "en", is there any benefit in providing identical text in "en-GB", for example? If not, it seems to me that doing so increases the maintenance burden when one of those values is changed. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:25, 25 January 2017 (UTC)

The way it works is that if user sets his preferred language to "en-GB" then the software will check if we have message in that language first before moving on into fallback "en" language. So Andy you are right there is no benefit (that I can think of) to adding "en-GB" message which is the same as "en". --Jarekt (talk) 18:43, 25 January 2017 (UTC)
Can non-Wikimedia users of data use the same fallback mechanism? —Wylve (talk) 19:07, 25 January 2017 (UTC)
You can edit your modules even in WMF-wikis so that you can make a home-made "fallback". For example, having English as fallback for Swedish in words originally written with Cyrillic script (Q8209) is not a very good idea. Norwegian and Danish and even German are in those cases considered better. -- Innocent bystander (talk) 19:18, 25 January 2017 (UTC)
Module:Fallback implements the fallback chain in the image. c:Module:Date's langSwitch function calls mw.language.getFallbacksFor Lua function to do the job. The results are almost identical. --Jarekt (talk) 21:07, 25 January 2017 (UTC)

Need a second opinion in a discussion about a role of Wikidata[edit]

I am engaged in a lively debate with user:Brya at WikiProject_Taxonomy about role of the Wikidata projects as related to taxonomy. We seem to be unable to convince each other and nobody else from WikiProject_Taxonomy seems to be joining the discussion, so I am seeking more perspectives on the issue. We are discussing Q21440769 an item for a family of flies called Heterocheilidae (McAlpine, 1991). Unfortunately McAlpine who named the family 26 years ago made a mistake and used the same name as a family of worms ( Heterocheilidae (Q5320961) ) was using since 1915. It does not seem like any biologist published anything about this naming issue. Both names are beeing used at the same time, sometimes even in the same publication, like in this 2011 publication (pages 80 and 227), despite the fact that ICZN convention states that the newer name is incorrect. We agree about those facts, but we do not agree about how to reflect them in the database. My position is that per Wikidata:Verifiability we report most up to date facts from literature, so we should report "Heterocheilidae" (McAlpine, 1991) as the current name and use the item to store data about this family of flies. We already have this item tagged as an instance of later homonym (Q17276484) and I think that is all we need to do, since we only suppose to report published facts, not do our own investigation. user:Brya, who reverted my edits, has a firm belief that since the name is wrong we can not use the item to store data about this family, until I "find[] a taxonomist to fix this permanently, for the whole world". Can someone help us find some middle ground here? --Jarekt (talk) 20:34, 25 January 2017 (UTC)

Technically, we can have two items, one for the flies, another for the wormies. They can have the same name/label (with a different description). That's all technically. If there are sources that the grouping is as you describe, we should register this, unless there are better sources that this grouping is outdated (then we might use qualifiers) or wrong (than we should forget about it, and name it in the article, but not in WikiData). But my knowledge of taxonomy is pretty superficial. Edoderoo (talk) 21:08, 25 January 2017 (UTC)
@Jarekt: What kind of „facts“ do you want to add to this item? I've made some minor improvements around the genus Heterocheila (Q14604579) (2 species) including Relationships of the genus Heterocheila (Diptera: Sciomyzoidea) with description of a new family (Q28528221) by David K. McAlpine (Q21502932). --Succu (talk) 22:26, 25 January 2017 (UTC)
How about the information that was removed: taxon name (P225), taxon rank (P105), parent taxon (P171), and that is is a taxon (Q16521). That would be a good start. Some of the queries I was using rely on those properties. --Jarekt (talk) 02:38, 26 January 2017 (UTC)
Jarekt has his facts wrong: these two items do not use the same name. These are two different scientific names, that is, two different formal entities, that have in common that they share the same spelling. In all other respects they are different (like author, date, type). The later of these two names may not be used as the scientific name for a taxon (Article 52.2). At some point, a taxonomist will (hopefully) take action to amend the situation.
        However, we have an item for the later name, and properties can be stored there (and indeed are there), so the problem is limited. If desired, the name can be added using name (P2561) as a qualifier. What we really should have is a separate property "scientific name, that may not be used a taxon name", but we don't have one. - Brya (talk) 04:40, 26 January 2017 (UTC)
Ok, different name with the same spelling. And the problem with the item is not limited, if my attempts to place this family in the taxonomy tree or provide currently used scientific name is reverted --Jarekt (talk) 14:01, 26 January 2017 (UTC)
I think there is little we can do at the moment. I wouldn't even recommend deprecating the official name of the family of flies since de facto it has become the official name. Not everyone actually follows all the rules to the letter even in academia. The maximum I can recommend is using different from (P1889) on both items and adding a note in the talk page. DGtal (talk) 06:54, 26 January 2017 (UTC)
But my issue is that it is not our job to be the spelling police for the biology community, correcting their errors. Our job is to make published information available in the database form. If there is a published source saying not to use this scientific name than we should change it, but at the moment that is the name used by scientific community. Wikipedia:No original research (Q4656524) policy prohibits this kind of behavior on Wikipedias, but I guess Wikidata is different. --Jarekt (talk) 14:01, 26 January 2017 (UTC)
It seems to me Jarekt is correct here - we should be reflecting what has been published by reliable sources, with proper referencing etc - the statement is not a statement by wikidata, it is a documentation by wikidata of what the source states. ArthurPSmith (talk) 15:19, 27 January 2017 (UTC)
But we are recording that the name has been published, and we could record by who the name is used. It is just that the fact that the name has been used, or even that the name is being used, does not mean that it is a valid name of a family: the rules say it can not be. - Brya (talk) 18:53, 27 January 2017 (UTC)
But Brya our role is not to be interpreters of scientific rules, only to store published facts. So, if name of this family is being used by scientific community than we should place it in Q21440769: taxon name (P225), taxon rank (P105), etc. If the taxon name is not in taxon name (P225) property, it is not being recorded, since any other placement will be ignored by the queries. --Jarekt (talk) 19:23, 27 January 2017 (UTC)
Abbe98

Achim Raschka (talk)
Brya (talk)
Dan Koehl (talk)
Daniel Mietchen (talk)
Delusion23 (talk)
Faendalimas
FelixReimann (talk)
Infovarius (talk)
Joel Sachs
Josve05a (talk)
Klortho (talk)
Lymantria (talk)
Michael Goodyear
MPF
PhiLiP
Andy Mabbett (talk)
Prot D
pvmoutside
Rod Page
Soulkeeper (talk)
Tinm
Tommy Kronkvist (talk)
TomT0m
Pictogram voting comment.svg Notified participants of WikiProject Taxonomy Can we have some more viewpoints from Taxonomy project? --Jarekt (talk) 19:29, 27 January 2017 (UTC)

Before having an opinion, I checked up at WS and there they are separated through using (Diptera) resp. (Nematoda); Heterocheilidae_(Diptera) and Heterocheilidae_(Nematoda) Dan Koehl (talk) 20:11, 27 January 2017 (UTC)
Reg the Diptera: The name of this taxon appears to be invalid under the relevant nomenclatural code, as it is a junior homonym of Heterocheilidae Railliet & Henry, 1915. Dan Koehl (talk) 20:15, 27 January 2017 (UTC)
The question is: should wikidata store published scientific name of Q21440769 in taxon name (P225), even if our interpretation of naming rules suggest that the name scientists have been using for all those decades does not meet naming rules? That is the essence of this discussion. The note on WS, just alerts people about the naming issue. --Jarekt (talk) 20:31, 27 January 2017 (UTC)
The question is: how to model this. --Succu (talk) 21:02, 27 January 2017 (UTC)
BTW: It's not our interpretation of ICZN. It's one of the easier facts and a common rule called Principle of Priority (Q2110868). So how many times was this name used to denote the taxon concept of David K. McAlpine (Q21502932). You cited one. --Succu (talk) 21:36, 27 January 2017 (UTC)
I would not call the period from 1991 to now "all those decades". I am mystified by the "our interpretation of naming rules" (who is this we, where do "naming rules" come in, and what is this "interpretation"?). Also, the "scientific rules" is bewildering, science does not come into it.
        What we are dealing with here is the International Code of Zoological Nomenclature, one of a number of Codices, lawbooks, that have worldwide support, to the degree that there are no competing alternatives (that have any kind of real support). These lawbooks govern (and, to a real degree, create) their own autonymous nomenclatural universes.
        And in this particular case it is crystal clear that this name can not be used as the correct name of a taxon in this particular universe. The name has nevertheless been published, and is being used here and there (and we can record this usage). But probably it is not being used a lot, or it would have been prominent enough for somebody to take appropriate action.
        I find this desire to cover up errors and create an alternate reality quite disturbing. - Brya (talk) 05:24, 28 January 2017 (UTC)
Despite of the fact that both taxa have been named more than 20 years ago, they are still used in publications. If we draw conclusions that a used taxon name (P225) is in fact not valid under the ICZN, however obvious in this case, that is an act of original research. We should not. Lymantria (talk) 07:29, 28 January 2017 (UTC)

<grin> we choose for a specific understanding of taxonomy and it is wrong </grin>. The scientific name is a combination of several parts. For a species it is a Genus, a species name, an author, a publication and a date. A species is defined by a type. The same type can be used for many scientifically correct genera at or below the level of species and they are roughly the same. Each entry is correct from a taxonomic point of view. Given that we are not in the business of having it right, we have a mess and all kinds of claims can be made. Wikipedia is an encyclopaedia and its aim is to present a view, the current understanding. Wikidata is not Wikipedia, it does not need to be. It is a project in its own right. Thanks, GerardM (talk) 08:26, 28 January 2017 (UTC)

@GerardM: Actually, the "Wikipedia is an encyclopaedia and its aim is to present a view, the current understanding" belongs in "What Wikipedia is not: Wikipedia_is_not_a_soapbox". The aim of Wikipedia is to represent "fairly, proportionately, and, as far as possible, without editorial bias, all of the significant views that have been published by reliable sources on a topic." [emphasis added] Other projects like Wikispecies and Commons have chosen to adopt instead a Single-Point-of-View stance, so NPoV is not a universal WMF feature. However, Wikidata is intended to serve Wikipedia, so it should be compatible with NPoV.
@Lymantria: I think a policy of just copy-and-paste of whatever is found in sources is untenable. Just as any user who writes a Wikipedia page should place content in context, in a user should make sure that what is entered into a Wikidata item is "structured data" (per the "Wikidata acts as central storage for the structured data" of the main page). - Brya (talk) 09:10, 28 January 2017 (UTC)
@Brya: The tool we have for that is our Notability criteria. In this case, several publications have relied on the supposedly unvalid name and not just a not very reliable database. I think we should refrain of passing by publications that can be mentioned a "reliable" source. The fact that the name is incorrect, does not make the data unstructured. One could even argue that taking away taxon name (P225) etc. removes the available structure. Lymantria (talk) 09:39, 28 January 2017 (UTC)
Wikidata has as one of its functions that it serves Wikipedia. If that is all it does I would not participate. Thanks, GerardM (talk) 15:02, 28 January 2017 (UTC)
@Brya: : Please create an "ICZN name" property. This code is supposed to provide a normalized identifier of a concept - a taxon in our case. This is pretty a common situation in Wikidata, and in such cases it's clear which code should be used. This way you will have stuff organized the way you want while having a chance for others to adopt another viewpoint as it's standard in Wikidata. Wikidata is a platform that permits web scale data crossing between several databases. It's one of its obvious usecase as we as many many identifiers stored. The ICZN one, on this viewpoint, is an important one, but not an uncommon one. This does not mean it has to eat and shade any other. author  TomT0m / talk page 10:32, 28 January 2017 (UTC)
Abbe98

Achim Raschka (talk)
Brya (talk)
Dan Koehl (talk)
Daniel Mietchen (talk)
Delusion23 (talk)
Faendalimas
FelixReimann (talk)
Infovarius (talk)
Joel Sachs
Josve05a (talk)
Klortho (talk)
Lymantria (talk)
Michael Goodyear
MPF
PhiLiP
Andy Mabbett (talk)
Prot D
pvmoutside
Rod Page
Soulkeeper (talk)
Tinm
Tommy Kronkvist (talk)
TomT0m
Pictogram voting comment.svg Notified participants of WikiProject Taxonomy : I went bold and I sketched a proposal, please read this page. I did not included it yet as an official proposal as it lacks all the technical details. I hope this can put an end to this controversy. author  TomT0m / talk page 10:52, 28 January 2017 (UTC)

Just wanted to make a point as a nomenclatural taxonomist. Under the code a name is supposed to be followed as per usage until such time as a valid nomenclatural act has been published to declare it unavailable for whatever reason. This nomenclatural act is governed by the rules of publication (arts 8-9) and WS does not meet these articles hence can only mention an issue but cannot change the nomenclature. It does not matter how problematic a name is you must use it until demonstrated to be at issue. The reason for this is to avoid multiple nomanclatures, anyone can read the code and possibly determine a name has an issue and should be unavailable, but others may disagree, if everyone did what they felt was right, we end up with multiple names. So to determine that a name is a junior homonym on Wikimedia you must cite the nomenclatural act that declared it a homonym, if you cannot then all you can say is it may be a homonym under the code but then use it anyway. I know not everyone follows this, and hence we have multiple names on many taxa. But this is how it is supposed to work. Cheers Scott Thomson (Faendalimas) talk 11:14, 28 January 2017 (UTC)
Yup. Brya needs to get off his high horse and if he feels that strongly about this issue, then he can go ahead and publish a new name himself. If a nobody like me can affect nomenclature, so can he. In the meantime he should respect the principle of no original research. Circeus (talk) 12:57, 28 January 2017 (UTC)
@Lymantria: the Notability policy determines if an entity deserves to have an item (which has not been called in doubt, in this case). Also, we have lots of items for homonyms which appear to fail these Notability criteria, but which have a page on some Wikipedia, so we are stuck with them. We need a way to structure items on homonyms, and it seems weird to invent new criteria of our own (NOR?) when there is a lawbook (accepted worldwide) which was set up for this purpose. - Brya (talk) 12:00, 28 January 2017 (UTC)
@Scott Thomson: I agree that Wikidata should not try to create nomenclatural acts (like creating a new spelling for one of these homonyms), and that we can't anyway. In this case I see no reason to assume that either name should not be "available" (in the sense of the Code). There are lots of provisions in the Code which factor in "prevailing usage" (not always clear in their application), but I see no indication in the Code that it applies to homonyms. It is mentioned in Article 23.9, but there it is a guide to a taxonomist in declaring one of them a nomen protectum and the other a nomen oblitum, and it is this declaration which would be the nomenclatural act. On the other hand, there are ways a taxonomist can deal with homonymy, explicitly set out (Article 52.4, 52.5 and the appeal to the Commission). I see no reason to put Article 52.1 out of commission. - Brya (talk) 12:21, 28 January 2017 (UTC)
@Brya, a nomenclatural act is any declaration that changes the validity, availability or usage of a name. So apart from new taxa and spelling corrections etc, it also includes placing names in synonymy or homonymy. Like many "rule books" the code is written for those it is intended. I am not saying whether this is good or bad but it is what it is. It takes a lot of understanding, discussion and patience to understand it properly. I think it is admirable that you try to apply the code here. Just be wary of the implications of what you do. Always be explicit that you cannot make a nomenclatural act on here. This prevents people thinking you have, when you cannot. One of my objectives is a stable nomenclature, like all members of the ICZN. For a nomenclatural system to be usable it must be stable and clear in its meaning. cheers Scott Thomson (Faendalimas) talk 13:37, 28 January 2017 (UTC)
As you say a nomenclatural act is any formal declaration that establishes a name or changes the application of a name. Placing a name in synonymy is mostly a matter of taxonomy. Whether a name is a homonym is not determined by a nomenclatural act, but by the facts. It is the resolution of a problem with a homonym that takes a nomenclatural act. I see nothing in the zoological Code that says its rules are dependent on action by a zoologist for them to take effect: that would be really weird, anyway. - Brya (talk) 16:32, 28 January 2017 (UTC)
@Brya: "Whether a name is a homonym is not determined by a nomenclatural act, but by the facts." That of course is not entirely correct. The pure homonymy may be a fact, judging the validity of the homonymous names is not. Lymantria (talk) 18:05, 28 January 2017 (UTC)
@Brya: yes it is the correction of the issue that is the act, I agree, which is also what I said, placing two names in homonymy is the act of recognizing the senior and junior homonym rendering the junior name invalid under the code. That is if it goes by priority, it does not always for a variety of reasons. The point is we cannot here designate one name as the senior homonym and one as the junior in an official by the code way, all we can do is effectively state the facts and leave it be. Obviously sending this information to specialists for those taxa is a good step towards resolving it as mentioned later by Neferkheperre in his post on the topic. Cheers Scott Thomson (Faendalimas) talk 00:40, 29 January 2017 (UTC)
I don't think TomT0ms bold approach is very helpful to solve the complexity of the problem. The problem we (an others) face is sketched in an article called Good and Bad Names published at the website of the Global Names Architecture (GNA). If you want to learn mor about GNA please read Towards a Global Names Architecture: The future of indexing scientific names (Q22117529). --Succu (talk) 17:04, 28 January 2017 (UTC)
A draft on a nomenclatural ontology for biological names is NOMEN. --Succu (talk) 18:33, 28 January 2017 (UTC)
Here is what we have: We at Wikispecies or Wikidata cannot generate nomenclatural acts here, because of the very nature of our copyright. Publications must be secure, and meet certain conditions. As we build our taxon data bases, we are going to find homonyms, since we are nearly universal in our mission. In my own field, Cirripedia, I have found three, one of which is family-group. As I have been entering reference citations for Zootaxa and Zookeys, I have discovered several more. By ICZN rules, new replacement names of family-group homonyms require application to ICZN for rulings. Genus and species do not. I checked Heterocheilidae in ICZN lists and indexes, and no results.
What to do: For listing in Wikispecies and Wikidata, until new replacement names are published, is to differentiate them, as NAME (Author), or NAME (Higher taxon group), or both. I try to find appropriate specialists and notify them. This has had success, and has helped build Wikispecies status. Neferkheperre (talk) 21:26, 28 January 2017 (UTC)
  1. As agreed on from the first, but raised again any number of times, Wikidata cannot publish nomenclatural acts. The problem cannot be fixed here.
  2. As also agreed on from the first, but raised again several number of times, it is possible, and recommended, to contact experts on the group in question and draw attention to the problems found. Hopefully, they will take formal action (by my count, I am up to three replacement names that are in press).
  3. @Scott Thomson: whether homonymy exists or not, and what is the senior homonym and what is the junior homonym, each is a matter of objective fact, ruled by the Code. It helps if somebody puts it in print, but that does not alter the facts. Hopefully, when somebody does go into print, he will at the same time also take steps to resolve the issue, which would be a formal act.
  4. @Lymantria, it is indeed important that both names are formally established: if the elder name is a nomen nudum or otherwise not formally established, there is no homonymy. But determining if a name is formally established does not (normally) require a formal act, but again is a matter of objective fact. Indexing centres make such decisions routinely all the time: "this is not a formally established name, as it fails the [...] requirement"; "that one is all right" (and, no, such decisions by indexing centres are not formal actions). Admittedly, there are grey areas, especially with old names recently dug up, that are dubious and have to be referred to higher authority, but these are very much the exception.
  5. As pointed out by Neferkheperre, homonymy in family-group names in zoology is special. Article 55.3.1 requires that such cases are submitted to the Commission for "a ruling to remove homonymy". If such a case is submitted, the Commission is likely to alter, "emend" the spelling of one of these names (likely the least used name), so as to remove the homonymy (example of case). This means that HETEROCEILIDAE is not like other homonyms, but rather is like Schroedinger's cat: one of these names is going to have a different spelling, but we can't know which, and we cannot provide a definitive spelling for either of these names. - Brya (talk) 06:29, 29 January 2017 (UTC)
Brya, the problem is that you refuse to realise that from the point of view of Wikimedia projects, whether or not these names are correct according to the code is completely irrelevant. As far as I am concerned, an edit such as this basically qualifies as nothing short of vandalism ("Blanking: The removal of most or all of a legitimate piece of information", information which is this case is pretty much completely unaffected by the fact the name is not correct), and if you did anything remotely similar on Wikispecies you'd land yourself with warnings, and I must assume you know this because you ave not attempted to remove any of that information from the Wikispecies page. Circeus (talk) 11:04, 29 January 2017 (UTC)
I am sorry to hear you feel that stuff copied off the internet automatically is "a legitimate piece of information". Any database that takes itself seriously is concerned about data quality. - Brya (talk) 17:35, 29 January 2017 (UTC)

Is there an easy way to get a list of all items with a specific type of claim, as well as the claim itself?[edit]

Hi, I've been looking for a while for an easy way to do this, but drawn a blank. What I'd like is to get a list (or page-able subset) of all items with claim Universal Decimal Classification (P1190) in them, along with the actual value of the claim made. For instance, a single record for religion in China (Q1482612) might read "Q1482612|221".

It's easy enough to get the list using "what links here", but then I'd have to look individually at every item to get the value which would brutalise the server I suspect. Thanks in advance. Lankiveil (talk) 02:10, 26 January 2017 (UTC).

This should do it. Only 122 items. --Tagishsimon (talk) 03:00, 26 January 2017 (UTC)
That is exactly what I wanted, thanks @Tagishsimon:! Lankiveil (talk) 08:54, 26 January 2017 (UTC).

About data donations: CC0 (Public Domain)[edit]

Hi everyone,

I am working in a future project in which we are going to gather data and use it, with the focus of migrate then its content to Wikimedia projects. In the case of Wikipedia I understand that every piece of content migrated entirely must to be licensed as CC SA-BY 3.0 or broader. But I want to be safe when I will write the guidelines to contribute and the set the license for the project so I was thinking to apply CC SA-BY to all the project but, if I want to make migrations of data that we are going to gather, do we need to set the CC0 (Public Domain) license for the data?

Maybe it is a nonsense, but I want to be sure with that process before to make any migrations in the future.

Thanks in advance!

Regards, Ivanhercaz Plume pen w.png (Talk) 11:56, 26 January 2017 (UTC)

Pictogram voting comment.svg Comment Checking Wikidata's website footer I imagine that I could set something similar to it, I mean: "data licensed with CC0 (Public Domain), the rest of the content licensed with CC SA-BY". Correct? Regards, Ivanhercaz Plume pen w.png (Talk) 12:04, 26 January 2017 (UTC)
That sounds right. --Jarekt (talk) 14:04, 26 January 2017 (UTC)
Thank you Jarekt! Regards, Ivanhercaz Plume pen w.png (Talk) 15:41, 26 January 2017 (UTC)
Data is under a different copyright law as intellectual work. On data itself is no copyright, but on a collection you can have copyrights. Take statistics of sports people, for example the WTA site for female tennis players. The data we can use, but we can not copy the whole set. In every day life this means we can manually type the info into WikiData, but we can not use a script to scrape their website without their allowance. In case of doubt you can always contact legal at wikimedia.org for details. Edoderoo (talk) 19:57, 26 January 2017 (UTC)

P1617 regex[edit]

I added the regex constraint {{Constraint:Format |pattern=[0-9a-f\-]+ |mandatory=true}} to Property talk:P1617. The constraint report now lists exceptions including cbeab979-c95b-432e-a3bb-2b1d502f4db5 and 1e655b90-b289-4762-9bae-ee980eeae9f9. According to the regex tester I use, these should be valid. What have I missed? I do note that all the unexpected exceptions have repeated, adjacent characters. Is that a coincidence? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:12, 26 January 2017 (UTC)

I guess it is the space at the end of the pattern causing the problem. --Pasleim (talk) 14:33, 26 January 2017 (UTC)
It is best to enclose the pattern by <nowiki></nowiki>. Perhaps the problem is that the final space is (sometimes?) included. Lymantria (talk) 14:38, 26 January 2017 (UTC)

Bug at langlinks API[edit]

HI, I made an article in fa.wikipedia when i check the api it doesn't purge interwiki.

I purge many time at wikidata, fawikipedia, enwikipedia but still API doesn't show new interwiki linkYamaha5 (talk) 21:51, 26 January 2017 (UTC)
@Yamaha5: It is there now
"lang": "fa",
"*": "\u06f1\u06f0\u06f0 \u0645\u0627\u06cc\u0644 \u0647\u0627\u0648\u0633"
 — billinghurst sDrewth 03:10, 27 January 2017 (UTC)
More than 2 hours it doesn't show. the api has lag and it should be solve Yamaha5 (talk) 07:18, 27 January 2017 (UTC)

Query regarding the descriptions for wikidata entities[edit]

Hi all,

We have been using wikidata json dump and labels, wiki links and external ids are very useful. Thanks a lot for such a good work. I was trying to look at the descriptions of wikidata entities and looks like there's not much consistency.

For eg: Brad Pitt: American actor

            Angelina Jolie: American actress, film director, and screenwriter
            Jennifer Aniston: television and film actress from the United States

           Even for cities:
           Tokyo: capital of Japan with 13 million inhabitants, and one of 47 prefectures of Japan
           Seattle: major city in state of Washington, United States; county seat of King County, Washington
           Mumbai: capital and district of Maharashtra, India
           Washington D.C: capital city of the United States

I was thinking descriptions for wikidata entities is equivalent to notable_for attribute in freebase dump. Can you please let me know if there's any other attribute in wikidata entity which I could use to map as short Description?

Thanks in advance.

Subramanyam

you could auto-generate descriptions using properties like instance of (P31), subclass of (P279), country of citizenship (P27), occupation (P106) etc. What properties are useful depends on what kind of entity you are looking at. ArthurPSmith (talk) 15:25, 27 January 2017 (UTC)

Links from items to Commons categories and galleries[edit]

Here's an update of results of queries into linking patterns from Wikidata to Commons categories and galleries.

A previous version was posted here at VP and also at at Commons VP in December 2015.

There are also some further historical versions, going back to September 2014, for older comparisons.

Commons categories
(5,499,772)
Commons galleries
(113,395)
total linked
Wikidata articles
(~ 22,165,947)
~ 1,268,063 100,042 ~ 1,299,996
sitelinks:
437,882
P373:
~ 1,209,119
sitelinks:
96,839
P935:
92,865
sitelinks:
534,727
props:
~ 1,235,579
Wikidata categories
(2,870,035)
396,087 558 396,094
sitelinks:
387,768
P373:
355,205
sitelinks:
16
P935:
545
sitelinks:
387,786
props:
355,209
total linked 1,426,002 100,086 ~ 1,696,090 items / 1,523,993 pages
sitelinks:
825,656
P373:
1,326,176
sitelinks:
96,853
P935:
92,898
sitelinks:
922,499
props: ~ 1,590,788 items /
1,419,074 pages

Compared to 2015, perhaps the most notable feature is that new sitelinks to Commons continue to be dominated by sitelinks between Commons categories and article-like items here: up 183,682 compared to an increase of 47,384 in sitelinks between Commons categories and category-like items. (The total number of Commons categories has increased by 912,375 over the same period).

This is against how some Wikidatans feel sitelinks to Commons ought to work. However, it does seem to be the clear preference of most users when adding sitelinks, so perhaps the time has come to accept it as mostly harmless. Jheald (talk) 23:29, 26 January 2017 (UTC)

Thanks Jheald. Curated galleries at Commons are PITA compared with categorisation, they will often never need to exist for many minor players, whereas a category is easy, and can work in multiple ways. I would prefer to see Commons look at whether curated galleries should be subsidiary (or discouraged) in the system and that makes it even easier to win an argument among WDatans. From the perspective of a WSian linking to a category at Commons is more beneficial than linking to a gallery, well it works that way for authors as many do not write many books. The problem is that here we map primarily map categories to categories as a preference, and there is no easy way to have a CommonsCat link to articles, yet many of the sisterwikis may not have categories to match. We need some better way to work through the interweaving of interwikis based upon how the sister prioritises their pages, not how we think that they should.  — billinghurst sDrewth
@Jheald: are you sure users did this and not just one user with a bot? Multichill (talk) 08:48, 27 January 2017 (UTC)
@Multichill: It's possible. I have no idea who has been adding the sitelinks -- I'm not even sure I'd know how to query the history to investigate at scale. If somebody were to have been adding sitelinks with a bot, I'm not clear where they'd be getting their information from. If someone were converting P373s into article --> commonscat sitelinks, why stop at only 180,000 of them? But maybe there are bots out there that add commonscat sitelinks. Jheald (talk) 09:19, 27 January 2017 (UTC)
I often see users creating items by linking a Commons category to a Wikipedia article. The number doesn't surprise me: The most obvious way to add interwiki links is to click the "Add links" link in the sidebar. - Nikki (talk) 12:58, 27 January 2017 (UTC)

Review[edit]

Can some experienced user(s) check recent contributions of user GiorgiXIII? There are several items related to army and armed forces affected, including these ones [3], [4], [5], with tens of sitelinks and/or labels changed/removed. There may be some lost sitelinks and/or altered tree classification. Right now I can't spend more time looking into this case. XXN, 00:15, 27 January 2017 (UTC)

I have reverted those edits and assuming good faith at the moment. It seems to me that the user thought some sitelinks were misplaced and was trying to fix it. —Wylve (talk) 02:52, 27 January 2017 (UTC)
I fixed some sitelink issues that I think User:GiorgiXIII had some issues with. The issue remains that w:lv:Armija is a more comprehensive disambiguation page for various senses of the word "army" (meaning either the military or only land troops). We also have no label (Q3505278) which I am unable to determine what it is supposed to represent. Maybe some speakers of Slavic languages can help. —Wylve (talk) 03:20, 27 January 2017 (UTC)
Added some Slavic descriptions, hope it helps. - Kareyac (talk) 05:06, 27 January 2017 (UTC)
@Kareyac: What is the difference between no label (Q3505278) and army (Q37726), if any? —Wylve (talk) 09:12, 27 January 2017 (UTC)
@Wylve: please try to use dictionary or some online translator. I tried: ru, be, cs, sr and uk WP articles describe no label (Q3505278) as all ground forces as class (not Navy or AF) and army (Q37726) as all ground forces or its unit/part/division (eg. Army № ...). I didnt research army (Q37726) in WPs if they have no no label (Q3505278) to compare. - Kareyac (talk) 09:55, 27 January 2017 (UTC)
Helo. I am Georgian man, I very good speac in georgian and russian. Georgian შეიარაღებული ძალები — russian Вооруженные силы, Georgian არმია — russian Армия, Georgian სახმელეთო ძალები — russian Сухопутные войска. My agge 54 yar. I live in Tbilisi (republic Georgia).GiorgiXIII (talk) 13:19, 27 January 2017 (UTC)
@GiorgiXIII: Welcome to Wikidata! You can use Wikidata:ფორუმი or Wikidata:Форум (or this page, of course). Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:44, 27 January 2017 (UTC)

Recent changes for claims involving a particular property[edit]

Is there any way to extract a list of recent changes in claims that involve a particular property

-- eg all recent changes to claims involving Cooper-Hewitt Person ID (P2011)

but without including any other changes on items that include a P2011 ?

I feel this is something that could help projects to keep a closer eye on changes involving their key properties, to monitor who has been changing what, and why. Jheald (talk) 14:12, 27 January 2017 (UTC)

User:Yair rand/DiffLists.js could be useful. However, it is a user script which needs to be installed on a per-user base by adding mw.loader.load('//www.wikidata.org/w/index.php?title=User:Yair_rand/DiffLists.js&action=raw&ctype=text/javascript'); to Special:MyPage/common.js. After “installation” it provides additional filter functionality in Special:RecentChanges, Special:Watchlist and Special:Contributions. —MisterSynergy (talk) 15:00, 27 January 2017 (UTC)
User:Pasleim/Lost Values could be useful. Matěj Suchánek (talk) 15:39, 27 January 2017 (UTC)
These look great, really useful. Unfortunately, Pasleim's page seems not to have been updated since April last year. @Yair rand:'s script looks to be exactly what I was looking for; but I can't seem to get it to show me changes earlier than the latest minute -- I was really looking for changes over the last month, or even longer. Is anybody else having this problem? Jheald (talk) 14:39, 29 January 2017 (UTC)
I'm guessing that what's happening is that Yair rand's script can only show me (some of) the 500 most recent edits, which it then filters. Whereas to do that kind of filtration over all the edits in the last month (eg to answer @Multichill:'s question above about who has been adding sitelinks to Commons), may be beyond what can be achieved with a client-side script, and would need something running server side. Jheald (talk) 14:45, 29 January 2017 (UTC)

Terms[edit]

I suppose that all P31: music term (Q20202269) should have some other class/superclass (too, or instead). How can we organize this? May be even to remove all P31:Q20202269 because they are too vague? --Infovarius (talk) 14:21, 27 January 2017 (UTC)

The very first one I checked had interwiki-links, and therefor languages connected. Those can not be removed for sure. Edoderoo (talk) 14:43, 27 January 2017 (UTC)
May be I was not very clear. I meant to remove the statement "P31:Q20202269" in such items, not sitelinks or items or anything else. --Infovarius (talk) 16:02, 27 January 2017 (UTC)
I would not remove it, as it is not incorrect. If you have a better specification, then feel free to replace them, but destroying info that is not wrong is not going to make it better. Edoderoo (talk) 17:33, 27 January 2017 (UTC)

Cleveland State University[edit]

Historical community information should be included for Cleveland State University (OH). Viking Hall and other dormitories were important living communities for students prior to the school's massive renovation. These buildings are a part of CSU's rich campus history as well.  – The preceding unsigned comment was added by 2601:548:c100:499b:242b:c054:333e:817 (talk • contribs) at 16:20, 27 January 2017‎ (UTC).

You are feel free to edit Q1100801. --Liuxinyu970226 (talk) 12:02, 28 January 2017 (UTC)

Institution of Mechanical Engineers: Chinese Wikipedia article[edit]

Could a Chinese speaker please check whether Institution of Mechanical Engineers (Q1569225) is the same as Institution of Mechanical Engineers (Q15051986)? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:49, 27 January 2017 (UTC)

@GZWDer:. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:57, 27 January 2017 (UTC)
The Chinese article seems to talks about a different organisation with headquarters in Hong-Kong. I could not access the official website given, but the archive.org version is here: http://web.archive.org/web/20111205134312/http://www.imeorg.org/en/
Koxinga (talk) 23:28, 27 January 2017 (UTC)
@Pigsonthewing:, unrelated. MechQuester (talk) 05:22, 28 January 2017 (UTC)
Since the zh label of Institution of Mechanical Engineers (Q15051986) was added by Cewbot, @Kanashimi: ^^ --Liuxinyu970226 (talk) 07:01, 28 January 2017 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

Thank, all. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:43, 28 January 2017 (UTC)

Wikidata links to English Wikipedia draftspace[edit]

There is a discussion relevant to the above topic, here. All comments welcome. --Euryalus (talk) 11:31, 28 January 2017 (UTC)

you know, that link they are referring to in the post, you can thank @MSGJ: for that. MechQuester (talk) 14:31, 28 January 2017 (UTC)

The message there is:

The EnWiki community requests that Wikidata establish a policy against linking to our draft space. Drafts are intended as an internal workspace. Drafts may contain inappropriate or problematical content. External consumption of drafts is undesired, and is strongly discouraged.

So far the straw poll is a unanimous 4 out of 4, endorsing it pretty strongly. If the Wikidata community is agreeable, I'd like to follow up investigating the possibility of some sort of technical barrier to entering draft space links. Alsee (talk) 19:22, 28 January 2017 (UTC)

According to WD:N Draft namespace already isn't accepted as a valid sitelink. Mbch331 (talk) 19:32, 28 January 2017 (UTC)
Mbch331, perhaps I am reading it differently because there was in fact a Draft space link, and an open Phabricator task requesting an upgrade to the functionality of draft links, but I can easily see reading it as not targeting draft links at all. That's the Notability policy for Wikidata items. In order to qualify as Notable, the item needs to satisfy one of the listed criteria. The data item did satisfy the criteria, it had links to Italian and French articles. So Notability was satisfied. The item was valid. Then someone thought it helpful to add the a draft link exactly matching the topic. That cannot diminish the already established Notability. Maybe my reading is biased by the circumstances, but it couldn't hurt to more directly target the issue. Alsee (talk) 20:35, 28 January 2017 (UTC)
Can we scan for any other Draft links that might exist? Alsee (talk) 20:48, 28 January 2017 (UTC)
The issue isn't about whether the item is notable. Enwiki doesn't want that external links point to it's draft namespace.
Policy-wise I see no reason why we should against the wishes of Enwiki on this point. Such links should be removed. It might also make sense to prevent the addition of those links technically. ChristianKl (talk) 21:30, 28 January 2017 (UTC)
Yes, we should do it at the mediawiki level. Draft on en.wp is not very much different from the user subspace, and we do not link to those.--Ymblanter (talk) 22:36, 28 January 2017 (UTC)
  • Pictogram voting comment.svg Comment Definitely agree that a sister wiki should be able to define which namespaces are outward facing for notability. Draft: namespace pages at English Wikipedia are not articles, and should not be linked here.  — billinghurst sDrewth 10:25, 29 January 2017 (UTC)
  • Pictogram voting comment.svg Comment why technical barrier? why not flag? why policy? why don't english wikipedia editors edit here, rather than straw polls there? you realize people will link to draftspace elsewhere to manage the list, i.e. english does not control inbound linking. - just because english wants to get spun up over a year old ticket with no action, does not mean wikidata needs to take any action. Slowking4 (talk) 13:07, 29 January 2017 (UTC)

Semiprotecting properties by default[edit]

Hi everyone,

I wonder if we should semiprotect the entire Property namespace, as properties aren't so easy to improve and too easy to vandalise (with an enormous and unpredictable impact). Becoming an autoconfirmed user (or receiving the confirmed flag) is really easy, and unregistered users will always be able to edit the Property_talk namespace or send requests to other users to add, for example, new labels or aliases, so I think that this protection shouldn't prevent anyone from contributing to Wikidata in any way.

What do you think? --abián 18:27, 28 January 2017 (UTC)

  • Symbol support vote.svg Support ChristianKl (talk) 21:30, 28 January 2017 (UTC)
  • Symbol support vote.svg Support, indeed, I do not currently see any drawback in protecting all the properties.--Ymblanter (talk) 22:32, 28 January 2017 (UTC)
  • Symbol support vote.svg Support Good idea. --Jklamo (talk) 23:05, 28 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose Semi-protect may be a good idea for wildly used property like instance of (P31), but I don't beleive it's a problem in the first place. I have like 20 to 30 properties pages in follow list my and I never see one of this pages vandalised. --Fralambert (talk) 23:28, 28 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose per Fralambert. This would also discourage, if not prevent, new users/ IPs from adding much-needed labels in smaller languages. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:39, 28 January 2017 (UTC)
    • Pictogram voting comment.svg Comment, edit conflict: A look shows us that (almost?) all the recent edits made by unregistered users are tests or vandalisms. Unfortunately, I don't think that this proportion of valid/total edits is going to improve in any moment of the future. However, the editing frecuency, in general, will do continue increasing (and, with it, the number of vandalisms on properties) as Wikidata becomes more and more reachable from the Wikipedias. Instead of only showing a padlock, we could include a message informing that, if the user wants to contribute to the property without having a registered account (something that I see as extremely unusual), we encourage them to leave a request (link to an appropriate new pre-filled message on the talk page, or on the project chat, or on a specific page for these cases). --abián 00:28, 29 January 2017 (UTC)
  • Pictogram voting comment.svg Comment While I see the potential for vandalism here, the property namespace is hardly edited by non-patrollers anyway (about 5 to 10 edits per day), and as far as I see, non-constructive edits are reverted faster there than elsewhere. --YMS (talk) 00:08, 29 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose we need labels in all languages.. Once we have them for all 280+ languages maybe.. Thanks, GerardM (talk) 00:17, 29 January 2017 (UTC)
We do need labels for all languages however we need good labels for all languages. Labels for small languages are much harder to patrol. Having people who actually understand how Wikidata works write those labels makes some sense. ChristianKl (talk) 06:09, 29 January 2017 (UTC)
  • Symbol support vote.svg Support We don't have the resources to watch all properties in all languages. As it is hard to know if a label in a given language is vandalism or not without knowing the language, and we don't have people in all our languages monitoring the changes, then it is a good idea to semiprotect the labels. --Micru (talk) 08:52, 29 January 2017 (UTC)
  • Symbol oppose vote.svg Oppose Until someone can offer an actual demonstrated benefit to this change. This is a proposal, but absolutely no evidence as to why this is being proposed as a good change. Jo-Jo Eumerus (talk, contributions) 09:59, 29 January 2017 (UTC)
  • Symbol neutral vote.svg Neutral with tendency to oppose. I don’t see what could be meant by “vandalise with an enormous and unpredictable impact”; if someone can provide examples, I’d reconsider my decision. —MisterSynergy (talk) 10:12, 29 January 2017 (UTC)
    With that comment I mean that Wikidata isn't only what we see in wikidata.org. Wikidata is (and we want it to be) a knowledge base used by the other Wikimedia projects, third-party projects and lots of external applications of all kinds (currently, even Google uses it for its searches). While vandalising a label for a property can mislead some users in Wikidata without greater impact (as labels aren't interpreted by machines, only by humans), other fatal changes, in the worst-case scenario, could harm, until vandalism is reverted, the entire Wikidata ontology and the we-don't-know-which projects and applications that could load the ontology in that state. --abián 11:33, 29 January 2017 (UTC)
    That’s still pretty abstract. I can imagine that changing URL patterns could be malicious for instance, or maybe even changes in equivalent property (P1628). However most information in property items is not critical, and I would prefer to keep properties unprotected for now. As an alternative: can we technically use abuse filters for critical parts (e.g. prevent URL pattern changes by anons and new users)? —MisterSynergy (talk) 12:28, 29 January 2017 (UTC)
    Yes, we can. I hadn't thought of that possibility, and I like the idea if there's no consensus on this. I would also like to have a filter that let unregistered users add new labels but which didn't let them modify or remove the existent ones, but this filter wouldn't be possible because users couldn't revert their own mistakes. --abián 13:05, 29 January 2017 (UTC)
  • Pictogram voting comment.svg Comment I am not adverse to the proposal if we can demonstrate a low percentage of useful edits, though I would like to see if we can utilise other tools to weed out bad edits first. We should be able to utilise abuse filter rules to more easily monitor changes in that namespace, and test and challenge IP edits, or brand new accounts on their edit with a constructive message, and let confirmed accounts pass through.  — billinghurst sDrewth 10:17, 29 January 2017 (UTC)
  • Symbol support vote.svg Support I see 2 reasons to limit changes in properties: first is vandalism and the second is avoid to see contributors changing the scope of the property by modifying the label/description/statemen once they have been set up. But this should be only if a translators team can provide the majority of the labels/descriptions after the property creation. Snipre (talk) 10:30, 29 January 2017 (UTC)
  • Symbol support vote.svg SupportWylve (talk) 10:35, 29 January 2017 (UTC)
  • Pictogram voting comment.svg Comment have you tried flagging / reverting edits with ORES? protection; filters should be a last resort, not first resort. Slowking4 (talk) 12:59, 29 January 2017 (UTC)
  • My experience is that changes in items linked by country (P17) today affects large parts of svwiki. Vandalism there is de facto a larger problem than in the property-namespace. The effect of vandalism of Property-namespace is potentially more critical, but such vandalism is in reality very limited. -- Innocent bystander (talk) 13:23, 29 January 2017 (UTC)
  • Interesting. It could be useful to protect them from vandalism. MechQuester (talk) 13:48, 29 January 2017 (UTC)
  • It would be cool having the possibility of (semi)protecting statements in properties, but leaving labels and descriptions aside. Strakhov (talk) 14:44, 29 January 2017 (UTC)

Great Officers of State[edit]

Should the Great Officers of State (Lord Chancellor (Q217217), Lord High Steward (Q510373), Lord Privy Seal (Q910308), Lord High Treasurer (Q944583), Lord Great Chamberlain (Q1798290), Lord President of the Council (Q943379), Lord High Constable of England (Q955642), Earl Marshal (Q1265164), Lord High Admiral (Q16153574)) be instance of (P31) or subclass of (P279) to Great Officer of State (Q1544356) ? SJK (talk) 06:56, 29 January 2017 (UTC)

Property-like items[edit]

There are these four strange items: PictoRight ID (Q27827683), no label (Q27163421), no label (Q26465959), no label (Q24575428). They look like a property, but they are items in the main namespace. Do we need them for a particular application, or should they be deleted? —MisterSynergy (talk) 12:02, 29 January 2017 (UTC)

I presume the persons who created them just didn't know the proper process for getting a property created. If they are not being used (and I don't think any of them are being used in any claims of other items) I say delete them. (It would be nice if the Wikidata software enabled certain properties to be declared as metaproperties only, i.e. not usable on normal items, and then if it blocked any attempts to use such properties on normal items...) SJK (talk) 12:10, 29 January 2017 (UTC)
How about pinging the creators? ChristianKl (talk) 14:13, 29 January 2017 (UTC)
@Hannolans, Спасимир, Mozel W., Ping08:. --Epìdosis 14:24, 29 January 2017 (UTC)
Maybe PictoRight ID (Q27827683) could be linked in PictoRight ID code (P3361) through subject item of this property (P1629) ✓ Done Not very useful for now, though. The rest, well... Strakhov (talk) 14:30, 29 January 2017 (UTC)
You can delete this item no label (Q24575428). He's not needed. — Ping08 (talk) 14:38, 29 January 2017 (UTC)
Yes, can be deleted I was not aware. --Hannolans (talk) 15:06, 29 January 2017 (UTC)
Yes, no label (Q27163421) is to be deleted. Sorry for the inconvenience and thank you!--Spasimir (talk) 15:21, 29 January 2017 (UTC)

Thanks for all your replies. The first item is now linked to the property which has meanwhile been created, the other ones are proposed for deletion. If you need real properties for these databases, please visit Wikidata:Property proposal and request it there. —MisterSynergy (talk) 17:33, 29 January 2017 (UTC)

HHGTTG[edit]

The English label on The Hitchhiker's Guide to the Galaxy pentalogy (Q25169), as well as its description ("1979-1992 series of five books by Douglas Adams") and topic's main category (P910) (in English, "Category:Novel adaptations of The Hitchhiker's Guide to the Galaxy") make clear that it is about the series of books, but the en.Wikipedia article is about the whole franchise, including radio plays, television and film. I suspect (but cannot determine categorically) that the latter is true for the other interwiki links. How should this be resolved? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:36, 29 January 2017 (UTC)