Wikipedia:Request a query

From Wikipedia, the free encyclopedia
  (Redirected from Wikipedia:SQL query requests)
Jump to navigation Jump to search

This is a page for requesting one-off database queries for certain criteria. Users who are interested and able to perform SQL queries on the projects can provide results from the Quarry website.

You may also be interested in the following:

  • If you are interested in writing SQL queries or helping out here, visit our tips page.
  • If you need to obtain a list of article titles that meet certain criteria, consider using PetScan (user manual) or the default search. Petscan can generate list of articles in subcategories, articles which transclude some template, etc.
  • If you need to make changes to a number of articles based on a particular query, you can post to the bot requests page, depending on how many changes are needed.
  • For long-term review and checking, database reports are available.

Wiktionary query[edit]

Hello!

I was hoping that I could be assisted with a Wiktionary query. I would like a query with all articles that meet two conditions: it should be a member of deepcat:"Non-lemma forms by language" and Special:WhatLinksHere/PAGE should return 0 pages, it should hence be orphaned. I want in other words a list on all non-lemma words (regardless of language) that has 0 links to it. Grateful for answers. Jonteemil (talk) 23:12, 2 August 2022 (UTC)Reply[reply]

quarry:query/66388. I'm surprised it worked - I'd thought Quarry couldn't handle more than 65536 result rows, and a tree depth of 1 (the category and its direct subcategories) was almost enough for that with some 64000 matching pages.
You sure you don't want pages that have no incoming links from mainspace, not pages with no incoming links at all? —Cryptic 00:16, 3 August 2022 (UTC)Reply[reply]
You're right, from mainspace is what I actually want, if that will find less results, thanks.Jonteemil (talk) 13:40, 3 August 2022 (UTC)Reply[reply]
quarry:query/66402. Despite having fewer raw results the first query, they represent more distinct pages - 81592 compared to 64449 - because the first had duplicate rows. —Cryptic 18:40, 3 August 2022 (UTC)Reply[reply]

All files in category X and category Y, including exif metadata[edit]

Hello. Could someone please construct a query for all enwiki files in both category Self-published_work and category All_free_media, and have the two result columns be page.page_title and image.img_metadata? I tried in this query but I think I messed something up, because the resultset is bigger than I was expecting, and also because it is slow. Thanks. –Novem Linguae (talk) 10:27, 6 August 2022 (UTC)Reply[reply]

It looks right to me. There are a lot of files to check, and people who self-publish on Wikipedia are likely to use a free licence. You could add AND cl_type='file' but it may make no difference: other pages shouldn't be in those categories. Of course, you'll need to join to page and image for the name and metadata. Certes (talk) 12:10, 6 August 2022 (UTC)Reply[reply]
Improved query.Novem Linguae (talk) 12:48, 6 August 2022 (UTC)Reply[reply]

surprising number of username = NULL in image table query[edit]

Hello. Does anyone know why, in my query of some files in the file namespace, most of the results have user_name = NULL? This suggests that almost every file in the 100,000 results has an unknown or logged out uploader. Also, deleting the "AND user_name NOT IN" part gets me the exact same number of results as having it in, and I was expecting it to eliminate some results. user_name is supposed to be the user name of the file uploader. Perhaps I messed something up. –Novem Linguae (talk) 20:58, 9 August 2022 (UTC)Reply[reply]

LEFT JOIN user ON user_id = actor_name, mw:Manual:Actor table. You want = actor_user. —Cryptic 21:19, 9 August 2022 (UTC)Reply[reply]

Exclude blocked users[edit]

Hello. Can I get help getting this query to exclude blocked users please? I tried adding the below and a couple variations (HAVING, wrapping the whole thing and making it a subquery) but couldn't quite crack it. Thank you.

JOIN user ON actor_user = user_id
LEFT JOIN ipblocks ON ipb_user = user_id
WHERE ipb_sitewide != 1

Novem Linguae (talk) 10:34, 12 August 2022 (UTC)Reply[reply]

The query as written should work correctly, if correctly means "isn't blocked at all". If you want "isn't blocked at all, or is blocked only per-page/namespace", then you have to write out WHERE ipb_sitewide IS NULL OR ipb_sitewide != 1. "NULL != 1" isn't true. —Cryptic 11:05, 12 August 2022 (UTC)Reply[reply]
I've had a go at what I think you mean in quarry:query/66612. That version does a left join to sitewide blocks only, so ipb_id is NULL for a unblocked or partially blocked user. Certes (talk) 11:11, 12 August 2022 (UTC)Reply[reply]