In 2019 Wikimedia UK, Archaeology Scotland and The Society of Antiquaries of Scotland recruited a graduate intern through the Scottish Graduate School for Arts & Humanities Internship programme. These funded placements give PhD researchers the opportunity to spend up to three months with a partner organisation; improving their research skills and giving them an opportunity to work on a project which makes a real difference to an organisation.

The successful applicant was Roberta Leotta, and we planned that she would help to design and deliver a project which looked at the content gap around images of Scottish Archaeology, using Wikipedia, Wikimedia Commons, and Wikidata. 

As a remote internship, with occasional visits to the Archaeology Scotland offices, our first step was to give Roberta some introductory training on the Wikimedia projects – and using some of the materials I put together for the new Scottish trainer cohort, let her explore the projects. Here are her reflections on the first part of her internship. – Dr Sara Thomas, Scotland Programme Coordinator, Wikimedia UK.


It is very common for a PhD student in Classics to be associated with the image of a bookworm: a person who spends all their time in the library surrounded by books and papyruses and who is usually very unfamiliar with technology and digital resources. For sure, some academic environments such as Classics tend to be less innovative and more traditional than others, however this image is increasingly becoming less realistic. In fact, even though we study the literature and culture of the past, we need and want to engage with the world where we live – a world which is highly digitised. Moreover, the special circumstances we are experiencing these days are showing how technology is crucial in all sectors, including Humanities. Unfortunately, despite some recent improvements, the opportunities to increase our digital skills whilst at university are not massive. For this reason, when I saw the possibility to partake in an internship with Wikimedia UK and Archaeology Scotland, I decided to apply. It seemed to me a good chance to learn something about a widespread digital resource that the academic world needs to deal in.

And indeed, my expectations were fulfilled from the beginning. In four training hours about Wikimedia, and in particular on Wikipedia and Wikimedia Commons, I started to notice that something I took for granted was, instead, the result of a very well organised structure. The structure of Wikimedia is made by people who work according to the same principles, such as, in primis, to guarantee a free access to the sum of human knowledge.

At the beginning, my first challenges were to become familiar with the new language and acquire new processes, rules and attitudes. In this regard it was interesting for me to understand the rules to follow in order to create an article suitable for Wikipedia. Differing from what I was used to, an article needs to show the state of knowledge on a specific topic and it needs not to show the writer’s critical perspective. In fact Wikimedia’s aim is not to give the final interpretation on a particular topic, but to offer the starting points for research on that topic. In other words, Wikimedia offers material which can inform us about the world, but also that can improve our critical skills about what we know about the world.

Moreover, an article has to be notable and well-referenced, and writing about underrepresented subjects such as women is encouraged. In this regard, I found that there are some parallels between the accuracy and subjects’ preferences required both in Wikimedia and in the academic world. The concept of notability, instead, can be more challenging and relative, but still can encourage us to differentiate personal interests in knowledge from those that would benefit humanity’s interests as a whole.

I appreciate these principles, but what I appreciate mostly is how Wikimedia guarantees that those principles are followed. This leads me to talk about my last point, for now, which I found very provocative about Wikimedia, namely the community’s system. So far, the feelings I got by observing Wikimedia from this closer point of view is that providing knowledge to other people is not a matter of individual effort, but a matter of a communal effort and that the collaborative attitude rather than competitiveness is the key to make it possible.

I wish not only the academic environment, but also other environments dealing with culture, could take on board the same spirit and attitude to spreading knowledge and understanding of the world.


Roberta’s project has continued to develop over the months. The COVID-19 situation has – obviously – changed the project somewhat, and we’ll be reflecting on that in a later blog. If you’d like to help us support more interns like Roberta, please consider donating to Wikimedia UK.

Writing for a time of need

20:22, Tuesday, 14 2020 April UTC

Unlike traditional writing assignments where a student’s work is ephemeral, the Wikipedia writing assignment allows for student work to persist on in the public reach. Student work can later become highly relevant and important in response to current events. Last spring, a University of Maryland student in Dr. L. Jen Shaffer’s Researching Environment and Culture class created a Wikipedia article about wildlife smuggling and zoonoses to explore the role that wildlife trafficking has in zoonotic diseases. With the emergence of a global coronavirus pandemic, likely linked to a wildlife host, this student’s work has seen a surge of interest.

Pageviews of the article went from 20-40 views a day to 400-500 views per day.

Before the pandemic, the article was viewed by a couple dozen of readers each day. In the month since the coronavirus outbreak was declared a pandemic on March 11, the page was viewed by an average of 330 readers every day, with a total of 9,869 views! The article includes the important section “Exotic trade and disease outbreaks”, which is particularly important to readers right now.

The student has written nearly all of the section “Exotic trade and disease outbreaks” in the article about wildlife smuggling and zoonoses. Image shows the Dashboard’s Authorship Highlighting Tool.

With education disrupted around the world, learners are increasingly turning to Wikipedia to fill their knowledge needs. On average, over 280,000,000 readers have turned to English Wikipedia each day in the month after a global pandemic was declared. Student work can fill that critical need to provide free information accessible to a general audience.

Examples of zoonotic diseases and their affected populations, a graphic from the page the student created.

Interested in incorporating a Wikipedia writing assignment into a future course? Visit teach.wikiedu.org for all you need to know to get started.

Wikipedia Student Program Manager Helaine Blumenthal

Wiki Education now regularly supports around 400 courses each term. With thousands of students adding millions of words to Wikipedia every academic year, you’d think we’d know everything there is to know about the Wikipedia assignment, from what it takes to be successful to what students get out of the project. But each term brings its own novelties, outlook, and learnings. Fall 2019 was no exception.

Tackling bias and correcting the record

Despite its more than six million articles, Wikipedia is still rife with content gaps. No where is this more apparent than in subjects related to women, minorities, and other underrepresented populations. Increasing equity of representation is currently one of Wiki Education’s chief strategic goals, and we’re incredibly proud of our instructors and students who are continually helping us to close Wikipedia’s content gaps.

According to our Fall 2019 instructor survey, 80% of instructors felt that the Wikipedia assignment made their students more socially and culturally aware. In learning how to contribute to Wikipedia, their students learned how to identify bias and to understand the role that bias plays in knowledge production and consumption. Similarly, 97% of instructors reported that the Wikipedia assignment helped their students develop a sense of digital citizenship. In other words, in contributing to Wikipedia, students not only learned how to catch bias, but that they felt a responsibility to remedy inequities and ensure that Wikipedia had the best coverage possible of subjects that are either traditionally underrepresented, misrepresented, or completely missing from the record.

Filling in content gaps on Wikipedia is only one piece of the equity puzzle. Who contributes to Wikipedia is just as important as what they contribute. Since its inception, Wikipedia’s editor base has largely been majority white and male accounting for many of Wikipedia’s knowledge gaps and issues surrounding systemic bias. For years, we’ve known that roughly 60% of the students in our program are women, and this year, we also collected data about the instructors who decide to run Wikipedia assignments. Based on our Fall 2019 survey, about 65% of the instructors in the Student Program are women. This figure is far higher than academia at large where only 49% of faculty are women, and only 38% of tenured positions are held by women. Not only are we introducing more women to the Wikipedia editing community, but we’re hopefully breaking down barriers that women still face in academia more widely by allowing them to explore different avenues of scholarship.

The Wikipedia assignment as community service

While for some, learning to identify and remedy bias is an intellectual endeavor, for others, it is an act of community service. Many of our instructors and students come to see the Wikipedia assignment as a public service. As one political science instructor wrote, “This has been a wonderful way for me to embed service learning (a high-impact practice) into my course.” Another instructor wrote the following, “Because I chose to have them focus on authors and illustrators of African American Children’s and Young Adult Literature, almost without exception, the students felt that 1) they learned a lot about someone about whose life they knew nothing about before even if they had read the author’s work or seen their illustrations in the past as children or in their library work and 2) they had done a great service for racial equity to the international community because of improving information access about African American authors and illustrators who write for young people.”

The work students do on Wikipedia can truly have a global reach. “This assignment helped create a digital presence for Lebanese and Arab authors who were almost unknown to the younger generations because they had no digital presence,” remarked one instructor. “It helped contribute to digitizing the Lebanese literary heritage and place Lebanon’s literary heritage on the global map.” Another instructor wrote about how one of her students added a section on Islamic contributions to the Manuscript article. The article had previously only contained sections on Western influences. This single act was “a powerful example of how students can contribute to an ongoing effort to decolonize and globalize the field of art history.”

As one instructor put it, “Students mentioned that this assignment helped them also gain an understanding of activism beyond “the streets.” “This truly was an impactful assignment for many of them!” Though the Wikipedia assignment takes place online, it’s fundamentally an act of engagement – engagement with knowledge and the world that shapes how knowledge is produced and dispersed. And this is truly what it means to become a digital citizen!

Energizing learning

Contributing to Wikipedia can be a scary proposition for students and instructors alike. Students seldom write for anyone other than their professors or TAs, and instructors worry about the vulnerabilities inherent in writing for a public audience. While taking those first baby Wikipedia steps might be difficult, the excitement that the Wikipedia assignment engenders is palpable. According to one student, “I’m going to start this with some real transparency: I had a deep, dark sense of dread when it hit me that I would have to edit a real, live Wikipedia page. I mean, come on… a real one? Not some dummy Wikipedia page that would be on our blog, that only our professors could see? No, I would have to make my edits for the entire world (literally, the entire world) to see. Would anyone really even see them? Does anyone really care about 5G wireless technology other than my nerd self? Once it came down to it, editing was a lot easier than I thought. I have always wondered how in the world to edit a Wikipedia article, and I am still so surprised at how easy it was. The amount of work that the Wikipedia community does to maintain itself is pretty remarkable. Though I do not anticipate that I will become an avid Wikipedia editor, going forward, I will honestly not hesitate to fix any mistakes or add any facts I find relevant to articles. The training helped me be prepared, and it wasn’t as scary as I thought. It actually kind of made me feel like I’m contributing to the greater good, and isn’t that what we’re all here for, anyway?”

The project is equally invigorating for instructors. As one instructor reported, “I think I was a better instructor this semester because I was excited about this assignment.” Another remarked, “I am so grateful to Wiki Education for this opportunity to learn how to contribute to public scholarship via Wikipedia. The process has been intellectually invigorating and I know it has inspired many of my students as well. THANK YOU!” Yet another instructor wrote, “This assignment has rejuvenated me and given me a lot of ideas about the role of writers and scientists in the world!”

In learning to contribute to Wikipedia, students gain a number of important practical skills, from digital literacy to online collaboration, but it’s often the more intangible aspects of this project that make it truly worthwhile. As usual, our instructors put it best. “Do it, don’t chicken out. It’s daunting but I am smiling from ear to ear seeing how much the topics I care about now have better coverage on Wikipedia.”

The numbers

Though numbers only tell half the story, we’re proud of them nonetheless! In Fall 2019, we worked with 388 courses in fields ranging from Advanced Ecology and Evolution to US Women’s History. Roughly 7,500 students enrolled in the program, and collectively they added close to six million words on Wikipedia across 6,450 articles. Additionally, they added almost 60,000 references and created 705 new entries. Throughout the term alone, their work was viewed close to 200 million times!

We know that one of the strengths of the Wikipedia assignment is its public facing nature. As one instructor wrote, “The assignment fostered a sense of pride and ownership in my students, of having contributed to a lasting academic legacy that was published and available via a google search.” Rather than being read by one person at the end of the term, our students work was viewed millions of times and will continue to be viewed well beyond the end of the class. Millions now have access to information that would have otherwise remained within the walls of the ivory tower, and we have our students and instructors to thank.

Because of the COVID-19 pandemic, there is so much attention to every aspect of it; the epidemiology, virology, vaccination, co-morbidity. Mix it with a heady mix of economics, profiteering and graft and what are you to think of it all. What is fact and what is not.

When I read that there is an "Outbreak Management Team" in the Netherlands, an advisory body to the Dutch government, I had a look. I added all the known scientists to Wikidata, looked for "authority identifiers" and attributed some of the papers that are likely theirs to them. It generated a really nice Scholia for them and the team as well.

At first I wanted to do similar European organisations but it takes quite some effort to find them. So I took the easy route and went for the CDC. Its organisational chart contains a wealth or smaller orgs among the the NCIRD and it has its own organisational chart. I did the same routine, adding the obvious scientists to Wikidata, looked for the authority identifiers for them, attributed papers.

The best bit? While adding people one at a time, you see how the Scholia evolves. Authors are reordered based on their number of papers, you find the ones that are co-authors and colleagues. The latest papers are shown first.. It is nice. However, this is management only, I cannot wait and see it evolve as staff finds its place in the Scholia as well.
Thanks,
     GerardM

By Kunal Mehta

For the past 5ish years, I’ve been working on a project called libraryupgrader (LibUp for short) to semi-automatically upgrade dependency libraries in the 900+ MediaWiki extension and related git repositories. For those that use GitHub, it’s similar to the Dependabot tool, except LibUp is free software.

One cool feature that I want to highlight is how we are able to fix npm security issues in generally under 24 hours across all repositories with little to no human intervention. The first time this feature came into use was to roll out the eslint RCE fix (example commit).

This functionality is all built around the npm audit command that was introduced in npm 6. It has a JSON output mode, which made it straightforward to create an npm vulnerability dashboard for all of the repositories we track.

The magic happens in the npm audit fix command, which automatically updates semver-safe changes. The one thing I’m not super happy about is that we’re basically blindly trusting the response given to us by the npm server, but I’m not aware of any free software alternative.

LibUp then writes a commit message by mostly analyzing the diff, fixes up some changes since we tend to pin dependencies and then pushes the commit to Gerrit to pass through CI and be merged. If npm is aware of the CVE ID for the security update, that will also be mentioned in the commit message (example). In addition, each package upgrade is tagged, so if you want to e.g. look for all commits that bumped MediaWiki Codesniffer to v26, it’s a quick search away.

Lately, LibUp has been occupied fixing the minimist prototype pollution advisory through a bunch of dependencies: gonzales-pe, grunt, mkdirp, and postcss-sass. It’s a rather low priority security issue, but it now requires very little human attention because it has been automated away.

There are some potential risks. Someone could install a backdoor by putting an intentional vulnerability in the same version as fixing a known/published security issue. LibUp would then automatically roll out the new version, making us more vulnerable to the backdoor. This is definitely a risk, but I think our strategy of pulling in new security fixes automatically protects us more than the potential downside of malicious actors abusing the system (also because I wouldn’t absolutely trust any code pulled down from npm in the first place!).

There are some errors we see occasionally, and could use help resolving them: T228173 and T242703 are the two most pressing ones right now.

About this post

This post originally appeared on Kunal Mehta’s blog The Lego Mirror on 5 April 2020

Featured image credit: The Maughan Library round reading room, Colin, FCC BY-SA 4.0

Tech News issue #16, 2020 (April 13, 2020)

00:00, Monday, 13 2020 April UTC
TriangleArrow-Left.svgprevious 2020, week 16 (Monday 13 April 2020) nextTriangleArrow-Right.svg
Other languages:
Deutsch • ‎English • ‎Esperanto • ‎español • ‎français • ‎italiano • ‎lietuvių • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎ไทย • ‎中文 • ‎日本語 • ‎한국어

weeklyOSM 507

11:43, Sunday, 12 2020 April UTC

31/03/2020-06/04/2020

lead picture

The trend observed by Pascal Neis in the week of 16 March seems to have been broken. 1 | © Pascal Neis © map data OpenStreetMap contributors

Mapping

  • [1] Pascal Neis noticed that daily activity in OSM has been declining, but isn’t sure as to whether it should be attributed to: COVID-19 and people having more important things to do; less outside activity; or fewer active paid mappers. In his two Tweets he provided charts of daily active mappers and editor usage statistics for the past month, along with the raw data.
  • Phyks asked whether a cycleway can be limited to mountain biking, as they felt that the documentation for highway=cycleway is more about city cycling. We are sure that one of the more than 80 answers in the mailing list thread has helped them with their query.
  • Ty Stockman suggests extending the tagging of medical facilities with urgent_care=yes/no. However, the term ‘urgent care’ does not seem to match perfectly with what he meant.
  • Georg von der Howen came across a discussion on Reddit regarding the mapping of changed opening hours during the COVID-19 pandemic that he would like to bring to a broader audience. There are currently five tags with a :covid19 namespace that allow users to tag deviating information that applies only as long as the crisis lasts. In his opinion, the :covid19 namespace can be applied to opening_hours, description, note, delivery and takeaway. The origin of these extensions seems to be in France where many places in Paris have already been tagged accordingly.
  • Pascal Neis has made another service available for mappers. On Twitter he announced the availability of a map that shows the latest edits per tile.

Community

  • Ça reste ouvert, the map of places open during the COVID-19 lockdown, has collected opening hours in France for almost 20,000 places within three weeks.The French community collaborated with several other communities and has added new countries to the map: Germany, Switzerland and Austria (as ‘Bleibt offen’), as well as Spain and Andorra. Their GitHub repository provides an issue template to request coverage for your country.The French service provider TransWay has published a mobile app for iOS. Eric Afenyo is developing one for Android.
  • Ed Freyfogle, from ‘OpenCage Data’ and ‘Geomob’, talked with Andy Allan, long-time OSMer and Geomobster, maker of OpenCycleMap, and founder of Thunderforest. They discussed building a bootstrapped geo business generally, and specifically the challenges of creating a business around OpenStreetMap.
  • Austin Bell started a new OSM-related podcast named ‘Nodes and Ways’. In his first episode he interviewed Maggie Cawley, Executive Director of OpenStreetMap US.
  • A week-long campaign took place in the Russian OSM community in early April (from 30 March to 5 April). It was dedicated to closing notes (ru) (automatic translation) and 48 participants closed almost 1500 notes during the week. However, more than 19,000 notes are still open around the territory of Russia.
  • Valery Trubin continued his series of interviews with OSMers. He spoke to Alexander Pavlyuk (automatic translation) about shooting orthophotos with planes and Maxim Dubinin (automatic translation), founder of NextGIS, about the quality of OSM data and why many people don’t trust it. (ru)

Imports

OpenStreetMap Foundation

  • Joost Schouppe from the OSMF board announced the start of OSMF’s Microgrants Committee.
  • Michael Spreng, from OSMF’s Membership Working Group, published a proposal for a free OSMF membership for people who made a ‘sizeable contribution’ to the project. The suggestion is to grant a free membership to requesting mappers who have contributed on at least 42 calendar days in the last 365 days. You are asked to provide feedback by writing a comment to the blog post or mails to the OSMF mailing list thread, or to the Membership Working Group directly.

Maps

  • Dave Bolger has produced an OSM-based map showing the 2km radius around your home. In Ireland you are limited to exercising within 2km of your home as part of the restrictions in place to combat COVID-19.
  • Primera Edición reports (es) (automatic translation) that the Province of Misiones, Argentina, has produced a map allowing users to visualise which essential services are closest, so as not to violate the current isolation decree. The map was created using uMap.
  • The Heidelberg Institute for Geoinformation Technology (HeiGIT) at Heidelberg University, has developed, together with Markus Ries from the Center for Pediatrics and Adolescent Medicine, University Hospital Heidelberg, a Map of COVID-19 Clinical Trials.

switch2OSM

  • The Russian project ‘Historical memory of the cities(ru) is a website where anthropologists and sociologists post fragments of different people’s oral stories about their home, neighbourhood or city. It uses OSM as a basemap.

Open Data

  • The British Geospatial Commission announced that unique identifiers for addresses and streets will be available as open data under the Open Government Licence from July 2020. Owen Boswarva has tweeted, as a thread, a preliminary detailed analysis. The actual addresses remain proprietary and any associated geometries will be generalised.

Software

  • AnyFinder, the POI finder app for iOS, has added support for the current five :covid19 namespaced tags. The developer has also created a special ‘COVID19’ category within the app’s ‘Events’ section, where users can quickly search for restaurants that are now closed but offer takeaway services during the pandemic, for shops that now offer home delivery, or for places with additional information about their operation during the pandemic.
  • HeiGIT increased the API quota for the Openrouteservice multi-vehicle route optimisation endpoint. The optimise endpoint of Openrouteservice is based on the Vroom engine by Julien Coupey.

Releases

  • JOSM has progressed to the stable version 20.03. The new version brings support for the Arabic language and support for Eastern Arabic(-Indic) and Khmer numerals, along with multiple other minor enhancements and fixes.

Did you know …

  • … about the site onosm.org with the help of which companies can ask to have themselves added to OSM?

Other “geo” things

  • Google is using anonymised location data to produce community mobility reports for public health officials, to help as they make critical decisions to combat COVID-19. Jen Fitzpatrick’s blog post gives some background to the project and the reports are available here.
  • Esri, a major international provider of proprietary geographic information systems software, featured the availability of a new set of feature layers in ArcGIS Online created with OSM live data, in a blog post.

Upcoming Events

Many meetings are being cancelled – please check the Calendar on the wiki page for updates.

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Polyglot, Rogehm, SK53, Silka123, SunCobalt, TheSwavu, derFred, geologist, ghowen, osmapman.

Writing about COVID-19 on Wikipedia

10:26, Sunday, 12 2020 April UTC

Last month was eventful not only in terms of my personal and professional life, but also in terms of my volunteering work. In March-April, I have been regularly writing articles on English Wikipedia about COVID-19, mostly about the medical aspects, issues surrounding the impact of the pandemic and people in leadership in responding to COVID-19.

I am used to doing everything in a structured way on Wikipedia, but COVID-19 changed everything. I usually take days and weeks to think about a new project on Wikipedia, then create a time line and a work plan, and then work systematically on each aspect of the work. But in a crisis situation like a pandemic, this level of structuring is not possible, so I am helping out wherever help is needed. Nowadays, I log in to Wikipedia in the morning, read the updates about the pandemic from there and then go searching for topics that are missing. Given the recentness of the pandemic, there is usually a lot to write about, especially about its socio-economic impact. In addition, the tables about the disease epidemiology need to be updated, new regulations and lockdowns passed in various countries need to be added and the biographies of notable individuals working on COVID-19 need to be created. I work on all these aspects.

I get my references from all kinds of sources, thanks to most journals making their COVID-19 research papers open access. Many magazines and newsletters like The Economist have made their articles related to COVID-19 subscription-free. The WHO, UNPFA, UNICEF, Human Rights Watch, Amnesty International and many other organisations have also created several documents related to COVID-19 and the impact of the pandemic on various spheres of life. I have generously drawn content from all these sources for creating and expanding articles on Wikipedia.

I have mostly been following the World Health Organisation (WHO) for knowing the latest disease updates, so I mostly bring information from the WHO to Wikipedia. As of 9 April 2020, I have written around 25 articles related to COVID-19 on Wikipedia. The most popular one so far is 2020 coronavirus pandemic in Kerala. The article I am most proud of is Gendered Impact of the 2019-20 coronavirus pandemic. The article which I think would be the most useful is List of unproven methods against COVID-19, given the misinformation circulating about the disease. Nearly 700 edits I made on English Wikipedia thus far are on articles related to COVID-19. The articles started by me have been viewed around 35,000 times every day during the last one month.

What am I going to do next? We are still in this pandemic and the situation is rapidly evolving (for better or for worse, we don’t know yet). So, I am going to take everything one day at a time, doing what is important for today, not making any long term plans. I will continue to do what I am doing right now on Wikipedia, until help is no longer needed. As a Wikipedian, doctor and researcher, this is the least I can do to empower people around the world to get open and reliable information about COVID-19.

Stay safe, y’all.

 

ListeriaBot is a bot that maintains lists based on information in Wikidata. In this blogpost I will explain what a Listeria list is, what it is used for. I will point out its qualitative benefits and explain how Listeria can be instrumental to limit bias, stimulate collaboration and help us share in the sum of the knowledge available for us.

The heart of a Listeria list is a query. In this query it is defined what data is retrieved from Wikidata, it includes the order of presentation and shows this information in a language depending on the availability of labels.

Listeria lists are defined only once and every day a job run by the ListeriaBot updates all lists with the latest data from Wikidata. In this way available information is provided even when articles are still to be written. When there is an article to read, the label is shown in the upright position, when there is not is shows in cursive.

The biggest difference between a Wikipedia list and a Listeria list? No false friends. When you seek a specific "Rebecca Cunnigham", it is really powerful to know that your Prof Cunningham will always be known as Q77527827 and is also authoritatively known by other identifiers. From a qualitative point of view, particularly in lists, red links even blue links such disambiguation is a big thing. At this time a typical Wikipedia list has an error rate because of disambiguation issues of around 4%. I frequently blogged about this, the Listeria list I often referred to is for the George Polk award.

Maintenance is another reason to choose for Listeria lists. This was documented by Magnus, a list was maintained up to a point in time as a Listeria list and for all the wrong reasons human qualities were to prevail. Magnus compared the results after some time and the human maintained list proved to be the poorly maintained list.

Categories are lists of a kind, for many categories it is defined what they contain. Consequently Wikidata is easily updated from Wikipedias and can serve as a source for updating categories as well.

Ok, the impasse. ListeriaBot is blocked because of a false friend issue. The objective is to find a resolution that will benefit us all. The false friend issue is that images can have a same name in both Wikimedia Commons and in English Wikipedia. The existing algorithm for showing pictures is that local pictures take precedence. When ListeriaBot is to do things differently, it can. Thanks to the wikidatification at Commons, we can indicate with a Wikidata identifier what a picture "depicts". Wikidatification of images can also be introduced for pictures at English Wikipedia and it is then becomes easy to always show what Commons has unless a preference is given to show a specific image for a particular project.

I have been told that I do not assume good faith. When I see the extend people care to go to resolve this issue I am only amused. The objective of what we do is share in the sum of all knowledge and do this in a collaborative way.

English Wikipedia fails spectacularly by assuming that their perceived consensus is in the best interest of what we aim to achieve. There is no reflection on the quality brought by Listeria, there is no reflection on how its quality can substantially be improved. I fail to understand what they achieve except for feeling safe by insisting on dated practices and dated points of view.

I wish we could be one community that is known by a best of breed effort with one common goal; sharing the sum of all the knowledge that is available to us.
Thanks,
        GerardM

Finding your students’ work

19:55, Friday, 10 2020 April UTC

Having trouble finding the work your students did on Wikipedia? We’ve been working on making the Dashboard a better tool for evaluating student work, but we know we know there’s still more work to be done. The Wiki Education technology team — myself and Wes Reid — have been meeting with many instructors over the last few weeks to see how finding student contributions works in practice, and we’ve got plans to improve many of the confusing aspects we learned about. In the meantime, here are some tips for sorting out what students did.

1. Use the ‘Article Assignments’ view from the Students tab.

The Article Assignments view of the Students tab

This interface on your Dashboard course page is designed to highlight the main assigned article(s) a student is working on, as well as the assigned peer reviews of classmates’ drafts. When you select a student, you’ll see which articles are assigned, and you’ll see the same ‘Bibliography’ and ‘Sandbox Draft’ links that the student sees from their ‘My Articles’ view on the Home tab. (You can also use this view to assign articles and peer reviews to a student.) The page icon at the right of each assigned article is the Article Viewer, which you can use to see the latest state of an assigned article, along with Authorship Highlighting to indicate which parts of the article were written by your students, see below. (The key limitations of Authorship Highlighting are that certain content that students might add to an article, including infoboxes and references in most cases, do not get highlighted.)

Article Viewer, highlighting which text was contributed by the student

2. Find the ‘sandboxes’ for each student.

The Students tab overview lists each student, as well as the amount of content they added to live articles (“mainspace”), sandboxes (“userspace”), and the “Draft” section of Wikipedia (where student work may inadvertently end up). Beneath the username of each student, the ‘sandboxes’ link goes to a list of every sandbox page associated with that student’s account.

The Students tab overview, showing what each student has done in broad strokes

Some sandboxes, like the “Evaluate an Article” page, are typically created as part of training modules and exercises. The default one titled “sandbox” is where students may have drafted their work, especially if they got started before they added their assigned topic on the Dashboard.

The list of all sandboxes for this student’s account

3. Look at the full set of edits for a student.

The ‘edits’ link, which also appears beneath a student’s username on the Students tab overview, goes to the full list of user contributions — listing every edit made with that account.

The full user contributions list for a Wikipedia account

This list shows the timestamp of each edit (in UTC, not your local timezone), how much content was added or removed, and which page was edited. The ‘diff’ link for each entry will show you exactly what text was added/removed in that edit.

If you can’t find what you’re looking for from the full list of contributions, the remaining possibilities are:

  • edits didn’t get saved (eg, if a student composed a draft but never saved it using the ‘Publish page’ button)
  • the student made edits while logged out, so those edits will be attributed to their IP address rather than their username (and may be hard to track down)
  • the edits were deleted by an administrator (most commonly because of sourcing problems with a new page, or because of copyright violations)

For tips on grading student work, click here. Or check out our other blog posts about assessment.


Not signed up on our Dashboard yet? (It’s free!) Head on over to teach.wikiedu.org to get started.

Many of us are at home, waiting to go out. We are all obsessed with the latest statistics and read what pundits have to say.  It is likely that you are cognizant of the statistics for your country, state or county.

I learned that Jonathan P. Tennant died in a traffic accident. When you care for statistics, you will wonder what are my chances of dying in a traffic accident at this time. Deduct it from your chances of dying of Corona and things look up.

Not so much for Protohedgehog, he met with an accident. It is sad, he was young, full of promise; just became a member of the Global Young Academy. If anything, it serves as a reminder for us to look left, right and left again to not become a bus factor.
Thanks,
     GerardM
Rivka Genesen. (CC BY-SA 4.0)

Our introductory linked data course provides support and resources for professionals to incorporate Wikidata into their different projects and goals. Rivka Genesen participated as Assistant Director of Library Services at The Ursuline School. 

My students, consciously or unconsciously, often show me the parallel sets of skills they have developed when we sit down to work out a research plan. These are the rules for science, these are the rules for English, this is what Social Studies expects, here’s what I say I do, and here’s what I leave out. Being outside of their classroom (school librarians have the chance to be the fun aunt), they tend to give me a more honest answer about their study habits. I return in kind. I often hear myself tell the student in front of me, I’m interested in teaching the person and student you actually are, rather than teaching the student I want you to be.

I spent the spring of 2019 reading and researching information literacy for middle and high school students. My summer was reserved for writing this curriculum along with a scope and sequence to accompany it. My students have little agency in their education- 6th-12th grade are conscriptive, going to class the adult equivalent of a day full of back to back meetings with different norms. In order to make information literacy foundational, I first want to meet the person my students are when they pick up their phones at lunch, open their laptops in the evenings.

When I emerged from Wiki Education’s daylong seminar at METRO NYC in July, it felt as though a ton of bang snaps had been emptied onto the floor of my mind. Wikidata is the linked data repository behind Wikipedia and it powers many answers that smartphones could spit out. Being able to explain what that meant would help equip students with all sorts of skills to reflect on the information landscapes that they consume without thinking.

Ask a student when they first learned about Wikipedia. You will likely get the same answer that I did: they first heard about this tool in 5th or 6th grade when they were told that it was not a reliable resource and that they couldn’t use it. And do you use it anyway but leave it off your bibliography? I ask in response. The answer is almost always affirmative. As I learned more about Wikidata this summer, it struck me how deeply problematic it is to skip over linked data, especially in its relationship to search in everyday life, in the classroom and out. How Siri and Alexa and that box in Google get their information appears to be as magical as electricity. It’s a miracle to turn the lights on and off, but there is wiring we cannot see. These inventions are made no less spectacular when the way they work is made visible.

Just like school librarians strive to support teachers and students with their expertise and research assistance, Wikidata supports Wikipedia (and the internet at large) by structuring data to make it both human and machine readable. This deeper understanding of how information is organized and created will help librarians better support their communities as they consume information.

A quick search of “wikipedia lesson plans” pulls up the evergreen articles on whether or not to allow Wikipedia in the classroom. I appreciate these as a way to open a conversation, but I want more: How do I work with my school community on interacting with “banned” sources and tools that evolve? And how do I facilitate discussions about media literacy with students?

A line from “How To Be Happy: Another Memo To Myself” by Stephen Dunn often comes to mind: “In large groups, create a corner in the middle of the room.” Sometimes this means creating the corner I want to be in, other times it just means claiming a seat. I know for sure that I’m not the only secondary school educator who wants to help students consume information responsibly. But, as I write this, I feel the anxiety of writing this without an explicit template. The only way I know how to do this is to work aloud, to claim my seat and create my own corner, make it welcoming, and get cozy.


Registration for our upcoming Wikidata courses is open until tomorrow! New to linked data? Join the open data movement in our beginner’s course. Have more experience with linked data or Wikidata? Sign up for our intermediate course that focuses on possible applications for your institution.

This Month in GLAM: March 2020

08:30, Thursday, 09 2020 April UTC
  • Australia report: Know My Name; Public libraries of Queensland join Wikidata
  • Colombia report: Gender gap, Wikipedia and Libraries from the GLAM team
  • France report: WikiGoths; WikiTopia Archives
  • Indonesia report: Volunteers’ meet-up; Wiki Cinta Budaya 2020 structured data edit-a-thon
  • Ireland report: Video tutorials; Celtic Knot Conference 2020
  • Kosovo report: WoALUG and NGO Germin call Albanian Diaspora to contribute to Wikipedia
  • Netherlands report: Nationaal Museum van Wereldculturen contributes to Wikimedia Commons again; Student research on GLAM-Wiki at Erasmus University Rotterdam
  • Serbia report: March Highlights – Everything is postponed
  • Sweden report: FindingGLAMs; Wikipedia in libraries; Art from the Thiel Gallery Collections; Kulturhistoria som gymnasiearbete
  • UK report: Colourful Kimonos from Khalili
  • USA report: Women & Editing in the time of virus
  • Special story: COVID-19
  • Wikidata report: Lockdown Levellings
  • WMF GLAM report: Mapping GLAM-Wiki collaborations
  • Calendar: April’s GLAM events

Art museums are not only stewards of data about their collection but also of data from other museums. All of this data can tell the story of art, art movements, and the creators of these artworks. In January I followed up with some of our Wikidata course participants at SFMOMA. I met with Ian Gill, Documentation Associate, and Marla Misunas, Collections Information Manager, as well as other museum staff to catch up about their Wikidata projects.

During my visit we talked about a variety of ways museums can work with linked data to provide more information to their patrons, researchers, staff, or anyone interested in the story of the museum’s collection. It was an engaging conversation, laying the groundwork for some fascinating projects in the future. Things really took off when Ian, who took our Wikidata course last summer, started telling us about a project he has been working on since the course ended.

SFMOMA’s Ian Gill and Marla Misunas with Wikidata Program Manager Will Kent.

Ian has been a prolific editor on Wikidata, hard at work adding exhibition data from SFMOMA’s database. Take a look at his query tracking all exhibitions at SFMOMA from 1935-2020. Since he has uploaded this data, he can track which traveling exhibits were at SFMOMA and when. Although several art museums have extensive exhibition data, SFMOMA by far has the most on Wikidata – with over 3,000 exhibitions represented. Why does that matter? There’s so much SFMOMA can visualize and understand about exhibition movement by utilizing Wikidata’s query capabilities. Plus, since Wikidata is an open repository, others can benefit from this data too.

Ian was able to upload all this data to Wikidata using QuickStatements, a tool that allows for semi-automated edits to Wikidata that we discussed in our course. Thanks to a firm background in manual editing and property/value usage, Ian was comfortable testing out the tool before implementing edits on a larger scale. He pointed to this tool, which was Wikidata’s tool of the year in 2019, as being essential to making these meaningful edits.

Projects like this not only make it easy to see exhibition history, but they also serve up this data neatly organized to anyone who wants it. By having dates associated with the query, Wikidata’s Query Service can render these results as a table or interactive timeline. As cool as this is, things get even more interesting when you consider several institutions sharing exhibition data. If that data is in Wikidata, users could look at which museums had an exhibition during a specific date range, follow where different exhibitions traveled to, and analyze them based on topic or length of stay. As with so much on Wikidata, there is great potential to reveal new insights about collections; the creators whose work are represented in them; and the people who consume information about these collections (us!).

It’s exciting that SFMOMA has been supportive of staff working on Wikidata. The commitment that they’ve made to open data demonstrates how combined data from museums can paint a more complete picture of history. Imagine if every art museum in the country shared exhibition data. Not only could you tell what was on display where and when, but you could also track exhibitions city by city over time. Only through the support of many institutions will collaborative projects like Wikidata have a more meaningful impact. Keep an eye on SFMOMA for more developments in the future.


Registration for our upcoming Wikidata courses is open! New to linked data? Join the open data movement in our beginner’s course. Have more experience with linked data or Wikidata? Sign up for our intermediate course that focuses on possible applications.


Want to hear from participants about what courses are like and how they’re using their new skills? Check out these blogs.

Ornithologists in cartoons

06:46, Wednesday, 08 2020 April UTC
From: The Graphic. 25 April 1874.
It is said that the modern version of badminton evolved from a game played in Poona (some sources name the game itself as Poona). When I saw this picture from 1874 about five years ago, I gave little thought to it. Revisiting it after five years after some research on one of A.O. Hume's ornithological collaborators, I have a strong hunch that one of the people depicted in the picture is recognizable although it is not going to be easy to confirm this.

I recently created a Wikipedia entry for a British administrator who worked in the Bombay Presidency, G.W. Vidal, when I came across a genealogy website (whose maintainer unfortunately was uncontactable by email) with notes on his life that included a photograph in profile and a cartoon. The photograph was apparently taken by Vidal himself, a keen amateur photographer apart from being a snake and bird enthusiast. Like naturalists of that epoch, many of his specimens were shot, skinned or pickled and sent off to museums or specialists. He was an active collaborator of Hume and contributed a long note in Stray Feathers on the birds of Ratnagiri District, where he was a senior ICS official. He continued to contribute notes after the ornithological exit of Hume, to the Journal of the Bombay Natural History Society. This gives further support for an idea I have suggested before that a key stimulus for the formation of the BNHS was the end of Stray Feathers. Vidal's mother has the claim for being the first women novelist of Australia. Interestingly one of his daughters married Major R.M. Betham, another keen amateur ornithologist born in Dapoli, who is well-known in Bangalore birding circles for being the first to note Lesser Floricans in the region. Now Vidal was involved in popularizing badminton in India, apparently creating some of the rules that allowed matches to be played. The man at the left in the sketch in the 1874 edition of The Graphic looks quite like Vidal, but who knows! What do you think?

PS: Vidal sent bird specimens to Hume, and at least two subspecies have been named from his specimens after him - Perdicula asiatica vidali and Todiramphus chloris vidali.

For more information on Vidal, do take a look at the Wikipedia entry. More information from readers is welcome as usual.

If you find yourself teaching the Wikipedia writing assignment virtually for the first time, take some tips from instructors in our community who have done it before!

Take advantage of support

  • Use Wiki Education’s Dashboard. It’s already an online platform, complete with full instructions from beginning to end, including a “how to teach with Wikipedia” orientation for first-timers and a full suite of training modules for your students.
  • Ask for help! Use all of Wiki Education’s resources, from browsing and submitting questions through ask.wikiedu.org to using the Get Help button on your Dashboard course page. You’re not alone!
Once you sign up on the Dashboard, the “Get Help” button can be found in the menu bar of your Dashboard course page. Clicking it will connect you with one of Wiki Education’s Wikipedia Experts on staff.

Set expectations early and keep them realistic

  • Focus on the quality of student work rather than quantity. That’s always a good idea with the Wikipedia assignment, but especially in a virtual course where students may be less inclined to reach out for help. Small contributions can still be impactful.
  • Be as explicit as possible. There is no such thing as over-explaining when it comes to running the assignment remotely.
  • The student trainings on your Dashboard course page will help students make an informed decision when choosing what Wikipedia articles to work on. Or, as the instructor, you may choose to put together a list of articles that students may choose from. If you choose that route, there’s a guide to walk you through creating that list. Remember, Wiki Education has guidance for every step of this assignment!
  • There are likely great disparities in your students’ access to technology and the internet. Keep that in mind as you ask them to conduct the assignment entirely off campus.
  • For more tips about setting expectations, click here.

Make it interactive when you can

  • If you’re running virtual sessions with your students through something like Zoom, be sure to make each session have a focus when it comes to the Wikipedia assignment. Each week on the Dashboard Timeline centers around a training or two that students should take, a task they should complete, and perhaps a list of discussion questions to go over in class. So, for example, ask your students to complete a task related to the project and then come to the session prepared with questions.
  • The Discussion blocks sprinkled throughout the Dashboard Timeline pose questions for students to consider (see this one for an example). You might use features of Zoom like “breakout rooms” to have students to discuss these questions or workshop their article contributions.
Discussion blocks appear throughout the Timeline on your Dashboard course page. These blocks are editable if you’d like to include links or more text.

How to grade

  • Grade students on whether they complete the online training modules by the date stated on your course Timeline.
The Assigned Articles view of the Students tab of the Wiki Education Dashboard shows links to key sandbox and article pages for each assigned article for a selected student.
  • Again, quality is far more important than quantity. What a student contributes to an article will depend on the sources available and Wikipedia’s existing coverage of the subject at hand. A 300 word contribution may be as critical to a subject as a 1000 word entry.
  • For more tips on grading, click here.

Celebrate the successes!

We know it’s a strange time and student health is a top priority. Giving students something to do that has a bigger reach than a typical assignment can be great right now. Create moments to acknowledge that your students’ work is something to be proud of.

The Engineering Writing Program at the University of Southern California put it well on Twitter:

“In our current state of heightened disquietude, the cooperative efforts of our students in the pursuit of knowledge and education give all of us reason to look forward with confidence and hope. Thank you Wiki Education!”

Want to connect with each other and share tips? Please do!

Tweet at @WikiEducation if you’d like us to connect you with other instructors in our network to share syllabi and more tips for running the assignment. If you have tips of your own to share, please use the hashtag #WikipediaWritingAssignment!

By Luca Toscano, Miriam Redi and Nuria Ruiz

The vast majority of the code that runs Wikipedia is Open Source—released under Free Software licenses. This means that the infrastructure that delivers the site’s free knowledge runs software that is not owned by any company. It is publicly available to anyone; you can read the code, and if you want, you can use it on your own server. Maintaining and improving one of the largest websites in the world using Open Source software requires a continuous commitment. The site is always evolving, so for every new component we want (or need!) to deploy, we need to evaluate the Open Source solutions available.

Our latest challenge was setting up an internal environment for machine learning model training. The fundamental difference between machine learning and traditional programming is that in the latter, you write code that, given an input, generates an output. In the former, you write a program that learns mappings between data inputs and outputs.

This data is called training data.

We use large-scale training data to derive a formula able to infer data properties to which we can supply new data to be categorized or analyzed. This formula can tell us things like “this article is about politics” or “this image contains a cat.”

This formula is called a model.

Models are computationally expensive to build; they require many operations over large sets of data. This is called training or building a model. As datasets get larger, training a model might take days or weeks. How long it takes will depend on the amount of data, the type of computations required for training, and the hardware in which those expensive computations are running. While these calculations are expensive (there are many of them), they are not complicated (simple arithmetic operations), and running those operations in parallel can speed up training dramatically.

GPUs (Graphics Processing Units), originally built to accelerate the memory-intensive work of geometrical operations in 3d environments like games, are used today in many areas that require the parallelization of matrix/vector multiplications, including Machine Learning.

Nvidia was one of the first manufacturers of commercial GPUs to produce highly performant drivers and tools since the early days, but it has historically overlooked collaborations with Open Source developers interested in providing non-proprietary/closed alternatives. This is why independent efforts to improve open-source Nvidia drivers emerged from the community.

However, another manufacturer seems to be leading the way: AMD. AMD provides a full Open Source suite for their GPUs called ROCm. While the GPU firmwares are still non-free binaries (as it also happens with some CPU firmwares), almost all the driver/software that manages the GPU is Open Source. This is why when the Wikimedia Analytics and SRE teams evaluated what brand to invest time and effort in, AMD was picked as the preferred solution.

The SRE team deploys only Debian on every host in the infrastructure. We chose Debian 10 (Buster) as a baseline for our tests since the Linux kernel version included directly supports the GPU. AMD directly develops the kernel side of the driver in the mainline Linux kernel and also ships the driver as part of its own set of Debian packages. Even if the ROCm Debian Repository supports only Ubuntu, we were able to use its packages on Debian 10 without any rebuild needed. Then we discovered tensorflow-rocm: a port of the Tensorflow Pypi Python package that met our needs.

It seemed the perfect solution, but then we quickly discovered the first issues. With the first GPU that we tested, a Hawaii FirePro W9100, Tensorflow was hanging and causing kernel stalls in most of our simple tests. Even basic testing tools provided by the ROCm suite were failing with obscure error messages. The GPU was “enabled but not supported,” meaning that there were little possibilities to make it work. We tried hard to fix the problem with upstream, but eventually, we decided to buy a new AMD GPU card.

This was in itself an interesting task. The two server vendors that we buy hosts from offer a wide selection of Nvidia cards (certified to fit and work into their chassis) but only old AMD ones. We manage our own hardware, so we had to be creative and measure space inside a server’s chassis before being sure about what card available on the market could have fit into it (pictures in https://phabricator.wikimedia.org/T216528). Size was not the only problem, power consumption and ventilation were also a concern. Eventually, we decided to buy a AMD Radeon Pro WX 9100 16GB card that ended up fitting very well in the server’s chassis.

Working outside the specifications of server vendors is not an easy task for a foundation with limited resources.

The next step was working on importing the Debian packages in our own APT repository—automating their deployment and configuration via puppet. Managing our own repository for Debian packages has a lot of advantages. One of them is the ability to build your own version of a package if it doesn’t respect your free software policy. One example is hsa-ext-rocr-dev. It contains non-free binary libraries for processing images with OpenCL. Since up to now there has been little traction to Open Source it by upstream, we are temporarily bypassing the problem by creating a “fake” Debian package via equivs, to avoid deploying those libraries. Last but not the least, Puppet code was added to our repository to automate the configuration of an AMD GPU with ROCm and its related monitoring.

This latest addition to Wikimedia’s infrastructure will enable developers, researchers and analysts at the Foundation to build machine learning models for image classification, language modeling, large-scale data processing, and more —relying on fast, reliable, and more importantly, open technology.

In the sequel of this blog post, we will talk about our latest machine learning projects. Stay tuned!

About this post

To know more please visit: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/AMD_GPU#Use_the_Debian_packages

Featured image credit: BalticServers data center, CC BY-SA 3.0, GNU Free Documentation License 1.2 or any later version

Tech News issue #15, 2020 (April 6, 2020)

00:00, Monday, 06 2020 April UTC
TriangleArrow-Left.svgprevious 2020, week 15 (Monday 06 April 2020) nextTriangleArrow-Right.svg
Other languages:
Deutsch • ‎English • ‎Esperanto • ‎español • ‎français • ‎italiano • ‎lietuvių • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎српски / srpski • ‎українська • ‎עברית • ‎العربية • ‎ไทย • ‎中文 • ‎日本語

Edwin G. Abel aka Ed Abel

13:53, Sunday, 05 2020 April UTC
Professor E.G. Abel came on my radar because he is a recipient of the Daniel X. Freedman Award. He has a Wikipedia article as "Ed Abel" and the information of the award has him as "Edwin G. Abel".

I looked into the Freedman award because of a criticism on the Wikipedia article of Professor Montegia. The superior article of Prof Montegia is criticised because it is an orphan. It now has a Scholia template and that links the 105 scholarly papers known in Wikidata. Its timeline does include the Freedman award linking the Professors Abel en Montegia.

I doubt it is considered enough to remove the orphan template. I have added a redirect for the Freedman award to the issuing organisation. Maintaining a Wikipedia list is not one of my ambitions.. It could be a Listeria list like this one..
Thanks,
      GerardM

For the past 5ish years, I've been working on a project called libraryupgrader (LibUp for short) to semi-automatically upgrade dependency libraries in the 900+ MediaWiki extension and related git repositories. For those that use GitHub, it's similar to the new dependabot tool, except LibUp is free software.

One cool feature that I want to highlight is how we are able to fix npm security issues in generally under 24 hours across all repositories with little to no human intervention. The first time this feature came into use was to roll out the eslint RCE fix (example commit).

This functionality is all built around the npm audit command that was introduced in npm 6. It has a JSON output mode, which made it straightforward to create a npm vulnerability dashboard for all of the repositories we track.

The magic happens in the npm audit fix command, which automatically updates semver-safe changes. The one thing I'm not super happy about is that we're basically blindly trusting the response given to us by the npm server, but I'm not aware of any free software alternative.

LibUp then writes a commit message by mostly analyzing the diff, fixes up some changes since we tend to pin dependencies and then pushes the commit to Gerrit to pass through CI and be merged. If npm is aware of the CVE ID for the security update, that will also be mentioned in the commit message (example). In addition, each package upgrade is tagged, so if you want to e.g. look for all commits that bumped MediaWiki Codesniffer to v26, it's a quick search away.

Lately LibUp has been occupied fixing the minimist prototype pollution advisory through a bunch of dependencies: gonzales-pe, grunt, mkdirp and postcss-sass. It's a rather low priority security issue, but it now requires very little human attention because it has been automated away.

There are some potential risks - someone could install a backdoor by putting an intentional vulnerability in the same version as fixing a known/published security issue. LibUp would then automatically roll out the new version, making us more vulnerable to the backdoor. This is definitely a risk, but I think our strategy of pulling in new security fixes automatically protects us more than the potential downside of malicious actors abusing the system (also because I wouldn't absolutely trust any code pulled down from npm in the first place!).

There are some errors we see occasionally, and could use help resolving them: T228173 and T242703 are the two most pressing ones right now.

Covid-19 Wikipedia pageviews, a first look

18:30, Friday, 03 2020 April UTC

World events often have a dramatic impact on online services. A past example would be the death of Michael Jackson which brought down Twitter and Wikipedia and made Google believe that they were under attack according to the BBC.

Events like the COVID-19 (Coronavirus) pandemic have less instantaneous affect but trends can still be seen to change. Cloudflare recently posted about some of the internet wide traffic changes due to the pandemic and various government announcements, quarantines and lockdowns.

Currently the main English Wikipedia article for the COVID-19 pandemic is receiving roughly 1.2 million page views per day (14 per second). This article has already gone through 4 different names over the past months, and the pageview rate continues to climb.

Wikipedia pageviews tool showing English Wikipedia COVID-19 pandemic article views up to 21 March 2020 (source)

Interestingly there was a decrease in pageviews throughout February compared to the week after the article was first created and since the continued increase in the pandemic. This decrease in pageviews also lines up with a decrease in general interest according to Google Trends.

Interest over time on Google Trends for Coronavirus – Worldwide, 17/01/2020 – 22/03/2020 (source)

Information is not only available in English, and other language Wikipedia pages are also seeing high and increasing pageviews with Russian leading the way, closely followed by Spanish, German and Chinese.

Taking these other language editions into account we reach roughly 2.4 million daily page views for the pandemic (28 per second), which is double that of the English article alone.

Wikipedia langviews tool showing the top 10 language editions COVID-19 pandemic pages on the 20th march 2020 (source)

A comprehensive list of current Wikipedia article titles relating to COVID-19 in all languages can be generated using the Wikidata Query Service using a fairly simple query. These ~2,500 page titles can then be used to retrieve further pageview data. A snapshot of the list used can be found here.

Looking at this full list of article titles across all language Wikipedias over the last week the topic interest has continued growing, and on the 21st March the topic received 4.5 million page views (52 per second).

COVID-19 related Wikipedia pageviews between the 14th and 21st March 2020

A continued increase in interest can be seen across all continents. The split per continent looks roughly consistent with general Wikipedia viewing figures, though Asia would normally be below North America, and Africa would normally be below South America.

COVID-19 related Wikipedia pageviews split by continent between the 14th and 21st March 2020

Looking at per country trends for the countries with the largest number of pageviews most countries appear to be trending up. Germany appears to have shown the most dramatic increase in interest in the past week. The United States and India have the highest pageviews on a single day. Italy actually appears to be trending down.

COVID-19 related Wikipedia pageviews split by country, where the country made over 100k pageviews a day, between the 14th and 21st March 2020

In order to see trends across all Wikimedia sites for a large number of pages it will be important to account for historical page names of articles. As identified at the top of this post the English Wikipedia article has passed through 4 different names in the past months, as I expect is also the case for other languages. As a result simply generating trend data for the current names misses data before the last name change, which is why the total, continent and country graphs only show the last 1 week.

Notes

  • All “page views” within this post refer to views by real users (excluding web crawlers etc).
  • Final aggregate data & splits by country and continent generated using the WMF Data Lake.
  • A typo in the Wikidata Query Service SPARQL query meant that ~300 page titles (out of ~1,700) were not checked during this blog post.

The post Covid-19 Wikipedia pageviews, a first look appeared first on Addshore.

Production Excellence #13: July 2019

16:30, Friday, 03 2020 April UTC

How’re we doing on that strive for operational excellence? Read this first anniversary edition to find out!

📊 Month in numbers
  • 5 documented incidents. [1]
  • 53 new Wikimedia-prod-error reports. [2]
  • 44 closed Wikimedia-prod-error reports. [3]
  • 218 currently open Wikimedia-prod-error reports in total. [4]

The number of recorded incidents over the past month, at five, is equal to the median number of incidents per month (2016-2019). – Explore this data.

To read more about these incidents, their investigations, and pending actionables; check Incident documentation § 2019.


📖 One year of Excellent adventures!

Exactly one year ago this periodical started to provide regular insights on production stability. The idea was to shorten the feedback cycle between deployment of code that leads to fatal errors and the discovery of those errors. This allows more people to find reports earlier, which (hopefully) prevents them from sneaking into a growing pile of “normal” errors.

576 reports were created between 15 July 2018 and 31 July 2019 (tagged Wikimedia-prod-error).
425 reports got closed over that same time period.

Read the first issue in story format, or the initial e-mail.


📉 Outstanding reports

Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Or help someone who already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • November: 1 report left (unchanged). ⚠️
  • December: 3 reports left (unchanged). ⚠️
  • January: 1 report left (unchanged). ⚠️
  • February: 2 reports left (unchanged). ⚠️
  • March: 4 reports left (unchanged). ⚠️
  • April: 10 of 14 reports left (unchanged). ⚠️
  • May: 2 reports got fixed! (4 of 10 reports left). ❇️
  • June: 2 reports got fixed! (9 of 11 reports left). ❇️
  • July: 18 new reports from last month remain unsolved.

🎉 Thanks!

Thank you to @aaron, @Anomie, @ArielGlenn, @Catrope, @cscott, @Daimona, @dbarratt, @dcausse, @EBernhardson, @Jdforrester-WMF, @jeena, @MarcoAurelio, @SBisson, @Tchanders, @Tgr, @tstarling, @Urbanecm; and everyone else who helped by finding, investigating, or resolving error reports in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


Quote: 🎙 “Unlike money, hope is for all: for the rich as well as for the poor.”

Footnotes:

[1] Incidents. – wikitech.wikimedia.org/wiki/Special:PrefixIndex?prefix=Incident…

[2] Tasks created. – phabricator.wikimedia.org/maniphest/query…

[3] Tasks closed. – phabricator.wikimedia.org/maniphest/query…

[4] Open tasks. – phabricator.wikimedia.org/maniphest/query…

Production Excellence #12: June 2019

16:29, Friday, 03 2020 April UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 11 documented incidents. ⚠️ [1]
  • 39 new Wikimedia-prod-error reports. [2]
  • 25 Wikimedia-prod-error reports closed. [3]

The number of incidents in June was high compared to previous years. At 11 incidents, this is higher than this year’s median (5), the 2018 median (4), and the 2017 median (5). It is also higher than any month of June in the last 4 years. – More data at CodePen.

To read more about these incidents, their investigations, and pending actionables; check Incident documentation § 2019.

There are currently 204 open Wikimedia-prod-error reports (up from 186 in April, and 201 in May). [4]


📖 [Op-ed] Integrated maintenance cost

Hereby a shoutout to the Wikidata and Core Platform teams, at WMDE and WMF respectively. They both recently established a rotating subteam that focuses on incidental work. Such as maintenance, and other work that might otherwise hinder feature development.

I expect this to improve efficiency by avoiding context switches between feature and incidental work. The rotational aspect should distribute the work more evenly among team members (avoiding burnout). And, it may increase exposure to other teams, and lesser-known areas of our code; which provide opportunities for personal growth and to retain institutional knowledge.


📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error

Or help someone who already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • November: 1 issue got fixed! (1 issue left).
  • December: 3 issues left (unchanged). ⚠️
  • January: 1 issue left (unchanged). ⚠️
  • February: 2 issues left (unchanged). ⚠️
  • March: 4 issues left (unchanged). ⚠️
  • April: 2 issues got fixed! (10 of 14 issues, that survived April, remain open). ❇️
  • May: 4 issues got fixed! (6 of 10 issues, that survived May, are left). ❇️
  • June: 11 new issues from last month remain unresolved.

By steward and software component, the unresolved issues that survived June:

  • CPT / MW Auth (PHP fatal): T228717
  • CPT / MW Actor (DB contention): T227739
  • CPT or Multimedia / Thumb handler (MultiCurl error): T225197
  • Multimedia / File metadata (PHP error): T226751
  • Wikidata / Commons page view (PHP fatal): T227360
  • Wikidata / Jobrunner (PHP memory fatal): T227450
  • Wikidata / Jobrunner (Trx error): T225098
  • Product-Infra / ReadingList API (PHP fatal): T226593
  • (Unknown?) / Special:ConfirmEmail (PHP fatal): T226337
  • (Unknown?) / Page renaming (DB timeout): T226898
  • (Unknown?) / Page renaming (Bad revision fatal): T225366
💡Ideas: To suggest something to investigate or highlight in a future edition, contact me by e-mail or private IRC message.

🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production. Including: @Anomie, @brion, @Catrope, @cscott, @daniel, @dcausse, @DerFussi, @Ebe123, @fgiunchedi, @Jdforrester-WMF, @kostajh, @Legoktm, @Lucas_Werkmeister_WMDE, @matmarex, @matthiasmullie, @Michael, @Nikerabbit, @SBisson, @Smalyshev, @Tchanders, @Tgr, @Tpt, @Umherirrender, and @Urbanecm.

Thanks!

Until next time,

– Timo Tijhof

🔮These are his marbles...” “Ha! He really did lose his marbles, didn't he?” “Yeah, he lost them good.

Footnotes:

  1. Incidents. – wikitech.wikimedia.org/wiki/Special:PrefixIndex…
  2. Tasks created. – phabricator.wikimedia.org/maniphest/query…
  3. Tasks closed. – phabricator.wikimedia.org/maniphest/query…
  4. Open tasks. – phabricator.wikimedia.org/maniphest/query…

Production Excellence #11: May 2019

16:28, Friday, 03 2020 April UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 6 documented incidents. [1]
  • 41 new Wikimedia-prod-error tasks created. [2]
  • 36 Wikimedia-prod-error tasks closed. [3]

The number of incidents in May of this year was comparable to previous years (6 in May 2019, 2 in May 2018, 5 in May 2017), and previous months (6 in May, 8 in April, 8 in March) – comparisons at CodePen.

To read more about these incidents, their investigations, and pending actionables; check wikitech.wikimedia.org/wiki/Incident_documentation#2019.

As of writing, there are 201 open Wikimedia-prod-error tasks (up from 186 last month). [4]


📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error

Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • November: 2 issues left (unchanged).
  • December: 1 issue got fixed. 3 issues left (down from 4).
  • January: 1 issue left (unchanged).
  • February: 2 issues left (unchanged).
  • March: 1 issue got fixed. 4 issues remaining (down from 5).
  • April: 2 issues got fixed. 12 issues remain unresolved (down from 14).
  • May: 10 new issues found last month survived the month of June, and remain unresolved.

By steward and software component, unresolved issues from April and May:

  • Wikidata / Lexeme (API query fatal): T223995
  • Wikidata / WikibaseRepo (API Fatal hasSlot): T225104
  • Wikidata / WikibaseRepo (Diff link fatal): T224270
  • Wikidata / WikibaseRepo (Edit undo fatal): T224030
  • Growth / Echo (Notification storage): T217079
  • Growth / Flow (Topic link fatal): T224098
  • Growth / Page deletion (File pages): T222691
  • Multimedia or CPT / API (Image info fatal): T221812
  • CPT / PHP7 refactoring (File descriptions): T223728
  • CPT / Title refactor (Block log fatal): T224811
  • CPT / Title refactor (Pageview fatals): T224814
  • (Unstewarded) Page renaming: T223175, T205675
💡Ideas: To suggest an investigation to write about in a future edition, contact me by e-mail, or private message on IRC.

🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production.

Until next time,

– Timo Tijhof

🎙“It’s not too shabby is it?

Footnotes:

[1] Incidents. –
wikitech.wikimedia.org/wiki/Special:PrefixIndex…

[2] Tasks created. –
phabricator.wikimedia.org/maniphest/query…

[3] Tasks closed. –
phabricator.wikimedia.org/maniphest/query…

[4] Open tasks. –
phabricator.wikimedia.org/maniphest/query…

Production Excellence #10: April 2019

16:27, Friday, 03 2020 April UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

  • Month in numbers.
  • Highlighted stories.
  • Current problems.
📊 Month in numbers
  • 8 documented incidents. [1]
  • 30 new Wikimedia-prod-error tasks created. [2]
  • 31 Wikimedia-prod-error tasks closed. [3]

The number of incidents in April was relatively high at 8. Both compared to this year (4 in January, 7 in February, 8 in March), and compared to last year (4 in April 2018).

To read more about these incidents, their investigations, and conclusions; check wikitech.wikimedia.org/wiki/Incident_documentation#2019.

As of writing, there are 186 open Wikimedia-prod-error issues (up from 177 last month). [4]

📖 Rehabilitation of MediaWiki-DateFormatter

Following the report of a PHP error that happened when saving edits to certain pages, Tim Starling investigated. The investigation motivated a big commit that brings this class into the modern era. I think this change serves as a good overview of what’s changed in MediaWiki over the last 10 years, and demonstrates our current best practices.

Take a look at Gerrit change 502678 / T220563.

📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the week in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error

Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • November: 2 issues left (unchanged).
  • December: 4 issues left (unchanged).
  • January: 1 issue got fixed. One last issue remaining (down from 2).
  • February: 2 issues were fixed. Another 3 issues remaining (down from 5).
  • March: 5 issues were fixed. Another 5 issues remaining (down from 10).
  • April: 14 new issues were found last month that remain unresolved.

By steward and software component, issues left from March and April:

  • Anti-Harassment / User blocking: T222170
  • CPT / Revision-backend (Save redirect pages): T220353
  • CPT / Revision-backend (Import a page): T219702
  • CPT / Revision-backend (Export pages for dumps): T220160
  • Growth / Watchlist: T220245
  • Growth / Page deletion (Restore an archived page): T219816
  • Growth / Page deletion (File pages): T222691
  • Growth / Echo (Job execution): T217079
  • Multimedia / File management (Upload mime error): T223728
  • Performance / Deferred-Updates: T221577
  • Search Platform / CirrusSearch (Job execution): T222921
  • (Unstewarded) / Page renaming: T223175, T221763, T221595

🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production. Including: @aaron, @ArielGlenn, @Daimona, @dcausse, @EBernhardson, @Jdforrester-WMF, @Joe, @KartikMistry, @Ladsgroup, @Lucas_Werkmeister_WMDE, @MaxSem, @MusikAnimal, @Mvolz, @Niharika, @Nikerabbit, @Pchelolo, @pmiazga, @Reedy, @SBisson, @tstarling, and @Umherirrender.

Thanks!

Until next time,

– Timo Tijhof

🏴‍☠️ “One good deed is not enough to save a man.” “Though it seems enough to condemn him?” “Indeed…

Footnotes:

[1] Incidents reports by month and year. –
codepen.io/Krinkle/…

[2] Tasks created. –
phabricator.wikimedia.org/maniphest/query…

[3] Tasks closed. –
phabricator.wikimedia.org/maniphest/query…

[4] Open tasks. –
phabricator.wikimedia.org/maniphest/query…

Production Excellence #9: March 2019

16:26, Friday, 03 2020 April UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 8 documented incidents. [1]
  • 31 new Wikimedia-prod-error issues reported. [2]
  • 28 Wikimedia-prod-error issues closed. [3]

The number of incidents this month was slightly above average compared to earlier this year (7 in February, 4 in January), and this time last year (4 in March 2018, 7 in February 2018).

To read more about these incidents, their investigations, and conclusions, check wikitech.wikimedia.org/wiki/Incident_documentation#2019-03.

There are currently 177 open Wikimedia-prod-error issues, similar to last month. [4]

💡 Ideas: To suggest an investigation to highlight in a future edition, feel free contact me by e-mail, or private message on IRC.

📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the week in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • September: Done! The last two issues were resolved.
  • October: Done! The last issue was resolved.
  • November: 2 issues left (from 1.33-wmf.2). 1 issue was fixed.
  • December: 4 issues left (from 1.33-wmf.9). 1 issue was fixed.
  • January: 2 issues left (1.33-wmf.13 – 14). 1 issue was fixed.
  • February: 5 issues (1.33-wmf.16 – 19).
  • March: 10 new issues (1.33-wmf.20 – 23).

By steward and software component, for issues remaining from February and March:


🎉 Thanks!

Thanks to @aaron, @Anomie, @Arlolra, @Daimona, @hashar, @Jdforrester-WMF, @kostajh, @matmarex, @MaxSem, @Niedzielski, @Nikerabbit, @Petar.petkovic, @santhosh, @ssastry, @Umherirrender, @WMDE-leszek, @zeljkofilipin, and everyone else who helped last month by reporting, investigating, or patching errors found in production!

Until next time,

– Timo Tijhof

🦅 “This isn’t flying. This is falling… with style!

Footnotes:

[1] Incidents. – wikitech.wikimedia.org/wiki/Special:PrefixIndex/Incident_documentation/201903 …

[2] Tasks created. – phabricator.wikimedia.org/maniphest/query …

[3] Tasks closed. – phabricator.wikimedia.org/maniphest/query …

[4] Open tasks. – phabricator.wikimedia.org/maniphest/query …

Production Excellence #8: February 2019

16:24, Friday, 03 2020 April UTC

How’d we do in our strive for operational excellence? Read on to find out!

📊 Month in numbers
  • 7 documented incidents. [1]
  • 30 new Wikimedia-prod-error tasks created. [2] (17 new in Jan, and 18 in Dec.)
  • 27 Wikimedia-prod-error tasks closed. [3] (16 closed in Jan, and 20 in Dec.)

There are in total 177 open Wikimedia-prod-error tasks today. (188 in Feb, 172 in Jan, and 165 in Dec.)

📉 Current problems

There’s been an increase in how many application errors are reported each week. And, we’ve also managed to mostly keep up with those each week, so that’s great!

But, it does appear that most weeks we accumulated one or two unresolved errors, which is starting to add up. I believe this is mainly because they were reported a day after the branch went out. That is, if the same issues had been reported 24 hours earlier in a given week, then they might’ve blocked the train as a regression.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Below is breakdown of unresolved prod errors since last quarter. (I’ve omitted the last three weeks.)

By month:

  • February: 5 reports (1.33-wmf.16, 1.33-wmf.17, 1.33-wmf.18).
  • January: 3 reports (1.33-wmf.13, 1.33-wmf.14).
  • December 2018: 5 reports (1.33-wmf.9).
  • November 2018: 3 reports (1.33-wmf.2).
  • October 2018: 1 report (1.32-wmf.26).
  • September 2018: 2 reports (1.32-wmf.20).

By steward and software component:

📖 Fixed exposed fatal error on Special:Contributions

Previously, a link to Special:Contributions could pass invalid options to a part of MediaWiki that doesn’t allow invalid options. Why would anything allow invalid options? Let’s find out.

Think about software as an onion. Software tends to have an outer layer where everything is allowed. If this layer finds illegal user input, it has to respond somehow. For example, by informing the user. In this outer layer, illegal input is not a problem in the software. It is a normal thing to see as we interact with the user. This outer layer responds directly to a user, is translated, and can do things like “view recent changes”, “view user contributions” or “rename a page”.

Internally, such action is divided into many smaller tasks (or functions). For example, a function might be “get talk namespace for given subject namespace”. This would answer “Talk:” to “(Article)”, and “Wikipedia_talk:” to “Wikipedia:”. When searching for edits on My Contributions with “Associated namespaces” ticked, this function is used. It is also used by Move Page if renaming a page together with its talk page. And it’s used on Recent Changes and View History, for all those little “talk” links next to each page title and username.

If one of your edits is for a page that has no discussion namespace, what should MediaWiki do? Show no edits? Skip that edit and tell the user “1 edit was hidden”? Show normally, but without a talk link? That decision is made by the outer layer for a feature, when it catches the internal exception. Alternatively, it can sometimes avoid an exception by asking a different question first – a question that cannot fail. Such as “Does namespace X have a talk space?”, instead of “What is the talk space for X?”.

When a program doesn’t catch or avoid an exception, a fatal error occurs. Thanks to @D3r1ck01 for fixing this fatal error. – T150324

💡 ProTip: If your Jenkins build is failing and you suspect it’s unrelated to the project itself, be sure to report it to Phabricator under “Shared Build Failure”.
🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production. Including: @aaron, @Addshore, @alaa_wmde, @Amorymeltzer, @Anomie @D3r1ck01 @Daimona @daniel @hashar @hoo, @jcrespo, @KaMan, @Mainframe98, @Marostegui, @matej_suchanek, @Ottomata, @Pchelolo, @Reedy, @revi, @Smalyshev, @Tarrow, @Tgr, @thcipriani, @Umherirrender, and @Volker_E.

Thanks!

Until next time,

– Timo Tijhof


Footnotes:

[1] Incidents. — wikitech.wikimedia.org/wiki/Special:AllPages…

[2] Tasks created. — phabricator.wikimedia.org/maniphest/query…

[3] Tasks closed. — phabricator.wikimedia.org/maniphest/query…


🍏 He got me invested in some kind of.. fruit company.

Production Excellence #7: January 2019

16:23, Friday, 03 2020 April UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 4 documented incidents in January 2019. [1]
  • 16 Wikimedia-prod-error tasks closed. [2]
  • 17 Wikimedia-prod-error tasks created. [3]

📖 Unable to move certain file pages

Xiplus reported that renaming a File page on zh.wikipedia.org led to a fatal database exception. Andre Klapper identified the stack trace from the logs, and Brad (@Anomie) investigated.

The File renaming failed because the File page did not have a media file associated with it (such move action is not currently allowed in MediaWiki). But, while handling this error the code caused a different error. The impact was that the user didn't get informed about why the move failed. Instead, they received a generic error page about a fatal database exception.

@Tgr fixed the code a few hours later, and it was deployed by Roan later that same day.
Thanks! — T213168

📖 DBPerformance regression detected and fixed

During a routine audit of Logstash dashboards, I found a DBPerformance warning. The warning indicated that the limit of 0 for “master connections” was violated. That's a cryptic way of saying it found code in MediaWiki that uses a database master connection on a regular page view.

MediaWiki can have many replica database servers, but there can be only one master database at any given moment. To reduce chances of overload, delaying edits, or network congestion; we make sure to use replicas whenever possible. We usually involve the master only when source data is being changed, or is about to be changed. For example, when editing a page, or saving changes.

As the vast majority of traffic is page views, we have lower thresholds for latency and dependency on page views. In particular, page views may (in the future) be routed to secondary data centres that don’t even have a master DB.

@Tchanders from the Anti-Harassment team investigated the issue, found the culprit, and fixed it in time for the next MediaWiki train. Thanks! — T214735

📖 TemplateData missing in action

@Tacsipacsi and @Evad37 both independently reported the same TemplateData issue. TemplateData powers the template insertion dialog in VisualEditor. It wasn't working for some templates after we deployed the 1.33-wmf.13 branch.

The error was “Argument 1 passed to ApiResult::setIndexedTagName() must be an instance of array, null given”. This means there was code that calls a function with the wrong parameter. For example, the variable name may've been misspelled, or it may've been the wrong variable, or (in this case) the variable didn't exist. In such case, PHP implicitly assumes “null”.

Bartosz (@matmarex) found the culprit. The week before, I made a change to TemplateData that changed the “template parameter order” feature to be optional. This allows users to decide whether VisualEditor should force an order for the parameters in the wikitext. It turned out I forgot to update one of the references to this variable, which still assumed it was always present.

Brad (Anomie) fixed it later that week, and it was deployed the next day. Thanks! — T213953

📈 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the week in which they were first observed.

phabricator.wikimedia.org/tag/wikimedia-production-error

There are currently 188 open Wikimedia-prod-error tasks as of 12 February 2019. (We’ve had a slight increase since November; 165 in December, 172 in January.)

For this month’s edition, I’d like to draw attention to a few older issues that are still reproducible:

  • [2013; Collection extension] Special:Book fatal error for blocked users. T56179
  • [2013; CentralNotice] Fatal error when placeholder key contains a space. T58105
  • [2014; LQT] Fatal error when attempting to view certain threads. T61791
  • [2015; MassMessage] Warning about Invalid message parameters. T93110
  • [2015; Wikibase] Warning “UnresolvedRedirectException” for some pages on Wikidata (and Commons). T93273
💡 Terminology:

A “Fatal error” (or uncaught exception) prevents a user action. For example — a page might display “MWException: Unknown class NotificationCount.”, instead the article content.
A “Warning” (or non-fatal, or PHP error) lets the program continue to display a mostly page regardless. This may cause corrupt, incorrect, or incomplete information to be shown. For example — a user may receive a notification that says “You have (null) new messages”.


🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production. Including: A2093064‚ @Anomie, @Daimona @Gilles, @He7d3r, @Jdforrester-WMF, @matmarex, @mmodell, @Nikerabbit, @Catrope, @Tchanders, @Tgr, and @thiemowmde.

Thanks!

Until next time,

— Timo Tijhof

👢There's a snake in my boot. Reach for the sky!


Footnotes:

[1] Incidents. — wikitech.wikimedia.org/wiki/Special:AllPages…

[2] Tasks closed. — phabricator.wikimedia.org/maniphest/query…

[3] Tasks created. — phabricator.wikimedia.org/maniphest/query…

Production Excellence #14: August 2019

16:20, Friday, 03 2020 April UTC

How’d we do in our strive for operational excellence in August? Read on to find out!

📊 Month in numbers
  • 3 documented incidents. [1]
  • 42 new Wikimedia-prod-error reports. [2]
  • 31 Wikimedia-prod-error reports closed. [3]
  • 210 currently open Wikimedia-prod-error reports in total. [4]

The number of recorded incidents in August, at three, was below average for the year so far. However, in previous years (2017-2018), August also has 2-3 incidents. – Explore this data.

To read more about these incidents, their investigations, and pending actionables; check Incident documentation § 2019.


*️⃣ When you have eliminated the impossible...

Reports from Logstash indicated that some user requests were aborted by a fatal PHP error from the MessageCache class. The user would be shown a generic system error page. The affected requests didn’t seem to have anything obvious in common, however. This made it difficult to diagnose.

MessageCache is responsible for fetching interface messages, such as the localised word “Edit” on the edit button. It calls a “load()” function and then tries to access the loaded information. However, sometimes the load function would claimed to have finished its work, but yet the information was not there.

When the load function initialises all the messages for a particular language, it keeps track of this, so as to not do the same a second time. From any one angle I could look at this code, no obvious mistakes stood out. A deeper investigation revealed that two unrelated changes (more than a year apart), each broke 1 assumption that was safe to break. But, put together, and this seemingly impossible problem emerges. Check out T208897#5373846 for the details of the investigation.


📉 Outstanding reports

Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • January: 1 report left (unchanged). ⚠️
  • February: 2 reports left (unchanged). ⚠️
  • March: 4 reports left (unchanged). ⚠️
  • April: 2 reports got fixed! (8 of 14 reports left). ❇️
  • May: 4 of 10 reports left (unchanged).
  • June: 1 report got fixed! (8 of 11 reports left). ❇️
  • July: 2 reports got fixed (17 of 18 reports left).
  • August: 14 new reports remain unsolved.
  • September: 11 new reports remain unsolved.

🎉 Thanks!

Thank you to @aaron, @Catrope, @Daimona, @dbarratt, @Jdforrester-WMF, @kostajh, @pmiazga, @Tarrow, @zeljkofilipin, and everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


🎭“I think you should call it Seb's because no one will come to a place called Chicken on a Stick.

Footnotes:

[1] Incidents. – wikitech.wikimedia.org/wiki/Special:PrefixIndex?prefix=Incident…

[2] Tasks created. – phabricator.wikimedia.org/maniphest/query…

[3] Tasks closed. – phabricator.wikimedia.org/maniphest/query…

[4] Open tasks. – phabricator.wikimedia.org/maniphest/query…

Production Excellence #6: December 2018

16:18, Friday, 03 2020 April UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

  • Month in numbers.
  • Lightning round.
  • Current problems.

📊 Month in numbers

  • 4 documented incidents. [1]
  • 20 Wikimedia-prod-error tasks closed. [2]
  • 18 Wikimedia-prod-error tasks created. [3]
  • 172 currently open Wikimedia-prod-error tasks (as of 16 January 2019).

Terminology:

  • An Exception (or fatal) prevents a user action. For example, a page would display “Exception: Unable to render page”, instead the article content.
  • An Error (or non-fatal, warning) can produce pages that are technically unaware of a problem, but may show corrupt, incorrect, or incomplete information. For example — a user may receive a notification that says “You have (null) new messages”.

For December, I haven’t prepared any stories or taken interviews. Instead, I’ve got a lightning round of errors in various areas that were found and fixed this past month.

⚡️ Contributions view fixed

MarcoAurelio reported that Special:Contributions failed to load for certain user names on meta.wikimedia.org (PHP Fatal error, due to a faulty database record). Brad Jorsch investigated and found a relation to database maintenance from March 2018. He corrected the faulty records, which resolved the problem. Thanks! — T210985

⚡️ Undefined talk space now defined

The newly created Cantonese Wiktionary (yue.wiktionary.org) was encountering errors from the Siteinfo API. We found this was due to invalid site configuration. Urbanecm patched the issue, and also created a new unit test for wmf-config that will prevent this issue from happening on other wikis in the future. Thanks! — T211529

⚡️ The undefined error status... error

After deploying the 1.33.0-wmf.8 train to all wikis, we found a regression in the HTTP library for MediaWiki. When MediaWiki requested an HTTP resource from another service, and this resource was unavailable, then MediaWiki failed to correctly determine the HTTP status code of that error. Which then caused another error! This happened, for example, when Special:Collection was unable to reach the PediaPress.com backend in some cases. Patched by Bill Pirkle. Thanks! — T212005

⚡️ Fatal error: Call to undefined function in Kartographer API

When the 1.33.0-wmf-9 train reached the canary phase on Tue 18 December (aka, group0 [1]), Željko spotted a new fatal error in the logs. The fatal originated in the Kartographer extension and would have affected various users of the MediaWiki API. Patched the same day by Michael Holloway, reviewed by James Forrester, and deployed by Željko. Thanks! — T212218

📉 Current problems

Take a look at the workboard and look for tasks that might need your help. The workboard lists known issues, grouped by the week in which they were first observed.

→ https://phabricator.wikimedia.org/tag/wikimedia-production-error

November's theme will continue for now, as I imagine lots of you were on vacation during that time! I’d like to draw attention to a subset of PHP fatal errors. Specifically, those that are publicly exposed (e.g. don’t need elevated user rights) and emit an HTTP 500 error code.

  1. Wikibase: Clicking “undo” for certain revisions fatals with a PatcherException. — T97146
  2. Flow: Unable to view certain talk pages due to workflow InvalidDataException. — T70526
  3. Translate: Certain Special:Translate urls fatal. — T204833
  4. MediaWiki (Special-pages): SpecialDoubleRedirects unavailable on tt.wikipedia.org. — T204800
  5. MediaWiki (Parser): Parse API exposes fatal content model error. — T206253
  6. CentralNotice: Certain SpecialCentralNoticeBanners urls fatal. — T149240
  7. PageViewInfo: Certain “mostviewed” API queries fail. — T208691

Public user requests resulting in fatals can (and have) caused alerts to fire that notify SRE of wikis potentially being less available or down.

💡 ProTip:

Use “Report Error” on https://phabricator.wikimedia.org/tag/wikimedia-production-error/ to create a task with a helpful template. This template is also available as “Report Application Error”, from the “Create Task” dropdown menu, on any task creation form.

🎉 Thanks!

Thank you to everyone who has helped by reporting, investigating, or resolving problems in Wikimedia production. Including @MarcoAurelio, @Anomie, @Urbanecm, @BPirkle, @zeljkofilipin, @Mholloway, @Esanders, @Jdforrester-WMF, and @hashar.

Until next time,

— Timo Tijhof


Footnotes:

[1] Incidents. — wikitech.wikimedia.org/wiki/Special:AllPages...

[2] Tasks closed. — phabricator.wikimedia.org/maniphest/query...

[3] Tasks opened. — phabricator.wikimedia.org/maniphest/query...

[4] What is group0? — wikitech.wikimedia.org/wiki/Deployments/One_week#Three_groups