Teaching students that their words have power

19:48, Thursday, 05 2020 March UTC

“College professors tend not to allow you to include a Wikipedia article as a citation in a paper you write. So what are you doing encouraging your students to go to Wikipedia?” Houston Matters host Craig Cohen asked Dr. Melissa Weininger in a radio interview last week.

Dr. Melissa Weininger (by Jeff Fitlow).

“Well, I’m not encouraging them to go to Wikipedia but to write well-sourced, well researched Wikipedia entries,” Dr. Weininger clarified. “We all know that people do use Wikipedia. Even if you’re not allowed to use it for your college course, we certainly look to Wikipedia when we need a little bit of information about something. And so it’s really important that the information that’s on Wikipedia is reliable.”

Dr. Weininger began teaching a Wikipedia writing assignment in her course, Sex and Gender in Modern Jewish Culture, at Rice University last fall. Through it, students discussed issues of inequity in Wikipedia’s content and the importance of access to accurate, verifiable information.

“I think we’ve all learned in the last years about how important it is for everybody, but students in particular, to be able to differentiate between reliable and unreliable information on the internet,” Dr. Weininger continued. “So one of the things the students learned in doing the project is—it was a way of reverse engineering that process. They learned how Wikipedia entries are built and therefore what their strengths and weaknesses can be.”

The lack of equal representation of biographies of women on Wikipedia “was one of the reasons for starting this project.” Dr. Weininger was interested in exploring writing as activism with students.

“Part of that is to teach the students that their words and their actions have power, that the information that is available to us on the internet can also be a source of power. If there’s less information available to us about women, we learn that women aren’t as valuable in our culture and we simply don’t have access to their stories. So the assignments for the class were all structured around this idea of not just improving our access to information about women on Wikipedia, but thinking about how what we do in the classroom, what we learn in the classroom, and how our own writing can really be of benefit to the public sphere in general.”

Sarah Silberman (by Katharine Shilcutt).

Dr. Weininger was joined by one of her former students who had completed the assignment, Sarah Silberman, who drastically improved Jewish social historian Paula Hyman’s biography (see the Dashboard’s Authorship Highlighting tool in action here!).

“At Rice I’m a double history and French major,” Sarah shared. “So in other Jewish history classes I actually read a good number of Paula Hyman articles and I read excerpts from books she had written. I really wanted to write about her because looking at her Wikipedia article, it just looked so sad and sparse in comparison to all the amazing things she’s done in her life and I wanted people to know that.”

“There’s a potential downside to crowdsourcing on Wikipedia,” the interviewer suggested, “where someone like you, Sarah, will write this extensive article, have well-cited sources, and then somebody else will come along and decide to change it without any particular citation. Did you track your article to see if others were coming along and making tweaks to it?”

“I have actually gone back a few times to make sure everything with Paula’s article is tip-top shape and so far no one has changed anything,” Sarah responded. “But even so, it tends to be that people on Wikipedia change things for the better. If my language was too subjective, citing too much of an opinion rather than a fact, then it would be changing something like that.”

“When my students published their entries, I immediately started circulating them to other people,” said Dr. Weininger. “And to be honest, other academics were really excited about it and excited to see that my students had really improved the quality of entries about these women who were extremely important.”

“Wikipedia itself might not be a reliable source as a place to look for all of your information,” she noted. “But it does actually have pretty stringent citation standards. So one of the things students also learn from this is the way to fact-check Wikipedia is by going back to the original sources because they’re all cited at the end of the article.”

”A lot of my students have grown up with Wikipedia and Googling things, but haven’t thought very much about where that information comes from. And when you yourself write an entry, all of a sudden all of that is open to you. And you have a much clearer understanding of how these things are put together, maybe who’s putting these things together, what’s involved in it, what the potential pitfalls are, and also the potential advantages.”

“Most projects in a classroom-setting are done in the classroom,” the interviewer noted. “This is one that is sort of public record. Did you have any qualms about doing a project that is so out there?”

“Well no,” Dr. Weininger answered, “and that’s mostly a testament to how great Rice students are and the students in this class were. I actually wasn’t worried about that at all. And there are also advantages that come with that risk—the public nature of this project. And that is that students get to see their work published. One of the great things to see was the day that the articles were due and they had to post them online. … They were all saying, ‘I wrote this and it’s on the internet!’ That’s a really exciting thing, and hopefully it gets them excited about doing it more now that they have the skills and the ability to do it. So there’s some risk involved, but a lot of reward. And I knew they were up to it and they really did such a fantastic job.”

“I actually really did enjoy doing this project,” Sarah added. “I would love to continue doing this as a side hobby or something. … I think this should be more prominent in college classes and in university classes. I think that, like Dr. Weininger was saying, not only was it satisfying from a student perspective to see that my work was published and that it was doing a public good, but also it’s just a way for our work to have greater impact. Because a lot of times I write papers, they’ll go to my professor, my professor will read them, and then that’s the end of it. It doesn’t really go anywhere beyond that. So having a project that actually has some impact on the world is really important. And I think academia should take advantage of more opportunities like that.”


To incorporate a Wikipedia writing assignment into an upcoming course, visit teach.wikiedu.org for access to free resources and assignment templates.


To read more about Dr. Weininger’s class, check out this blog post.


To listen to the original interview (which begins at minute 11:10), visit Houston Matters.


Thumbnail and inset images by Jeff Fitlow and Katharine Shilcutt respectively (Rights reserved).

Dr. Lydia Le Page is a postdoc at the University of California, San Francisco, where she images brain metabolism with MRI to understand Alzheimer’s disease. In our recent Wiki Scientists course sponsored by the National Science Policy Network, she was excited to improve Wikipedia pages that will help voters and policy-makers make the best use of research when voting on or developing policies. 

“You read that on Wikipedia? Oh I wouldn’t trust that.”

Lydia Le Page.
Image by Pebbles1.0, CC BY-SA 4.0 via Wikimedia Commons.

Wikipedia has come a long way since its inception in 2001. Dubious at first, I found myself using it more and more during my Chemistry undergraduate studies at Oxford. When I moved from learning about science to doing science as part of a PhD in diabetic physiology and metabolism using MRI, I found that even the most esoteric topics had pages (Hyperpolarized carbon-13 MRI gets ~11 views/day) – and to me that was invaluable!

My research has since taken me to a postdoc in San Francisco, where I now study metabolism in the brain in Alzheimer’s disease using MRI. I’m also enthusiastic about science communication and policy. I love finding ways to share my understanding of science with other people. It’s important to me that scientific endeavors improve lives, and one way is to do that is using scientific evidence to inform policy changes. Although I’ve made blog posts, YouTube videos, and given talks about science, I had never edited Wikipedia content before and until the Wiki Education course, I had no idea how to start.

After a brief application, I was excited to be sponsored by the National Science Policy Network to attend a Wiki Education course. For an hour a week for 12 weeks, I would get on a video call with policy-minded scientists (and their dogs) across America to learn about Wikipedia. The course was led by Wiki Educators Ryan and Elysia who were infectiously enthusiastic about editing, and great teachers. I can now write my own pages on science and policy-related topics – I wrote the page for the National Alzheimer’s Project Act!

One thing I learnt was the high standard that Wikipedia edits are held to – edits that don’t meet the rules are typically quickly reversed. Wikipedia isn’t for sharing your own view – an encyclopedia is built from statements of fact, and these must have citations. But not any old source! I learnt that some sources are greater than others. If there aren’t several secondary sources about a topic (e.g. the page on the Queen can’t reference anything she wrote about herself), it shouldn’t be on Wikipedia. As a scientist, that meant that I couldn’t link to my latest paper on my research, but instead would need to wait until a review on the work was written by someone else! Although this means that Wikipedia may lag behind some developments, they avoid incorporating early studies that are later found to be unreproducible.

The media, and especially the internet, has a reputation for being male biased, and Wikipedia is not without some of these issues. Did you know that pages about women regularly mention their marital status? Not so much for the men. Initiatives to address these issues are underway, such as the WikiProject ‘Women in Red’ – highlighting just how many women or works by women that ‘may qualify for an article on the English Wikipedia’ don’t have one. In October 2014, just over 15% of English Wikipedia’s biographies were about women; in January 2020 we’ve just exceeded 18%. We still have a way to go.

One tool I found very useful when I started adding content was the Citation Hunt tool  – a fascinating and easy way to improve Wikipedia. It shows you snippets of random pages that need citations; my favorite addition was to the page on Bela Lugosi – citations were required on the current whereabouts of his famous Dracula cape! (It’s in a museum in LA by the way..)

“Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That’s what we’re doing.” – Jimmy Wales, Wikipedia co-founder

The first English encyclopedia, Encyclopedia Brittanica, was published over 250 years ago in 1768. In January, English Wikipedia reached 6 million articles. Collating all the world’s knowledge, freely available to anyone, is truly one of the wonders of the modern world. Twenty years ago very few people even believed such a project was viable. Wikipedia has helped me to such a degree in my career, from a young student to budding neuroscientist, I am happy to be able to give something back.


Interested in taking an introductory Wikipedia training course? Write Wikipedia biographies for women across disciplines and professions (here). To see all courses with open registration, visit learn.wikiedu.org.

On Twitter Janeen Uzzell praised a blogpost that is the Wikimedia Foundation All Hands: 2020 Sketchbook and indeed it informs about current thinking, most of it is great and still, I find it absolutely terrifying.

There are several great sketches in there. Katherine Maher gave an asperational talk, I love it for Wikimedia to be seen as infrastructural, inclusive and even that that what we do does not have to be in our projects. Important is that she mentions "support systems" because they provide the input for much of our processes.

Important is the page on security and risk. All the important concepts are mentioned among them; likelihood, relative impact and management preparedness but also "plan for and mitigate risks".

What truly makes me uneasy is when it is said that we aim to clarify who we are in the world in one brand, Wikipedia. The idea is that when we are all branded as Wikipedia, things are likely to become easier. When you check out the website brandingwikipedia.org there is no argument; Wikipedia is free knowledge. When you check out what it is to do
  • project and improve our reputation
  • support our movement/growth
  • be opt-in
In the abstract Wikipedia IS wonderful, in reality the concept of what Wikipedia is, is largely determined by the English Wikipedia. It it is fiercely independent, it is hardly inclusive and it has largely determined the maneuvering space the Wikimedia Foundation has. In order to "plan for and mitigate risks", I will mention several reasons why I am anxious because of this branding initiative.
  • In the Commons OTRS they use English Wikipedia notions to determine if pictures can stay or are to be removed. Commons provides a service to all Wikimedia projects
  • The query functionality for Commons is maintained by people from the Foundation. For more than half a year it puts a strain on the growth and usefulness of Wikidata. Tools have become glacially slow and often malfunction because an edit is not available when needed in further processing. It is not known what the position of the WMF director is in this
  • This is about marketing and we have never done much marketing for any of our projects. What we have done was reactive and has been all about the English Wikipedia. Now consider this:
    • Wikisource, we do not know what is available at what quality, it is all about editing and not about having people read the finished article, consequently we do not value Wikisource and fulfill its potential.
    • So far Commons has always been English only. With the support of the "depicts" functionality, there is room to enable and market  a multilingual search engine. In the spirit of "it is a Wiki", it serves as an open invite to add labels in any and all of our languages and open up what Commons has to offer. It is how to market free content the Wiki way.
    • In Wikidata we know many more concepts than what we know in any individual Wikipedias. We could use our data and inform as we have done for years in multilingual tools like Reasonator. This is an example in English Russian Chinese and Kannada. NB it takes additional labels to improve results and consequently this is the inclusive approach.
    • When Wikipedians were willing to reflect on their own performance, we could help them solve their false friends issues.
One sketch in the sketchbook is a presentation by Jess Wade. It says that even Academia is biased. As the Wikimedia community we do not need to be subservient to any bias and most certainly not the bias that Wikipedia has brought us.

Run Selenium tests using Quibble and Docker

14:35, Wednesday, 04 2020 March UTC

Dependencies are Git Python 3, and Docker Community Edition (CE).

First, the general setup.

$ git clone https://gerrit.wikimedia.org/r/p/integration/quibble
...
       
$ cd quibble/

$ python3 -m pip install -e .
...

$ docker pull docker-registry.wikimedia.org/releng/quibble-stretch:latest
...
(2m 26s)

The simplest, and slowest, way to run Quibble.

$ docker run -it --rm \
 docker-registry.wikimedia.org/releng/quibble-stretch:latest
...
(12m 54s)

Speed things up by using local repositories.

$ mkdir -p ref/mediawiki/skins

$ git clone --bare https://gerrit.wikimedia.org/r/mediawiki/core ref/mediawiki/core.git
...
(3m 40s)

$ git clone --bare https://gerrit.wikimedia.org/r/mediawiki/vendor ref/mediawiki/vendor.git
...

$ git clone --bare https://gerrit.wikimedia.org/r/mediawiki/skins/Vector ref/mediawiki/skins/Vector.git
...

$ mkdir cache
$ chmod 777 cache

$ mkdir -p log
$ chmod 777 log

$ mkdir -p src
$ chmod 777 src

$ docker run -it --rm \
  -v "$(pwd)"/cache:/cache \
  -v "$(pwd)"/log:/workspace/log \
  -v "$(pwd)"/ref:/srv/git:ro \
  -v "$(pwd)"/src:/workspace/src \
  docker-registry.wikimedia.org/releng/quibble-stretch:latest
...
(18m 0s)

The second run of everything, just to see if things get faster.

$ docker run -it --rm \
  -v "$(pwd)"/cache:/cache \
  -v "$(pwd)"/log:/workspace/log \
  -v "$(pwd)"/ref:/srv/git:ro \
  -v "$(pwd)"/src:/workspace/src \
  docker-registry.wikimedia.org/releng/quibble-stretch:latest
...
(16m 50s)

If you get this error message

A LocalSettings.php file has been detected. To upgrade this installation, please run update.php instead

just remove the file

$ rm src/LocalSettings.php

Speed things up by skipping Zuul and not installing dependencies.

$ docker run -it --rm \
  -v "$(pwd)"/cache:/cache \
  -v "$(pwd)"/log:/workspace/log \
  -v "$(pwd)"/ref:/srv/git:ro \
  -v "$(pwd)"/src:/workspace/src \
  docker-registry.wikimedia.org/releng/quibble-stretch:latest --skip-zuul --skip-deps
...
(6m 17s)

Speed things up by just running Selenium tests.

$ docker run -it --rm \
  -v "$(pwd)"/cache:/cache \
  -v "$(pwd)"/log:/workspace/log \
  -v "$(pwd)"/ref:/srv/git:ro \
  -v "$(pwd)"/src:/workspace/src \
  docker-registry.wikimedia.org/releng/quibble-stretch:latest --skip-zuul --skip-deps --run selenium
...
(1m 19s)

"Building with Nature" .. a case for a beaver solution

11:37, Tuesday, 03 2020 March UTC
The Markermeer is a lake with an ecological problem; the water is cloudy, plants and mussels do not grow. In order to alleviate that problem, the Marker Wadden was developed and in order to future proof the Houtribdijk the same "building with nature" concepts are used; the extensive water features will enable the growth of plants and the intended result is not only that the water will be clear again but also that the dyke will better withstand future storms.

With ecology part of the solution, it is relevant to appreciate ecology as part of a solution for open issues. There are two open issues: geese and willows. So far, geese are kept at bay at some areas with fences and young willows are being rooted out by volunteers.

When willows are allowed to grow, they will mature quickly and enable the next ecological succession. The wood and bark provides food and building material for beavers and this makes for an even more robust defense against storm damage. Some trees will mature anyway and this provides natural nesting places for white tailed eagles. Given that the wels catfish is endemic in the Markermeer, it will find its place among the Marker wadden and it may even predate on the over abundant geese.

So given that Natuurmonumenten, the organisation looking after the Marker Wadden is happy about beavers in its terrains, maybe it is the "building with nature" engineers who have to consider succession in their deliberations.
Thanks,
      GerardM

Perf Matters at Wikipedia 2015

21:43, Monday, 02 2020 March UTC

Hello, WANObjectCache

This year we achieved another milestone in our multi-year effort to prepare Wikipedia for serving traffic from multiple data centres.

The MediaWiki application that powers Wikipedia relies heavily on object caching. We use Memcached as horizontally scaled key-value store, and we’d like to keep the cache local to each data centre. This minimises dependencies between data centres, and makes better use of storage capacity (based on local needs).

Aaron Schulz devised a strategy that makes MediaWiki caching compatible with the requirements of a multi-DC architecture. Previously, when source data changed, MediaWiki would recompute and replace the cache value. Now, MediaWiki broadcasts “purge” events for cache keys. Each data centre receives these and sets a “tombstone”, a marker lasting a few seconds that limits any set-value operations for that key to a miniscule time-to-live. This makes it tolerable for recache-on-miss logic to recompute the cache value using local replica databases, even though they might have several seconds of replication lag. Heartbeats are used to detect the replication lag of the databases involved during any re-computation of a cache value. When that lag is more than a few seconds (a large portion of the tombstone period), the corresponding cache set-value operation automatically uses a low time-to-live. This means that large amounts of replication lag are tolerated.

This and other aspects of WANObjectCache’s design allow MediaWiki to trust that cached values are not substantially more stale, than a local replica database; provided that cross-DC broadcasting of tiny in-memory tombstones is not disrupted.


First paint time now under 900ms

In July we set out a goal: improve page load performance so our median first paint time would go down from approximately 1.5 seconds to under a second – and stay under it!

I identified synchronous scripts as the single-biggest task blocking the browser, between the start of a page navigation and the first visual change seen by Wikipedia readers. We had used async scripts before, but converting these last two scripts to be asynchronous was easier said than done.

There were several blockers to this change. Including the use of embedded scripts by interactive features. These were partly migrated to CSS-only solutions. For the other features, we introduced the notion of “delayed inline scripts”. Embedded scripts now wrap their code in a closure and add it to an array. After the module loader arrives, we process the closures from the array and execute the code within.

Another major blocker was the subset of community-developed gadgets that didn’t yet use the module loader (introduced in 2011). These legacy scripts assumed a global scope for variables, and depended on browser behaviour specific to serially loaded, synchronous, scripts. Between July 2015 and August 2015, I worked with the community to develop a migration guide. And, after a short deprecation period, the legacy loader was removed.

Line graph that plots the firstPaint metric for August 2015. The line drops from approximately one and a half seconds to 890 milliseconds.

Hello, WebPageTest

Previously, we only collected performance metrics for Wikipedia from sampled real-user page loads. This is super and helps detect trends, regressions, and other changes at large. But, to truly understand the characteristics of what made a page load a certain way, we need synthetic testing as well.

Synthetic testing offers frame-by-frame video captures, waterfall graphs, performance timelines, and above-the-fold visual progression. We can run these automatically (e.g. every hour) for many urls, on many different browsers and devices, and from different geo locations. These tests allow us to understand the performance, and analyse it. We can then compare runs over any period of time, and across different factors. It also gives us snapshots of how pages were built at a certain point in time.

The results are automatically recorded into a database every hour, and we use Grafana to visualise the data.

In 2015 Peter built out the synthetic testing infrastructure for Wikimedia, from scratch. We use the open-source WebPageTest software. To read more about its operation, check Wikitech.


The journey to Thumbor begins

Gilles evaluated various thumbnailing services for MediaWiki. The open-source Thumbor software came out as the most promising candidate.

Gilles implemented support for Thumbor in the MediaWiki-Vagrant development environment.

To read more about our journey to Thumbor, read The Journey to Thumbor (part 1).


Save timing reduced by 50%

Save timing is one of the key performance metrics for Wikipedia. It measures the time from when a user presses “Publish changes” when editing – until the user’s browser starts to receive a response. During this time, many things happen. MediaWiki parses the wiki-markup into HTML, which can involve page macros, sub-queries, templates, and other parser extensions. These inputs must be saved to a database. There may also be some cascading updates, such as the page’s membership in a category. And last but not least, there is the network latency between user’s device and our data centres.

This year saw a 50% reduction in save timing. At the beginning of the year, median save timing was 2.0 seconds (quarterly report). By June, it was down to 1.6 seconds (report), and in September 2015, we reached 1.0 seconds! (report)

Line graph of the median save timing metric, over 2015. Showing a drop from two seconds to one and a half in May, and another drop in June, gradually going further down to one second.

The effort to reduce save timing was led by Aaron Schulz. The impact that followed was the result of hundreds of changes to MediaWiki core and to extensions.

Deferring tasks to post-send

Many of these changes involved deferring work to happen post-send. That is, after the server sends the HTTP response to the user and closes the main database transaction. Examples of tasks that now happen post-send are: cascading updates, emitting “recent changes” objects to the database and to pub-sub feeds, and doing automatic user rights promotions for the editing user based on their current age and total edit count.

Aaron also implemented the “async write” feature in the multi-backend object cache interface. MediaWiki uses this for storing the parser cache HTML in both Memcached (tier 1) and MySQL (tier 2). The second write now happens post-send.

By re-ordering these tasks to occur post-send, the server can send a response back to the user sooner.

Working with the database, instead of against it

A major category of changes were improvements to database queries. For example, reducing lock contention in SQL, refactoring code in a way that reduces the amount of work done between two write queries in the same transaction, splitting large queries into smaller ones, and avoiding use of database master connections whenever possible.

These optimisations reduced chances of queries being stalled, and allow them to complete more quickly.

Avoid synchronous cache re-computations

The aforementioned work on WANObjectCache also helped a lot. Whenever we converted a feature to use this interface, we reduced the amount of blocking cache computation that happened mid-request. WANObjectCache also performs probabilistic preemptive refreshes of near-expiring values, which can prevent cache stampedes.

Profiling can be expensive

We disabled the performance profiler of the AbuseFilter extension in production. AbuseFilter allows privileged users to write rules that may prevent edits based on certain heuristics. Its profiler would record how long the rules took to inspect an edit, allowing users to optimise them. The way the profiler worked, though, added a significant slow down to the editing process. Work began later in 2016 to create a new profiler, which has since completed.

And more

Lots of small things. Including the fixing of the User object cache which existed but wasn’t working. And avoid caching values in Memcached if computing them is faster than the Memcached latency required to fetch it!

We also improved latency of file operations by switching more LBYL-style coding patterns to EAFP-style code. Rather than checking whether a file exists, is readable, and then checking when it was last modified – do only the latter and handle any errors. This is both faster and more correct (due to LBYL race conditions).


So long, Sajax!

Sajax was a library for invoking a subroutine on the server, and receiving its return value as JSON from client-side JavaScript. In March 2006, it was adopted in MediaWiki to power the autocomplete feature of the search input field.

The Sajax library had a utility for creating an XMLHttpRequest object in a cross-browser-compatible way. MediaWiki deprecated Sajax in favour of jQuery.ajax and the MediaWiki API. Yet, years later in 2015, this tiny part of Sajax remained popular in Wikimedia's ecosystem of community-developed gadgets.

The legacy library was loaded by default on all Wikipedia page views for nearly a decade. During a performance inspection this year, Ori Livneh decided it was high time to finish this migration. Goodbye Sajax!


Further reading

This year also saw the switch to encrypt all Wikimedia traffic with TLS by default. More about that on the Wikimedia blog.

— Timo Tijhof


Mentioned tasks: T107399, T105391, T109666, T110858, T55120.

I read “The impact of user interface on young children’s computational thinking” (pdf) by Sullivan, Bers, Pugnali (2017). Wonderful paper! Authors compare a tangible interface to robotics programming and a non-tangible interface. The tangible interface is KIBO, a robotic kit programmed with wooden blocks, basically making Scratch physical. (I totally love KIBO, so I’m biased […]

Tech News issue #10, 2020 (March 2, 2020)

00:00, Monday, 02 2020 March UTC
TriangleArrow-Left.svgprevious 2020, week 10 (Monday 02 March 2020) nextTriangleArrow-Right.svg
Other languages:
Deutsch • ‎English • ‎Esperanto • ‎español • ‎français • ‎lietuvių • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎فارسی • ‎বাংলা • ‎ไทย • ‎中文 • ‎日本語

Semantic MediaWiki 3.1.5 released

17:17, Sunday, 01 2020 March UTC

February 29, 2020

Semantic MediaWiki 3.1.5 (SMW 3.1.5) has been released today as a new version of Semantic MediaWiki.

It is a maintenance release providing bug fixes. Please refer to the help pages on installing or upgrading Semantic MediaWiki to get detailed instructions on how to do this.

Production Excellence: January 2020

16:16, Sunday, 01 2020 March UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 3 documented incidents. [1]
  • 26 new Wikimedia-prod-error reports. [2]
  • 26 Wikimedia-prod-error reports closed. [3]
  • 198 currently open Wikimedia-prod-error reports in total. [4]

To read more about these incidents and pending actionables; check Incident documentation § 2020, or Explore Wikimedia incident stats (interactive).


📖 Paradoxical array key

Wikimedia encountered several Zend engine bugs that could corrupt a PHP program at run-time, during the upgrade from HHVM to PHP 7.2. (Some of these bugs are still being worked on.) One of the bugs we fixed last month was particularly mysterious. Investigation led by @hashar and @tstarling.

MediaWiki would create an array in PHP and add a key-value pair to it. We could iterate this array, and see that our key was there. Moments later, if we tried to retrieve the key from that same array, sometimes the key would no longer exist!

After many ad-hoc debug logs, core dumps, and GDB sessions, the problem was tracked down to the string interning system of Zend PHP. String interning is a memory reduction technique. It means we only store one copy of a character sequence in RAM, even if many parts of the code use the same character sequence. For example, the words “user” and “edit” are frequently used in the MediaWiki codebase. One of those sequences is the empty string (“”), which is also used a lot in our code. This is the string we found disappearing most often from our PHP arrays. This bug affected several components, including Wikibase, the wikimedia/rdbms library, and ResourceLoader.

Tim used a hardware watchpoint in GDB, and traced the root cause to the Memcached client for PHP. The php-memcached client would “free” a string directly from the internal memory manager after doing some work. It did this even for “interned” strings that other parts of the program may still be depending on.

@jijiki and @Joe backported the upstream fix to our php-memcached package and deployed it to production. Thanks! — T232613


📉 Outstanding reports

Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Breakdown of recent months (past two weeks not included):

  • March: 3 of 10 reports left (unchanged). ⚠️
  • April: Two reports closed, 4 of 14 left.
  • May: (All clear!)
  • June: Two reports closed. 4 of 11 left.
  • July: Four reports closed, 8 of 18 left.
  • August: 4 of 14 reports left (unchanged).
  • September: One report closed, 8 of 12 left.
  • October: 8 of 12 left (unchanged).
  • November: 5 of 5 left (unchanged).
  • December: Three reports closed, 6 of 9 left.
  • January: 7 new reports survived the month of January.

There are a total of 57 reports filed in recent months that remain open. This is down from 62 last month.


🎉 Thanks!

Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


Footnotes:

[1] Incidents. –
wikitech.wikimedia.org/wiki/Incident_documentation#2019

[2] Tasks created. –
phabricator.wikimedia.org/maniphest/query…

[3] Tasks closed. –
phabricator.wikimedia.org/maniphest/query…

[4] Open tasks. –
phabricator.wikimedia.org/maniphest/query…

weeklyOSM 501

11:42, Sunday, 01 2020 March UTC

18/02/2020-24/02/2020

lead picture

Source code of blender-osm addon for Blender 3d is back on Github 1 | © vvoovv | © map data OpenStreetMap contributors

About us

  • We would like to thank everyone who sent a message on the occasion of issue 500 of weeklyOSM via blog post, e-mail, comment on the issue, Twitter or even personally. Please help to inform the community about YOUR OSM-activity by using our guest mode.

Mapping

  • Joseph Eisenberg is asking for comments on his proposal of amenity=motorcycle_taxi, a specific tag for a place where motorcycle taxis wait for passengers. (Nabble)
  • Victor started a discussion asking if the presence or absence of signs for surveillance cameras should be tagged camera:signed=yes/no, as some countries require a notification if a public space is being filmed. (Nabble)
  • Long time OSM contributor Nick Whitlegg announced that he is shutting down his England and Wales footpath mapping site FreeMap. The site started out in the very early days of OSM (or even slightly beforehand) and the announcement caused Richard Fairhurst to point out a fascinating old blog post on his own mapping work antecedent to the creation of OpenStreetMap (geowiki.com) and other aspects of OSM prehistory.
  • dktue created the page autobahnkilometer.tirol on 17 February 2020 showing highway milestones in Tyrol (Austria) on a map. On the day of publication no milestones had been mapped in OSM; now most of them have been mapped.
  • Thejesh GN wrote an introduction to mapping surveillance cameras in OSM, and how to map them with OSMAnd.
  • HOT has shared about its collaboration with Microsoft on the issues of creating detailed maps of Tanzania and Uganda via satellite mapping, machine learning and community mappers.
  • Knotenpunktsystem (literally ‘junction point system’) have been set up in a number of European countries to help people navigate their cycling networks by numbering the junctions and providing extensive signage between them. Robert Grübler reported (automatic translation) on the work undertaken to map the knotenpunktsystem of the cycle network in Murtal, Austria.

Community

  • If you know a person or a project who deserves to be awarded for their contribution to OSM you can submit a nomination for the OSM-Awards 2020. Ilya Zverev asks that nominations be made before 10 May 2020.
  • Fabien Delolmo will be holding (automatic translation) a technical workshop on programming a mobile app that uses and updates OpenStreetMap data. The workshop will be held on 6 March in Grenoble, France.
  • Google announced the open source projects and organisations which have been accepted for this year’s Google Summer of Code (GSoC) program. OSM is amongst the 200 accepted open source projects. GSoC will accept student applications from 16 to 31 March 2020.
  • SomeoneElse is experimenting with adding information to established icons. Amongst other things he explains in his blog post how an icon could be enriched with wheelmap accessibility information.
  • The Tobler Society is an up-and-coming YouthMappers Chapter at the University of Chicago. They tweeted about their Mapathon, held last week. Several times a year, they host mapathons through YouthMappers’ humanitarian projects and last year students started the first chapter of YouthMappers in Chicago.

OpenStreetMap Foundation

  • Simon Poole announced a new version of the Attribution Guideline, which incorporates feedback from the public discussion in August last year. The draft will be forwarded to the OSMF board for formal approval.

Events

  • Ilya Zverev reminds us that State of the Map Baltics 2020 will take place on 6 March 2020 at the University of Latvia.
  • Lars Linger invites you to another OSM Hackweekend on 28 and 29 March 2020 in Berlin. The event, which has been running for at least 7 years, will be hosted at Wikimedia’s offices and is supported by the FOSSGIS, the local chapter of the OSMF.
  • Ed Freyfogle, co-founder of OpenCage, and Steven Feldman, of KnowWhere Consulting, have started a podcast series about geo-related themes.

Humanitarian OSM

  • HOT has tweeted that all six HOT Microgrants candidates from Sierra Leone, Iraq, South Sudan, the Philippines, Bolivia and Peru have been fully funded by the community.

Maps

  • Do you like drawing on maps? Are you sick of having to do it on a paper map with a crayon like some sort of caveman? Then Gribrouillon (automatic translation) by Adrien Pavie is for you. Use it to draw on an OSM basemap and share with friends.
  • Andrei Kashcha announced the update of his ‘obscure’ 2D WebGL renderer. In his tweet he shows a preview of the new 3D rendering feature.
  • The BBC dedicated an article to the misuse of a map to share misleading information on the global spread of coronavirus.
  • Some mappers expressed (de) (automatic translation), in the German OSM forum, their disappointment about the poor rendering performance of OSM’s main map.

switch2OSM

  • Tempus is a transport company in the Südeifel region of Germany. With the community’s help, it has entered all of its bus lines (de) into OpenStreetMap, and built individual maps on the line pages of its website using Leaflet and Overpass.

Open Data

  • OpenDataFrance has launched (fr) (automatic translation) the GeoDataMine extraction tool. It allows local authorities to quickly extract data relating to their territory from the OpenStreetMap database.

Software

  • OSMCha is switching its domain to https://osmcha.org. Users will be redirected from the previous domain to the new one, but developers that use the API need to change their applications to avoid problems. See more on https://www.openstreetmap.org/user/wille/diary/392289.
  • The source code for blender-osm, an add-on to render OSM data in 3D with the free and open 3D creation software Blender, is back on Github as vvoovv announced on his user diary.

Programming

  • The Open Geospatial Consortium (OGC) has announced the publication of reports documenting the work of the Open Routing API Pilot, including implementations developed and a prototype Routing API and Route Exchange Model to address location interoperability challenges with various tools available from many providers.
  • The Open Geospatial Consortium (OGC) is considering adopting CityJSON as an official OGC Community Standard. A new work item justification to begin the Community Standard endorsement process is available for public comment.

Releases

  • The JOSM team has released stable version 20.02 (release 15915) of their OSM editor. The new version brings access to Maxar’s imagery back, add links to Geofabrik’s regional/national Taginfo instances, improves JOSM’s Autofilters and has many many more new features and improvements.

Did you know …

  • … the traffic_sign tag? It allows the location and type of traffic signs to be tagged. This is especially helpful if you are unsure about how to tag the effect of the sign and helps other mappers to make corrections if necessary.

Other “geo” things

  • Omar Rizwan tweets the personalised public transport map of Auckland, New Zealand which he made for his parents.
  • Dāvis Viļums five years ago set out to ride every street in Central London. Dāvis explains the why and how and presents a time-lapse video of the record of all his rides as they reveal London’s street grid.
  • Tom Forth (ODI Leeds) has a long twitter thread (fr) about using open data to map income disparities in the UK and France.
  • Carlos Jurado Rivera was featured in an article (automatic translation) in La Voz de Almería. Carlos describes how he came to be interested in mapping the cycling infrastructure of Almería (Spain), why he chose OpenStreetMap, and his mapping style.
  • Cartographers have been known to occasionally sneak a personal joke into their work. Zoey Poll reports on how even Swisstopo has played host to a small cartographical joke or two.
  • Researchers from Heidelberg University’s GIScience Research Group have published a paper on the role of spatially structured scoring systems as a motivational element in location-based games. They found that players who are confronted with a spatially structured scoring system are more likely to have a longer playing time, walk longer distances, and be more willing to take detours than those facing a spatially random scoring system.
  • The kick-off meeting for LOKI (Airborne Observation of Critical Infrastructure) was hosted by Heidelberg University on 12 February. The aim of the LOKI project is to develop a system that enables fast and reliable airborne situation assessments following an earthquake.
  • The imagery provider Mapillary is now available in ArcGIS Urban, ESRI’s tool for smart city planning, providing imagery with street-level perspective.

Upcoming Events

Where What When Country
Toulouse Contrib’atelier OpenStreetMap 2020-02-29 france
Budapest Budapest gathering 2020-03-02 hungary
London Missing Maps London 2020-03-03 united kingdom
Hanover OSM-Sprechstunde 2020-03-04 germany
Stuttgart Stuttgarter Stammtisch 2020-03-04 germany
Praha/Brno/Ostrava Kvartální pivo 2020-03-04 czech republic
Arlon Atelier ouvert OpenStreetMap 2020-03-04 belgium
Dortmund Mappertreffen 2020-03-06 germany
Riga State of the Map Baltics 2020-03-06 latvia
Amagasaki GPSで絵を描いてみようじゃあ~りませんか 2020-03-07 japan
Rennes Réunion mensuelle 2020-03-09 france
Grenoble Rencontre mensuelle 2020-03-09 france
Taipei OSM x Wikidata #14 2020-03-09 taiwan
Toronto Toronto Mappy Hour 2020-03-09 canada
Hamburg Hamburger Mappertreffen 2020-03-10 germany
Lyon Rencontre mensuelle 2020-03-10 france
Zurich 115. OSM Meetup Zurich 2020-03-11 switzerland
Hanover OSM-Sprechstunde 2020-03-11 germany
Freiburg FOSSGIS-Konferenz 2020-03-11-2020-03-14 germany
Nantes Rencontre mensuelle 2020-03-12 france
Berlin 141. Berlin-Brandenburg Stammtisch 2020-03-12 germany
Ulmer Alb Stammtisch Ulmer Alb 2020-03-12 germany
Chemnitz Chemnitzer Linux-Tage 2020-03-14-2020-03-15 germany
Nottingham Nottingham pub meetup 2020-03-17 united kingdom
Lüneburg Lüneburger Mappertreffen 2020-03-17 germany
Cologne Bonn Airport 127. Bonner OSM-Stammtisch 2020-03-17 germany
Hanover OSM-Sprechstunde 2020-03-18 germany
Hanover Stammtisch Hannover 2020-03-19 germany
Munich Münchner Treffen 2020-03-19 germany
San José Civic Hack & Map Night 2020-03-19 united states
Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
Guarda EuYoutH OSM Meeting 2020-06-24-2020-06-28 spain
Cape Town HOT Summit 2020-07-01-2020-07-02 south africa
Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Elizabete, PierZen, Polyglot, Rogehm, SK53, Sammyhawkrad, SunCobalt, TheSwavu, YoViajo, derFred, geologist, jinalfoflia, Can.

Trying something new in an already popular class

18:03, Friday, 28 2020 February UTC

“I was hesitant about shaking up a popular class that had worked well for years, but one statistic finally convinced me. … And this assignment quickly took on a much deeper significance than a research paper.”

Dr. Stephennie Mulder at University of Texas at Austin decided to scrap the research paper for her Fall 2018 term and have students collaborate and rewrite Wikipedia pages. “It ended up better than I could have imagined and transformed how I think about teaching,” she shared over Twitter when the course wrapped up.

She has now taught a Wikipedia writing assignment in two of her Islamic art history courses, once in Fall 2018 and again in Fall 2019. One student from last fall uploaded a photo of metalworking detail to the Wikipedia article about the history of metallurgy in Mosul. Another created a new article for an object of islamic art by translating the one on the French Wikipedia. In this most recent course, 26 students added 154 references to 9 total articles. Overall a great success for representing topics related to Islamic art history on Wikipedia.

Dr. Mulder shared her reasons for incorporating the Wikipedia writing assignment again via Twitter:

“I was hesitant about shaking up a popular class that had worked well for years,” Dr. Mulder reflected after that first term. “But one statistic finally convinced me: 90% of Wikipedia articles are written by men – and largely by men from Euro-American contexts,” she wrote, citing an article in the Guardian.

“The oft-maligned Wikipedia is now the world’s most frequently-consulted source of information and fifth-most visited website. Because anyone can be an editor, it’s on us to make it the best resource it can be. If you think Wikipedia is bad, look in the mirror. Wikipedia is us.”

“So this assignment quickly took on a much deeper significance than a research paper. It required [students] to develop many of the same research and writing skills as a research paper, but it also came to have a strong civics and social justice component, and the students responded.”

“To emphasize the collaborative, real-world applicability of the assignment, I set it up as a partnership with a museum and another Islamic art class via my brilliant colleagues Dr. Leslee Michelsen and Dr. Alex Dika Seggerman.”

“The assignment asks [students] to imagine they’re hired as consultants for a major museum to improve the public’s knowledge on Islamic art, in advance of an exhibition. Their task? To assess an article for the quality of its sources and content and then research, write, and improve it.”

“The assignment was set up on Wiki Education’s terrific website. Students had to complete a training module/week. Through this and discussion in class they learned how to determine quality of sources, the value of peer review, and improved research methods.”

Dr. Mulder also called upon materials from Art & Feminism, as well as the help of librarians at UT Austin.

“I also spent a day in the library with students, walking them through basic research methods. I thought this might be too basic but the day walking them through the physical library was inspired by a comment made by one of our brightest Honors Art History undergraduates in a meeting of the UT Provost’s Libraries Task force (on which I currently serve): ‘Many faculty and librarians don’t realize that it’s not that undergraduates don’t want to use the libraries, it’s that we are SCARED of the libraries. She explained that many students no longer receive the basic training needed to know how to use the libraries. So they don’t.”

“Just getting them physically in the library was revelatory. They were astonished at the range and quality of sources that did not turn up on a Google search. It really lit a fire under them, sparked their curiosity and made them want to dive in. This was revelatory for me too!”

“We talked about what a privilege it is to have access to depth and reliable quality of knowledge in a top research library. We talked about how access to knowledge controls the way we see the world and our place in it. Again and again, students spoke about feeling empowered.”

What I really admired about this assignment was how it contributed to the world. The idea of how the gist of my research from quality sources not accessible to everyone, will now be available for all to see makes me so happy. Even though I was really nervous about making edits, not being a historian on Islamic art, I appreciate how Wikipedia encouraged me to “be bold” and not be afraid to contribute. I look forward to seeing the edits on our page in the future!
A student reflection after completing the Wikipedia assignment

“Yesterday we met for an upload party. I wasn’t sure how it would go, but it turned out to be one of my favorite days as a teacher EVER. I am so proud of my students and of their hard work. I am amazed at how seriously they took the assignment and what a terrific job they did.”

Students in Dr. Mulder’s course.
Photo by Dr. Mulder, rights reserved.

“There were challenges: the sometimes-difficult collaborative aspect of a group assignment, the fact that some topics had few sources, the challenge of learning to write in a neutral, unbiased style. But they had something real on the line: an article out there in the world. And in the end, these smart students have now contributed 15 better-sourced, better-written articles to the most frequently-consulted body of knowledge in the world. And they’ve learned key research and writing skills that will be applicable in a range of fields.”

That number has now risen to 29!

“In short, this assignment did all a traditional research paper does: deepening knowledge, research, and writing skills – but it also left them feeling empowered to be active contributors: shaping global knowledge and redressing imbalances that shape the way we see the world.”

“So if you’ve ever wanted to have students do a Wikipedia article writing project, I would encourage you to jump in and follow Wikipedia’s motto: Be Bold. Your students will learn, and you will all feel like you made the world just a tiny bit better place to be.”


To incorporate a Wikipedia writing assignment into an upcoming course, visit teach.wikiedu.org for access to free resources and assignment templates.


Image detail: Seated Woman with a Book, India 20th century. Doris Duke Foundation for Islamic Art, 11.1.5

Balancing arguments - Gender and the #Wikimedia projects

17:56, Friday, 28 2020 February UTC
Some say, gender is important because there is a serious imbalance in the reporting on people in Wikipedia. There are many people who dedicate their time to bring some balance by writing Wikipedia articles. At the same time it is important to be cognizant of the fact that gender is not binary; the point it brings is that when you write an article you need a source to know what gender a person identifies with.

So far so good. At Wikidata other things are at play. It is vital to understand that Wikidata items are not so much about an individual, an item. When recipients of an award are included like for "Member of the Hassan II Academy of Sciences and Technologies". There is often nothing more than Moroccans that received an award because a source says so. Determining a gender relies on googling for images of the person and when the name is decidedly male like Omar, Hakim, Mustapha the gender is implied.

Why include a gender? Because projects like Women in Red rely on prospects to write articles about. Because tools like Scholia do express what we know about all the recipients of an award.. It tells us that there are currently two ladies known and 22 gentlemen. We know nothing of their work because the bias against Africa is staggering and because performance for inclusion at Wikidata is abysmal.

The arguments why we should not include gender is often based on what people expect; "Wikidata contains large sets of data and consider that it makes no statistical difference one way or the other". The reality however is that when you consider the use of data in for instance Scholia, the subsets are small. One more fine lady makes a statistical difference.

When people write about a person for a Wikipedia, they do get to know the person, they have multiple sources at hand. At Wikidata not so much. One purpose of adding people is to nibble away at our bias.

Requiring sources to indicate gender is what takes away the usefulness of the data and is counter productive when we are talking bias. For me it is a Wikipedia argument, an article based argument and it is counter productive to translate it to the set based approach of Wikidata.
Thanks,
       GerardM

Reading Wikipedia’s editorial culture

22:14, Wednesday, 26 2020 February UTC

With three terms of Wikipedia writing assignment experience already under his belt, Dr. Josh DiCaglio reflects on having students understand Wikipedia’s editing culture by participating in it. He is an assistant professor in the department of English at Texas A&M University.

Joshua DiCaglio
Image by Etherfire, CC BY-SA 4.0 via Wikimedia Commons.

Although many instructors have used Wikipedia in class projects, these assignments usually focus on writing new content in a subject related to the course or remedying imbalances in editing. While these are excellent pedagogical tasks and essential for Wikipedia’s mission, my assignment focuses on the general editorial situation presented by Wikipedia. I ask students to take existing content, which is often a patch-work of years of different contributions and changes representing different interests and backgrounds, and intervene in the page in a way that will align it with Wikipedia conventions and facilitate future contributions. They are encouraged not to produce new content but to work with what is already there, with the assumption that other content-matter experts will come along (probably long after they are done with this project) to assist in working up the page. Doing so forces students to learn the elaborate conventions of Wikipedia, join and react to a live community of writers, figure out ways to facilitate an on-going writing process, and experience editing in an online setting—including using Wikipedia’s markup language. The public setting amplifies the stakes: every change is live for the whole world to see (4.12 million people viewed the pages students worked on during the 2019 project), every communication open for response, and every alteration a permanent part of the evolution of the articles they edit.

This assignment provides students with direct experience with a live and dynamic editorial situation that shares many of the attributes of editing in a technological environment—working with existing content, accommodating limits according to a prescribed genre, managing multiple writers that may or may not be present and may have contrasting uses, considering multiple audiences, accounting for past writing, and facilitating future work on a future document. One of the most valuable and elusive skills is the learning and accommodating an existing editing culture. In fact, the original assignment was inspired when some friends of mine asked me to assist in putting together a page for someone they held worthy of a Wikipedia page. To our surprise, we received quick and direct lessons in the difficulties of writing biographies of living persons, the constraints of Wikipedia’s policies (especially Neutral Point of View, Notability, and Verifiability), and the procedures that Wikipedians use to vet content. This provided a valuable lesson in learning to read an editorial culture, one that I wanted to pass along to my students.

As I developed the project, I came to see that there is a key skill needed for the continued success of Wikipedia that few beyond seasoned Wikipedia editors may be aware of: the need for advanced editorial sensibilities to perform the elaborate coordination needed to handle the diverse interests and contributions provided to Wikipedia. There are three major scenarios I try to get my students to focus on. The first are slowly developing pages that are the result of years (often well over a decade) of contributions that have been amassed without a global coherence. In these pages, decisions made a decade ago by one user might set the stage for years of edits even if they are not the most sensible organization. Often, there has been little discussion of what the page is actually for and what kind of content should be there (One of my favorite examples of this is the Concert page—what kind of information should go in a Concert Wikipedia page? What is it for?). The second are pages that have received attention from content matter experts but may not have the appropriate frame, organization, and style that fits with Wikipedia. These often have very active users who would welcome someone less knowledgeable in the topic; but this requires someone who is not afraid to work with these experts to dig into the writing and organization of the article to make it more accessible and appropriate. Sometimes these kinds of pages have controversies that could use an additional neutral voice to help figure out what might be done next and where the article could go. The third are pages that were created and vetted largely on the work of one contributor but which are not appropriately written for Wikipedia. Most common is the page that is ported from an essay into a Wikipedia page, such as this one on World War 2 US Military Sex Education or this one on Gender in Horror Films (the links are the pages before my students’ interventions). These pages usually have good content but become stalled because so much work is required to turn the content into something appropriate for Wikipedia.

Having students work with such pages provides both an incredibly useful editing scenario and a focused study of the difficulties of editing with a group over time. Many students struggle with the assignment only to return later to speak of how meaningful it was for them. Several students have mentioned referring to it in job interviews and several have remained regular Wikipedia editors since completing the assignment.

The Montague-CTE (Center for Teaching Excellence) Scholars program is an award and grant given at Texas A&M to one assistant professor from each college per year. As part of the award, they have you focus on one particular innovative teaching technique that you would like to develop. I received the award (which includes a small stipend in addition to general recognition) to work on my Wikipedia assignment that I developed for my Technical and Professional Editing course. My plan is to use this support to enlist an undergraduate in refining this course assignment, gathering examples from previous iterations of the course (this semester will be my fourth time running the assignment), and working up these materials into something that can be presented to others at professional conferences (including, potentially, the WikiConference) and through a publication for a composition journal.


To incorporate a Wikipedia writing assignment into an upcoming course, visit teach.wikiedu.org for access to free resources and assignment templates.

Surviving 2020 on Twitter

05:21, Wednesday, 26 2020 February UTC

I’m a political junkie, perhaps in some ways more now than ever. And yet, I’m posting very little about the 2020 election on Twitter. An old friend with similar political compulsions asked how I’m doing it. The answer is ironically too long for Twitter, so here goes.

Reduced my Twitter political inputs

Step 1 was to simply reduce the amount of political stuff that I see when I go to Twitter. I see all kinds of other wonderful stuff instead! What I did:

  • Unsubscribe from all ‘news’ feeds on twitter—@nytimes, @cnn, etc. I use other mechanisms (email, actually visiting a website) to get them daily at most. More generally, I aggressively turn off all news notifications on my phone. If the missiles launch and I need to hug my loved ones, someone will text me.
  • Unsubscribe from people I don’t know personally. For me, that’s basically all celebrites (except Lin-Manuel) but if that sounds too aggressive, you can Marie Kondo your follows with the help of the Tokimeki Unfollow tool. Two (small) exceptions for me:
    • Have they taught me something I didn’t know, because they’re giving me diverse perspectives not in my personal network? That can be troubling/non-joyful, but still valuable.
    • Have they given me opportunities for real-world action that you can’t get in some other way? For me, this is primarily local organizations — several San Francisco bike, transit, and YIMBY activists. (I find this to almost never be the case from national media, because the opportunities for practical action are too limited.)
  • Turn off pure retweets with the Turn Off Retweets tool. In my experience, pure retweets are highly likely to be more angry/emotional, and less informative. Yes, there was some FOMO here. I got over it very quickly. If it is important, I see it eventually.
  • Mute (aka filter) political words aggressively. Here are Twitter’s instructions. My word list: all the primary candidate’s names; Trump; President; debate. I’m sure I’ll add more.

(optional) Replace with better news sources

I still feel the need for a lot of politics news. I subscribe to them via non-Twitter mechanisms. This is local as much as possible, or in some cases national news for very specific needs. For example, I still very much feel the need to understand global warming, so that I can target my giving in that space, so I read heated.world and the Washington Post’s Energy 202.

I use Feedbin to subscribe to newsletters (and yes, some RSS feeds still) so that they stay out of my email inbox. (Most of the same ‘aggressively unsubscribe’ applies to my email inbox too…)

(hard, but helpful) come to terms with the world as it is, and act in that framework

At some point in the past few years, I accepted that I’m going to have a baseline level of anger about the state of the world, and that I have to focus on what I can change and let go of what I can’t. (Twitter anger is the latter.) So what can I change? Where is my anger productive?

I’ve found that doing things offline—for me, mostly giving money—really helps. In particular, giving to causes that seek systemic (usually, that means political/government) change like 350.org and local activist groups, and giving a lot, and regularly. This, frankly, makes it a lot easier for me to ignore anger online — each new tweet is not likely to make me be more angry, or give more, because I’m already basically giving what I can. Being confident about that really reduced my FOMO when I started filtering aggressively.

I hear from non-parents/non-startup-founders that physical-world activism (door-knocking, phone banking, local gov meeting-attending, etc.) can be great in this way too but sadly I can’t confirm :(

(I also want to acknowledge that, in the current state of the world, ‘letting go’ gets harder the less privilege you have. I have no great response to that, except to say that I empathize and am trying to fight for you where and how I can.)

Improving my outputs

Having done all that, here’s how I try to improve the Twitter environment for others:

  • If I must RT or otherwise share politics news, I only quote tweet. I try to add useful context. What can I add that others can’t? If I can’t add something, if I’m just amplifying anger, I try to shut up instead.
  • If I must be angry, I’ve tried to follow a rule that I only express that offline if I am also telling other people who are angry how to constructively address the problem. I don’t just say “I’m so mad about global warming”, say “I’m mad about global warming, here’s what I’m doing to help fix it, you can too“. If I don’t have a ‘here’s what I’m doing’ to add to it … I go back to ‘figure out what I can do’.

This isn’t perfect

Twitter has made me a literally better person, because it has exposed me to viewpoints I don’t have in my daily life that have made me more empathetic to others. It has changed my politics, making me vastly more open to systemic critiques of US center-left politics. So I’m reluctant to say ‘use it less, particularly for politics’. But I feel like it’s the only way to stay sane in 2020.

Students at Middlesex University researching the Women of Bletchley Park for a Wikipedia editing assignment

This post was written by Ewan McAndrew, Wikimedian in Residence at the University of Edinburgh.

Kindness on the Internet has been much in the news of late and this quote from novelist Henry James stood out to me:

Three things in human life are important: the first is to be kind; the second is to be kind; and the third is to be kind.

I have been working at the University of Edinburgh for over four years now as the Wikimedian in Residence. Four years as of January 2020 in fact, just as Wikipedia itself turned nineteen years old on January 15th 2020. In thinking about this period of my working life, I am reminded of some of the (sometimes) sceptical conversations I have had with (some) academics over the years but more often than not I recall the enthusiasm, generosity and kindness I have encountered.  And I’m reminded also of the words of Katherine Maher, Executive Director for the Wikimedia Foundation, when she said that Wikipedia, ultimately, is based on human generosity; that the act of editing Wikipedia is a generous act by volunteer editors all around the world because they are giving of their time, their expertise and their passion for a subject in order to improve the knowledge shared openly with the world through this free and open online encyclopedia. And why? Well because…

Knowledge creates understanding – understanding is sorely lacking in today’s world. – Katherine Maher.

While the residency has been something of an experiment, a proof of concept if you will for hosting a Wikimedian to support the whole university, I am more convinced than ever that there is a clear role, a structural need even, for Wikimedia in teaching and learning.

Yet while I am an employee of the University of Edinburgh, I attended the other place (University of Glasgow) for my undergraduate course and my postgraduate courses were at Glasgow Caledonian University, University of Strathclyde and Northumbria University. So four years at the University of Edinburgh and experience of five universities all told. As 74 UK universities go on strike now and a national conversation is being held about working conditions, casualised contracts and the workloads of staff at universities it does indeed give pause for thought. Time, for thought and reflection on the purpose of education… and its delivery.


Now imagine you are relaxing after work in a sauna at your local swimming pool one evening and a guy called Patrick starts chatting to you and asking what you do for a living. You tell Patrick why, I’m a Wikipedian at the University of Edinburgh. And Patrick replies… “Cool. What’s Wikipedia got to do with universities?”

Have a think for a moment… what is the link between Wikipedia and Universities? What would you say? How would you answer?

Well Patrick, it’s a fair question. Let’s see.

How about shared vision and mission statements. “The creation, curation and dissemination of knowledge” is built into the University of Edinburgh’s mission while Wikimedia’s vision is to “Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.”

And as Sue Beckingham said in her Association for Learning Technology (ALT) keynote it’s about engaging with & understanding the relationship we have with the open web, how people create, curate and contest knowledge online and our relationship with the big digital intermediaries like Facebook, Google, Amazon and Wikipedia, the fifth most visited website in the world.

Then there’s the Digital Skills aspect – It is widely recognised that digital capabilities are a key component of graduate employability. So many reports make this clear. Supporting learning digital research skills, synthesising that information and communicating it in a rapidly changing digital world.

And it’s about how we support developing a more robust critical information literacy. In fact, this is just the tip of the iceberg in terms of the areas that working with the free and open Wikimedia projects affords. At its heart its about the fact that search is the way we live now and what’s right or wrong or missing on Wikipedia affects the whole internet. And this is how Wikipedia in teaching in learning is often framed – warning students about its use, pros and cons, often with the focus firmly on the cons, as something to be consumed at your peril. When Wikipedia in teaching and learning should really spin this on its head. It’s what you can also contribute as an institution, staff and students, and get out of the teaching & learning experience as a result.

Indeed, the ALT website defines Learning Technology as this:

We define Learning Technology as the broad range of communication, information and related technologies that can be used to support learning, teaching and assessment. Our community is made up of people who are actively involved in understanding, managing, researching, supporting or enabling learning with the use of Learning Technology. We believe that you don’t need to be called ‘Learning Technologist’ to be one.

Wikipedia is learning technology, the largest open knowledge resource in human history that is free, open and anyone can contribute to. Now aged nineteen, as of last month, Wikipedia has truly come of age and ranks among the world’s top ten sites for scholarly resource lookups and is extensively used by virtually every platform used on a daily basis, receiving over 20 billion views per month, from 1.5 billion unique devices. The only non-profit website in the top 100 websites, quite simply “Wikipedia is today the gateway through which millions of people now seek access to knowledge.”– (Cronon, 2012)

Ergo… Wikimedians are learning technologists. And a Wikimedian is just someone who has learnt how to train people how to edit, who facilitates editing events and assignments.

Ergo… Learning technologists are Wikimedians or they should be.

Because at the University of Edinburgh, we have quickly generated real examples of technology-enhanced learning activities appropriate to the curriculum and transformed our students, staff and members of the public from being passive readers and consumers to being active, engaged contributors. The result is that our community is more engaged with knowledge creation online and readers all over the world benefit from our teaching, research and collections.

Our Wikimedia in the Curriculum activities bring benefits to the students who learn new skills and have immediate impact in addressing both the diversity of editors and diversity of content shared online:

  • Global Health MSc students add 180-200 words to Global Health related articles e.g. their edits to the page on obesity are viewed 3,000 times per day on average.
  • Digital Sociology MSc students engage in workshops with how sociology is communicated and how knowledge is created and curated online each year.
  • Reproductive Biology Honours – students work in groups in 2 workshops at the beginning of the semester – learning about digital research kills from our Academic Support Librarians so they can work collaboratively to research and publish a new article on a reproductive biomedical term not yet on Wikipedia. One student’s article on high-grade serous carcinoma, one of the most common forms of ovarian cancer, includes 60 references and diagrams she created, has been viewed over 88,000 times since 2016. That’s impact.
  • Translation Studies MSc students gain meaningful published practice each semester by translating 1,500 words to share knowledge between two different language Wikipedias on a topic of their own choosing from the highest quality articles.
  • World Christianity MSc students spend the semester undertaking a literature review assignment to make the subject much less about White Northern hemisphere perspectives; creating new articles on Asian Feminist Theology, Sub-Saharan Political Theology and more.
  • Data Science for Design MSc – Wikipedia’s sister project, Wikidata, affords students the opportunity to work practically with research datasets, like the Survey of Scottish Witchcraft Database, and surface data to the Linked Open Data Cloud and explore different visualisations and the direct and indirect relationships at play in this semantic web of knowledge to help further discovery.
  • This academic year we have also added three more course programmes in Korean Studies MSc, Digital Education MSc (group editing pages related to information literacy), and Global Health Challenges Postgraduate Online (group editing on short stub articles on natural disasters). Indeed we are looking increasingly at how we support online course programmes and supporting discussion, engagement and up-skilling students on these course programmes in more structured self-directed way.

We also work with student societies (Law & Technology, History, Translation, Women in STEM, Wellcomm Kings) and have held events for Ada Lovelace Day, LGBT History Month, Black History Month, Mental Health Awareness Week and celebrated Edinburgh’s Global Alumni; working with the UncoverEd project and the Commonwealth Scholarship Commission.

Students are addressing serious knowledge gaps and are intrinsically motivated to communicate their scholarship because of this. They benefit from the practice academically and enjoy doing it personally because their scholarship is published, lasting long beyond the assignment and does something for the common good for an audience of not one but millions.

Why engage at all? I think we know that representation matters. And that Gender inequality in science and technology is all too real. Gaps in our shared knowledge excludes the vitally important contributions of many within our community and role models, trail blazers are important. You can’t be what you can’t see. To date, 69% of our participating editors at the University of Edinburgh have been women. The choices being made in creating new pages and increasing the visibility of topics and the visibility of inspirational role models online can not only shape public understanding around the world for the better but can also help inform and shape our physical environments to inspire the next generation.

Wikipedia in the curriculum involves identifying reliable secondary sources we can cite (or sometimes the lack thereof); discussing whose knowledge, open access, bias, neutral point of view, writing for a lay audience and copyright. These are all absolutely appropriate for the modern graduate. The skills needed by those contributing to Wikimedia are the same digital literacy skills which a degree at University of Edinburgh is designed to develop: Those of critical reading, summarising, paraphrasing, original writing, referencing, citing, publishing, data handling, and understanding your audience.  In this era of fake news it has never been more important that our students understand how information is published, shared, and contested online. And beyond this, feel empowered that they can do something positive to share fact-checked knowledge and help build understanding.

Because It’s an emotional connection… Within, I’d say, less than 2 hours of me putting her page in place it was the top hit that came back in Google when I Googled it and I just thought that’s it, that’s impact right there!” (Hood & Littlejohn, 2018)

Things can look bleak when we think about all we see in the news and our relationship with the open web and the way in which information is shared online. It’s easy to lose faith at times. Indeed almost two years ago, Sir Tim Berners-Lee was on Channel 4 News being interviewed about the Facebook and Cambridge Analytica scandal and he said this.

We need to rethink our attitude to the internet.

It is not enough just to keep the web open and free because we must also keep a track of what people are building on it.

Look at the systems that people are using, like the social networks and look at whether they are actually helping humanity.

Are they being constructive or are they being destructive?

And he’s later reiterated this point that he feels the open web is at something of a crossroads and could go either way.

Happily, Sir Tim had cheered up a little by May 2018 when he gave his Turing Award lecture in Amsterdam when he said,

It is amazing that humanity has managed to produce Wikipedia. Somebody recently said, “You know what? For all of the defending of the open net and the open web, it would have been worth it if we just got Wikipedia.”

It IS amazing that humanity has produced Wikipedia. And he’s right. That’s my experience of working with Wikipedia. The research, the feedback from staff and students all bear this out. People do feel they are doing something inherently good, and worthwhile in sharing verifiable open knowledge and they learn so much from engaging in this process. Becoming knowledge activists. I commend it to you as a hugely impactful form of learning technology where our staff, students, research and collections can help shape the open web for the better, building understanding to make for a kinder, better world.

Bibliography

  1. Wadewitz, A. (2014). 04. Teaching with Wikipedia: the Why, What, and How. Retrieved from https://www.hastac.org/blogs/wadewitz/2014/02/21/04-teaching-wikipedia-why-what-and-how
  2. Cronon, W. (2012). Scholarly Authority in a Wikified World | Perspectives on History | AHA. Retrieved from https://www.historians.org/publications-and-directories/perspectives-on-history/february-2012/scholarly-authority-in-a-wikified-world
  3. Levine, N. (2019). A Ridiculous Gender Bias On Wikipedia Is Finally Being Corrected. Retrieved from https://www.refinery29.com/en-gb/2019/06/234873/womens-world-cup-football-Wikipedia
  4. Mathewson, J., & McGrady, R. (2018). Experts Improve Public Understanding of Sociology Through Wikipedia. Retrieved from https://www.asanet.org/news-events/footnotes/apr-may-2018/features/experts-improve-public-understanding-sociology-through-Wikipedia
  5. Hood, N., & Littlejohn, A. (2018). Becoming an online editor: perceived roles and responsibilities of Wikipedia editors. Retrieved from http://www.informationr.net/ir/23-1/paper784.html
  6. McAndrew, E., O’Connor, S., Thomas, S., & White, A. (2019). Women scientists being whitewashed from Wikipedia. Retrieved from https://www.scotsman.com/news/opinion/women-scientists-being-whitewashed-from-wikipedia-ewan-mcandrew-siobhan-o-connor-dr-sara-thomas-and-dr-alice-white-1-4887048
  7. McMahon, C.; Johnson, I.; and Hecht, B. (2017). The Substantial Interdependence of Wikipedia and Google: A Case Study on the Relationship Between Peer Production Communities and Information Technologies.

Wikipedia's JavaScript initialisation on a budget

17:05, Monday, 24 2020 February UTC

This week saw the conclusion of a project that I've been shepherding on and off since September of last year. The goal was for the initialisation of our asynchronous JavaScript pipeline (at the time, 36 kilobytes in size) to fit within a budget of 28 KB – the size of two 14 KB bursts of Internet packets.

In total, the year-long effort is saving 4.3 Terabytes a day of data bandwidth for our users' page views.

The above graph shows the transfer size over time. Sizes are after compression (i.e. the net bandwidth cost as perceived from a browser).


How we did it

The startup manifest is a difficult payload to optimise. The vast majority of its code isn't functional logic that can be optimised by traditional means. Rather, it is almost entirely made of pure data. The data is auto-generated by ResourceLoader and represents the registry of module bundles. (ResourceLoader is the delivery system Wikipedia uses for its JavaScript, CSS, interface text.)

This registry contains the metadata for all front-end features deployed on Wikipedia. It enumerates their name, currently deployed version, and their dependency relationships to other such bundles of loadable code.

I started by identifying code that was never used in practice (T202154). This included picking up unfinished or forgotten software deprecations, and removing unused compatibility code for browsers that no longer passed our Grade A feature-test. I also wrote a document about Page load performance. This document serves as reference material, enabling developers to understand the impact of various types of changes on one or more stages of the page load process.

Fewer modules

Next was collaborating with the engineering teams here at Wikimedia Foundation and at Wikimedia Deutschland, to identify features that were using more modules than is necessary. For example, by bundling together parts of the same feature that are generally always downloaded together. Thus leading to fewer entry points to have metadata for in the ResourceLoader registry.

Some highlights:

  • WMF Editing team: The WikiEditor extension now has 11 fewer modules. Another 31 modules were removed in UploadWizard. Thanks Ed Sanders, Bartosz Dziewoński, and James Forrester.
  • WMF Language team: Combined 24 modules of the ContentTranslation software. Thanks Santhosh Thottingal.
  • WMF Reading Web: Combined 25 modules in MobileFrontend. Thanks Stephen Niedzielski, and Jon Robson.
  • WMDE Community Wishlist Team: Removed 20 modules from the RevisionSlider and TwoColConflict features. Thanks Rosalie Perside, Jakob Warkotsch, and Amir Sarabadani.

Last but not least, there was the Wikidata client for Wikipedia. This was an epic journey of its own (T203696). This feature started out with a whopping 248 distinct modules registered on Wikipedia page views. The magnificent efforts of WMDE removed over 200 modules, bringing it down to 42 today.

The bar chart above shows small improvements throughout the year, all moving us closer to the goal. Two major drops stand out in particular. One is around two-thirds of the way, in the first week of August. This is when the aforementioned Wikidata improvement was deployed. The second drop is toward the end of the chart and happened this week – more about that below.


Less metadata

This week's improvement was achieved by two holistic changes that organised the data in a smarter way overall.

First – The EventLogging extension previously shipped its schema metadata as part the startup manifest. Roan Kattouw (Growth Team) refactored this mechanism to instead bundle the schema metadata together with the JavaScript code of the EventLogging client. This means the startup footprint of EventLogging was reduced by over 90%. That's 2KB less metadata in the critical path! It also means that going forward, the startup cost for EventLogging no longer grows with each new event instrumentation. This clever bundling is powered by ResourceLoader's new Package files feature. This feature was expedited in February 2019 in part because of its potential to reduce the number of modules in our registry. Package Files make it super easy to combine generated data with JavaScript code in a single module bundle.

Second – We shrunk the average size for each entry in the registry overall (T229245). The startup manifest contains two pieces of data for each module: Its name, and its version ID. This version ID previously required 7 bytes of data. After thinking through the Birthday mathematics problem in context of ResourceLoader, we decided that the probability spectrum for our version IDs can be safely reduced from 78 billion down to "only" 60 million. For more details see the code comments, but in summary it means we're saving 2 bytes for each of the 1100 modules still in the registry. Thus reducing the payload by another 2-3 KB.

Below is a close-up for the last few days (this is from synthetic monitoring, plotting the raw/uncompressed size):

The change was detected in ResourceLoader's synthetic monitoring. The above is captured from the Startup manifest size dashboard on our public Grafana instance, showing a 2.8KB decrease in the uncompressed data stream.

With this week's deployment, we've completed the goal of shrinking the startup manifest to under 28 KB. This cross-departmental and cross-organisational project reduced the startup manifest by 9 KB overall (net bandwidth, after compression); From 36.2 kilobytes one year ago, down to 27.2 KB today.

We have around 363,000 page views a minute in total on Wikipedia and sister projects. That's 21.8M an hour, or 523 million every day (User pageview stats). This week's deployment saves around 1.4 Terabytes a day. In total, the year-long effort is saving 4.3 Terabytes a day of bandwidth on our users' page views.


What's next

It's great to celebrate that Wikipedia's startup payload now neatly fits into the target budget of 28 KB – chosen as the lowest multiple of 14KB we can fit within subsequent bursts of Internet packets to a web browser.

The challenge going forward will be to keep us there. Over the past year I've kept a very close eye (spreadsheet) on the startup manifest — to verify our progress, and to identify potential regressions. I've since automated this laborious process through a public Grafana dashboard.

We still have many more opportunities on that dashboard to improve bundling of our features, and (for Performance Team) to make it even easier to implement such bundling. I hope these on-going improvements will come in handy whilst we work on finding room in our performance budget for upcoming features.

– Timo Tijhof


Further reading:

Fashion and digital citizenship at Bath Spa University

14:29, Monday, 24 2020 February UTC
Bath Spa University. Photo by User:Rwendland, licensed CC-BY-SA 4.0.

Bath Spa University in the UK had its first ever Wikipedia assignment, as part of a Digital Citizenship module for undergraduates studying Business Management Fashion. The results are mostly in user sandboxes, but some new articles have been created.

Bath Spa University, located in South West England, recently had its first, tentative, Wikipedia student assignment. The Business fashion degree has a focus on sustainable fashion, hence students had been studying the Rana Plaza collapse and its aftermath. This disaster highlighted the role of sweatshop labour in fashion supply chains and led to activism  including the #WhoMadeMyClothes hashtag. This gave a range of Wikipedia articles to which their work is relevant. The Women In Red project was also immensely useful for identifying prominent women in the fashion industry who did not have a Wikipedia article.

User:MartinPoulter gave two workshops on editing and interacting with Wikipedia. Because the Wikipedia element was introduced relatively late in the course, we decided to have the students post in user sandboxes rather than directly edit articles. Some groups collaborated on a single sandbox, and the article history was very useful in showing what each student had contributed.

As well as being marked for their individual essay submitted in the normal way, students were marked on whether they had done enough on-wiki work to make a substantial improvement to Wikipedia. It was still important that they experience feedback through the wiki platform, so in one activity they used Talk pages to write short reviews of each others’ drafts. The resulting work varies widely in quality but it has already enabled some significant improvement to English Wikipedia and at the same time it enabled students to do real collaborative writing. Martin is going through the students’ content, moving it into mainspace, in some cases combining work from multiple drafts. Some articles that have been created or improved:


On 26 February Wikimedia UK and the Disruptive Media Learning Lab are holding a one-day summit about the role of Wikimedia in education. This post was written by Martin Poulter and Caroline Kuhn; Martin will be taking part in the summit and teaching people about Wikidata. To learn more about Wikimedia UK’s activities, subscribe to our quarterly newsletter.

weeklyOSM 500

11:16, Sunday, 23 2020 February UTC

11/02/2020-17/02/2020

lead picture

Wochennotiz / weeklyOSM – Issue # 500 | Thank you

About us

  • The 500th issue of the weekly newsletter has arrived! It all started with issue 1 of the weekly note on 23 July 2010 having 14 news articles.At this milestone, in our 10th year, the current editorial staff would like to thank all the readers who have supported us over the years. We thank you for your interest and the helpful feedback which has enabled the initially small editorial team to produce weekly and hopefully interesting issues. WN is now published in nine languages and enjoys steadily increasing popularity.Only through a strong community can OpenStreetMap be successful despite the commercial competition. The great recognition that the ‘open world map’ has gained in the world of open geodata is remarkable. Thanks are also due to all those who are happy to contribute to our weeklyOSM, in whatever language, and to those who have made weeklyOSM a modest contribution to strengthening the OSM community.Many thanks and enjoy reading.

Mapping

  • The JOSM team announced a development version of its editor has been released that brings the Maxar imagery back. Bryan Housel tweeted that the Maxar background layers have also been restored in the iD editor. The imagery service for OSM was temporarily suspended following a sharp increase in usage caused by automated requests.
  • It was recommended not to use Vespucci from Google Play but instead use the one from open source store F-Droid, as the Google Play version is lagging three months behind the current version. Fortunately, this issue has now been resolved
  • The proposal to introduce the tag duty_free=*, for marking shops which offer duty-free shopping, has been approved.
  • memfrob is looking for an open source 3D webmap with 3D terrain which can visualise track data and let a user fly over it.
  • Mapillary released the results of their survey of delivery drivers, carried out to estimate the cost to business of broken maps. The losses due to broken maps are huge.
  • Minutely Extracts is a new service by Protomaps for on-demand OSM downloads in PBF format. The data is replicated from the main OSM database once every minute, so mappers can download and make use of their edits immediately after uploading. You can find further details in bdon’s diary entry.

Community

  • Martin Koppenhoefer thinks that the OSMF board has overruled the ‘on the ground rule’ and the Data Working Group decision to tag the Crimea peninsula as being part of Russia two years ago. He asks the board on OSM’s main mailing list to revisit the previous decision. As expected this started a long discussion about this case in particular but also about the tagging of disputed boundaries, the ‘on the ground rule’, the role of non-physical objects in our database and many others. (Nabble)
  • Daniel Capilla points to an ongoing discussion about how explicitly abstaining from voting should be treated during the tagging proposal voting process.
  • The results of the most recent board election of OSM US are in. The amendments to the articles of association (change in the terms of office of the board members) failed as they did not meet the quorum of 50% of the membership participating. The new board consists of Jubal Harpster, Daniela Waltersdorfer, Alyssa Wright, Martijn van Exel and Minh Nguyễn. Jonah Adkins and Steve Johnson were not elected.
  • Shawna Bjorgan, of Arizona State University, writes about her experience interning remotely with YouthMappers and the USAID GeoCenter on the YouthMappers blog.
  • OpenStreetMap US publishes a monthly newsletter with a short selection of news, events and announcements from the OpenStreetMap community in the United States. You can read the February Issue here.
  • pic500 Valeriy Trubin continues his series of interviews with OSMers. He talked with Georgiy Potapov (automatic translation) about neural networks and AI and with Sergey Zaichenko (automatic translation) about the use of OSM data in big projects such as the ‘Sputnik‘ map service.

OpenStreetMap Foundation

  • picture 500Joost Schouppe from the OSMF board announced the start of OSMF’s microgrants. However, before the first funds can be granted, the mechanism to do so, in the form of a selection committee, must be set up. If you think you can help OSM by running the program, your application is welcome until 8 March 2020.
  • The minutes of the OSMF board meeting of 30 January 2020 are online. Topics included ODbL violations, dealing with new members without a history of mapping activity and the establishment of a working group on diversity.
  • OSM and the OSMF favour open and self-hosted software, as documented in the FOSS policy. However, the implementation has stalled somewhat. Tobias Knerr from the OSMF board is trying to push the implementation further by looking for volunteers for a FOSS Policy Committee.

Events

  • pic500 The FOSSGIS Conference 2020 and OSM Saturday will take place in Freiburg im Breisgau from 11 to 14 March. Nakaner wants (automatic translation) people to know that participation on OSM Saturday 2020 (14 March) is free of charge and independent of the FOSSGIS conference.
  • pic500 Manfred Stock reminds us that the deadline for the call for participation and the call for abstracts of the academic track at State of the Map 2020 in Cape Town is looming.
  • pic500 OSGeo Oceania, in lieu of purchasing carbon credits for its recent conference has decided to fund the planting of 250 yellow box trees in the Churchill National Park in Lysterfield, Victoria, Australia. The tree planting will take place on 15 August 2020 and people who would like to participate should register with ParkConnect.
  • pic500 On 14 March 2020 the GIS meeting will be held in Saint Petersburg (Russia) as part of the spbgeotex project. The meeting will be dedicated to that famous software package – QGIS.
  • pic500 50 students, with their teachers, from five European countries will work together in Râmnicu Vâlcea, Romania at the end of April to do pre-disaster mapping and learn to use several QA tools.
  • pic500 Pista ng Mapa, a festival that aims to promote the use of free and open geodata and software, will take place from 27 to 29 May 2020 in the Philippines.
  • pic500The SOTM Baltics 2020 is scheduled for 6 March 2020 in Riga. The conference should be of interest for non-Baltic mappers as well, as all talks will be in English.
  • pic500 Open Belgium 2020 will bring together over 50 national and international speakers to provide information about Open Knowledge and Open Data. osm.be can be of help if you’d like to attend but can not afford the ticket for the event on 6 March 2020 in Hasselt, Belgium.
  • pic500OpenStreetMap Belgium will be hosting a meet up with Allan Mustard, chair of the OSMF board of directors, at the Cafe De Markten in Brussels on 23 March. RSVP via Meetup.

Humanitarian OSM

  • HOT has launched an export tool, which aims to ease the delivery of OSM data to humanitarian organisations via the Humanitarian Data Exchange (HDX).
  • In cooperation with the United Nations and the Ugandan government HOT has created the Risk Atlas of Arua showing existing hazards, exposure, and vulnerabilities.
  • During the UN World Urban Forum 10, held in Abu Dhabi between 8 and 13 February, the Humanitarian OpenStreetMap Team had a training event titled ‘Map Your City: A Tale of Three Countries’, where project managers from Indonesia, Turkey and Uganda demonstrated OSM data tools and described how open data is used in different humanitarian contexts. You can access the slides of the event here.

Education

  • The German tutorial ‘Getting Started with OpenStreetMap’ by Volker Gringmuth, aka kreuzschnabel, offers a really good introduction to the OSM universe, and not only for beginners. Volker continues to update his tutorial and the pdf (de) may be interesting for translators.
  • Ryan Lambert started a webinar series of six videos on how to work with PostGIS and OSM. The first four videos are online; the others will follow shortly.

Maps

  • pic 500 Do you remember Geopedia, the digital travel guide? Developed by Michael Schön, this application (also available as a smartphone app) shows Wikipedia articles with a geolocation overlaid on OpenStreetMap (or Google Maps if that’s what you’re into).
  • pic 500 Mapbox visualises the course of the outbreak of the new coronavirus on a real-time map with a lot of additional information.
  • Art group ‘Neploho’ (Russian for ‘not bad’) created an online map of popular places for first dates in Moscow.
  • An investment map is available on the website of the Administration of Chernogorsk (a city in Russia). OSM is used as a basemap.

Open Data

  • pic500 A reminder that you are invited to Open Data Day, 7 March. For the tenth time, groups from around the world will hold local events on the day where they will use open data in their communities.

Licences

Software

  • The base version of the blender-osm plugin for Blender 3D can now be downloaded for free.
  • pic500 Generation Street, a game based on OSM data, is now available for free on Steam (with a limited selection of territories, the entire planet is available in DLC). According (ru) (automatic translation) to its developer, Roman Shuvalov, its source code will be opened in the foreseeable future.

Programming

  • mmd has announced the new CGImap version 0.8.0 in his OSM user diary. Thanks to the new changeset endpoints, the OSM API 0.6 is now covered to the extent that running most parts of an OSM editing session is now possible using only CGImap.
  • Rostislav Netek et al. compare in their paper ‘Performance Testing on Vector vs. Raster Map Tiles-Comparative Study on Load Metrics’ the relative performance in delivering maps over HTTP between raster and vector tiles.

Releases

  • The OSM Software Watchlist by Walter, aka wambacher, will continue to be maintained and it remains the reference in terms of highlighting release changes of OSM-related software. Walter, a long time member of the Wochennotiz/weeklyOSM team, will not be able to maintain the Boundaries-map in the future.
  • OsmAnd’s iOS version is catching up with the Android one. The team has released version 3.12 which comes with an improved route details screen.
  • The webmap framework OpenLayers has reached version 6.2.0. The new version improves mousewheel zooming, optimises text rendering, introduces the ‘displacement’ option for a more flexible positioning of point symbolisers, and many, many, more improvements.

Did you know …

  • … how to map highways covered by a building? The wiki article Key:covered explains how to map highways covered by buildings in situations where using layer=-1, denoting the highway is underground, is not applicable.
  • … that there is a list of OSM-related Twitter accounts?
  • … about the websites ‘Tram systems of Russia’ and ‘Bus lanes of Russia’? (ru)
  • … that the iD editor now has its own blog? This blog covers feature releases, community news, and development insights from iD’s maintainers and contributors.
  • picture_500 … that windy.com has animations of the weather forecasts produced by various models superimposed on an OpenStreetMap-based map?

OSM in the media

  • El Territorio, a local Argentinian newspaper, informs us (es) (automatic translation) that all rivers and streams in the Misiones province, Argentina, can be found in OpenStreetMap. They can also be seen in a hydrology map prepared with uMap by user Carlos Brys. Most of the toponyms were collected by geographer Miguel Stefañuk, who travelled the mostly jungle covered province for many years, checking the names of streams with rural and native dwellers.

Other “geo” things

  • BBC Radio’s ‘Inside Science’ visits Britain’s Ordnance Survey, and some of the methods that they’re using will sound fairly familiar to contributors to OpenStreetMap.
  • The Ramblers are running their ‘Don’t Lose Your Way’ project to identify potential lost rights of way by searching old OS maps for paths, which are marked historically as footpaths or bridleroads, which are now missing on the current OS map. Time is running short, from 1 January 2026 it will no longer be possible to add paths to the definitive map (the national record of public paths) and access to missing paths will be lost forever.
  • Jack Dangermond, founder and president of ESRI in California, details in a Geospatial World article titled ‘GIS as an intelligent nervous system for the planet’ the importance of GIS and geodata from the perspective of one of the market leaders.
  • Google solves the problem of disputed borders by showing whatever the user wants to see. Greg Bensinger’s article in the Washington Post describes how the Silicon Valley firm’s decision-making on maps is often shrouded in secrecy, even to some of those who work to shape its digital atlases every day.
  • Paul Kang is a Nashville (USA) resident with a passion for adding civil rights related sites to Google Maps. Molly McHugh-Johnson’s article describes how Paul’s hobby started and grew. If only we could somehow encourage him to also add them to OSM. Then not only would his work not ‘be going anywhere’, it would also be freely available for everyone to use.
  • Dan Stowell and Jack Kelly explained the value of mapping the location of solar photovoltaic panels. They highlight the recent work done by OSM mappers to add over 120,000 installations to the map.
  • Christine Ro discussed some of the unintended consequences of using remote sensing in human right campaigns. She gives as an example the 2007 project ‘Eyes on Darfur’, where the villages in Sudan that were being remotely monitored to discourage human rights violations ended up being more likely to experience violence.
  • SevenCs has released a new version of its electronic nautical chart display software. The new version allows users to combine SevenCs’ visuals with third-party geographic data sources such as those provided by OpenStreetMap.
  • DC Rainmaker reviewed the Suunto 7 with Wear OS, a smart watch that features an OSM-based map.
  • Feedspot has a list of 75 of the most popular blogs and websites about GIS (Geographic Information Systems) for 2020.
  • New locations are available (automatic translation) on Yandex.Panoramas. Earlier Yandex gave permission (automatic translation) to use its panoramas for editing OSM data.
  • appleinsider.com features a comparison of Apple Maps and Google Maps in a comprehensive article. Unfortunately the article does not include any pointers to other options.
  • The GEO-3 payload of the European Geostationary Navigation Overlay System (EGNOS), hosted aboard the EUTELSAT 5 West B satellite, has successfully entered into service. The payload is part of the program to update Europe’s satellite-based augmentation system.
  • And finally a time lapse of our future: A journey to the end of time …

Upcoming Events

Where What When Country
Turin FOSS4G-it/OSMit 2020 2020-02-18-2020-02-22 italy
Hanover OSM-Sprechstunde 2020-02-19 germany
Ulmer Alb Stammtisch Ulmer Alb 2020-02-20 germany
Rennes Atelier découverte 2020-02-23 france
Takasago Takasago Open Datathon 2020-02-24 japan
Singen Stammtisch Bodensee 2020-02-26 germany
Hanover OSM-Sprechstunde 2020-02-26 germany
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-02-26 germany
Lübeck Lübecker Mappertreffen 2020-02-27 germany
Brno Únorový brněnský Missing maps mapathon na Geografickém ústavu 2020-02-27 czech republic
Toulouse Contrib’atelier OpenStreetMap 2020-02-29 france
Budapest Budapest gathering 2020-03-02 hungary
London Missing Maps London 2020-03-03 united kingdom
Hanover OSM-Sprechstunde 2020-03-04 germany
Stuttgart Stuttgarter Stammtisch 2020-03-04 germany
Praha/Brno/Ostrava Kvartální pivo 2020-03-04 czech republic
Dortmund Mappertreffen 2020-03-06 germany
Riga State of the Map Baltics 2020-03-06 latvia
Amagasaki GPSで絵を描いてみようじゃあ~りませんか 2020-03-07 japan
Rennes Réunion mensuelle 2020-03-09 france
Grenoble Rencontre mensuelle 2020-03-09 france
Taipei OSM x Wikidata #14 2020-03-09 taiwan
Toronto Toronto Mappy Hour 2020-03-09 canada
Hamburg Hamburger Mappertreffen 2020-03-10 germany
Zurich 115. OSM Meetup Zurich 2020-03-11 switzerland
Hanover OSM-Sprechstunde 2020-03-11 germany
Freiburg FOSSGIS-Konferenz 2020-03-11-2020-03-14 germany
Chemnitz Chemnitzer Linux-Tage 2020-03-14-2020-03-15 germany
Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
Guarda EuYoutH OSM Meeting 2020-06-24-2020-06-28 spain
Cape Town HOT Summit 2020-07-01-2020-07-02 south africa
Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Elizabete, Nakaner, PierZen, Polyglot, Rogehm, SK53, Silka123, Softgrow, SunCobalt, TheSwavu, YoViajo, derFred, geologist, jinalfoflia.

Why does building a skin require PHP knowledge?

23:26, Saturday, 22 2020 February UTC

One of my longstanding pet peeves is that skin development for MediaWiki is so hard. I propose a radical change to how skins are installed and ask for feedback.

Having watched teenagers use Myspace.com and then tumblr.com and watching Wikimedians build all sorts of things using wikitext templating it's clear that skinning anything should be possible with a mixture of basic knowledge of web technology (HTML,CSS maybe JSON) and/or cargo cult programming. The MediaWiki skin ecosystem is pretty sparse and when skins are created they don't tend to be published for wider consumption or are lost in github repos that are never linked to. Some never even get built. After almost 10 years in this movement it's easy to see why.

At a recent offsite I got all my team to stand up in a room and asked them to sit down if they felt comfortable with HTML. A few sat down and I told them unfortunately they couldn't build a skin. When I asked them if they felt comfortable editing CSS, a few more sat down and I told them the same thing. Eventually everyone sat down. What was interesting was who sat down and when. The designers sat down at the mention of PHP (while comfortable with CSS and JS) as did many frontend engineers. Meanwhile backend engineers sat down at the mention of PHP.

Our skin code is pretty complicated. We currently encourage skin development by guiding users to the ExampleSkin. This extension is pretty scary to many developers not already in our ecosystem and many designers who are in it. There is an extreme amount of PHP and knowledge of folder structure and MediaWiki-concepts such as ResourceLoader is needed before someone can even start.

Currently to create a skin at minimum you must

  • Download and setup MediaWiki
  • Learn git and clone the ExampleSkin repo
  • Understand ResourceLoader
  • Understand our i18n system
  • Understand how skin.json works
  • Edit PHP to generate HTML
  • Edit CSS

To encourage a healthy skin system we need to lower many of the barriers to implementing a skin. It should be as simple as:

  • Clone a repo
  • Edit some CSS and HTML
  • Run some npm commands

During the implementation of MobileFrontend and MinervaNeue many changes were made to the skin system to help build the new skin whilst maintain the old skin. It also intentionally made some breaking changes from traditional skin - for example no longer were languages or categories part of the HTML of an article. JavaScript and CSS shipped by all skins was turned off in preference for its own versions. In my opinion this was the skin's right. A good skinning system allows you to create radically different skins and innovate. If our skin system was healthy we'd likely have skins of all sorts of shapes and sizes. A good skin system also makes maintenance easier. Right now because of class inheritance, it's very difficult to make changes to our existing skins or our core skin PHP without worrying about breaking something elsewhere. Similar changes and challenges happened with Timeless that I'm sure Isarra can attest to!

Exploring different approaches

I've been lamenting this situation for some time. A while back I created an extension called SimpleSkins that reduced the Minerva skin to 2 files with some ambitious and breaking changes to the PHP skin code that I dreamed of one day upstreaming.

At a recent hackathon, with this idea still in mind, I took a slightly different approach. Instead of trying to make a skin a folder of several files I thought - what if the skin folder was an output of a build step? Similar to the SimpleSkin approach I again focused on a folder of frontend friendly technologies and reduced Vector to 3 files - Mustache (template), JS (with require support), LESS(css) however the generation of skin.json, PHP was left to a build script. Remarkably this worked and was relatively straightforward. One interesting insight I had this time however was that no skin developer should require a MediaWiki install to build the skin - with templates a lot of data can be stubbed or come from an API endpoint. Not having to install a MediaWiki install is a big deal!

With a good architecture a lot of our skin system can be abstracted away. A skin without qqq codes is still useful provided en.json has been generated. ResourceLoader module config is much easier to auto-generate now we have packageFiles provided we enforce a common entry point e.g. index.js and index.css/less. The PHP skin class should and can do more. Instead of having skins that extend SkinTemplate e.g. SkinVector we should have a skin rendering class that renders folders that contain skin meta data....

How we do it.

Forgetting about existing technology choices and working with what we've got, I'd propose automating most of the skin registration process - to the point that PHP is irrelevant and JS and JSON files are optional.

I strongly believe the following is possible:

  • An online skin builder that saves and exports skin folders to download folder or github similar to jsfiddle
  • A valid skin is a folder with at minimum 2 files - index.mustache and index.(css/less)
  • You should be able to copy and paste an existing skin and get a new skin without any modification except the folder name.

To achieve such a goal we would need a SkinRenderer class that locates a skin directory and renders the template inside it (Mustache is currently the template language we support in core). SkinRenderer when passed the skin key skinnew for example would find the folder skinnew in the skins folder and the files index.less, index.js and skin.mustache. It would pass skin.mustache data (which is subject to deprecation policy and well documented and it would register a ResourceLoader module using index.less and index.js and packageFiles. qqq.json and en.json if needed could live in the i18n folder as they currently do but their absence would not cause any problems.

A developer would fork a new version of the ExampleSkin which provides the basic file components, run npm install and then npm start. This would pull our core frontend technologies - mustache and LESS from npm and then pass the skin through a tool such as parceljs that allows live updating, the workflow of which is demonstrated in this video. However unlike in my hack experiment, installing the skin would be as simple as copying that folder into mediawiki's skin folder rather than running a build step :)

What do I do next?

Am I alone in thinking it should be possible to build skins without PHP? Do you have different opinions on what the skin system should be? Do you have concerns about how such a system would scale or whether such a system would get adoption? What do you think of my skin builder tool and should I work on it more? If so I'd love to hear more from you. Any feedback you can provide would be helpful to decide whether I should prepare and push and RFC.

Thank you for your time!

A few days ago, during a period of severe flooding across many parts of England and Wales, caused by storms Ciara and Dennis, one of my friends shared this Financial Times story, UK makes last minute bid for EU flood funds, on Facebook:

FB1

Obviously, it was a timely item, and relevant to many of those affected. and clearly related to recent Brexit developments.

Or was it?

A mutual friend soon responded:

FB2

As you can see, they pointed out that the story was actually four years old.

Facebook could prevent well-intentioned posts of such stale news from misleading people, and subtly discourage people from repeating them, by including the original publication date in their preview, something like this:

FB3

There are a number of ways they could obtain the publication date, such as from the shared page’s meta headers, and if no date is available they could even warn “no date specified, please check source”.

Of course, this won’t stop bad actors from publishing stories on web pages with deliberately misleading metadata, nor the inept from accidentally doing so, but Facebook are reportedly downgrading unreliable sources anyway, so that’s just another factor to add to their criteria.

And it’s not just Facebook, but Twitter and other social media services, that could do this.

The post How Proactively Publishing Publication Dates Could Help Facebook Fight Fake News appeared first on Andy Mabbett, aka pigsonthewing.

SMW vs Wikibase vs Cargo

00:00, Saturday, 22 2020 February UTC

What software do you need to manage data/knowledge in your wiki? Historically this was Semantic MediaWiki. Is that still the case, and how do the alternatives stack up?

Overview

In this article we look at the features provided by Semantic MediaWiki (SMW), Wikibase and Cargo. At the end of the article we share our thoughts on when to use which software.

No one has our level of expertise when it comes to both SMW and Wikibase. We have been significantly involved in the development of both and helped many of our customers with SMW. We cannot say the same for Cargo, as we have not used it ourselves or gotten requests from customers yet.

Ecosystem

Semantic MediaWiki Wikibase Cargo
Public wikis 1000s ~10 100s
Developed for Data management Wikidata.org Data management
Primary users Organizations Wikidata.org Organizations
Complementary extensions 28 to 50+ ~7 2
Dedicated mailing list
Dedicated conference SMWCon
Commercial support Dozens of companies One company Several companies
Available since 2005 2012 2015
Main developer Community Wikimedia DE Community

Editing experience

Semantic MediaWiki Wikibase Cargo
Define data where it used
Visual editing
Programmable editing UI Forms Forms
Source editing

Data access

Semantic MediaWiki Wikibase Cargo
Inline queries
Query language SMW specific - Based on SQL
Visualizations 60+ inline External only ~25 inline
SPARQL queries
Web API
Lua API
Simple DB tables

Data model and storage

Semantic MediaWiki Wikibase Cargo
Data location Flexible Dedicated pages Flexible
In-text data
Data model Flexible Complex and fixed Flexible
Data types 18 17 18
Data constraints Maybe via tools
Computed values
Multilingual data Second class
Optional
First class
Required
?
Primary storage Wiki pages Wiki pages (JSON) Wiki pages
Primary indexing MW DB (MySQL) MW DB (MySQL) MW DB (MySQL)

Development

Semantic MediaWiki Wikibase Cargo
Main developer Community Wikimedia DE Community
Developed for Data management Wikidata.org Data management
Stable releases
Release notes
Available since 2005 2012 2015
Developed on GitHub (SMW org) Wikimedia Gerrit Wikimedia Gerrit
Contributors ~200 developers ~250 developers ~75 developers
Code changes 15k+ 30k+ ~1300
Fully open source

When to use Wikibase

Wikibase serves a different usecase than Semantic MediaWiki and Cargo. You might have noticed it is the odd one out in the above comparison. This is because unlike SMW and Cargo, Wikibase was not developed for displaying and using data within the wiki itself. It was developed for collaborative creation and maintenance of a knowledge base containing data that can be used on Wikipedia and by third parties.

This makes it great for organizations that curate data for third party consumption. For instance museums and other GLAM institutions. If on the other hand, like is done in most structured wikis, you want to present the data in your wiki itself, then Wikibase is the wrong tool for you.

There are several potential gotchas to be aware off before starting with Wikibase. Wikibase is developed by Wikimedia for Wikidata. Wikidata will always be the priority. While Wikimedia has recently started to talk about supporting third parties, the robustness of this support remains to be seen. There are also no stable releases or upgrade instructions. Depending on your level of tolerance for operational risk, this might disqualify using the Wikibase software at present.

We have contributed to the development of Wikibase and have significant expertise. If you are looking for commercial Wikibase help or advice, contract us.

When to use Cargo

Cargo was created as alternative to Semantic MediaWiki. The main motivation was to provide a simpler software, both in terms of use and in terms of codebase. We think it failed to do the latter.

Cargo ties data storage directly to templates and stores the data in database tables based on those templates. This allows querying data via SQL snippets rather than SMWs dedicated query language.

If you want to run SQL queries against the structured data stored in your wiki, either from within the wiki, or from another application, then look into Cargo. If you are fine with its smaller feature set, reduced flexibility and smaller ecosystem, then it might be the right choice for you.

When to use Semantic MediaWiki

Semantic MediaWiki is the standard software for managing data in wikis. It has been continuously developed since 2005 and has by far the biggest ecosystem and most features. There are several enterprise MediaWiki hosting companies and they all use a Semantic MediaWiki based stack.

We believe Semantic MediaWiki is the right choice for most wikis and suggest it to be your default pick.

Our fully managed enterprise wiki hosting plans all include Semantic MediaWiki and important extensions to SMW. Looking for SMW help, advice or development? Contact us.

Semantic Wikibase

We want to develop a connection between Wikibase and Semantic MediaWiki and are looking for funding.

Jewish women are underrepresented on the reference site — but not if this class has anything to write about it

Wikipedia is the most popular reference site on the internet, with nearly 20,000 articles added every month. Yet an estimated 85% of its editors are male — a problem of gender imbalance the site’s founder, Jimmy Wales, has said he isn’t sure how to fix — and a mere 17% of its biographies are about women.

Wiess College senior Sarah Silberman and Lovett College junior Julia Fisher were among the students whose work is now published on Wikipedia. (Photo by Katharine Shilcutt. Rights reserved.)

“Instead of being the egalitarian ‘sum of all human knowledge,’” as a 2015 New Statesman article by Jenny Kleeman put it, “the English version of Wikipedia is mostly the sum of male knowledge.”

That’s what makes the work of one Rice University course, Sex and Gender in Jewish Culture, so vital.

Each student became an “expert” in one small topic area of women’s studies — Jewish women, in particular — over the course of the semester, then presented a published Wikipedia article on that topic to their fellow students on the last day of class.

“I wanted to structure all the assignments for this particular class around the idea of writing as activism, and the Wikipedia project fit perfectly into that principle,” said Melissa Weininger, the Anna Smith Fine Senior Lecturer in Jewish Studies and associate director of the Program in Jewish Studies.

She introduced the assignment for the first time last semester, in collaboration with Wiki Education, which connects higher education to Wikipedia in an effort to make its information more representative, accurate and complete.

Writing Wikipedia entries about Jewish women or topics pertinent to them was only one of Weininger’s many course assignments. The class also penned op-eds, watched Israeli and American movies depicting Jewish women and read short stories.

But it was the Wikipedia project that resonated most with the students.

“There’s an obvious lack of representation of Jewish women in history, and it’s even more obvious on Wikipedia,” said Lovett College junior Julia Fisher, an art history major who took the course last semester in part to explore her own Jewish heritage. But what she came away with was a whole new understanding of who writes our shared history and why.

Fisher was surprised to find that very little had been written about Israeli poet Yona Wallach; what was on the stub Wikipedia article had been plagiarized, Fisher discovered. Yet Fisher found multiple books and articles in Rice’s Fondren Library that discussed Wallach’s life.

“I wanted to structure all the assignments for this particular class around the idea of writing as activism, and the Wikipedia project fit perfectly into that principle,” said professor Melissa Weininger. (Photo by Jeff Fitlow. Rights reserved.)

“I didn’t understand how all of this information hadn’t been synthesized into a Wikipedia article before,” she said. “That really surprised me, because it’s all readily available. Why hadn’t anyone thought to put it all together yet?”

In a world with so much information at our fingertips, in fact, one of the most difficult things to impart to students is how to differentiate reliable and unreliable sources.

“In the last few years, I have been rethinking many of my class assignments, trying to teach writing and argumentation skills through projects that have more value to students and to the world than standard academic research papers,” Weininger said.

Sarah Silberman, a Wiess College senior who was also attracted to the course as a means of exploring her own family history, was shocked to find that prominent Jewish historian Paula Hyman had a short, “not very good” article on Wikipedia.

“She essentially founded the modern study of Jewish women’s history,” Silberman said.

Accustomed to writing long-form research papers as a history major, Silberman appreciated the Wikipedia project as an opportunity to write in a completely different format as well as a chance to have her research published in a public venue for a change.

“I think Dr. Weininger set up the course in a really good way that made it both nontraditional but still very effective,” Silberman said.

The residual effects of the course are just as important: With more women editors now trained on how to research, cite and write Wikipedia articles — and with an increased awareness of the ongoing gender imbalance issue — Weininger hopes her students will continue their contributions in the future.

“I am really proud of the work my students produced through this project,” Weininger said. “They dedicated themselves to it wholeheartedly and it was clear at the end that they had both learned a lot and enjoyed the process, as well as the outcome.”

For Fisher, who still checks in on her Wikipedia entry like a proud parent, it was a project she’ll never forget.

“It was easily the coolest part of the class because it actually felt like we were making a real, tangible impact,” Fisher said. “It wasn’t just like we were writing a paper and then it disappears forever. This is permanent.”


Katharine Shilcutt is a media relations specialist in Rice University’s Office of Public Affairs.


Interested in adapting a Wikipedia writing assignment for your own course? Visit teach.wikiedu.org for access to our free assignment templates and tools.


This blog was originally published February 10, 2020 by Rice Unversity’s Office of Public Affairs. It has been republished here with permission from the author.

New names for everyone!

00:14, Friday, 21 2020 February UTC

The Cloud Services team is in the process of updating and standardizing the use of DNS names throughout Cloud VPS projects and infrastructure, including the Toolforge project. A lot of this has to do with reducing our reliance on the badly-overloaded term 'Labs' in favor of the 'Cloud' naming scheme. The whole story can be found on this Wikitech proposal page. These changes will be trickling out over the coming weeks or months, but one change you might notice already.

New private domain for VPS instances

For several years virtual machines have been created with two internal DNS entries: <hostname>.eqiad.wmflabs and <hostname>.<project>.eqiad.wmflabs. As of today, hosts can also be found in a third place: <hostname>.<project>.eqiad1.wikimedia.cloud. There's no current timeline to phase out the old names, but the new names are now the preferred/official internal names. Reverse DNS lookups on a instance's IP address will return the new name, and many other internal cloud services (for example Puppet) will start using the new names for newly-created VMs.

Eventually the non-standard .wmflabs top level domain will be phased out, so you should start retraining your fingers and updating your .ssh/config today.


If you visit the Students tab of your course page, you may notice that it looks a bit different. As the first phase of our Spring 2020 push to improve the instructor user experience, we’ve rolled out a new version of the Students list that lets you view details about the Article Assignments for any student.

This new view builds on the improvements to the student user experience that we launched before the start of the term, and includes links to the key sandbox pages where students will complete the different phases of their assignment — preparing their bibliography, drafting their articles, reviewing the drafts of their peers, and editing live Wikipedia articles. We’ve built this interface in particular around the task of evaluating student work, and we’ll continue to iterate on it over the next several months to make it easier for instructors to efficiently view and grade their students’ work for each milestone in the Wikipedia assignment.

The Assigned Articles view of the Students tab of the Wiki Education Dashboard, showing links to key sandbox and article pages for each assigned article for a selected student.

We’re eager to get your feedback! If you have ideas for what else could be changed to make the grading process easier, leave a comment on our blog or send an email.

The next phase of this project is to sit down with instructors and see how they make use of the Dashboard for reviewing and grading their students’ Wikipedia work. If you can spare an hour to meet with the Wiki Education technology team to test out these features, discuss your grading process, and help us plan further improvements, please let us know!

Changes to Security Team Workflow

18:18, Thursday, 20 2020 February UTC

In an effort to create a repeatable, streamlined process for consumption of security services the Security Team has been working on changes and improvements to our workflows. Much of this effort is an attempt to consolidate work intake for our team in order to more effectively communicate status, priority and scheduling. This is step 1 and we expect future changes as our tooling, capabilities and processes mature.

How to collaborate with the Security Team

The Security Team works in an iterative manner to build new and mature existing security services as we face new threats and identify new risks. For a list of currently deployed services available in this iteration please review our services page.

The initial point of contact for the majority of our services is now a consistent Request For Services [2] (RFS) form [3].

The two workflow exceptions to RFS are the Privacy Engineering [4] service and Security Readiness Review [5] process which already had established methods that are working well.

If the RFS forms are confusing or don't lead you to answers you need try security-help@wikimedia.org to get assistance with finding the right service, process, or person

security@wikimedia.org will continue to be our primarily external reporting channel

Coming changes in Phabricator

We will be disabling the workboard on the Privacy [6] project. This workboard is not actively or consistently cultivated and often confuses those who interact with it. Privacy is a legitimate tag to be used in many cases, but the resourced privacy contingent within WMF will be using the Privacy engineering [7] component.

We will be disabling the workboard for the Security [8] project. Like the Privacy project this workboard is not actively or consistently cultivated and is confusing. Tasks which are actively resourced should have an associated group [9] tag such as Security Team [10].

The Security project will be broken up into subprojects with meaningful names that indicate user relation to the Security landscape. This is in service to Security no longer serving double duty as an ACL and a group project. This closes long standing debt and mirrors work done in T90491 for Operations to improve transparency. This means an ACL*Security-Issues project will be created and Security will still be available to link cross cutting issues, but will also allow equal footing for membership for all Phabricator users.

Other Changes

A quick callout to the consistency [11] and Gerrit sections of our team handbook [12]. As a team we have agreed that all changesets we interact on need a linked task with the Security-Team tag.

security@ will soon be managed as a Google group collaborative inbox [13] as outlined in T243446, This will allow for an improved workflow and consistency in interactions with inquiries.

Thanks
John

[1] Security Services
https://www.mediawiki.org/wiki/Wikimedia_Security_Team/Services
[2] RFS docs
https://www.mediawiki.org/wiki/Security/SOP/Requests_For_Service
[3] RFS form
https://phabricator.wikimedia.org/maniphest/task/edit/form/72/
[4] Privacy Engineering form
https://form.asana.com/?hash=554c8a8dbf8e96b2612c15eba479287f9ecce3cbaa09e235243e691339ac8fa4&id=1143023741172306
[5] Readiness Review SOP
https://www.mediawiki.org/wiki/Security/SOP/Security_Readiness_Reviews
[6] Phab Privacy tag
https://phabricator.wikimedia.org/tag/privacy/
[7] Privacy Engineering Project
https://phabricator.wikimedia.org/project/view/4425/
[8] Security Tag
https://phabricator.wikimedia.org/tag/security/
[9] Phab Project types
https://www.mediawiki.org/wiki/Phabricator/Project_management#Types_of_Projects
[10] Security Team tag
https://phabricator.wikimedia.org/tag/security-team/
[11] Security Team Handbook
https://www.mediawiki.org/wiki/Wikimedia_Security_Team/Handbook#Consistency
[12] Secteam handbook-gerrit
https://www.mediawiki.org/wiki/Wikimedia_Security_Team/Handbook#Gerrit
[13] Google collab inbox
https://support.google.com/a/answer/167430?hl=en

Khyati Soneji

Khyati Soneji has been improving the Wiki Education Dashboard for nearly a year now. She joined our open tech project as an Outreachy Intern in 2019 and is now mentoring another intern for a new project. Recently she attended the SWASTHA meetup in Mumbai along with our 2019 Google Summer of Code intern Amit Joki, who has been helping improve the Dashboard for almost two years. The experience further inspired their passion for the open knowledge project we all know and love.

SWASTHA is a new initiative to tackle the lack of authentic healthcare related articles in the regional languages of India,” Amit explained. “It stands for Special Wikipedia Awareness Scheme for The Healthcare Affiliates. Incidentally, the acronym also means ‘health’ in Hindi.”

The initiative was started by Abhishek Suryawanshi, who believes in the importance of accessible knowledge, especially when it comes to healthcare. “The project focuses on delivering high quality and verified health related articles which are region and language based, so people can read the articles in their native language,” Khyati explained.

Amit Joki

“Because there are lots of Indians who don’t speak English,” Amit added, “it’s difficult for them to access free knowledge regarding healthcare because articles online in their regional languages are nearly non-existent.” This problem of access is one that folks in the SWASTHA initiative hope to tackle.

Khyati and Amit have both worked in a tech-capacity behind the scenes of Wikipedia-related projects, but they hadn’t had experience with volunteer editor communities before.

“The meetup was an informal, friendly meeting which introduced us to various Indian Wikipedia communities of editors,” Khyati shared. “I personally was not at all aware of the various native language editors communities.”

“It was a great way to know the people behind the articles,” Amit agreed. “The passion they have is what defines Wikipedia as a movement. They put the face to the movement and that’s always an excellent way to get more people into the fold.”

SWASTHA meet up participants.
Image by Raykannu (CC BY-SA 4.0 via Wikimedia Commons)

Both Amit and Khyati were impressed by how many years of experience were represented among these Wikipedians. “Some of their experience was far greater than our ages and that was humbling and overwhelming, in a good sense,” said Amit. Khyati added that “some people have made their family also contribute to Wikipedia and have become a Wikipedian family! It was great to see their passion and enthusiasm to help their people to get free knowledge in their native language.”

Over two days, the 20 or so participants divided themselves into groups and discussed problems faced in their communities that the SWASTHA initiative could help address. Khyati and Amit shared how the Dashboard could be used as part of those solutions. The first step, the group decided, was to translate the 10 health focused Wikipedia articles chosen as part of the initiative into everyone’s native languages.

“As a Dashboard team, we came up with an idea that there should be one instructor and everyone would be assigned a task to translate the articles in their native language,” said Khyati. “And we can easily check the number of contributors, check their progress and how many health articles are available in each language, and a general idea that writing health articles (in any language) should be made part of the curriculum of some top-notch Medical College so that professors can assign articles to students and professors would then verify those articles and we can get authenticated medical articles, which can later be divided into different languages.”

“We had a quick analysis about the challenges we may face,” Amit also explained. “Those included problems ranging from having hardly any canonical source of authentic terminology of the diseases in the regional languages, to finding a way to validate the articles. Because it’s healthcare we are talking about, it becomes our responsibility that what gets published is thoroughly vetted by experts before it ends up in the public domain. After the challenges were discussed, we framed a tentative timeline stretching across a period of 3 months after which we would return to quantify the work that has been done and to further discuss how we should take this forward.”

All in all, Khyati and Amit agreed that the meetup was inspiring and energizing.

“It was such a nice experience seeing the energy and eagerness of people to help others get free access to knowledge,” said Khyati. “I feel lucky to have become part of such a community and am very thankful that I was invited to such a wonderful event. I am hopeful that we will be able to accomplish the task and help people. I’m glad to help in this great initiative by bringing new members into the group and also by contributing.”

“Overall, the whole experience was amazing,” Amit echoed. “I got to board an aeroplane for the first time in my life, thanks to Wikimedia and Wiki Education. The people I met were the nicest set of people I could have met and this only further strengthens my belief that Wikipedia stands for all the good that can be found on the Internet – openness, inclusiveness, accessibility, and caring about knowledge.”


To learn more about the SWASTHA initiative, click here. To read more about our tech mentoring program, check out this blog post.

mwparser on wheels

06:48, Tuesday, 18 2020 February UTC

mwparserfromhell is now fully on wheels. Well...not those wheels - Python wheels!

If you're not familiar with it, mwparserfromhell is a powerful parser for MediaWiki's wikitext syntax with an API that's really convenient for bots to use. It is primarily developed and maintained by Earwig, who originally wrote it for their bot.

Nearly 7 years ago, I implemented opt-in support for using mwparserfromhell in Pywikibot, which is arguably the most used MediaWiki bot framework. About a year later, Merlijn van Deen added it as a formal dependency, so that most Pywikibot users would be installing it...which inadvertently was the start of some of our problems.

mwparserfromhell is written in pure Python with an optional C speedup, and to build that C extension, you need to have the appropriate compiler tools and development headers installed. On most Linux systems that's pretty straightforward, but not exactly for Windows users (especially not for non-technical users, which many Pywikibot users are).

This brings us to Python wheels, which allow for easily distributing built C code without requiring users to have all of the build tools installed. Starting with v0.4.1 (July 2015), Windows users could download wheels from PyPI so they didn't have to compile it themselves. This resolved most of the complaints (along with John Vandenberg's patch to gracefully fallback to the pure Python implementation if building the C extension fails).

In November 2016, I filed a bug asking for Linux wheels, mostly because it would be faster. I thought it would be just as straightforward as Windows, until I looked into it and found PEP 513, which specified that basically, the wheels needed to be built on CentOS 5 to be portable enough to most Linux systems.

With the new Github actions, it's actually pretty straightforward to build these manylinux1 wheels - so a week ago I put together a pull request that did just that. On every push it will build the manylinux1 wheels (to test that we didn't break the manylinux1 compatibility) and then on tag pushes, it will upload those wheels to PyPI for everyone to use.

Yesterday I did the same for macOS because it was so straightforward. Yay.

So, starting with the 0.6.0 release (no date set yet), mwparserfromhell will have pre-built wheels for Windows, macOS and Linux users, giving everyone faster install times. And, nearly everyone will now be able to use the faster C parser without needing to make any changes to their setup.

weeklyOSM 499

14:44, Sunday, 16 2020 February UTC

04/02/2020-10/02/2020

lead picture

OpenStreetMap mapping activity layer by Kontur 1 | © Kontur, Mapbox | © map data OpenStreetMap contributors

Mapping

  • Jinal Foflia shared a MapRoulette task to add street names in the regions around Jakarta, as well as a task to add missing sidewalks in Singapore to make the Singapore map more pedestrian-friendly.
  • Mateusz Konieczny suggested using amenity=faculty as he thinks that university faculties often deserve to be mapped separately. Comments in the discussion raise various issues with the naming of constituent parts of universities. (Nabble)
  • The European Water Project continues its efforts to improve the tagging schema for drinking-water related tagging. Stuart from the project started the voting on the proposal for drinking_water:refill=<yes/no> and drinking_water:refill_scheme=<scheme-name/multiple>.
  • Joseph Eisenberg suggested improving the tagging system for micromapping as he thinks that the current tags aren’t quite right for this purpose. (Nabble)
  • The voting for amenity=give_box, an amenity where you can share various types of items freely, has ended with an unclear result as it seems it is ambiguous how ‘abstain’ should be counted.
  • Ilya Zverev reports (ru) (automatic translation) a large amount of editorial work (reverts including deletion of old object versions) has started in Russia. About 600 organised unknowns (probably Rostelecom) have taken data from unauthorised sources. Presumably, in preparation for this year’s census, buildings and addresses were cribbed from unauthorised sources in an organised way.

Community

  • The first meeting of the Diversity and Inclusion Special Committee was held on 12 February. Mikel Maron posted some rough notes on what came out of the meeting.
  • Heather Leson reported on her time spent at FOSDEM 2020, particularly the community and legal sessions. She was pleased to see that Ilya Zverev’s talk on reverse geocoding is available online, as his talk was to a full room.

OpenStreetMap Foundation

  • The OSMF invited people to join the OpenStreetMap diversity mailing list. They also asked for help in translating their blog posts to languages other than English.

Events

  • Nick Whitelegg brought to the notice of the talk mailing list readers the Panorama Mapping Party with TrekView, a not-for-profit organisation which aims to capture panoramas. The event will take place on 2 May 2020 at the English village Ashurst, in Hampshire.
  • Stefan Keller from the HSR University of Applied Sciences Rapperswil invites (de)(automatic translation) you to the 12th Mapathon and Mapping Party to be held on 24 April 2020 in Rapperswil, Switzerland. The invitation is also extended to English-speaking mappers and beginners.
  • FOSS4G-IT 2020, an event dedicated to Free and Open Source Geographic Data and Software, will take place from 18 to 22 February 2020 at the Politecnico di Torino / Turin, Italy.

Humanitarian OSM

  • Warin has the impression that HOT is using the undocumented tag damage= and contacted the HOT mailing list with his observation.
  • EurekAlert, a non-profit news-release distribution platform, features an article about the three finalists of ‘Creating Hope in Conflict’, a challenge supported by the US Agency for International Development, the UK Department for International Development, and the Ministry of Foreign Affairs of the Netherlands. The goal is to help the vulnerable and hardest-to-reach people affected by humanitarian crises.
  • The website thenextweb.com reported about the efforts of scientists to develop an AI-based tool that can help find safe routes after a disaster strikes, enabling families to find each other.

Education

  • Jez Nicholson made “OSMUK-in-a-box”, his own toolchain for querying OSM data for the UK from a database, available at GitHub.

Maps

  • Andrei Kashcha’s ridgeline map, which we have recently written about, now allows you to set bounds for the map.
  • Hans van der Kwast gave a tutorial on how to resolve hydrological discrepancies in DEMs by using ‘stream burning’. The tutorial uses QGIS combined with stream network data from OSM and SRTM data downloaded from the USGS Earth Explorer.
  • Russ Garret, the creator and maintainer of OpenInfraMap, announced on Twitter that OpenInfraMap now links power infrastructure to Wikipedia and uses images from Wikidata where available. Links have also been added to the UK’s Renewable Energy Planning Database.
  • Plamen Pasliev made a prototype available which scrapes listings from one of the largest German real estate portals. It uses Google services to geolocate the listings and nearby amenities and provides results on an OSM-based map.

Programming

  • Chris Beddow, from Mapillary, blogged about the possibilities of working with Mapillary data in Jupyter Notebooks.The hands-on guide explains how to access the Mapillary APIs and work with images, image sequences, and (if you’ve paid the subscription for map data) with features such as traffic signs, crosswalks, utility poles and pavement markings.

Did you know …

  • …. the OpenStreetMap mapping activity layer from Kontur, which shows how actively every place in the world is mapped and which mapper is most active there?
  • … that OpenMapTiles supports more than 50 languages? The names are taken from OpenStreetMap and enhanced by adding data from Wikidata.
  • … that OSM is used on the website coronavirus.app, which shows where the corona virus has been found.

Other “geo” things

  • Several media outlets such as the Indian Hindustan Times and the German Der Spiegel (de) (automatic translation) featured the 15th birthday of Google Maps in articles describing the service and its history. Jen Fitzpatrick, Senior Vice President at Google Maps, also recaps the development from Google’s perspective in her blog.
  • A map of South-east England with places names spelt phonetically in Polish was recently tweeted. It is presumed that the map was created to assist the many Polish pilots in the Royal Air Force during the Second World War.
  • Morgan Herlocker, of SharedStreets, and formerly of Mapbox, uses the example of the recent artistic exploitation of Google’s real-time traffic algorithm (as we reported), to explain some of the issues involved in turning traffic sensor data into something usable by road users.
  • A detailed Twitter thread on cycleway provision in the British city of Leicester, provides a good introduction to a lot of detailed terminology which might be of interest to dedicated micro-mappers.
  • A new minimum distance of 1.5 m will soon apply in Germany when overtaking cyclists. This distance calculator can be used to estimate how close an overtaking vehicle was in a bicycle camera shot.
  • To some graffiti is a cultural asset and an important aspect of a city’s aesthetics; to others it’s vandalism. A team from Heidelberg University’s GIScience Research Group have developed a deep learning approach to detect building facades with graffiti artwork based on the automatic interpretation of images from Google Street View.
  • Barrington-Leigh and Millard-Ball’s paper on urban sprawl (which we covered earlier) received more coverage, this time in an article on CityLab. The article points to the authors’ sprawl map and provides us with the fact that there are 10,845,867 dead ends mapped in OSM.
  • bikeradar reviewed the OS Trail 2 Bike GPS and found it a frustratingly flawed device that doesn’t merit its £400 price tag and provides little incentive for choosing it over a Garmin. Of particular interest was the finding that: ‘[d]evices that use OpenStreetMap provide much greater detail and include details such as mountain bike tracks. OS map data doesn’t have this level of granular information, rendering it fairly incompatible with your average trail centre rider.’

Upcoming Events

Where What When Country
Karlsruhe Karlsruhe Hack Weekend February 2020 2020-02-15-2020-02-16 germany
Mainz Mainzer OSM-Stammtisch 2020-02-17 germany
Viersen OSM Stammtisch Viersen 2020-02-18 germany
Derby Derby pub meetup 2020-02-18 united kingdom
Cologne Bonn Airport 126. Bonner OSM-Stammtisch 2020-02-18 germany
Lüneburg Lüneburger Mappertreffen 2020-02-18 germany
Turin FOSS4G-it/OSMit 2020 2020-02-18-2020-02-22 italy
Ulmer Alb Stammtisch Ulmer Alb 2020-02-20 germany
Rennes Atelier découverte 2020-02-23 france
Takasago Takasago Open Datathon 2020-02-24 japan
Singen Stammtisch Bodensee 2020-02-26 germany
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-02-26 germany
Lübeck Lübecker Mappertreffen 2020-02-27 germany
Brno Únorový brněnský Missing maps mapathon na Geografickém ústavu 2020-02-27 czech republic
Toulouse Contrib’atelier OpenStreetMap 2020-02-29 france
Budapest Budapest gathering 2020-03-02 hungary
London Missing Maps London 2020-03-03 united kingdom
Stuttgart Stuttgarter Stammtisch 2020-03-04 germany
Praha/Brno/Ostrava Kvartální pivo 2020-03-04 czech republic
Dortmund Mappertreffen 2020-03-06 germany
Riga State of the Map Baltics 2020-03-06 latvia
Amagasaki GPSで絵を描いてみようじゃあ~りませんか 2020-03-07 japan
Freiburg FOSSGIS-Konferenz 2020-03-11-2020-03-14 germany
Chemnitz Chemnitzer Linux-Tage 2020-03-14-2020-03-15 germany
Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
Guarda EuYoutH OSM Meeting 2020-06-24-2020-06-28 spain
Cape Town HOT Summit 2020-07-01-2020-07-02 south africa
Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Polyglot, Rogehm, SK53, SunCobalt, TheSwavu, YoViajo, derFred.

An article in VICE starts as follows: 'Wikipedia consensus is that an unedited machine translation, left as a Wikipedia article, is worse than nothing'. This article is problematic in so many ways, it starts with this premise because the Cebuano Wikipedia does not contain machine translation. It contains machine generated text and, to add insult to injury this same article states: 'the majority (generated articles) are surprisingly well constructed'.

An article like this can be sanity checked. Principles come first;
  • This is about a Wiki in contrast to the Nupedia approach. 
  • Wikipedia’s founding goal is to make knowledge freely available online in as many languages as possible.
  • There is a difference between opinions and facts
It is important how arguments are made. When "highly trusted users who specialize in combating vandalism" are introduced and comment that "many articles are created by bots", it does not follow that the quality is low nor that this is to be considered vandalism but the implication is made.

It is a fact that the Cebuano Wikipedia has 5,378,563 articles and also that there are some 16.5 million people who understand Cebuano. There is however no relation between these two facts. More relevant is that the wife of Sverker Johansson has Cebuano as her mother tongue and his two kids learn from their maternal cultural heritage also thanks to the work he does for the Cebuano Wikipedia. That is very much a classic Wiki approach.

In contrast the English Wikipedia has its bot policy preventing the use of bots for generating content. These notions should be local to the English Wikipedia and need not have relevance elsewhere. These highly trusted users can be expected to proselyte this point of view and thanks to this POV they take away a source of information without offering any credible alternative for the existing lack of information available to the rest of the world. At the same time the English Wikipedia is biased in the information it provides and does not provide the same quality of service for the domains selected for the Cebuano Wikipedia.

Sadly the Wikimedia Foundation itself makes no effective difference in support of the "other" languages it is said. An alternative to the LSJbot was introduced and it may be able to make a difference but as it does not provide a public facing service making it very much a paper tiger. Even worse are the Nupedia notions in the combination of two things: "Due to its heavy reliance on Wikidata entries, the quality of content produced is heavily influenced by the quality of the Wikidata available." and "It can discredit other Wikipedia entries related to automatic creation of content or even the Wikipedia quality.” These notions are problematic for several reasons.
  • No information is preferred over little information when our service to an end user is considered
  • Quality of information is framed in the light of existing Wikipedia entries. Whose Wikipedia entries are we considering? They are however irrelevant as our aim is to inform our end users; they do not cover the same subject.
  • When the quality is considered of Wikidata .. Why, it is a wiki and its quality is improving particularly as so many eyes shine their light on it.
  • We can inform, in any and all languages, and we do not even have to call it Wikipedia, we do not even have to save it in a Wikipedia when we only cache the results from the automated text generation.
  • When we cache results of automated text generation, texts can be generated again when the data is expanded or changed.
So far the critique of the VICE article, but then again does English not have its own problems?
  • Its 1,143 administrators and 137,368 active users are struggling to keep up, when you compare it with the 6 administrators and 14 active users for the Cebuano Wikipedia it is understandable that, as they grow, the English have to rely more and more on bots and artificial intelligence.
  • Magnus has demonstrated that the maintenance of lists is better served not by editors but by using the data from Wikidata
  • The Wikipedia technology has a problem with false friends. Arguably some 4% of list entries are wrong because the wrong article is linked to. When links are solidified by using Wikidata identifiers instead, this problem disappears in the same way as the problems with interwiki links disappeared.
The biggest problem "Wikipedia consensus" has is that it was formulated in the past by a tiny in-crowd making up the "accepted" big words for the rest of us and worse they can not be swayed from their POV by facts.
Thanks,
      GerardM

Older blog entries