Tech News issue #24, 2020 (June 8, 2020)

00:00, Monday, 08 2020 June UTC
This document has a planned publication deadline (link leads to timeanddate.com).
TriangleArrow-Left.svgprevious 2020, week 24 (Monday 08 June 2020) nextTriangleArrow-Right.svg
Other languages:
English • ‎Esperanto • ‎Nederlands • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎српски / srpski • ‎українська • ‎עברית • ‎العربية • ‎ไทย • ‎中文 • ‎日本語 • ‎한국어
The Wikimedia Foundation is important for the support of languages on the Internet. The localisation of its software is done at translatewiki.net, it is done in over 300 languages.

The milestones for multilingual support are:
These milestones have been very much technology driven. For me the one reason why Wikidata became the success it is, is because it was from the start linked to every subject covered by Wikipedia and the solution was so overwhelmingly superior that nobody could reasonably object.

To make a success of this latest milestone, institutional support is needed. It is for the Wikimedia Foundation, its movement to reduce its bias for English and make room for improved language support.

My way of phrasing this as an essential objective: "All of is available to every single person on the planet". As we adopt this as our objective, it is first and foremost about making Special:MediaSearch useful in any and all of our languages and make it available from any and all of our Wikipedias.

As we adopt this, it is essential that priority is given to multilingual search over special interests including GLAM, Open Data, SPARQL and what have you. Priority when we are to open up in multiple languages first. Special interest only gain relevance when it is made obvious how it helps it helps open up Commons in Swahili, Hindi, German or Vietnamese.

Special:MediaSearch is possible because of everything that went before.. Its functionality is part of MediaWiki and localised at translatewiki.net. The existing search engine is now linked to the labels for items in Wikidata and it was made public after Hay Kranen brought us his proof of concept. It became available warts and all and while finding منصور اعجاز  in Punjabi is huge, it is not great when you do not find cats because a user is called Kočka..

The challenge to us as an organisation, a movement are we willing to work on our existing bias, open up Commons in all the languages we are said to support and accept that our hobby horses will get attention not in the next but in a future iteration.
Thanks,
       GerardM

The Toolforge Composition

17:25, Thursday, 04 2020 June UTC

Toolforge , formerly known as wmflabs, is changing its URLs. Where there was one host (tools.wmflabs.org) before, each tool now gets its own sub-domain (eg mix-n-match.toolforge.org).

Until now, I have used my WiDaR tool as a universal OAuth login for many of my tools, so users only have to sign in once. However, since this solution only works within the same sub-domain, it is no longer viable with the new Toolforge URL schema.

I am scrambling to port my tools that use OAuth to their own sign-in. To make this easier, I put my WiDaR tool into a PHP class, that can be reused across tools; the individual tool API can then pick up the requests that were previously sent to WiDar. Some tools, like Mix-n-match, have already been ported.

This brought me back to something that has been requested of some of my tools before – portability, namely to MediaWiki/Wikibase installations other then the Wikimedia ones. A tool with its own WiDaR would be much more portable to such installations.

But the new WiDaR class is included via the shared file system of Toolforge; how to get it portable? Just copying it seems like a duplication of effort, and it won’t receive updates etc.

The solution, in the PHP world, is called composer, a package manager for PHP. While I was at it, I ported several of my often-reused PHP “library scripts” to composer, and they are available in code here, or as an installable package here.

Since the source files for composer slightly differ from the ones I use internally on Toolforge, I wrote a script to “translate” my internal scripts into composer-compatible copies.

The first tool I equipped with this composer-based WiDaR is Tabernacle. It should be generic enough to be useful on other Wikibase installations, and is very lightweight (the PHP part just contains a small wrapper API around the WiDaR class). Installation instructions are in the repo README.

I will continue converting tools to the new URL schema, as time allows. I hope I will beat the hard deadline of June 15.

We stand for racial justice.

23:57, Wednesday, 03 2020 June UTC

George Floyd’s death last week at the hands of law enforcement in Minneapolis lays bare the tremendous inequalities and racism that black people face in the United States on a daily basis. In the past few weeks, his name, along with Ahmaud Arbery, Breonna Taylor, and David McAtee have joined a staggering register of victims of violent anti-Black racism in America.

We see our Black colleagues, community members, readers, and supporters grieving, fearful, and feeling the weight of this week and the history of all of the weeks just like this. Today, and every day, the Wikimedia Foundation stands in support of racial justice and with the movement for Black Lives. As an employer and part of an international movement our work in every country depends on promoting and defending human rights.

Over the past week, we have witnessed communities across the U.S. and around the world stand up for racial justice and demand an end to police brutality and extrajudicial killings. This has been met with more brutalityarrests, and even lethal force against citizens from Minneapolis to New York City, Los Angeles to Washington, D.C. In many places this policing response has been accompanied by egregious attacks on freedoms of the press and the rights to freedom of speech and assembly.

On these issues, there is no neutral stance. To stay silent is to endorse the violence of history and power; yesterday, today, and tomorrow. It is well past time for racial justice in America and beyond.

The Wikimedia vision, “a world in which every single human can share in the sum of all knowledge,” guides our commitment to the inherent dignity and value of every single human being. Our efforts are animated by foundational understandings: that the right to information is fundamental, universal, and inviolable, and that our work will be forever incomplete until all voices are heard.

In 2017, the Wikimedia Foundation adopted an explicit commitment to “Knowledge Equity.” We pledged our focus as a social movement to supporting knowledge and communities that have been left out by structures of power and privilege, and to breaking down the social, political, and technical barriers preventing people from accessing and contributing to free knowledge.

We understand our work to support free knowledge is about far more than a website. It involves reclaiming knowledge from gatekeepers and reestablishing it as something we do and share together. It is a radical act of freedom and reimagination of the status quo. It calls on all of us to shape what we understand of our world, be critical readers of conventional wisdom, and participate in writing history. Our work cannot be separated from the work of equality and freedom.

We recognize and stand with Black Americans in the fight for justice and equality. We reject racism and the ideology of white supremacy. We condemn attacks on the press and protesters in violation of the fundamental right to freedom of expression. To these ends, we make the following statements.

We call upon governments to:

 

We commit to advancing racial justice in Wikimedia work, including:

 

Furthermore, we wish to amplify the following Wikimedia affiliates and efforts:

 

We hope that one day the Wikimedia projects document a grand turning point — a time in the future when our communities, systems, and institutions acknowledge the equality and dignity of all people. Until that day, we stand with those who are fighting for justice and for enduring change. With every edit, we write history.

By Trey Jones, Senior Software Engineer, Search Platform, The Wikimedia Foundation

Khmer, also called Cambodian, is the official language of Cambodia. The Khmer script is a syllabary closely related to the scripts for Thai and Lao, and more distantly to all the Brahmic scripts. Khmer is written left-to-right in syllable groups, without spaces between words—though that’s not its most complicated feature!

Screenshot of the Khmer Wikipedia article វិគីភីឌា (“Wikipedia”), May 15, 2020.

Syllables govern the world

Khmer text is composed of syllables, which are built around a base character that represents a consonant and an inherent vowel. Up to two additional supplementary consonants may be added to the syllable by stacking them underneath the base character (usually). The inherent vowel of the base character can also be changed by adding a different vowel sign to the syllable. Other diacritics may also be added to alter the pronunciation of the consonant or vowel.

The various elements that glom on to the base character can attach themselves below, above, to the left, or to the right of the base character, and sometimes multiple elements can stack, especially above or below the base character. Some supplementary consonants can also go to the side of the base character rather than below, and some of the vowel signs have two parts—one to the left, and the other above or to the right. Some diacritics change shape or location in the presence of other diacritics—for example, if both would normally go on top of the base character—presumably to keep things from getting too crowded.

In the example below, the base character, “sa”, is in red. The first supplementary consonant, “ta” (in orange), goes below the base character. The second supplementary consonant, “ro” (in yellow), goes mostly to the left, though a bit of it is below the base character. The vowel sign, “oe” (in green), is in two parts—one goes to the left of everything else, and the other goes above the base character. The transliteration of this syllable is “straeu”. The final vowel, which determines the vowel for the whole syllable, is named “oe” in Unicode, though it is phonetically transcribed as /aə/, and it can be transliterated into Latin script as “aeu”—just to keep things confusing for non-speakers!

Order is the law of all intelligible existence

Unicode support for Khmer is very different from, for example, Hangul—the script for the Korean language. Modern Hangul has 40 basic letters that can be combined into the 11,172 syllables in common usage, all of which are available as individual Unicode characters. For example, 걂 (HANGUL SYLLABLE GYALM) is a single pre-composed character, even though it is made up of four component characters: ᄀ + ᅣ + ᄅ + ᄆ. [1]

In Khmer, the task of composing characters into syllables is left entirely to some combination of the font, the operating system, and the application being used. [2] Back in the digital stone age—the 1990’s and before—support for Khmer was spotty and inconsistent, like the support for many non-Latin (and even non-English) writing systems. [3] As a result, different fonts and different operating systems support typing the code points of a Khmer syllable in different orders, in that they will render the resulting syllable the same (or very nearly the same). This happens because—to simplify a bit—there isn’t really an obvious linear order to the elements that glom on top, to the left, and below the base character, so any semi-reasonable order will do.

The Unicode Standard sets out a canonical order for Khmer syllable elements, which corresponds to the order the elements are spoken—though there seem to still be a few elements that are ordered arbitrarily.[4] Incorrect ordering should result in glyphs with a dotted circle (◌), showing that they aren’t combining correctly, but many fonts and applications will still render non-canonical orders perfectly fine, or at least reasonably well, and some don’t show the dotted circle even when they render poorly.

Another issue is that for some Khmer diacritics, multiple copies can render directly on top of each other so that you can’t easily see that there are multiple copies. This can apply to vowels, supplementary consonants, and other diacritics.

A similar but much smaller scale problem in English is that é can be either a single character or two characters (e + ´) composed together—but in Khmer this is turned up to 11! A more analogous example would be if the character sequences scrum, srcum, sucrm, and scruuuuum all looked identical.

All of this variability causes great difficulty in search because two words that look exactly the same could underlyingly be composed of very different sequences of characters.

Example is always more efficacious than precept

Below are some screenshots of examples of Khmer syllables that look the same or very similar, despite the different orders of the syllable elements. I’m providing screenshots because the exact behavior of any given non-canonical sequence of characters may differ depending on the font used, and I don’t know what fonts you have available, Dear Reader.

All of these are examples I found on the Khmer Wikipedia.

In this group, the supplementary consonant (the third element in the first row, which looks like an intersection symbol: ∩), can appear between or after any of the other elements. This is analogous to the scrum, srcum, sucrm case above.

In the next set of examples, the rendered syllables are not exactly the same, but are probably close enough to go unnoticed if you are reading quickly. Not only can the final three elements appear in any order, in the first example, the second element is a completely different character—it looks like a double quote (“) rather than a comma (,). It is moved down and below the base character because of the presence of the last element (which looks a degree symbol: °)—there isn’t room for both above the base character.

As mentioned before, some vowel signs are made up of two parts. In this case, those two parts each exist as independent characters. If used together, they can look exactly like the single vowel sign. This is a bit like using “vv” instead of “w” in English.

This example shows the most duplication I found of any diacritic. The two characters above the base character don’t render particularly well anyway, so the slight increase in thickness of the circle is almost impossible to notice at reading speed. This is analogous to the scruuuuum case above—though it is more like scruuuuuuuuuuuuuum!

Finally, these are examples of how some of the malformed syllables above should appear, according to the Unicode specs—but, of course, depending on your fonts, OS, and apps, your mileage may vary.

Just figure out what’s next

Obviously, having this kind of invisible variability can make some sequences of characters almost impossible to find using normal text-based search, which makes the search on Khmer-language wikis less effective.

So, how big is the malformed Khmer syllable problem, and what can be done about it?

After much research and experimentation, in the fall of 2019 I built a prototype tool that reads in Khmer text, identifies syllables, and reorders the malformed ones. I’ve been able to use feedback from Khmer readers to improve it, and to get a sense of the kinds of problems that exist on Khmer-language wikis. The tool also made it easier to see how frequent the problem is.

The good news is that syllable errors on Khmer Wikipedia are fairly rare—only about 0.17% of syllables in my sample of articles needed to be reordered. Also, the reordering algorithm works very well—only 0.39% of the 0.17% of the reorderings (or, 0.00067% of the total) were erroneous, and most of those were the result of obvious typos in the original text. Errors are noticeably more common in queries on Khmer Wikipedia—1.3%—but the error rate in fixing them is about the same (0.38% of the 1.3%, or 0.0049% of the total).

The plan is to use the prototype as a basis for building a plugin for Elasticsearch—the search engine that underlies on-wiki search—and deploy it to Khmer-language wikis. When syllables are reordered, both the original and reordered syllables will be searchable.

You can track progress on the task on Phabricator. You can read more details about my investigation into Khmer syllable reordering on Mediawiki, including a bunch of example reordered syllables and discussion about them; you can also find details of the syllable reordering algorithm, and all the best references I found while learning about Khmer syllables, if you want to read more!

__________________

[1] Here is the color-coded breakdown of 걂 (HANGUL SYLLABLE GYALM) into ᄀ + ᅣ + ᄅ + ᄆ (g + ya + l + m).

[2] For rare, historical letters in Hangul, it is up to the font, OS, or application to combine them into syllable blocks, because no pre-composed Unicode characters exist. Some fonts are better than other fonts—just as with all Khmer syllables:

[3] The online Cambodian Information Center’s “Khmer Fonts” page (cambodia.org/fonts/), where I downloaded many Khmer fonts, offers this bit of unofficial history:

As computer and internet industry gain influence and market in Cambodia, several types of Khmer fonts have been developed… Most of them were not developed by using Unicode or meet the guideline of the Unicode Standards. However, all of these fonts have been widely utilized with word processing, such as Word in Microsoft Office. Because many of these fonts were neither developed using Unicode Standards nor adopted by makers of World Wide Web (WWW) browsers, many Khmer fonts were not readable without special library drivers.

[4] There is a lot of material and discussion from the Unicode Consortium about Khmer syllables. Chapter 16.4 of Version 13.0 of the Core Specification [PDF] includes a section on Ordering of Syllable Components (p 662 of the standard; 27th page of the linked PDF), if you are either super interested in the nerdy details—like me—or you need help falling asleep at 3am.

About this post

About Featured Image: Khmer inscription from Prasat Kravan (ប្រាសាទក្រវាន់), a 10th-century temple in Angkor, Cambodia. (Image cropped from “Prasat Kraven – Doorway Inscriptions”, by Greg Willis, CC BY-SA 2.0)

Section headings are quoted from John Selden, John Stuart Blackie, Samuel Johnson, and Steve Jobs.

Just want to give a shout-out to the wonderful folks at Microsoft and elsewhere who have gotten a Visual Studio Code Insiders build created for Windows on ARM64, which runs natively on the Surface Pro X and other ARM64 machines.

It’s still not listed in the regular downloads but it works for me when installed directly, and should auto-update with further Insiders releases. :)

The x86 build ran acceptably for some light development on the Surface Pro X in emulation, but the native build feels a *lot* faster. Starts up instantly, no longer so sluggish to scroll or wait for linter updates.

Now all I need is Docker for Win10/ARM64 and for WSL2 to fix the ARM64 performance problems with Hyper-V. :)

Celebrating 600,000 commits for Wikimedia

18:36, Monday, 01 2020 June UTC

By James Forrester, Software Engineer

Last week, the 600,000th commit was pushed to Wikimedia’s Gerrit server!

We thought we’d take this moment to reflect on the developer services we offer and our community of developers, be they Wikimedia staff, third party workers, or volunteers.

At Wikimedia, we currently use a self-hosted installation of Gerrit to provide code review workflow management, and code hosting and browsing. We adopted this in 2011–12, replacing Apache Subversion.

Within Gerrit, we host several thousand repositories of code (2,441 as of today). This includes MediaWiki itself, plus all the many hundreds of extensions and skins people have created for use with MediaWiki. Approximately 90% of the MediaWiki extensions we host are not used by Wikimedia, only by third parties. We also host key Wikimedia server configuration repositories like puppet or site config, build artifacts like vetted docker images for production services or local .deb build repos for the software we use— like etherpad-lite, ancillary software— like our special database exporting orchestration tool for dumps.wikimedia.org, and dozens of other uses.

Gerrit is not just (or even primarily) a code hosting service, but a code review workflow tool. Per the Wikimedia code review policy, all MediaWiki code heading to production should go through separate development and code review for security, performance, quality, and community reasons. Reviewers are required to use their “good judgment and careful action,” which is a heavy burden, because “[m]erging a change to the MediaWiki core or an extension deployed by Wikimedia is a big deal.” Gerrit helps them do this, providing clear views of what is changing, supporting itemized, character-level, file-level, or commit-level feedback and revision, and allowing series of complex changes to be chained together across multiple repositories, and ensuring that forthcoming and merged changes are visible to product owners, development teams, and other interested parties.

Across all repositories, we average over 200 human commits a day, though activity levels vary widely. Some repositories have dozens of patches a week (MediaWiki itself gets almost 20 patches a day; puppet gets nearly 30), whereas others get a patch every few years. There are over 8,000 accounts registered with Gerrit, although the activity is not distributed uniformly throughout that cohort.

To focus engineer time where it’s needed, a fair amount of low-risk development work is automated. This happens in both creating patches and also, in some cases, merging them.

For example, for many years we have partnered with TranslateWiki.net‘s volunteer community to translate and maintain MediaWiki interfaces in hundreds of languages. Exports of translators’ updates are pushed and merged automatically by one of the TWN team each day, helping our users keep a fresh, usable system whatever their preferred language.

Another key area is LibraryUpgrader, a custom tool to automatically upgrade the libraries we use for continuous integration across hundreds of repositories, allowing us to make improvements and increase standards without a single central breaking change. Indeed, the 600,000th commit was one of these automatic commits, upgrading the version of the mediawiki-codesniffer tool in the GroupsSidebar extension to the latest version, ensuring it is written following the latest Wikimedia coding conventions for PHP.

Right now, we’re working on upgrading our installation of Gerrit, moving from our old version based on the 2.x branch through 2.16 to 3.1, which will mean a new user interface and other user-facing changes, as well as improvements behind the scenes.

More on those changes will be coming in later posts.

About this post

Featured image credit: A vehicle used to transport miners to and from the mine face by ‘undergrounddarkride’, used under CC-BY-2.0.

Original publication: This post originally appeared on Phame: https://phabricator.wikimedia.org/phame/post/view/197/celebrating_600_000_commits_for_wikimedia/

The best documentation automation can buy

15:58, Monday, 01 2020 June UTC

HEADER CAPTION: Screenshot from Wikimedia's famous Visual Editor. The typo "documenation" has a red squiggly line under it indicating the spell checker has automatically detected a spelling error by the author.

Tools for validating that JavaScript documentation is current and error-free have advanced significantly over the last several years. It is now possible to detect mismatches between a program's documentation and its source code automatically using a free and open-source, industry-standard type checker. This goes way beyond typos.

JavaScript typing is loose

JavaScript is an untyped language. Unlike a typed language, a JavaScript program is always generated regardless of whether the types in it are valid. Some consider JavaScript's fast-and-loose style a feature, not a bug. Notable proponents of that viewpoint include Douglas Crockford and Paul Graham.

There have been numerous articles written on the subject, but I suspect that most reading already understand the values of clear typing. For any nontrivial program with multiple authors and any longevity, especially those likely to be found among the sprawling wikis, strong typing is much more practical and sustainable than the alternative. With good typing, one can quickly grasp the structure of a program. That is, you can conceptualize and interface with any well-typed API whether you understand how it works internally or not. Refactors are a lot easier too and while not fearless, a typed codebase is far more malleable than an untyped one. Type checks are also a great way to verify your work, just like in grade school.

CAPTION: 65 miles per hour is how many kilometers per hour? So long as the fractions are correct, we can validate the conversion by checking that the units cancel each other out. In type checking, our function parameters, function return types, and object properties must align in a similar way but the process is automated.

Many bugs could be caught before arriving in production if every patch had its typing validated—but don't take my word for it. Phan, the PHP type checker, is now a required validation test for any change to MediaWiki Core as well as many extensions. It's like a bunch of built-in unit tests specifically for types. Without automation, these tests can require thousands of lines of hand written code that are tedious and time consuming to author, read, and maintain (e.g., see the otherwise excellent Popups extension). In the worst cases, no tests are written at all.

CAPTION: Types must align like clockwork or the machine stops running. Image by ElooKoN / CC BY-SA.

Documentation should be correct

Good typing is just as important in documentation. For JavaScript, documentation is largely written in JSDoc (or its deprecated competitor, JSDuck). Wikipedians seem to agree that documentation is a very good idea. If documentation is a good idea, correct and up-to-date documentation is an even better one. There's a tool for that: it's called TypeScript.

If you haven't heard of TypeScript yet, it may be because it's not very common at Wikimedia except for the uber-amazing work by the WMDE and Wikidata communities (e.g., see wikibase-termbox which is over 80% TypeScript) as well as explorations several years back by Joaquín Oltra Hernández. However, it is now immensely popular globally and proven itself by capability to be far more than just a fashionable trend from 2012.

So what is TypeScript exactly? TypeScript is JavaScript with types. Whether you choose to use it for functional code like WMDE or not, TypeScript features the ability to lint plain JavaScript files for the type correctness of their JSDocs. You don't need Webpack and you don't need to make any functional changes to your code (unless it's incorrect and out-of-sync from the documentation—i.e., bug fixes). Your JavaScript is the same as it ever was but now, if your documentation and program don't match, TypeScript will report an error.

This isn't just better documentation, it's documentation as accurate as we can write in an automated way. Who doesn't want better documentation?

What changes are needed?

Typing at the seams. In practice, this usually means documenting function inputs and outputs, and user types using JSDoc syntax. E.g.:

JSDocs
/**
 * Template properties for a portlet.
 * @typedef {Object} PortletContext
 * @prop {string} portal-id Identifier for wrapper.
 * @prop {string} html-tooltip Tooltip message.
 * @prop {string} msg-label-id Aria identifier for label text.
 * @prop {string} [html-userlangattributes] Additional Element attributes.
 * @prop {string} msg-label Aria label text.
 * @prop {string} html-portal-content
 * @prop {string} [html-after-portal] HTML to inject after the portal.
 * @prop {string} [html-hook-vector-after-toolbox] Deprecated and used by the toolbox portal.
 */

/**
 * @param {PortletContext} data The properties to render the portlet template with.
 * @return {HTMLElement} The rendered portlet.
 */
function wrapPortlet( data ) {
  const node = document.createElement( 'div' );
  node.setAttribute( 'id', 'mw-panel' );
  node.innerHTML = mustache.render( portalTemplate, data );
  return node;
}

CAPTION: If this code was undocumented or the types inaccurate, would you always get the data properties right? Maybe you would, but what about everyone else?

Most programmers are already typing their JavaScript to some extent with JSDocs, so often only refinements are needed. In other cases, TypeScript's excellent type inference abilities can be leveraged so that no changes are required.

Type definitions are a useful supplement to JSDocs. Definitions are non-functional documentation that support type annotations inline. For example, the definition of the powerful but fantastically loose jQuery API could find marvelous utility in many Wikimedia codebases for at-your-fingertips documentation needs. Another very relevant example that ships with TypeScript itself is the DOM definition, which will alert you to misalignments such as attempting to access a classList on a Node instead of an Element. Thorough type checking is similar and the perfect complement to ESLint checks for ES5-only sources or more broadly ESLint's safety checks.

Type definitions are also a convenient way to describe globals and, more generally, share types. Definitions are either shipped with the NPM package itself or DefinitelyTyped (e.g., npm i -D @types/jquery) and are now standard practice for most noteworthy JavaScript libraries. Imagine if this degree of accuracy could be achieved in some of our most well-used codebases. Integrations between skins, extensions, Core, and peripheral libraries would be validated for alignment. It would be harder to break things and a much more welcoming experience for newcomers.

npm install jsdoc typescript tsconfig.json tsc Document!

The actual project setup for adding documentation checks to an existing repository is minimal and requires no functional changes:

  1. Add JSDoc and TypeScript as NPM development dependencies. Optionally: add any missing types for third-party libraries used.
  2. Add a tsconfig.json to tell TypeScript to lint JavaScript documentation you wish to validate.
  3. Add tsc to the project's NPM test script.

The real work is in fleshing out the missing documentation with JSDocs. However, TypeScript is quite flexible about how one chooses to opt in or out of documentation validation. If code isn't worth documenting, it's probably not worth keeping, but typing can be consciously deferred in a number of ways. The most straightforward is probably with a // @ts-ignore comment. Think of it as progressively enhanced code.

An example project setup for Vector is here which shows how typing and documentation can be retrofitted nicely even on codebases that predate TypeScript and make heavy use of sophisticated APIs like jQuery.

Editor support

It's unnecessary for setting up a project, but worth mentioning, that by ensuring that even a machine can model your documentation means that your code editor can understand it too. For most editors, this means you'll get accurate, split-second documentation lookup and documentation type checking. Visual Studio Code has a superb out-of-box experience for JavaScript including documentation awareness and code completion, but other editors are supported too.

CAPTION: Errors are identified as you write. There's that typo again but this time it could be your next unbreak now or your next type checker error.

You would see similar output from a continuous integration job or the command line:

CAPTION: The command line output is just as informative.

And here are those excellent docs:

CAPTION: Documentation is a mouse hover away. Coding with documentation at hand is a breeze and the expectation for many modern developers writing their first MediaWiki patches.

Conclusion

65 miles per hour is 104.60736 kilometers per hour. Language changes the way we think, and documentation is the encyclopedia of code. Tooling that improves our abilities to understand, reason, and express ourselves through language improves our ability to engineer.

In my own personal and professional development, I've found accurate documentation to be a great treasure that gives me confidence and efficiency in the code that I read and write. Maybe we should have the same hopes and expectations of our documentation that newcomers do. Maybe with better documentation—documentation that is as accurate as we can automate—some of Wikipedia's many JavaScript errors could be identified and eliminated as easily as changing units from mph to kph. Maybe with better documentation, we could write better software, faster. Software that users love using and developers enjoy writing. Let's get to work!

CAPTION: Programs are like jigsaw puzzles where types are the shapes. Check assembly before shipping. Image by Muns and Schlurcher / CC BY-SA.

Thanks to Sam Smith, Joaquín Oltra Hernández, and Leszek Manicki for reviewing and providing feedback.

NOTE: Documentation on building better documentation is being written on wiki with the help of editors like you!

Celebrating 600,000 commits for Wikimedia

14:17, Monday, 01 2020 June UTC

Earlier today, the 600,000th commit was pushed to Wikimedia's Gerrit server. We thought we'd take this moment to reflect on the developer services we offer and our community of developers, be they Wikimedia staff, third party workers, or volunteers.

At Wikimedia, we currently use a self-hosted installation of Gerrit to provide code review workflow management, and code hosting and browsing. We adopted this in 2011–12, replacing Apache Subversion.

Within Gerrit, we host several thousand repositories of code (2,441 as of today). This includes MediaWiki itself, plus all the many hundreds of extensions and skins people have created for use with MediaWiki. Approximately 90% of the MediaWiki extensions we host are not used by Wikimedia, only by third parties. We also host key Wikimedia server configuration repositories like puppet or site config, build artefacts like vetted docker images for production services or local .deb build repos for software we use like etherpad-lite, ancillary software like our special database exporting orchestration tool for dumps.wikimedia.org, and dozens of other uses.

Gerrit is not just (or even primarily) a code hosting service, but a code review workflow tool. Per the Wikimedia code review policy, all MediaWiki code heading to production should go through separate development and code review for security, performance, quality, and community reasons. Reviewers are required to use their "good judgement and careful action", which is a heavy burden, because "[m]erging a change to the MediaWiki core or an extension deployed by Wikimedia is a big deal". Gerrit helps them do this, providing clear views of what is changing, supporting itemised, character-level, file-level, or commit-level feedback and revision, and allowing series of complex changes to be chained together across multiple repositories, and ensuring that forthcoming and merged changes are visible to product owners, development teams, and other interested parties.

Across all of repositories, we average over 200 human commits a day, though activity levels vary widely. Some repositories have dozens of patches a week (MediaWiki itself gets almost 20 patches a day; puppet gets nearly 30), whereas others get a patch every few years. There are over 8,000 accounts registered with Gerrit, although activity is not distributed uniformly throughout that cohort.

To focus engineer time where it's needed, a fair amount of low-risk development work is automated. This happens in both creating patches and also, in some cases, merging them.

For example, for many years we have partnered with TranslateWiki.net's volunteer community to translate and maintain MediaWiki interfaces in hundreds of languages. Exports of translators' updates are pushed and merged automatically by one of the TWN team each day, helping our users keep a fresh, usable system whatever their preferred language.

Another key area is LibraryUpgrader, a custom tool to automatically upgrade the libraries we use for continuous integration across hundreds of repositories, allowing us to make improvements and increase standards without a single central breaking change. Indeed, the 600,000th commit was one of these automatic commits, upgrading the version of the mediawiki-codesniffer tool in the GroupsSidebar extension to the latest version, ensuring it is written following the latest Wikimedia coding conventions for PHP.

Right now, we're working on upgrading our installation of Gerrit, moving from our old version based on the 2.x branch through 2.16 to 3.1, which will mean a new user interface and other user-facing changes, as well as improvements behind the scenes. More on those changes will be coming in later posts.


Header image: A vehicle used to transport miners to and from the mine face by 'undergrounddarkride', used under CC-BY-2.0.

Tech News issue #23, 2020 (June 1, 2020)

00:00, Monday, 01 2020 June UTC
This document has a planned publication deadline (link leads to timeanddate.com).
TriangleArrow-Left.svgprevious 2020, week 23 (Monday 01 June 2020) nextTriangleArrow-Right.svg
Other languages:
Deutsch • ‎English • ‎Esperanto • ‎Nederlands • ‎Tiếng Việt • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎ગુજરાતી • ‎ไทย • ‎中文 • ‎日本語 • ‎한국어

weeklyOSM 514

10:14, Sunday, 31 2020 May UTC

19/05/2020-25/05/2020

lead picture

Akihiko Kusanagi: real-time 3D digital map of Tokyo’s public transport system 1 | © Akihiko Kusanagi | map data © OpenStreetMap contributors

Mapping

  • Voting for François Lacombe’s line_management=* proposal has started. The key is meant to be used in conjunction with power=line, power=minor_line and power=cable for describing particular topologies at their supports or other important points.
  • Also, Jan Michel’s proposal for the new access keys electric_bicycle=* and speed_pedelec=* has been opened for voting.
  • Peter Elderson created a proposal to add specific roles for the members of recreational route relations, namely alternative, excursion, approach, connection and — probably the most used role — main. Comments are appreciated.
  • Edits by Mateusz Konieczny, made with assistance of a script, removed over 1000 tracking parameters from links within the OSM database. See an example changeset.
  • Pascal Neis has implemented an additional parameter to allow the retrieval of the oldest entries from his OSM Notes service, which passes OSM error reports as an XML feed. However, it should be noted that some OSM notes have been carried over from the predecessor OpenStreetBugs and are still open.

Community

  • Our Q&A site help.openstreetmap.org is dying, according to Tobias Wrede.
  • Stefan Keller, professor of information systems at HSR University for Applied Sciences (Hochschule für Technik), Rapperswil, prepares (de) (automatic translation) people for the end of the lockdown with links to speciality maps of every kind of outdoor activity, from visiting caves and BBQ areas to a map of public art. Unfortunately, some of the maps are specific to Switzerland.
  • OpenCage talked with Alexis Markwick, the man behind Bexhill-OSM, a service that uses OSM to help expose both tourist information and local history for Bexhill-on-Sea, England.

Imports

  • The OSM community in Fortaleza, Ceará, Brazil, plans to import building data provided by the local municipality and created a comprehensive page on the OSM wiki.

OpenStreetMap Foundation

  • The minutes of the OSMF board meeting on 17 April 2020 have been published.
  • The Data Working Group (DWG) published its activity report for the 3rd quarter 2019.
  • The State of the Map Working Group invites you to submit your OSM-related artwork for the poster exhibition at the virtual State of the Map 2020. The deadline is 30 June.

Education

  • Sergey Golubev recorded (ru) (automatic translation) a podcast in which he explains how to use OSM data.

Maps

switch2OSM

  • Another example of using OSM for eco projects is in the forecasting of unpleasant smells (ru) from the landfill near the city of Klin (Russia).

Open Data

  • The OpenSpeedcam website now allows (ru) (automatic translation) its data to be used in OSM.
  • DELFI, a cooperation network of Federal States in Germany, the Federal Government and other partners, published (de) (automatic translation) a dataset under the Lex OSM licence that aims to include all public transport stops in Germany.

Software

  • The Trufi Association held its own TrufiAppHackathon on 14 and 15 May. In a blog post featuring a video they unveil what they have achieved so far.
  • Egor Smirnov in his article (ru) (automatic translation), on Habr, described how his visual data loader for OSM, YourMaps, is designed and how it works.

Programming

  • Paul Norman recommends shutting trac.openstreetmap.org down and directing users to project-specific trackers. He also noted that there is only one active project still using OSM’s trac instance. He suggests mothballing SVN, due to the connections between it and trac.
  • Paul Norman has created a prototype of a client-side rendered map based on OSM’s main map style Carto and made the code available on GitHub.

Releases

  • Version 1 of routing engine GraphHopper has been released.
  • The developers announced the release of OsmAnd 3.7, available for Android devices. Among the most relevant changes: new offline slope maps, full style customisation of favourites and GPX waypoints, customisable context menu, and more options for the Wikipedia map layer.
  • For iOS users the new release of OsmAnd 3.14 features beta functionality for public transport navigation, new offline slope maps, options to show lines and direction arrows to active markers, and a new option to switch between top bar and widget to show distances to markers.
  • Mapbox blogged about updates made to its products during May 2020. The article highlights new gestures for mobile maps and the improved Vision SDK, and provides links to some interesting tutorials.

Did you know …

  • … there is video of a talk by retired ambassador Allan Mustard titled ‘”I’m Tired of Getting Lost!” or How Open-Source Cartography Improved our Lives in Turkmenistan’, given at a North American Cartographic Information Society banquet?
  • … uMap, a tool that lets you create a map with OpenStreetMap layers and embed it in your website? Check out its guide on the OSM Wiki.
  • … that the Saint Petersburg public transport portal (ru) (automatic translation) uses OSM?
  • … about the ‘City tree map‘? (ru) (automatic translation) This project allows residents and organisations to add trees to the map of the city; you can also add information about the trees and photos.

Other “geo” things

  • ‘How the world became data-driven, And what’s next?’, an article in Forbes.
  • Outdooractive, a German outdoor planning website using OSM, among other sources, acquired Viewranger and MountNpass.
  • The New York Times carried a story by Alanna Mitchell on the ‘height modernisation’ being carried out by the National Oceanic and Atmospheric Administration. The programme will see the old NAVD 88 system replaced with one based on GPS and gravity observations, resulting in ‘elevations’ falling by up to a metre across the USA.
  • Yandex has published (ru) (automatic translation) a study on how cities are coming out of isolation.

Upcoming Events

Where What When Country
Biella Incontro mensile 2020-05-30 italy
London Missing Maps ONLINE London Mapathon 2020-06-02 united kingdom
Stuttgart Stuttgarter Stammtisch 2020-06-03 germany
Arlon Atelier ouvert OpenStreetMap 2020-06-03 belgium
Rennes Réunion mensuelle 2020-06-08 france
Taipei OSM x Wikidata #17 2020-06-08 taiwan
Lyon Rencontre mensuelle 2020-06-09 france
Munich Münchner Treffen 2020-06-11 germany
Zurich 117. OSM Meetup Zurich 2020-06-11 switzerland
San José Virtual Civic Hack Night & Map Night 2020-06-11 united states
Berlin 144. Berlin-Brandenburg Stammtisch 2020-06-12 germany
Berlin OSM Berlin Verkehrswende (Online) 2020-06-16 germany
Lüneburg Lüneburger Mappertreffen 2020-06-16 germany
Cologne Bonn Airport 130. Bonner OSM-Stammtisch 2020-06-16 germany
Leoben Stammtisch Obersteiermark (cancelled) 2020-06-18 austria
Cape Town HOT Summit 2020-07-01-2020-07-02 south africa
Kandy 2020 State of the Map Asia 2020-10-31-2020-11-01 sri lanka

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, Elizabete, Nakaner, NunoMASAzevedo, Polyglot, Rogehm, SK53, Silka123, SunCobalt, TheSwavu, YoViajo, derFred.

Are you familiar with occupational epidemiology? It’s the study of whether working conditions are safe for workers. As workplaces determine whether or not it’s safe to open up facilities again and resume “normal” work amidst a global pandemic, organization leaders are ideally making these important decisions with science and employee safety in mind.

Public health students in Tania Carreon-Valencia and Thais Morata’s course at the University of Cincinnati exercised their science communication muscles this spring as they added worker health and workplace safety information to Wikipedia. These topics are at the forefront of our collective consciousness right now as we contemplate (locally and globally) what “returning to work” looks like. And Wikipedia has proven to be a valuable resource during the pandemic as the world seeks updates on what to do.

While there may not be lots of peer-reviewed research yet about the effects of the pandemic on essential workers, it’s still worth keeping these topics up to date as information becomes available. Being aware of the risks of dental aerosols (a new Wikipedia page created by one of these students) might cause workplaces to contemplate how else coronavirus can spread and take precautions for reducing the risk. As this new page will inform you, the instruments that dentists use to probe and clean your teeth create aerosols that can pose a risk to clinicians and other patients. These dental aerosols even have the possibility to transmit diseases by spreading viruses, including SARS-CoV-2, which causes COVID-19. This is why on March 16th, 2020, the American Dental Association advised dentists to postpone all elective procedures. This student’s work has already been viewed more than 1,300 times, showing that even seemingly obscure topics can fill the information needs of many.

Another student improved the Wikipedia page about incident stress—the behavioral, emotional, and physical symptoms a frontline worker might experience after experiencing something traumatic on the job. While there is no method that is completely effective for preventing incident stress, there are ways to reduce its impact on the affected person. Possible steps to maintaining on-site health include maintaining nutrition and rest; limiting exposure to further stimuli, like noise; whether or not an employer is prepared to respond to cases of incident stress; and more. These steps are now captured in the corresponding Wikipedia page in a brand new section about “prevention” thanks to a student.

And the Wikipedia page about shift work sleep disorder, which consistently receives about 150 views a day, saw quite a few improvements in April. The disorder causes adverse health effects in people whose work schedule disrupts their typical sleeping patterns. A student added that it often goes undiagnosed and that the health effects include increased risk of bone fractures, low fertility, obesity, diabetes, decreased immune functioning, and negative effects on mental health. The page now also makes clear that sleep deprivation may lead to medical errors, workplace accidents, and low productivity. And it includes more methods through which decreased sleep quality can be assessed. The page has received 10,000 visits since this student made these changes.

The Wikipedia writing assignment was internationally recognized as an important tool for science communication around public health by the National Institute for Occupational Safety and Health (NIOSH) in 2019. NIOSH recognizes that Wikipedia makes research “usable” for the general public and lauds the site for policies that make information verifiable for readers. When Wikipedia is one of the leading sources for medical information out there, making sure that information is rooted in the latest science is hugely important. And students are great folks to do that work (with the assistance of their expert instructors and our Wikipedia training materials). Let’s make sure workers know their rights and that employers are up to date on science that can best prepare them to make positive decisions for their employees.


Interested in incorporating a Wikipedia writing assignment into a future course? Visit teach.wikiedu.org for all you need to know to get started. And here are some tips for incorporating the assignment into a virtual course.


Thumbnail image by Gmihail, via Wikimedia Commons (CC BY-SA 3.0 RS).

Monthly​ ​Report,​ March 2020

18:25, Thursday, 28 2020 May UTC

Highlights

  • As a result of the global COVID-19 pandemic, Wiki Education closed its office in the Presidio and moved all its operations online. In order to deal with the new situation, staff created a contingency and a crisis communications plan for each program. We also instituted a weekly COVID-19 briefing aimed at creating a shared understanding of how the pandemic affects our organization. A “Friday virtual social hour” helps staff deal with being isolated at home.
  • March 2020 also saw dramatic changes to the higher education landscape as the vast majority of courses in the U.S. moved to online platforms as a result of the outbreak of COVID-19. It was a chaotic time for our instructors and students as they all adjusted to this new mode of learning, and Wiki Education was there to help. Wikipedia Student Program Manager Helaine Blumenthal checked in on courses to see if they needed additional help and to let them know that Wiki Education’s support would remain uninterrupted. We were truly heartened to hear from so many of our instructors as we all adjust to these new circumstances both in our professional and personal lives. We are grateful that we can continue to work with our instructors and students during this challenging time, and hope we can provide our students with a meaningful educational experience whether they are on or off campus.
  • We launched the third Scholars & Scientists course in partnership with the Society of Family Planning (SFP) to improve Wikipedia articles related to abortion and contraception. We know that Wikipedia plays a significant role in the research people do about health and medicine, and we are happy to work with SFP to ensure the public has access to the highest quality information about family planning.

Read more…

For student work highlights; examples of great work from our Scholars & Scientists, Wikidata, and Visiting Scholars Programs; finance and fundraising updates; and more read our full report here.


Header/thumbnail image by Marcela McGreal (CC BY 2.0) shows protesters in New York, was uploaded to Wikimedia Commons by a student in Amy Carleton’s English course at Northeastern University, and is used in the Wikipedia article Asian American university resource center. Read this month’s report for more examples of great student work.
I have a renewed interest in Commons because the first steps have been made to make it actually useful. According to Wikidata there are two distinct Sarah T. Roberts. One is an epidemiologist the other is into information & media studies.

At Commons it was a mess, the picture of Sarah was used to illustrate an info box of the other Sarah. It is not that interesting to tell you how I did what. Relevant is that I did. I did because you will will find things when there is a label for whatever in "your" language..

Given that we do not research the use of Commons or Wikidata for that matter, why should the WMF give priority to opening up Commons even further? After all, there is no data to support it..
Thanks,
      GerardM

Enhancing the disability healthcare information on Wikipedia is a powerful way to combat misinformation, discrimination, and prejudice around disability and disability healthcare. The online encyclopedia is the most utilized healthcare resource in the world with a reach of 500 million readers per month. Policy makers, doctors, and others need to understand the diverse communities they serve and the existing barriers for adults with disabilities, and Wikipedia’s content can help institutions institute more inclusive practices. But quality of content on Wikipedia varies widely, and the volunteers who write it may not have access to expensive medical journal articles or an understanding of the evolving field of Disability Studies. The majority of Wikipedia pages related to developmental disabilities need significant improvement. Many get hundreds of page views a day, indicating a demand for content that just isn’t complete.

Wiki Scientist Kathleen Downes was less than impressed with the depiction of spastic cerebral palsy on Wikipedia, so she uploaded a photo of herself as a child. (CC BY-SA 4.0)

Thanks to a grant from the WITH Foundation, our first WITH Wiki Scientists course helped combat these problems. We supported a group of 20 experts as they worked to add more than 11,000 words to Wikipedia about topics like spastic cerebral palsy, diagnostic overshadowing, the Civil Rights Act of 1968, special needs dentistry, the connection between sexual abuse and intellectual disability, and much more. Together, their work has been viewed more than 224,000 times, and their work will live on long beyond the course.

When we approached the WITH Foundation last year with the idea to run a Wikipedia training course for disability healthcare professionals, we hoped the course would be interesting and impactful for prospective participants. We worked with the WITH Foundation to share this opportunity with their networks and were excited to receive almost twice as many applications than there were seats available, including 14 from members of the American Academy of Developmental Medicine & Dentistry (AADMD). In our first WITH Wiki Scientists course this spring, we were able to support 9 of those members. Next month, we will present at the upcoming AADMD virtual conference to share these medical professionals’ impact to public scholarship.

We had hoped to use the conference presentation to share this virtual learning opportunity with AADMD members, but the coronavirus pandemic has pushed the virtual presentation beyond the registration deadline. If you’re attending the AADMD virtual conference, please join Director of Partnerships Jami Mathewson on Thursday, June 18, 2020 7:30-8:00 PM EDT. The 30-minute session aims to achieve the following learning outcomes:

1. To understand Wikipedia as a means of public scholarship and increasing access to current academic research about developmental disabilities

2. To learn how medical professionals are making Wikipedia more inclusive for people with developmental disabilities

3. To understand how healthcare providers are applying their new Wikipedia knowledge in their daily professional lives

Whether or not you’re an AADMD member, if you’re interested in participating in our second cohort from June 15th–September 4th, please apply at wikiedu.org/with-AADMD by June 5th. We encourage adults with developmental disabilities to apply and/or spread the opportunity in your networks. Together, we can help ensure medical professionals can provide comprehensive healthcare to everyone.

These students in India have to do a project. The subject is Botswana. Their teacher wants them to find many pictures so he searched Wikimedia Commons among others for pictures of  Mokgweetsi Masisi, the president of Botswana. He marked the pictures that depicts Mr Masisi and now his pupils will find more pictures of him when they look for मोकेगसेसी मासी.

At the same time in Japan students have to do a project about Botswana. Their teacher is pleasantly surprised when he find so many pictures for モクウィツィ・マシシ...
Thanks,
       GerardM

Commons app v2.13 beta

14:16, Tuesday, 26 2020 May UTC

Hope you are all safe and well. We’ve just released v2.13 to beta, which includes:

– A new media details UI, which includes the ability to zoom and pan around images
– When the user uploads a picture with a geotag, the app will check for Nearby places that need photos around that location, and one is found, it will ask the user “Is this a picture of Place X?”
– Modifications to Nearby filters based on user feedback
– Bug and crash fixes for stuff that got broken by the codebase overhaul

Our next release will likely contain structured data integration, bookmarks for the Nearby map, and a couple of other new features.

Stay tuned!

@WikiCommons - meanwhile in a different universe

21:34, Monday, 25 2020 May UTC
And again there was a discussion that it should not be this hard to find pictures in Commons. The big difference this time is that there is now a wealth of images that have been tagged for what they "depict". They are linked to Wikidata items and they have a wealth of labels in many, many languages. In essence it has always been an objective of Wikidata to share its content in any and all of the 300+ languages supported by a Wikipedia.

The ideas that floated around soon made it into a "proof of concept" and as so often it actually worked after a fashion. The first iteration was in true Wikimedia tradition English only. The proof of concept got its second language in Dutch, Hay Kranen the developer is Dutch. Now there are nine languages and we are waiting for French to be the tenth.

So what does it do. You can look for pictures in Commons, it has 61 million media files, and when you are looking for available pictures in your language, you will find it as long as Wikidata has a label in your language.  This is for instance a result in Japanese and this is the result in German.

What can you do to make it better? Add labels in your language for the things you want to find and find media files that depicts what you are looking for. When nobody translated the software in your language, you can even do that.

Why is this so relevant? Have you ever wondered how many pictures you find in one of the smaller languages using Google or Bing? Let me tell you, it is disappointing to be polite. Commons is the repository of the mediafiles that illustrate all the Wikipedias so yes, it covers "almost anything".

The Wikimedia Foundation has this big strategy for its movement to be inclusive. This is a wonderful opportunity to show how agile it is, that it understands and supports a need that has been expressed for many many years. The beauty is the the way forward has been expressed in something that already works.

ABSOLUTELY, there will be challenges in integrating this functionality where it fulfills a need.

Luckily it is not necessary for it all to be done in one go. The first step can be as little as to take the "proof of concept" an rewrite it in the preferred language of the WMF, internationalise and localise it and keep it stand alone for now. The people who know about it will use it and they will be the first to point out what more they want to be done. A priority will be to retain its KISSable nature.

The objective is to open up Commons. Open it up in any and all languages. For me it is obvious. I will gladly give it my attention in the expectation that both Wikidata and Commons actually find a public, have a purpose that is more than what we do for ourselves.
Thanks,
      GerardM

Production Excellence #20: April 2020

16:23, Monday, 25 2020 May UTC

How are we doing on that strive for operational excellence during these unprecedented times?

📊  Numbers for March and April
  • 3 documented incidents. [1]
  • 60 new Wikimedia-prod-error reports. [2]
  • 58 Wikimedia-prod-error reports closed. [3]
  • 178 currently open Wikimedia-prod-error reports in total. [4]

For more about recent incidents and pending actionables see Wikitech and Phabricator.


📉  Outstanding reports

Take a look at the workboard and look for tasks that could use your help.

→  https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Breakdown of recent months:

  • April 2019: Two reports closed, 2 of 14 left.
  • May: (All clear!)
  • June: 4 of 11 left (unchanged). ⚠️
  • July: 8 of 18 left (unchanged).
  • August: 2 of 14 reports left (unchanged).
  • September: 7 of 12 left (unchanged).
  • October: Two reports closed, 4 of 12 left.
  • November: One report closed, 4 of 5 left.
  • December: Two reports closed, 4 of 9 left.
  • January 2020: One report closed, 5 of 7 reports left.
  • February: One report closed, 6 of 7 reports left.
  • March: 2 new reports survived the month of March.
  • April: 13 new reports survived the month of April.

At the end of February the total of open reports over recent months was 58. Of those, 12 got closed, but with 15 new reports from March/April still open, the total is now up at 61 open reports.

The workboard overall (which includes pre-2019 tasks) has 178 tasks open. This is actually down by a bit for the first time since October with December at 196, January at 198, and February at 199, and now April at 178. This was largely due to the Release Engineering and Core Platform teams closing out forgotten reports that have since been resolved or otherwise obsoleted.

💡 Tip: Verifying existing tasks is a good way to (re)familiarise yourself with Kibana. For example: Does the error still occur in the last 30 days? Does it only happen on a certain wiki? What do the URLs or stack traces have in common?

🎉  Thanks!

Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


Footnotes:
[1] Incidents. – https://wikitech.wikimedia.org/wiki/Incident_documentation
[2] Tasks created. – https://phabricator.wikimedia.org/maniphest/query/HjopcKClxTfw/#R
[3] Tasks closed. – https://phabricator.wikimedia.org/maniphest/query/ts62HKYPBxod/#R
[4] Open tasks. – https://phabricator.wikimedia.org/maniphest/query/Fw3RdXt1Sdxp/#R

Shocking tales from ornithology

03:05, Monday, 25 2020 May UTC
Manipulative people have always made use of the dynamics of ingroups and outgroups to create diversions from bigger issues. The situation is made worse when misguided philosophies are peddled by governments that put economics ahead of ecology. The pursuit of easily gamed targets such as GDP is preferrable to ecological amelioration since money is a man-made and controllable entity. Nationalism, pride, other forms of chauvinism, the creation of enemies and the magnification of war threats are all effective tools in the arsenal of Machiavelli for use in misdirecting the masses when things go wrong. One might imagine that the educated, especially scientists, would be smart enough not to fall into these traps, but cases from history dampen hopes for such optimism.

There is a very interesting book in German by Eugeniusz Nowak called "Wissenschaftler in turbulenten Zeiten" (or scientists in turbulent times) that deals with the lives of ornithologists, conservationists and other naturalists during the Second World War. Preceded by a series of recollections published in various journals, the book was published in 2010 but I became aware of it only recently while translating some biographies into the English Wikipedia. I have not yet actually seen the book (it has about five pages on Salim Ali as well) and have had to go by secondary quotations in other content. Nowak was a student of Erwin Stresemann (with whom the first chapter deals with) and he writes about several European (but mostly German, Polish and Russian) ornithologists and their lives during the turbulent 1930s and 40s. Although Europe is pretty far from India, there are ripples that reached afar. Incidentally, Nowak's ornithological research includes studies on the expansion in range of the collared dove (Streptopelia decaocto) which the Germans called the Türkentaube, literally the "Turkish dove", a name with a baggage of cultural prejudices.

Nowak's first paper of "recollections" notes that: [he] presents the facts not as accusations or indictments, but rather as a stimulus to the younger generation of scientists to consider the issues, in particular to think “What would I have done if I had lived there or at that time?” - a thought to keep as you read on.

A shocker from this period is a paper by Dr Günther Niethammer on the birds of Auschwitz (Birkenau). This paper (read it online here) was published when Niethammer was posted to the security at the main gate of the concentration camp. You might be forgiven if you thought he was just a victim of the war. Niethammer was a proud nationalist and volunteered to join the Nazi forces in 1937 leaving his position as a curator at the Museum Koenig at Bonn.
The contrast provided by Niethammer who looked at the birds on one side
while ignoring inhumanity on the other provided
novelist Arno Surminski with a title for his 2008 novel -
Die Vogelwelt von Auschwitz
- ie. the birdlife of Auschwitz.

G. Niethammer
Niethammer studied birds around Auschwitz and also shot ducks in numbers for himself and to supply the commandant of the camp Rudolf Höss (if the name does not mean anything please do go to the linked article / or search for the name online).  Upon the death of Niethammer, an obituary (open access PDF here) was published in the Ibis of 1975 - a tribute with little mention of the war years or the fact that he rose to the rank of Obersturmführer. The Bonn museum journal had a special tribute issue noting the works and influence of Niethammer. Among the many tributes is one by Hans Kumerloeve (starts here online). A subspecies of the common jay was named as Garrulus glandarius hansguentheri by Hungarian ornithologist Andreas Keve in 1967 after the first names of Kumerloeve and Niethammer. Fortunately for the poor jay, this name is a junior synonym of  G. g. anatoliae described by Seebohm in 1883.

Meanwhile inside Auschwitz, the Polish artist Wladyslaw Siwek was making sketches of everyday life  in the camp. After the war he became a zoological artist of repute. Unfortunately there is very little that is readily accessible to English readers on the internet (beyond the Wikipedia entry).
Siwek, artist who documented life at Auschwitz
before working as a wildlife artist.
 
Hans Kumerloeve
Now for Niethammer's friend Dr Kumerloeve who also worked in the Museum Koenig at Bonn. His name was originally spelt Kummerlöwe and was, like Niethammer, a doctoral student of Johannes Meisenheimer. Kummerloeve and Niethammer made journeys on a small motorcyle to study the birds of Turkey. Kummerlöwe's political activities started earlier than Niethammer, joining the NSDAP (German: Nationalsozialistische Deutsche Arbeiterpartei = The National Socialist German Workers' Party)  in 1925 and starting the first student union of the party in 1933. Kummerlöwe soon became a member of the Ahnenerbe, a think tank meant to provide "scientific" support to the party-ideas on race and history. In 1939 he wrote an anthropological study on "Polish prisoners of war". At the museum in Dresden that he headed, he thought up ideas to promote politics and he published them in 1939 and 1940. After the war, it is thought that he went to all the European libraries that held copies of this journal (Anyone interested in hunting it should look for copies of Abhandlungen und Berichte aus den Staatlichen Museen für Tierkunde und Völkerkunde in Dresden 20:1-15.) and purged them of his article. According to Nowak, he even managed to get his hands (and scissors) on copies held in Moscow and Leningrad!  

The Dresden museum was also home to the German ornithologist Adolf Bernhard Meyer (1840–1911). In 1858, he translated the works of Charles Darwin and Alfred Russel Wallace into German and introduced evolutionary theory to a whole generation of German scientists. Among Meyer's amazing works is a series of avian osteological works which uses photography and depicts birds in nearly-life-like positions (wonder how it was done!) - a less artistic precursor to Katrina van Grouw's 2012 book The Unfeathered Bird. Meyer's skeleton images can be found here. In 1904 Meyer was eased out of the Dresden museum because of rising anti-semitism. Meyer does not find a place in Nowak's book.

Nowak's book includes entries on the following scientists: (I keep this here partly for my reference as I intend to improve Wikipedia entries on several of them as and when time and resources permit. Would be amazing if others could pitch in!).
In the first of his "recollection papers" (his 1998 article) Nowak writes about the reason for writing them - noticing that the obituary for Prof. Ernst Schäfer  was a whitewash that carefully avoided any mention of his wartime activities. And this brings us to India. In a recent article in Indian Birds, Sylke Frahnert and coauthors have written about the bird collections from Sikkim in the Berlin natural history museum. In their article there is a brief statement that "The  collection  in  Berlin  has  remained  almost  unknown due  to  the  political  circumstances  of  the  expedition". This might be a bit cryptic for many but the best read on the topic is Himmler's Crusade: The true story of the 1939 Nazi expedition into Tibet (2009) by Christopher Hale. Hale writes: 
He [Himmler] revered the ancient cultures of India and the East, or at least his own weird vision of them.
These were not private enthusiasms, and they were certainly not harmless. Cranky pseudoscience nourished Himmler’s own murderous convictions about race and inspired ways of convincing others...
Himmler regarded himself not as the fantasist he was but as a patron of science. He believed that most conventional wisdom was bogus and that his power gave him a unique opportunity to promulgate new thinking. He founded the Ahnenerbe specifically to advance the study of the Aryan (or Nordic or Indo-German) race and its origins
From there Hale goes on to examine the motivations of Schäfer and his team. He looks at how much of the science was politically driven. Swastika signs dominate some of the photos from the expedition - as if it provided for a natural tie with Buddhism in Tibet. It seems that Himmler gave Schäfer the opportunity to rise within the political hierarchy. The team that went to Sikkim included Bruno Beger. Beger was a physical anthropologist but with less than innocent motivations although that would be much harder to ascribe to the team's other pursuits like botany and ornithology. One of the results from the expedition was a film made by the entomologist of the group, Ernst Krause - Geheimnis Tibet - or secret Tibet - a copy of this 1 hour and 40 minute film is on YouTube. At around 26 minutes, you can see Bruno Beger creating face casts - first as a negative in Plaster of Paris from which a positive copy was made using resin. Hale talks about how one of the Tibetans put into a cast with just straws to breathe from went into an epileptic seizure from the claustrophobia and fear induced. The real horror however is revealed when Hale quotes a May 1943 letter from an SS officer to Beger - ‘What exactly is happening with the Jewish heads? They are lying around and taking up valuable space . . . In my opinion, the most reasonable course of action is to send them to Strasbourg . . .’ Apparently Beger had to select some prisoners from Auschwitz who appeared to have Asiatic features. Hale shows that Beger knew the fate of his selection - they were gassed for research conducted by Beger and August Hirt.
SS-Sturmbannführer Schäfer at the head of the table in Lhasa

In all, Hale makes a clear case that the Schäfer mission had quite a bit of political activity underneath. We find that Sven Hedin (Schäfer was a big fan of him in his youth. Hedin was a Nazi sympathizer who funded and supported the mission) was in contact with fellow Nazi supporter Erica Schneider-Filchner and her father Wilhelm Filchner in India, both of whom were interned later at Satara, while Bruno Beger made contact with Subhash Chandra Bose more than once. [Two of the pictures from the Bundesarchiv show a certain Bhattacharya - who appears to be a chemist working on snake venom at the Calcutta snake park - one wonders if he is Abhinash Bhattacharya.]

My review of Nowak's book must be uniquely flawed as  I have never managed to access it beyond some online snippets and English reviews.  The war had impacts on the entire region and Nowak's coverage is limited and there were many other interesting characters including the Russian ornithologist Malchevsky  who survived German bullets thanks to a fat bird observation notebook in his pocket! In the 1950's Trofim Lysenko, the crank scientist who controlled science in the USSR sought Malchevsky's help in proving his own pet theories - one of which was the ideas that cuckoos were the result of feeding hairy caterpillars to young warblers!

Issues arising from race and perceptions are of course not restricted to this period or region, one of the less glorious stories of the Smithsonian Institution concerns the honorary curator Robert Wilson Shufeldt (1850 – 1934) who in the infamous Audubon affair made his personal troubles with his second wife, a grand-daughter of Audubon, into one of race. He also wrote such books as America's Greatest Problem: The Negro (1915) in which we learn of the ideas of other scientists of the period like Edward Drinker Cope! Like many other obituaries, Shufeldt's is a classic whitewash.  

Even as recently as 2015, the University of Salzburg withdrew an honorary doctorate that they had given to the Nobel prize winning Konrad Lorenz for his support of the political setup and racial beliefs. It should not be that hard for scientists to figure out whether they are on the wrong side of history even if they are funded by the state. Perhaps salaried scientists in India would do well to look at the legal contracts they sign with their employers, especially the state, more carefully. The current rules make government employees less free than ordinary citizens but will the educated speak out or do they prefer shackling themselves. 

Postscripts:
  • Mixing natural history with war sometimes led to tragedy for the participants as well. In the case of Dr Manfred Oberdörffer who used his cover as an expert on leprosy to visit the borders of Afghanistan with entomologist Fred Hermann Brandt (1908–1994), an exchange of gunfire with British forces killed him although Brandt lived on to tell the tale.
  • Apparently Himmler's entanglement with ornithology also led him to dream up "Storchbein Propaganda" - a plan to send pamphlets to the Boers in South Africa via migrating storks! The German ornithologist Ernst Schüz quietly (and safely) pointed out the inefficiency of it purely on the statistics of recoveries!

Tech News issue #22, 2020 (May 25, 2020)

00:00, Monday, 25 2020 May UTC
TriangleArrow-Left.svgprevious 2020, week 22 (Monday 25 May 2020) nextTriangleArrow-Right.svg
Other languages:
British English • ‎Deutsch • ‎English • ‎Esperanto • ‎Nederlands • ‎Tiếng Việt • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎српски / srpski • ‎українська • ‎עברית • ‎العربية • ‎فارسی • ‎ไทย • ‎中文 • ‎日本語 • ‎한국어

weeklyOSM 513

10:33, Sunday, 24 2020 May UTC

12/05/2020-18/05/2020

lead picture

Tracking changes on magOSM 1 | © Magellium, OpenLayers | map data © OpenStreetMap contributors

Mapping

  • Mapillary images have been used by the Ukrainian community to map over a thousand speed bumps in Kiev.
  • muramoto tweeted screenshots of two tools. Street-level POI Viewer displays POIs from OSM and Wikipedia over Mapillary images. He also pointed to another tool that allows the calculation of angles and distances, also based on Mapillary and an OSM basemap. You can use the two values to determine heights with the online calculator provided. The project files are available on GitHub.
  • Pascal Neis drew attention to the high share of paid mappers. He specifically mentioned India, where 8 of the top 10 mappers are working for Facebook.
  • Ty S wants to mark areas that have dangerous dogs and has created a proposal for dog_warning=*.
  • User SteveA wanted to use boundary=administrative for a range of local government entities in Connecticut. This led to long and involved discussions on both the talk-us and tagging mailing lists as to what qualifies as an administrative boundary.
  • Bob Gambrel asked on the talk-us list for advice about mapping snowmobile trails.
  • NetWormKido invites (ru) (automatic translation) everyone to join his initiative on drawing roads to villages in the Privolzhskiy Federal District of Russia, which are currently not connected in OSM to the rest of the world. Task on MapRoulette.
  • User b-unicycling is interested in field names in Ireland. As part of a local archaeological society activity in Kilkenny they have been collecting field names using Field Papers. They have also published a umap showing all existing places in Ireland where the names of individual fields have been added to OSM.

Community

  • fr1 (a user from Russia) conducted (ru) (automatic translation) an experiment. He simultaneously recorded GPS tracks with a regular smartphone and with one of the new generation which uses a two-frequency GPS receiver.
  • The podcast Nodes and Ways published its 3rd episode. This episode features Ciarán Staunton, who is speaking about mapping in Ireland, particularly the #osmIRL_buildings campaign.
  • The COVID-19 pandemic has forced humanity to take a pause in its activities. The Côte d’Azur University, in collaboration with CartONG, invites (fr) (automatic translation) the inhabitants of the planet to add to a map of natural phenomena and solidarity actions that have arisen in this period. Contributors are asked to remember the best too. The project Open map of the global pause was born. Add a photo to it!
  • Rovastar pointed out that daily mapper numbers reached a new peak of 6999 on 12 May. The 7000 barrier was broken two days later for another record high of 7209.
  • Sergey Astahov reflected (ru) (automatic translation), in his diary on GPS receivers, on the movement of lithospheric plates and how this affects OSM.
  • OSM Kosova, in collaboration with FLOSSK, posted on Facebook about a series of virtual workshops held during the past two months about OpenStreetMap and Wikidata with local high school students. They were introduced to the projects and taught how to edit them properly.
  • Valeriy Trubin continues his series of interviews with OSMers. This time he spoke with Dmitry Lebedev (ru) (automatic translation) about using OSM for research and Darafei Praliaskouski (ru) (automatic translation) about the work of the OSM Foundation.

OpenStreetMap Foundation

  • Christoph Hormann (imagico) provided a statistical overview of applications for the OSMF microgrants programme.

Events

  • This year’s AGIT, an Austrian yearly conference and trade fair about geoinformation, will take place (de) (automatic translation) virtually from 6 to 10 July 2020. It is still to be decided if OSM and OSGeo will be featured.
  • The Transatlantic Council of Boy Scouts of America organised a five-day Virtual Mapathon Challenge, allowing Sea Scouts to complete community service requirements by completing tasks on the HOT Tasking Manager.
  • Geomob events organised by OpenCage and Mappery have so far taken place in London, Munich and Barcelona. Since COVID-19 the talks, which always have a geographical background, have taken place on the Internet. Commercial and non-commercial, open and closed source speakers report on their work. The next online conference will take place on 10 June. All are welcome to attend, but places are limited to 100 people, on a first-come, first-served basis. Details of how to sign up, and other news, can be found in the monthly newsletter forthcoming in early June.

Maps

  • Julien Minet introduced OpenArdenneMap a cartoCSS style optimised for several scales of topographic maps. The style is available on github.
  • Jochen Topf’s recent release of the osm2pgsql flex backend has been discussed generally by Adrien Pavie, and very specifically by Styxman, who is interested in rendering bus routes.
  • The University of Heidelberg has stopped operating its tile server Mapsurfer.NET due to organisational difficulties.

switch2OSM

  • The tourist portal (ru) of the Republic of Mordovia (region in Russia) uses OSM as a basemap. Unfortunately, the site does not attribute OSM properly.
  • A team of Russian urbanists has started (ru) a public GIS project (ru) on analysing public transport routes. So far in Moscow (automatic translation) only, but OSM is the basis of their project. At the moment they are also raising (ru) (automatic translation) funds for further development of the project.

Software

  • [1] The French company Magellium announced (automatic translation) on talk-fr a new ‘Tracking changes’ (fr) web portal for the magOSM project. About twenty themes are available, covering metropolitan France for the past 30 days. On the database side it uses PGSQL triggers on osm2pgsql tables to detect and store changes before analysing them. The source code is published under a free licence.
  • We have written earlier about the open source program OpenDroneMap that can be used to assemble orthophotomaps. This article (ru) (automatic translation) explains how to make the app work.

Programming

  • In a blogpost Mikel Maron, Lead Mapbox Community team, co-founder of HOT and OSMF board member, published an interview with the Tasking Manager’s lead developer, Felix Delattre, on technical details and other background information of the new version of HOT’s widely used tool.
  • In March, Paul Norman reported to the QGIS developers that a particular feature (XYZ tile backgrounds) consumes far more slippy map tiles than necessary. QGIS now represents 5% of all tile requests on the main OSM servers. QGIS developer elpaso has filed a pull request with a fix which should be included in the imminent release of QGIS 3.14. The fix also provides backport patches for two earlier versions: 3.10 and 3.12.
  • OpenMapTiles provided an update on recent developments (with the slightly misleading title of ‘The Future of OpenMapTiles Project’) with their software stack. A significant change is moving away from using MapnikVT to a native PostGIS function, ST_AsMVT, which both simplifies the stack and improves performance. They now also run continuous integration tests on the tile output after each code change is integrated.

Releases

  • Translators from Latin America produced a Spanish version of ‘Mapping routes‘, Trufi Association documentation on how to map informal bus routes.
  • HeiGIT, the Heidelberg University’s GIScience Research Group, announced the release of version 1.0 of its API for the history analysis platform for OpenStreetMap called ohsome. The ohsome project aims to make OSM data from the full history of edits more easily accessible.
  • Trail Router, a service which helps users to find new running routes, improved its feature to avoid hills. A blog post details changes that have been made to improve the sensitivity of the option to avoid hills, and a new feature to avoid hills when getting multiple suggestions.
  • Martijn van Exel has fixed his map OSM Then And Now, which compares OSM in early October 2007 with today.

Did you know …

  • … how to map permanent orienteering course markers? A Twitter conversation between Gregory Marler and Ollie O’Brien, orienteer and maintainer of OpenOrienteering Map, provides some useful hints.

OSM in the media

  • The online newspaper New Indian Express reports on how over a thousand volunteer students have been adding to OpenStreetMap through the Mapathon Keralam initiative of the Kerala State IT Mission.

Other “geo” things

  • In the small village of Quiliano (northern Italy) local police had to install (it) (automatic translation) road signs to warn truckers not to follow route instructions from Google Maps, because trucks often get stuck or cause traffic congestion in narrower streets.
  • Freedom of information requests have revealed the official terminology for many parts of bus stops in London. Tim Dunn summarises the key points visually on Twitter.
  • Peter Rushforth informed us about the re-opened call for positions or presentations for the W3C – OGC online workshop on standardising maps. The event is planned for the week from 21 September to 2 October 2020 and will be held in a format that allows global participation.
  • The website IanVisits features an article about a map of London street trees. The TreeTalk map helps you to answer the question ‘What kind of tree is that?’ It is not obvious from the map, but the data comes from the Greater London Datastore which published open data on street trees back in 2016.
  • The Guardian interviewed the Slovakian graphic designer Martin Vargic, who has created nice fictional maps including among others: ‘Britannia Under the Waves‘, ‘Map of Literature‘, ‘Map of Festivals‘, and ‘Map of Common Foods‘.
  • The Guardian presents five of the best online map apps.
  • Ride with gps announced that Garmin developed Varia to create a safer cycling environment. Varia is a first-of-its-kind rearview bike radar and smart bike light system that warns cyclists of vehicles approaching from behind, while also alerting approaching vehicles of a cyclist ahead. Ride with gps users now have the ability to connect these Garmin units with their Ride with GPS mobile apps.
  • Russian mobile operator Beeline has launched (automatic translation) a geoplatform ‘Save the bees(ru). With this platform they want to introduce (ru) landowners and bee keepers to each other so they can exchange information. This would help to prevent the death of bees from chemicals used in fields.
  • More than three thousand new petrol (gas) stations were added (ru) (automatic translation) to Yandex.Zapravki, a service which allows you to pay for your fuel without leaving your car.

Upcoming Events

Where What When Country
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-05-27 germany
Biella Incontro mensile 2020-05-30 italy
London Missing Maps ONLINE London Mapathon 2020-06-02 united kingdom
Stuttgart Stuttgarter Stammtisch 2020-06-03 germany
Arlon Atelier ouvert OpenStreetMap 2020-06-03 belgium
Rennes Réunion mensuelle 2020-06-08 france
Taipei OSM x Wikidata #17 2020-06-08 taiwan
Lyon Rencontre mensuelle 2020-06-09 france
Munich Münchner Treffen 2020-06-11 germany
Zurich 117. OSM Meetup Zurich 2020-06-11 switzerland
Berlin 144. Berlin-Brandenburg Stammtisch 2020-06-12 germany
Cape Town HOT Summit 2020-07-01-2020-07-02 south africa
Kandy 2020 State of the Map Asia 2020-10-31-2020-11-01 Sri Lanka

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by AnisKoutsi, NunoMASAzevedo, PierZen, Polyglot, Rogehm, SK53, Silka123, SunCobalt, TheSwavu, YoViajo, derFred.

Today, the Wikimedia Foundation Board of Trustees voted to ratify new trust and safety standards for Wikipedia and all other Wikimedia projects. The standards, as outlined in a new Community Culture Statement, provide direction and priority to address harassment and incivility within the Wikimedia movement and create welcoming, inclusive, harassment-free spaces in which people can contribute productively and debate constructively.

Specifically, the Board has tasked the Foundation with:

  • Developing and introducing, in close consultation with volunteer contributor communities, a universal code of conduct that will be a binding minimum set of standards across all Wikimedia projects;
  • Taking actions to ban, sanction, or otherwise limit the access of Wikimedia movement participants who do not comply with these policies and the Terms of Use;
  • Working with community functionaries to create and refine a retroactive review process for cases brought by involved parties, excluding those cases which pose legal or other severe risks; and
  • Significantly increasing support for and collaboration with community functionaries primarily enforcing such compliance in a way that prioritizes the personal safety of these functionaries.

 

The Board’s statement formalizes years’ of longstanding efforts by individual volunteers, Wikimedia affiliates, Foundation staff, and others to stop harassment and promote inclusivity on Wikimedia projects.

Please see the Board’s Community Culture Statement below and on Meta-Wiki.

Statement on Healthy Community Culture, Inclusivity, and Safe Spaces

Harassment, toxic behavior, and incivility in the Wikimedia movement are contrary to our shared values and detrimental to our vision and mission. They negatively impact our ability to collect, share, and disseminate free knowledge, harm the immediate well-being of individual Wikimedians, and threaten the long-term health and success of the Wikimedia projects. The Board does not believe we have made enough progress toward creating welcoming, inclusive, harassment-free spaces in which people can contribute productively and debate constructively.

In recognition of the urgency of these issues, the Board is directing the Wikimedia Foundation to directly improve the situation in collaboration with our communities. This should include developing sustainable practices and tools that eliminate harassment, toxicity, and incivility, promote inclusivity, cultivate respectful discourse, reduce harms to participants, protect the projects from disinformation and bad actors, and promote trust in our projects.

Specifically, the Foundation shall:

  • Develop and introduce a universal code of conduct (UCoC) that will be a binding minimum set of standards across all Wikimedia projects.
    • The first phase, covering policies for in-person and virtual events, technical spaces, and all Wikimedia projects and wikis, and developed in collaboration with the international Wikimedia communities, will be presented to the Board for ratification by August 30, 2020.
    • The second phase, outlining clear enforcement pathways, and refined with broad input from the Wikimedia communities, will be presented to the Board for ratification by the end of 2020;
  • Take actions to ban, sanction, or otherwise limit the access of Wikimedia movement participants who do not comply with these policies and the Terms of Use;
  • Work with community functionaries to create and refine a retroactive review process for cases brought by involved parties, excluding those cases which pose legal or other severe risks; and
  • Significantly increase support for and collaboration with community functionaries primarily enforcing such compliance in a way that prioritizes the personal safety of these functionaries.

 

Until such directives are implemented, the Board instructs the Foundation to adopt and implement policies for reducing harassment and toxicity on our projects and minimizing legal risks for the movement, in collaboration with communities whenever practicable. Until these two phases of the UCoC are complete and operational an interim review process involving community functionaries will be in effect. In this interim period, the Product Committee of the Board of Trustees will also advise the Trust & Safety team.

To that end, the Board further directs the Foundation, in collaboration with the communities, to make additional investments in Trust & Safety capacity, including but not limited to: development of tools needed to assist our volunteers and staff, research to support data-informed decisions, development of clear metrics to measure success, development of training tools and materials (including building communities’ capacities around harassment awareness and conflict resolution), and consultations with international experts on harassment, community health and children’s rights, as well as additional hiring.

The above efforts will be undertaken in coordination and collaboration with appropriate partners from across the movement, seek to increase effective community governance of conduct and behavioral standards, and reduce the long-term need of the Foundation to act. It is the shared goal of the Board and Foundation that these efforts advance a sustainable Wikimedia movement and support, rather than substitute, effective models of community governance.

We urge every member of the Wikimedia communities to collaborate in a way that models the Wikimedia values of openness and inclusivity, step forward to do their part to create a safe and welcoming culture for all, stop hostile and toxic behavior, support people who have been targeted by such behavior, assist good-faith people learning to contribute, and help set clear expectations for all contributors.

The Depicts

17:00, Friday, 22 2020 May UTC

So Structured Data on Commons (SDC) has been going for a while. Time to reap some benefits!

Besides free-text image descriptions, the first, and likely most used, element one can add to a picture via SDC is “depicts”. This can be one or several Wikidata items which are visible (prominently or as background) on the image. Many people have done so, manually or via JavaScript- or Toolforge-based mass editing tools.

This is all well and good, but what to do with that data? It can be searched for, if you know the magic incantation for the search engine, but that’s pretty much it for now. A SPARQL query engine would be insanely useful for more complex queries, especially if it would work seamlessly with the Wikidata one, but no usable, up-to-date one is in sight so far.

Inspired by a tweet by Hay, and with some help from Maarten Dammers, I found a way to use SDC “depicts” information in my File Candidates tool. It suggests files that might be useful to add to specific Wikidata items.

Now, since proper SDC support is … let’s say incomplete at the moment, I had to go a bit off beaten path. First, I use the “random” sort in the Commons API search for files with a “depicts” statement. That way, I get 50 such files with one query. Then, I use the wikibase API on Commons to get the structured data for these files. The structured data contains the information which Wikidata item(s) each file depicts.

Armed with these Wikidata item IDs, I use the database replicas on Toolforge to retrieve the subset of items that (a) have no image (P18), (b) have P31 “instance of”, (c) have no P279 “subclass of”, and (d) do not link to any of a number of “unsuitable” items (eg. templates or given names). For that subset, I get the files the items use, eg as a logo image (to not suggest their usage with the item), and then I add an entry to the database that says “this item might use this image”, according to the depicts statements in the respective image (Code is here, in case you are interested).

50 files (a restriction imposed by the Commons API) are not much, especially since many images with depicts statements probably are used as an image on the respective Wikidata item. So I do keep running such random requests in the background and collect them for the File Candidates tool. At the time of writing, over 12k such candidates exist.

Happy image matching, and don’t forget to check out the other candidate image groups in the tool (including potentially useful free images from Flickr!).

"Semantic-MediaWiki.org" got a new look

13:00, Friday, 22 2020 May UTC

April 14, 2018

"Semantic-MediaWiki.org" got a new look

"Semantic-MediaWiki.org" the home of Semantic MediaWiki got a new look. The new skin based on Chameleon finally emancipated the website from the standard wiki appearance and aims to provide a professional looking view increasing the user experience. Thanks go to Stephan Gambke, Iván Hernández and Karsten Hoffmeyer for working on this.

“Guadalupe”

11:26, Friday, 22 2020 May UTC

Here’s a story of how I tried to remove a fake story marginally related to COVID-19 from Wikipedia, and, at least for now, achieved the opposite and contributed to its dissemination and perpetuation.

On a BBC-produced podcast (in Russian) I heard a story about Lupe Hernández, a nurse who allegedly invented hand sanitizer. The story was born in a 2012 Guardian article, which was subsequently quoted by viral Facebook posts and a bunch of news sites in Spanish and a bunch of other languages, and even mention in an academic nursing book published by Springer. In the last few months hand sanitizer became more popular than ever, and so the story regained popularity.

When contacted for confirmation, the original Guardian story’s author said that “she couldn’t remember the source, and that her notebooks are in storage facility she currently can’t get to”.

The podcast, as well as a thorough LA Times article, conclude that the whole story is probably an urban legend and that the person probably never existed. No one was even sure whether it’s a woman or a man, even though the original story said “she”.

The podcast did mention that there is a very short Wikipedia article. I proposed it for deletion. The result of the deletion discussion was that the article was kept and renamed to “Lupe Hernández hand sanitizer legend”.

Before it was renamed in the English Wikipedia to be an article about a legend, it was also translated to Spanish and French, as an article about “Guadalupe Hernández”, a female nurse who invented hand sanitizer, even though zero sources say that her name was actually “Guadalupe”. Sure, you can assume that “Lupe” is short for “Guadalupe”, as some imaginative writers did, but why do we do it on Wikimedia sites?

I’m still of the firm opinion that the subject should be completely removed from Wikipedia in all languages, as well as from Wikidata, but there’s only so much I can do about this. If any of you know French or Spanish, can you please make sure the articles in your languages are not too awful, or perhaps consider proposing them for deletion?

And if you think I’m badly wrong about it all, please do tell me, too.

2014 WikiConference USA (Group F) 25 By now dozens of women have stepped into open source via Outreach Program for Women, a paid internship program administered by the GNOME Foundation. I recently asked several of them whether they had been able to transition from intern to volunteer.*

Are you succeeding at continuing to volunteer in your open source project? Or are you running into trouble? I'd love to know how people are doing and whether y'all need help.

When you were an OPW intern, you had a mentor and you had committed to a specific project for three months. Volunteering is freer -- you can change your focus every week if you want -- but the training wheels are gone and you have to steer yourself.

(I bet Google Summer of Code alumni have similar experiences.)

I got several answers, and in them I saw some common problems to which I suggest solutions.

  1. Problem: seems as though there are no more specific tasks to do within your project. Solutions: ask your old mentor what they might like you to do next. If they don't respond within 3 days, repeat your question to the mailing list for your open source project. Or switch to another open source project, maybe one your friends are working on!
  2. OPW mentors and interns at Wiki Conference USA 2014 Problem: finding the time. Solutions: set aside a weekly appointment, just as you might with a therapist or an exercise class. Pair up with someone else from the OPW alum list and set yourself a task to complete during a one-hour online sprint! Or if you know your time is being eaten up by your new job, set yourself a reminder for 3 months from now to check whether you have more free time in December.
  3. Problem: loneliness. Solutions: talk more in the #opw chat channel on GNOME's IRC (irc.gnome.org). Use http://www.pairprogramwith.me/ and http://lanyrd.com/ and https://lwn.net/Calendar/ to find get-togethers in your area, or launch one using http://hackdaymanifesto.com/ and http://meetup.com/.
  4. Karen Sandler, GNOME and OPW advocate. Problem: motivation. Solutions: consider the effects you're having in the world. Or focus on the bits of work you enjoy for their own sake, whatever those are. Or teach others the things you know, and see the light spark in their eyes.

These are tips for the graduating interns themselves; it would be good for someone, maybe me, to also write a list of tips for the organizers and mentors to nurture continued participation.


* OPW also provides a list of paid opportunities for alumni.

The Persistence of Poverty, which is today’s episode of NPR’s famous “Indicator” podcast, made me think of how small things that happened long ago in the history of Wikipedia and other Wikimedia wiki sites still affect us, for better or worse.

Here are some examples.

Example one: People didn’t want to have full copies of historical documents on the English Wikipedia, because they are not encyclopedic articles. So they created a whole separate wiki for it, called “Primary Sources Wikipedia”: “ps.wikipedia.org”. It turned out that this would be the URL for the Wikipedia in the Pashto language, which has the ISO 639 code “ps”, so it was renamed to Wikisource, becoming Wikimedia’s first non-Wikipedia wiki. The movement wasn’t even called “Wikimedia” then—the organization was created later. Later, Wiktionary, Wikibooks, WikiCommons, and other projects joined. And Wikisource and all of these other projects are awesome, but now this also has the side effect of having to have some challenging discussions between the Foundation and the community about how non-Wikipedia wikis should be branded in the long term.

Example two: A French Wikipedia editor who is curious about Ancient Egypt wanted to insert Egyptian hieroglyphics into Wikipedia articles, and he happened to know some PHP, so he wrote the Wikihiero extension, which is installed on all the wikis. Because it’s an extension that adds its own wiki syntax, Visual Editor shows a button to insert Hieroglyphics on every page, including the page about Astronomy on Wikiversity, which doesn’t have much to do with Ancient Egypt. This is not bad—this is mostly very good. What is bad is that the Visual Editor doesn’t have a button to insert infoboxes or “citation needed” tags, even though they are far more common than hieroglyphics, because they are implemented as templates and not as PHP, and Visual Editor handles all templates as one generic type of object. (If you are wondering how can this get fixed, the first necessary step in that direction is described on the page Global templates on mediawiki.org.)

Example three: Some people didn’t like that too many wikis are created in new languages and stay inactive, so they wanted a proper way to prove that people plan to be active editors. So they created the “Incubator” wiki, where people would show they are serious by writing the first bunch of articles. For various technical reasons, using it was more difficult than using a usual Wikipedia, but they probably quietly assumed that everybody who wants to create a Wikipedia in a new language is experienced in editing Wikipedia in English or Italian or some other big language, so almost no one ever bothered to improve it. By now, we know that that assumption was tragically wrong: most people who want to create a Wikipedia in a new language are not experienced in editing in other languages, so they are newest and the least experienced editors, but they get the most complicated user interface. (If you are wondering how can this get fixed, see this page on Phabricator.)

Yes, I’m oversimplifying all of these stories for brevity. And I’m not implying any malice or negligence in any of the cases here. These were good people with good intentions, who made assumptions that were reasonable for the time.

It’s just a shame that the problems they created are proving more difficult to fix as the time goes by.

Most could tell you the significance of Hiroshima and Nagasaki: the first usage of nuclear weapons in warfare. But many would be surprised to learn that the US continued to drop nuclear bombs on islands of the Pacific, long after World War II was finished. Students in the Japanese Environmental History class taught by Dr. Elyssa Faison at University of Oklahoma collaborated to enhance Wikipedia’s coverage of one such incident. In 1954, twenty-three Japanese sailors set out to catch tuna. While their ship, the Daigo Fukuryū Maru (the Lucky Dragon No. 5) was near the Marshall Islands, the sky started glowing in the west, and ash fell like snow from the sky. Unbeknownst to the sailors, and despite being outside of the US-declared “danger zone”, the fishermen had just been exposed to the radioactive fallout of a nuclear test, one that was more than twice as powerful as it had been intended. The sailors immediately fell ill with radiation poisoning; one would later die from the exposure while the other twenty-two men were hospitalized for over a year. One sailor Oishi Matashichi, had a stillborn child and later developed liver cancer, both of which he attributed to his radiation poisoning. The ship itself remained highly radioactive at first, with radiation detectable from one hundred feet away.

Daigo Fukuryū Maru, shortly before the 1954 nuclear incident. (Public domain)

Students made substantial revisions to Daigo Fukuryū Maru, adding detail about the health effects to the surviving fishermen and the response of the US government, which was initially denying culpability and claiming that the fishermen were actually spies. The US eventually paid Japan more than 15 million dollars in reparations. The fate of the Daigo Fukuryū Maru was also added: initially purchased by the Japanese government, by 1970 the Lucky Dragon No. 5 was sitting in a garbage-filled canal. It was then pulled from the water and put on public display in Tokyo as a symbol of opposition to nuclear weapons. Students even created a brand new biography of survivor Oishi Matashichi, who went on to become an author advocating for nuclear disarmament, attending a 2015 memorial service on the Marshall Islands for the victims of the nuclear testing at Bikini Atoll.

By writing this information into Wikipedia, the students have shared the story of the Lucky Dragon No. 5 with a global audience, helping thousands to understand the far-reaching ripples of nuclear testing in the Pacific.


Interested in incorporating a Wikipedia writing assignment into a future course? Visit teach.wikiedu.org for all you need to know to get started.