The best documentation automation can buy

14:24, Sunday, 22 2020 March UTC

HEADER CAPTION: Screenshot from Wikimedia's famous Visual Editor. The typo "documenation" has a red squiggly line under it indicating the spell checker has automatically detected a spelling error by the author.

Tools for validating that JavaScript documentation is current and error-free have advanced significantly over the last several years. It is now possible to detect mismatches between a program's documentation and its source code automatically using a free and open-source, industry-standard type checker. This goes way beyond typos.

JavaScript typing is loose

JavaScript is an untyped language. Unlike a typed language, a JavaScript program is always generated regardless of whether the types in it are valid. Some consider JavaScript's fast-and-loose style a feature, not a bug. Notable proponents of that viewpoint include Douglas Crockford and Paul Graham.

There have been numerous articles written on the subject, but I suspect that most reading already understand the values of clear typing. For any nontrivial program with multiple authors and any longevity, especially those likely to be found among the sprawling wikis, strong typing is much more practical and sustainable than the alternative. With good typing, one can quickly grasp the structure of a program. That is, you can conceptualize and interface with any well-typed API whether you understand how it works internally or not. Refactors are a lot easier too and while not fearless, a typed codebase is far more malleable than an untyped one. Type checks are also a great way to verify your work, just like in grade school.

Picture of a whiteboard showing the complete mathematical steps for unit conversion from 65 miles per hour to kilometers per hour.
CAPTION: 65 miles per hour is how many kilometers per hour? So long as the fractions are correct, we can validate the conversion by checking that the units cancel each other out. In type checking, our function parameters, function return types, and object properties must align in a similar way but the process is automated.

Many bugs could be caught before arriving in production if every patch had its typing validated—but don't take my word for it. Phan, the PHP type checker, is now a required validation test for any change to MediaWiki Core as well as many extensions. It's like a bunch of built-in unit tests specifically for types. Without automation, these tests can require thousands of lines of hand written code that are tedious and time consuming to author, read, and maintain (e.g., see the otherwise excellent Popups extension). In the worst cases, no tests are written at all.

Photograph of the interior of a pocket watch showing intricate gears and fine craftsmanship.
CAPTION: Types must align like clockwork or the machine stops running. Image by ElooKoN / CC BY-SA.

Documentation should be correct

Good typing is just as important in documentation. For JavaScript, documentation is largely written in JSDoc (or its deprecated competitor, JSDuck). Wikipedians seem to agree that documentation is a very good idea. If documentation is a good idea, correct and up-to-date documentation is an even better one. There's a tool for that: it's called TypeScript.

If you haven't heard of TypeScript yet, it may be because it's not very common at Wikimedia except for the uber-amazing work by the WMDE and Wikidata communities (e.g., see wikibase-termbox which is over 80% TypeScript) as well as explorations several years back by Joaquín Oltra Hernández. However, it is now immensely popular globally and proven itself by capability to be far more than just a fashionable trend from 2012.

So what is TypeScript exactly? TypeScript is JavaScript with types. Whether you choose to use it for functional code like WMDE or not, TypeScript features the ability to lint plain JavaScript files for the type correctness of their JSDocs. You don't need Webpack and you don't need to make any functional changes to your code (unless it's incorrect and out-of-sync from the documentation—i.e., bug fixes). Your JavaScript is the same as it ever was but now, if your documentation and program don't match, TypeScript will report an error.

This isn't just better documentation, it's documentation as accurate as we can write in an automated way. Who doesn't want better documentation?

What changes are needed?

Typing at the seams. In practice, this usually means documenting function inputs and outputs, and user types using JSDoc syntax. E.g.:

JSDocs
/**
 * Template properties for a portlet.
 * @typedef {Object} PortletContext
 * @prop {string} portal-id Identifier for wrapper.
 * @prop {string} html-tooltip Tooltip message.
 * @prop {string} msg-label-id Aria identifier for label text.
 * @prop {string} [html-userlangattributes] Additional Element attributes.
 * @prop {string} msg-label Aria label text.
 * @prop {string} html-portal-content
 * @prop {string} [html-after-portal] HTML to inject after the portal.
 * @prop {string} [html-hook-vector-after-toolbox] Deprecated and used by the toolbox portal.
 */

/**
 * @param {PortletContext} data The properties to render the portlet template with.
 * @return {HTMLElement} The rendered portlet.
 */
function wrapPortlet( data ) {
  const node = document.createElement( 'div' );
  node.setAttribute( 'id', 'mw-panel' );
  node.innerHTML = mustache.render( portalTemplate, data );
  return node;
}

CAPTION: If this code was undocumented or the types inaccurate, would you always get the data properties right? Maybe you would, but what about everyone else?

Most programmers are already typing their JavaScript to some extent with JSDocs, so often only refinements are needed. In other cases, TypeScript's excellent type inference abilities can be leveraged so that no changes are required.

Type definitions are a useful supplement to JSDocs. Definitions are non-functional documentation that support type annotations inline. For example, the definition of the powerful but fantastically loose jQuery API could find marvelous utility in many Wikimedia codebases for at-your-fingertips documentation needs. Another very relevant example that ships with TypeScript itself is the DOM definition, which will alert you to misalignments such as attempting to access a classList on a Node instead of an Element. Thorough type checking is similar and the perfect complement to ESLint checks for ES5-only sources or more broadly ESLint's safety checks.

Type definitions are also a convenient way to describe globals and, more generally, share types. Definitions are either shipped with the NPM package itself or DefinitelyTyped (e.g., npm i -D @types/jquery) and are now standard practice for most noteworthy JavaScript libraries. Imagine if this degree of accuracy could be achieved in some of our most well-used codebases. Integrations between skins, extensions, Core, and peripheral libraries would be validated for alignment. It would be harder to break things and a much more welcoming experience for newcomers.

npm install jsdoc typescript tsconfig.json tsc Document!

The actual project setup for adding documentation checks to an existing repository is minimal and requires no functional changes:

  1. Add JSDoc and TypeScript as NPM development dependencies. Optionally: add any missing types for third-party libraries used.
  2. Add a tsconfig.json to tell TypeScript to lint JavaScript documentation you wish to validate.
  3. Add tsc to the project's NPM test script.

The real work is in fleshing out the missing documentation with JSDocs. However, TypeScript is quite flexible about how one chooses to opt in or out of documentation validation. If code isn't worth documenting, it's probably not worth keeping, but typing can be consciously deferred in a number of ways. The most straightforward is probably with a // @ts-ignore comment. Think of it as progressively enhanced code.

An example project setup for Vector is here which shows how typing and documentation can be retrofitted nicely even on codebases that predate TypeScript and make heavy use of sophisticated APIs like jQuery.

Editor support

It's unnecessary for setting up a project, but worth mentioning, that by ensuring that even a machine can model your documentation means that your code editor can understand it too. For most editors, this means you'll get accurate, split-second documentation lookup and documentation type checking. Visual Studio Code has a superb out-of-box experience for JavaScript including documentation awareness and code completion, but other editors are supported too.

Screenshot from Visual Studio Code integrated development environment showing the same red squigglies we saw before but this time its detecting a fatal error instead of a typo.
CAPTION: Errors are identified as you write. There's that typo again but this time it could be your next unbreak now or your next type checker error.

You would see similar output from a continuous integration job or the command line:

Animation of the same flawed code presenting an error on the command line when checked.
CAPTION: The command line output is just as informative.

And here are those excellent docs:

Screenshot of the same code editor presenting documentation in a highly integrated and useful manner.
CAPTION: Documentation is a mouse hover away. Coding with documentation at hand is a breeze and the expectation for many modern developers writing their first MediaWiki patches.

Conclusion

65 miles per hour is 104.60736 kilometers per hour. Language changes the way we think, and documentation is the encyclopedia of code. Tooling that improves our abilities to understand, reason, and express ourselves through language improves our ability to engineer.

In my own personal and professional development, I've found accurate documentation to be a great treasure that gives me confidence and efficiency in the code that I read and write. Maybe we should have the same hopes and expectations of our documentation that newcomers do. Maybe with better documentation—documentation that is as accurate as we can automate—some of Wikipedia's many JavaScript errors could be identified and eliminated as easily as changing units from mph to kph. Maybe with better documentation, we could write better software, faster. Software that users love using and developers enjoy writing. Let's get to work!

Photograph of a blank jigsaw puzzle where pieces differ only by shape.
CAPTION: Programs are like jigsaw puzzles where types are the shapes. Check assembly before shipping. Image by Muns and Schlurcher / CC BY-SA.

Thanks to Sam Smith and Joaquín Oltra Hernández for reviewing and providing feedback.

NOTE: Documentation on building better documentation is being written on wiki with the help of editors like you!

weeklyOSM 504

13:58, Sunday, 22 2020 March UTC

10/03/2020-16/03/2020

lead picture

A new OpenStreetMap Indoor Viewer: indoor= 1 | © François2 | © map data OpenStreetMap contributors

Mapping

  • Joseph Eisenberg noticed that the usage of man_made=goods_conveyor for industrial conveyor belts and systems was proposed but never approved, although the tag has been used over 4,000 times to date. Therefore he suggests adding it to the Wiki pages for ‘Key:man_made‘ and ‘Map Features‘.
  • Jennings Anderson from the University of Colorado, Boulder published a paper about the role of corporate editors in OSM, such as Apple, Microsoft, and Facebook. The paper also investigates the influence they have, and where they are mapping.
  • jguthula, from Amazon, highlighted that ~94% of the gates in OSM do not have access information. He recommend adding the missing information, who can use these gates, and summarises the values for access=* tag.
  • An article on improveosm.org details how the OpenStreetCam plug-in for JOSM, with its new features, can help with the mapping of missing features recognised from road signs.
  • OSM’s main map style no longer renders barrier=hedge with area=yes as an area following an update in August 2019 but instead renders them with the standard green line. It took a while until someone mentioned this in the German forum and the regular discussion (de) (automatic translation) about mapping an object as polygon or line has started.
  • The voting for place=refugee_site proposal, as suggested by Manonv and Kateregga1, finished and the proposal has been rejected with 28 votes for, 12 votes against, and 1 abstention.
  • Apple analysed (ru) (automatic translation) the territory of Russia with its Atlas tool. The detected errors are uploaded as tasks to the MapRoulette service and are waiting for us to fix them. Join!

Community

  • In April 2018 (we talked about it here), members of OSM Togo community launched the GirlsMap initiative, which aims to organise OSM training sessions just for ladies, in order to increase their proportion within the local and global OSM community. Since then the initiative has spread over other African countries: Benin, Mali, Ivory Coast, Guinea, Burkina Faso, Madagascar. In Togo, Benin or Madagascar they have already organised their second or third workshops. The DRC community had planned to start their first one on 20 March, but unfortunately they had to postpone it to a later date, due to the COVID-19 situation.
  • A discussion about how best to make imagery available in OSM editors seems to have degenerated from ‘what do we do next’ into ‘whose fault is it’ and finger-pointing over the 16 months that it has been open. See also here and here for more of the discussion.
  • Russian user Zkir (Kirill Bondarenko) shared (ru) (automatic translation) his impressions of the SotM BALTICS 2020 conference.
  • In early March the ‘Open Data Day’ conference took place in Moscow (Russia). It was held in the architectural landmark (constructivist architecture) Communal House of the Textile Institute on Ordzhonikidze street. The RU-OSM community shot panoramas (automatic translation) of the surrounding area and the building itself, and mapped it in OSM, including a 3D representation of the building.
  • The German OSM Telegram group is trying (de) (automatic translation) to restart regular organised weekly mapping tasks, an idea that got lost in 2010 in the German community, although attempts have been made to revive the idea. The current task (de) (automatic translation) is about checking the opening hours for doctors and pharmacies.

Imports

  • Lanxana informed us about the plan to import pharmacies in Catalunya, Spain, which consists of verifying and completing data on existing pharmacies by experienced mappers with the help of a Task Manager project. However, the licensing of the data and open issues about the import procedure have raised some questions.

Events

  • The Maker Faire Vienna 2020, originally scheduled for 16 and 17 March, has been rescheduled to 3 and 4 October 2020.
  • You map informal transit and take pictures? Post them on Wikimedia Commons to participate in the WikiLovesAfrica contest and Jungle Bus Paratransit Contest. Learn more here.
  • Thomas Skowron shared his experience of the talks given at the German FOSSGIS conference, which have been held remotely due to the coronavirus.

Humanitarian OSM

  • Russell Deffner, from HOT, explained why the organisation has not yet responded to the Coronavirus threat. Due to missing formal requests from partners, which HOT requires to become active, he suggested organising efforts with local OpenStreetMap Chapters and Communities and to use HOT resources, such as the Tasking Manager, if needed.
  • Ryan Arcadio wrote an article about how maps and mapathons can aid the Philippines amid coronavirus and other disasters.
  • MapSwipe, in which Prof. Alexander Zipf, of the University of Heidelberg, played a major role, was awarded a main prize at the ‘Mobile World Congress’ trade fair in Barcelona, Spain.

Maps

  • [1] ‘indoor=’ is a new renderer for indoor data in OSM. The map, with global coverage and based on Simple Indoor Tagging, has been provided by François de Metz and is available on GitHub. Further details are provided in a blog post.
  • In the last weeks many OSM users have noticed an increase in grey tiles in the OpenStreetMap carto style. The reason for this was the omission of several cache servers in Russia, Chile and France. The OSM operations team therefore is asking for help in providing servers, even if Twitter user Anonymaps thinks OSMF has better things to do.
  • OSM Paraguay wrote a how-to on setting up a tile server for Paraguay.

Licences

  • Полина Новикова asked for clarification on the difference between a Derivative Database and a Produced Work under the ODbL.

Software

  • OSM’s API now supports JSON responses, which should simplify usage for web apps. JSON support, which has been made happen by mmd and others, is documented on the OSM wiki.
  • Facebook’s Michal Migurski announced the company’s Daylight Map Distribution offering; a kind of Facebook-reviewed OSM dataset, which helps users by filtering intentional or unintentional questionable edits.

Did you know …

  • … that the Russian company NextGIS has a YouTube-channel where various training videos on GIS technologies are published?

Other “geo” things

  • Süddeutsche Zeitung carried (automatic translation) an article on Martin Waldseemüller, the inventor of America. Christoph von Eichhorn’s article explains how Waldseemüller’s 1507 map was the first to show a continent called ‘America’ and that two of the six remaining copies are probably fakes.
  • The Geospatial World published some of the best maps tracking Coronavirus updates. There you can find maps like: The Esri Story Maps, WHO Situation Dashboard, Johns Hopkins University, HealthMap and many more.
  • Forbes features an article about the importance of geodata for the fight against the Coronavirus.
  • Mapbox discusses a variety of different maps, in a blogpost, which visualise the spread of the Coronavirus.
  • Yandex has created a map of the coronavirus spread in Russia and the world.

Upcoming Events

Many meetings were cancelled – please check the Calendar on the wiki page

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Elizabete, Polyglot, Rogehm, SK53, Sammyhawkrad, SeverinGeo, Silka123, SunCobalt, TheSwavu, YoViajo, derFred.

Mobile web performance: the importance of the device

21:24, Saturday, 21 2020 March UTC

This week at our team offsite in Dublin, I looked at our performance data from an angle we haven't explored before: mobile device type. Most mobile devices expose their make and model in the User Agent string, which allows to look at data for a particular type of device. As per our data retention guidelines, we only keep user agent information for 90 days, but that's already plenty of data to draw conclusions.

I looked at the top 10 mobile devices accessing our mobile sites, per country, for the past week. One country in particular, India, had an interesting set of top 10 devices that included two models from different hardware generations. The Samsung SM-J200G, commercially known as the Samsung Galaxy J2, which was the 5th most common mobile device accessing our mobile sites. And the Samsung SM-G610F, also known as the Samsung Galaxy J7 Prime, which was the 2nd most common. The hardware of the more recent handset is considerably more powerful, with 3 times the RAM, 23% faster CPU clock and twice the amount of CPU cores than the older model.

Being in the top 10 for that country, both devices get a lot of traffic in India, which means a lot of performance Real User Monitoring data collected from real clients to work with.

With the J7 Prime retail price in India currently being double the J2 retail price, one might wonder if users who use the cheaper phone also use a cheaper, slower, internet provider.

Thanks to the Network Information API, which we recently added to the performance data we collect, we are able to tell.

Looking at Chrome Mobile only, for the sake of having a consistent definition of the effectiveType buckets, we get:

effectiveType  J2 J7 Prime
slow-2g 0.5% 1.1%
2g 0.8% 0.7%
3g 27% 28%
4g 71.5% 70.2%

These breakdowns are extremely similar, which strongly suggests that users of these two phone models in India actually experience the same internet connectivity quality. This is very interesting, because it gives us the ability to compare the performance of these two devices from different hardware generations, in the real world, with connectivity quality as a whole that looks almost identical. And similar latency, since they're connecting to our data centers from the same country.

What does firstPaint look like for these users, then?

Device Sample size Median p90 p95 p99
J2 1226 1842 4769 7704 15957
J7 Prime 1798 1082 2811 5076 12136
difference -41.3% -41.1% -34.2% -24%

And what about loadEventEnd?

Device Sample size Median p90 p95 p99
J2 1226 3078 9813 14072 29240
J7 Prime 1798 1821 5635 9847 28949
difference -40.9% -42.6% -30.1% -1.1%

Across the board, the difference is huge, even for metrics like loadEventEnd when one might think that download speed might be an equalizer, particularly since we serve some heavy pages when articles are long. OS version might play a part in addition to hardware, but in practice we see that older Android devices tend to stick to the OS version they were shipped with, which means that those two factors are tied together. For example, worldwide for the past week, 100% of J2 phones run the Android version they were shipped with (5.1).

These results show that device generation has a huge impact on the real performance experienced by users. Across the globe, users are upgrading their devices over time. This phenomenon means that the performance metrics we measure directly on sampled users with RUM should improve over time, by virtue of people getting more powerful devices on average. This is an important factor to keep in mind when measuring the effect of our own performance optimizations. And when the median of the RUM metrics stay stable over a long period of time, it might be that our performance is actually worsening, and that degradation is being masked by device and network improvements across the board.

Given the eye-opening results of this small study, getting a better grasp on the pace of improvement of the environment (device generations, network) looks like a necessity to understand and validate our impact on the evolution of RUM metrics.

WikimediaDebug v2 is here!

21:20, Saturday, 21 2020 March UTC

WikimediaDebug is a set of tools for debugging and profiling MediaWiki web requests in a production environment. WikimediaDebug can be used through the accompanying browser extension, or from the command-line.

This post highlights changes we made to WikimediaDebug over the past year, and explains more generally how its capabilities work.

  1. What's new?
  2. Features overview: Staging changes, Debug logging, and Performance profiling.
  3. How does it all work?

§ 1. What's new?

Redesigned

I've redesigned the popup using the style and components of the Wikimedia Design Style Guide.

New design Previous design

The images above also show improved labels for the various options. For example, "Log" is now known as "Verbose log". The footer links also have clearer labels now, and visually stand out more.

New footer Previous footer

This release also brings dark mode support! (brighter icon, slightly muted color palette, and darker tones overall). The color scheme is automatically switched based on device settings.

Dark mode
Inline profile

I've added a new "Inline profile" option. This is a quicker and more light-weight alternative to the "XHGui" profile option. It outputs the captured performance profile directly to your browser (as hidden comment at the end of the HTML or CSS/JS response).

Beta Cluster support

This week, I've set up an XHGui server in the Beta Cluster. With this release, WikimediaDebug has reached feature parity between Beta Cluster and production.

It recognises whether the current tab is for the Beta Cluster or production, and adapts accordingly.

  • The list of hostnames is omitted to avoid confusion (as there is no debug proxy in Beta).
  • The "Find in Logstash" link points to logstash-beta.wmflabs.org.
  • The "Find in XHGui" link points to performance-beta.wmflabs.org/xhgui/.

§ 2. Features overview

Staging changes

The most common use of WikimediaDebug is to verify software changes during deployments (e.g. SWAT). When deploying changes, the Scap deployment tool first syncs to an mw-debug host. The user then toggles on WikimediaDebug and selects the staging host.

WikimediaDebug is now active and routes browser activity for WMF wikis to the staging host. This bypasses the CDN caching layers and load balancers normally involved with such requests.

Debug logging

The MediaWiki software is instrumented with log messages throughout its source code. These indicate how the software behaves, which internal values it observes, and the decisions it makes along the way. In production we dispatch messages that carry the "error" severity to a central store for monitoring purposes.

When investigating a bug report, developers may try to reproduce the bug in their local environment with a verbose log. With WikimediaDebug, this can be done straight in production.

The "Verbose log" option configures MediaWiki to dispatch all its log messages, from any channel or severity level. Below is an example where the Watchlist component is used with the verbose log enabled.

One can then reproduce the bug (on the live site). The verbose log is automatically sent to Logstash, for access via the Kibana viewer at logstash.wikimedia.org (restricted link).

Aggregate graphs (Kibana) Verbose log (Kibana)
Performance profiling

The performance profiler shows where time is spent in a web request. This feature was originally implemented using the XHProf PHP extension (for PHP 5 and HHVM). XHProf is no longer actively developed, or packaged, for PHP 7. As part of the PHP 7 migration this year, we migrated to Tideways which provides similar functionality. (T176370, T206152)

The Tideways profiler intercepts the internals of the PHP engine, and tracks the duration of every subroutine call in the MediaWiki codebase, and its relation to other subroutines. This structure is known as a call tree, or call graph.

The performance profile we capture with Tideways, is automatically sent to our XHGui installation at at https://performance.wikimedia.org (public). There, the request can be inspected in fine detail. In addition to a full call graph, it also monitors memory usage throughout the web request.

Most expensive functions (XHGui) Call graph (XHGui)

§ 3. How does it all work?

Browser extension

The browser extension is written using the WebExtensions API which Firefox and Chrome implement.

Add to Firefox   Add to Chrome

You can find the source code on github.com/wikimedia/WikimediaDebug. To learn more about how WebExtensions work, refer to MDN docs, or Chrome docs.

HTTP header

When you activate WikimediaDebug, the browser is given one an extra HTTP header. This header is sent along with all web requests relating to WMF's wiki domains. Both those for production, and those belonging to the Beta Cluster. In other words, any web request for *.wikipedia.org, wikidata.org, *.beta.wmflabs.org, etc.

The header is called X-Wikimedia-Debug. In the edge traffic layers of Wikimedia, this header is used as signal to bypass the CDN cache. The request is then forwarded, past the load balancers, directly to the specified mw-debug server.

Header Format
X-Wikimedia-Debug: backend=<servername> [ ; log ] [ ; profile ] [ ; forceprofile ] [ ; readonly ]
mediawiki-config

This HTTP header is parsed by our MediaWiki configuration (wmf/profiler.php, and wmf/logging.php).

For example, when profile is set (the XHGui option), profiler.php invokes Tideways to start collecting stack traces with CPU/memory information. It then schedules a shutdown callback in which it gathers this data, connects to the XHGui database, and inserts a new record. The record can then be viewed via performance.wikimedia.org.

See also

Further reading

Add WikimediaDebug to Firefox   Add WikimediaDebug to Chrome

14 January 2020 security incident on Phabricator

21:20, Saturday, 21 2020 March UTC

On 14 January 2020, staff at the Wikimedia Foundation discovered that a data file exported from the Wikimedia Phabricator installation, our engineering task and ticket tracking system, had been made publicly available. The file was leaked accidentally; there was no intrusion. We have no evidence that it was ever viewed or accessed. The Foundation's Security team immediately began investigating the incident and removing the related files. The data dump included limited non-public information such as private tickets, login access tokens, and the second factor of the two-factor authentication keys for Phabricator accounts. Passwords and full login information for Phabricator were not affected -- that information is stored in another, unaffected system.

The Security team has investigated and assesses that there is no known impact from this incident. However, out of an abundance of caution, we are resetting all Two-Factor Authentication keys for Phabricator and invalidating the exposed login access tokens. Additionally, we continue to encourage people to engage in online security best practices, such as keeping your software updated and resetting your passwords regularly.

The Foundation will continue to investigate this incident and take steps to prevent it from occurring again in the future. In the meantime, Phabricator is online and functioning normally. We regret any inconvenience this may have caused and will provide updates if we learn of any further impact.

Respectfully,

David Sharpe
Senior Information Security Analyst
Wikimedia Foundation

Wikimedia UK and COVID-19: A local and global response

14:25, Wednesday, 18 2020 March UTC

By Lucy Crompton-Reid, Wikimedia UK Chief Executive

This is not the blog that I thought I was going to be writing this March. Through a combination of Women’s History Month, International Women’s Day and Wikimedia’s own project Art+Feminism, Wikimedia UK would have been involved in a wide range of events with some amazing partners across the country, training new editors and increasing coverage of women and their achievements on Wikipedia and beyond. Instead, our events have been cancelled and as of today, the Wikimedia UK office is closed and staff are all working from home, as we are gripped by a global pandemic on a scale none of us has ever lived through.

These are strange, sad and unsettling times for all of us, which are illustrating both our fragility and our interconnectedness. It’s difficult to find anything positive to focus on at a time when people are dying, museums are closing, businesses are folding and all of us are worried about ourselves and our loved ones. However I did want to share with you some of the ways in which the Wikimedia community both here in the UK and around the world is helping people through this crisis:

  • Wikipedia editors are working around the clock to make sure that there is up-to-date, accurate and accessible information about COVID-19 and the social, economic and educational impact of the pandemic. This Wired article talks about the crucial role played by a London based doctor and Wikipedia contributor in the development of the COVID-19 article on Wikipedia, which is now receiving more than half a million views a day. The doctor in question was trained to edit Wikipedia by our Wikimedian in Residence at the Wellcome Library, demonstrating the importance of these programmes and the offline activities facilitated by Wikimedia UK.
  • As more than 849 million children and young people are having their education disrupted as a result of the pandemic – according to the latest information released to Wikipedia by UNESCO – the value of the Wikimedia projects as the world’s biggest open educational resource has never been more crucial. Here in the UK, we are exploring how we can support universities in their transition to online learning, whilst many Wikimedia organisations around the world are working to highlight teaching and learning resources for all age groups.
  • Individual editors and organisations across the global Wikimedia movement are joining advocacy efforts to encourage official bodies and international agencies to release content about the virus and associated issues under open licences. This work is ongoing and has already resulted in accurate, well researched and – crucially at this point – educational material about COVID-19 being released onto the Wikimedia projects, where millions more people are accessing this information.

Conversely, on a video call with colleagues from across the global Wikimedia movement yesterday, it became clear that some governments are using the pandemic to justify heavy censorship – including blocking Wikipedia – and roll back civil liberties. Whilst we are already seeing how a pandemic like this is creating an ideal environment for misinformation and disinformation, we must ensure that it is not used as an excuse to limit freedom of expression and curtail people’s rights to information and knowledge.

Wikimedia UK may be at home, but we’re still online. Yesterday the whole staff team met to discuss some of the implications of the office closure – and the wider COVID-19 situation – for us, our partners, our volunteer community and other contributors. We are already thinking about our programme and partnerships, and considering what events and projects will need to be cancelled but what can be rescheduled, moved online or re-imagined. We are also keen to explore ways in which we can support editors and readers during this period, as well as our members and supporters. If you would like to get in touch, please email us on info@wikimedia.org.uk and we will endeavour to respond as quickly as we can.

I wish you all the very best for the next few weeks and months. Please stay safe and, if you can, #StayAtHome.

Tech News issue #12, 2020 (March 16, 2020)

00:00, Monday, 16 2020 March UTC
TriangleArrow-Left.svgprevious 2020, week 12 (Monday 16 March 2020) nextTriangleArrow-Right.svg
Other languages:
Deutsch • ‎English • ‎Esperanto • ‎français • ‎lietuvių • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎српски / srpski • ‎українська • ‎עברית • ‎العربية • ‎বাংলা • ‎ไทย • ‎中文 • ‎日本語

weeklyOSM 503

15:47, Sunday, 15 2020 March UTC

03/03/2020-09/03/2020

lead picture

Mapathon at Saint Louis University in Baguio City, Philippines 1 | Photo © GOwin

Mapping

  • Stereo and contrapunctus have proposed a simplified approach to mapping public transport routes. This is based on their extensive experience in Luxembourg and Delhi respectively. Feedback was requested on the tagging mailing list.
  • Baloo Uriza has completed a large project of cleaning up southern California’s Interstate 405 freeway, much of which hasn’t been significantly touched since the TIGER imports. A number of issues, largely relating to complications arising from lane-mapping very large roads and editors not highlighting lane information such as placement, lane change restrictions and per-lane access restrictions, are highlighted in his wrap-up on talk-us.
  • Martijn van Exel introduced a new type of MapRoulette Challenge, the Quick Fix. Unlike traditional MapRoulette Challenges, the new Quick Fix Challenges require no experience with OSM editing tools like JOSM or iD. MapRoulette asks you, the mapper, a question that just requires a simple yes/no answer.
  • The voting for Ferdinand0101’s proposal of name:Zsye=*, which enables mappers to add names writeable with emojis, has closed. The proposal was unsuccessful.

Community

  • The OSMF Board marked International Women’s Day by reflecting on the limitations to knowing exactly how many women contribute to the OpenStreetMap project and why some people contribute more or less than others. They also note that the reasons that bring people of any gender, origin or age to our project seem to be similar among contributors: it’s fun to make the map data a bit better and it’s rewarding when someone finds your work useful.
  • Miriam Gonzalez, one of the founders of Geochicas, was featured in an article on women in mapping by Anastasia Moloney. The article discusses how women mappers tend to add services often overlooked by men, such as hospitals, childcare services, toilets, domestic violence shelters and women’s health clinics. The OSMF Board’s full responses to Anastasia’s questions can be found here.
  • HOT uses International Women’s Day to recap the previous year’s HOT gender achievements, to summarise what can be done to ensure maps are being made for, by, and about women and to state HOT’s Gender Commitments.
  • Mapillary and Humanitarian OpenStreetMap Team are launching a new mapping campaign, #map2020, to improve maps and navigation in low- and middle-income countries. Jessica Bergmann invited communities to get involved for the chance for one individual to win a fully-funded trip to present at the HOT Summit in Cape Town, South Africa. This year’s campaign calls for communities to capture street-level imagery with Mapillary that contributes to improved navigation within their communities. Expressions of interest must be submitted by 20 March and final projects by 20 April.
  • Marco Minghini announced that the call for abstracts for the Academic Track at State of the Map 2020 has been extended to 26 March.
  • Nathalie Sidibé provided her background, motivation for OSM, and her achievements in an article in her user diary.
  • Andy Mabbett reminded us of a debate five years ago about the value of adding Wikidata IDs to OSM. Andy provides an example of the benefits of the link and asks others for their projects, which make use of both OSM and Wikidata.

Events

  • Jungle Bus is organising a Project of the Month in March, in the French-speaking community, on charging stations for electric vehicles. Numerous tools are available to contribute in the field or from home. Useful information is available on the project wiki page. The hashtag for this project is #balanceTaBorne
  • Michael Reichert announced (automatic translation) that the OSM Saturday planned for 14 March at the FOSSGIS Conference 2020 was cancelled due to current developments.
  • Current developments have also resulted in the Berliner OSM Hackweekend, planned for 28 and 29 March, being cancelled, as reported (automatic translation) by Lars Lingner.

Humanitarian OSM

  • Sean Fleming reported on how Microsoft’s AI for Humanitarian Action program is working with HOT and Bing to bring together satellite mapping, machine learning, and volunteers to create a new generation of detailed maps. The AI-powered tools help make the human volunteers more productive by predicting features, suggesting that a shape may be a building, and speeding up initial identification.
  • The Humanitarian OpenStreetMap Team invites everyone to submit ideas, during the Call for Sessions, for events at the 6th HOT Summit in Cape Town, South Africa, to be held 1 and 2 July 2020.
  • HOT announced its internship programmes around Google Summer of Code and Outreachy, a programme which organises short paid internships for typically under-represented people.
  • The good folk of Pittsburgh, Pennsylvania, love their Lenten fish fry so much that a few clever souls have created a Lenten Fish Fry Map to help residents catch their weekly fried fish fix. Emily Mercurio describes how Hollen Barmer and Code for Pittsburg used open source software and open geospatial data to build a platform for volunteers to gather, update, and share fish fry information.

Maps

  • The Russian taxi ordering service ‘Taksovichkof‘ uses OSM as a basemap. We want to note that they have specified attribution properly.
  • A team of Russian cartographers created (automatic translation) a detailed map of Southern Ural. According (automatic translation) to one of the project participants Alexey Klyanin, it is based on OSM data.
  • The map at coronavirus.app visualises the number of people infected with the new virus by country on an OSM basemap.

switch2OSM

  • The Russian public movement ‘Antiborshevik’ (borshevik is Russian for heracleum), which aims to fight the harmful and poisonous plant Heracleum sosnowskyi, recently switched to OSM. Now their map (ru) is made using the uMap service, which uses OSM.

Open Data

  • The Saint Louis University in Baguio City, Philippines has recently hosted its first mapathon and was pleasantly surprised by the number of participants. GOwin shares their impressions of the event and the results in a nicely illustrated user diary post.
  • OpenTrees.org, a service which currently visualises 12 million municipal street and park trees from 179 different sources, went live recently.

Licences

  • Christian Quest reported on the success of OSM France’s ‘AttributionIsNotOptional’ campaign (which we reported earlier). OSM France’s tile servers delivered a tile requesting that OSM be attributed as the source instead of a map tile, to websites using the tiles without the correct OSM attribution. The source citation is mandatory according to paragraph 4.3 of the ODbL. There are some pros and cons in the discussion about the procedure. Interesting to see who does not speak out 😉

Software

  • The JOSM team wrote about the planned removal of the JavaScript API from JOSM, following the removal of the JavaScript engine Nashorn from Java, and the impact it had and still has to the core of the editor JOSM.

Programming

  • Nick Whitelegg informed us about the start of a new development blog for his OSM-based augmented reality project Hikar and the off-road ‘StreetView’-like application opentrailview.org.
  • Russian user Vascom, who recently began to create weekly maps for the Maps.Me app for all regions of Russia and CIS countries, shared the scripts he uses to do this.
  • According to Roman Shuvalov, the developer of the game ‘Generation Street’, in the new version of his game you can export generated 3D models into .ply and .obj.

Releases

  • DBeaver SQL Client version 7.0.0 for Windows, MAC OS X and Linux is available. This DB Client allows users to Access/Edit/View the major SQL DBase. Major enhancements were announced for this version:
    • Data viewer and data editor UI major improvements
    • SQL editor major improvements.
  • In the new version (9.6.0) of the famous mobile navigation app Maps.Me, which is based on OSM data, isolines were added (iOS, Android). Now navigation through mountains and hills will become a little easier. The function is also available in offline mode.

Did you know …

  • … in a tweet the JOSM team reminded editors in SK, BR, CZ, FR, DE, PT and RU to enable their country-specific JOSM validator rules.
  • …there is a QGIS tool for automatically identifying asbestos roofing?
  • … that the KFC representative office in Russia uses OSM to display (ru) its restaurants on its website?
  • … about the Russian website ‘Accident Map‘? (automatic translation) Its developers take data from the official traffic police statistics portal and map it. OSM is used as a basemap.

OSM in the media

  • FingerLakes1.com, a website offering local news, weather and sports for the Finger Lakes region in New York, featured OSM in an article titled ‘OpenStreetMap: Find Your Way In A Foreign City Like A Local’.

Other “geo” things

  • GPS World carried a story on ‘RoadTagger’, an artificial intelligence model developed by the Massachusetts Institute of Technology and Qatar Computing Research Institute to determine road features, such as type or lane count, from aerial imagery. Testing of the model with OSM data from 20 USA cities showed that it predicted the number of lanes with 77% accuracy and the road surface type with 93% accuracy.
  • Upendra Oli created a tutorial video on how to animate OSM time series data using QGIS.
  • Open Knowledge Belgium and Noms Peut-Être (automatic translation) have created a map visualising the street names of Brussels by gender. Only 6% of streets named for people in Brussels are named after women and only one street is named after a transgender man. Dries blogged about the creation and launch of EqualStreetNames.Brussels.
  • A user of the app RunKeeper, which tracked his bike ride close to a crime scene, was subjected to a police investigation following a ‘geofence warrant’ that led Google, whose location services the app is using, to share the data of its nearby phone users with the police.

Upcoming Events

Where What When Country
Chemnitz Chemnitzer Linux-Tage (cancelled) 2020-03-14-2020-03-15 germany
Nottingham Nottingham pub meetup 2020-03-17 united kingdom
Lüneburg Lüneburger Mappertreffen 2020-03-17 germany
Cologne Bonn Airport 127. Bonner OSM-Stammtisch 2020-03-17 germany
Hanover OSM-Sprechstunde 2020-03-18 germany
Hanover Stammtisch Hannover 2020-03-19 germany
Munich Münchner Treffen 2020-03-19 germany
San José Civic Hack & Map Night (online) 2020-03-19 united states
Bremen Bremer Mappertreffen 2020-03-23 germany
Reading Reading Missing Maps Mapathon 2020-03-24 united kingdom
Hanover OSM-Sprechstunde 2020-03-25 germany
Prešov Missing Maps Mapathon Prešov #5 2020-03-26 slovakia
Lübeck Lübecker Mappertreffen 2020-03-26 germany
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-03-27 germany
Biella Incontro mensile 2020-03-28 italy
Toulouse Contrib’atelier OpenStreetMap 2020-03-28 france
Berlin Berlin Hackweekend 2020-03-28-2020-03-29 germany
Cologne Kölner Stammtisch 2020-04-01 germany
Stuttgart Stuttgarter Stammtisch 2020-04-01 germany
Toulouse Toulouse Hack Weekend April 2020 2020-04-04-2020-04-05 france
Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
Guarda EuYoutH OSM Meeting 2020-06-24-2020-06-28 spain
Cape Town HOT Summit 2020-07-01-2020-07-02 south africa
Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by PierZen, Polyglot, Rogehm, SK53, Silka123, SunCobalt, TheSwavu, derFred, geologist.

#SwineFlue management with #wolves

11:43, Sunday, 15 2020 March UTC
A lot is being said about viruses and pandemics, they do not only exist in humans but also in animals particularly in kept animals. One knee jerk reaction is that by an outbreak of a disease animals in nature are blamed.

A good example is swine flue and African swine flue. It is a tradition to call for the culling, the extermination of wild boar and, traditionally the result is an increase in boar being killed.

A real solution may be found in an ecological solution, wolves who predate on boar prefer a sickly animal over a healthy animal that is better able to fight back. There is documentation of wolves determining the extend of outbreaks of a swine flue. Areas with wolves do better.

As an apex hunter the effects of wolves on its ecology are profound. There are all kinds of arguments why people oppose the reintroduction of animals that are essential for a functional ecology, animals like wild boar, beaver, wolf are extinct in places. We argue that we need more trees to offset climate change but this will not work when those trees are not placed in a functioning ecology. In Scotland trees will not grow because they will be eaten by overabundant elk.. Scotland has no functioning ecology it lacks predators like wolves and lynx to keep the elk in check.

When we consider pandemics, viral diseases, our ecology it is important to consider our own effects. We will do better when we enable ecological functionality and consider building with nature for more sustainable results.
Thanks,
        GerardM

At the end of February I was honoured to be invited to present the closing keynote at the Wikimedia in Education Summit at the Disruptive Media Learning Lab at Coventry University.  This is the transcript of my talk. 


Introduction

Although I’m originally an archaeologist by background, I’ve worked in the domain of learning technology for over twenty years and for the last ten years I’ve focused primarily on supporting the uptake of open education technology, resources, policy and practice, and it’s through open education that I came to join the Wikimedia community.  I think the first Wikimedia event I ever took part in was OER De a cross-sector open education conference, hosted by Wikimedia Deutschland in Berlin in 2014. I remember being really impressed by the wide range of innovative projects and initiatives from across all sectors of education and it really opened my eyes to the potential of Wikimedia to support the development of digital literacy skills, while enhancing the student experience and enriching our shared knowledge commons. And I think we’ve seen plenty of inspiring examples today of that potential being realised in education institutions around the UK.

So what I want to do this afternoon is to explore the relationship between the open education and Wikimedia domains and the common purpose they share; to widen access to open knowledge, remove barriers to inclusive and equitable education, and work towards knowledge equity for all. I also want to turn our attention to some of the structural barriers and systemic inequalities that prevent equitable participation in and access to this open knowledge landscape. We’ll begin by taking a brief look at some of the recent global policy initiatives in this area, before coming back closer to home to explore how the University of Edinburgh’s support for both open education and Wikimedia in the curriculum forms part of the institution’s strategic commitment to creating and sharing open knowledge.

Open Education

To begin with though, I want to take a step back to look at what we mean when we talk about open education, and if you’re heard me speak before, I apologise if I’m going over old ground here.

The principles of open education were outlined in the 2008 Cape Town Declaration, one of the first initiatives to lay the foundations of what it referred to as the “emerging open education movement”. The Declaration advocates that everyone should have the freedom to use, customize, and redistribute educational resources without constraint, in order to nourish the kind of participatory culture of learning, sharing and cooperation that rapidly changing knowledge societies need. It sounds a lot like the goals of the Wikimedia community doesn’t it? Which is hardly surprising given that one of the authors of the Cape Town Declaration was Jimmy Wales. In a press release to mark the launch of the Declaration, Wales was quoted as saying

“Open education allows every person on earth to access and contribute to the vast pool of knowledge on the web. Everyone has something to teach and everyone has something to learn.”

The Cape Town Declaration is still an influential document and it was updated on its 10th anniversary as Capetown +10, and I can highly recommend having a look at this if you want a broad overview of the principles of open education. Unsurprisingly, engaging with Wikipedia is woven through Capetown +10, as a means to empower the next generation of learners, to encourage the adoption of open pedagogies, and to open up publicly funded resources.

As conceived by the CapeTown Declaration, open education is a broad umbrella term, there’s is no one hard and fast definition, and indeed as Catherine Cronin reminds us in her paper “Openness and Praxis” open education is complex, personal, contextual and continually negotiated.

One conceptualisation of open education that I like is from the European Union’s JRC Science for Policy Report. Opening Up Education. A Support Framework for higher education institutions, which describes the aim of open education as being

“to widen access and participation to everyone by removing barriers and making learning accessible, abundant, and customisable for all. It offers multiple ways of teaching and learning, building and sharing knowledge. It also provides a variety of access routes to formal and non-formal education, and connects the two.”

Another interpretation of open education that I often return to is from the not-for-profit organization OER Commons which states that

“The worldwide OER movement is rooted in the human right to access high-quality education. The Open Education Movement is not just about cost savings and easy access to openly licensed content; it’s about participation and co-creation.”

One of the things I like about both these interpretations is the focus co-creation and removing barriers to knowledge, which to my mind are the most important aspects of open education and which, of course, are also cornerstones of the Wikimedia movement.

Open Educational Resources (OER)

Owing to its contextual nature, open education encompasses many different things including open pedagogy, open textbooks, open assessment practices, open online courses, and open data, however open educational resources, or OER, are central to any understanding of this domain. And of course Wikipedia is frequently described as the world’s biggest open educational resource.

UNESCO define open educational resources as:

“learning, teaching and research materials in any format and medium that reside in the public domain or are under copyright that have been released under an open license, that permit no-cost access, re-use, re-purpose, adaptation and redistribution by others.”

UNESCO OER Recommendation

Now there is actually some controversy regarding this wording of this definition, but I’m not going to go into that right now. The reason this definition is significant is that in November last year UNESCO made a formal commitment to actively support the global adoption of OER, when it approved its Recommendation on Open Educational Resources. This Recommendation builds on a series of earlier policy instruments including the 2012 Paris OER Declaration, and the 2017 Ljubljana OER Action Plan. To distinguish between these policy instruments, Declarations outline principles that UNESCO states wish to afford the broadest possible support to, while Recommendations have significantly greater authority and are intended to influence the development of national laws and practices. So the fact that we now have a new UNESCO Recommendation on OER is an important step forward.

Central to the new Recommendation, is the acknowledgement of the role that OER can play in achieving United Nations Sustainable Development Goal 4. The Recommendation recognises that

“in building inclusive Knowledge Societies, Open Educational Resources (OER) can support quality education that is equitable, inclusive, open and participatory as well as enhancing academic freedom and professional autonomy of teachers by widening the scope of materials available for teaching and learning.”

And it outlines five areas of action

  1. Building capacity of stakeholders to create, access, re-use, adapt and redistribute OER
  2. Developing supportive policy
  3. Encouraging effective, inclusive and equitable access to quality OER
  4. Nurturing the creation of sustainability models for OER
  5. Promoting and reinforcing international cooperation

Equality and diversity is centred throughout the Recommendation with the acknowledgement that

“In all instances, gender equality should be ensured, and particular attention paid to equity and inclusion for learners who are especially disadvantaged due to multiple and intersecting forms of discrimination.”

This echoes UNESCO Assistant Director for Education Qian Tang’s summing up at the end of the 2nd World OER Congress in Ljubljana in 2017 when he said that

“to meet education challenges, we can’t use the traditional way. In remote and developing areas, particularly for girls and women, OER are a crucial, crucial means to reach SDGs. OER are the key.”

How member states choose to action the UNESCO OER Recommendation, and what impact it will have globally, remains to be seen. However a coalition of organizations committed to promoting open education worldwide, including the Commonwealth of Learning, Creative Commons, SPARC and Open Education Global has been established to provide resources and services to support the implementation of the Recommendations.

Wikimedia Movement Strategy

Running in parallel with the development of the UNESCO Recommendation, the Wikimedia Foundation has been undertaking its own Movement Strategy exercise to shape the strategic direction of the movement, and outline the processes required to enable Wikimedia to achieve its goal of becoming the essential infrastructure of the ecosystem of free knowledge by 2030. Over the past three years volunteers, staff, partners and other stakeholders from across the global Wikimedia community have been involved in an ambitious process to identify what the future of the movement should look like, and how we should get there. And although the process and mechanism for scoping the Movement Strategy could hardly be more different from the development and ratification of the formal UNESCO Recommendation, both are underpinned by common principles and seek to achieve broadly similar goals.  The movement strategy is still under development but it outlines 13 Recommendations to build a shared future and bring the Wikimedia movement’s vision to life.

I’m not going to go into all these Recommendations, you can find out more about them and how to contribute to the Movements Strategy process here, but it’s clear that they echo many of principals of the UNESCO OER Recommendation. Indeed Recommendation 10 Prioritize Topics for Impact, specifically acknowledges the need to address global challenges, such as those outlined in the Sustainable Development Goals, and there are many other areas of commonality with the global open education movement among the other Recommendations.

Enshrined in the Wikimedia Movement Strategy, are the key concepts of Knowledge as a Service and Knowledge Equity.

Knowledge as a service, is the idea that, Wikimedia will become a platform that serves open knowledge to the world across interfaces and communities.

And knowledge equity, is the commitment to focus on knowledge and communities that have been left out by structures of power and privilege, and to break down the social, political, and technical barriers preventing people from accessing and contributing to free knowledge.
Knowledge Equity and Structural Inequality – giving up space.

Structural Inequality in the Open Knowledge Landscape

And to my mind it is this commitment to knowledge equity that is key to both the open education and Wikimedia movements, because as we are all aware, the open knowledge landscape is not without its hierarchies, norms, gatekeepers and power structures.

Indeed the 2019 Progress update for Sustainable Development Goal 4 notes that while rapid technological changes present both opportunities and challenges, refocused efforts are needed to improve learning outcomes particularly for women, girls and marginalized people in vulnerable settings.

Wikimedia’s problems with gender imbalance, structural inequalities and systemic bias are well known and much discussed. On English language Wikipedia just over 18% of biographical articles are about women, and the number of female editors is somewhere between 15 and 20%. Some language Wikipedias, such as the Welsh Wicipedia, fare better, others are much worse. Despite Wikipedia’s gender imbalance being an acknowledged problem, that projects such as Wiki Women In Red have sought to address, too often those who attempt to challenge these structural inequalities and rectify the systemic bias, are the subject of targeted hostility and harassment. The Movement Strategy acknowledges these issues and highlights the importance of addressing them.

Recommendation 2; on Creating Cultural Change for Inclusive Communities notes that Wikimedia communities do not reflect the diversity of our global society, and that the alarming gender gap can be attributed to a number of causes, including lack of a safe environment, as evidenced by numerous cases of harassment. And Recommendation 5 on Ensuring Equity in Decision-Making notes that Wikimedia’s historical structures and processes reinforce the concentration of power around established participants and entities. Adding that inclusive growth and diversification requires a cultural change founded on more equitable processes and representative structures.

In a recent article titled “The Dangers of Being Open” Amira Dhalla, who leads Mozilla’s Women and Web Literacy programs, wrote:

“What happens when only certain people are able to contribute to open projects and what happens when only certain people are able to access open resources? This means that the movement is not actually open to everyone and only obtainable by those who can practice and access it.

Open is great. Open can be the future. If, and only when, we prioritize structuring it as a movement where anyone can participate and protecting those who do.”

This lack of equity in the open knowledge landscape is significant, because if knowledge and education are to be truly open, then they must be open to all regardless of race, gender, or ability, because openness is not just about definitions, recommendations and strategies, openness is about creativity, access, equity, and social inclusion and enabling learners to become fully engaged radical digital citizens.

Radical Digital Citizenship, as defined by Akwugo Emejulu and Callum McGregor, moves beyond the concept of digital literacy as simply acquiring skills to navigate the digital world, to a re-politicised digital citizenship in which social relations with technology are made visible, and emancipatory technological practices for social justice are developed to advance the common good.

And I think, to some extent, that is what the Wikimedia Movement strategy process and the UNESCO OER Recommendation are trying to achieve.

University of Edinburgh

At the University of Edinburgh we believe that both open education and open knowledge are strongly in keeping with our institutional vision and values; to discover knowledge and make the world a better place, and to ensure our teaching and research is accessible, inclusive, and relevant to society. In line with the UNESCO OER Recommendation, we also believe that OER and open knowledge can contribute to achieving the aims of the United Nations Sustainable Development Goals, which the University is committed to through the SDG Accord. To this end the University supports both a Wikimedian in Residence and a central OER Service.

We’ve already heard about our successful Wikimedian in Residence programme so I want to turn our attention to our OER Service which was launched in 2015, round about the same time as our Residency, and both have worked closely together over the last five years.
OER Vision

The University’s vision for OER has three strands, building on our excellent education and research collections, traditions of the Scottish Enlightenment, the university’s civic mission and the history of the Edinburgh Settlement. The three strands of our OER vision are:

For the common good – encompassing every day teaching and learning materials.
Edinburgh at its best – high quality resources produced by a range of projects and initiatives.
Edinburgh’s Treasures – content from our world class cultural heritage collections.
OER Policy

This vision is backed up by an OER Policy, approved by our Learning and Teaching Committee, which encourages staff and students to use, create and publish OERs to enhance the quality of the student experience. The fact that this policy was approved by our Learning and Teaching Committee is significant as it places open education and OER squarely in the domain of teaching and learning. Both the University’s vision for OER and its support for our Wikimedian in Residence are the brainchild of Melissa Highton, Assistant Principal Online Learning and Director of Learning and Teaching Web Services, who many of you will know and who presented the keynote at the Wikimedia in Education Summit in Middlesex University two years ago. EUSA, the student union were also instrumental in encouraging the University to adopt an OER policy, and we continue to see student engagement and co-creation as being fundamental aspects of open education and open knowledge.

OER Service

Of course policy is nothing without support, and this is where the OER Service comes in. The service provides staff and students with advice and guidance on creating and using OER, and engaging with open education. We provide a one stop shop that provides access to OER produced by staff and students across the university, and we place openness at the centre of strategic technology initiatives by embedding digital skills training and support in institution wide programmes including lecture recording, academic blogging, MOOCs, and distance learning at scale.

Like our Wikimedian in Residence, the OER Service focuses on digital skills development and we run a wide range of digital skills workshops for staff and students on copyright literacy, open licencing, OER and playful engagement.

Copyright Debt

We see the development of copyright literacy skills as particularly important as it helps to mitigate a phenomenon that my colleague Melissa has referred to as copyright debt.  If you don’t get the licensing of educational content right first time round, it will cost you to fix it further down the line, and the cost and reputational risk to the university could be significant if copyright is breached. And this is one of the key value propositions for investing in strategic support for OER at the institutional level; we need to ensure that we have the right to use, adapt, and reuse the educational resources we have invested in. It’s very common to think of OER as primarily being of benefit to those outwith the institution, however open licenses help to ensure that we can continue to use and reuse the resources that we ourselves have created. Unless teaching and learning resources carry a clear and unambiguous licence statement, it is difficult to know whether and in what context they can be reused.

Online Learning: MOOCs and MicroMasters

Ensuring continued access to course materials is particularly important for our many online learners, whether they are among the 4,000 matriculated students enrolled on our online masters courses, or the 2.7 million learners who have signed up for the wide variety of MOOCs that we offer. Continued access to MOOC content can be particularly problematic as educational content often gets locked into commercial MOOC platforms, regardless of whether or not it is openly licensed, and some platforms are now time limiting access to content. Clearly this is not helpful for learners and, given how costly high-quality online resources are to produce, it also represents a poor return on investment for the University. In order to address this issue, the OER Service works closely with our MOOC production teams to ensure that all content can be released under open licence though our Open Media Bank channel on our media asset management platform Media Hopper Create. We now have over 500 MOOC videos which are available to re-use, covering topics as diverse as music theory, mental health, clinical psychology, astrobiology and the discovery of the Higgs Boson particle.

We’re also extending our commitment to providing open access to high quality online learning opportunities and widening access to our scholarship, by launching a new programme of MicroMasters in partnership with EdX. These micro credentials are flexible, open to all, and provide a stepping stone from open to formal accreditation. And if you cast your minds back to the EU report on Opening Up Education, you’ll remember that providing access routes between non-formal and formal education is one of the specific benefits of open education that it highlights.

Openness has informed our approach to these innovative new programmes at every step of that way: edX was chosen as a not for profit organisation built on an open source platform; the technology and policies that drive our new pedagogical approaches at scale, are open and shared; and in line with our OER policy, we’re building openness into the creation of all teaching materials. Our first MicroMasters in Predictive Analytics for Business Applications was launched in September, and course materials will be released under open licence shortly.

Co-Creation

As I mentioned earlier, at Edinburgh we believe that student engagement is fundamental to our institutional mission and our vision for OER and open knowledge. And arguably the best way to engage students is through co-creation, which to my mind, is one of the most powerful affordances of open education.

Put simply, co-creation can be described as student led collaborative initiatives, often developed with teachers or other bodies, that lead to the development of shared outputs. A key feature of co-creation is that it should be based on equal partnerships, and relationships that foster respect, reciprocity, and shared responsibility.

And we’ve already seen plenty of examples of the benefits of co-creation in action through the inspiring Wikimedia in the Curriculum initiatives supported by Ewan, but we also have a number of open education and OER creation assignments running throughout the University.

One particularly inspiring example is the School of Geosciences Outreach and Engagement course which gives students the opportunity to develop their own science communication projects with schools, museums, outdoor centres and community groups, creating a wide range of reusable educational resources for science engagement and community outreach.  Each summer the OER Service employs Open Content Creation student interns, who take the materials created by the GeoScience students, make sure everything in those resources can be released under open license and then share them on TES Resources, so they could be found and reused by other teachers and learners.

OER creation assignments also form an integral part of the Digital Futures for Learning course which is part of our MSc in Digital Education. Commenting on this assignment course leader Jen Ross said

“Experiencing first-hand what it means to engage in open educational practice gives students an appetite to learn and think more. The creation of OERs provides a platform for students to share their learning, so their assignments can have ongoing, tangible value for the students themselves and for those who encounter their work.”

And these sentiments echo the experiences of many of the students who have participated in our Wikipedia in the Curriculum assignments.

Knowledge Equity

Finally I want to return to the theme of knowledge equity; many of our open education and Wikimedia activities have a strong focus on redressing gender imbalance, centering marginalised voices, diversifying and decolonising the curriculum, and uncovering hidden histories. Some inspiring examples include our regular Wiki Women in Red editathons; Women in STEM editathons for Ada Lovelace Day and International Women’s Day; LGBT+ resources for medical education; open educational resources on LGBT+ Issues for Secondary Schools; UncoverED, a student led collaborative decolonial project uncovering the global history of the university; Diverse Collections, showcasing stories of equality and diversity within our archives; and the award winning Survey of Scottish Witchcraft Wikidata project.

Projects such as these provide our staff and students with opportunities to engage with the creation of open knowledge and to improve knowledge equity. And what is particularly gratifying is that we often find that this inspires our staff and students to further knowledge activism. ♦ So for example this is Tomas Sanders, an undergraduate History student and one of our former Open Content Curation student interns, and who went on to run a successful Wikipedia editathon for Black History Month with the student History Society.

Talking about his experience of running the Black History Month Editathon, in an interview with Ewan, Tomas said

“The history that people access on Wikipedia is often very different from the history that you would access in a University department; there’s very little social history, very little women’s history, gender history, history of people of colour or queer history, and the only way that’s going to be overcome is if people from those disciplines start actively engaging in Wikipedia and trying to correct those imbalances. I feel the social potential of Wikipedia to inform people’s perspectives on the world really lies in correcting imbalances in the representation of that world. People should try to make Wikipedia accurately represent the diversity of the world around us, the diversity of history, and the diversity of historical scholarship.”

All these projects are examples of knowledge equity in action; the dismantling of obstacles that prevent people from accessing and participating in education and knowledge creation. Ultimately, this is what knowledge equity is about; counteracting structural inequalities and systemic barriers to ensure just representation of knowledge and equitable participation in the creation of a shared public commons. And it’s through the common purpose of knowledge equity that we can harness the transformational potential of open knowledge and open education to make real steps towards achieve the aims of Sustainable Development Goal 4; ensuring inclusive and equitable quality education and promoting lifelong learning opportunities for all, while supporting social inclusion and enabling learners to become fully engaged radical digital citizens.

References

Cook-Sather, A., Bovill, C., & Felten, P. (2014). Engaging students as partners in learning and teaching: A guide for faculty. San Francisco, CA: Jossey-Bass.

Cronin, C. (2017). Openness and Praxis: Exploring the Use of Open Educational Practices in Higher Education. The International Review of Research in Open and Distributed Learning, 18(5). https://doi.org/10.19173/irrodl.v18i5.3096

Cybulska, D., (2019), Funding utopia when you’re already a free knowledge utopia, https://medium.com/a-funding-utopia/funding- utopia-when-youre-already-a-free-knowledge-utopia-8da9d8f12c3c

Dhalla, A., (2018). The Dangers of Being Open, https://medium.com/@amirad/the-dangers-of-being-open-b50b654fe77e

Emejulu, A. and McGregor, C., (2019). Towards a radical digital citizenship in digital education, Critical Studies in Education, 60:1, 131-147, DOI: 10.1080/17508487.2016.1234494

Inamorato Dos Santos, A., Punie, Y., and Castaño Muñoz, J. (2016). Opening up Education: A Support Framework for Higher Education Institutions, European Commission Joint Research Centre, https://10.2791/293408

Lubicz-Nawrocka, T. (2018). Students as partners in learning and teaching: The benefits of co-creation of the curriculum. International Journal for Students As Partners, 2(1), 47-63.

Schuwer, R. (2019), UNESCO Recommendation on OER, https://www.robertschuwer.nl/?p=2812

UNESCO General Conference, (2019), Draft Recommendation on Open Educational Resources, https://unesdoc.unesco.org/ark:/48223/pf0000370936

Wikimedia Movement Strategy, 2018 – 2020, https://meta.wikimedia.org/wiki/Strategy/Wikimedia_movement/2018-20

A greatly-overdue Commons app update

14:26, Thursday, 12 2020 March UTC

Wow, has it really been a year since the last post I wrote on this blog? Time has really gotten away from me for a bit!

Much has changed in the Commons app since I last wrote. We have completed our codebase overhaul, and are proud to report a much more modern, stable, organized codebase. Also, from all reports that we have, the persistent upload failures that have plagued a few users have been solved! More details can be found on our Project Grant’s midpoint report.

Some screenshots from our new Nearby filters feature in v2.12:

Nearby filters in Commons app 01 Nearby filters in Commons app 02

And a map of p18 edits made with our app (images uploaded via Nearby for places that need them) from all around the world:

Commons app p18 edits map

2020 plans

We may also have an iOS app coming up for those of you who use iPhones! We are currently proposing its creation here – feel free to post your thoughts on it. 🙂

As usual, thank you to every one of you who has made all of this possible.

Wikipedia as a teaching tool that empowers students

15:19, Tuesday, 10 2020 March UTC

“I’ve improved my student reviews from it.”

Dr. Jennifer Glass’ environmental geochemistry course at Georgia Tech last fall covered “how chemical, biological, and geological processes control the distribution of chemical elements on Earth and the solar system.” Through a semester-long Wikipedia writing assignment, she wanted students to gain experience “in scientific writing on notable topics in environmental geochemistry of high interest to the public.”

This was her third term in a row using our assignment management tools and trainings to guide students through the task. After that first term in Fall 2018, she was thrilled to see how the assignment both affected students and the public resource that is Wikipedia. “Everyone agrees it’s way more interesting than a term paper,” she shared with us back in early 2019. “I think I’ll never go back to term papers – and instead always do something interactive like this.”

Cement wall with leaching occurring causing the white discoloration.
Image by Daina Krumins, CC BY-SA 4.0 via Wikimedia Commons.

One student from this most recent fall term improved Wikipedia’s coverage of the chemistry process called leaching, a naturally occurring process where a solute is extracted using a solvent. Although the page was created in 2008, it has only seen a few updates every year until this student came in overhauled the article. They effectively rewrote the whole article (check out the Authorship Highlighting tool on the Dashboard to see their contributions), including what leaching processes look like in soil and for fly ash. The page now also has a summary of laboratory tests used to measure for each type of leaching process. The page gets about 240 views every day and has received more than 23,000 visits since the student improved it! They also uploaded a freely licensed photo to the page to illustrate leaching occurring in a cement wall due to natural weathering events.

Another student expanded a section of the page about sulfur cycles, the “processes by which sulfur moves between rocks, waterways and living systems.” The page receives about 340 pageviews every day and has been read 35,000 times since the student added this new section of well-referenced work.

Our Dashboard’s Authorship Highlighting tool shows all the live content that students have added to a particular topic on Wikipedia. It’s very helpful for grading!

Dr. Glass recently spoke at Georgia Tech’s Spring 2020 Teaching with Technology Spotlight about the power of the assignment (see the full talk recorded here). She shares how to incorporate the assignment into her class, what learning objectives it achieves, and student reactions.

“I was excited to do it because I first learned how to edit Wikipedia myself at a conference five years ago,” she explained. She had seen warnings on Wikipedia pages before this (like the “reference needed” tag), but wasn’t familiar with how the site’s content was actually curated.

“Before this conference, I thought that kind of warning was for an official editor of Wikipedia to fix, when in fact, no, that’s the beauty of Wikipedia. It’s this big community out there that volunteers their time. And I realized, along with a lot of other professors, that I could use this in my class. And maybe students would like it a lot more than the typical project of a term paper that only I read.”

“I liked the assignment myself in the beginning because it’s for the public good. It’s getting information out. It’s taking the information from these specialized journals that only we, at universities, have easy access to because our institute pays the very high subscription fees. Most people can’t get access to that peer review information, so how do we get this out to the broader community that’s interested?”

Already, freeing up knowledge that’s siloed inside academia is a powerful motivator for many of the instructors we work with. But there’s another piece that is almost more compelling—students really like it.

“Students can take greater ownership of their work,” Dr. Glass continued. “And I’ve learned through my experience that students are much more engaged with the material, take much more ownership, and feel much more empowered when they know that the information is gonna end up in the public domain. It’s going to be seen by the world. In the beginning, it’s kind of scary for them. They’ve heard bad things about Wikipedia in high school, often, and they also are nervous about writing something that is going to end of being seen by the entire world. But when they go through the whole semester-long process, by the end of it, 95% of them really like the experience and feel empowered. I’ve seen multiple people put it on their resumes. It really shows that they feel like they got a real life experience out of this that is going to help them in the real world. And I think it will.”

“If you can do more writing in your classes, or if you want to, I really feel that students get a lot of great writing experience from this. And it teaches them about ethics and plagiarism and bias and how to find sources.”

Plus, she added, “I’ve improved my Course Instructor Opinion Survey (CIOS) scores from it.” That’s not a bad outcome, either!


To incorporate a Wikipedia writing assignment into an upcoming course, visit teach.wikiedu.org for access to free resources and assignment templates.


To read more about Dr. Glass’ past course, see our blog post after her first term teaching. To watch Dr. Glass’ full talk, click here.

A fair number of books have been written on the birds of India. Many colonial-era books have been taken out of the clutches of antique book sellers and wealthy hoarders and made available to researchers at large by the Biodiversity Heritage Library but there are still many extremely rare books that few have read or written about. Here is a small sampling of them which I hope to produce as a series of short entries.

One of these is by M.R.N. Homer (Mary Rebekah Norris Holmer - 6 June 1875 - 2 September 1957) - a professor of physiology at Lady Hardinge Medical College who was also the first woman board member in the Senate of Punjab University and a first for any university in India. Educated at Cambridge and Dublin University she worked in India from 1915 to 1922 and then returned to England. She wrote several bits on the methods of teaching nature study, and seems to have been very particular about these ideas. From a small fragment, it would appear that she emphasized the use of local and easily available plants as teaching aids and she deplored the use of the word "weed". Her sole book on birds was first published in 1923 as Indian Bird Life and then revised in 1926 as Bird Study in India. The second edition includes very neat black-and-white  illustrations by Kay Nixon, a very talented artist who illustrated some Enid Blyton books and apparently designed posters for the Indian Railways.

A rather sparse Wikipedia entry has been created at https://en.wikipedia.org/wiki/M.R.N._Holmer - more information is welcome!

A scanned version of her bird book can now be found on the Internet Archive - https://archive.org/details/Holmer Holmer came from a Christian Sunday School approach to natural history which shows up in places in the book. Her book includes many literary references, several especially to R.L.S. (R.L.Stevenson). In another part of the series we will look at more "evangelical" bird books.



 
John Stephenson, the writer of the preface, was a zoologist and a specialist on the oligochaetes. He wrote the Fauna of British India volume on the oligochaetes and was the series editor for the Fauna of British India from 1927 following the death of the editor A.E. Shipley.

This Month in GLAM: February 2020

02:04, Tuesday, 10 2020 March UTC
  • Armenia report: Wiki project on Museums with My Armenia
  • Brazil report: Moreira Salles Institute GLAM initiative in Brazil
  • Finland report: The Helsinki then and now exhibition
  • France report: GLAM related blogposts
  • Indonesia report: Proposing collaboration with museums in Bali; First Wikisource training in the region
  • Netherlands report: Students write articles about Media artists, Public Domain Day 2020, Wiki Goes Caribbean, WikiFridays at Ihlia – Wikimedia Nederland in January & February 2020
  • Norway report: Wikipedia editing workshop with the Norwegian Network for Museums
  • Serbia report: Great dedication of librarians
  • Sweden report: Historic photos; Support for international Wikimedia community; Library training tour; Many GLAMs improved on Wikidata
  • UK report: Kimonos and Khalili
  • Ukraine report: Winning photos Wiki Loves Monuments shown in different cities; Libraries Lead an All-Ukrainian Challenge
  • USA report: Black History Month and Open Access Anniversaries
  • Structured Data on Wikimedia Commons report: Summary of pilot projects, and what’s next
  • Wikidata report: Leap into Wikidata!
  • WMF GLAM report: New Team Leadership, GLAM-Focused Grants Review, OpenGLAM Declaration Research
  • Calendar: March’s GLAM events

Eval, not evil

12:44, Monday, 09 2020 March UTC

My Mix-n-match tool deals with third-party catalogs, and helps matching their entries to Wikidata. This involves, as a necessity, importing minimal information about those entries into the Mix’n’match database, but ideally also imports additional metadata, such as (in case of biographical entries) gender, birth/death dates, VIAF etc., which are invaluable in automatically matching entries to Wikidata items, and thus greatly reduce volunteer workload.

However, virtually none of the (currently) ~2600 catalogs in Mix’n’match offers a standardized format to retrieve either basic or meta-data. Some catalogs are imported by volunteers from tabbed files, but most are “scraped”, that is, automatically read and parsed, from the source website.

Some source websites are set up in a way that allows a standardized scraping tool to run there, and I offer a web form to create new scrapers; over 1400 of these scrapers have run successfully, and ~750 of them can automatically run again on a regular basis.

But the autoscraper does not handle metadata, such as birth/death dates, and many catalogs need bespoke import code even for the basic information. Until recently, I had hundreds of scripts, some of them consisting of thousands of lines of code, running data retrieval and parsing:

  • Basic (ID, name, URL) information retrieval from source site
  • Creating or amending entry descriptions from source site
  • Importing auxiliary data (other IDs, such as VIAF, coordinates, etc.) from source site
  • extraction of birth/death dates from descriptions, taking care not to use estimates, “flourit” etc
  • extraction of auxiliary data from descriptions
  • linking of two related catalogs (e.g. one for painters, one for paintings) to improve matching (e.g. artworks only from that artist on Wikidata)

and many others.

Over time, all this has become unwieldy, unstructured, repetitive; I have written bespoke scrapers only to find that I already had one somewhere else etc.

So I went to radically redesign all these processes. My approach is that since only some small piece of code performs the actual scraping/parsing logic, these code fragments are now stored in the Mix’n’match database, associated with the respective catalog. I imported many code fragments from the “old” scripts into this table. I also wrote function-specific wrapper code that can load and execute a code fragment (via the eval function, which is often considered “evil”, hence the blog post title) on its associated catalog. An example of such code fragments for a catalog can be seen here.

I can now use that web interface to retrieve, create, test, and save code, without having to touch the command line at all.

In an ideal world, I would let everyone add and edit code here; however, since the framework executes PHP code, this would open the way for all kinds of malicious attacks. I can not think of a way to safeguard against (deliberate or accidental) destructive code, though I have put some mitigations in place, in case I make a mistake. So, for now, you can look, but you can’t touch. If you want to contribute code (new or patches), please give it to me, and I’ll be happy to add it!

This code migration is just in its infancy; so far, I support four functions, with a total of 591 code fragments. Many more to come, over time.

Selenium Ruby framework deprecated

09:14, Monday, 09 2020 March UTC

This is your friendly but final warning that we are replacing Selenium tests written in Ruby with tests in Node.js. There will be no more reminders. Ruby stack will no longer be maintained. For more information see T139740 and T173488.

Extensive documentation is available at mediawiki.org. If you need help with the migration, I am available for pairing and code review (zfilipin in Gerrit, zeljkof in #wikimedia-releng).

To see how to write a test watch Selenium tests in Node.js tech talk (J78).

Tech News issue #11, 2020 (March 9, 2020)

00:00, Monday, 09 2020 March UTC
TriangleArrow-Left.svgprevious 2020, week 11 (Monday 09 March 2020) nextTriangleArrow-Right.svg
Other languages:
Deutsch • ‎English • ‎Esperanto • ‎español • ‎français • ‎lietuvių • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎ไทย • ‎中文 • ‎日本語

weeklyOSM 502

12:32, Sunday, 08 2020 March UTC

25/02/2020-02/03/2020

lead picture

OSM-FR reminder “Attribution is not an option!” 1 | © Picture by Christian Quest | © map data OpenStreetMap contributors

Mapping

  • Markus Peloso created a proposal for locations used to share surplus food with others, and asks the readers of the Tagging mailing list for their opinions. The proposed tag is analogous to give_box and public_bookcase, but for food.
  • Martin Koppenhoefer has revived the 2014 proposal, originally submitted by Hno, for amenity=student_accommodation and asks for comments.
  • Rory McCann asks whether there is a tag for marking information boards which display OSM-based maps. (Nabble)
  • Crossing Highways (roads) in India is a map of spots in India where highways cross or overlap each other without sharing a node. Users can see details and click to open the same location in the OpenStreetMap iD editor to fix it. The map updates daily. It was created by processing the output of OSMLint. The frontend map uses leaflet-simple-csv. Note: it may contain false positives such as pedestrian overbridges.
  • The European Water Project’s proposal for drinking_water:refill=<yes/no> and drinking_water:refill_scheme=<scheme-name/multiple> was approved.
  • Brian Prangle enjoyed a short 90 minute stroll, on a crisp showery early spring day, around Ward End Park in Birmingham (UK). In a blog post on Mappa Mercia Brian explains not only how his walk contributed to the OSMUK first quarter project, but also some of the history of Ward End Park and his observations on how the park has changed since it was last mapped in 2009.
  • Manonv and Kateregga1 created a proposal for the tag place=refugee_site to solve the lack of consensus within the OSM community regarding the way to reference refugee camps in OSM.
  • Mateusz Konieczny published statistics on how many OSM elements were edited by each StreetComplete quest.

Community

  • Gerry McGovern will give a series of talks at An Event Apart about the environmental impact of digitalisation and will propose ways to improve it. He is looking for arguments as to why OSM is more environmentally friendly than Google Maps.
  • mavl detailed the first 3000 out of more than 7500 reported issues at openstreetmap.org in a diary post. The reader can learn, for example, that most reported issues are about spam, and if a user is reported, because of vandalism.
  • Ibrahima Sory Diallo, a Masters student in ‘Geography – urban spaces and societies’ at Gaston Berger University (UGB), Saint-Louis, Senegal, shares his experience about the inaugural Africa Geospatial Data and Internet Conference (AGDIC2019) he attended in Accra, Ghana on the YouthMappers blog.

Imports

  • Deane Kensok is planning an import of buildings for Flagstaff, Arizona, USA. He has detailed his plan to import the data, provided by City of Flagstaff GIS Team, in the OSM wiki.
  • Stefan Baebler seeks advice on how to improve municipal and city boundaries for Slovenia in OSM following the release of new official border data. The new boundaries align with cadastral parcels. This means they do not fit to the data already in OSM as well as the older public data did.

Events

  • The local community of OpenStreetMap, OsGeo/FOSS4G, and other map-py enthusiasts in the Philippines announced the upcoming Pista ng Mapa (Festival of Maps) 2020 conference to be held in Cebu City, on 27 to 29 May 2020. For updates, follow them on Twitter.
  • The State of the Map US 2020 will be held from 5 to 7 November in Tucson, Arizona. The organising committee is holding a logo design competition, entries are due by 22 March. If you want to do more than attend the committee is looking for help organising.

Humanitarian OSM

  • MapSwipe has won the Global Mobile Award for the Best Mobile Innovation Supporting Emergency or Humanitarian Situations. The GIScience News Blog explains the aims and background of the MapSwipe project.

switch2OSM

  • The city of Düsseldorf, in Germany, in cooperation with surrounding municipalities has updated its official maps and now combines cadastral with OSM data. In an article they mention (de) (automatic translation) the advantages of the combination.

Licences

  • Christian Quest reports on Twitter and talk-fr about the OSM-fr ‘Reminder tile actions’ to address the lack of OSM attribution by some sites using the tile.openstreetmap.fr tile service. Several websites reacted in under 24 hours, and added the appropriate attribution.

Software

  • Binnette wrote an article, in his diary, on how he become a ‘noob contributor’ to uMap. He is looking for translators and contributors (see his article for details).

Programming

  • Frederic Rodrigo’s article, ‘ImpOsm2pgRouting Route planning on OpenStreetMap road network with the benefit of updates’, presents the ImpOsm2pgRouting programming tool that can import OSM data, including pbf format, into PostgreSQL/PostGIS and transform it to the PgRouting data model.This allows synchronisation of a PgRouting database with OSM data. OSM highway lines are cut at each crossing and the characteristics of each road network segment are kept to calculate the route with the ‘lowest cost’, respecting various constraints such as road speed, one ways, etc.

Releases

  • OsmAnd has released version 3.6 of their Android app. The new version includes improved navigation profiles, the use of exit numbers in navigation, a direct-to-point navigation mode for marine users, and an Antarctica map.
  • Florimond Berthoux announced the release of a new version of CyclOSM, on its first anniversary. In his mail he briefly mentions the rendering improvements made in recent versions.

Did you know …

Other “geo” things

  • MobiGyaan ran an article about everything you need to know about NavIC. Navigation with Indian Constellation (NavIC) is India’s autonomous regional satellite navigation system.
  • Google Earth is finally available in browsers other than Chrome. Google beta tested a switch from its NaCI implementation to WebAssembly over the past six months, and it has successfully led to the launch of Google Earth for Firefox, Edge, and Opera. If you’d like to try out Google Earth in a web browser it is available at Google’s site.
  • HERE has released Geodata Models to help the telecommunications industry plan and deploy 5G wireless networks. The physical characteristics of mmWave 5G networks may require operators to install up to 10 times as many cell sites per km2 compared to 4G networks, so planning for the networks will require much more precise location data.
  • Far & Wide has created a gallery of their favourite maps from Terrible Maps.
  • It may surprise you to learn that paper maps still sell. Sales of print maps and road atlases have had a five-year compound annual growth rate of 10% in the USA. Edward Baig’s article in USA Today explains why you’ll never tear some folks away from their paper maps.

Upcoming Events

Where What When Country
Dortmund Mappertreffen 2020-03-06 germany
Riga State of the Map Baltics 2020-03-06 latvia
Amagasaki GPSで絵を描いてみようじゃあ~りませんか 2020-03-07 japan
Rennes Réunion mensuelle 2020-03-09 france
Grenoble Rencontre mensuelle 2020-03-09 france
Taipei OSM x Wikidata #14 2020-03-09 taiwan
Toronto Toronto Mappy Hour 2020-03-09 canada
Hamburg Hamburger Mappertreffen 2020-03-10 germany
Lyon Rencontre mensuelle 2020-03-10 france
Zurich 115. OSM Meetup Zurich 2020-03-11 switzerland
Hanover OSM-Sprechstunde 2020-03-11 germany
Freiburg FOSSGIS-Konferenz 2020-03-11-2020-03-14 germany
Nantes Rencontre mensuelle 2020-03-12 france
Berlin 141. Berlin-Brandenburg Stammtisch 2020-03-12 germany
Ulmer Alb Stammtisch Ulmer Alb 2020-03-12 germany
Chemnitz Chemnitzer Linux-Tage 2020-03-14-2020-03-15 germany
Freiburg OSM-Samstag @ FOSSGIS-Konferenz 2020-03-14 germany
Nottingham Nottingham pub meetup 2020-03-17 united kingdom
Lüneburg Lüneburger Mappertreffen 2020-03-17 germany
Cologne Bonn Airport 127. Bonner OSM-Stammtisch 2020-03-17 germany
Hanover OSM-Sprechstunde 2020-03-18 germany
Hanover Stammtisch Hannover 2020-03-19 germany
Munich Münchner Treffen 2020-03-19 germany
San José Civic Hack & Map Night 2020-03-19 united states
Bremen Bremer Mappertreffen 2020-03-23 germany
Reading Reading Missing Maps Mapathon 2020-03-24 united kingdom
Hanover OSM-Sprechstunde 2020-03-25 germany
Prešov Missing Maps Mapathon Prešov #5 2020-03-26 slovakia
Lübeck Lübecker Mappertreffen 2020-03-26 germany
Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
Guarda EuYoutH OSM Meeting 2020-06-24-2020-06-28 spain
Cape Town HOT Summit 2020-07-01-2020-07-02 south africa
Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Elizabete, NunoMASAzevedo, PierZen, Polyglot, Rogehm, SK53, Sammyhawkrad, SunCobalt, TheSwavu, derFred.

A buggy history

08:13, Saturday, 07 2020 March UTC
—I suppose you are an entomologist?—I said with a note of interrogation.
—Not quite so ambitious as that, sir. I should like to put my eyes on the individual entitled to that name! A society may call itself an Entomological Society, but the man who arrogates such a broad title as that to himself, in the present state of science, is a pretender, sir, a dilettante, an impostor! No man can be truly called an entomologist, sir; the subject is too vast for any single human intelligence to grasp.
The Poet at the Breakfast Table (1872) by Oliver Wendell Holmes, Sr. 
 
A collection of biographies
with surprising gaps (ex. A.D. Imms)
The history of Indian interest in insects has been approached by many writers and there are several bits and pieces available in journals and various insights distributed across books. There are numerous ways of looking at how people viewed insects over time. One of these is a collection of biographies, some of which includes uncited verbatim (and not even within quotation marks) accounts  from obituaries. This collation by B.R. Subba Rao who also provides a few historical supporting material to thread together the biographies. Keeping Indian expectations in view, bot Subba Rao and the agricultural entomologist M.A. Husain play to the crowd in their versions. Husain wrote in pre-Independence times where there was a need for Indians to assert themselves in their conflict with colonial masters. They begin with mentions of insects in ancient Indian texts and as can be expected there are mentions of honey, shellac, bees, ants, and a few nuisance insects. Husain takes the fact that the term Satpada षट्पद or six-legs existed in the 1st century Amarakosa to make the claim that Indians were far ahead of time because Latreille's Hexapoda, the supposed analogy, was proposed only in 1825. Such one-up-manship misses the fact that science is not just about terms but  also about structures and one can only assume that they failed to find the development of such structured ideas in the ancient texts that they examined. The identification of species in old texts also leave one wondering about the accuracy of translations. For instance K.N. Dave translates a verse from the Atharva-veda and suggests an early date for knowledge of shellac. This interpretation looks dubious and sure enough, Dave has been critiqued by an entomologist, Mahdihassan. One organism named in the texts as the indragopa (Indra's cowherd) is supposedly something that appears after the rains. Sanskrit scholars have remarkably enough, identified it with a confidence that no coccidologist ever had as the cochineal insect (the species Dactylopius coccus is South American!), while still others identify it as a lac insect, a firefly(!) and as Trombidium (red velvet mites) - the last matches the blood red colour mentioned in a text attributed to Susrutha. To be fair, ambiguities resulting from translation are not limited to those that deal with Indian writing. Dikairon (Δικαιρον), supposedly a highly-valued and potent poison from India was mentioned in the work Indika by Ctesias 398 - 397 BC. One writer said it was the droppings of a bird. Valentine Ball thought it was derived from a scarab beetle. Jeffrey Lockwood claimed that it came from the rove beetles Paederus sp. And finally a Spanish scholar states that all this was a misunderstanding and that Dikairon was not a poison, and believe it or not - was a masticated mix of betel leaves, arecanut, and lime! One gets a far more reliable idea of ancient knowledge and traditions from practitioners, forest dwellers, the traditional honey-harvesting tribes, and similar people that have been gathering materials such as shellac and beeswax. Unfortunately, many of these traditions and their practitioners are threatened by modern laws, economics, and cultural prejudice. These practitioners are being driven out of the forests where they live, and their knowledge was hardly ever captured in writing. The writers of the ancient Sanskrit texts were probably associated with temple-towns and other semi-urban clusters and it seems like the knowledge of forest dwellers was never considered merit-worthy.

A more meaningful overview of entomology may be gained by reading and synthesizing a large number of historical bits, and of which there are a growing number. The 1973 book published by the Annual Reviews Inc. should be of some interest. I have appended a selection of sources that I have found useful in adding bits and pieces to form a historic view of entomology in India. It helps however to have a broader skeleton on which to attach these bits and minutiae. Here, there area also truly verbose and terminology-filled systems developed by historians of science (for example, see ANT). I prefer an approach that is free of a jargon overload and like to look at entomology and its growth along three lines of action - cataloguing with the main product being collection of artefacts and the assignment of names, communication and vocabulary-building are social actions involving groups of interested people who work together with the products being scholarly societies and journals, and pattern-finding where hypotheses are made, and predictions tested. I like to think that anyone learning entomology also goes through these activities, often in this sequence. With professionalization there appears to be a need for people to step faster and faster into the pattern-finding way which also means that less time is spent on the other two streams of activity. The fast stepping often is achieved by having comprehensive texts, keys, identification guides and manuals. The skills involved in the production of those works - ways to prepare specimens, observe, illustrate, or describe are often not captured by the books themselves.

Cataloguing

The cataloguing phase of knowledge gathering, especially of the (larger and more conspicuous) insect species of India grew rapidly thanks to the craze for natural history cabinets of the wealthy (made socially meritorious by the idea that appreciating the works of the Creator was as good as attending church)  in Britain and Europe and their ability to tap into networks of collectors working within the colonial enterprise. The cataloguing phase can be divided into the non-scientific cabinet-of-curiosity style especially followed before Darwin and the more scientific forms. The idea that insects could be preserved by drying and kept for reference by pinning, [See Barnard 2018] the system of binomial names, the idea of designating type specimens that could be inspected by anyone describing new species, the system of priority in assigning names were some of the innovations and cultural rules created to aid cataloguing. These rules were enforced by scholarly societies, their members (which would later lead to such things as codes of nomenclature suggested by rule makers like Strickland, now dealt with by committees that oversee the  ICZN Code) and their journals. It would be wrong to assume that the cataloguing phase is purely historic and no longer needed. It is a phase that is constantly involved in the creation of new knowledge. Labels, catalogues, and referencing whether in science or librarianship are essential for all subsequent work to be discovered and are essential to science based on building on the work of others, climbing the shoulders of giants to see further. Cataloguing was probably what the physicists derided as "stamp-collecting".

Communication and vocabulary building

The other phase involves social activities, the creation of specialist language, groups, and "culture". The methods and tools adopted by specialists also helps in producing associations and the identification of boundaries that could spawn new associations. The formation of groups of people based on interests is something that ethnographers and sociologists have examined in the context of science. Textbooks, taxonomic monographs, and major syntheses also help in building community - they make it possible for new entrants to rapidly move on to joining the earlier formed groups of experts. Whereas some of the early learned societies were spawned by people with wealth and leisure, some of the later societies have had other economic forces in their support.

Like species, interest groups too specialize and split to cover more specific niches, such as those that deal with applied areas such as agriculture, medicine, veterinary science and forensics. There can also be interest in behaviour, and evolution which, though having applications, are often do not find economic support.

Pattern finding
Eleanor Ormerod, an unexpected influence
in the rise of economic entomology in India

The pattern finding phase when reached allows a field to become professional - with paid services offered by practitioners. It is the phase in which science flexes its muscle, specialists gain social status, and are able to make livelihoods out of their interest. Lefroy (1904) cites economic entomology as starting with E.C. Cotes [Cotes' career in entomology was short, after marrying the famous Canadian journalist Sara Duncan in 1889 he too moved to writing] in the Indian Museum in 1888. But he surprisingly does not mention any earlier attempts, and one finds that Edward Balfour, that encyclopaedic-surgeon of Madras collated a list of insect pests in 1887 and drew inspiration from Eleanor Ormerod who hints at the idea of getting government support, noting that it would cost very little given that she herself worked with no remuneration to provide a service for agriculture in England. Her letters were also forwarded to the Secretary of State for India and it is quite possible that Cotes' appointment was a result.

As can be imagined, economics, society, and the way science is supported - royal patronage, family, state, "free markets", crowd-sourcing, or mixes of these - impact the way an individual or a field progresses. Entomology was among the first fields of zoology that managed to gain economic value with the possibility of paid employment. David Lack, who later became an influential ornithologist, was wisely guided by his father to pursue entomology as it was the only field of zoology where jobs existed. Lack however found his apprenticeship (in Germany, 1929!) involving pinning specimens "extremely boring".

Indian reflections on the history of entomology

Kunhikannan died at the rather young age of 47
A rather interesting analysis of Indian science is made by the first native Indian entomologist to work with the official title of "entomologist" in the state of Mysore - K. Kunhikannan. Kunhikannan was deputed to pursue a Ph.D. at Stanford (for some unknown reason many of the pre-Independence Indian entomologists trained in Stanford rather than England - see postscript) through his superior Leslie Coleman. At Stanford, Kunhikannan gave a talk on Science in India. He noted in his 1923 talk :

In the field of natural sciences the Hindus did not make any progress. The classifications of animals and plants are very crude. It seems to me possible that this singular lack of interest in this branch of knowledge was due to the love of animal life. It is difficult for Westerners to realise how deep it is among Indians. The observant traveller will come across people trailing sugar as they walk along streets so that ants may have a supply, and there are priests in certain sects who veil that face while reading sacred books that they may avoid drawing in with their breath and killing any small unwary insects. [Note: Salim Ali expressed a similar view ]
He then examines science sponsored by state institutions, by universities and then by individuals. About the last he writes:
Though I deal with it last it is the first in importance. Under it has to be included all the work done by individuals who are not in Government employment or who being government servants devote their leisure hours to science. A number of missionaries come under this category. They have done considerable work mainly in the natural sciences. There are also medical men who devote their leisure hours to science. The discovery of the transmission of malaria was made not during the course of Government work. These men have not received much encouragement for research or reward for research, but they deserve the highest praise., European officials in other walks of life have made signal contributions to science. The fascinating volumes of E. H. Aitken and Douglas Dewar are the result of observations made in the field of natural history in the course of official duties. Men like these have formed themselves into an association, and a journal is published by the Bombay Natural History Association[sic], in which valuable observations are recorded from time to time. That publication has been running for over a quarter of a century, and its volumes are a mine of interesting information with regard to the natural history of India.
This then is a brief survey of the work done in India. As you will see it is very little, regard being had to the extent of the country and the size of her population. I have tried to explain why Indians' contribution is as yet so little, how education has been defective and how opportunities have been few. Men do not go after scientific research when reward is so little and facilities so few. But there are those who will say that science must be pursued for its own sake. That view is narrow and does not take into account the origin and course of scientific research. Men began to pursue science for the sake of material progress. The Arab alchemists started chemistry in the hope of discovering a method of making gold. So it has been all along and even now in the 20th century the cry is often heard that scientific research is pursued with too little regard for its immediate usefulness to man. The passion for science for its own sake has developed largely as a result of the enormous growth of each of the sciences beyond the grasp of individual minds so that a division between pure and applied science has become necessary. The charge therefore that Indians have failed to pursue science for its own sake is not justified. Science flourishes where the application of its results makes possible the advancement of the individual and the community as a whole. It requires a leisured class free from anxieties of obtaining livelihood or capable of appreciating the value of scientific work. Such a class does not exist in India. The leisured classes in India are not yet educated sufficiently to honour scientific men.
It is interesting that leisure is noted as important for scientific advance. Edward Balfour, mentioned earlier, also made a similar comment that Indians were too close to subsistence to reflect accurately on their environment!  (apparently in The Vydian and the Hakim, what do they know of medicine? (1875) which unfortunately is not available online)

Kunhikannan may be among the few Indian scientists who dabbled in cultural history, and political theorizing. He wrote two rather interesting books The West (1927) and A Civilization at Bay (1931, posthumously published) which defended Indian cultural norms while also suggesting areas for reform. While reading these works one has to remind oneself that he was working under and with Europeans and may not have been able to have many conversations on such topics with Indians. An anonymous writer who penned the memoir of his life in his posthumous work notes that he was reserved and had only a small number of people to talk to outside of his professional work.
Entomologists meeting at Pusa in 1919
Third row: C.C. Ghosh (assistant entomologist), Ram Saran ("field man"), Gupta, P.V. Isaac, Y. Ramachandra Rao, Afzal Husain, Ojha, A. Haq
Second row: M. Zaharuddin, C.S. Misra, D. Naoroji, Harchand Singh, G.R. Dutt (Personal Assistant to the Imperial Entomologist), E.S. David (Entomological Assistant, United Provinces), K. Kunhi Kannan, Ramrao S. Kasergode (Assistant Professor of Entomology, Poona), J.L.Khare (lecturer in entomology, Nagpur), T.N. Jhaveri (assistant entomologist, Bombay), V.G.Deshpande, R. Madhavan Pillai (Entomological Assistant, Travancore), Patel, Ahmad Mujtaba (head fieldman), P.C. Sen
First row: Capt. Froilano de Mello, W Robertson-Brown (agricultural officer, NWFP), S. Higginbotham, C.M. Inglis, C.F.C. Beeson, Dr Lewis Henry Gough (entomologist in Egypt), Bainbrigge Fletcher, Bentley, Senior-White, T.V. Rama Krishna Ayyar, C.M. Hutchinson, Andrews, H.L.Dutt


Enotmologists meeting at Pusa in 1923
Fifth row (standing) Mukerjee, G.D.Ojha, Bashir, Torabaz Khan, D.P. Singh
Fourth row (standing) M.O.T. Iyengar (a malariologist), R.N. Singh, S. Sultan Ahmad, G.D. Misra, Sharma, Ahmad Mujtaba, Mohammad Shaffi
Third row (standing) Rao Sahib Y Rama Chandra Rao, D Naoroji, G.R.Dutt, Rai Bahadur C.S. Misra, SCJ Bennett (bacteriologist, Muktesar), P.V. Isaac, T.M. Timoney, Harchand Singh, S.K.Sen
Second row (seated) Mr M. Afzal Husain, Major RWG Hingston, Dr C F C Beeson, T. Bainbrigge Fletcher, P.B. Richards, J.T. Edwards, Major J.A. Sinton
First row (seated) Rai Sahib PN Das, B B Bose, Ram Saran, R.V. Pillai, M.B. Menon, V.R. Phadke (veterinary college, Bombay)

Note: As usual, these notes are spin-offs from researching and writing Wikipedia entries, in this case on several pioneering Indian entomologists. It is remarkable that even some people in high offices, such as P.V. Isaac, the last Imperial Entomologist, and grandfather of noted writer Arundhati Roy, are largely unknown (except as the near-fictional Pappachi in Roy's God of Small Things)


Further reading
An index to entomologists who worked in India or described a significant number of species from India - with links to Wikipedia (where possible - the gaps in coverage of entomologists in general are too many)
(woefully incomplete - feel free to let me know of additional candidates)

Carl Linnaeus - Johan Christian Fabricius - Edward Donovan - John Gerard Koenig - John Obadiah Westwood - Frederick William Hope - George Alexander James Rothney - Thomas de Grey Walsingham - Henry John Elwes - Victor Motschulsky - Charles Swinhoe - John William Yerbury - Edward Yerbury Watson - Peter Cameron - Charles George Nurse - H.C. Tytler - Arthur Henry Eyre Mosse - W.H. Evans - Frederic Moore - John Henry Leech - Charles Augustus de Niceville - Thomas Nelson Annandale - R.C. WroughtonT.R.D. Bell - Francis Buchanan-Hamilton - James Wood-Mason - Frederic Charles Fraser  - R.W. Hingston - Auguste Forel - James Davidson - E.H. Aitken -  O.C. Ollenbach - Frank Hannyngton - Martin Ephraim Mosley - Hamilton J. Druce  - Thomas Vincent Campbell - Gilbert Edward James Nixon - Malcolm Cameron - G.F. Hampson - Martin Jacoby - W.F. Kirby - W.L. DistantC.T. Bingham - G.J. Arrow - Claude Morley - Malcolm Burr - Samarendra Maulik - Guy Marshall
 
Edward Percy Stebbing - T.B. Fletcher - Edward Ernest Green - E.C. Cotes - Harold Maxwell Lefroy - Frank Milburn Howlett - S.R. Christophers - Leslie C. Coleman - T.V. Ramakrishna Ayyar - Yelsetti Ramachandra Rao - Magadi Puttarudriah - Hem Singh Pruthi - Shyam Sunder Lal Pradhan - James Molesworth Gardner - Vakittur Prabhakar Rao - D.N. Raychoudhary - C.F.W. Muesebeck  - Mithan Lal Roonwal - Ennapada S. Narayanan - M.S. Mani - T.N. Ananthakrishnan - Muhammad Afzal Husain

Not included by Rao -   F.H. Gravely - P.V. Isaac - M. Afzal Husain - A.D. Imms - C.F.C. Beeson
 - C. Brooke Worth - Kumar Krishna - M.O.T. Iyengar - K. Kunhikannan


PS: Thanks to Prof C.A. Viraktamath, I became aware of a new book-  Gunathilagaraj, K.; Chitra, N.; Kuttalam, S.; Ramaraju, K. (2018). Dr. T.V. Ramakrishna Ayyar: The Entomologist. Coimbatore: Tamil Nadu Agricultural University. - this suggests that TVRA went to Stanford on the suggestion of Kunhikannan.

    Update: On March 14, 2020, the Wikimedia Foundation announced new measures to support employees and protect general public health amid COVID-19. As part of these operational actions, we have:

    • Closed both Foundation offices in San Francisco and Washington, DC, until at least March 31, 2020. All staff are now working remotely. 
    • Shifted to a reduced work week. Expectations are that staff may work 20 hours a week if necessary, and all will be paid according to their usual work schedules.
    • Waived normal sick time requirements for staff who are ill or caring for others.
    • Guaranteed all contract and hourly workers full compensation for planned hours worked.
    • Cancelled all near-term, in-person gatherings until the World Health Organization declares the pandemic over.

     

    Importantly, we have shifted our priorities to essential work, including keeping Wikipedia online and available for the world as a critical informational resource. Please see here for real-time updates about our COVID-19 response, as well as related resources.


     

    The Wikimedia Foundation is closely monitoring developments with respect to the Coronavirus (COVID-19) and its potential impact on our staff and the communities in which we all live. Today, we’re sharing steps we’re taking to protect our employees and how we plan to do our part to prevent the spread of COVID-19.

    For the remainder of March 2020, our San Francisco office will be closed to staff and visitors. We are putting in place measures to ensure that our San Francisco-based staff have resources and support to continue working remotely. Our Washington D.C. office will remain open for the time being, though we are encouraging everyone to take precautions to protect themselves and their communities and to work remotely where possible.

    We have also temporarily suspended nonessential travel for all staff, and instituted a risk review process for any travel considered essential. In addition, we have been in touch with members of the Wikimedia movement community with respect to upcoming events and are taking necessary steps to cancel or postpone these, based on potential risks.

    To ensure we continue to evaluate and take action to limit the spread of COVID-19, we have established a responding team of staff to monitor new developments and determine the appropriate measures as the situation further develops. This team will also be tasked with ensuring we have clear, actionable protocols and plans to maintain the continuity of technical needs to provide free knowledge for our hundreds of millions of users around the world.

    Approximately 64 percent of Wikimedia Foundation staff are remote, and so we do not anticipate a major disruption in our work. That said, we’re continuing to evaluate and take necessary measures to meet the organization’s goals and priorities.

    We encourage our staff, partners, volunteer communities, and everyone to take care of themselves during this time, and recognize the role each of us can play in not only limiting the spread of the disease for ourselves, but also for the communities we all live in. Stay safe, be well, and we’ll keep you updated as the situation develops.

    Katherine Maher
    Wikimedia Foundation Chief Executive Officer and Executive Director

    What are the implications if the world’s leading source of online information highlights the accomplishments of elite Indigenous athletes? It’s a question that Dr. Vicky Paraschak’s class at University of Windsor surely discussed as they created brand new Wikipedia biographies for Indigenous athletes from Canada as an assignment this last fall.

    Tony Cote, the first elected Chief of the Cote First Nations and creator of the Saskatchewan First Nations Summer and Winter Games, has a brand new page. As a student from Dr. Paraschak’s Spring 2019 course wrote in the Wikipedia page for the World Indigenous Games (WIN), sporting events like the one Tony Cote created “have become a means to project positive images and garner social, political, and/or economic benefits for their communities. Organizers and Indigenous stakeholders wanted to use the WIN Games [in particular] to address challenges faced by Indigenous communities such as: stereotypes, lack of resources and opportunities for Indigenous youth, and vulnerability of Indigenous women.”

    And head on over to Joy SpearChief-Morris’s new biography to learn about her career as a track champion. She received the 2017 Tom Longboat Award, “awarded annually by the Aboriginal Sport Circle to the most outstanding male and female Indigenous athletes in Canada;” “holds the 115th position in the World Ranking of the International Association of Athletics Federations for 100-meter hurdles;” and is training to compete in the 2020 Summer Olympics.

    The world can also now read about taekwondo athlete Sara-Lynne Knockwood, who is not only a Miꞌkmaq hall of famer, but also a North American Indigenous Games multi-gold medalist!

    The 6 students who completed the assignment added more than 1,000 words each and their work has already been viewed 1,400 times! We can all agree that that’s quite a bit more than if they had written a term paper that only their instructor would read…

    Screenshot of Dr. Paraschak’s Dashboard course page shows the impact students made on Wikipedia during the Fall 2019 term.

    As a professor at the University of Windsor, Dr. Paraschak focuses on “efforts to enhance reconciliation with Indigenous peoples in Canada specifically in the area of physical activity, in keeping with the five relevant calls to action found in the Truth and Reconciliation Commission summary report.” [1] The Wikipedia writing assignment has played a role in her work since 2017 and since then, students have created or improved 194 Wikipedia pages within three main categories: First Nations sportspeople, Métis sportspeople and Canadian Inuit sportspeople.

    Screenshot shows the total impact that Dr. Paraschak’s students have made on Wikipedia since the Spring 2017 term.

    Her commitment to the Wikipedia assignment shows the real impact that students can have on expanding Wikipedia’s coverage of Indigenous achievements. Well done!


    Interested in incorporating a Wikipedia writing assignment into a future course? Visit teach.wikiedu.org for all you need to know to get started.

    Teaching students that their words have power

    19:48, Thursday, 05 2020 March UTC

    “College professors tend not to allow you to include a Wikipedia article as a citation in a paper you write. So what are you doing encouraging your students to go to Wikipedia?” Houston Matters host Craig Cohen asked Dr. Melissa Weininger in a radio interview last week.

    Dr. Melissa Weininger (by Jeff Fitlow).

    “Well, I’m not encouraging them to go to Wikipedia but to write well-sourced, well researched Wikipedia entries,” Dr. Weininger clarified. “We all know that people do use Wikipedia. Even if you’re not allowed to use it for your college course, we certainly look to Wikipedia when we need a little bit of information about something. And so it’s really important that the information that’s on Wikipedia is reliable.”

    Dr. Weininger began teaching a Wikipedia writing assignment in her course, Sex and Gender in Modern Jewish Culture, at Rice University last fall. Through it, students discussed issues of inequity in Wikipedia’s content and the importance of access to accurate, verifiable information.

    “I think we’ve all learned in the last years about how important it is for everybody, but students in particular, to be able to differentiate between reliable and unreliable information on the internet,” Dr. Weininger continued. “So one of the things the students learned in doing the project is—it was a way of reverse engineering that process. They learned how Wikipedia entries are built and therefore what their strengths and weaknesses can be.”

    The lack of equal representation of biographies of women on Wikipedia “was one of the reasons for starting this project.” Dr. Weininger was interested in exploring writing as activism with students.

    “Part of that is to teach the students that their words and their actions have power, that the information that is available to us on the internet can also be a source of power. If there’s less information available to us about women, we learn that women aren’t as valuable in our culture and we simply don’t have access to their stories. So the assignments for the class were all structured around this idea of not just improving our access to information about women on Wikipedia, but thinking about how what we do in the classroom, what we learn in the classroom, and how our own writing can really be of benefit to the public sphere in general.”

    Sarah Silberman (by Katharine Shilcutt).

    Dr. Weininger was joined by one of her former students who had completed the assignment, Sarah Silberman, who drastically improved Jewish social historian Paula Hyman’s biography (see the Dashboard’s Authorship Highlighting tool in action here!).

    “At Rice I’m a double history and French major,” Sarah shared. “So in other Jewish history classes I actually read a good number of Paula Hyman articles and I read excerpts from books she had written. I really wanted to write about her because looking at her Wikipedia article, it just looked so sad and sparse in comparison to all the amazing things she’s done in her life and I wanted people to know that.”

    “There’s a potential downside to crowdsourcing on Wikipedia,” the interviewer suggested, “where someone like you, Sarah, will write this extensive article, have well-cited sources, and then somebody else will come along and decide to change it without any particular citation. Did you track your article to see if others were coming along and making tweaks to it?”

    “I have actually gone back a few times to make sure everything with Paula’s article is tip-top shape and so far no one has changed anything,” Sarah responded. “But even so, it tends to be that people on Wikipedia change things for the better. If my language was too subjective, citing too much of an opinion rather than a fact, then it would be changing something like that.”

    “When my students published their entries, I immediately started circulating them to other people,” said Dr. Weininger. “And to be honest, other academics were really excited about it and excited to see that my students had really improved the quality of entries about these women who were extremely important.”

    “Wikipedia itself might not be a reliable source as a place to look for all of your information,” she noted. “But it does actually have pretty stringent citation standards. So one of the things students also learn from this is the way to fact-check Wikipedia is by going back to the original sources because they’re all cited at the end of the article.”

    ”A lot of my students have grown up with Wikipedia and Googling things, but haven’t thought very much about where that information comes from. And when you yourself write an entry, all of a sudden all of that is open to you. And you have a much clearer understanding of how these things are put together, maybe who’s putting these things together, what’s involved in it, what the potential pitfalls are, and also the potential advantages.”

    “Most projects in a classroom-setting are done in the classroom,” the interviewer noted. “This is one that is sort of public record. Did you have any qualms about doing a project that is so out there?”

    “Well no,” Dr. Weininger answered, “and that’s mostly a testament to how great Rice students are and the students in this class were. I actually wasn’t worried about that at all. And there are also advantages that come with that risk—the public nature of this project. And that is that students get to see their work published. One of the great things to see was the day that the articles were due and they had to post them online. … They were all saying, ‘I wrote this and it’s on the internet!’ That’s a really exciting thing, and hopefully it gets them excited about doing it more now that they have the skills and the ability to do it. So there’s some risk involved, but a lot of reward. And I knew they were up to it and they really did such a fantastic job.”

    “I actually really did enjoy doing this project,” Sarah added. “I would love to continue doing this as a side hobby or something. … I think this should be more prominent in college classes and in university classes. I think that, like Dr. Weininger was saying, not only was it satisfying from a student perspective to see that my work was published and that it was doing a public good, but also it’s just a way for our work to have greater impact. Because a lot of times I write papers, they’ll go to my professor, my professor will read them, and then that’s the end of it. It doesn’t really go anywhere beyond that. So having a project that actually has some impact on the world is really important. And I think academia should take advantage of more opportunities like that.”


    To incorporate a Wikipedia writing assignment into an upcoming course, visit teach.wikiedu.org for access to free resources and assignment templates.


    To read more about Dr. Weininger’s class, check out this blog post.


    To listen to the original interview (which begins at minute 11:10), visit Houston Matters.


    Thumbnail and inset images by Jeff Fitlow and Katharine Shilcutt respectively (Rights reserved).

    Dr. Lydia Le Page is a postdoc at the University of California, San Francisco, where she images brain metabolism with MRI to understand Alzheimer’s disease. In our recent Wiki Scientists course sponsored by the National Science Policy Network, she was excited to improve Wikipedia pages that will help voters and policy-makers make the best use of research when voting on or developing policies. 

    “You read that on Wikipedia? Oh I wouldn’t trust that.”

    Lydia Le Page.
    Image by Pebbles1.0, CC BY-SA 4.0 via Wikimedia Commons.

    Wikipedia has come a long way since its inception in 2001. Dubious at first, I found myself using it more and more during my Chemistry undergraduate studies at Oxford. When I moved from learning about science to doing science as part of a PhD in diabetic physiology and metabolism using MRI, I found that even the most esoteric topics had pages (Hyperpolarized carbon-13 MRI gets ~11 views/day) – and to me that was invaluable!

    My research has since taken me to a postdoc in San Francisco, where I now study metabolism in the brain in Alzheimer’s disease using MRI. I’m also enthusiastic about science communication and policy. I love finding ways to share my understanding of science with other people. It’s important to me that scientific endeavors improve lives, and one way is to do that is using scientific evidence to inform policy changes. Although I’ve made blog posts, YouTube videos, and given talks about science, I had never edited Wikipedia content before and until the Wiki Education course, I had no idea how to start.

    After a brief application, I was excited to be sponsored by the National Science Policy Network to attend a Wiki Education course. For an hour a week for 12 weeks, I would get on a video call with policy-minded scientists (and their dogs) across America to learn about Wikipedia. The course was led by Wiki Educators Ryan and Elysia who were infectiously enthusiastic about editing, and great teachers. I can now write my own pages on science and policy-related topics – I wrote the page for the National Alzheimer’s Project Act!

    One thing I learnt was the high standard that Wikipedia edits are held to – edits that don’t meet the rules are typically quickly reversed. Wikipedia isn’t for sharing your own view – an encyclopedia is built from statements of fact, and these must have citations. But not any old source! I learnt that some sources are greater than others. If there aren’t several secondary sources about a topic (e.g. the page on the Queen can’t reference anything she wrote about herself), it shouldn’t be on Wikipedia. As a scientist, that meant that I couldn’t link to my latest paper on my research, but instead would need to wait until a review on the work was written by someone else! Although this means that Wikipedia may lag behind some developments, they avoid incorporating early studies that are later found to be unreproducible.

    The media, and especially the internet, has a reputation for being male biased, and Wikipedia is not without some of these issues. Did you know that pages about women regularly mention their marital status? Not so much for the men. Initiatives to address these issues are underway, such as the WikiProject ‘Women in Red’ – highlighting just how many women or works by women that ‘may qualify for an article on the English Wikipedia’ don’t have one. In October 2014, just over 15% of English Wikipedia’s biographies were about women; in January 2020 we’ve just exceeded 18%. We still have a way to go.

    One tool I found very useful when I started adding content was the Citation Hunt tool  – a fascinating and easy way to improve Wikipedia. It shows you snippets of random pages that need citations; my favorite addition was to the page on Bela Lugosi – citations were required on the current whereabouts of his famous Dracula cape! (It’s in a museum in LA by the way..)

    “Imagine a world in which every single person on the planet is given free access to the sum of all human knowledge. That’s what we’re doing.” – Jimmy Wales, Wikipedia co-founder

    The first English encyclopedia, Encyclopedia Brittanica, was published over 250 years ago in 1768. In January, English Wikipedia reached 6 million articles. Collating all the world’s knowledge, freely available to anyone, is truly one of the wonders of the modern world. Twenty years ago very few people even believed such a project was viable. Wikipedia has helped me to such a degree in my career, from a young student to budding neuroscientist, I am happy to be able to give something back.


    Interested in taking an introductory Wikipedia training course? Write Wikipedia biographies for women across disciplines and professions (here). To see all courses with open registration, visit learn.wikiedu.org.

    On Twitter Janeen Uzzell praised a blogpost that is the Wikimedia Foundation All Hands: 2020 Sketchbook and indeed it informs about current thinking, most of it is great and still, I find it absolutely terrifying.

    There are several great sketches in there. Katherine Maher gave an asperational talk, I love it for Wikimedia to be seen as infrastructural, inclusive and even that that what we do does not have to be in our projects. Important is that she mentions "support systems" because they provide the input for much of our processes.

    Important is the page on security and risk. All the important concepts are mentioned among them; likelihood, relative impact and management preparedness but also "plan for and mitigate risks".

    What truly makes me uneasy is when it is said that we aim to clarify who we are in the world in one brand, Wikipedia. The idea is that when we are all branded as Wikipedia, things are likely to become easier. When you check out the website brandingwikipedia.org there is no argument; Wikipedia is free knowledge. When you check out what it is to do
    • project and improve our reputation
    • support our movement/growth
    • be opt-in
    In the abstract Wikipedia IS wonderful, in reality the concept of what Wikipedia is, is largely determined by the English Wikipedia. It it is fiercely independent, it is hardly inclusive and it has largely determined the maneuvering space the Wikimedia Foundation has. In order to "plan for and mitigate risks", I will mention several reasons why I am anxious because of this branding initiative.
    • In the Commons OTRS they use English Wikipedia notions to determine if pictures can stay or are to be removed. Commons provides a service to all Wikimedia projects
    • The query functionality for Commons is maintained by people from the Foundation. For more than half a year it puts a strain on the growth and usefulness of Wikidata. Tools have become glacially slow and often malfunction because an edit is not available when needed in further processing. It is not known what the position of the WMF director is in this
    • This is about marketing and we have never done much marketing for any of our projects. What we have done was reactive and has been all about the English Wikipedia. Now consider this:
      • Wikisource, we do not know what is available at what quality, it is all about editing and not about having people read the finished article, consequently we do not value Wikisource and fulfill its potential.
      • So far Commons has always been English only. With the support of the "depicts" functionality, there is room to enable and market  a multilingual search engine. In the spirit of "it is a Wiki", it serves as an open invite to add labels in any and all of our languages and open up what Commons has to offer. It is how to market free content the Wiki way.
      • In Wikidata we know many more concepts than what we know in any individual Wikipedias. We could use our data and inform as we have done for years in multilingual tools like Reasonator. This is an example in English Russian Chinese and Kannada. NB it takes additional labels to improve results and consequently this is the inclusive approach.
      • When Wikipedians were willing to reflect on their own performance, we could help them solve their false friends issues.
    One sketch in the sketchbook is a presentation by Jess Wade. It says that even Academia is biased. As the Wikimedia community we do not need to be subservient to any bias and most certainly not the bias that Wikipedia has brought us.

    Run Selenium tests using Quibble and Docker

    14:35, Wednesday, 04 2020 March UTC

    Dependencies are Git Python 3, and Docker Community Edition (CE).

    First, the general setup.

    $ git clone https://gerrit.wikimedia.org/r/p/integration/quibble
    ...
           
    $ cd quibble/
    
    $ python3 -m pip install -e .
    ...
    
    $ docker pull docker-registry.wikimedia.org/releng/quibble-stretch:latest
    ...
    (2m 26s)

    The simplest, and slowest, way to run Quibble.

    $ docker run -it --rm \
     docker-registry.wikimedia.org/releng/quibble-stretch:latest
    ...
    (12m 54s)

    Speed things up by using local repositories.

    $ mkdir -p ref/mediawiki/skins
    
    $ git clone --bare https://gerrit.wikimedia.org/r/mediawiki/core ref/mediawiki/core.git
    ...
    (3m 40s)
    
    $ git clone --bare https://gerrit.wikimedia.org/r/mediawiki/vendor ref/mediawiki/vendor.git
    ...
    
    $ git clone --bare https://gerrit.wikimedia.org/r/mediawiki/skins/Vector ref/mediawiki/skins/Vector.git
    ...
    
    $ mkdir cache
    $ chmod 777 cache
    
    $ mkdir -p log
    $ chmod 777 log
    
    $ mkdir -p src
    $ chmod 777 src
    
    $ docker run -it --rm \
      -v "$(pwd)"/cache:/cache \
      -v "$(pwd)"/log:/workspace/log \
      -v "$(pwd)"/ref:/srv/git:ro \
      -v "$(pwd)"/src:/workspace/src \
      docker-registry.wikimedia.org/releng/quibble-stretch:latest
    ...
    (18m 0s)

    The second run of everything, just to see if things get faster.

    $ docker run -it --rm \
      -v "$(pwd)"/cache:/cache \
      -v "$(pwd)"/log:/workspace/log \
      -v "$(pwd)"/ref:/srv/git:ro \
      -v "$(pwd)"/src:/workspace/src \
      docker-registry.wikimedia.org/releng/quibble-stretch:latest
    ...
    (16m 50s)

    If you get this error message

    A LocalSettings.php file has been detected. To upgrade this installation, please run update.php instead

    just remove the file

    $ rm src/LocalSettings.php

    Speed things up by skipping Zuul and not installing dependencies.

    $ docker run -it --rm \
      -v "$(pwd)"/cache:/cache \
      -v "$(pwd)"/log:/workspace/log \
      -v "$(pwd)"/ref:/srv/git:ro \
      -v "$(pwd)"/src:/workspace/src \
      docker-registry.wikimedia.org/releng/quibble-stretch:latest --skip-zuul --skip-deps
    ...
    (6m 17s)

    Speed things up by just running Selenium tests.

    $ docker run -it --rm \
      -v "$(pwd)"/cache:/cache \
      -v "$(pwd)"/log:/workspace/log \
      -v "$(pwd)"/ref:/srv/git:ro \
      -v "$(pwd)"/src:/workspace/src \
      docker-registry.wikimedia.org/releng/quibble-stretch:latest --skip-zuul --skip-deps --run selenium
    ...
    (1m 19s)

    "Building with Nature" .. a case for a beaver solution

    11:37, Tuesday, 03 2020 March UTC
    The Markermeer is a lake with an ecological problem; the water is cloudy, plants and mussels do not grow. In order to alleviate that problem, the Marker Wadden was developed and in order to future proof the Houtribdijk the same "building with nature" concepts are used; the extensive water features will enable the growth of plants and the intended result is not only that the water will be clear again but also that the dyke will better withstand future storms.

    With ecology part of the solution, it is relevant to appreciate ecology as part of a solution for open issues. There are two open issues: geese and willows. So far, geese are kept at bay at some areas with fences and young willows are being rooted out by volunteers.

    When willows are allowed to grow, they will mature quickly and enable the next ecological succession. The wood and bark provides food and building material for beavers and this makes for an even more robust defense against storm damage. Some trees will mature anyway and this provides natural nesting places for white tailed eagles. Given that the wels catfish is endemic in the Markermeer, it will find its place among the Marker wadden and it may even predate on the over abundant geese.

    So given that Natuurmonumenten, the organisation looking after the Marker Wadden is happy about beavers in its terrains, maybe it is the "building with nature" engineers who have to consider succession in their deliberations.
    Thanks,
          GerardM

    Perf Matters at Wikipedia 2015

    21:43, Monday, 02 2020 March UTC

    Hello, WANObjectCache

    This year we achieved another milestone in our multi-year effort to prepare Wikipedia for serving traffic from multiple data centres.

    The MediaWiki application that powers Wikipedia relies heavily on object caching. We use Memcached as horizontally scaled key-value store, and we’d like to keep the cache local to each data centre. This minimises dependencies between data centres, and makes better use of storage capacity (based on local needs).

    Aaron Schulz devised a strategy that makes MediaWiki caching compatible with the requirements of a multi-DC architecture. Previously, when source data changed, MediaWiki would recompute and replace the cache value. Now, MediaWiki broadcasts “purge” events for cache keys. Each data centre receives these and sets a “tombstone”, a marker lasting a few seconds that limits any set-value operations for that key to a miniscule time-to-live. This makes it tolerable for recache-on-miss logic to recompute the cache value using local replica databases, even though they might have several seconds of replication lag. Heartbeats are used to detect the replication lag of the databases involved during any re-computation of a cache value. When that lag is more than a few seconds (a large portion of the tombstone period), the corresponding cache set-value operation automatically uses a low time-to-live. This means that large amounts of replication lag are tolerated.

    This and other aspects of WANObjectCache’s design allow MediaWiki to trust that cached values are not substantially more stale, than a local replica database; provided that cross-DC broadcasting of tiny in-memory tombstones is not disrupted.


    First paint time now under 900ms

    In July we set out a goal: improve page load performance so our median first paint time would go down from approximately 1.5 seconds to under a second – and stay under it!

    I identified synchronous scripts as the single-biggest task blocking the browser, between the start of a page navigation and the first visual change seen by Wikipedia readers. We had used async scripts before, but converting these last two scripts to be asynchronous was easier said than done.

    There were several blockers to this change. Including the use of embedded scripts by interactive features. These were partly migrated to CSS-only solutions. For the other features, we introduced the notion of “delayed inline scripts”. Embedded scripts now wrap their code in a closure and add it to an array. After the module loader arrives, we process the closures from the array and execute the code within.

    Another major blocker was the subset of community-developed gadgets that didn’t yet use the module loader (introduced in 2011). These legacy scripts assumed a global scope for variables, and depended on browser behaviour specific to serially loaded, synchronous, scripts. Between July 2015 and August 2015, I worked with the community to develop a migration guide. And, after a short deprecation period, the legacy loader was removed.

    Line graph that plots the firstPaint metric for August 2015. The line drops from approximately one and a half seconds to 890 milliseconds.

    Hello, WebPageTest

    Previously, we only collected performance metrics for Wikipedia from sampled real-user page loads. This is super and helps detect trends, regressions, and other changes at large. But, to truly understand the characteristics of what made a page load a certain way, we need synthetic testing as well.

    Synthetic testing offers frame-by-frame video captures, waterfall graphs, performance timelines, and above-the-fold visual progression. We can run these automatically (e.g. every hour) for many urls, on many different browsers and devices, and from different geo locations. These tests allow us to understand the performance, and analyse it. We can then compare runs over any period of time, and across different factors. It also gives us snapshots of how pages were built at a certain point in time.

    The results are automatically recorded into a database every hour, and we use Grafana to visualise the data.

    In 2015 Peter built out the synthetic testing infrastructure for Wikimedia, from scratch. We use the open-source WebPageTest software. To read more about its operation, check Wikitech.


    The journey to Thumbor begins

    Gilles evaluated various thumbnailing services for MediaWiki. The open-source Thumbor software came out as the most promising candidate.

    Gilles implemented support for Thumbor in the MediaWiki-Vagrant development environment.

    To read more about our journey to Thumbor, read The Journey to Thumbor (part 1).


    Save timing reduced by 50%

    Save timing is one of the key performance metrics for Wikipedia. It measures the time from when a user presses “Publish changes” when editing – until the user’s browser starts to receive a response. During this time, many things happen. MediaWiki parses the wiki-markup into HTML, which can involve page macros, sub-queries, templates, and other parser extensions. These inputs must be saved to a database. There may also be some cascading updates, such as the page’s membership in a category. And last but not least, there is the network latency between user’s device and our data centres.

    This year saw a 50% reduction in save timing. At the beginning of the year, median save timing was 2.0 seconds (quarterly report). By June, it was down to 1.6 seconds (report), and in September 2015, we reached 1.0 seconds! (report)

    Line graph of the median save timing metric, over 2015. Showing a drop from two seconds to one and a half in May, and another drop in June, gradually going further down to one second.

    The effort to reduce save timing was led by Aaron Schulz. The impact that followed was the result of hundreds of changes to MediaWiki core and to extensions.

    Deferring tasks to post-send

    Many of these changes involved deferring work to happen post-send. That is, after the server sends the HTTP response to the user and closes the main database transaction. Examples of tasks that now happen post-send are: cascading updates, emitting “recent changes” objects to the database and to pub-sub feeds, and doing automatic user rights promotions for the editing user based on their current age and total edit count.

    Aaron also implemented the “async write” feature in the multi-backend object cache interface. MediaWiki uses this for storing the parser cache HTML in both Memcached (tier 1) and MySQL (tier 2). The second write now happens post-send.

    By re-ordering these tasks to occur post-send, the server can send a response back to the user sooner.

    Working with the database, instead of against it

    A major category of changes were improvements to database queries. For example, reducing lock contention in SQL, refactoring code in a way that reduces the amount of work done between two write queries in the same transaction, splitting large queries into smaller ones, and avoiding use of database master connections whenever possible.

    These optimisations reduced chances of queries being stalled, and allow them to complete more quickly.

    Avoid synchronous cache re-computations

    The aforementioned work on WANObjectCache also helped a lot. Whenever we converted a feature to use this interface, we reduced the amount of blocking cache computation that happened mid-request. WANObjectCache also performs probabilistic preemptive refreshes of near-expiring values, which can prevent cache stampedes.

    Profiling can be expensive

    We disabled the performance profiler of the AbuseFilter extension in production. AbuseFilter allows privileged users to write rules that may prevent edits based on certain heuristics. Its profiler would record how long the rules took to inspect an edit, allowing users to optimise them. The way the profiler worked, though, added a significant slow down to the editing process. Work began later in 2016 to create a new profiler, which has since completed.

    And more

    Lots of small things. Including the fixing of the User object cache which existed but wasn’t working. And avoid caching values in Memcached if computing them is faster than the Memcached latency required to fetch it!

    We also improved latency of file operations by switching more LBYL-style coding patterns to EAFP-style code. Rather than checking whether a file exists, is readable, and then checking when it was last modified – do only the latter and handle any errors. This is both faster and more correct (due to LBYL race conditions).


    So long, Sajax!

    Sajax was a library for invoking a subroutine on the server, and receiving its return value as JSON from client-side JavaScript. In March 2006, it was adopted in MediaWiki to power the autocomplete feature of the search input field.

    The Sajax library had a utility for creating an XMLHttpRequest object in a cross-browser-compatible way. MediaWiki deprecated Sajax in favour of jQuery.ajax and the MediaWiki API. Yet, years later in 2015, this tiny part of Sajax remained popular in Wikimedia's ecosystem of community-developed gadgets.

    The legacy library was loaded by default on all Wikipedia page views for nearly a decade. During a performance inspection this year, Ori Livneh decided it was high time to finish this migration. Goodbye Sajax!


    Further reading

    This year also saw the switch to encrypt all Wikimedia traffic with TLS by default. More about that on the Wikimedia blog.

    — Timo Tijhof


    Mentioned tasks: T107399, T105391, T109666, T110858, T55120.

    I read “The impact of user interface on young children’s computational thinking” (pdf) by Sullivan, Bers, Pugnali (2017). Wonderful paper! Authors compare a tangible interface to robotics programming and a non-tangible interface. The tangible interface is KIBO, a robotic kit programmed with wooden blocks, basically making Scratch physical. (I totally love KIBO, so I’m biased […]

    Tech News issue #10, 2020 (March 2, 2020)

    00:00, Monday, 02 2020 March UTC
    TriangleArrow-Left.svgprevious 2020, week 10 (Monday 02 March 2020) nextTriangleArrow-Right.svg
    Other languages:
    Deutsch • ‎English • ‎Esperanto • ‎español • ‎français • ‎lietuvių • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎فارسی • ‎বাংলা • ‎ไทย • ‎中文 • ‎日本語

    Semantic MediaWiki 3.1.5 released

    17:17, Sunday, 01 2020 March UTC

    February 29, 2020

    Semantic MediaWiki 3.1.5 (SMW 3.1.5) has been released today as a new version of Semantic MediaWiki.

    It is a maintenance release providing bug fixes. Please refer to the help pages on installing or upgrading Semantic MediaWiki to get detailed instructions on how to do this.

    Older blog entries