This Month in GLAM: July 2020

05:42, Wednesday, 12 2020 August UTC

WikimediaDebug v2 is here!

23:10, Monday, 10 2020 August UTC

WikimediaDebug is a set of tools for debugging and profiling MediaWiki web requests in a production environment. WikimediaDebug can be used through the accompanying browser extension, or from the command-line.

This post highlights changes we made to WikimediaDebug over the past year, and explains more generally how its capabilities work.

  1. What's new?
  2. Features overview: Staging changes, Debug logging, and Performance profiling.
  3. How does it all work?

§ 1. What's new?

Redesigned

I've redesigned the popup using the style and components of the Wikimedia Design Style Guide.

New design Previous design

The images above also show improved labels for the various options. For example, "Log" is now known as "Verbose log". The footer links also have clearer labels now, and visually stand out more.

New footer Previous footer

This release also brings dark mode support! (brighter icon, slightly muted color palette, and darker tones overall). The color scheme is automatically switched based on device settings.

Dark mode
Inline profile

I've added a new "Inline profile" option. This is a quicker and more light-weight alternative to the "XHGui" profile option. It outputs the captured performance profile directly to your browser (as hidden comment at the end of the HTML or CSS/JS response).

Beta Cluster support

This week, I've set up an XHGui server in the Beta Cluster. With this release, WikimediaDebug has reached feature parity between Beta Cluster and production.

It recognises whether the current tab is for the Beta Cluster or production, and adapts accordingly.

  • The list of hostnames is omitted to avoid confusion (as there is no debug proxy in Beta).
  • The "Find in Logstash" link points to logstash-beta.wmflabs.org.
  • The "Find in XHGui" link points to performance-beta.wmflabs.org/xhgui/.

§ 2. Features overview

Staging changes

The most common use of WikimediaDebug is to verify software changes during deployments (e.g. SWAT). When deploying changes, the Scap deployment tool first syncs to an mw-debug host. The user then toggles on WikimediaDebug and selects the staging host.

WikimediaDebug is now active and routes browser activity for WMF wikis to the staging host. This bypasses the CDN caching layers and load balancers normally involved with such requests.

Debug logging

The MediaWiki software is instrumented with log messages throughout its source code. These indicate how the software behaves, which internal values it observes, and the decisions it makes along the way. In production we dispatch messages that carry the "error" severity to a central store for monitoring purposes.

When investigating a bug report, developers may try to reproduce the bug in their local environment with a verbose log. With WikimediaDebug, this can be done straight in production.

The "Verbose log" option configures MediaWiki to dispatch all its log messages, from any channel or severity level. Below is an example where the Watchlist component is used with the verbose log enabled.

One can then reproduce the bug (on the live site). The verbose log is automatically sent to Logstash, for access via the Kibana viewer at logstash.wikimedia.org (restricted link).

Aggregate graphs (Kibana) Verbose log (Kibana)
Performance profiling

The performance profiler shows where time is spent in a web request. This feature was originally implemented using the XHProf PHP extension (for PHP 5 and HHVM). XHProf is no longer actively developed, or packaged, for PHP 7. As part of the PHP 7 migration this year, we migrated to Tideways which provides similar functionality. (T176370, T206152)

The Tideways profiler intercepts the internals of the PHP engine, and tracks the duration of every subroutine call in the MediaWiki codebase, and its relation to other subroutines. This structure is known as a call tree, or call graph.

The performance profile we capture with Tideways, is automatically sent to our XHGui installation at at https://performance.wikimedia.org (public). There, the request can be inspected in fine detail. In addition to a full call graph, it also monitors memory usage throughout the web request.

Most expensive functions (XHGui) Call graph (XHGui)

§ 3. How does it all work?

Browser extension

The browser extension is written using the WebExtensions API which Firefox and Chrome implement.

Add to Firefox   Add to Chrome

You can find the source code on github.com/wikimedia/WikimediaDebug. To learn more about how WebExtensions work, refer to MDN docs, or Chrome docs.

HTTP header

When you activate WikimediaDebug, the browser is given one an extra HTTP header. This header is sent along with all web requests relating to WMF's wiki domains. Both those for production, and those belonging to the Beta Cluster. In other words, any web request for *.wikipedia.org, wikidata.org, *.beta.wmflabs.org, etc.

The header is called X-Wikimedia-Debug. In the edge traffic layers of Wikimedia, this header is used as signal to bypass the CDN cache. The request is then forwarded, past the load balancers, directly to the specified mw-debug server.

Header Format
X-Wikimedia-Debug: backend=<servername> [ ; log ] [ ; profile ] [ ; forceprofile ] [ ; readonly ]
mediawiki-config

This HTTP header is parsed by our MediaWiki configuration (wmf/profiler.php, and wmf/logging.php).

For example, when profile is set (the XHGui option), profiler.php invokes Tideways to start collecting stack traces with CPU/memory information. It then schedules a shutdown callback in which it gathers this data, connects to the XHGui database, and inserts a new record. The record can then be viewed via performance.wikimedia.org.

See also

Further reading

Add WikimediaDebug to Firefox   Add WikimediaDebug to Chrome

In 2019 Wikimedia UK, Archaeology Scotland and The Society of Antiquaries of Scotland recruited a graduate intern through the Scottish Graduate School for Arts & Humanities Internship programme. These funded placements give PhD researchers the opportunity to spend up to three months with a partner organisation; improving their research skills and giving them an opportunity to work on a project which makes a real difference to an organisation. Robertta Leotta is coming to the end of her internship, and reflects on her project in this blog she wrote for us.

At the beginning of March – which now looks like many ages ago –  I went to North Berwick in order to take pictures I wanted to upload in Wikicommons. During my walking tour and while I was taking pictures of monuments and buildings, I bumped into a monument dedicated to Catherine Watson, a woman who died in her attempt to rescue children from an angry sea, in 1889. Seeking Catherine Watson on the Internet I found this interesting project called Mapping Memorials to Women in Scotland. This project aims to uncover and map several memorials in Scotland which commemorate women, either famous or unknown, ‘who have contributed in some way to the life of the country we know today’. I really liked the purpose of this project and I started to think about some collaborations in terms of my internship. I would have liked to do some historical research about women in Scotland and go around taking pictures of other possible memorials, so that I would have contributed to the increase in open access images in Wikimedia about the history of women in this country. However, when I was about to begin this project the pandemic happened and I needed, for personal reasons, to come back home to Italy.

All this has been very disruptive for everyone and, in my own way, I needed to adjust all my plans under these new circumstances. It was highly remarkable how quickly and effectively it was possible to find a solution to adapt my internship project to the new reality we all have been experiencing.

Indeed while I was in Sicily locked inside an empty house, I could have not travelled anymore around Scotland in my hunt for memorials. However, the host institution of my internship was very supportive and they helped me to find a different solution for my project. Instead of mapping memorials physically, I started to learn how to navigate the digital space of Wikidata, a relational database. Wikidata was totally unknown for me, but became a place where I could remotely create, map and increase new data about women in Scotland.

At the beginning, I had a hard time understanding this language, the coding system, how things make sense in this field, but my host institution was always there for all my doubts, questions, worries. This online and remote training was indeed very efficient because of the empathy, positiveness and expertise provided by the host institution.

Then, thanks to the fact that we obtained properties approved for the Women of Scotland project, we had access to the identifier numbers through which I was able to create more than fifty items in Wikidata – among memorials and women – linked to the Mapping Memorials website. As an ultimate outcome, my project contributed to an increase in open access knowledge about the history of women who. in some cases, are little known in Scotland.

In this process, I had the chance to acquire several skills and have a very significant experience, both from a personal and professional perspective. Firstly, I learnt some new digital skills, extremely necessary for almost all careers nowadays. Secondly, this project gave me the chance to reflect upon principles and processes of categorisation and classification in Wikidata; a topic which also dialogues with my PhD interests in cognitive studies. Lastly, what I consider the major lesson I took from this experience was understanding how to create a collaborative, supportive working team which is able to face, with creativity and flexibility, any type of situations; also, as it happens to experience, very unpredictable ones such as pandemics.

Find out more about how you can support interns like Roberta here.

Tech News issue #33, 2020 (August 10, 2020)

00:00, Monday, 10 2020 August UTC
previous 2020, week 33 (Monday 10 August 2020) next
Other languages:
Deutsch • ‎English • ‎Nederlands • ‎español • ‎français • ‎italiano • ‎lietuvių • ‎magyar • ‎polski • ‎português do Brasil • ‎čeština • ‎русский • ‎українська • ‎հայերեն • ‎עברית • ‎العربية • ‎മലയാളം • ‎中文 • ‎日本語 • ‎한국어

Tamil Computing Virtual Meetup

12:40, Sunday, 09 2020 August UTC

Today(August 09, 2020), Tamil Virtual Academy organized a virual meetup on Tamil computing and its roadmap. This full day event had 18 sessions presented by various people working on Tamil computing. Event was chaired by T. Udhayachandran IAS, Director of TVU. I was also invited for the program. I talked about potentially collaboration of Tamil and Malayalam computing communities to solve common problems. Opensource based language computing helps to accelerate language computing in both languages by such collaboration.

weeklyOSM 524

09:27, Sunday, 09 2020 August UTC

28/07/2020-03/08/2020

lead picture

Stina Flodström‘s video created with FOSS data 1 | © Stina Flodström | map data © OpenStreetMap contributors

Mapping

  • A blog post on The Strava Club reminded users that they can help improve the quality of suggested routes by contributing to OpenStreetMap.
  • The members of OpenStreetMap Italia community have created su.openstreetmap.it, a website where you can add businesses in Italy to OSM without having an account. They have recently added a written and video tutorial with a step by step explanation. The news has been covered (it) (automatic translation) by Wikimedia Italia with a short article. The source code of the website is available on GitHub.
  • Skunk published in his blog post a entry titled ‘Mapping artwork and memorials with Wikimedia integration’.
  • Jake Coppinger reviewed a number of bicycle routing apps and ranked them in rough order of which he would use for commuter cycling in Sydney.

Community

  • User mariotomo writes (es) (automatic translation) about his ‘reflections on the margins of the SotM-2020 virtual conference’, in particular from the perspective of being on a low bandwidth connection.
  • OpenStreetMap US has been awarded a Geospatial Data Analytics Services Grant for the Azavea Summer of Maps programme. Eugene Chong will be using OpenStreetMap to track (and map) progress made on UN Sustainable Development Goals in several American cities.
  • On System Administrator Appreciation Day Dorothea thanked our sysadmins for the awesome work that they are doing.

OpenStreetMap Foundation

  • Mikel Maron wrote an email about ‘coordinated funding to support continued maintenance and development of the iD editor’.
  • Guillaume Rischard announced, on talk, the funding of three infrastructure projects: Nominatim, osm2pgsql and Potlatch 2 by the OSMF Board. Technical issues were discussed among others by Sören Reinecke, Richard Fairhurst and mmd.
  • The OSMF Board is considering splitting the Advisory Board in two and is seeking input on this move.
  • The minutes of the OSMF Board meeting on 30 July have been published.
  • Imagico ponders in his blog post about ‘the how and where of the OSMF starting to hand out money in the OSM community’.

Humanitarian OSM

  • Médecins Sans Frontières held a missing maps mapathon on 1 August with volunteers from all over Southeast Asia. The mapathon is part of the organisation’s plan to map out Nigeria this year.
  • Despite movement restrictions in the country, HOT Philippines has been able to continue training volunteers by identifying their needs and shifting training efforts online.
  • YouthMappers, CommonSensing and other members of the humanitarian mapping world were covered in an article by Gareth Willmer on Devex.

Maps

  • Three students created a website gathering all kinds of data related to cycling. The first version of the tool is online. The mapathon to collect data, and create a complete website to document the entire making process, was organised by Open Knowledge Belgium. They are asking for feedback.
  • Peter Corless discusses how Stadia Maps improved their end-to-end latency by switching from CockroachDB to Scylla.
  • After Taiwan passed the National Language Law, the Hokkien Language, mixed with Austronesian and Japanese terms, known as Taiwanese or Taiwanese Hokkien, gained official language status in Taiwan. Brandon Liu will talk about the Taiwan Taigi Map in COSCUP, one of the most important open source conferences in Taiwan. The Taiwan Taigi Map lets people explore name:nan tags which record local Hokkien names.

switch2OSM

  • For some time the City of Karlsruhe has been using the API of Openrouteservice (ORS), developed by HeiGIT, to prove a routing service for pedestrians, bicycles and cars for their online city map and citizen GIS app (for citizens and visitors to Karlsruhe). You can try it here:

    The client developed by the Liegenschaftsamt (land management department), city of Karlsruhe is based on the ArcGIS for Javascript API and built as a widget for the ArcGIS Web AppBuilder.

Software

  • Can Yang has written a tutorial on map matching using their fast map matching library and OpenStreetMap data.
  • Misiones Provincial Routes (automatic translation) is a tool built with institutional and collaborative contributions, such as OpenStreetMap, Mapillary, Wikipedia, and Openrouteservice. Developed by the Dirección de Modernización de la Gestión y Gobierno Electrónico de la Provincia de Misiones, Argentina, it is an interactive map to virtually travel provincial routes and roads. Carlos Brys, one of the developers, gives (automatic translation) more details in a Primera Edicion article.
  • Ilya Zverev explained (ru) (automatic translation) how a combination of ancient anonymous edits and a new JSON output option for the OSM API caused problems for iD.

Programming

  • GIScience Heidelberg open-sourced the ohsome2label tool, which offers a flexible framework for labelling customised geospatial objects using historical OSM data allowing more effective and efficient deep learning. Based on the ohsome API developed by HeiGIT gGmbH, ohsome2label aims to mitigate the lack of abundant high-quality training samples in geospatial deep learning by automatically extracting customised OSM historical features, and providing intrinsic OSM data quality measurements.
  • Fabian, from HeiGIT, explained the new function in the ohsome (OSM history Analytics) API for checking query parameters. The API now uses a fuzzy string matcher to suggest which parameter you meant to pass when it encounters one it doesn’t recognise in your query.
  • [1] Stina Flodström has produced a pipeline importing OpenStreetMap data into Unity, using Houdini (here is the free version for testing), to generate real time environments based on real cities. You must see the video!

Releases

  • Stable release 20.07 of JOSM has been released.

Did you know …

  • … that you can draw a bounding box on a map and Norbert Renner’s bbox tool will return its coordinates in a number of different OSM formats?
  • … there is a wiki page listing websites that are using OSM data without correct attribution? The page also describes the steps to follow if you discover another example of missing attribution.
  • … that OpenHistoricalMap is built by a community of mappers and historians that contribute and maintain data about the history of the world?
  • … that Public Transport Network Analysis (PTNA) provides a daily analysis of public transport lines mapped in OSM?

Other “geo” things

  • Bored Panda showed us their favourite maps from Terrible Maps.
  • It’s been hot enough recently to prompt Alexander Zipf to remind us that a shady route feature is available for Heidelberg and Dresden in the meinGrün app (automatic translation).
  • Burb (ˈbərb) v. the act of cycling every street of a suburb in a single ride, a variant of the Chinese postman problem. Matt and Andy competed with Jim’s optimiser to find the shortest route to burb Bellfield.
  • Microsoft’s latest release of Flight Simulator is more of a vast, gamified take on Google Earth than it is a simulation of flight. Keith Stuart describes how the game presents a near-photorealistic depiction of the entire planet, featuring cities procedurally generated by AI, based data from OpenStreetMap.
  • Antony Barja is excited about North Road’s new SLYR ArcMap to QGIS compatibility suite. The new tool automatically converts ArcMap MXD, MXT and PMF documents to QGIS projects, Esri LYR files to their QGIS equivalents, and Esri .style databases.
  • Laura Bliss writes on Bloomberg CityLab about the pandemic-era appeal of getting lost in a labyrinth, and links to the world wide labyrinth locator. On OpenStreetMap, the attraction=maze tag is used about 800 times.

Upcoming Events

Where What When Country
Michigan Michigan Online Meetup 2020-08-08 USA
Taipei OSM x Wikidata #19 2020-08-10 taiwan
Hamburg Hamburger Mappertreffen 2020-08-11 germany
Munich Münchner Stammtisch 2020-08-12 germany
Berlin 146. Berlin-Brandenburg Stammtisch 2020-08-14 germany
Zurich 120. Mapping-Party/OSM Meetup Zurich 2020-08-15 switzerland
Cologne Bonn Airport 130. Bonner OSM-Stammtisch (Online) 2020-08-18 germany
Lüneburg Lüneburger Mappertreffen 2020-08-18 germany
Berlin 14. OSM-Berlin-Verkehrswendetreffen (Online) 2020-08-18 germany
Cologne Köln Stammtisch ggf. ONLINE 2020-08-19 germany
Derby Derby pub meetup 2020-08-25 united kingdom
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-08-26 germany
Kandy 2020 State of the Map Asia 2020-10-31-2020-11-01 sri lanka

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by AnisKoutsi, LorenzoStucchi, MatthiasMatthias, Nordpfeil, NunoMASAzevedo, PierZen, Polyglot, Rogehm, TheSwavu, derFred, muramototomoya, osmapman, richter_fn.

Keeping it simple for "Abstract Wikipedia"

07:55, Sunday, 09 2020 August UTC
Abstract Wikipedia is confusing to me; it is said to be about "articles in a language independent way". Articles are complicated because the expression in any language has to be consistent with the grammar, the diction, the vocabulary for that language. Wikipedia articles have one additional complication; once you start reading you may end up in a rabbit hole of wonderful stuff that grabs your attention.

Abstract Wikipedia covers all of Wikidata and that is much more than what all Wikipedias combined cover. Currently there are two items for every item with a Wikipedia link. The first objective that seems obvious is to have something to say about each item. It can be as little as **Name** is a **human**. When we know his profession **Name** is a **chemist**. When an award was won, "**Name** is a **chemist**. The **Award** was received in **year**." Patterns like these are similar for every language.

This minimal approach is the basis for automated descriptions and are vital when disambiguating. It is an improvement over manual descriptions because they do not get updated when new information becomes available. Automated descriptions are not articles; they have to be descriptive and not describing.

When a Wikipedia articles exist, they provide a rich source of information when new texts are to be generated. Given that Abstract Wikipedia is based on Wikidata, a tool like "Concept Cloud" is useful because it shows all the links to other articles and how often they occur in an article (Concept Cloud is part of Reasonator). The challenge will be to model such relations in Wikidata OR allow for these relations to be registered in a new way as part of Abstract Wikipedia.

Once sufficient information is available, an article can be generated. That is what LSJBOT and the Cebuano Wikipedia are famous for. It follows that once the same amount of data is available for a similar subject in Wikidata, an article can be generated in for instance Cebuano. When we recreate these templates, we can update them for any language. 

The linguists who theorise Abstract Wikipedia to death, can apply their magic and find if their pet theories hold water in the real world. In Abstract Wikipedia their function is to enable the provision of information in any language. Obviously competing theories may be implemented and as a result the underlying technology may evolve.

Thanks, GerardM

Designing technical workshops for the Indic community

19:00, Friday, 07 2020 August UTC

Background

Small Wiki Toolkits is an initiative for building technical capacity in smaller wiki communities. As part of this initiative, Wikimedia Foundation’s Developer Advocacy team planned an in-person workshop series to take place in June 2020 for a local Wikimedia community that might need it and would benefit from it. After conversations with 4-5 wiki communities, the team decided to co-organize the first workshop series with volunteers in India. Considering that there are emerging technical contributors in the Indic community, and IndicTech-Com exists, an initiative purely run by the volunteer community, the thought was that this workshop series would help augment the ongoing efforts. Also, as the Indic Wikimedia community is home to more than 22 language communities, the success of this workshop series would allow us to model a similar concept in another community quickly. Though the original plan was to coordinate this series in an in-person setting, due to the COVID-19 pandemic, it adopted an online format.

Workshop planning and design

Planning this workshop series seemed challenging initially, especially as we wanted to design an experience that supports running hands-on technical workshops in an online format smoothly. We wanted attendees not just to be passive listeners but active participants in these workshops. So, experimenting with different online meeting tools and finalizing one that would work best for facilitators, organizers, and participants was also crucial. In this post, we share detailed planning that went into designing different components of the workshop and lessons learned along the way so that it might benefit potential organizers or trainers interested in piloting a similar workshop series in their community. 

Concept

In December 2019, Indic community members ran a Community Engagement Survey for planning WikiConference India 2020, though the event itself has been put on hold as well, owing to the pandemic. We used this survey to ask a few questions to help understand the technical needs and challenges of Indic community members and identify new technical skills they wanted to learn—responses from the survey indicated the following top 3 skills: writing user scripts and gadgets, writing Wikidata queries, and using Phabricator. Based on these responses, we finalized the workshop topics. 

Registration and outreach

To announce the workshop series, we first set up a registration process and an event page on Meta-Wiki. On the registration form, participants could express interest in attending multiple workshops in order of their preference; we made it clear that the spots will be limited (15 per workshop) to allow mentors to handle the sessions more efficiently. We selected participants based on their preferred choices and motivations, and in the invitation emails, we asked them to sign the Meta-Wiki page to confirm their participation. With this, we quickly realized that the people who filled out the registration might not all show up, which is quite a regular pattern for online events, so we invited a few more people who we had on the waitlist for a particular session. For the outreach part, we wanted to ensure that we found and reached the people we designed this series for and shared this event with them. We spread the word on the Indic community mailing lists, village pumps, and Facebook groups where the Indian communities are majorly active. Then additionally, we also reached out to interface admins on their talk pages, who we thought could benefit particularly from the session on writing user scripts and gadgets—the skill most in demand as per the survey. 

Running virtual workshops

We decided to use a combination of Zoom and Google Meet to conduct the online sessions, platforms that Indic communities seem to be most familiar with; using these, we could record and retrieve the sessions later easily and take advantage of cool moderation features for smooth facilitation. We recorded all the workshops except a discussion-based session on brainstorming the challenges of Indic communities to allow participants to feel more comfortable sharing their stories, thoughts, questions, and concerns. Slides and videos from all workshops are available on Wikimedia Commons and documented here on the workshops page along with the notes. 

We also shared facilitation tips with mentors asking them to:

  • Run a quick introduction round for participants to introduce themselves at the beginning of a workshop.
  • Ask participants to turn their video off when not talking to ensure the call runs smoothly for everyone, especially for low-bandwidth users.
  • Let participants know that the workshop will be recorded so that they can raise any concerns they have.
  • Share a friendly space policy or the Code of Conduct for Wikimedia technical spaces.

Reminders, follow-ups and feedback surveys

We had sent at least two reminders to all the participants a day and an hour before and a follow-up email after the workshop. In this follow-up email, we included workshop materials (slides, notes, and recording), an invitation to reach out for support for conducting similar training in the future, a feedback survey to help understand the usefulness of such workshops, and, most importantly, a few follow-up tasks for attendees. For example, for the user scripts and gadgets workshop, mentor Jayprakash put together a few small tasks. We asked attendees to work on these tasks, share their solutions, and ask related questions to the mentor directly via email. 

Participant demographics

In May 2020, a call for participants had been sent to Indic Wikimedia communities, and participants were selected based on their preference, background, and motivations to participate. Altogether, in the four sessions, there were 26 unique participants, and the total number of participants was 42 (i.e., an average of 10 participants per workshop). In terms of diversity, we had one participant from each language community, with the exception of Bengali, Gujarati, Hindi, Malayalam, and Santali, from each of which there were two people, and five Wikimedia technical contributors. 

Bar graph showing the number of participants across wikis
Figure 1: Participants from various Indic wikis by KCVelaga, CC BY-SA 4.0

This series could have done better on the gender diversity front, as only 10% of all the participants were women—which is a reflection of women and non-binary individuals in general in Wikimedia technical spaces, especially in India. However, out of the five technical contributors in the series who participated, 60% were women; they are not directly associated with the communities in India, but through Google Summer of Code and similar programs.

Key outcomes

While the three workshops “Writing user scripts and gadgets”, “Writing Wikidata queries”, and “Using project management and bug reporting tool Phabricator” focused on capacity development as evident from their titles, the fourth entitled “Understanding the technical challenges of Indic language wikis” focused on brainstorming the Indic community’s challenges. Participants reported that their skills and awareness increased by 30%, on average, and 8% percent of the participants said their familiarity with a technical topic increased by 60% after the workshop1. These percentages indicate the workshop series made a decent impact on the technical front of Indic communities. Improvement indicates the participants have gained new skills they can build upon and that was the intent of these workshops.

Bar graph showing the percentage improvement of participant skills
Figure 2: Percentage improvement of skills by KCVelaga, CC BY-SA 4.0

Success stories

  • Upon encouragement from the mentor and other participants in the “Understanding the technical challenges of Indic language wikis” workshop, User:Jayprakash12345 applied for global interface admin rights (which is usually not easy to obtain), and his nomination was successful.
  • Also, during this workshop, Malayalam and Konkani users reported that the Wikipedia:Twinkle gadget had been broken on their Wikipedias since a new update. Also, improper functioning of citoid on Malayalam was reported. As of July 2020, all issues reported from Malayalam have been fixed.

Observations and evaluations

What worked well?

  • On average, the overall quality and usefulness of the series was rated 4.2 out of 5, which is a positive sign. 
  • The participant turnout (50-60% of selected) for the workshops was quite good, considering that they were online and amidst the pandemic. We think that keeping a selection process helped generate some commitment to attend the sessions. 
  • Limiting workshops to 15 participants was helpful for mentors to manage the sessions better. Though we never reached the maximum capacity, it was still a good measure to have limitations, as more numbers may have negatively affected the quality of these sessions.
  • It was a great learning experience for the organizers to understand the dynamics of conducting a series of online workshops to have an impact equivalent to a two-day physical event.
  • It helped us understand the technical landscape of Indic wikis—contributors, their challenges, perspectives, and interests.
  • Participants reported that they would be using their learnings from these workshops in the following ways—building bots, importing gadgets into their local wikis, developing new gadgets, creating wikidata queries to use on their local wiki, among several others.
  • Upon asking for additional feedback, we got the following responses, which indicate the value of these workshops and a need for their continuation, more so when designed with a specific local community in mind:

“I am looking forward for the “Understanding the technical challenges of Indic language wikis 2.0” session and such sessions should be arranged at least once in a month where we can discuss and solve issues facing by our Indic community.

“The session was useful and moreover it was in Hinglish (A mix of Hindi and English) and I feel it was more adaptive for me to understand. The workshop started with the basics, which really helped me. Workshop was so interesting that some of the attendees were eager to learn more even after workshop time limits. I think it should not [be limited] to one session.”

“Overall experience of [the] workshop was excellent and I would say that such [a] kind of workshop should be organised frequently.”

What could be improved?

  • Participation rates of female and non-binary individuals could be improved. This could be done by proactively reaching out to them during the call for participation phase.
  • All the sessions overshot their actual end time, and in some cases, it was 50% more than what was planned for. As it was the first time, time required was not well calculated—if required it is better to break down into two smaller sessions, with a break in-between.

What’s next?

The Developer Advocacy team will soon be publishing lessons learned from the first year of the Small Wiki Toolkits initiative, including lessons learned from this workshop series. The team is also considering the technical challenges shared in this workshop series for planning new activities and projects to be engaged as part of the initiative. If a plan specific to the Indic communities is required, the workshop series organizers will take that on. The next steps will be announced on the Wikimedia-I/Wikitech-I mailing lists and via other communication mediums, hopefully by the end of September, so stay tuned!

Credits: Thanks to Jayprakash, Mahir, Birgit, Andre, and Satdeep for their massive help in mentoring and facilitating these workshops and to Alex for copy-editing this post!

About this post

Feature image credit: Participants from one of the Indic workshops, Satdeep Gill, CC BY-SA 4.0

1Percentages were based on a post-event survey, to which 24 of the 42 participants responded. As the response rate was 57%, it was deemed substantial enough to represent all the participants, and “participants” was used in place of “respondents”. This is applicable to further sections as well.

Adding biographies of female oceanographers

15:52, Thursday, 06 2020 August UTC

Laura Guertin is a Professor of Earth Science at Penn State Brandywine in Media, Pennsylvania. She recently participated in the 500 Women Scientists Wiki Scholars program and reflects on her experience with the Wikimedia community in this guest blog post.

As a scientist and educator Wikipedia is a resource that has always been on my radar. As a geoscience education blogger for the American Geophysical Union (AGU) at GeoEd Trek, I have explored different Wikipedia efforts and programs, such as Wikipedia edit-a-thon for women in STEM [Women’s History Month] (2015), Wikipedia turns 15 – but do academics trust this teenager? (2016), Wikipedia Year of Science 2016 (2016), and Why Wikipedia edit-a-thons are needed, and how we can help (2019). But I remained on the outside, quick to read about and comment on Wikipedia, without ever having made an edit on the site.

In the Spring 2020 semester, I was finally ready to dive in with my students through the Wikipedia Student Program and have my class edit Wikipedia pages. However, the semester shift to online learning had me pull back, with too many unknowns with students’ ability to access technology, and my own lack of confidence in my own ability to provide a positive Wikipedia experience through remote instruction. Still determined to jump in to the Wikipedia waters, I took the time this summer to build my own Wikipedia editing skills, and I’ve been extremely pleased with how many opportunities there have been for my own professional development, how quickly I’ve been able to contribute to Wikipedia biography pages, and how I’ve been able to assist others in making their first edits.

I have to credit my entry into the Wikipedia world with my participation in the 500 Women Scientists Wiki Scholars program, which provided instruction with Wikipedia experts, live mentoring via Zoom, asynchronous support via Slack, and an incredibly supportive community of newcomers to Wikipedia.

The Wiki Education program staff were incredible walking us through the steps to edit existing or to create new biography pages of women in STEM. We were asked to work on two biography pages, but I decided that it was “now or never” to take that deep dive into Wikipedia. I started by editing the page for Diana Josephson, the first female leader of NOAA. Then, I sought to create new Wikipedia pages, and finished three biographies by the end of the course – Rana FineDeborah Kelley, and Karen Von Damm. All three are oceanographers and AGU Fellows (as AGU Fellows, they meet the Wikipedia notability guidelines (academics)).

As I was completing the 500 Women Scientists Wiki Scholars program (you can see from the project dashboard that our 20 participants added over 38,000 words!), a one-day opportunity to add to Wikipedia was organized for June 10 on editWikipedia4BlackLives. I added content to the Wikipedia page for RADM Evelyn J. Fields (NOAA Corps), the first woman and first African American to head the NOAA Corps.

Then on June 25, the Smithsonian National Museum of Natural History led a “virtual micro-crowdsourcing event” for Adding Women in Science to Wikipedia. Just like the previous event, there was a Wikipedia page for the event with a suggest list of bio pages to contribute to. I made additions to the page of botanist Velva E. Rudd.

And the opportunities have continued! I recently participated in the SACNAS/500 Women Scientists Edit-A-Thon on July 12, where I contributed to the biography page of physical oceanographer Vanesa Magar Brunner. I also served as a Wikipedia “expert” in one of the breakout rooms to help others with their edits. I am far from an expert, but it was great knowing that I had learned enough in the 500 Women Scientists Wiki Scholars program to be able to help others make contributions to this effort. The dashboard for this project shows there were 140 editors that came together from across the globe, adding over 53,000 words and creating 15 new bio pages!

Before I wrap up this post, I want to jump back to the work I did as part of the 500 Women Scientist Wiki Scholar program. My intent for my first-ever Wikipedia page was to create one to recognize a woman from where I went to graduate school, University of Miami’s Rosenstiel School of Marine and Atmospheric Science. I never had the opportunity to be in a class or interact with Rana Fine, and as a student was only aware of a small part of the vast contributions she has made to the discipline. I also saw that her name was on the list of Wikipedia Women in Red/Geoscience AGU Fellows page (“red” because they do not have a page yet, so the hyperlink is not valid). This is where I saw the names of the other two women I created biography pages for. It has been incredibly humbling and amazing to learn about the numerous accomplishments of these female trailblazers in oceanography, and to do my part to make sure others learn about these individuals as well.

There are reasons and opportunities for editing Wikipedia pages, such as those listed in this EOS article. I feel so fortunate I had the opportunity to be selected for the 500 Women Scientist Wiki Scholar program and to be a part of a community committed to increasing and improving the representation of female scientists. Myself and my cohort still connect in our Slack channel and share our continued contributions and dates of future Edit-a-Thons. I look forward to discovering additional ways I can improve the biographies of women on Wikipedia and how I can mentor others along the way.

Interested in taking a course like the one Laura participated in? Visit learn.wikiedu.org to see current course offerings.

Header/thumbnail image courtesy of Laura Guertin, CC BY-SA 4.0 via Wikimedia Commons

Recently, the Wikimedia Foundation began a fundraising campaign on Wikipedia in India, inviting anyone who relies on Wikipedia to support its future. Banners are appearing on the English Wikipedia in India, asking readers to consider contributing with a donation.

The banners have generated a lot of conversation online. We wanted to address some of the comments from our Indian users about why we are running the campaign and how donations to Wikipedia are used. Wikipedia is not a commercial website, and we are not driven by profit or advertising incentives. We are a charitable organization supported by the people who read Wikipedia. Our mission is to ensure that everyone, everywhere, can share in and access free knowledge.

India Fundraising Banner

Anyone can edit Wikipedia, and in the nearly 20 years since Wikipedia’s founding, millions of volunteer editors of diverse backgrounds and beliefs from all over the world have contributed to its articles. Wikipedia is powered by more than 250,000 global volunteers every month; it spans more than 50 million articles across nearly 300 languages.

Readers in India visit Wikipedia more than 750 million times each month, the fifth highest number of views from any country. Not only do Indians access Wikipedia in large numbers, Indian Wikipedia editors are integral and valued contributors to the encyclopedia, which is available in 23 of the languages spoken across India. In recent months, Indian volunteer contributors have ensured that neutral, reliable information about COVID-19 is available across Indic languages on Wikipedia. They have collaborated with global health partners so that information and new developments about the pandemic are well-sourced, accurate, and vetted by medical experts.

We ensure Wikipedia remains fast, secure, and reliable wherever you are in the world.

Reader donations are critical to supporting Wikipedia’s global presence. To meet the needs of readers in India and around the world, we operate an international technology infrastructure comparable to the world’s largest commercial websites. This includes hosting costs like keeping our servers running, as well as significant, ongoing engineering work to make sure Wikipedia is reliable, secure, loads quickly, and protects your privacy. As a living, constantly-changing project (with 350 edits per minute and roughly 6,700 pageviews per second), this has become increasingly important as we’ve seen more and more people turn to Wikipedia as a resource during the COVID-19 pandemic.

Donations also allow us to dedicate engineering resources to ensure that you can access Wikipedia in your preferred language, on your preferred device, no matter where you are in the world — from a dial-up modem to a brand new smartphone. Most major websites support an average of 50-100 languages — Wikipedia supports roughly 300 languages, a number that grows every year. We also use donations to reinforce volunteer efforts in ensuring information on Wikipedia is neutral, accurate, and well-sourced, working with volunteer contributors to build tools that protect against vandalism and help identify unsourced information.

Altogether, over 250 people work in engineering and product development at the Wikimedia Foundation. They work directly with our servers; manage traffic and uptime; maintain security and respond to attacks; design, develop, and release new features and experiences; support advanced editing tools and services; protect reader and editor privacy; improve accessibility for people with disabilities; and keep bandwidth costs affordable for readers. That works out to only one employee for every four million monthly readers of Wikipedia! Compared to other top websites with thousands of engineers, our mandate is to do a lot, with a little.

We support community-led projects to set knowledge free

We recognize that some of the best ideas to set knowledge free come from people around the world who use and contribute to Wikipedia and its sister free knowledge projects. At the Foundation, we collaborate with Wikimedia volunteers around the globe to support their ideas and help them bring more free knowledge to the world. In India, the Wikimedia Foundation has provided more than 2 million USD in funding to free knowledge since 2010, through grants to individuals, groups, and organizations, across 10 language communities. Support has helped Indian volunteers preserve and digitize historical sources, introduce them to new skills, and promote knowledge equity.

Last year, the Foundation issued more than 8 million USD in funding, or 7% of the Foundation’s annual budget, through grants to Wikimedia community members, affiliates, and nonprofit organizations around the world. This grantmaking is done in large part through participatory decision making processes, with Wikimedia volunteers serving on grant committees to discuss and provide recommendations on grant proposals. This funding supports the addition of new knowledge to Wikipedia and the other Wikimedia project sites through edit-a-thons, events, and initiatives that work to fill knowledge gaps and increase diversity on our sites.

We advocate for global policies that protect and advance access to information

The Foundation’s policy and legal efforts help ensure that everyone has the right to access, share, and create knowledge, while defending our volunteers from threat of reprisal, and upholding our commitment to free expression and open knowledge. We advocate for free licenses and open source software and work to make sure that copyright laws are built and reformed so that people can share and use knowledge more broadly. We also fight against censorship and protect the right of everyone to speak and learn freely. Support for this work is vital to giving you and users everywhere equal access to Wikimedia projects.

Your loyalty and donations ensure free knowledge can thrive

Wikipedia and Wikimedia projects belong to everyone—they are built for and by you. From readers to editors, we all have a stake in preserving and telling the stories of our history, our culture, and the intriguing and notorious people who have shaped our world. We recognize that not everyone has the ability or means to give. All that we ask is that you continue to seek us out as the world’s largest free knowledge resource.

For those who can support us, your donations will help continue to sustain the systems that make Wikipedia possible, and ensure the free knowledge movement can grow and thrive. We know not everyone can afford to support this work — which is exactly why Wikipedia exists. Our mission is to make sure free knowledge is available to the world, for everyone, everywhere.

We’re able to make this commitment thanks to the tremendous generosity of past and present donors, and the incredible work of the global Wikimedia volunteer communities. But our work is not done. There is so much more knowledge in the world, and so many more people to reach. To fulfill our mission to create a world in which every human being can freely share in the sum of all knowledge, we need to meet the new challenges of our time. If you value this work, please support it. Visit donate.wikimedia.org to make a contribution today.

As the COVID-19 pandemic rages across the United States, residents are seeking information about their local governments’ responses. Local newspapers are often a great source for the most recent news, but it’s hard to get a big picture of the pandemic’s impact on states, cities, and regions from reading a daily newspaper. Wikipedia, however, provides that overview — assuming you live in a region where Wikipedia’s volunteers have expanded the article.

Sadly, however, that’s not every region of the country. That’s why Wiki Education launched a series of courses in our Wiki Scholars program devoted to improving the quality of content on articles related to state and regional responses to COVID-19. To date, we’ve wrapped up two courses, have a third ongoing, and are actively recruiting for more. Course fees for these courses have been paid by a generous sponsor who is supporting the improvement of COVID-19-related articles on Wikipedia.

The two courses that have wrapped up demonstrate the benefits of the Wiki Scholars model. Wiki Education staff recruits experts in public policy, political science, journalism, and other related topic areas to take a 6-week course where we teach the participants how to edit Wikipedia articles related to the pandemic. In our first two courses, 36 subject matter experts have added content to articles that have been viewed millions of times.

Many of the participants improved the “timeline” sections of state articles, adding the daily and weekly updates, and most also improved other sections in articles. One participant wrote a large section in Maine’s article on the impact to higher education in the state. The Arizona article has a section on epidemiology and public health responses written nearly entirely by a participant in one of our courses. The South Carolina article now has a section on the epidemiology and public health response, as well as impacts to K-12 and higher education schools and the economy.

Our courses also offered an opportunity for participants to address Wikipedia’s equity gaps in this content area. One participant added a section on the pandemic’s impact on the Northern Arapaho tribe to the Wyoming article. Another noted the Navajo Nation, which at the time had a higher per capita positive rate than any U.S. state, didn’t have an article about COVID-19, so our participants created one. Another participant added a section about New Mexico’s Navajo Nation to the New Mexico article.

A participant from our course wrote most of the article about North Dakota’s response, but once they finished the course, the article edits stopped as well. These courses have demonstrated that it takes more than just an individual or even a group of individuals to keep these articles up to date. Articles like this desperately need regular edits; that’s why we’re offering more courses. The work our past participants have done has demonstrated the value of these improvements; now, we need to do more.

If you’re interested in learning how to improve Wikipedia’s articles related to COVID-19, we are actively accepting applicants for our next course.

Header/thumbnail image of COVID-19 testing in Arizona by User:Prim8acs, a participant in one of our courses, CC BY-SA 4.0 via Wikimedia Commons

Afinal, por que eu reingressei no ensino superior?

00:00, Tuesday, 04 2020 August UTC



A necessidade de um diploma de ensino superior é bastante debatida dentro da indústria da tecnologia — não há um dia em que não vejamos tweets sobre o assunto em nossas linhas do tempo. Por isso — e também para registrar as minhas reflexões sobre o ensino superior como um todo —, resolvi compartilhar o processo de tomada de decisão que me levou à decisão de reingressar no ensino superior em uma graduação com foco em computação.

De onde parti

No começo da minha carreira na tecnologia, eu era uma graduanda em Engenharia Mecânica com uma grande predileção por computação. Como conto em A minha jornada até o Outreachy, ou Como aprendi a parar de me preocupar e começar a contribuir, o meu envolvimento com tecnologia até o meu primeiro estágio nunca havia sido total, estando limitado a cursos de extensão ou horas de voluntariado em projetos ou eventos. Ser selecionada como estagiária do Outreachy na comunidade Wikimedia em 2018 — e ter uma experiência tão positiva a ponto de conseguir alinhar uma nova oportunidade na área logo após o término do meu estágio — confirmou algo que suspeitava há algum tempo: tenho uma grande aptidão para atuar na área, e cursar Engenharia Mecânica se tornou uma grande limitação.

Cursei apenas um semestre na graduação antes de trancar a minha matrícula, e isso me permitiu realinhar as minhas expectativas ao mesmo tempo que progredi na minha carreira profissional: tive a oportunidade de viajar para várias cidades no Brasil e no mundo, fui mentorada pela comunidade da Mozilla, juntei-me ao time do Outreachy… E decidi fazer o ENEM (Exame Nacional do Ensino Médio, o único método de ingresso que a universidade federal do meu estado adota em todas as modalidades) novamente.

Você está indo tão bem. Por que um diploma?

A trajetória profissional de alguém acaba sendo uma grande mistura de competência, sorte e privilégio — em medidas bastante diferentes em cada caso. Conquistar um diploma de ensino superior me permite aumentar o fator “competência” e de quebra me confere algum grau de seguridade caso eu decida deixar o Brasil. Demais benefícios, como algum grau de envolvimento com pesquisa acadêmica ou conexões no meio universitário, são fatores interessantes mas que não influenciaram a minha decisão de reingressar no ensino superior.

Ainda falando sobre privilégio, é importante ressaltar que sou bastante privilegiada por poder fazer essa escolha (e ser capaz de ingressar e permanecer no ensino superior público brasileiro). Para mim foi difícil fazer uma prova de nível médio após tantos anos sem contato com o conteúdo cobrado, mas o meu desempenho, ainda que não tenha sido brilhante, incontestavelmente comprova que a minha boa base no Ensino Médio em um colégio particular persiste.

O bom

Comecei a minha graduação em Sistemas de Informação em 2019 e hoje estou com 21% do curso integralizado graças aos processos de reaproveitamento de créditos. Há uma sinergia entre a minha vida profissional e a minha vida acadêmica que me permite entender determinados conteúdos com uma profundidade e maturidade maior ao mesmo tempo que posso oferecer uma perspectiva interessante aos meus colegas de turma e professores. Aliás, reingressar no ensino superior no meio de seus vinte anos é bastante diferente de ingressar na faculdade no fim da adolescência — sinto que tenho mais serenidade para resolver problemas e encarar desafios.

Ter construído uma carreira antes de ingressar em uma graduação afim também eliminou a pressão e a incerteza de como entraria no mercado de trabalho após a minha formatura. Já não sinto mais que o meu sucesso é dependente de um diploma de ensino superior, e isso me permite inclusive ser mais assertiva e vocal em momentos em que percebo alguma injustiça ou desvio de conduta.

O meio-termo

Apesar de ter aulas em apenas um turno (noturno), cursar uma graduação ainda exige um grau de dedicação e responsabilidade. Acumular funções em diversas organizações e projetos não é uma boa ideia — fazer isso te levará a um burn out. Isso pode diminuir as suas perspectivas de remuneração ou envolvimento com coisas fantásticas a curto prazo, mas vejo um diploma de ensino superior como um bom investimento a longo prazo que eventualmente se pagará no futuro.

O mau

Como mencionei há alguns parágrafos, você enfrentará algumas situações desagradáveis no ensino superior — e isso inevitavelmente acabará afetando a sua saúde mental. Sendo bastante sincera, não há um semestre em que eu não pense em desistir da ideia de ter um diploma apesar dos incentivos que citei há pouco. As dinâmicas de poder em uma universidade acabam afetando o seu dia-a-dia como estudante, e muitas vezes sinto que a universidade “sufoca” a minha criatividade em nome de algo que sequer sei se coordenadores e núcleos de estruturação de cursos sabem nomear.

Também há um grande desejo de “atender as demandas do mercado” sem que isso seja acompanhado de uma formação humana — discussões sobre o impacto social da tecnologia são muitas vezes vistas como mero detalhe. Foram raras as vezes em que testemunhei algum professor tocar em assuntos como condições de trabalho e o papel da tecnologia no reforço de estruturas que não tornam o mundo melhor1.

Devo considerar seguir o mesmo caminho?

Acredito que toda pessoa deve tomar decisões sobre a sua carreira de forma informada e voluntária, por isso te incentivo a questionar a utilidade de um diploma de ensino superior em seus planos de carreira. Lembre-se que, como em toda decisão em sua carreira, cursar uma graduação te trará alguns benefícios e te forçará a fazer algumas renúncias. Analise o seu momento de vida e veja se essa é uma escolha que faz sentido no grande esquema do seu caminho profissional.


  1. Uma das raras ocasiões aconteceu no ensino de uma disciplina do primeiro semestre — um professor publicou um formulário online em que perguntava por que decidimos ingressar em Sistemas de Informação. Ao receber várias respostas relacionadas a “tornar o mundo melhor”, ele respondeu: “A tecnologia não é uma área tão aclamada porque ‘torna o mundo melhor’ — ela é apenas uma área extremamente rentável.” ↩︎



Promotion

III Workshop ADAs

I was invited to give a short workshop at the III Workshop ADAs, an annual event ran and organized by an initiative focused on women in technology called Projeto ADAs. It consisted on teaching viewers how Git and GitHub works, and giving them an idea of what a contribution path looks like. In addition to teaching university students about open source in a more practical way, I recommended Outreachy alongside Google Summer of Code and Google Season of Docs as paid opportunities they should consider applying to, and was interviewed by an organizer about my career in open source. That workshop was streamed live on July 4th, and it’s still available on their channel. It was well-received among students and professors.

Backstage

Longitudinal study

Last month I explored our longitudinal study to gather answers related to a few communities of interest we were evaluating. This month I discussed a few strategies to review and analyze all answers of all communities with Sage. We agreed that:

  • The most telling aspect of an answer is often what a former intern didn’t say about their internship, their mentors or their community.
  • We should also keep in mind the history of Outreachy itself, and read and analyze all answers considering that context.

Reading those answers takes quite a while, and it’s often a very exhausting task. With a bit of automation magic I was able to convert the .csv file to a .txt that structures all answers in a more readable format, but I always double check to make sure no important info is left behind. I had hopes it would make an impact on the way we organize things in our next round, but a more realistic timeline is delivering an in-depth report by September.

Impact of the COVID-19 pandemic on Brazilian academic calendars

The December-March round is extremely popular with Brazilian students. With the pandemic suspending most of academic calendars in Brazil, Sage and I agreed to preemptively check the dates of all public Brazilian universities that had more than one candidate in the last December-March round to understand better what challenges we would face when defining the dates of the selection process and the internship itself. A surprising number of universities haven’t defined their policies quite yet, but all of our universities of interest published an official document stating at least the most important dates related to all 2020 terms (start and end dates).

Reading interviews and news, I expect this pandemic to impact all academic calendars in the world for at least two years (or four rounds).

Policy for students from the Federal University of Goiás

I was invited by an Instituto de Informática professor to participate in the talks of creating a hackerspace in the Federal University of Goiás. We discussed the possibility of creating some kind of policy within the hackerspace that would possibly allow Federal University of Goiás students to use their Outreachy hours as credits or even as mandatory internship hours.

The use of Outreachy hours to comply to mandatory internship hours is quite controversial as Brazilian law has a really strict definition of what truly counts as an internship — in my conversations with FIEG last year, they told me that nationwide promotion of the program should frame it as a “paid mentorship” rather than a “paid internship”. This particular professor, however, has managed to create a few policies (at first with elective classes) to help students use Google Summer of Code hours as academic credits, and he agreed that a hackerspace may be able to provide the structure we need to help students validate their hours.

I believe this particular initiative has more chances to thrive than my failed attempt at creating a federal policy as I have much more leverage as an undergraduate student at my university than I had with an external organization such as FIEG.

Tech News issue #32, 2020 (August 3, 2020)

00:00, Monday, 03 2020 August UTC
previous 2020, week 32 (Monday 03 August 2020) next
Other languages:
Bahasa Indonesia • ‎Bahasa Melayu • ‎Deutsch • ‎English • ‎Hausa • ‎Nederlands • ‎Sunda • ‎Tagalog • ‎Tiếng Việt • ‎Türkçe • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português • ‎português do Brasil • ‎slovenčina • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎ייִדיש • ‎עברית • ‎العربية • ‎فارسی • ‎ߒߞߏ • ‎नेपाली • ‎मराठी • ‎हिन्दी • ‎ગુજરાતી • ‎தமிழ் • ‎తెలుగు • ‎മലയാളം • ‎ไทย • ‎አማርኛ • ‎中文 • ‎日本語 • ‎ꯃꯤꯇꯩ ꯂꯣꯟ • ‎한국어

weeklyOSM 523

11:14, Sunday, 02 2020 August UTC

21/07/2020-27/07/2020

lead picture

Civil Protection Portugal uses OSM – with attribution 😉 1 | © Civil Protection Portugal | Map data © OpenStreetMap contributors

Breaking news

  • Dorothea reminds us to celebrate the 16th anniversary of OSM on 8 August. She provides some ideas on how that can be done!

About us

  • Happy Birthday OSM-Wochennotiz! It was ten years ago this week that the first issue (de) (automatic translation) of the Wochennotiz was published.

Mapping

  • Matthew Woehlke has proposed the new tag sport=four_square for a sport called Four square.
  • Michael Montani’s proposal to introduce a natural=bare_soil tag for ‘an area covered by soil, without any vegetation’ is currently open for voting until 7 August.
  • Following the British Geospatial Commission’s announcement that unique identifiers for addresses and streets would become available as open data (we reported earlier), proposals have been produced for ref:GB:uprn (unique property reference number) and ref:GB:usrn (unique street reference number). Discussion has taken place on the Talk-GB and Tagging mailing lists.
  • JesseFW provided us with an explanation as to why the coastlines on Carto hadn’t been updated since January 2020. It seems that the coastline update was another victim of the Río de la Plata edit war (which we reported on earlier).
  • User mahdi1234 published a guide for beginners on visualising changes in OSM over time. He shows in detail how to create a time lapse with OSM data.

Community

  • Christoph Hormann objects to the framing of craft mappers in OpenStreetMap as conservatives opposed to change. He feels that this is part of a new narrative being communicated in OSMF politics; that is, the need for change in OpenStreetMap, and craft mappers’ opposition to change.
  • On the Geomob podcast Steven Feldman chats with recent FOSS4GUK keynote speaker María Arias de Reyna, a senior software engineer at Red Hat and former President of the Open Source Geospatial Foundation. The episode deals with María’s current work, and her recent talk at FOSS4GUK, but also imposter syndrome, and science fiction.
  • Øukasz recounts their experience of two recent interactions with CartONG, one with developing a tagging schema for refugee camps and the other with importing a UNHCR refugee camp dataset. Léonie Miège, from CartONG, responded in a blog comment.
  • Richard Fairhurst has written a new guide for data owners wishing to contribute to OSM. It is an output of cooperation with two local councils in the UK, which were recently funded by the Open Data Institute to investigate using and contributing to crowdsourced open map data such as OSM.
  • OpenStreetMap US has published its July 2020 newsletter.
  • Igor Eliezer has made a video showing how the 3D modelling of the Museu Paulista (São Paulo, Brazil) was re-worked in OpenStreetMap using the JOSM editor. The 3D preview at the beginning and end of the video is from the F4Map website and, during modelling, in Kendzi3D within the JOSM editor.

Imports

  • Alex Hennings has reviewed Facebook and ESRI’s proposed ‘not-an-import’ imports (we reported earlier) of ArcGIS datasets through RapiD or JOSM MapWithAI plugin and found them wanting. Of particular concern was the lack of solicited community review on the imports-us mail list.

OpenStreetMap Foundation

  • The OSMF board would like to consult the community on its hiring plans for a Senior Site Reliability Engineer. This is the first position based on the hiring framework which osmf-talk discussed a few days ago.

Events

  • Proceedings of the Academic Track at the State of the Map 2020 have been published.

Humanitarian OSM

  • HOT is conducting an online survey of people who have used RapiD to find out what their experience was. The data will be used to understand how RapiD could be made more accessible and usable for a variety of users.

Maps

  • Nuno Caldeira congratulated, through Twitter, the Portuguese National Emergency and Civil Protection Authority for using OpenStreetMap data, and correct attribution, in a tweet (pt) about a large forest fire that occurred last week in Portugal.
  • Taiwan, a nation in the Far East with special political status, has lots of outlying islands. Dadan Island in Kinmen is one of the most remote. MG, the Italian mapper, shares the Kinmen information website he created, The Kinmen Rising Project on the OSM Taiwan Telegram channel, showing photos from his journey with OSM as the base map, and of course he mapped a lot of POIs on the island.

Software

  • A research paper analyses the growing amount of freely available spatio-temporal information (such as aerial imagery) to support and guide mappers in their work. Artificial neural networks identify regions of interest where OSM is likely to require updating.
  • Venkanna Babu Guthula has released Label-Pixels, a tool for semantic segmentation of remote sensing images using fully convolutional networks (FCNs), designed for extracting road networks from remote sensing imagery.

Releases

  • Sarah Hoffmann announced release 1.3.0 of osm2pgsql with the addition of the (still experimental) new flex output. Jochen Topf, the main contributor for this release, explained how this gives more flexibility when exporting data from OSM to PostgreSQL.
  • The iD editor was updated and now has touch support, so it can be used on tablets (smartphone sized screens aren’t fully supported yet). Other highlights are integrated quality checks and multi-selection editing.
  • With the release of the latest version of iD, the ‘locator overlay’, a semi-transparent overlay when zoomed out, has been rebuilt. Via the OpenStreetMap editor-layer-index, the new overlay is now available on OpenStreetMap.org, and soon on the HOT Tasking Manager and other instances of iD.

Did you know …

  • … Finde.cash displays banks and ATMs with the respective ATM networks on a map? It also offers route planning by foot, bicycle or public transport, and four background options including OpenCycleMap. Missing ATMs can be inserted directly. The map is worldwide but the menu is only available in German.
  • MyOSMatic, the free of charge web service to generate city maps using OSM Data, which are available in PNG, PDF and SVG ready to print? Menus are available in 25 languages.
  • … the ‘OSM Quality Ranking’ (Beta) assesses and ranks 51 US cities by OSM data quality, checking geometry and tagging for streets, roads, and relations?

Other “geo” things

  • Brooklyn Historical Society’s map collection includes over 1,500 digitised historical maps spanning the seventeenth century to the present.
  • Nathanael Peterlini examined (de) (automatic translation) difficulties cartographers face when trying to please all of their users’ political views. They look at the cases of Kosovo and Palestine and how they are treated by Apple, Google, and OSM.
  • Garmin has been the victim of a ransomware attack. As a result, many of their online services were interrupted or are still down.
  • An update to Google Maps has allowed docked bike share riders in cities such as Chicago, Montreal and London to see end-to-end walking and cycling directions for their journey integrated with bike and dock availability. Cities Today gave some background to the new service.
  • Tagesspiegel interviewed 21,000 people about what scares them on the street and what Berlin’s bike paths should look like in the future. The results are explained (de) (automatic translation) with a series of graphics.

Upcoming Events

Where What When Country
London London Missing Maps Mapathon (ONLINE) 2020-08-04 uk
Mannheimn Mannheimer Mapathons – Treffen im Luisenpark 2020-08-04 deutschland
Stuttgart Stuttgarter Stammtisch 2020-08-05 germany
San José Civic Hack & Map Night (online) 2020-08-06 united states
Taipei OSM x Wikidata #19 2020-08-10 taiwan
Hamburg Hamburger Mappertreffen 2020-08-11 germany
Munich Münchner Stammtisch 2020-08-12 germany
Berlin 146. Berlin-Brandenburg Stammtisch 2020-08-14 germany
Zurich 120. Mapping-Party/OSM Meetup Zurich 2020-08-15 switzerland
Cologne Bonn Airport 130. Bonner OSM-Stammtisch (Online) 2020-08-18 germany
Lüneburg Lüneburger Mappertreffen 2020-08-18 germany
Cologne Köln Stammtisch ggf. ONLINE 2020-08-19 germany
Kandy 2020 State of the Map Asia 2020-10-31-2020-11-01 sri lanka

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, MatthiasMatthias, Nakaner, Nordpfeil, PierZen, Polyglot, Rogehm, SK53, TheSwavu, derFred, richter_fn.

Commissioners for Tanzanian Regions

12:44, Saturday, 01 2020 August UTC
Aggrey Mwanri is one of the 31 commissioners for a Tanzanian Region. The Tabora Region has a population of 2,291,623 inhabitants. For most of the 31 regions we know at least one commissioner and only for the Arusha Region we know "them all". 

I have been adding information about these Regional Commissioners and this is from a quality point of view a step in the right direction. Slowly but surely we know for more African countries structures and politicians.

When you compare African countries with "Western" countries, such structures are comparable. This makes it possible to show the extend the data in Wikidata does not represent the African reality. 

It is more than likely that there are lists of the data that is currently missing. These lists help us provide the bare bones of what it takes to know about African countries. 

So who are the data wizards who show where we our data is lacking. Where are the lists that enable the people who know tools like OpenRefine to fill in the gaps. Who has the pictures so that a Wikipedia article for a Mr Mwanri is illustrated??
Thanks,
       GerardM

Monthly Report, May 2020

22:25, Thursday, 30 2020 July UTC

Highlights

  • In May, we launched a Wiki Scientists course in partnership with 500 Women Scientists, facilitating as 20 members of 500 Women Scientists learned how to expand Wikipedia’s biographies of women in STEM. Thanks to a high demand from their members, we have continued searching for additional funding to support more women scientists as they join the Wikipedia community.
  • Our first two Wikidata courses of 2020 just wrapped up. The Beginner course had 10 participants who created 67 new Wikidata items, made more than 1,220 edits to Wikidata, and added 74 references to statements, improving the data quality for all of those claims. The 10 participants of the Intermediate course created 205 new items and edited over 5,100 existing ones. One of the participants was working on a project for the Metropolitan Museum of Art and uploaded over 5,000 images to Wikimedia Commons which can be used on Wikidata. Additionally, this course had several participants who are working on the LINCs project. This project aims to connect humanities research in Canada through linked data. The perspectives these individuals brought to this course demonstrated how Wikidata can influence other large scale linked data initiatives and, in turn, how these initiatives can influence Wikidata. To see some of the outstanding work these course participants did, follow this link. This record number of items edited is nothing short of inspiring. We hope this indicates that more institutions are willing to invest in Wikidata or that more driven participants are finding their way to our course. Either way this is a large body of high quality work that will benefit Wikidata and the larger linked data community.

Programs

Wikipedia Student Program

Status of the Wikipedia Student Program for Spring 2020 in numbers, as of May 31:

  • 409 courses were in progress (268, or 65%, were led by returning instructors).
  • 7,496 student editors were enrolled.
  • 55% of students were up-to-date with their assigned training modules
  • Students edited 6,200 articles, created 556 new entries, and added 5 million words and 54,100 references.

While a handful of spring quarter courses are still working on their Wikipedia assignments, May saw most of our courses wrap up for the term. Spring 2020 was a time of upheaval for our instructors and students as their courses abruptly switched to online platforms in the middle of the term. Despite these challenges, our students contributed 5 million words to Wikipedia, even tackling articles related to COVID-19.

Though a great deal of uncertainty surrounds the Fall 2020 term, Wikipedia Student Program Manager Helaine Blumenthal began to prepare for the following academic year. We are paying close attention to what institutions of higher education are planning as the pandemic unfolds and trying to adapt our materials to better serve our instructors and students in their changed classroom circumstances. Helaine hopes to look more deeply into best practices for online teaching and to assemble a robust list of library resources so our students can continue to access high quality sources despite being unable to physically go to their university libraries. 

Wikipedia Experts Shalor Toncray, Elysia Webb, and Ian Ramjohn were busy closing out courses and identifying the great work our students did this term.

Student work highlights:

On May 5, student work was featured on the main page of Wikipedia. A student improved an article about a moth species called the slender Scotch burnet. The article received nearly 1,300 views the day it was featured. Student work also appeared on the main page of Wikipedia again on May 29, where it was viewed an impressive 7,500 times! Climate of Pluto was created by a student in Vincent Chevrier Planetary Atmospheres course at University of Arkansas.

Gay and lesbian bars have long been a part of society. Some have needed to remain relatively secret in order to escape persecution while others have openly advertised their services to the local community. Daniel’s, which opened in late 1975, was one of the first lesbian bars in Spain and one of the first LGBT bars in Barcelona. Opened by María del Carmen Tobar, it originally was a bar and billiards room but expanded to have a dance hall. The bar attracted women from a wide variety of backgrounds including non-lesbian women. In the early years of the Spanish democratic transition the bar was accepted because its owner was well connected in the local government through her band-mate Daniela. Despite this, the police still occasionally raided the bar during its early years. Tobar played an active role in making Daniel’s the center of lesbian life in Barcelona, sponsoring sports teams and a theater group. The bar also sold feminist literature, including the magazine call Red de Amazonas. The bar later closed, but would be remembered in books and exhibits for its importance in the lesbian history of Spain. This article was expanded by a Colby College student in Dean Allbritton’s Queer Spain class, which sought to expand Wikipedia’s knowledge on LGBT history in Spain.

Many have heard of Amelia Earhart, but have you heard of Grace Muriel Earhart Morrissey? Morrissey was Amelia’s younger sister and a high school teacher, author, and activist. Earhart taught at the high schools in Medford and Belmont, MA, and she remained an active member of the Medford community until her death. She spent decades documenting Amelia’s life and managing her legacy, devoting significant time to coordinating her sister’s posthumous affairs, setting up donations, marshaling information, and dealing with Amelia’s fans. Morrissey also spoke out against the speculations that arose in the wake of Amelia’s death. She denied, for example, that her sister died while on a spy mission, as some theorists have past suggested. She wrote two books about Amelia, Courage is the Price and Amelia, My Courageous Sister. This article was created during May by a student in Lisa Gulesserian’s Kindred Spirits class at Harvard University, who allowed Amelia’s sister to shine. 

There are many Black men and women who fought against the injustices perpetuated against African-Americans seeking equal and fair treatment. Curlee Brown, Sr. is one such person who chose to challenge the inequality in the education system, as he launched a legal case that resulted in the integration of what would become the West Kentucky Community and Technical College. In 1950 Brown had attempted to enroll in the school, only to be rejected due to a then recently amended 1904 state law that prohibited desegregation in schools. He brought a lawsuit against the school and the U.S. District Court at Paducah ruled that the college must allow Brown and other Black applicants to enroll; however, the school fought against this. Their appeals were ultimately unsuccessful and the college was eventually integrated. For his tireless work with activism and the Paducah NAACP, Brown Sr. received multiple awards and honors and to honor his legacy the Kentucky NAACP created the Curlee Brown Scholarship. The Paducah branch of the NAACP created the Curlee Brown Award, which they grant to individuals who have made a visible impact in the field of human rights. In 2010 Brown Sr. was inducted into the Kentucky Civil Rights Hall of Fame. Another notable individual was Cyrus Field Adams, a Republican civil rights activist, author, teacher, newspaper manager and businessman. Adams fought a key battle in civil rights for African Americans. He used his variety of positions through his life, whether that be working for the newspaper, teacher, or working for the treasurer to advocate for civil rights. In his later life after being appointed by Theodore Roosevelt to be the Assistant Register at the US Treasury, he used this platform to write a book titled, The National Afro-American Council, Organized 1898: a history etc. In 1912, Adams decided to leave his position at the Treasury and join President Taft’s re-election campaign as asked to do so by Taft himself. This was an attempt to get Adams out of the treasury position as Taft had promised that position to another African-American man who supported Taft. Taft lost this election and President Wilson took over, he replaced every Republican that had worked for Taft including Adams. In the years to follow, an investigation was launched regarding the time Adams spent at the treasury to try to discredit his career. It’s thanks to University of Kentucky students in Nikki Brown’s African American History, 1865 to the Present class that we now have these articles. 

Joanna Mary Boyce (7 December 1831 – 15 July 1861) was a British painter associated with the Pre-Raphaelite Brotherhood. She is also known by her married name as Mrs. H.T. Wells, or as Joanna Mary Wells. She produced multiple works with historical themes, as well as portraits and sketches, and authored art criticism responding to her contemporaries. Boyce first exhibited her artwork publicly in 1855 at the Royal Academy. Though Boyce exhibited two pieces, it was her painting Elgiva that won Boyce the admiration of such critics as John Ruskin and Ford Madox Brown. In it, Boyce depicted model Lizzie Ridley as a tragic heroine from Anglo-Saxon historical legend, possibly following the precedent of Pre-Raphaelite painter John Everett Millais who had depicted Elgiva eight years prior. Following her first exhibition, Boyce continued to pursue artistic excellence through extensive sketching and international art-viewing expeditions. She spent 1857 in Italy, and in December of that year married miniaturist Henry Tanworth Wells (later a Royal Academician) in Rome. Boyce used her time in Italy to work on paintings such as The Boys’ Crusade and La Veneziana, a portrait of a Venetian lady. In addition to her own artistic practice at this time, Boyce also continued a lifelong practice of seeking out and analyzing the artwork of her contemporaries. Boyce published some of this analysis as art criticism in the Saturday Review, wherein she lauded the “sincerity” and principles of the Pre-Raphaelite art movement, and noted the positive influence of John Ruskin on the English art world. At the time of her death, contemporaries remarked on Boyce’s talent as an artist: Dante Gabriel Rossetti described her as “a wonderfully gifted woman”, and another obituarist called her a genius. Later critics have observed that Boyce’s reputation was somewhat constrained by her early death, but her art has been highlighted in exhibitions up until the present day. Who do we have to thank for the expansion of this article? None other than a UW Madison student from Anna Simon’s Art Librarianship class!

The Rio Grande sucker is a freshwater fish species native to the American southwest. Like many fish species in the area, populations have declined as a consequence of land use change, habitat loss, environmental degradation and competition from non-native species. As a result of this, the Rio Grande sucker is considered endangered in Colorado and a “species of concern” in Arizona. But before a student in Derek Houston’s Biology 667 class created an article about it, there was no Wikipedia article about the Rio Grande sucker. Given the important role that Wikipedia articles serve as a starting point for research into a topic, the presence of this article might have an impact on regulators who are trying to manage this fish species.

Little things run the world — in particular, the microorganisms that make up most of the living things on Earth. Rhodobacter capsulatus is a type of purple bacteria, which are bacteria that are able to make their own food using photosynthesis, much like plants do, but purple bacteria use a purple molecule to capture light instead of the green pigments that plants use. Rhodobacter capsulatus is also able to make gene transfer agents, small packages of DNA that allow them to transfer genes to other bacteria, without using sex. Before a student in Kelly Bender’s Prokaryotic Diversity class started editing it, Wikipedia’s article about this species of bacteria was just a three-sentence stub which mostly talked about the Latin roots of its scientific name. The student editor was able to expand the article into something very informative, adding sections about its genomics, morphology, ecology and significance, among others. Other students in the class made similar improvements to the Pseudomonas stutzeri and Chlamydia felis articles.

Scholars & Scientists Program

Wikipedia

This month we launched a 6-week intensive course focused on improving Wikipedia’s coverage of COVID-19 pandemic information. Specifically, participants are focusing on state-specific articles. In the United States, many of the actions taken that affect people’s lives most happen at the state level, and out of a commitment to public knowledge on Wikipedia we decided to run a course at no cost to participants in order to shore up this vital content. The scope of the course was more narrow than usual, and the duration a bit shorter, which allowed Scholars & Scientists Program Manager Ryan McGrady to develop a custom curriculum to guide participants to maximize their impact in a relatively short period of time.

We still have a couple weeks left in the COVID-19 Wiki Scholars course, but participants are already doing some incredible work. Highlights include a significantly expanded section of the Maine article that focuses on the impact on education; a near tripling of the size of the Wyoming article, including a major update to the timeline, impact on the economy, impact on colleges, and effects on the Northern Arapaho tribe and Yellowstone; several updates to the Florida timeline; increasing the size of the North Dakota article from about 6,500 to 45,000 bytes; and the addition of a significant section on the impact on voting in the New York article. At the start of the course, Wikipedia already had articles on all 50 states, but one Wiki Scholar ran into a challenge: how should we cover the well-documented impact on the Navajo Nation, which has the highest per capita rate of infection in the country and covers parts of three states? The answer seems obvious in hindsight, but nobody had done it yet: to create a brand new article about the COVID-19 pandemic in the Navajo Nation. Thanks to that Wiki Scholar, the impact on this community is covered on Wikipedia.

We were also excited to launch a course in partnership with 500 Women Scientists, focused on improving Wikipedia’s coverage of women in science. We’re less than half way through the course at the end of the month, and participants are still developing their articles, but we already have several great examples of biographies created or improved:

  • Rana Fine, whose research concerns ocean circulation processes over time through use of chemical tracers and the connection to climate.
  • Abigail Thompson, a mathematician who specializes in knot theory and low-dimensional topology.
  • Rachel Green, a professor of molecular biology and genetics researching ribosomes and their function in translation.
  • Deborah Kelley, a marine biologist studying hydrothermal vents, active submarine volcanoes, and life in those areas of the deep ocean.

The Women in Red Wiki Scholars course we kicked off last month started to hit its stride in May. We still have a few weeks left to go, but here are some of the biographies of women Wiki Scholars have created or improved this month:

  • Mary Carson Breckinridge (1881-1965), American nurse midwife who founded the Frontier Nursing Service.
  • Jane Sharp (c. 1641 – ?), an English midwife who wrote The Midwives Book: or the Whole Art of Midwifery Discovered in 1671.
  • Anne de Graville (c. 1490 – c. 1540), French Renaissance poet, translator, book collector, and lady-in-waiting to Queen Claude of France.
  • Madeleine Brès (1842-1921), the first French woman to obtain a medical degree.
  • Montserrat Calleja Gómez, Spanish physicist who specializes in bionanomechanics.
  • Anne-Marie Lagrange, French astrophysicist whose work focuses on extrasolar planetary systems.
  • Natalie Roe, experimental particle physicist and observational cosmologist who is the Director of the Physics Division at Lawrence Berkeley National Laboratory.

Last month we highlighted some of the articles improved through the course we ran with the American Physical Society. It wrapped up early this month, but not before participants added a few more articles to their list of pages created or improved:

  • Peter F. Green, materials scientist and Deputy Laboratory Director for Science and Technology at the National Renewable Energy Laboratory.
  • Tulika Bose, physicist at the University of Wisconsin-Madison whose research focuses on developing triggers for experimental searches of new phenomena in high energy physics.
  • Henry T. Brown, chemical engineer who was the first African American director of the American Institute of Chemical Engineers in 1983.
  • Rayleigh theorem for eigenvalues, a concept in mathematics concerning the behavior of the solutions of an eigenvalue equation as the number of basis functions employed in its resolution increases.

We also finished our third course focused on family planning topics in partnership with the Society of Family Planning. As with previous courses, participants improved several high-impact articles on abortion, contraception, and related topics. Among the improvements this month were: the addition of a section on teleabortion to the telehealth article; updates to a wide range of state-specific abortion articles, like Abortion in New York and Abortion in Guam; extensive edits to the Title X article, the only federal grant program dedicated solely to providing individuals with comprehensive family planning and related preventative health services; and a variety of improvements to the pregnancy test article.

Wikidata

Our first two Wikidata courses of 2020 just wrapped up. These two courses were able to a staggering amount of work in six short weeks.

  • Beginner: This course had 10 participants who created 67 new Wikidata items, made more than 1,220 edits to Wikidata, and added 74 references to statements, improving the data quality for all of those claims. The participants in this course were engaged and excited about the course material. We were lucky to host several individuals from City College, part of CUNY, in New York. Having multiple perspectives from one institution emphasized just how many applications Wikidata has. We also had a participant work on a collection of theater posters. Take a look at this well-modeled item for Yosef Bulof in Gidon. One item, William Marshal, 1st Earl of Pembroke, received more than 20 new references, bolstering the accuracy of their respective statements. The interests of this group varied greatly. this link to see a complete list of items they edited.
  • Intermediate: Ten editors participated in this course. They created 205 new items and edited over 5,100 existing ones. One of the participants was working on a project for the Metropolitan Museum of Art and uploaded over 5,000 images to Wikimedia Commons which can be used on Wikidata. Additionally, this course had several participants who are working on the LINCs project. This project aims to connect humanities research in Canada through linked data. The perspectives these individuals brought to this course demonstrated how Wikidata can influence other large scale linked data initiatives and, in turn, how these initiatives can influence Wikidata. To see some of the outstanding work these course participants did, follow this link.

This record number of items edited is nothing short of inspiring. We hope this indicates that more institutions are willing to invest in Wikidata or that more driven participants are finding their way to our course. Either way this is a large body of high quality work that will benefit Wikidata and the larger linked data community. 

Advancement

Partnerships

In May, we launched a Wiki Scientists course in partnership with 500 Women Scientists, facilitating as 20 members of 500 Women Scientists learned how to expand Wikipedia’s biographies of women in STEM. Thanks to a high demand from their members, we have continued searching for additional funding to support more women scientists as they join the Wikipedia community. 

We spent some time in May reworking the Scholars & Scientists end-of-course survey for participants, making sure we continue learning about their experiences in the course, motivations for participating, and how they assess their learning outcomes. We deployed the new survey to Wiki Scientists who completed the American Physical Society course, and we’re excited to use the information to demonstrate the value of working on Wikipedia to others’ employers and organizations.

Communications

Attabey Rodríguez Benítez has tips for folks stuck at home: learn how to add photos to Wikimedia Commons like she did in our Wiki Scientist course!

Dr. Lilly Eluvathingal learned how to add content to Wikipedia pages in her area of expertise through one of our Wiki Scientist courses. This month,  shared on our blog what she thought of the experience. 

The Wikipedia page Andrew Oh drastically improved achieved Good Article status when he continued to edit it after his course. Read more about what he found so valuable about the experience.

Blog posts:

External media:

Research:

Technology

In May, we turned our attention from our project to improve the user experience for students and instructors — which we had been iterating on through April — to the long-term foundations of the Dashboard. Google Summer of Code interns Amit Joki and Shashwat Kathuri, while not scheduled to officially start the ‘coding period’ until June, have already started making major strides to modernize the Dashboard’s JavaScript infrastructure by replacing deprecated libraries and features and replacing them with more stable and well-supported alternatives. This work will accelerate into the summer, as Amit focuses on streamlining our JavaScript and reducing the amount of code that browsers need to download, while Shashwat develops a system to better keep track of system errors and data bottlenecks.

Finance & Administration

The total expenditures for the month of April were $174K, ($16K) under the budget of $190K. The Board was under ($9K) by moving the Board Meeting from In-person to Remote. Fundraising was over budget +$6K due to a personnel change creating a need for consulting work +$2K and employment costs +$3K and +$1K in Indirect Costs. General & Administrative were over +$11K due to Indirect overhead allocation change +$6K, Professional Fees +$4K, and Administrative Costs +$1K. Programs were under by ($24K) including Payroll ($5K), while under in Travel ($8K), Professional Fees ($2K) Communications ($2K) and Indirect costs ($7K).

Office of the ED

Current priorities:

  • Finalizing the annual plan & budget for fiscal year 2020–21
  • Dealing with the effects of the COVID-19 pandemic on our organization

In May, Frank continued working on putting the annual plan & budget for next fiscal year together. Relying on a multitude of different data sources, as well as numerous conversations, he tried to come up with a realistic picture of how the COVID-19 pandemic and other destabilizing events in the United States could affect Wiki Education. In particular, his goal was to understand how institutional funders might move forward given that times of crisis always trigger a reaction from philanthropy that might threaten the survival of nonprofits which – like Wiki Education – depend on continuously unlocking new funding opportunities from institutional grantmakers. Frank ran through different scenarios and possible ways to mitigate a situation where grantmakers wouldn’t accept any new grantees for the foreseeable future, and where they would focus on protecting their endowments instead. 

After sending a first draft of the potential annual plan for 2020–21 to the board, Frank started extensive conversations with individual board members. He highlighted the extreme uncertainty that made coming up with the best path forward difficult, and he listened to how the board members assessed the situation. Discussions included projections about the situation in our country in general, about the possible reaction of the philanthropic sector, as well as effects on higher education and knowledge institutions like museums, archives, and libraries. All these conversations lasted through May and the board generously agreed to extend the timeline for the delivery of the final version of the annual plan in order to find the best solution for Wiki Education and the many millions of people being positively impacted by our organization’s work.

All code is built

20:57, Wednesday, 29 2020 July UTC

HEADER CAPTION: The head of the Statue of Liberty on exhibit at the Paris World's Fair, 1878. The statue was built in France ahead of time, shipped overseas in crates, and then assembled in New York. Image by Albert Fernique / public domain.

The process of mapping human-readable source code inputs to optimized, machine-readable outputs is called compiling or more generally, building. It's been a necessary part of software development since computers evolved past machine code. Even to serve the most abstract, high-level languages such as HTML and CSS, this build process is essential.

Just-in-time build steps

We build code all the time at Wikimedia. Every page request benefits from Less compilation, CSS and JavaScript minification, internationalization, URL mapping, and bundling build steps. All of this occurs at runtime through the ResourceLoader pipeline.

ResourceLoader's just-in-time build process is critical when key parameters vary on request. However, it has some notable limitations including:

  • Every just-in-time build step must be extremely performant, so fast that it can run on-the-fly, or our pages will load slowly. Additionally, sequential steps cannot be appended ad infinitum.
  • Effectively, ResourceLoader's just-in-time build steps can only use tools written in PHP. JavaScript execution is not possible.
  • Just-in-time build steps are less secure. They execute on production servers and serve content directly to the user. This eliminates the separation between development and runtime-only dependency trees, which can dramatically increase the attack surface, sometimes by orders of magnitude. Additionally, build outputs are shipped directly to the user without any opportunity for security review. When it comes to security, a just-in-time build step always strives to be as secure as an ahead-of-time build step that produces static outputs.
  • Just-in-time build steps are custom and complex. An ahead-of-time build step can easily be a one-liner that invokes standard tooling but the equivalent just-in-time build step, if one exists, is just as likely to be hundreds of lines of custom code. Historically, these custom steps have suffered from bus factor and received little attention beyond basic life support. Few engineers possess the abilities to write code of the caliber needed to add new build steps or change existing ones, which means the rest of Wikipedia and WMDE is blocked on their evolution. For example, we have been unable to keep pace with fundamental features like source map support (a formal request since 2013) or ES6 transpilation. In fact, there are laundry lists of missing features now standard elsewhere. The lack of standard functionality means that developing any code at Wikimedia is a completely different and far slower experience than the rest of the industry.
  • Just-in-time build step outputs have worse caching. The most advanced build step executed at runtime endeavors to have the same caching that comes out-of-the-box with an ahead-of-time build step: a plain file on disk.
CAPTION: No step in the pipeline can be delayed, and the longer the pipeline, the longer it takes to go from a nut to a new car. Image by unknown author / public domain.

Solving problems too big for just-in-time

Some problems are only solvable by just-in-time build steps. However, many solutions cannot meet the constraints of just-in-time build steps, so only a subset of all problems can be solved. This is a more general limitation of just-in-time build steps, not the ResourceLoader implementation. In practice, this means that developers cannot add a build step to the pipeline but are still left with their problem unsolved.

There must be an alternative. Our options include:

  1. Double down on building new features in ResourceLoader. This approach fails to address the fundamental limitations of all just-in-time build steps and may require reimplementing existing open-source solutions.
  2. Ship extra tooling to every user's browser and let them process it. Besides significantly increased bandwidth and computation costs that go against our mission to serve everyone, this isn't very eco-friendly, fails to solve many problems, leads to the laggy browsing experiences users so loathe on JavaScript-heavy pages, and doesn't scale far past polyfills.
  3. Replace ResourceLoader with industry standard tooling that has fewer constraints. This will require exploration, be expensive, and may have the same outcome as #1.
  4. Enhance ResourceLoader by building what we can ahead-of-time.

The first two options don't work. The third option doesn't sound like a good first choice. The fourth is the most conventional and proven solution.

Ahead-of-time build steps

Ahead-of-time build steps are usually what people think of when they refer to "building code." Most build problems that remain to be solved in Wikimedia only fit in the ahead-of-time space. As you might expect, we're using these enhancements all over the place already and can't live without them. Some examples include:

  • OOUI: Portions of this library are built with Grunt and a suite of packages from NPM for minification, uglification, and additional processing. The results are dozens of build products that are file-copied into Core manually.
  • Page Previews: This gem of a codebase is fully compiled from the latest JavaScript with Webpack. It serves about two billion virtual pageviews a month.
  • Wikibase : Ahead-of-time build tools are used by Wikibase including Webpack, TypeScript, and a plethora of other standards to serve the Wikidata communities.
  • MultimediaViewer: Commits to MultimediaViewer use ahead-of-time build steps to replace any human readable source SVGs with optimized, machine-readable outputs.
  • MediaWiki: Core uses a build step on every deployment. The process is called "a full scap." When the process fails, it's called "a full scapadapadoo."
  • MobileFrontend: All JavaScript in MobileFrontend, the heart of the mobile site, is built by Webpack. That's over 50% of all pageviews benefiting from an ahead-of-time build step using industry standard tooling.
  • Wikipedia for KaiOS: This Webpack-powered project uses a build step to serve a highly performant web app.
  • ContentTranslation: The glittering new ContentTranslation app uses the Vue CLI and standard tooling to generate the next-generation interfaces essential to serving contributors around the world. Put plainly, this is the kind of modern experience that would be impossible to build without modern tooling that leverages ahead-of-time build steps.
  • Wikipedia.org: Portals uses a build step to synchronize sister project statistics. I know someone who has a recurring task each week reminding him "it's build time." Although triggering the build step is person-powered, the outputs are what you would expect of an ahead-of-time build step: practical and project specific.
  • VisualEditor: VE is a sophisticated application that requires a build step. I don't know what this does exactly but I would guess it's solving the same kinds of problems everyone else has ahead-of-time.
  • And many more.

These ahead-of-time build steps are everywhere in Gruntfiles, Gulpfiles, Webpack configs, NPM package.json files, and shell scripts. Even if the Foundation mandated it today, we could never get rid of them.

Evolving the ResourceLoader pipeline with a new stage

Ahead-of-time build steps are the only solution for many problems, so it's fortunate they have such a proven track record of success both within and beyond the MediaWiki ecosystem. As everyone who is already using ahead-of-time build steps has discovered, they're the perfect complement to ResourceLoader's just-in-time build steps.

However, this is a problem at scale and it needs to be solved at scale. Informal developer builds work surprisingly well but aren't as efficient for developers as they could be. We need to extend the pipeline to include a pre-ResourceLoader stage. This stage is an ahead-of-time build step.

CAPTION: The International Space Station was built on Earth in modules that were optimized for assembly and constructed in orbit. Similarly, ResourceLoader modules can be built before deployment and finally assembled in the user's browser. Image by NASA/Crew of STS-132 / public domain.

In conclusion:

  • ResourceLoader provides useful just-in-time build steps.
  • Many projects have requirements that cannot be solved at runtime. These real problems are only solvable by traditional ahead-of-time build steps.
  • Just-in-time and ahead-of-time build steps are already in use by and are for everyone, and we can't change that.
  • Ahead-of-time build steps often use standard tools but are highly project specific. These should not be centralized nor should they be constrained by artificial limitations. Per-project solution autonomy must be preserved.
  • Adding a pre-ResourceLoader stage can integrate neatly with the current ResourceLoader system by extending the pipeline to include these existing ahead-of-time workflows.

Above all, a build step means freedom. The freedom to succeed and the freedom to use the tool that's right for the job, not the rare tool that fits into a runtime-only pipeline.

Thanks to Jan Drewniak, Santhosh Thottingal, Daniel Cipoletti, Joe Walsh, Bernd Sitzmann, and Mónica Pinedo Bajo for reviewing and providing detailed feedback.

This post is also available on the Wikimedia Tech Blog.

By Stephen Niedzielski, Senior Software Engineer

The process of mapping human-readable source code inputs to optimized, machine-readable outputs is called compiling or more generally, building. It’s been a necessary part of software development since computers evolved past machine code. Even to serve the most abstract, high-level languages such as HTML and CSS, this build process is essential.

Just-in-time build steps

We build code all the time at Wikimedia. Every page request benefits from Less compilation, CSS and JavaScript minification, internationalization, URL mapping, and bundling build steps. All of this occurs at runtime through the ResourceLoader pipeline.

ResourceLoader’s just-in-time build process is critical when key parameters vary on request. However, it has some notable limitations including:

  • Every just-in-time build step must be extremely performant, so fast that it can run on-the-fly, or our pages will load slowly. Additionally, sequential steps cannot be appended ad infinitum.
  • Effectively, ResourceLoader’s just-in-time build steps can only use tools written in PHP. JavaScript execution is not possible.
  • Just-in-time build steps are less secure. They execute on production servers and serve content directly to the user. This eliminates the separation between development and runtime-only dependency trees, which can dramatically increase the attack surface, sometimes by orders of magnitude. Additionally, build outputs are shipped directly to the user without any opportunity for security review. When it comes to security, a just-in-time build step always strives to be as secure as an ahead-of-time build step that produces static outputs.
  • Just-in-time build steps are custom and complex. An ahead-of-time build step can easily be a one-liner that invokes standard tooling but the equivalent just-in-time build step, if one exists, is just as likely to be hundreds of lines of custom code. Historically, these custom steps have suffered from bus factor and received little attention beyond basic life support. Few engineers possess the abilities to write code of the caliber needed to add new build steps or change existing ones, which means the rest of Wikipedia and WMDE is blocked on their evolution. For example, we have been unable to keep pace with fundamental features like source map support (a formal request since 2013) or ES6 transpilation. In fact, there are laundry lists of missing features now standard elsewhere. The lack of standard functionality means that developing any code at Wikimedia is a completely different and far slower experience than the rest of the industry.
  • Just-in-time build step outputs have worse caching. The most advanced build step executed at runtime endeavors to have the same caching that comes out-of-the-box with an ahead-of-time build step: a plain file on disk.
An old photograph of the Ford assembly line.
No step in the pipeline can be delayed, and the longer the pipeline, the longer it takes to go from a nut to a new car. Image by unknown author / public domain.

Solving problems too big for just-in-time

Some problems are only solvable by just-in-time build steps. However, many solutions cannot meet the constraints of just-in-time build steps, so only a subset of all problems can be solved. This is a more general limitation of just-in-time build steps, not the ResourceLoader implementation. In practice, this means that developers cannot add a build step to the pipeline but are still left with their problem unsolved.

There must be an alternative. Our options include:

  1. Double down on building new features in ResourceLoader. This approach fails to address the fundamental limitations of all just-in-time build steps and may require reimplementing existing open-source solutions.
  2. Ship extra tooling to every user’s browser and let them process it. Besides significantly increased bandwidth and computation costs that go against our mission to serve everyone, this isn’t very eco-friendly, fails to solve many problems, leads to the laggy browsing experiences users so loathe on JavaScript-heavy pages, and doesn’t scale far past polyfills.
  3. Replace ResourceLoader with industry standard tooling that has fewer constraints. This will require exploration, be expensive, and may have the same outcome as #1.
  4. Enhance ResourceLoader by building what we can ahead-of-time.

The first two options don’t work. The third option doesn’t sound like a good first choice. The fourth is the most conventional and proven solution.

Ahead-of-time build steps

Ahead-of-time build steps are usually what people think of when they refer to “building code.” Most build problems that remain to be solved in Wikimedia only fit in the ahead-of-time space. As you might expect, we’re using these enhancements all over the place already and can’t live without them. Some examples include:

  • OOUI: Portions of this library are built with Grunt and a suite of packages from NPM for minification, uglification, and additional processing. The results are dozens of build products that are file-copied into Core manually.
  • Page Previews: This gem of a codebase is fully compiled from the latest JavaScript with Webpack. It serves about two billion virtual pageviews a month.
  • Wikibase : Ahead-of-time build tools are used by Wikibase including Webpack, TypeScript, and a plethora of other standards to serve the Wikidata communities.
  • MultimediaViewer: Commits to MultimediaViewer use ahead-of-time build steps to replace any human readable source SVGs with optimized, machine-readable outputs.
  • MediaWiki: Core uses a build step on every deployment. The process is called “a full scap.” When the process fails, it’s called “a full scapadapadoo.”
  • MobileFrontend: All JavaScript in MobileFrontend, the heart of the mobile site, is built by Webpack. That’s over 50% of all pageviews benefiting from an ahead-of-time build step using industry standard tooling. If you build it, they will come.
  • Wikipedia for KaiOS: This Webpack-powered project uses a build step to serve a highly performant web app.
  • ContentTranslation: The glittering new ContentTranslation app uses the Vue CLI and standard tooling to generate the next-generation interfaces essential to serving contributors around the world. Put plainly, this is the kind of modern experience that would be impossible to build without modern tooling that leverages ahead-of-time build steps.
  • Wikipedia.org: Portals uses a build step to synchronize sister project statistics. I know someone who has a recurring task each week reminding him “it’s build time.” Although triggering the build step is person-powered, the outputs are what you would expect of an ahead-of-time build step: practical and project specific.
  • VisualEditor: VE is a sophisticated application that requires a build step. I don’t know what this does exactly but I would guess it’s solving the same kinds of problems everyone else has ahead-of-time.
  • And many more.

These ahead-of-time build steps are everywhere in Gruntfiles, Gulpfiles, Webpack configs, NPM package.json files, and shell scripts. Even if the Foundation mandated it today, we could never get rid of them.

Evolving the ResourceLoader pipeline with a new stage

Ahead-of-time build steps are the only solution for many problems, so it’s fortunate they have such a proven track record of success both within and beyond the MediaWiki ecosystem. As everyone who is already using ahead-of-time build steps has discovered, they’re the perfect complement to ResourceLoader’s just-in-time build steps.

However, this is a problem at scale and it needs to be solved at scale. Informal developer builds work surprisingly well but aren’t as efficient for developers as they could be. We need to extend the pipeline to include a pre-ResourceLoader stage. This stage is an ahead-of-time build step.

Photograph of the International Space Station in Earth's orbit.
The International Space Station was built on Earth in modules that were optimized for assembly and constructed in orbit. Similarly, ResourceLoader modules can be built before deployment and finally assembled in the user’s browser. Image by NASA/Crew of STS-132 / public domain.

In conclusion:

  • ResourceLoader provides useful just-in-time build steps.
  • Many projects have requirements that cannot be solved at runtime. These real problems are only solvable by traditional ahead-of-time build steps.
  • Just-in-time and ahead-of-time build steps are already in use by and are for everyone, and we can’t change that.
  • Ahead-of-time build steps often use standard tools but are highly project specific. These should not be centralized nor should they be constrained by artificial limitations. Per-project solution autonomy must be preserved.
  • Adding a pre-ResourceLoader stage can integrate neatly with the current ResourceLoader system by extending the pipeline to include these existing ahead-of-time workflows.

Above all, a build step means freedom. The freedom to succeed and the freedom to use the tool that’s right for the job, not the rare tool that fits into a runtime-only pipeline.

Thanks to Jan Drewniak, Santhosh Thottingal, Daniel Cipoletti, Joe Walsh, Bernd Sitzmann, and Mónica Pinedo Bajo for reviewing and providing detailed feedback.

About this post

Featured image credit: The head of the Statue of Liberty on exhibit at the Paris World’s Fair, 1878. The statue was built in France ahead of time, shipped overseas in crates, and then assembled in New York. Image by Albert Fernique / public domain.

This post was originally published on July 28, 2020 in the Wikimedia Phame Blog.

Production Excellence #22: June 2020

16:54, Tuesday, 28 2020 July UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📈 Month in review
  • 4 documented incidents in June. [1]
  • 37 new production errors were filed and 27 were closed. [2] [3]
  • 72 recent production errors still open (up from 68).
  • 203 total Wikimedia-prod-error tasks currently open (up from 192). [4]

For more about recent incidents see Incident documentation, on Wikitech or Preventive measures in Phabricator.


📖 Outstanding errors

Breakdown of new errors reported in June that are still open today:

  1. (Needs owner) / Newsletter extension: Unexpected locking SELECT query. T253926
  2. (Needs owner) / FlaggedRevs extension: Unable to submit review of page due to bad fr_page_id record. T256296
  3. Editing team / MassMessage extension: Delivery fails due to system user conflict. T171003
  4. Parsing team / Parsoid: Pagebundle data unavailable due to a bad UTF-8 string. T236866
  5. Growth team / Recent changes: Update for ActiveUsers data failing due to deadlock. T255059
  6. Growth team / GrowthExperiments: Issue with question display on personal homepage. T255616
  7. Language team / Translate extension: Update jobs fail due to invalid function call. T255669
  8. Language team / ContentTranslation: Save action fails due to duplicate insert query. T256230
  9. Core Platform team / Content handling: Incompatible content type during content merge/stash. T255700
  10. Core Platform team / Monolog: API usage logs and error logs sometimes missing due to socket failure. T255578
  11. Search Platform team / WikibaseCirrus: Elevated error levels from EntitySearchElastic warnings. T255658
  12. Wikidata / API: Generator query fails due to invalid API result format. T254334
  13. Wikidata / API: EntityData query emits warning about bad RDF. T255054
  14. Wikidata / Repo: Entity relation update jobs fail due to deadlock. T255706

📊 Trends
Take a look at the workboard and look for tasks that could use your help.

Summary over recent months:

  • July 2019 (5 of 18 tasks left): Two tasks closed.
  • August (1 of 14 tasks left): Another task closed, only one remaining! 🚀
  • September (5 of 12 tasks left): Two tasks closed.
  • October (6 of 12 tasks left), no change.
  • November (3 of 5 tasks left): Another task closed.
  • December (5 of 9 tasks left), no change.
  • January 2020 (5 of 7 tasks lef), no change.
  • February (4 of 7 tasks left), no change.
  • March (2 of 2 tasks left), no change.
  • April (11 of 14 tasks left): Three tasks closed.
  • May (11 tasks left): Three tasks closed.
  • June: 14 new tasks survived the month of June. ⚠️

At the end of May the number of open production errors over recent months was 68. Of those, 10 got closed, but with 14 new tasks from June still open, the total has grown further to 72.

The workboard had 192 open tasks last month, which saw another increase, to now 203 open tasks (this includes tasks from 2019 and earlier).


🎉 Thanks!

Thank you to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


ATC: “Do you want to report a UFO?” Pilot: “Negative. We don't want to report.”
   ATC: “Do you wish to file a report of any kind to us?” Pilot: “I wouldn't know what kind of report to file.”
  ATC: “Me neither…”

Footnotes:
[1] Incidents. – https://wikitech.wikimedia.org/wiki/Incident_documentation#2020
[2] Tasks created. – https://phabricator.wikimedia.org/maniphest/query/VTpmvaJLYVL1/#R
[3] Tasks closed. – https://phabricator.wikimedia.org/maniphest/query/qn5yeURqyl3D/#R
[4] Open tasks. – https://phabricator.wikimedia.org/maniphest/query/Fw3RdXt1Sdxp/#R

Tracing some ornithological roots

06:59, Tuesday, 28 2020 July UTC
The years 1883-1885 were tumultuous in the history of zoology in India. A group called the Simla Naturalists' Society was formed in the summer of 1885. The founding President of the Simla group was, oddly enough, Courtenay Ilbert - who some might remember for the Ilbert Bill which allowed Indian magistrates to make judgements on British subjects. Another member of this Simla group was Henry Collett who wrote a Flora of the Simla region (Flora Simlensis). This Society vanished without much of a trace. A slightly more stable organization was begun in 1883, the Bombay Natural History Society. The creation of these organizations was probably precipitated by the emergence of a gaping hole. A vacuum was created with the end of an India-wide correspondence network of naturalists that was fostered by a one-man-force - that of A. O. Hume. The ornithological chapter of Hume's life begins and ends in Shimla. Hume's serious ornithology began around 1870 and he gave it all up in 1883, after the loss of years of carefully prepared manuscripts for a magnum opus on Indian ornithology, damage to his specimen collections and a sudden immersion into Theosophy which also led him to abjure the killing of animals, taking to vegetarianism and subsequently to take up the cause of Indian nationalism. The founders of the BNHS included Eha (E. H. Aitken was also a Hume/Stray Feathers correspondent), J.C. Anderson (who was a Simla naturalist) and Phipson (who was from a wine merchant family with a strong presence in Simla). One of the two Indian founding members, Dr Atmaram Pandurang, was the father-in-law of Hume's correspondent Harold Littledale, a college principal at Baroda.

Shimla then was where Hume rose in his career (as Secretary of State, before falling) allowing him to work on his hobby project of Indian ornithology by bringing together a large specimen collection and conducting the publication of Stray Feathers. Through readings, I had a constructed a fairytale picture of the surroundings that he lived in. Richard Bowdler Sharpe, a curator at the British Museum who came to Shimla in 1885 wrote (his description  is well worth reading in full):
... Mr. Hume who lives in a most picturesque situation high up on Jakko, the house being about 7800 feet above the level of the sea. From my bedroom window I had a fine view of the snowy range. ... at last I stood in the celebrated museum and gazed at the dozens upon dozens of tin cases which filled the room ... quite three times as large as our meeting-room at the Zoological Society, and, of course, much more lofty. Throughout this large room went three rows of table-cases with glass tops, in which were arranged a series of the birds of India sufficient for the identification of each species, while underneath these table-cases were enormous cabinets made of tin, with trays inside, containing series of the birds represented in the table-cases above. All the specimens were carefully done up in brown-paper cases, each labelled outside with full particulars of the specimen within. Fancy the labour this represents with 60,000 specimens! The tin cabinets were all of materials of the best quality, specially ordered from England, and put together by the best Calcutta workmen. At each end of the room were racks reaching up to the ceiling, and containing immense tin cases full of birds. As one of these racks had to be taken down during the repairs of the north end of the museum, the entire space between the table-cases was taken up by the tin cases formerly housed in it, so that there was literally no space to walk between the rows. On the western side of the museum was the library, reached by a descent of three stops—a cheerful room, furnished with large tables, and containing, besides the egg-cabinets, a well-chosen set of working volumes. ... In a few minutes an immense series of specimens could be spread out on the tables, while all the books were at hand for immediate reference. ... we went below into the basement, which consisted of eight great rooms, six of them full, from floor to ceilings of cases of birds, while at the back of the house two large verandahs were piled high with cases full of large birds, such as Pelicans, Cranes, Vultures, &c.
I was certainly not hoping to find Hume's home as described but the situation turned out to be a lot worse. The first thing I did was to contact Professor Sriram Mehrotra, a senior historian who has published on the origins of the Indian National Congress. Prof. Mehrotra explained that Rothney Castle had long been altered with only the front facade retained along with the wood-framed conservatories. He said I could go and ask the caretaker for permission to see the grounds. He was sorry that he could not accompany me as it was physically demanding and he said that "the place moved him to tears." Professor Mehrotra also told me about how he had decided to live in Shimla simply because of his interest in Hume! I left him and walked to Christ Church and took the left branch going up to Jakhoo with some hopes. I met the caretaker of Rothney Castle in the garden where she was walking her dogs on a flat lawn, probably the same garden at the end of which there once had been a star-shaped flower bed, scene of the infamous brooch incident with Madame Blavatsky (see the theosophy section in Hume's biography on Wikipedia). It was a bit of a disappointment however as the caretaker informed me that I could not see the grounds unless the owner who lived in Delhi permitted it. Rothney Castle has changed hands so many times that it probably has nothing to match with what Bowdler-Sharpe saw and the grounds may very soon be entirely unrecognizable but for the name plaque at the entrance. Another patch of land in front of Rothney Castle was being prepared for what might become a multi-storeyed building. A botanist friend had shown me a 19th century painting of Shimla made by Constance Frederica Gordon-Cumming. In her painting, the only building visible on Jakko Hill behind Christ Church is Rothney Castle. The vegetation on Shimla has definitely become denser with trees blocking the views.
 
So there ended my hopes of adding good views (free-licensed images are still misunderstood in India) of Rothney Castle to the Wikipedia article on Hume. I did however get a couple of photographs from the roadside. In 2014, I managed to visit the South London Botanical Institute which was the last of Hume's enterprises. This visit enabled the addition a few pictures of his herbarium collections as well as an illustration of his bookplate which carries his personal motto.

Clearly Shimla empowered Hume, provided a stimulating environment which included several local collaborators. Who were his local collaborators in Shimla? I have only recently discovered (and notes with references are now added to the Wikipedia entry for R. C. Tytler) that Robert (of Tytler's warbler fame - although named by W E Brooks) and Harriet Tytler (of Mt. Harriet fame) had established a kind of natural history museum at Bonnie Moon in Shimla with  Lord Mayo's support. The museum closed down after Robert's death in 1872, and it is said that Harriet offered the bird specimens to the government. It would appear that at least some part of this collection went to Hume. It is said that the collection was packed away in boxes around 1873. The collection later came into possession of Mr B. Bevan-Petman who apparently passed it on to the Lahore Central Museum in 1917.

Hume's idea of mapping rainfall
to examine patterns of avian distribution
It was under Lord Mayo that Hume rose in the government hierarchy. Hume was not averse to utilizing his power as Secretary of State to further his interests in birds. He organized the Lakshadweep survey with the assistance of the navy ostensibly to examine sites for a lighthouse. He made use of government machinery in the fisheries department (Francis Day) to help his Sind survey. He used the newly formed meteorological division of his own agricultural department to generate rainfall maps for use in Stray Feathers. He was probably the first to note the connection between rainfall and bird distributions, something that only Sharpe saw any special merit in. Perhaps placing specimens on those large tables described by Sharpe allowed Hume to see geographic trends.

Hume was also able to appreciate geology (in his youth he had studied with Mantell ), earth history and avian evolution. Hume had several geologists contributing to ornithology including Stoliczka and Ball. One wonders if he took an interest in paleontology given his proximity to the Shiwalik ranges. Hume invited Richard Lydekker to publish a major note on avian osteology for the benefit of amateur ornithologists. Hume also had enough time to speculate on matters of avian biology. A couple of years ago I came across this bit that Hume wrote in the first of his Nests and Eggs volumes (published post-ornith-humously in 1889):

Nests and Eggs of Indian birds. Vol 1. p. 199
I wrote immediately to Tim Birkhead, the expert on evolutionary aspects of bird reproduction and someone with an excellent view of ornithological history (his Ten Thousand Birds is a must read for anyone interested in the subject) and he agreed that Hume had been an early and insightful observer to have suggested female sperm storage.

Shimla life was clearly a lot of hob-nobbing and people like Lord Mayo were spending huge amounts of time and money just hosting parties. Turns out that Lord Mayo even went to Paris to recruit a chef and brought in an Italian,  Federico Peliti. (His great-grandson has a nice website!) Unlike Hume, Peliti rose in fame after Lord Mayo's death by setting up a cafe which became the heart of Shimla's social life and gossip. Lady Lytton (Lord Lytton was the one who demoted Hume!) recorded that Simla folk "...foregathered four days a week for prayer meetings, and the rest of the time was spent in writing poisonous official notes about each other." Another observer recorded that "in Simla you could not hear your own voice for  the grinding of axes. But in 1884 the grinders were few. In the course of my service I saw much of Simla society,  and I think it would compare most favourably with any other town of English-speaking people of the same size. It was bright and gay. We all lived, so to speak, in glass houses. The little bungalows perched on the mountainside wherever there was a ledge, with their winding paths under the pine trees, leading to our only road, the Mall." (Lawrence, Sir Walter Roper (1928) The India We Served.)

A view from Peliti's (1922).
Peliti's other contribution was in photography and it seems like he worked with Felice Beato who also influenced Harriet Tytler and her photography. I asked a couple of Shimla folks about the historic location of Peliti's cafe and they said it had become the Grand Hotel (now a government guest house). I subsequently found that Peliti did indeed start Peliti's Grand Hotel, which was destroyed in a fire in 1922, but the centre of Shimla's social life, his cafe, was actually next to the Combermere Bridge (it ran over a water storage tank and is today the location of the lift that runs between the Mall and the Cart Road). A photograph taken from "Peliti's" clearly lends support for this location as do descriptions in Thacker's New Guide to Simla (1925). A poem celebrating Peliti's was published in Punch magazine in 1919. Rudyard Kipling was a fan of Peliti's but Hume was no fan of Kipling (Kipling seems to have held a spiteful view of liberals - "Pagett MP" has been identified by some as being based on W.S.Caine, a friend of Hume; Hume for his part had a lifelong disdain for journalists. Kipling's boss, E.K. Robinson started the British Naturalists' Association while E.K.R.'s brother Philip probably influenced Eha.

While Hume most likely stayed well away from Peliti's, we see that a kind of naturalists social network existed within the government. About Lord Mayo we read: 
Lord Mayo and the Natural History of India - His Excellency Lord Mayo, the Viceroy of India, has been making a very valuable collection of natural historical objects, illustrative of the fauna, ornithology, &c., of the Indian Empire. Some portion of these valuable acquisitions, principally birds and some insects, have been brought to England, and are now at 49 Wigmore Street, London, whence they will shortly be removed. - Pertshire Advertiser, 29 December 1870.
Another news report states:
The Early of Mayo's collection of Indian birds, &c.

Amids the cares of empire, the Earl of Mayo, the present ruler of India, has found time to form a valuable collection of objects illustrative of the natural history of the East, and especially of India. Some of these were brought over by the Countess when she visited England a short time since, and entrusted to the hands of Mr Edwin Ward, F.Z.S., for setting and arrangement, under the particular direction of the Countess herself. This portion, which consists chiefly of birds and insects, was to be seen yesterday at 49, Wigmore street, and, with the other objects accumulated in Mr Ward's establishment, presented a very striking picture. There are two library screens formed from the plumage of the grand argus pheasant- the head forward, the wing feathers extended in circular shape, those of the tail rising high above the rest. The peculiarities of the plumage hae been extremely well preserved. These, though surrounded by other birds of more brilliant covering, preserved in screen pattern also, are most noticeable, and have been much admired. There are likewise two drawing-room screens of smaller Indain birds (thrush size) and insects. They are contained in glass cases, with frames of imitation bamboo, gilt. These birds are of varied and bright colours, and some of them are very rare. The Countess, who returned to India last month, will no doubt, add to the collection when she next comes back to England, as both the Earl and herself appear to take a great interest in Illustrating the fauna and ornithology of India. The most noticeable object, however, in Mr. Ward's establishment is the representation of a fight between two tigers of great size. The gloss, grace, and spirit of the animals are very well preserved. The group is intended as a present to the Prince of Wales. It does not belong to the Mayo Collection. - The Northern Standard, January 7, 1871
And Hume's subsequent superior was Lord Northbrook about whom we read:
University and City Intelligence. - Lord Northbrook has presented to the University a valuable collection of skins of the game birds of India collected for him by Mr. A.O.Hume, C.B., a distinguished Indian ornithologist. Lord Northbrook, in a letter to Dr. Acland, assures him that the collection is very perfect, if not unique. A Decree was passed accepting the offer, and requesting the Vice-Chancellor to convey the thanks of the University to the donor. - Oxford Journal, 10 February 1877
Papilio mayo
Clearly Lord Mayo and his influence on naturalists in India is not sufficiently well understood. Perhaps that would explain the beautiful butterfly named after him shortly after his murder. It appears that Hume did not have this kind of hobby association with Lord Lytton, little wonder perhaps that he fared so badly!

Despite Hume's sharpness on many matters there were bits that come across as odd. In one article on the flight of birds he observes the soaring of crows and vultures behind his house as he sits in the morning looking towards Mahassu. He points out that these soaring birds would appear early on warm days and late on cold days but he misses the role of thermals and mixes physics with metaphysics, going for a kind of Grand Unification Theory:

And then claims that crows, like saints, sages and yogis are capable of "aethrobacy".
This naturally became a target of ridicule. We have already seen the comments of E.H. Hankin on this. Hankin wrote that if levitation was achieved by "living an absolutely pure life and intense religious concentration" the hill crow must be indulging in "irreligious sentiments when trying to descend to earth without  the help of gravity." Hankin despite his studies does not give enough credit for the forces of lift produced by thermals and his own observations were critiqued by Gilbert Walker, the brilliant mathematican who applied his mind to large scale weather patterns apart from conducting some amazing research on the dynamics of boomerangs. His boomerang research had begun even in his undergraduate years and had earned him the nickname of Boomerang Walker. On my visit to Shimla, I went for a long walk down the quiet road winding through dense woodland and beside streams to Annandale, the only large flat ground in Shimla where Sir Gilbert Walker conducted his weekend research on boomerangs. Walker's boomerang research mentions a collaboration with Oscar Eckenstein and there are some strange threads connecting Eckenstein, his collaborator Aleister Crowley and Hume's daughter Maria Jane Burnley who would later join the Hermetic Order of the Golden Dawn. But that is just speculation!
1872 Map showing Rothney Castle

The steep road just below Rothney Castle

Excavation for new constructions just below and across the road from Rothney Castle

The embankment collapsing below the guard hut

The lower entrance, concrete constructions replace the old building

The guard hut and home are probably the only heritage structures left


I got back from Annandale and then walked down to Phagli on the southern slope of Shimla to see the place where my paternal grandfather once lived. It is not a coincidence that Shimla and my name are derived from the local deity Shyamaladevi (a version of Kali).


The South London Botanical Institute

After returning to England, Hume took an interest in botany. He made herbarium collections and in 1910 he established the South London Botanical Institute and left money in his will for its upkeep. The SLBI is housed in a quiet residential area. Here are some pictures I took in 2014, most can be found on Wikipedia.


Dr Roy Vickery displaying some of Hume's herbarium specimens

Specially designed cases for storing the herbarium sheets.

The entrance to the South London Botanical Institute

A herbarium sheet from the Hume collection

 
Hume's bookplate with personal motto - Industria et Perseverentia

An ornate clock which apparently adorned Rothney Castle
A special cover released by Shimla postal circle in 2012

Further reading
 Postscript

 An antique book shop had a set of Hume's Nests and Eggs (Second edition) and it bore the signature of "R.W.D. Morgan" - it appears that there was a BNHS member of that name from Calcutta c. 1933. It is unclear if it is the same person as Rhodes Morgan, who was a Hume correspondent and forest officer in Wynaad/Malabar who helped William Ruxton Davison.
Update:  Henry Noltie of RBGE pointed out to me privately that this is cannot be the forester Rhodes Morgan who died in 1919! - September, 2016.

Incidentally, the Simla naturalists' Society must have had its home in Chapslee Estate, which was where Ilbert lived and I had the privilege of having a look at the interiors of one of the last remaining heritage mansions in Shimla.

Ornithologists in cartoons

06:56, Tuesday, 28 2020 July UTC
From: The Graphic. 25 April 1874.
It is said that the modern version of badminton evolved from a game played in Poona (some sources name the game itself as Poona). When I saw this picture from 1874 about five years ago, I gave little thought to it. Revisiting it after five years after some research on one of A.O. Hume's ornithological collaborators, I have a strong hunch that one of the people depicted in the picture is recognizable although it is not going to be easy to confirm this.

I recently created a Wikipedia entry for a British administrator who worked in the Bombay Presidency, G.W. Vidal, when I came across a genealogy website (whose maintainer unfortunately was uncontactable by email) with notes on his life that included a photograph in profile and a cartoon. The photograph was apparently taken by Vidal himself, a keen amateur photographer apart from being a snake and bird enthusiast. Like naturalists of that epoch, many of his specimens were shot, skinned or pickled and sent off to museums or specialists. He was an active collaborator of Hume and contributed a long note in Stray Feathers on the birds of Ratnagiri District, where he was a senior ICS official. He continued to contribute notes after the ornithological exit of Hume, to the Journal of the Bombay Natural History Society. This gives further support for an idea I have suggested before that a key stimulus for the formation of the BNHS was the end of Stray Feathers. Vidal's mother has the claim for being the first women novelist of Australia. Interestingly one of his daughters, Norah, married Major Robert Mitchell Betham (2 May 1864 – 14 March 1940), another keen amateur ornithologist born in Dapoli, who is well-known in Bangalore birding circles for being the first to note Lesser Floricans in the region. Now Vidal was involved in popularizing badminton in India, apparently creating some of the rules that allowed matches to be played. The man at the left in the sketch in the 1874 edition of The Graphic looks quite like Vidal, but who knows! What do you think?

PS: Vidal sent bird specimens to Hume, and at least two subspecies have been named from his specimens after him - Perdicula asiatica vidali and Todiramphus chloris vidali.

For more information on Vidal, do take a look at the Wikipedia entry. More information from readers is welcome as usual.

PS: 26-July-2020: It would appear that an old badminton court near Sholapur was also of some ornithological interest.
I think I can safely say that there is only one place in India where this bird has been shot, and where I have shot it during every month in the year, and this Sholapur. There was a grass and baubul jungle near the old Fives court on the Bijapur Road which always contained florican. - "Felix" (1906). Recollections of a Bison & Tiger Hunter. London:J.M. Dent & Co. p. 183.

Special Olympics staff improve Wikipedia’s equity

16:32, Monday, 27 2020 July UTC

Jamie Valis is the Director of Health Training at Special Olympics, Lindsay Dubois is the Director, Research and Evaluation at Special Olympics, and Chelsea Fosse is a public health dentist, former Coordinator of the Clinical Director Community of Practice at Special Olympics, and Senior Health Policy Analyst at the American Dental Association. Jamie, Lindsay, and Chelsea recently participated in a WITH Wiki Scientists training course and reflect on their experience related to health disparities for individuals with intellectual disabilities.

Jamie Valis, Lindsay Dubois, and Chelsea Fosse
Jamie Valis, left; Lindsay Dubois, center; and Chelsea Fosse, right

At first glance, a Special Olympics competition looks like any other sporting event. Passion radiates from faces on the field, stretching is happening on the sidelines in preparation for an upcoming match, seas of uniform colors form as teams flock together throughout the facility, and fans cheer loudly. But what you may miss if you don’t look closely, is that life-saving health services and screenings are also taking place on-site during these competitions. Special Olympics athletes have intellectual disabilities (ID) and face greater barriers accessing and utilizing our complex health care system and suffer at disproportionate rates from chronic health conditions. They have difficulty finding providers who are trained and willing to serve them, and may struggle with or need the support of others in their day-to-day health needs. Special Olympics offer these critical health screenings to help end the health disparities and health care inequities that exist and are experienced by people with ID.

Special Olympics – with support from the Centers for Disease Control and Prevention (CDC) and the Golisano Foundation – has been striving to correct these inequities and promote the health and wellness of athletes and those individuals with ID. Our Inclusive Health programming includes: training health care providers and health professional students to provide higher quality care for people with ID; performing health screenings; offering health services such as prescription glasses, shoe fittings, and fluoride treatments; and referring athletes for follow-up care when necessary. While we’re incredibly proud of the progress we’ve made in bringing quality health care to our athletes in a positive and encouraging environment, we recognize there are so many more lives we need to touch, and additional work that needs to be done to increase awareness of the health disparities that exist in our communities around the world

When WITH Foundation and Wiki Education announced the opportunity for those in the disability and health space to come together to expand and enhance their knowledge base on disability and health, we were eager to participate. The call for participation in this program came through American Academy of Developmental Medicine & Dentistry (AADMD) on January 11th, 2020 – the exact day that Chinese media reported the first death due to COVID-19. On the first day of the Wiki Scientists course there were 0 confirmed cases of COVID-19 in the United States. The day the class ended, there were 22,086 confirmed cases in the United States. The correlation between the timeline of COVID-19 and this course is noteworthy given people with developmental disabilities are one of the most vulnerable populations to the virus. The most important risk for people with ID, however, is not the underlying condition, but the lack of access to quality health care and subsequent health inequities.

Three Special Olympics staff members were part of an 18-member cohort of students who were selected to participate in a 12-week WITH Wiki Scientists course to understand how to make scholarly research and Wikipedia contributions accessible to those with disabilities. The educational backgrounds, professional experiences, and personal journeys were integrated to allow each participant to explore their areas of interest while contributing to a broader mission. Participating in this course with such a diverse group allowed us to collaborate with other disability advocates and the broader disability community. We were reminded that many of the battles we face on a daily basis are those that are not just unique to people with intellectual disabilities.

Understanding how to utilize our sandboxes (pages designed for testing edits on Wikipedia), becoming proficient at using the visual editor, and evaluating wiki articles were amongst the many components of the weekly training assignment that we completed. Throughout the course, we were challenged to contribute to two Wikipedia articles. We placed emphasis on articles that had high relevance for understanding the health needs of people with intellectual disabilities and for solutions to improving health equity.

While the course focused on the basics of making contributions to existing articles and creating new Wikipedia entries, there was also emphasis on the overall concept of inclusivity and the need to create content that people of all abilities can read and understand. To address this, Wikipedia launched a simple English version of its platform for people with different needs, children, adults with learning difficulties, and people who are trying to learn English. The caveat is that content contributors need to write and submit this content separately.

Wikipedia editors are like anyone else, and they have their own biases and interests. Wikipedia is a reflection of the people who create it, and not necessarily the experiences and contributions of the broader population. This means that articles may have a more “ableist” point of view if they were written by scholars or contributors who don’t understand the lived experiences and needs of people with intellectual and developmental disabilities; these articles may use terminology that makes assumptions about the abilities of people with disabilities.

What became clear throughout the course was that Wikipedia, for all its vast wisdom and knowledge, is not immune to the shortcomings that we continue to observe in the fight for equity for people with intellectual and developmental disabilities. As a result of this course, the Special Olympics staff involved in this course propose a call to action for all scholars, experts, and other interested individuals to do the following:

1) Set up a Wikipedia account and learn how to create and review content.

2) Work on a simple language summary of your findings, recommendations, and guidelines every time you publish on Wikipedia content about people with intellectual and developmental disabilities

3) Review existing Wikipedia entries that are related to your areas of expertise and help create simple English versions of this content.

It has been the such a rewarding experience to meet new colleagues, collaborate with other disabilities advocates, and broaden our own horizons on issues faced by others in the disabilities community.

Lindsay Dubois, PhD
Chelsea Fosse, DMD, MPH
Jamie Valis, PhD

 

Interested in taking a course like the one Lindsay, Chelsea, and Jamie took? Visit learn.wikiedu.org to see current course offerings.

Header/thumbnail image by Maggie Mengel, AKSM photography, CC BY-SA 4.0 via Wikimedia Commons. Images of authors courtesy Jamie Valis.

Tech News issue #31, 2020 (July 27, 2020)

00:00, Monday, 27 2020 July UTC
previous 2020, week 31 (Monday 27 July 2020) next
Other languages:
British English • ‎Deutsch • ‎Deutsch (Sie-Form)‎ • ‎English • ‎Nederlands • ‎español • ‎français • ‎italiano • ‎polski • ‎português do Brasil • ‎suomi • ‎čeština • ‎русский • ‎српски / srpski • ‎українська • ‎עברית • ‎العربية • ‎മലയാളം • ‎中文 • ‎日本語 • ‎한국어

weeklyOSM 522

10:33, Sunday, 26 2020 July UTC

14/07/2020-20/07/2020

lead picture

Victoria Crawford‘s streets in Hackney align with the rising sun 1 | © Victoria Crawford | map data © OpenStreetMap contributors

Mapping

  • Michael Reichert would like to revert a series of surface and tracktype tags added without local knowledge by an armchair mapper. A discussion followed on the feasibility of mapping these tags only from imagery and whether reverting would be justified in all countries affected.
  • Cycling infrastructure has been selected as the UK quarterly project for Q3 2020. The project’s wiki page lists details of the types of things that can be done as part of the project. Of particular note is the availability of large quantities of cycling-related data from Transport for London which needs merging with OSM.
  • Adamant36’s request to add RapiD to the list of editors on the OSM map resulted in a robust discussion on GitHub.
  • Supaplex030 describes (de) (automatic translation) in his blog how measuring vibrations with a smartphone sensor gives excellent results to objectively classify the flatness of surfaces and map them with smoothness=*.
  • The MITFAHR|DE|ZENTRALE has published a guide (de) (automatic translation) on how to find railway stations without bike parking facilities.

Community

  • Stephan Knauss asked what happened to the FOSSGIS affiliate link for Amazon.de. Frederik Ramm replied that the link still exists but the revenue is declining. The statistics have been updated for those who are interested.
  • Andy Allan had a productive day working on the OpenStreetMap website.
  • Deane Kensok blogged about a partnership between Esri and Facebook to provide data from the ArcGIS user community to OSM. These datasets are OSM-tagged, compatibly licensed, and available to use for building maps in RapID and JOSM (via a plugin).
  • In Geomob Podcast #24 Ed Freyfogle interviews Peter Karich, co-founder of routing software maker Graphhopper. Graphhopper is a unique success story – they have built a thriving business on top of an open-source software project and OpenStreetMap data.
  • In Geomob podcast #25 Ed chats with geo industry veteran Randy Meech, co-founder of StreetCred. Perviously Randy was CEO of Mapzen, and CTO of MapQuest.
  • At the close of the State of the Map 2020 a live quiz with an OpenStreetMap theme was held, in which our team member ‘Nakaner’ took first place. A few days after the conference finished, Ilya Zverev published the quiz as a standalone game Test your knowledge of OpenStreetMap history and technology, or use the code to create your own quiz!
  • Ed Neerhut wrote in Geospatial World about the importance of maps in a crisis.

OpenStreetMap Foundation

  • openstreetmap.org was offline for a short period of time. Due to a memory-leakage issue with creating planet dumps, the server (ironbelly) was running out of memory, which interfered with its other role as NFS (Network File System) server for the web site. In order to prevent this in future, user images will be moved to Amazon’s S3.
  • The minutes of the Local Chapters and Communities Working Group meeting of 15 June have been published.
  • OpenStreetMap US has applied to become an official OSMF local chapter. The supporting documentation is available on the OSMF wiki.
  • The OpenStreetMap Ops Team tweeted a reminder that any Bitcoin donated to the OSMF would be used to build a stronger project.

Humanitarian OSM

  • Nominations for this year’s Humanitarian OpenStreetMap Team Board of Directors election have closed. The candidates have till 31 July to present their ideas and engage with the membership on how they would contribute to HOT in an elected role. You can find the list of candidates, the details of their nomination, and links to their candidate statements on the wiki page.
  • HOT published in a blogpost what was achieved with the 2019 HOT Microgrants.
  • Mikel Maron wrote a blogpost about what HOT needs to work on until 2025.

Maps

  • The LandlordTech project has created a survey in order to collect and display data on new forms of housing injustice caused by surveillance, tracking, data accumulation, and algorithmic methods intruding into domestic and neighbourhood spaces.
  • John Nelson wrote about firefly cartography and how these glowing maps attract feedback like no other type of map that he has made.
  • Michal Migurski announced that Facebook is releasing an update to Daylight – a 56 gigabyte export of validated OSM data.
  • A YouTube video about South Korea’s map service policy depicts Google’s fight for better map data, and the companies profiting from the situation.

Licences

  • The minutes of the Licensing Working Group meeting on 9 July have been published. The Attribution Guideline, for which the Board of Directors would prefer a stricter version than the one LWG proposed, was given a lot of attention. Simon Poole also announced that he is stepping down from the LWG.

Software

  • New features have been added to the browser extension OSM Smart Menu. Now users can create links using URL Templates [1], rename existing links [2] and access the link list from any website [3].
  • John Vargas-Muñoz et al. have published a paper reviewing the use of machine learning to improve OSM data and machine learning based techniques that use OSM data for applications in other domains.

Programming

  • hauke-stieler has developed a new task management tool, as an alternative to the HOT tasking manager or MapCraft, and called it Simple Task Manager (STM). On the mailing list Hauke explains (de) (automatic translation) in detail what made him do it and how to work with the STM.
  • [1] Victoria Crawford reports on Twitter about her map, which shows which roads in Hackney run toward the rising sun during the year, according to her, inspired by puntofisso, which in turn refers to Cédric Scherer’s (@CedScherer) 30DayMapChallenge. She has published her code on GitHub.
  • Alexey Pechnikov reported (ru) (automatic translation) about his experiences with creating PostgreSQL / PgRouting routing systems using OpenStreetMap data.

Did you know …

  • … that the OpenStreetMap wiki has an A to Z to help you figure out how to tag objects?
  • … that you can view 19th century maps of Europe on Mapire?

Other “geo” things

  • The Group on Earth Observations (GEO) and Google Earth Engine (GEE) have announced that 32 projects, from 22 countries, will be awarded a total of US$3 million towards production licences and US$1 million in technical support from EO Data Science, to tackle some of the world’s greatest challenges using open Earth data.
  • Alexander Zipf reported on work done at Heidelberg University to develop a new routing algorithm for pedestrians. The algorithm minimises the exposure of pedestrians to traffic noise pollution while taking into account the route distance.
  • The history of indigenous peoples and their political movements is an important issue in Taiwan. Researchers in east coast county Hualian have trained members of the Bunun people to use GPS devices, and in an expedition surveyed traces of the abandoned settlements where their ancestors lived. They found (automatic translation) 50 historic remains such as abandoned houses in the remote area.

Upcoming Events

Where What When Country
Ludwigshafen a.Rhein (Stadtbibliothek) Mannheimer Mapathons e.V. 2020-07-23 germany
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-07-29 germany
London London Missing Maps Mapathon (ONLINE) 2020-08-04 uk
Stuttgart Stuttgarter Stammtisch 2020-08-05 germany
Taipei OSM x Wikidata #19 2020-08-10 taiwan
Zurich 120. OSM Meetup Zurich 2020-08-11 switzerland
Munich Münchner Stammtisch 2020-08-12 germany
Kandy 2020 State of the Map Asia 2020-10-31-2020-11-01 sri lanka

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by AnisKoutsi, Joker234, MatthiasMatthias, Nakaner, Nordpfeil, Polyglot, Rogehm, SK53, TheSwavu, derFred, mOlind, richter_fn.

First a definition; "When data is biased, we mean that the sample is not representative of the entire population". This approach successfully underpins the Women in Red project currently a percentage of 18.51% women in English Wikipedia has been achieved. Compare the coverage of Anglo-American politicians with the politicians from the whole of Africa, the bias in the data at Wikidata is already obvious, it will then have numbers attached to it.

This is not a problem for Wikidata alone and yes, we can have a project and include a lot of data to get to a growth percentage as we did for the Women in Red. Worthwhile in its own right but in this way we do not forge a closer relation with its "premier brand Wikipedia". It would be mere stamp collecting.

The best argument for having data in Wikidata is that it is used. This is done in self selecting Wikipedias through global info boxes and lists. Interwiki links are used on every Wikipedia. Integrating the necessary functionality is a meta/technical affair and firmly for the Wikimedia Foundation to own. 

The functionality to make this happen implements an existing idea with additional twists.
  • Pictures for the subject are linked to courtesy of Special:MediaSearch
  • Automated descriptions are provided in every language to aid disambiguation. At first the functionality by Magnus is used and it is to be replaced with improved descriptions provided by Abstract Wikipedia
  • A Reasonator like display is provided to inform on the data we have on an item.
  • Suggestions for the inclusion in categories and lists are provided based on Wikidata definitions for categories and lists.
  • To help people find sources, alternate sources, Scholia is included when there are papers about the subject. Once existing citations are available, they are an additional resource
In essence this is a toolset that you can opt into as an individual and/or it is the standard for a project. Particularly for the smaller projects this will prove to be really valuable; it will prevent false friends, it indicates heavily linked items that do not have an article. It stimulates the addition of labels because it is beneficial in finding illustrations. 

This proposal is relatively low tech and it will bring our many communities together by providing widely the information that is available to us.
Thanks,
     GerardM

What to love in English Wikipedia

19:03, Thursday, 23 2020 July UTC
This list of commissioners of the Arusha Region is great, it provides the basic information that enables me to include this information in Wikidata. It can be assumed that they are all from Tanzania, politicians and human as well. 

What I love in English Wikipedia are lists like this. It is more than likely that for every Tanzanian region there will be a similar list and as a consequence we can include all these fine politicians to Wikidata, list them in whatever Wikipedia.

As more politicians for Tanzania or any other African country are added, politicians will pop up who have held multiple offices. This will be explicit in Wikidata and in Wikipedia you could use Special:WhatLinksHere.

Technically there is not much stopping us from associating red links with Wikidata items. This is the same guy used in the "WhatLinksHere" and you find him in this list that is a work in progress as well. 

Think this through.. With lists like this in any Wikipedia, these people are findable, linkable. It will be possible to state in text what a given commissioner did and, there will be no ambiguity because of the link. 

So I love English Wikipedia for the rich resource of information it is. I love its editors who provide us with the information that enables the reuse of data. I will rejoice when it is recognised that we can do much more. When we accept that together, as an ecosystem, we are in a position where we actually share the sum of all knowledge that is available to us.
Thanks,
        GerardM

Manjari - 4th anniversary

04:40, Thursday, 23 2020 July UTC

A rough drawing I did in 2014 November 20 and shared with my friends as a new font idea. I got this concept from my explorations about perfect curves in Malayalam script after I released Chilanka font. I spent all my free time from then onwards till releasing Manjari typeface on 23rd July 2016 by making it as perfect as I can. I took two months time off from my job in 2016 to complete this work too.

Production Excellence #16: October 2019

03:09, Thursday, 23 2020 July UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 3 documented incidents. [1]
  • 33 new Wikimedia-prod-error reports. [2]
  • 30 Wikimedia-prod-error reports closed. [3]
  • 207 currently open Wikimedia-prod-error reports in total. [4]

There were three recorded incidents last month, which is slightly below our median of the past two years (Explore this data). To read more about these incidents, their investigations, and pending actionables; check Incident documentation § 2019.


📖 To Log or not To Log

MediaWiki uses the PSR-3 compliant Monolog library to send messages to Logstash (via rsyslog and Kafka). These messages are used to automatically detect (by quantity) when the production cluster is in an unstable state. For example, due to an increase in application errors when deploying code, or if a backend system is failing. Two distinct issues hampered the storing of these messages this month, and both affected us simultaneously.

Elasticsearch mapping limit

The Elasticsearch storage behind Logstash optimises responses to Logstash queries with an index. This index has an upper limit to how many distinct fields (or columns) it can have. When reached, messages with fields not yet in the index are discarded. Our Logstash indexes are sharded by date and source (one for “mediawiki”, one for “syslog”, and one for everthing else).

This meant that error messages were only stored if they only contained fields used before, by other errors stored that day. Which in turn would only succeed if that day’s columns weren’t already fully taken. A seemingly random subset of error messages was then rejected for a full day. Each day it got a new chance at reserving its columns, so long as the specific kind of error is triggered early enough.

To unblock deployment automation and monitoring of MediaWiki, an interim solution was devised. The subset of messages from “mediawiki” that deal with application errors now have their own index shard. These error reports follow a consistent structure, and contain no free-form context fields. As such, this index (hopefully) can’t reach its mapping limit or suffer message loss.

The general index mapping limit was also raised from 1000 to 2000. For now that means we’re not dropping any non-critical/debug messages. More information about the incident at T234564. The general issue with accommodating debug messages in Logstash long-term, is tracked at T180051. Thanks @matmarex, @hashar, and @herron.

Crash handling

Wikimedia’s PHP configuration has a “crash handler” that kicks in if everything else fails. For example, when the memory limit or execution timeout is reached, or if some crucial part of MediaWiki fails very early on. In that case our crash handler renders a Wikimedia-branded system error page (separate from MediaWiki and its skins). It also increments a counter metric for monitoring purposes, and sends a detailed report to Logstash. In migrating the crash handler from HHVM to PHP7, one part of the puzzle was forgotten. Namely the Logstash configuration that forwards these reports from php-fpm’s syslog channel to the one for mediawiki.

As such, our deployment automation and several Logstash dashboards were blind to a subset of potential fatal errors for a few days. Regressions during that week were instead found by manually digging through the raw feed of the php-fpm channel instead. As a temporary measure, Scap was updated to consider the php-fpm’s channel as well in its automation that decides whether a deployment is “green”.

We’ve created new Logstash configurations that forward PHP7 crashes in a similar way as we did for HHVM in the past. Bookmarked MW dashboards/queries you have for Logstash now provide a complete picture once again. Thanks @jijiki and @colewhite! – T234283


📉 Outstanding reports

Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • March: 1 report fixed. (3 of 10 reports left).
  • April: 8 of 14 reports left (unchanged). ⚠️
  • May: (All clear!)
  • June: 9 of 11 reports left (unchanged). ⚠️
  • July: 13 of 18 reports left (unchanged).
  • August: 2 reports were fixed! (6 of 14 reports left).
  • September: 2 reports were fixed! (10 of 12 new reports left).
  • October: 12 new reports survived the month of October.

🎉 Thanks!

Thank you, to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


🌴“Gotta love crab. In time, too. I couldn't take much more of those coconuts. Coconut milk is a natural laxative. That's something Gilligan never told us.

Footnotes:
[1] Incidents. – wikitech.wikimedia.org/wiki/Special:PrefixIndex?prefix=Incident…
[2] Tasks created. – phabricator.wikimedia.org/maniphest/query…
[3] Tasks closed. – phabricator.wikimedia.org/maniphest/query…
[4] Open tasks. – phabricator.wikimedia.org/maniphest/query…

Production Excellence #17: December 2019

03:09, Thursday, 23 2020 July UTC

How’d we do in our strive for operational excellence in November and December? Read on to find out!

📊 Month in numbers
  • 0 documented incidents in November, 5 incidents in December. [1]
  • 17 new Wikimedia-prod-error reports. [2]
  • 23 Wikimedia-prod-error reports closed. [3]
  • 190 currently open Wikimedia-prod-error reports in total. [4]

November had zero reported incidents. Prior to this, the last month with no documented incidents was December 2017. To read about past incidents and unresolved actionables; check Incident documentation § 2019.

Explore Wikimedia incident graphs (interactive)


📖 Many dots, do not a query make!

@dcausse investigated a flood of exceptions from SpecialSearch, which reported “Cannot consume query at offset 0 (need to go to 7296)”. This exception served as a safeguard in the parser for search queries. The code path was not meant to be reached. The root cause was narrowed down to the following regex:

/\G(?<negated>[-!](?=[\w]))?(?<word>(?:\\\\.|[!-](?!")|[^"!\pZ\pC-])+)/u

This regex looks complex, but it can actually be simplified to:

/(?:ab|c)+/

This regex still triggers the problematic behavior in PHP. It fails with a PREG_JIT_STACKLIMIT_ERROR, when given a long string. Below is a reduced test case:

$ret = preg_match( '/(?:ab|c)+/', str_repeat( 'c', 8192 ) );
if ( $ret === false ) {
    print( "failed with: " . preg_last_error() );
}
  • Fails when given 1365 contiguous c on PHP 7.0.
  • Fails with 2731 characters on PHP 7.2, PHP 7.1, and PHP 7.0.13.
  • Fails with 8192 characters on PHP 7.3. (Might be due to php-src@bb2f1a6).

In the end, the fix we applied was to split the regex into two separate ones, and remove the non-capturing group with a quantifier, and loop through at the PHP level (Gerrit change 546209).

The lesson learned here is that the code did not properly check the return value of preg_match, this is even more important as the size allowed for the JIT stack changes between PHP versions.

For future reference, @dcausse concluded: The regex could be optimized to support more chars (~3 times more) by using atomic groups, like so /(?>ab|c)+/. — T236419


📉 Outstanding reports

Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Or help someone that’s already started with their patch:

→ Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • March: 3 of 10 reports left. (unchanged). ⚠️
  • April: Three reports closed, 6 of 14 left.
  • May: (All clear!)
  • June: Three reports closed. 6 of 11 left (unchanged). ⚠️
  • July: One report closed, 12 of 18 left.
  • August: Two reports closed, 4 of 14 left.
  • September: One report closed, with 9 of 12 left.
  • October: Four reports closed, 8 of 12 left.
  • November: 5 new reports survived the month of November.
  • December: 9 new reports survived the month of December.

🎉 Thanks!

Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production.

Until next time,

– Timo Tijhof


Footnotes:
[1] Incidents. – wikitech.wikimedia.org/wiki/Incident_documentation#2019
[2] Tasks created. – phabricator.wikimedia.org/maniphest/query…
[3] Tasks closed. – phabricator.wikimedia.org/maniphest/query…
[4] Open tasks. – phabricator.wikimedia.org/maniphest/query…