Digital Collections Associate Lisa Barrier and Digital Collections Manager Kathryn Gronsbell from Carnegie Hall explain what to expect when taking Wiki Education’s beginner’s Wikidata course and discuss their linked data plans for the future.

Introduction

Lisa Barrier and Kathryn Gronsbell.

We both started Join the open data movement: a beginner Wikidata course with limited Wikidata knowledge. While we understood the basic concept of Wikidata as an information source, we had minimal editing experience and even less familiarity of how Wikidata is created, maintained, and used. Throughout the course, we learned the technical skills needed to make edits, create items, and query items as well as the underlying concepts and community practices that best explain how Wikidata works and why it continues to expand. We went into the course expecting to receive an introduction to editing (and maybe querying) but came out of it with a new understanding of how to interpret, share, and grow Carnegie Hall’s archival collection and performance data.

Participant Background

Lisa (Digital Collections Associate, Carnegie Hall): I initially chose to take the course to become comfortable editing Wikidata items and to learn more about using and creating Wikidata queries. My main day-to-day activities at Carnegie Hall include: creating authority records in our performance history database for entities and venues related to collections; cataloging and monitoring asset metadata in the digital asset management system (DAMS); and working with members of other internal departments to successfully upload, tag, and find their assets. I hoped to be able to create new Wikidata items for underrepresented Carnegie Hall collection data as I did not know how to approach this seemingly overwhelming task. I did not understand the flexibility and community structure of Wikidata and thought that I would personally have to create perfect, complete items for each collection entity.

Prior to this Wiki Education course, I took an in-person Wikidata training course on creating items with data from the Metropolitan Museum of Art. The course introduced adding statements and references to newly created items, but primarily covered art-related data models and did not explain how to search for other properties and available statements. With the aid of this first course and the Wikidata understanding of my colleague Rob Hudson (Archives Manager at Carnegie Hall), the extent of my early experience with Wikidata included adding Carnegie Hall Agent IDs to items and referencing Wikidata Concept URIs in authority records created in the performance history database [which Rob set up to manifest in Carnegie Hall’s Linked Open Data (CH-LOD) as SKOS:exact match].

Kathryn (Digital Collections Manager, Carnegie Hall): I was excited to take this course with Lisa, and for the opportunity to learn more about Wikidata’s structure and communities. My aim was to apply that knowledge to help expand and improve Carnegie Hall’s performance history data as it relates to Digital Collections material. I’m responsible for Carnegie Hall’s Digital Collections – both the material and the digital asset management system (DAMS) where the collections are managed. We recently announced public access to a portion of the historic material at collections.carnegiehall.org and are excited to start modeling collection objects for inclusion in our linked open data.

Before starting this Wiki Education course, I had extremely limited experience in Wikidata. So little that my only contribution was done anonymously years ago – I corrected a statement that inaccurately assigned a person’s cause of death (P509) as a geographic location. I knew Wikidata grew significantly in past few years, and there was a lot of opportunity to participate. Lisa and our colleague Rob shared their experiences learning about and contributing to Wikidata, and I was excited to join in on the fun.

Course Takeaways

Our most significant takeaways were the exposure to community practices and how to find inspiration within existing data projects and groups. Our instructors, Will Kent and Ian Ramjohn, discussed the culture of page ownership and how there may be “primary” editors who will often engage on the Talk pages and lead decision-making about edits. On a meta-level, we were introduced to WikiProjects to explore where discussions take place and how to participate in those conversations. These projects are community run and can include goals, chats, data models, and vary in maturity and comprehensiveness. There are several WikiProjects related to archival collections, performing arts organizations, and other concepts that directly overlap with our daily work – we are excited to jump in to participate and maybe create our own. Class chats over Zoom touched upon active groups and recent conferences related to Wikidata, including: LD4 Linked Data for Libraries, WikidataCon, WikiCon North America, Wikimania, and the International Semantic Web Conference. We were introduced to showcase items (high-quality, well-developed examples) which enabled us to contribute more confidently to Wikidata. Benchmarks for data quality were a hot topic – we had an enlightening conversation about how bots contribute to Wikidata, understanding what role the bot may have, and identifying and correcting inaccuracies that may arise from automated data creation.

Along with the class discussion, we found the following technical skills fundamental to our use of Wikidata. We learned how to:

  • Edit items by adding statements, references, and identifiers (and best practices for doing so);
  • View an item’s change log and edit history;
  • Set up notifications on watch items;
  • Use suggested property lists and recoin to create more complete items;
  • Find and build queries with editable examples (and some great sample query lists);
  • Make edits, batch edits, and queries a little bit easier with tools such as Cradle, TABernacle, OpenRefine, and QuickStatements, and how to potentially use these tools to onboard new contributors.

We understood that increasing our experience with and exposure to Wikidata would help us plan for upcoming data projects. We can now work on continuing alignment between Wikidata and Carnegie Hall’s performance history data (CH-LOD) and create items for under-described or lesser known entities (including performers, composers, and artists) who may not be described in other datasets. We hope to engage with some of the existing WikiProjects around performing arts concepts and content, and potentially use WikiProjects as a space to model our public collection data.

Upcoming Data Projects at Carnegie Hall

The Carnegie Hall Archives is undertaking an exploratory project to understand how using Wikibase may help manage some of our data. Wikibase is the software that Wikidata runs on. Anyone can set up their own instance of a Wikibase to house their data. Thanks to this course, we have a better grasp on what we can query to pull in to a local Wikibase from Wikidata and have a better basis to understand what might be useful to contribute back out to Wikidata after a standalone Wikibase is established.

One collection that we think will most benefit from a Wikibase instance (and possible collaborative WikiProject) is our Tenants and Studios Collection. Currently, this collection data lives in a spreadsheet that lists individuals and groups that lived and/or worked in artist studios that no longer exist in Carnegie Hall’s current configuration. While this spreadsheet format is acceptable, we want to model the collection data semantically to increase usability and discover connections in the data that is not possible using a spreadsheet. Creating items for names and studios would allow more control over the structure and visibility of the information, as well as allow for us to capture spatial and temporal variances over the years that are not easily described or captured in flat or relational data structures. Now that we understand what data is useful to push to Wikidata and which information should be kept as a local resource, we can better create and edit items for research and reference purposes. We ultimately envision a Wikibase instance for the Tenants and Studios Collection as an opportunity to combine Carnegie Hall’s history with the data and stories of external resources and academics.

Upcoming data projects, like the one described above, will be under our newly established Carnegie Hall Data Lab. The Data Lab is a learning space for Carnegie Hall to expand our understanding of information innovation through experiments with linked open data, semantic technologies, and data-driven strategies that leverage the resources of the Carnegie Hall Archives. Having the experience and exposure we received in this Wiki Education course allowed us to more confidently initiate and participate in Data Lab experiments.

We are grateful to our classmates for their participation and willingness to share, and the guidance and insight from Will Kent and Ian Ramjohn throughout the course. Thank you Wiki Education!


Registration for our upcoming Wikidata courses is open! New to linked data? Join the open data movement in our beginner’s course. Have more experience with linked data or Wikidata? Sign up for our intermediate course that focuses on possible applications. Or visit data.wikiedu.org for more information.


Thumbnail image by Lmbarrier, CC BY-SA 4.0, via Wikimedia Commons.

Dr. Bridget Marshall is an Associate Professor at the University of Massachusetts Lowell and recently completed one of our Wiki Scholar courses with faculty at her institution. The Wikipedia training course is part of an initiative at UMass Lowell to build digital literacy teaching capacity and address the gender gap on campus and in Wikipedia.

I joined the Women in Red initiative at UMass Lowell because I had repeatedly experienced searching for biographies of women writers on Wikipedia and found them to be lacking, limited, or in some cases, non-existent. When I started as a Wiki Scholar in Wiki Education’s Women in Red training course sponsored by the University of Massachusetts, Lowell, only 18.04% of Wikipedia’s biographies were about women. And if students – and the wider world – don’t see stories about the lives and work of women, they will assume that women haven’t made substantial contributions to knowledge, or assume they don’t belong in certain industries, and continue to perpetuate systems that exclude women or keep them at the margins.

While I’m a regular user of Wikipedia, I had never considered becoming a contributor of content. But knowing that there were several Wikipedia experts as well as my own colleagues at the University working together really helped me feel comfortable to dive into this new realm. The class helped me to feel more confident in jumping into this platform, and the fact that there was a due date and weekly meetings helped me to keep on track. There’s always a lot to do, so it’s easy to say “oh, it would be great to fix that entry…someday” but having the class helped me focus on learning the necessary skills to make – and then complete – the additions and updates that I thought were important.

Two young women in Lowell, Massachusetts circa 1870. Source: Center for Lowell History, University of Massachusetts Lowell Libraries. (Public domain)

I’ve been working for a while now on a project about the “mill girls” of Lowell. There is actually a wonderfully well-developed page about them, and it’s a great resource, but I was disappointed that there weren’t more biographies of the individual women. I was able to improve and update several of the individual biographies of “mill girls” and also add a biography for a mill girl who did not previously have her own page. Very often, these writers are just referred to as “mill girls” as a collective, when in fact there were so many different individuals involved. They published stories, poems, and non-fiction, a lot of which is really interesting, and was well known at the time they were writing. Yet in the time since then, their writing – like a lot of popular writing by women – has all too often been ignored.

Nineteenth-century literature is filled with big names that you probably read in high school: Nathaniel Hawthorne, Herman Melville, Henry David Thoreau, Ralph Waldo Emerson. The ones that are taught over and over again are mostly men (and also white men). When I run workshops for K-12 teachers about the writing by mill girls, they are shocked to find that it existed, or that it’s something they could consider teaching.

Right now, if you go to the Wikipedia page for American literature, which has a quality assessment of “B,” you will see that while there are only four women (Emily Dickinson, Harriet Beecher Stowe, Edith Wharton, and Harper Lee) that make it into the first section (the overview), in that same section there are thirty-seven men included. Four women out of forty one total authors mentioned: that is less than 10%, a number even worse than the percentage of women’s biographies across Wikipedia. The rest of the article does include more women, but it’s just one of many, many examples of how despite the fact that large numbers of women were writing and publishing – and popular during their lives! – they simply are not represented in Wikipedia. Teachers and students need to see these people as important in order to want to teach them and read them, and one way of demonstrating that an author is important is to have a robust appearance on Wikipedia.

In addition to my work in nineteenth-century literature, I teach a course on “Disability in Literature,” so I was also particularly interested in the overlap of women writers with disabilities. For the writers I discuss in this class, very few of them have Wikipedia pages. In some cases, I was very surprised by this, because these were authors with multiple books. I created a new page for one of these writers, and I was so pleased to find that several other people then added to it with more information and citations. Seeing your article has been improved – by other people you’ll never meet or know! – is a really joyful experience. Knowing that you’ve contributed to something that will grow and be improved by others (and that you can come back to it and improve it yourself) really makes writing for Wikipedia feel worthwhile. This is one reason why I’m planning to include a Wikipedia writing assignment in one of my future courses, so that my students can think about audiences beyond just our classroom.

As of 6 January 2020, 18.19% of English Wikipedia’s biographies are of women. The needle is moving, but slowly, and we can use more Wiki Scholars to increase the number and improve the quality and variety of women represented on this important source.


Interested in taking a course like this? Sign up for our upcoming course and write Wikipedia biographies for women across disciplines and professions. To see all courses with open registration, visit learn.wikiedu.org.


Learn more about our partnership with UMass Lowell and this particular course by reading our blog post. For inquiries about partnering with Wiki Education, contact Director of Partnerships Jami Mathewson at jami@wikiedu.org or visit partner.wikiedu.org.


Hero/thumbnail image in the public domain.

6 million articles on Wikipedia!

19:11, Monday, 27 2020 January UTC

As of late last week, there are 6 million encyclopedic articles on the English Wikipedia. Milestones like this serve as a reminder of how far this resource, which we all have at our fingertips, has come. This event is also an opportunity to acknowledge and appreciate the continued commitment of the Wikipedian community to share all human knowledge with the world. Volunteers make Wikipedia what it is. And the world benefits from that.

Rosie Stephenson-Goodknight (VGrigas, CC BY-SA 4.0)

While it’s impossible to know which article brought Wikipedia over the threshold of six million, around 15 articles were added to the site in the minute it reached the milestone. Like many decisions made around Wikipedia every day, volunteers discussed which article should be recognized as the symbolic six millionth. Consensus determined that the honor should go to the biography of writer Maria Elise Turner Lauder, written by the prolific User:Rosiestep, known in and outside of the Wikipedia realm as Rosie Stephenson-Goodknight. As a founder of projects like Women in Red, Rosie helps lead the effort to close Wikipedia’s notorious gender gap and has written hundreds of new biographies for women writers since she began adding content to the site in 2007.

The first, second, third, fourth, and fifth millionth Wikipedia articles were about a railway station, television program, Norwegian director, Egyptian city, and rare shrub, respectively. Maria Elise Turner Lauder joins this diverse array of topics and reminds us of the important mission of groups like Women in Red and Wikipedians like Rosie to represent everyone on Wikipedia.

Maria Elise Turner Lauder, 1893.

Rosie’s mission to add 19th century women writers to Wikipedia is a powerful one. But it can also be challenging. Many of these women just weren’t recognized in their time, so collecting reputable sources to summarize for their Wikipedia biography is not always a simple task.

Wiki Education has been happy to help remove this barrier through our Visiting Scholars program, a program in which Wikipedians receive access to academic sources they wouldn’t otherwise be able to use. We connected Rosie with Northeastern University in 2017 and since then, she has created 352 new Wikipedia articles, added 862,000 words, uploaded more than 5,000 freely licensed images, and her work has been seen 2.33 million times.

By representing the lives and accomplishments of women across history for Wikipedia’s worldwide audience, Rosie honors their names and writes them into public history. While they were often silenced in their lifetimes and in the historical canon, we are not silent about them now.

Here’s to the next million!


Thumbnail includes images by Fuzheado (CC BY-SA 3.0) and VGrigas (WMF) (CC BY-SA 3.0).

weeklyOSM 496

11:22, Sunday, 26 2020 January UTC

14/01/2020-20/01/2020

lead picture

OSM and the streets in my city 1 | © Leaflet | © map data OpenStreetMap contributors

Mapping

  • Andy Mabbett noticed that JOSM flags building=disused were outdated but no alternative tags can be found in the OSM wiki. Kevin Kenny responded that JOSM uses the life cycle prefixes disused:building=* and abandoned:building=* instead.
  • The European Water Project is still actively contributing to the mapping of places with drinking water. Not just in terms of mapping but also by improving the tagging for such amenities. The project seeks the opinion of the community about the tagging of “seasonal” in conjunction with the combination of amenity=drinking_water or amenity=fountain and drinking_water=yes.
  • The European Water Project has drafted a proposal for the tagging of free_water=yes/no/customers and the specification free_water:container= and started the Request For Comments period.
  • The mapping of the eastern boundary of the Río de la Plata, i.e. defining where the river ends and the ocean begins, was the subject of an edit war.
  • Mapillary have made their map features available as a data layer in iD Editor. The layer contains point data which was extracted from imagery uploaded to Mapillary.

Community

  • OpenStreetMap Ireland are pleased with progress of mapping buildings in Kilkenny. Nearly 70% of the task manager squares have been mapped. Other Irish counties are also showing good progress.
  • Tobias Knerr posted a reminder about the upcoming Google Summer of Code 2020 and asks the community to add project ideas to the OSM Wiki page.
  • Allan Mustard, one of the newly elected OSMF board members, has drafted a SWOT analysis and asks the community to add their perspective on the page in the OSM Wiki.
  • n76 blogged about the problems he faced when he tried to produce a map of trekking destinations in Nepal. The name tags in Nepal seem to break with OSM name conventions, specifically by using romanised/transliterated names in the main name= tags rather than local names.
  • OSM-UK have been using Loomio for collaborative work, but the removal of the free tier led to discussion as to what platform to use in the future. Harry Wood pointed to the existing United Kingdom sub-forum. The issue has been resolved temporarily for 2020 by continuing to use Loomio.
  • Samuel Darkwah Manu (Sammyhawkrad), a scholar from Ghana, shares his experience of participating in the State of the Map Africa 2019 and Understanding Risk – West and Central Africa conferences in Ivory Coast in a diary post.

Imports

  • Branko Kokanovic wrote about plans to import local boundaries (admin_level=9) in Serbia and tag these with ref:sr:maticni_broj=, a tag pointing to reference numbers similar to the UK ons_code= or the French ref_insee=. Another interesting news item is the availability of an open data portal provided by the Републички геодетски завод, the geodetic authority in Serbia.
  • CJ Malone announced his intention to update bus stop names on the Isle of Wight. He plans to use an open data set from local bus operator Southern Vectis.
  • An import of Swedish settlement names from Lantmäteriets GSD-Terrängkartan is currently being discussed (sv) (automatic translation) on the local mailing list. The project and its current progress are documented on the comprehensive page in the OSM Wiki.

Events

  • Applications for scholarships for the upcoming State of the Map 2020, to be held in Cape Town in July 2020, can be submitted until 15 February 2020.

Humanitarian OSM

  • Harry Wood reminded us of the hundreds of mappers who spontaneously came together after the Haiti earthquake to rapidly produce a map for aid agencies. In response, Simon Poole observed that today’s HOT is not the same HOT that responded to the Haiti earthquake in 2010. Mikel Maron asks us to think about how we can help those who are still suffering in Haiti. He points out that, in some ways, HOT has benefited more than Haiti from the quake response.

Maps

  • [1] Andrei Kashcha announced a website which allows you to create, customise and export a map with all roads of a city and gave a brief introduction in a short video. He adds on Twitter that the data is licensed under ODbL and the source code is provided under a MIT license.

switch2OSM

  • Thomas Gratier pointed to a publication of the Elysée Palace (seat of the French President). The Elysée Palace advertised (automatic translation) the exhibition “Fabriqué en France” and used an OSM map to show the origin of the products.

Licences

  • MapTiler, a company offering mapping products and services based on OSM data, achieved something that others have been working on for years: Within one day the company fixed the missing attribution.

Software

  • Heidelberg University’s GIScience Research Group reports that NASA used the University’s OpenRouteService’s navigation service in a study of disaster response times.

Programming

  • Michal Migurski, following up on Andy Allan’s suggestion, had an in-depth look at testing the OSM Website Chef recipes with continuous integration tools. He wrote up his experiences as a diary post.
  • A forum user asked if it is possible, using OSRM, to create bicycle routes which avoid crossing busy roads. It turns out that this is missing functionality in OSRM. The discussion raises other use cases and possible ways to do this type of routing with existing software.

Releases

  • The OSM editor iD has been updated to version 2.17.1. The editor no longer supports Node 8 and requires at least Node 10 if you want to build the editor yourself. Other changes include the new ability to reorder fields with multiple values by drag-and-drop, usability improvements and many more.
  • OSMnx, a Python package to download, model, project, visualise, and analyse street networks with OSM data, has been upgraded to 0.11.3.

Did you know …

  • … MapRoulette is calling on mappers to nominate their favourite challenges so that they appear at the top of the Challenge list?
  • … the app Cartas Militares (pt), an all-terrain navigation app with military cartography? The app was created by the Geospatial Information Army Center (CIGeoE) of Portugal and it is available for Android and iOS devices.
  • … Bike Ottawa’s guide to tagging cycling infrastructure in OSM?

Other “geo” things

  • The United Kingdom has a bewildering complexity of administrative geographies. A new set of briefing papers from the Library of the House of Commons make comparisons between various geographies, showing how they overlap or have coterminous boundary segments. A deeper dive behind the history of these different geographies is also available.
  • Haikus created using OpenStreetMap data (as we reported earlier) are now also available (es) in Spanish.
  • The newspaper Heilbronner Stimme reports (de) (automatic translation) that a new pond created by a beaver dam near Adelshofen now appears on Google Maps (it also appears on OSM).
  • The Guardian features as one of their long reads an in-depth article about Strava.
  • The Green Car Congress blog reported on a recent paper which shows that newly developed street network patterns are reinforcing a trend towards urban sprawl. The open-access research, published in the journal PNAS, relied on OpenStreetMap and satellite data. The work was carried out by researchers from McGill University (Canada) and the University of California Santa Cruz.
  • Huawei have signed a deal with TomTom for maps and map services on their phones. New maps are needed because Huawei are no longer able to use Google Maps.
  • Business Insider profiled What3Words in their series on startups. With over 100 employees and major customers such as newly signed Mercedes, the firm is “working towards profitability”. It’s not clear precisely what this means as the filed accounts for 2018 showed losses of £11 million on a turnover of about £250,000.
  • This XKCD comic makes fun of the Mercator map projection by replacing every continent and island with South America.

Upcoming Events

Where What When Country
Ivrea Incontro mensile 2020-01-25 italy
Rome Incontro mensile Roma 2020-01-27 italy
Prague Missing Maps Mapathon Praha 2020-01-28 czech republic
Zurich Missing Maps Mapathon Zürich 2020-01-29 switzerland
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-01-29 germany
Hanover OpenStreetMap Sprechstunde 2020-01-29 germany
Budapest Budapest gathering 2020-02-03 hungary
Berlin OSM-Verkehrswende #8 2020-02-04 germany
London Missing Maps London 2020-02-04 united kingdom
Stuttgart Stuttgarter Stammtisch 2020-02-05 germany
Dortmund Mappertreffen 2020-02-07 germany
Rennes Réunion mensuelle 2020-02-10 france
Grenoble Rencontre mensuelle 2020-02-10 france
Taipei OSM x Wikidata #13 2020-02-10 taiwan
Toronto Toronto Mappy Hour 2020-02-10 canada
Lyon Rencontre mensuelle pour tous 2020-02-11 france
London Move 2020 (featuring OSMUK) 2020-02-11-2020-02-12 united kingdom
Zurich 114. OSM Meetup Zurich 2020-02-11 switzerland
Munich Münchner Stammtisch 2020-02-12 germany
Nantes Rencontre mensuelle 2020-02-13 france
Berlin 140. Berlin-Brandenburg Stammtisch 2020-02-14 germany
Turin FOSS4G-it/OSMit 2020 2020-02-18-2020-02-22 italy
Riga State of the Map Baltics 2020-03-06 latvia
Freiburg FOSSGIS-Konferenz 2020-03-11-2020-03-14 germany
Chemnitz Chemnitzer Linux-Tage 2020-03-14-2020-03-15 germany
Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
Guarda EuYoutH OSM Meeting 2020-06-24-2020-06-28 spain
Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by NunoMASAzevedo, Polyglot, Rogehm, SK53, SunCobalt, TheSwavu, YoViajo, derFred.

Celebrating 2 years of MediaWiki codesearch

10:50, Sunday, 26 2020 January UTC
MediaWiki codesearch logo

It's been a little over 2 years since I announced MediaWiki codesearch, a fully free software tool that lets people make regex searches across all the MediaWiki-related code in Gerrit and much more. While I expected it to be useful to others, I didn't anticipate how popular it would become.

My goal was to replace the usage of the proprietary GitHub search that many MediaWiki developers were using due to lack of a free software alternative, but doing so meant that it needed to be a superior product. One of the biggest complaints about searching via GitHub was that it pulled in a lot of extraneous repositories, making it hard to search just MediaWiki extensions or skins.

codesearch is based on hound, a code search engine written in go, originally maintained by etsy. It took me all of 10 minutes to get an initial prototype working using the upstream docker image, but I ran into an issue pretty quickly: the repository selector didn't scale to our then-500+ git repositories (now we're at more like 900!). So it wouldn't really be possible to just search extensions easily.

After searching around for other upstream code search engines and not having much luck finding things I liked, I went back to hound and instead tried running multiple instances at once and it more or less worked. I wrote a small ~50 line Python proxy to wrap around the different hound instances and provide a unified UI. The proxy was sketch enough that I wrote "Please don't hurt me." in the commit message!

But it seems to have held up over time, surprisingly well. I attribute that to having systemd manage everything and the fact that hound is abandoned/unmaintained/dead upstream, creating a very stable platform, for better or worse. We've worked around most of the upstream bugs so I usually pretend it's a feature. But if it doesn't get adopted sometime this year I expect we'll create our own fork or adopt someone else's.

I recently used the anniversary to work on puppetizing codesearch so there would be even less manual maintenance work in the future. Shoutout to Daniel Zahn (mutante) for all of his help in reviewing, fixing up and merging all the puppet patches. All of the package installation, systemd units and cron jobs are now declared in puppet - it's really straightforward.

For those interested, I've documented the architecture of codesearch, and started writing more comprehensive docs on how to add a new search profile and how to add a new instance.

Here's to the next two years of MediaWiki codesearch.

WBStack Infrastructure

13:08, Saturday, 25 2020 January UTC

WBStack currently runs on a Google Cloud Kubernetes cluster made up of 2 virtual machines, one e2-medium and one e2-standard-2. This adds up to a current total of 4 vCPUs and 12GB of memory. No Google specific services make up any part of the core platform at this stage meaning WBStack can run wherever there is a Kubernetes cluster with little to no modification.

A simplified overview of the internals can be seen in the diagram below where blue represents the Google provided services, with green representing everything running within the kubernetes cluster.

Other utility services exist around the core platform both running on kubernetes and as Google services. These includes:

And some of the platform services are made up of multiple parts, such as:

  • Different MediaWiki services for UI, API and job / backend requests
  • Different Platform APIs for user or services sourced requests
  • Other Platform elements such as a job queue and scheduler.
  • Replication of the storage layer (redis, mysql, blazegraph)

Right now many of the Google priced variables are still within, or mostly provided by the free tier meaning the cost per month is essentially the cost of the CPU, RAM and Load Balancer.

Moving forward it would be nice to move this to a location more supportive of the WBStack project to enable continued cost effective growth and development at this early stage.

The post WBStack Infrastructure appeared first on Addshore.

Looking for inspiration for your course this spring? These instructors are raving about how easy it is to adapt the Wikipedia assignment to their own disciplines. Sign up on the Dashboard today and follow their roadmaps!

  • Archaeology: Dr. Kate Grillo published an article last year about the value of a Wikipedia assignment in archaeology courses. The paper, which she published along with Daniel Contreras, ultimately concludes that “Wikipedia’s popularity and reach mean that archaeologists should actively engage with the website by adding and improving archaeological content.” Read more…
  • Architecture: Dr. Anthony Denzer, an Associate Professor of Architectural Engineering, shares how he sells the Wikipedia writing assignment to his students and all the reasons he plans to conduct the assignment again in future terms. “Was it a success? Absolutely.” Read more…
  • Biographies of women in STEM: So many women in STEM don’t have a Wikipedia biography until they’ve been recognized in a huge way. But thanks to Dr. Rebecca Barnes’s student at Colorado College, paleoclimatologist Dr. Andrea Dutton already had one even before she was named a MacArthur Foundation Fellow last year. Read more…
  • Composition: “What better way to teach research and writing skills than by using a platform that virtually ALL students already use, and one that has been universally forbidden to them as a research source throughout high school?” Read more…
  • English and anthropology: Dr. Gardner Campbell shares that the results of his Wikipedia assignment “far exceed” his expectations. “When the nature of an assignment leads to discovery, not simply to compliance, the learning becomes the students’ own.” Read more…
  • Gender studies: In their research analysis of the Wikipedia assignment, Dr. Ariella Rotramel, Rebecca Parmer, and Rose Oliveira of Connecticut College conclude that the assignment works well with feminist curricula, prepares students for careers, and fosters effective collaboration among faculty. Read more…
  • Early American history: The very thing about the assignment that some students found intimidating (writing for the public), was what made Rachel Van fall in love with it. She found that students asked themselves, Were they right in how they represented the history? Was their research fair to the subject? “These were the very things I wanted them to contemplate.” Read more…
  • Local history: Dr. Jason Todd tells the story of his students at Xavier University of Louisiana, who dramatically improved the Wikipedia article of their local town, saving the area from erasure in cultural memory. Read more…
  • Local and religious history: Dr. Heather Sharkey invited students to write Wikipedia pages about topics related to Muslim, Christian, or Jewish built structures. The class chose to create a new page for a local mosque near the Penn campus. “The students loved writing this page because they often pass this mosque on the street. They say that it has given them a sense of connection to this building and its community, which is part of the larger West Philadelphia neighborhood.” Read more…
  • History of science: What happens when students are confronted with a potential audience of 500 million? Dr. Alexandra Edwards can tell you. Read more…
  • Information science: Trudi Jacobson explains how she wove a Wikipedia writing assignment together with the six frameworks of information literacy. Read more…
  • Journalism: Dr. Melony Shemberger provides her framework for teaching Wikipedia writing assignments in the journalism classroom. Her work was even selected as a winning entry in a teaching contest at last year’s Association for the Education of Journalism and Mass Communication annual conference. Read more…
  • Medicine: For the group of med students who made the most significant improvement to their assigned Wikipedia pages, Melissa Kahili-Heede and Dr. Richard Kasuya would throw them a pizza party. The result was a classroom experience that promoted social responsibility, digital literacy, and some friendly competition! Read more…
  • Medicine: Instructional designer Johnathon Neist shares why a Wikipedia writing assignment is a great fit for medical school curricula. Read more…
  • Media studies: What does it mean for students to join the “Wikipedia ecosystem”? Dr. Carolyn Cunningham reflects on a Wikipedia writing assignment from a media studies perspective. Read more…
  • Music: Even small classes, like Dr. Michael Rushing’s piano pedagogy course at Mississippi College, can make a big difference on Wikipedia. Read more…
  • Public health: The National Institute for Occupational Safety and Health recognized the course we supported at the Harvard T.H. Chan School of Public Health as an effective way to make occupational safety information available to the public. Read more…
  • Science communication: How can students build their confidence as science communicators while also navigating the tricky new problems brought up by the internet/digital age? Sarah Mojarad shares why she comes back to the Wikipedia writing assignment with her students year after year. Read more…

Interested in adapting a Wikipedia writing assignment for your own course? Visit teach.wikiedu.org for access to our free assignment templates and tools.

Semantic MediaWiki 3.1.3 released

23:10, Thursday, 23 2020 January UTC

January 24, 2020

Semantic MediaWiki 3.1.3 (SMW 3.1.3) has been released today as a new version of Semantic MediaWiki.

It is a release providing bug fixes and updated system message translations. Please refer to the help pages on installing or upgrading Semantic MediaWiki to get detailed instructions on how to do this.

Semantic MediaWiki 3.1.3 released

23:08, Thursday, 23 2020 January UTC

January 24, 2020

Semantic MediaWiki 3.1.3 (SMW 3.1.3) has been released today as a new version of Semantic MediaWiki.

It is a release providing bug fixes and updated system message translations. Please refer to the help pages on installing or upgrading Semantic MediaWiki to get detailed instructions on how to do this.

On becoming a Wiki Woman Scientist

18:03, Thursday, 23 2020 January UTC

Dr. Jyoti Patel is a Research Associate Professor in the Department of Neurosurgery at NYU School of Medicine. She recently completed our Wiki Scientists course with the New York Academy of Sciences.

Jyoti C Patel.
Image by Jyoti C Patel, CC BY-SA 4.0 via Wikimedia Commons.

Growing up in the UK during the 70s, long before the Wikipedia era, I was fortunate to have received a completely free education. I did my O-Level and A-Level exams at an Inner City London School in one of the most disadvantaged regions of the country and was among the very few in my school year that got the sufficient grades to be able to make it to University. This was an achievement that amazed not only me but also my family, friends, and even my teachers. Having confidence in myself has never been my strong point.

Part of my success at school may be credited to my older sister who enthusiastically taught her younger siblings everything she had learned at school that day. I remember being told all about atoms at the age of six and being my sister’s lab assistant when she experimented with her home Chemistry Kit. My teachers also played a big role in lending me books and spending extra time tutoring me. Credit also goes to the local library and the secondhand set of Encyclopedia Britannica volumes our neighbor gave us.

Going to University was never in my plan but I thought: “Now that I have the grades I guess I ought to go”. Studying Pharmacology was a challenge. I struggled with the immense amount of information thrown at me. But again, with the support of the excellent teaching staff and many of those around, as well as the library, I managed to get through it. Again, I had made no concrete plans for my future after University but, to cut a very long and serendipitous story short, I ended up working on a Ph.D. in Neuroscience in a scientific research lab headed by a guy that I had once heard as an external speaker at University and who had written a text book that was crucial in getting me my first degree.  During my period as a doctoral student, still as lacking in confidence as ever, I met many inspirational characters who were extremely intelligent and worldly in comparison to me but importantly, were unstinting in sharing their knowledge (both academic and non-academic).

A few years after completing my Ph.D. I had the amazing opportunity to join a lab in New York City.  It wasn’t until I was in the USA that it really struck me just how fortunate I was to have received for free the formal education and informal nurturing that had shaped me into becoming a fully-fledged scientist exploring the wonders of the brain. I had always felt compelled by my personal experience to try to pass on my knowledge, but I mostly did this by training others in my lab or by teaching students in classes. This was satisfying but only impacted relatively few people. I began to take part in the Brain Awareness Fair at my institution several years ago; this annual event is designed to educate the public about the brain and brain health. I find this an important and rewarding activity, again however, it mostly impacts those in the local community.  I’ve always wanted to do more but, as a full-time research scientist, I have limited time and energy to invest in attending organized meetings and events.

My partner (also a scientist) and I have been big enthusiasts for Wikipedia and often talk about what the underlying philosophy and what an immense resource it is. A place where people all across the globe may obtain (mostly) accurate information on just about any topic of notoriety regardless of whether they are wealthy or not.  The ultimate mentor. When I saw the announcement from NYAS about the opportunity to take part in a Wiki Scientist course, I was delighted. It seemed like a perfect way to discover how to pass on my scientific knowledge to a vast community whenever and however much I wanted.

Despite feeling intimidated about not being up-to-date with the latest internet technology, I knew I had to try it. Just like going to University, doing a Ph.D. and moving to the USA, I knew that enrolling in the Wiki Scientist course would force me to get out of my comfort zone.  I had never used ‘Zoom’ before but loved it when we used it to meet with our instructors Ryan, Ian, Will and Elysia, and fellow classmates once a week to discuss progress and concerns. I came to appreciate using ‘Slack’ as a communication channel during the week to share any issues or information we had between meetings. The Wiki Education Dashboard was like a home where we would complete on-line courses, assign ourselves articles to edit and keep track of our progress. But most of all I enjoyed finding out about the workings of Wikipedia itself; evaluating articles, using talk pages and Wiki Code to let other editors know what you are doing, and using sandboxes to draft material, as well as finding out more about the amazing Wiki Community.

Making minor edits to a real Wikipedia article for the first time was both incredibly scary and exhilarating at the same time. The sense of responsibility for what I added or amended was surprisingly overwhelming. We were told to ‘Be Bold’ but I was convinced I was going to do something drastically wrong. But as Ian (our Wiki Education expert) said “you cannot break Wikipedia”.  Next we were asked to make major changes to either a ‘Start’ article or an article with a low rating and in need of improving. I chose one close to my current research topic on the brain, which also happened to be on a list of Neuroscience articles needing improvement. Despite having written many scientific articles in my professional life, I found this task challenging. The non-biased factual style and way in which statements should be substantiated was different to what I am used to. How to find non-copyrighted images to include in an article was also revealing. After finishing my major edits to the article in the sandbox and then making them ‘live’ I was astonished to discover that before I started working on this article there had not been any changes to it from other editors for months. However, after my edits, there was an immediate flurry of activity from other editors correcting my typos…but thankfully not my science.  What I find most rewarding, however, is that 150 to 200 people view this article every day. Passing on my knowledge to that many people around the world really blows my mind.

With the relatively small number of women that make contributions to Wikipedia, I truly feel that being an active Wiki Woman Scientist is one of the most important things I can do. While exploring the platform for articles to edit I found that there is certainly no shortage of work to be done. I would encourage anyone and everyone to contribute content to this wonderful gift to our world –  and if you are a little underconfident and intimidated, as I was, just remember Ian’s words: “You cannot break Wikipedia”.


Interested in taking a course like this? Improve information about disability healthcare on Wikipedia through our upcoming course sponsored by WITH Foundation (here). Or write Wikipedia biographies for women across disciplines and professions (here). To see all courses with open registration, visit learn.wikiedu.org.


For inquiries about partnering with Wiki Education, contact Director of Partnerships Jami Mathewson at jami@wikiedu.org or visit partner.wikiedu.org.

Monthly​ ​Report,​ November 2019

19:46, Wednesday, 22 2020 January UTC

Highlights

Many Wiki Education staff members attended WikiConference North America in Boston. Wikidata Program Manager Will Kent, Scholars & Scientists Program Manager Ryan McGrady, Wikipedia Student Program Manager Helaine Blumenthal, Senior Wikipedia Expert Ian Ramjohn, Director of Partnerships Jami Mathewson and Chief Technology Officer Sage Ross presented at the conference alongside many Wiki Education program participants and faculty instructors in our programs. Wiki Education won the Education Impact Award during the conference.

Wiki Education won the Education Impact Award during WikiConference North America.
Image by Ruben Rodriguez, CC BY-SA 4.0 via Wikimedia Commons.
  • November was a busy month for partnerships. Jami joined faculty at Western Colorado University to promote the use of Wikipedia as a teaching tool. While in Cambridge for WikiConference North America Jami had the opportunity to visit Harvard School of Public Health and Boston University and presented to instructors. Also, Jami and Customer Success Manager Samantha Weald attended the Women’s Forum Global Meeting to run a 3-day session teaching conference attendees about Wikipedia’s gender gap and how they can help curb it.

 

Programs

Ryan McGrady presents during Wiki Education’s Programs & Tech Offsite in November 2019.
Sampling cacao juice during Wiki Education Programs & Tech Offsite Chocolate Tour.

The Programs and Technology teams had a joint offsite meeting in Boston, Massachusetts, prior to WikiConference North America. Agenda items included an overview of the current fiscal year for both Programs and Tech, each program manager sharing survey results and what they’ve learned from their programs, learnings from professional development staff have taken recently, and a revisiting of our process maps. We also engaged in a team activity around our DISC profiles. After we wrapped up the meeting part of the day, we all did a walking tour of Harvard Square — focused on chocolate. We learned about the process of producing chocolate, and then tasted it in a variety of forms, from chocolate bars to cookies to ice cream to balsamic vinegar. The offsite was a great way to both share learnings and touch base on our work and engage in team-building activities.

 

Wikipedia Student Program

Status of the Wikipedia Student Program for Fall 2019 in numbers, as of November 30:

  • 388 Wiki Education-supported courses were in progress (222, or 57%, were led by returning instructors)
  • 7,451 student editors were enrolled
  • 62% of students were up-to-date with their assigned training modules.
  • Students edited 4,960 articles, created 417 new entries, added 3.61 million words and 37,600 references.

As always, November is one of the busiest months for the Student Program as students begin to move their work into the article main space. While our Wikipedia Experts were hard at work reviewing student work and responding to requests for help, Program Manager Helaine Blumenthal began to plan for the Spring 2020 term in earnest.

Helaine presents alongside faculty instructors in our program.
Image by Victor Grigas, CC BY-SA 4.0 via Wikimedia Commons.

Each term, Wiki Education works with hundreds of instructors, but we rarely get to meet our program participants in person. In November, we had two opportunities to do so. At this year’s WikiConference North America, held in Boston, Helaine got to both present along side several of our instructors as well as attend a number of presentations about how instructors are using Wikipedia in their classrooms. We also had a chance to meet up with several of our instructors at this year’s National Women’s Studies Association conference held right here in our home base of San Francisco. Helaine, along with Chief Programs Officer LiAnna Davis, met with a handful of instructors to learn more about what instructors have to say about the Wikipedia assignment and what we here at Wiki Education can do to make it a better experience for both students and instructors alike.

Student work highlights:

A coral hairstreak butterfly, uploaded and added to the article butterfly gardening by a student in Sarah Wyatt and Brett Fredericksen’s Scientific Writing course at Ohio University.
Image by Kopph, CC BY-SA 4.0 via Wikimedia Commons.

Literature has always been a great way to impart experiences and knowledge, as well as help people see things from an all new point of view. This is why Pearl Cleague’s debut novel What Looks Like Crazy on an Ordinary Day is so important, as one proponent of the book stated that it showcases the empowerment of women in the face of undeniably difficult life challenges and that Cleage’s focus on the challenges associated with AIDS, drug addiction, and domestic violence offers an intuitive look into the realities of social issues that are dealt with at surface level by traditional societal institutions. Shortly after its release at the end of December 1997 this book became part of the Oprah Winfrey Book Club and spent nine weeks on the New York Times Bestseller List. We can thank University of Florida instructor Delia Steverson, who taught her first class with us this fall, for the creation of this article, as one of the students in her Gender and Sexualities in African American Literature class chose to work on this remarkable novel.

Eugenics has a dark and dirty shadow that continues to haunt humanity. Given the atrocities that have been committed in the name of “better humans”, it is important to remember what has come before so we as a society can work towards preventing further harm. Vanderbilt University students in first time instructor Danielle Picard’s Eugenics and Its Shadow course worked on and created several articles, two of which are on the Race Betterment Foundation and the sculpture Average Young American Male. The Race Betterment Foundation was a eugenics and hygiene foundation founded in 1906 at Battle Creek, Michigan by John Harvey Kellogg (yes, the man behind the breakfast cereal) due to his concerns about race degeneracy. The foundation supported conferences, including three National Conferences on Race Betterment, publications (Good Health), and a eugenics registry in cooperation with the ERO (Eugenics Record Office). The foundation also sponsored the Fitter Families Campaign from 1928 to the late 1930s and funded Battle Creek College (not what is now Andrews University). The foundation controlled the Battle Creek Food Company, which in turn served as the major source for Kellogg’s eugenics programs, conferences, and Battle Creek College. The Average Young American Male was a 22-inch plaster statue sculpted in 1921 by Jane Davenport Harris as a composite model for the eugenics movement in the United States. The statue was exhibited at the Second and Third International Congresses of Eugenics in 1921 and 1932, respectively, as a visual representation of that which eugenicists considered to be the degeneration of the white race. While the statue received mixed responses from contemporary critics, it inspired the creation of additional composite statues as propaganda for the eugenics movement throughout the mid-twentieth century.

A cuckoo chick pushing reed warbler eggs out of a reed warbler nest, uploaded by a student in 
Memorial University of Newfoundland’s Animal Behaviour course.
Image by Anderson MG, CC BY 4.0 via Wikimedia Commons.

Human communication by talking is complex and sophisticated, though you’d be wrong to think we’re the only animals capable of complex communication! This term, David Wilson’s Animal Behaviour course out of Memorial University of Newfoundland explored the myriad ways that animals behave and communicate with each other. Students in the course were busy, editing 74 articles total, including 5 new creations. Cleaner fish, an article about fish that remove dead skin and parasites from other fish’s skin and gills, saw huge growth, with the student now responsible for almost 80% of the article. Another article that saw growth was egg tossing, an article about, as expected, when birds dump eggs from a nest. Though it may seem counterintuitive for birds to throw eggs from their own nests, they may have good reasons, including tossing eggs of competing birds of the same or different species. Some egg tossing is done by nest parasites like cuckoos, who remove eggs from the nest of a host bird to add their own. Cuckoo hatchlings will even knock the host birds’ eggs out of nests so that they get all the attention from parents. The students’ edits brought detail and sturdy referencing to a fascinating array of articles on animal behavior.

Bluegrass music is a genre of American roots music that developed in the 1940s in the United States Appalachian region. Countless musicians have performed in this genre, which include Reno and Smiley. They were an American musical duo that was composed of Don Reno (May 17, 1925 – October 16, 1984) and Red Smiley (February 21, 1925 – January 2, 1972). They were one of the most acclaimed duos in the country, now bluegrass, music in the 1950s and early 1960s. It’s thanks to East Tennessee State University students in Lee Bidgood’s Bluegrass History I class that this article has been expanded and improved.

Calypso Rose is an iconic Tobagonian calypsonian with a career spanning almost 65 years. At the age of 78 she was the age of 78 was the first calypsonian to perform at Coachella and the oldest person ever to do so. Students in Kavita Singh’s Caribbean Literatures course rewrote most of the article, greatly improving Wikipedia’s coverage of an important artist from the Caribbean. Beryl Gilroy is a Guyanese teacher and novelist who has been called “one of Britain’s most significant post-war Caribbean migrants”. A student editor in the class substantially improved Gilroy’s Wikipedia biography, adding content and context to her career and achievements. Other student editors in the class made major improvements to the biographies of Rita Indiana, a writer and singer-songwriter from the Dominican Republic, and Patrick Chamoiseau, a writer from Martinique. Another student editor added information about the indentureship process to the Chinese Caribbeans article.

You’d probably expect that more complex organisms have more genes that very simple ones, but the reality is that there’s no relationship between the complexity and the number of genes — humans have about the same number of genes as a nematode, while the water flea, Daphnia pulex, has more than 60% more. This state of affairs is known as the G-value paradox. The article was created by a student editor in Dan Graur’s Advanced Ecology and Evolution class. Another student in the class worked on the broader Genome size article, adding to it a discussion of the concept of genome miniaturization. One student in the class created new articles about the enemy release hypothesis, which suggests that the reason invasive species succeed in that they have escaped from the pests, predators and pathogens that normally keep them in check, while another created the herbivore effectives on plant diversity article.

Most ant species produce winged reproductives (alates) which emerge from the nest in large swarms and mate. The females then go and found new colonies, while the males die. But for about 55 species of ants, the reproductives are wingless like the workers, and are called ergatoids. Before a student from this class started working on it, Wikipedia’s ergatoid article was short — four sentences long — and quite hard for the average reader to make sense of. Thanks to a student editor, this article is now lengthy, fairly comprehensive and, maybe most importantly, it’s readable and informative to someone who may never have heard the term before.

Two student-authored articles appeared on Wikipedia’s Main Page in the Did You Know? section, Leptoconops torrens on November 21, and Syritta pipiens on November 22.

Scholars & Scientists Program

WikiConference North America

Will presents about Wikidata during WikiConference North America.

November saw many Wiki Education staff attend WikiConference North America in Boston. Wikidata Program Manager Will Kent and Scholars & Scientists Program Manager Ryan McGrady joined Helaine to talk about how Wiki Education’s programs engage subject-matter experts in different ways. The question of how to encourage people with advanced knowledge to edit Wikipedia is one that comes up regularly at Wikipedia-related conferences, and one that Ryan and other Wiki Education staff have even presented on there in the past. What excited us this time was to present about programs that are actually doing that. Ryan explained the idea behind the Scholars & Scientists program, how it overcomes several of the well-documented obstacles to subject-matter expert engagement, the model of support we use, and some of the successes we have seen in the nearly two years the program has been running. Will addressed ways in which seeking subject specialists differs with Wikidata than Wikipedia. He explained why expertise is helpful in standardizing Wikidata’s ontology, seeking consistency across items, and identifying gaps on Wikidata that non-specialists may miss. Click through the above link to see a description, notes, and a link to the recording of this session.

Will also presented about building a Wikidata curriculum. This session framed Will’s curriculum development process around the new Wikidata courses. Specifically the session addressed some of the challenges around developing a Wikidata curriculum including, what to include/exclude, how to make it approachable, how to make it relatable, how to measure understanding, and how to construct assignments that make sense for the participants. This session also generated a dynamic question and answer session allowing for participants to share their thoughts on curriculum development. The intention of the session was to encourage others to pursue creating their own version of a Wikidata curriculum. Judging from the question and answer session, many hurdles remain, but there are many manageable ways of training others to use Wikidata. The above link also contains a description, notes, and a link to the session recording.

Director of Partnerships Jami Mathewson joined Judy Davidson, Sara Marks, and June Lemen from the University of Massachusetts Lowell to talk about their Women in Red at UMass Lowell initiative, fostering digital literacy among students and faculty while also improving Wikipedia’s coverage of notable women from the region. Judy, Sara, and June have coordinated and participated in a Wiki Scholars course with us that is ongoing and shared their experience in that effort. See below for highlights from that course.

Wikidata

Our two Wikidata courses continued through the month of November, one intermediate and one beginner. We have a record number of participants taking these courses. These two courses have only a week left and will wrap up at the beginning of December. So far everyone is off to a great start, asking some excellent questions, participating in sessions, and editing Wikidata. Here are some details:

  • Beginner: This course has 18 editors who have also made more than 100 edits. They are also excelling at adding references to statements. This course has a healthy mix of participants with linked data experience and some without as much Wikidata experience. Nevertheless conversations have gravitated toward some of the most specific details of Wikidata — property usage, data consistency, how to learn about using tools, and specific query requests.
  • Intermediate: There are 11 editors who have already made more than 100 edits across 100 items. These editors are also adding references to statements. As an intermediate course, several of these course participants have experience with linked data. They also have some ideas for projects that would involve integrating Wikidata into their workflows at their respective institutions. Conversations in this course revolve around these project ideas, answering questions about Wikidata specifics.

As with other Wikidata courses, participants have a diverse set of needs and expectations of Wikidata. Both courses are already exploring several use cases that speak to library and museum needs. We have had conversations about using Wikidata to connect identifiers, propose properties, interact with the Wikidata community, and pull specific data sets from Wikidata through queries.

One conversation revolved around the tool, genetic tree, which reveals taxonomic relationships between items. This had an impact on the course participant to further their understanding of how queries work and how they might model data from their collection.

Wikipedia

We had three Wikipedia Scholars & Scientists courses active this month. First, we were excited to launch a new course, Living Knowledge, focused on improving biology-related articles on Wikipedia. At the end of the month, the course was only a couple weeks in, so they are still learning the basics of editing and selecting topics, but with their diverse specializations, we are looking forward to the improvements this group makes to science content on Wikipedia.

The Women in Red at UMass Lowell Wiki Scholars course hit its stride this month, with participants doing most of their article writing before the holiday break. Here are some of the highlights so far:

  • A new article about Georgina Kleege, an American writer and professor at the University of California, Berkeley, who has written important work in disability studies.
  • The article about Lowell State College, a historic precursor to UMass Lowell, has more than doubled in size.
  • The Betsey Guppy Chamberlain article, about the 19th century textile mill worker and writer, also doubled in length.
  • A significant expansion to the article about Sherley Anne Williams (1944–1999), a poet, novelist, professor, vocalist, jazz poet, playwright, and social critic.

Finally, the course we offered with the National Science Policy Network wrapped up this month. This was an exciting course in which participants focused on science topics about which the public needs high-quality information in order to understand the world around them (not to mention policy-makers themselves, who use Wikipedia, too). Here are some of the highlights from the course:

  • The embryo article is a good example of the sort of big scientific topic that can easily be neglected on Wikipedia. Its scope is broad and it really benefits from the work of someone who comes to it with a broad understanding of the subject and the literature. One of the NSPN Wiki Scientists made major improvements to the page, correcting an overemphasis on human embryos (for which there is a separate article), expanding the article, rewriting the lead, and adding references.
  • Water resource policy concerns one of our most fundamental resources and how it can be collected, prepared, used, and disposed of, taking into account human use and the environment. The incredibly important topic has had maintenance tags at the top of the article for several years, indicating it needed a lot of work. After this Wiki Scientists course, it is vastly improved, with new or rewritten/expanded sections, better organization, and more sources. It stands as a good example both of why these courses are so important — the great work subject-matter experts do, and why we need to keep bringing people in to continue the work.
  • A new article about Jean Dickey (1945–2018), a pioneering geodesist (someone who works to measure and understand Earth’s shape, orientation, and gravitational field) and particle physicist who worked at NASA’s Jet Propulsion Laboratory for 37 years.
  • Another new article on the National Alzheimer’s Project Act, which led to the U.S. National Alzheimer’s Plan to increase spending on research, care, and public engagement regarding the disease.
  • Several improvements to agriculture in California, including adding information about the use of water and environmental effects.
  • The Nuclear Posture Review concerns the role of nuclear weapons in U.S. security strategy. Until this course, it only included coverage of the reviews conducted in 2002 and 2010. But there were two others, in 1994 and 2018. With the addition of these sections, the article helps readers gain a clearer picture of the development of national policy on one of the most serious topics imaginable.
  • Directly related is the Comprehensive Nuclear-Test Ban Treaty, for which a Wiki Scientist overhauled and expanded the section on monitoring.
  • Pathogen, a broad subject covering anything that can produce disease. We included this article in last month’s monthly report, too, but as a Wiki Scientist continued to improve it, this 1,000-views-a-day article deserves mention again.

Visiting Scholars Program

Life has been found in all of these environments that exist below the surface of the earth.

The biosphere is the sum of all ecosystems on Earth. Most of the life on Earth that we tend to think of exist above the surface, but the deep biosphere, the part of the biosphere below the first few meters of the surface, accounts for 15% of biomass on the planet, and about 90% of archaea and bacteria in particular. It extends at least 5 km below the surface, and 10.5 km below the sea surface. Organisms that live there exist at temperatures that can exceed 100 degrees C. The way organisms eat and breathe at these depths are quite different from the way we do on the surface, with metabolisms up to a million times slower. The fascinating, extensive, and accessible article was promoted to Good Article this month, thanks to the hard work of Andrew Newell, Visiting Scholar with the Deep Carbon Observatory.

Northeastern University Visiting Scholar Rosie Stephenson-Goodknight added two biographies of impressive women writers, both of whom got started early in life. Emma Huntington Nason (1845–1921) was a poet, author, and composer from New England who started writing when she was only 12. Mary Bassett Clarke (1831–1908) was a writer from New York who published using the pen name Ida Fairfield starting from the age of 15.

 

Advancement

Fundraising

November was a slow month for fundraising. No new grants were awarded and no new proposals or concept notes were invited or submitted to funders. Some donations came in during the month from some individual supporters, especially on #GivingTuesday. In particular, we received a $1000 donation from Diana Strassman, a former member and chair of Wiki Education’s Board of Directors, and her husband Jeff.

The majority of our fundraising efforts in November were research and report related. We worked on drafting interim reports for our grants from the 20MM Michelson Foundation and from the Moore Foundation, which are due in December and January, respectively. We also continued to identify and qualify potential funders, including migrating our research and interaction notes to the software platform Asana. This software allows us to more easily see where each potential and existing funder is on a cultivation continuum.

Partnerships

This month, Director of Partnerships Jami Mathewson joined faculty at Western Colorado University to promote the use of Wikipedia as a teaching tool. Thanks to Director of Library Services Dustin Fife for bringing us out to campus. During the visit, Jami had the chance to speak with faculty interested in open educational resources (OER) the opportunity to learn about how students can engage with OER, turning it into an open educational practice. We look forward to bringing more Western Colorado faculty into the Student Program in the coming terms!

Jami attended WikiConference North America with several other Wiki Education staff members. She presented alongside faculty from the University of Massachusetts Lowell about the partnership we built this year to embed Wikipedia expertise on the UMass Lowell campus. We’re excited to share this project with other Wikipedians and university faculty, as we hope to run similar Wiki Scholars courses in partnership with other cohorts of campus faculty.

While in Cambridge for WikiCon North America, Jami had the opportunity to join Thais Morata and John Sadowski at the Harvard School of Public Health and Boston University. We presented to instructors about the work John and Thais are undertaking at NIOSH to improve Wikipedia’s coverage of occupational safety and health, including the great work they do as a part of Wiki Education’s Student Program. We enjoyed working with Diana Ceballos to coordinate these presentations and to showcase the excellent work her students have done on Wikipedia.

At the end of the month, Jami and Customer Success Manager Samantha Weald attended the Women’s Forum Global Meeting to run a 3-day session teaching conference attendees about Wikipedia’s gender gap and how they can help curb it. We were invited by Wikipedian Jess Wade, who has been an advocate for adding women’s biographies to Wikipedia over the past few years. We were thrilled to welcome Daria Cybulska of Wikimedia UK, Adelaide Calais and Amélie Cabon of Wikimedia UK, and Natacha Rault of Les sans pagEs to help raise visibility of the work we can do to bring more women to Wikipedia.

Communications

As 2019 comes to a close, we’re beginning to reflect on all the great progress that program participants have made on Wikipedia this year. Namely, several of the instructors we support won teaching awards for their Wikipedia assignments. Read more about that in our blog from this month.

We featured a guest blog by one of the instructors we support: Melony Shemberger, who has also taken one of our Wikipedia training courses, provides a framework for teaching Wikipedia writing assignments in the journalism classroom. Read more!

We also featured a story from one of our Wikipedia training courses for Society of Family Planning members. The participant shares their experience improving Wikipedia pages about family planning in Uganda, where they are from. Read more!

For those who are curious about how our Wikipedia training courses come about, check out our case studies with two partners: the Colorado Alliance of Research Libraries and the Society of Family Planning. Both organizations sponsored courses for their members, who produced some great work on Wikipedia.

Blog posts:

 

 

External media:

 

 

Technology

In November, we polished up and extended the Dashboard updates to the student user experience that will roll out in late December. In addition to students’ assigned articles and peer reviews, the Dashboard will allow students to register completion of assigned exercises such as the ‘evaluate an article’ task that typically occurs near the beginning of a course.

We also improved how the Dashboard handles the automatic posting of templates to students’ userpages and talk pages, so that when such an edit fails we can still ensure that the template gets posted later.

Our newest pair of Outreachy interns were announced this month as well — with projects that will be co-mentored by two of our interns from the summer! Lalitha Reddy, co-mentored by summer intern Khyati Soneji, will be rewriting the Dashboard’s Campaign page system, laying the groundwork for a better user experience and easier maintenance in the long term. Glory Agatevure, co-mentored by summer intern Ujjwal Agrawal, will carry forward the Dashboard Android app that Ujjwal began earlier this year. This set of internships begins in early December and goes through March.

Finance & Administration

The total expenditures for the month of November were $202K, +$31K over the budget of $171K. The biggest contributor to the difference was timing. Fundraising was over by $2K, all was travel, budgeted for the prior month. General Administration was over by +$15K, of which +$2K were indirect expenses no longer being reallocated, +$8K in meetings and +$6K in travel, both budgeted for prior month, while underspending in location expenses ($2K). Programs were over $14K, underspending in Indirect Costs($2K) and Communications ($2K), while being over in Outside Contract Services +$13K and vacation accruals +$5K due to the implementation of an updated PTO policy.

Wiki Education Monthly Expenses November 2019

The Year-to-date expenses are $895K, ($36K) under the budget of $931K. Fundraising is under by ($3K), of which ($2K) is Indirect costs and ($1K) is under in Travel. General and Administration is over by +$45K. +$62K in Indirect costs not re-allocated to Programs and Fundraising, +$6K in Travel, and +$2K in Communications, while underspending ($16K) in Professional Services, (3K) Location Expenses and ($6K) in general expenses. Programs is under ($78K), of which ($60K) relates to Indirect costs, ($10K) in communications, ($28K) Travel while over in Payroll Expenses +$9K, and Outside Services +$13K.

Wiki Education YTD Expenses November 2019

Office of the ED

Current priorities:

  • High-level analysis of Wiki Education’s work in the area of institutional fundraising
  • Finalizing the audit for fiscal year 2018–19
  • Supporting the board’s recruitment of new board members

In November, Frank attended WikiConference North America in Boston alongside a number of staff members. The main conference took place at the Massachusetts Institute of Technology’s Stata Center on the MIT campus. Organized by a team of volunteer Wikimedians from all over North America, this year’s conference focused on credibility and reliability in news and media. The four-day event proved to be an excellent opportunity for Wiki Education’s staff to present some results of their own work, to learn about the work of others, to engage in face-to-face conversations with members of the community, and also to meet with local funding prospects and partners.

In October, Frank had initiated a high-level conversation about the future of Wiki Education’s institutional fundraising. This work continued in November with TJ, LiAnna and Frank working on a “fundraising outlook” document. Although we’ve been very successful with creating an additional revenue stream through earned income, institutional fundraising will continue playing a very important role for Wiki Education in the years to come. Given the significance of this part of our revenue model, Frank felt like it was time to perform a high-level analysis that will guide our institutional fundraising work going forward. Over the course of several small group meetings, TJ, LiAnna and Frank discussed Wiki Education’s opportunities and challenges with regard to institutional fundraising, as well as solutions that will improve our fundraising outlook going forward. The results will serve as a basis of discussion with the board in its upcoming in-person meeting in early 2020.

Also in November, Frank talked to Susan Malone from our external auditing firm Hood & Strong and answered all her questions about fiscal year 2018–19 as part of our yearly financial audit process. SFBay Financials, supported by Ozge and Andres, provided extensive documentation as requested by the auditors. We’re currently clearly ahead of last year’s schedule and expect the audit to be finished before the end of the year.

***

Monthly​ ​Report,​ October 2019

18:10, Wednesday, 22 2020 January UTC

Highlights

  • This month we launched two new Wikidata courses and we’re extremely pleased to be working with 28 new participants. They represent a host of schools, museums, and organizations such as Carnegie Hall, the Smithsonian Libraries, the Barack Obama Foundation, The Texas Archive of the Moving Image, the Detroit Institute of Art, the National Museum of American History, the Yale University Art Gallery, and the Center for Research Libraries (CRL). This group of participants has some ambitious plans for learning about Wikidata. We are looking forward to the impact they will have on Wikidata and potential they will have to share the collections they represent with the world.
  • We attended the Society of Family Planning’s (SFP) annual meeting in Los Angeles to highlight the great work SFP’s 32 Wiki Scholars did over the course of 2 Wiki Scholars courses. We met with 10 of the SFP members who participated in the courses, and we were excited to hear how meaningful these medical practitioners found their experience of learning how to add to Wikipedia.
  • We received grant payments from the WITH Foundation and from the Michelson 20MM Foundation. We look forward to engaging in the work supported by these grants.

Programs

Wikipedia Student Program

Status of the Wikipedia Student Program for Fall 2019 in numbers, as of October 31:

  • 383 Wiki Education-supported courses were in progress (221, or 58%, were led by returning instructors)
  • 6,771 student editors were enrolled
  • 61% of students were up-to-date with their assigned training modules.
  • Students edited 2,640 articles, created 149 new entries, added 1.22 million words and 13,000 references.

October is the month where students begin to take a deep dive into their Wikipedia assignment. After some introductory edits, they begin to draft their larger contributions and become immersed in Wikipedia culture. This October was no exception.

Wikipedia Student Program Manager Helaine Blumenthal finished onboarding for the Fall 2019 term and was busy getting the program ready for Spring 2020. This means updating materials and records and generally setting things up for the quick turnaround between Fall and Spring. In our ongoing efforts to get the best feedback from instructors, she revised the Fall 2019 instructor survey to reflect the program’s changing needs and goals.

Student work highlights:

This cyanotype, which was created by a process of coating a light-sensitive paper with specific chemicals, and then exposing it under sunlight or UV lights to create this specific emulsion result, is an example of monochrome photography and was uploaded by a Arizona State University student in Kristy Roschke’s Digital Media Literacy class.
Image by hutschinetto, CC BY-SA 2.0 via Wikimedia Commons.
While the upper floors were removed years ago, the exterior of this building was used for the exterior shots of Larry and Balki’s apartment for the first two seasons of Perfect Strangers. This photograph was uploaded by an Arizona State University student in Kristy Roschke’s Digital Media Literacy class.
Image by Kreglas413, CC BY-SA 4.0 via Wikimedia Commons.
This photograph of Reno and Smiley, an American musical duo composed of Don Reno (May 17, 1924 – October 16, 1984) and Red Smiley (February 21, 1926 – January 2, 1972), was uploaded by an East Tennessee State University student in Lee Bidgood’s Bluegrass History I class.
Image by Ann Milovsoroff, CC BY-SA 4.0 via Wikimedia Commons.

Each term students working with Wiki Education create new articles or add content to existing ones. In October, students in Jessica Blackwell’s Gender and Social Justice class at the University of Waterloo chose to work together on creating an article for Elizabeth Smith Shortt, one of the first three women to earn a medical degree in Canada. Despite a hostile backlash from the male staff at students at Queen’s University, Ontario and receiving an expulsion from the same school along with other women medical students, Shortt persevered and completed her studies at a newly established women’s college. She would go on to practice medicine in Hamilton, Ontario, after finishing medical school in 1884 and became an active and long-serving member of the National Council of Women of Canada and spearheaded a number of public health and women’s welfare initiatives. Decades after her death in 1949 two collections of her diaries were published. Her diaries for the years 1872–1884, which include her experiences at medical school, were published under the title A Woman with a Purpose (University of Toronto Press, 1980); her diaries while traveling in Europe with her husband in 1911 were published as Travels and Identities: Elizabeth and Adam Shortt in Europe, 1911 (Wilfrid Laurier University Press, 2017).

If you’re a horror fan, you’re likely familiar with the 1992 film Candyman, a cult classic about a supernatural killer with themes of race and social class in inner-city United States. You may also be aware that up and coming director Nia DaCosta will be working with Jordan Peele on a new film set in this universe. But are you aware that this will be the second feature film she has directed? Last year DaCosta premiered Little Woods at the Tribeca Film Festival, where it was met with a positive reception from critics and received a nomination for the festival’s Best Narrative Feature Award. DaCosta was born in Brooklyn, New York, and raised in Harlem. She became enraptured with film making after watching Apocalypse Now and movies by directors such as Martin Scorsese, Sidney Lumet, Steven Spielberg, and Francis Ford Coppola. She went on to enroll at the New York University’s Tisch School of the Arts, where she would eventually meet Scorsese while working as a TV production assistant. It’s amazing to think that prior to this term DaCosta lacked an article and may have had to have waited longer to have an article, if not for a student in Laura Horak’s Analyzing Cinema, Gender, and Sexuality class at Carleton University.

This was not the only new article created for the class; another student created an article for the Canadian filmmaker, producer, and photographer Zaynê Akyol. Born in Turkey, Akyol predominantly focuses on documentary film and is known for her feature length documentary film Gulîstan, Land of Roses, which was supported by the National Film Board (NFB) and MitosFilm in Germany. Another student chose to expand the article on Canadian film and television director Trish Dolman, who is most noted for her 2017 documentary film Canada in a Day, for which she won the Canadian Screen Award for Best Direction in a Documentary Program at the 6th Canadian Screen Awards in 2018. She has also won multiple awards and is the founder of Screen Sirens Pictures, a production studio based in Vancouver, British Columbia.

People love goats. They’re brought into yoga classes and edited into Taylor Swift music videos. They parkour climb on walls and obstacles, but some goats have an unusual quirk—they faint after being startled. A student in Animal Behaviour at Memorial University of Newfoundland drastically improved Wikipedias article on fainting goats. The student added more information on physical characteristics of fainting goats, and also added information about how the goats’ fainting relates to a medical condition in humans called congenital myotonia. With daily page views numbering more than 450, the student’s work will quickly reach thousands of people, helping them learn more about this topic.

Students in Joan Strassmann’s Behavoiral Ecology class continue working on articles about Diptera, the insect Order that includes flies, fruit flies and mosquitoes. Lutzomyia longipalpis is a sandfly from South and Central America which is a vector for Leishmania infantum, a parasitic protist which causes visceral leishmaniasis, the most severe form of the disease. Although the species (or group of closely related species) is an important vector of this disease, there was no Wikipedia article about it until a student editor in the class created one. The larva of Mallophora ruficauda, a species of robber fly, parasitize the larvae of certain species of scarab beetles. Because the beetle larvae are agricultural pests, M. ruficauda larvae are important biocontrol agents. The adults, on the other hand, prey on bee species, and are an important honeybee predator. Chironomus annularius is a species of non-biting midge which can form the kinds of large swarms commonly associated with midges. Glossina fuscipes is a species of tse-tse fly which spreads African trypanosomiasis. These articles were among those that were created by students in the class during October.

Student work appeared on Wikipedia’s main page in the Did You Know? section twice:

Other new articles include:

Scholars & Scientists Program

Wikidata

New Courses

This month we launched two new Wikidata courses – one beginner and one intermediate. We’re extremely pleased to be working with these 28 new participants. They represent a host of schools, museums, and organizations such as Carnegie Hall, the Smithsonian Libraries, the Barack Obama Foundation, The Texas Archive of the Moving Image, the Detroit Institute of Art, the National Museum of American History, the Yale University Art Gallery, and the Center for Research Libraries (CRL). This group of participants has some ambitious plans for learning about Wikidata. We are looking forward to the impact they will have on Wikidata and potential they will have to share the collections they represent with the world.

You can follow their progress on the Dashboards here and here.

WikidataCon 2019

October also saw the convening of 250 Wikidata editors — including our Wikidata Program Manager, Will Kent — at the second Wikidata Conference in Berlin, Germany. This semi-annual event creates a space for linked data enthusiasts, practitioners, and researchers to gather and talk about Wikidata. Follow this link to review sessions, including recordings, slides, and notes. This year there was a special emphasis on languages in Wikidata. The theme covered supporting endangered languages, pulling in content from less-represented languages on Wikidata, and some tools to help users achieve this. Wikidata also turned seven during this conference. There was a celebration, complete with cake, well-wishes, and presents (in the form of new tools, collections, and bug-fixes!).

In addition to the conference, there was a day-long Wikibase workshop following the conclusion of the conference. Wikibase is the MediaWiki extension that allows users to create, manage, and populate their own instance of Wikidata.

Wikidatacon 2019 Wikidata & languages.
Image by Jan Apel (WMDE), CC BY-SA 4.0 via Wikimedia Commons.

Wikipedia

This month we kicked off a new course in partnership with the University of Massachusetts Lowell focused on improving Wikipedia’s coverage of notable women. The course is part of the university’s Women in Red at UMass Lowell Initiative to teach digital literacy to both faculty and students while increasing the visibility of notable women from the region. Fifteen UMass Lowell faculty joined Scholars & Scientists Program Manager Ryan McGrady and Wikipedia Expert Elysia Webb for the first online meetings early this month. We are excited to help these enthusiastic scholars contribute their knowledge and research to Wikipedia, and to help guide them as they begin teaching with Wikipedia.

We also wrapped up our Wiki Scientists course offered through the New York Academy of Sciences. Nine editors joined us for an 8-week course to understand Wikipedia’s role and potential for communicating science with the public. Participants improved a number of articles. For example, one Wiki Scientist overhauled the article on the nigrostriatal pathway. One of four major dopamine pathways in the brain, it is critical in our ability to move. This is a great example of a complex topic that greatly benefits from contributions by subject-matter experts. Another participant spotted outdated and missing references in the article for immunoglobulin G, the most common type of antibody found in blood circulation. After going through the article, the references are now more reliable and up to date. Finally, articles on big topics like psychologist often struggle on Wikipedia. In this case, one of its problems was that anyone coming to the article and reading the summary in the lead at the top would have a poor overview of what the article actually contains. A Wiki Scientist expanded the lead considerably to better reflect a summary of the subject, for example outlining different kinds of psychologists.

In our course offered with the National Science Policy Network, participants have been hard at work making great contributions to scientific topics relevant to science policy as well as several biographies of women in science. Here are some stand-out examples of articles they have created or improved so far:

  • Pathogen, a broad subject covering anything that can produce disease. The article gets upwards of 1,000 views every day.
  • Mycobacterium bovis, an aerobic bacterium that causes tuberculosis in cattle.
  • Comprehensive Nuclear-Test-Ban Treaty, a multilateral treaty banning all nuclear explosions. It was adopted by the UN General Assembly in 1996, but has not been enforced because there remain 8 nations that have not ratified it.
Adaora Adimora.
Image by Bradley Allf, CC BY 2.0 via Wikimedia Commons.
  • Adaora Adimora, the Sarah Graham Kenan Distinguished Professor of Medicine and professor of epidemiology at the University of North Carolina School of Medicine. She researches the transmission of HIV and other sexually transmitted infections among minority populations.
  • Miaki Ishii, seismologist and Professor of Earth and Planetary Sciences at Harvard University.
  • Vera Simons, inventor, artist, and balloonist known for high altitude gas balloon development and exploration.

Visiting Scholars Program

The Old Spanish Trail half dollar was struck in 1935, designed and distributed by coin dealer L.W. Hoffecker. The design, which included a cow’s head on one side and yucca tree superimposed on a map of the Gulf Coast states, received mixed reviews and was historically inaccurate. The subject of the coin was Cabeza de Vaca’s travels, but the coin features El Paso prominently, when de Vaca never went there. El Paso, it turns out, is Hoffecker’s hometown. When it was minted, Hoffecker bought them all, said it was for a museum, then sold them to collectors for profit. This unfortunate story in the history of numismatics is now the subject of a Featured Article on Wikipedia thanks to Gary Greenbaum, Visiting Scholar at George Mason University.

The Old Spanish Trail half dollar, featuring spurious information about Cabeza de Vaca’s travels. (Public domain)

Advancement

Partnerships

There’s a lot of work to be done to create pages for people who identify as women, African-American, Asian, Asian-American, Latinx, indigenous, and LGBTQ+. On Ada Lovelace Day this year, engineers, thinkers, and creators at X set out to help close these representation gaps together. Wiki Education staff joined employees at X to teach them how to add biographies to Wikipedia. Laura Chrobak, an engineering intern, chose conservationist Christa Anderson‘s biography to create. “I truly admire her work,” Laura told Wiki Education. “And through the process of writing the Wikipedia page, I became so impressed with her professional accomplishments and history. I read one of her papers detailing climate change mitigation policy and found her argument and research to be quite compelling. Her work on deforestation and climate policy is incredibly relevant and interesting to me.” We’re excited to see commitment from our Bay Area neighbors to make the internet a more inclusive space.

Employees at X learn how to add to Wikipedia.

Wiki Education attended the Society of Family Planning’s (SFP) annual meeting in Los Angeles to highlight the great work SFP’s 32 Wiki Scholars did over the course of 2 Wiki Scholars courses. We met with 10 of the SFP members who participated in the courses, and we were excited to hear how meaningful these medical practitioners found their experience of learning how to add to Wikipedia. At the conference, we had the opportunity to present to more than 70 attendees alongside Grace Ferguson, Anne Davis, Bhavik Kumar, and Colleen Denny. We shared why Wikipedia is such an important platform for informing the public with rigorous science, and they each gave insights into their learning outcomes along the way.

SFP Wiki Scholars Grace Ferguson, Anne Davis, Bhavik Kumar, and Colleen Denny sit on a panel with Jami Mathewson to share their experiences learning how to add medical content to Wikipedia.

Fundraising

In October, we received grant payments from the WITH Foundation and from the Michelson 20MM Foundation. We look forward to engaging in the work supported by these grants. We also learned that our grant proposal to the Institute for Museum and Library Services was declined for eligibility reasons. We submitted our request for an Annual Planning Grant renewal to the Wikimedia Foundation, and had good conversations with our program officer there. We also had a phone conversation with our program officer at the Moore Foundation about a renewal of our current grant, which ends in November. TJ Bliss, Chief Advancement Officer, traveled to Phoenix at the end of October to attend the annual Open Education Conference. While there, he had face-to-face meetings with our program officer at the 20MM Foundation and with our program officer at the Hewlett Foundation. He also met with Bob Cummings, head of the Fundraising Committee on our Board of Directors.

Internally, we spent a lot of time organizing our fundraising records using Asana and Salesforce. Asana, in particular, has been useful for visualizing the progress we are making with each of our potential and current funders. We focused heavily on identifying and reaching out to potential funders of a Women in Science on Wikipedia Contest, in collaboration with the Smithsonian Institution Archives. Several funders responded positively to these requests, but none invited proposals in October.

 

Communications

Dr. Alexandra Edwards wrote a guest blog about what happens when students are confronted with a potential audience of 500 million (instead of one). Instructional designer Johnathon Neist shared in another guest blog why he thinks others should get their students involved in the Wikipedia movement. The Wikipedia writing assignment was also featured on the blog of the Human Anatomy and Physiology Society in a piece about the impact that anatomy and physiology students have had on Wikipedia.

Blog posts:

External media:

Anatomy & Physiology students share their knowledge with the world through Wikipedia. Cassidy Villeneuve. The HAPS Blog. (October 9)

 

Technology

In October, the Technology team continued work on the set of Dashboard user experience improvements for students that will be rolled out for the Spring 2020 term. The core features — including a progress tracker for each article a student is assigned to edit or peer review, along with improved completion tracking for assigned exercises and on-wiki activities — are complete, and we’ll be scheduling user tests to identify problems and rough edges before the Spring classes get underway.

We also made several performance improvements to handle the ever-larger number of programs, especially on the global Programs & Events Dashboard. We now show pageview stats based on the average rate of views for each article, as it became impractical to keep up exact pageview numbers for every tracked article on a daily basis. We also added pagination to the list of programs in large campaigns, which avoids performance problems that cropped up when browsing the largest campaigns.

Finance & Administration

Overall expenses in October were $181K, ($23K) less than the budgeted plan of $204K. There was an accounting process change to the allocation of indirect operational costs. Due to this change, we will see overspending in General and Administration (as they will no longer re-allocate professional service fees from General and Administration to indirect administrative costs in Programs and Fundraising) and underspending in Programs and Fundraising (for the same reason). An adjustment was made in October to recognize this change for Q1. For the month of October, Governance was in alignment with the Plan. General and Administration was over by $32K of which, +$40K over in Indirect expense allocation: $25K was the Q1 adjustment for the allocation change, +$10K for October professional fees not allocated and +$6K for actual overages in indirect fees. G&A shows underspending in direct professional fees by ($9K), due to the timing of the Audit. Programs were under by ($51K), ($40K) relating to the Indirect allocation addressed above, and ($11K) underspent in Travel. Fundraising was under by ($5K) by not attending a cultivation event ($4K) and ($1K) in indirect cost adjustment.

Wiki Education expenses October 2019

The Year-to-date expenses are $694K ($66K) under budget of $760K. Fundraising, is under by ($5K) due to indirect adjustment ($1K), +$1K in Travel and ($4K) in the cultivation event not attended. The Board is on target. General and Administration is over +$30K. The indirect allocation adjustment +$41K. while underspending in meetings ($14K), Professional fees ($15K), due to timing, and ($2K) in location expenses. Programs are under by ($91K)-($41K) due to the Indirect allocation adjustment, and ($17K) in other indirect allocations. Programs were also under ($26K) in Travel and ($7K) in Communications.

Wiki Education Foundation Expenses YTD October 2019

Office of the ED

  • Current priorities:
    • High-level analysis of the fundraising outlook for the next years
    • Embarking on remaining HR projects for this fiscal year

In October, the board’s Finance and Audit Committees conducted a review of our organization’s financial results for Q1 of fiscal year 2019–20. Wiki Education experienced a particularly strong first quarter in fundraising, while our revenue through earned income was slightly lower than projected. Overall, Q1 results have been higher than projected.

Also in October, the board’s HR committee and Frank started a conversation about updating Wiki Education’s salary structure. From day one on, our organization has been extremely frugal and careful about what we spend donor money on. At the same time, increases in the cost of living in the San Francisco Bay area over the past couple of years have made our staff’s lives extremely difficult at times, given that our salary structure is still based on a now-outdated analysis from six years ago. In order to continue being able to attract the best talent and maintain our track-record of a very low turnover ratio, Frank and Ozge will deep-dive into analyzing whether the pay for each individual position is still adequate or whether small adjustments have to made.

At the beginning of October, we released our new vacation policy. The new policy, which had been in the works for a while, takes the realities of a modern work biography into account and provides Wiki Education hard-working staff with sufficient time off to recharge their batteries, and stay healthy and productive.

Visitors and guests

Michael Schulte, founder of Klexikon, the German online encyclopedia for children aged six to twelve years

 

***

The value of being a Wiki Scientist

18:17, Tuesday, 21 2020 January UTC
Dilara Kiran, guest blog contributor.
Image by DK.Sci, CC BY-SA 4.0 via Wikimedia Commons.

Within academic circles, Wikipedia is often looked down upon, and is not considered a credible source of information. Yet, it is one of the most widely visited websites in the world and is often the first link to pop up when you conduct a typical Google search of a topic. With much of scientific information behind paywalls, filled with jargon, or difficult to search for without prior knowledge, it makes it hard for the non-expert to find credible scientific information. When offered the chance to take the Wiki Scientist course through the National Science Policy Network this Fall, I jumped on the opportunity to contribute to an open access forum of information and learn more about the inner workings of what I would come to learn is a complex community.

While currently a graduate student spending a large portion of my time within a basic science research laboratory, my goals for who I want to be as a scientist are impact-driven and align more with applied sciences. I aim to have a career where I can ask questions and gather data that can lead to actionable changes and policy initiatives that improve scientific systems at all levels, from graduate student training to the dissemination of scientific information to the public. Within my joint degree program, I attend veterinary school and am working toward a PhD studying human tuberculosis. This program uniquely positions me at the interface of animal and human health, which is where I chose to focus my time for article improvement in Wikipedia. I specifically chose to add to the articles for Mycobacterium bovis, a bacterial species that primarily causes tuberculosis in cattle, but is being recognized more and more as a cause for human tuberculosis. I also edited the article for One Health, which is a concept focused on the integration of human, animal, and environmental health sectors in order to work toward solutions to global health problems.

I came into the course open-minded, but also with certain pre-existing notions about what I thought Wikipedia was, its provided service to society, and its reputability. Through each week of the course, I became more acquainted with the technical aspects of how to physically make edits on a Wikipedia page, how to best break up information and structure contributions, and what sources were considered appropriate. However, it was the less tangible aspects, the nuts and bolts behind the scenes that keep the Wikipedia machine alive and well, that intrigued me most and left a lasting impression.

First and foremost, it is entirely true that ANYONE can edit Wikipedia. Your parents, your children, your neighbors and friends could all log onto their computer or phone and make changes. For me, this contributed most to my perceptions prior to taking the course that Wikipedia is not a reliable place to obtain information. If anyone can edit, how is it policed? Wikipedians (members of the community of Wikipedia editors) are an extremely dedicated and active group. Editors can add pages to a Watchlist, where they are updated any time a significant change is made to a Wikipedia page. Those “watchers” are then able to respond in real time, either reverting edits to their previous form in the case of site vandalism, or engaging in a meaningful discussion about the impetus for the change on what is called a Talk page. A Talk page is a companion page for every article and functions as a forum for discussion of content, article structure, and even the potential for merging articles with other ones. Wikipedians care a LOT about making Wikipedia a safe space for robust conversations surrounding article content and ensuring information is in its most accurate form. Those who edit with reckless abandon, or choose to ignore these avenues for civil discussion, can actually be banned from Wikipedia! The user community is so strong, that you will begin to recognize prominent editors, and those who do not have a username and just edit with their IP addresses are often not viewed as significantly as regular contributors.

Further, Wikipedia is actually very well sourced, and there are many rules in place about what is allowed to be cited in Wikipedia. Most surprising to me was that for the case of science or medical topics, primary literature articles are not supposed to be cited. The reason for this is that these articles are only one snapshot of a complex scientific question or issue. Especially for medically relevant topics, where readers could use information from the website to dictate decisions about their health, it is very important that a scientific consensus is represented. Secondary sources, like review articles published in peer-reviewed journals, are an example of this. Or, information from government organizations like the Center for Disease Control. However, there are still a significant number of missing citations throughout a wide range of articles. Since Wikipedia has an automatic citation generator, you can easily plug in an article link or DOI number and a well-formatted citation appears. This is a way to add to the robustness of Wikipedia, without much of a time commitment.

With the ease at which anyone can edit Wikipedia, I thought that it would be a relatively simple task to contribute to an article. And, while it is easy to spot statements lacking citations, incorrect grammar, or plain false information, it is actually quite time consuming to craft meaningful Wikipedia edits. Some articles are for very broad topics, such as One Health, that one must try to summarize appropriately, without being redundant with other more specific, related topics. Other articles are very specific and pointed, such as the article for Mycobacterium bovis. The approach to these two types of articles is very different. I can easily see why training more people how to improve Wikipedia is a full-time job for those at Wiki Education. Additionally, you have to be mindful of putting bias into article content, and it is actually recommended that individuals with very close ties to a topic refrain from making substantial article edits, or at least reveal their connections within the Talk page.

For a perfectionist like myself, I spent a significant amount of time agonizing over the smallest paragraphs. This information was going to be accessible by individuals across the globe, after all, and people would assume it was correct without necessarily verifying it for themselves. One of Wikipedia’s mantras introduced by the instructors helped significantly with this: Be Bold. Never be afraid of putting content out into the void of Wikipedia, because it is better for readers to have your additional contribution as soon as possible, rather than wait for it to be a perfectly crafted statement. After the information is in the main space, other Wikipedians can assist you with and provide comments/guidance or completely undo your efforts. When I did actually get my edits into the main space of Wikipedia, I found that I was often virtually thanked for those edits. That was an extremely gratifying experience, to know that my contributions were valued by others and would help educate individuals looking at the article for years to come. The mantra of Be Bold seems to hold true for many aspects of science and is the beauty of advancing knowledge through scientific inquiry. We don’t always have the complete story or all the answers, but disseminating our knowledge helps others advance their understanding and overall helps society move forward.

I can understand why it might be daunting to consider incorporating Wikipedia edits as a part of your scientific practice. Or, maybe you still don’t see the utility or benefit of making these contributions. But, there are MANY articles that still need improvement. There are quality ratings for each article that are publically accessible, and significant topics, such as the page for “veterinary medicine” are still considered of the poorest quality. Spending a few hours in an afternoon to craft a paragraph for a page, or even ten minutes finding a citation has lasting and important impacts. I firmly believe it is our duty as scientists, particularly scientists funded via government agencies/taxpayer dollars, that we are able to communicate our science in a way that everyone can understand, that we acknowledge the myriad of ways that our science impacts our communities, and that we ensure our scientific expertise is openly accessible. Your small contributions impacts information for ALL. There are many different levels of involvement that I have alluded to—just adding citations where they’re needed, fixing grammar and spelling in articles, contributing section paragraphs, completely revamping an article’s content and structure, or even creating a completely new Wikipedia page from scratch, which someone in our course did! So, while it may take time to make substantive edits, it doesn’t take much time to contribute—you can define what your contributions are.

Taking this course allowed me to learn about my peers, their interests, and connect with other young scientists passionate about improving scientific communication and contributing to evidence-based science policy. The more you engage with the Wikipedia community, the more connected you become with others around the globe, from all walks of life and with many different perspectives and motivations for their contributions. User pages are almost like social media profiles, with biographies, widgets for specific interests, and places to send other users awards, called barnstars, for their great work. This is just one other way to connect to the broader scientific community, escape the academic ivory tower, and expand your knowledge dissemination outside of paywalls and the confines of an academic journal. With the ease at which information, and more importantly misinformation spreads within our digital age, it is even more imperative that the source options that are open access have curated, accurate information presented in ways that everyone can understand. I could envision this type of an exercise being incorporated into lab meeting or journal club settings, where you begin a culture of contribution from the top down, with PIs, postdocs, graduate students, and undergrads all working together for an hour each week to improve relevant pages. Scientists at all career levels can benefit from practicing ways to distill information into succinct, jargon free sentences.

With the end of the semester and holiday season, I took a bit of a hiatus from my editing. However, I am looking forward to starting this New Year with the goal to make contributions, no matter how big or small, each week. I encourage you to explore the opportunities that Wikipedia editing may offer you, or seek out a Wiki Education course yourself. It was an invaluable experience.


Dilara Kiran is in her sixth year of the Combined Degree DVM/PhD program at Colorado State University and recently completed our Wikipedia training course sponsored by the National Science Policy Network. She aspires to use her knowledge of both clinical practice and research to contribute to evidence-based scientific policy and is passionate about science communication. Follow her on Twitter @dvmphd2be.


Interested in taking a course like this? Improve information about disability healthcare on Wikipedia through our upcoming course sponsored by WITH Foundation (here). Or write Wikipedia biographies for women across disciplines and professions (here). To see all courses with open registration, visit learn.wikiedu.org.


For inquiries about partnering with Wiki Education, contact Director of Partnerships Jami Mathewson at jami@wikiedu.org or visit partner.wikiedu.org

weeklyOSM 495

14:47, Sunday, 19 2020 January UTC

07/01/2020-13/01/2020

lead picture

Ethermap from Chris Lamby 1 | © Leaflet | © map data OpenStreetMap contributors

Mapping

  • Andrew Wiseman, from Apple, announced a new MapRoulette challenge for issues with roads and routing in Haiti, which have been identified by the Atlas data analysis tool.
  • Tesla’s plan to build Gigafactory 4 in Germany (Grünheide, Brandenburg), has generated strong interest in the OSM community. In mapping the changes during construction users are cautioned not to use outdated or copyrighted plans. Some edits of this kind already needed to be reverted. User Polarbear started a Wiki page (de) (automatic translation) to collect information usable in OSM.
  • In memory of Qasem Soleimani, the popular Iranian General, many roads and places are being named after him in Iran starting last week. This will be reflected on OSM shortly. OSM user iriman posted (automatic translation) a diary entry on how he added name:etymology=* and name:etymology:wikidata=* tag to these places.
  • OSM user ‘the_node_less_traveled’ reported about the MapWithAI plugin for JOSM, which allows JOSM users to work with the features detected by Facebook’s recently introduced MapWithAI tool. The tool is intended for users with an unstable internet connection or who don’t want to use the iD editor derivative RapiD.

Community

  • Mikko Tamura informed about a mapathon for LGBT spaces and HIV facilities, which will take place in Cebu City, Philippines on 14 March 2020 and specifies the sources and the tagging schema in a follow-up mail on OSM’s diversity mailing list. The project got a write-up by Gmanetwork.
  • Jorge Sanz reported (es) (automatic translation), in his blog, about the progress made in 2019 by the project #1calle1nombre (automatic translation). of the OSM Spain community. This project seeks to fill in the name tag for every street in the country.

Imports

OpenStreetMap Foundation

  • Dorothea announced that draft minutes from the OSMF board meeting on 23 December 2019 are now available.

Events

  • The 2020 FOSSGIS conference in Freiburg-im-Breisgau has now published the programme (de) and opened ticket sales.
  • The programme for the FOSDEM Geospatial dev room is also available. The conference takes place in Brussels on 1 and 2 February 2020, with geo sessions on the Sunday.
  • The first Maptime Salzburg this year, to be held on 23 January, will focus on digitising the outlines of buildings using satellite images for use by organisations such as Doctors without Borders.

Humanitarian OSM

  • Joost Schouppe announced the second draft of the procedure for the planned OSMF microgrants and asks for community feedback.
  • Azavea, a company offering geospatial technology services, carried out a global calculation of the UN Rural Access Indicator. The indicator (detailed description) is a proxy for a population’s access to the road network. It was introduced by the World Bank in 2006 and later adopted by UN. The company calculates the indicator based on data from OSM for the road network, WorldPop for population data, and the Global Urban-Rural Mapping Project for urban/rural classification.
  • VentureBeat reported about the efforts of Intel, the Red Cross and other parties to use AI in the identification of unmapped bridges and roads from satellite imagery. It also reports on the existing, technology based mapping efforts of Facebook and Microsoft. It is not clear from the article (or the associated Intel press release) whether data so generated can be used in OpenStreetMap.
  • The website Mail & Guardian reported about the efforts to map Makoko, a floating slum in the Nigerian city Lagos, to give it a cartographic identity and thereby forcing the local authority to include the suburb into their development planning rather than to deny its existence.

Maps

  • Jason Le Vaillant hopes to have produced the most detailed and useful topographic map of New Zealand.
  • Mapillary is hiring for a range of positions including designers and marketers.

Software

  • Robert Whittaker announced an additional mapping QA tool for the UK: construction sites which have not been edited in over a year. In many cases work will have been completed, or at least merit an additional survey.

Releases

  • Heidelberg University’s GIScience Research Group announced the release of version 0.6 of openrouteservice. Although there are some new features such as alternative routes and round trip routing, the majority of the changes were in the backend code.
  • After several weeks “offline”, the OSM Software Watchlist, from wambacher, is available again with all the current releases.

Did you know …

  • [1] … Ethermap from Chris Lamby, who tried OSM once 😉 ? Like Etherpad or Ethercalc, Ethermap is a tool for distributed working, here to create a simple collaborative map. The source code is available on GitLab.
  • … the current tasks on maproulette.org? There you will surely find something in your area.
  • Bushfire.io, a map which consolidates information on Australian bushfires from multiple sources.
  • … the site latlong.net, which lets you find the geographic coordinates of a position easily?
  • … Habr now has Russian translations (ru) of weeklyOSM.
  • … that OSM-US now produces a monthly newsletter? The January issue is out now.

Other “geo” things

  • Quoctrung Bui and Emily Badger write, in their New York Times column “The Upshot”, about the expansion of urban areas in the US over the last 10 years. For the article they asked Descartes Labs to identify these changes using machine learning.
  • Marc Prioleau tweets about an article in Ars Technica about some of the extraordinary technology used to provide driving aids before the advent of GPS.
  • There’s now a GTFS feed for the San Francisco Bay Area. An observant reader on Hacker News noticed that the ACE train (which runs here) is missing. The route is missing because they don’t currently have geometries for the route alignment, so a suggestion was made to “Send an intern with a logging GPS”. It’ll never work …Joking apart, Interline, a mobility data and service company that provides the GTFS feed, has a number of OSM extracts available (they’ve continued where Mapzen left off), and in the Bay Area the Valley Transportation Authority has worked closely with OSM in a number of areas, including sidewalk mapping.
  • It appears that relatively few American voters can locate Iran on a map.
  • The German National Geographic site, part of Fox News, published (de) (automatic translation) a selection of vintage maps from the 130 years of the society and its magazine.
  • Geospatial World blogged about the mapping industry, which, according to the author of the blog article, Sarah Hisham, is maturing from mapping and navigation to analytics and intelligence. The development is driven by the increasing importance of location data for businesses. Companies like SAP and IBM are integrating location data and analytics into their business intelligence solutions.
  • NASA reported about a study, which says that near-real-time satellite data could cut costs and save time for emergency responders in a disaster. The overall concept also includes the use of the openrouteservice based on OSM data.

Upcoming Events

Where What When Country
Dortmund Mappertreffen 2020-01-17 germany
Cologne Bonn Airport 125. Bonner OSM-Stammtisch 2020-01-21 germany
Lüneburg Lüneburger Mappertreffen 2020-01-21 germany
Nottingham Nottingham pub meetup 2020-01-22 united kingdom
Bratislava Missing Maps Mapathon Bratislava #8 2020-01-23 slovakia
Lübeck Lübecker Mappertreffen 2020-01-23 germany
Salzburg Maptime Salzburg – Mapathon 2020-01-23 austria
Ivrea Incontro mensile 2020-01-25 italy
Rome Incontro mensile Roma 2020-01-27 italy
Zurich Missing Maps Mapathon Zürich 2020-01-29 switzerland
Budapest Budapest gathering 2020-02-03 hungary
London Missing Maps London 2020-02-04 united kingdom
Stuttgart Stuttgarter Stammtisch 2020-02-05 germany
Dortmund Mappertreffen 2020-02-07 germany
Turin FOSS4G-it/OSMit 2020 2020-02-18-2020-02-22 italy
Riga State of the Map Baltics 2020-03-06 latvia
Freiburg FOSSGIS-Konferenz 2020-03-11-2020-03-14 germany
Chemnitz Chemnitzer Linux-Tage 2020-03-14-2020-03-15 germany
Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
A Guarda EuYoutH OSM Meeting 2020-06-24-2020-06-28 spain
Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Elizabete, NunoMASAzevedo, Polyglot, Robot8A, Rogehm, SK53, SunCobalt, TheSwavu, YoViajo, derFred.

How to share an oral history collection more widely

22:34, Friday, 17 2020 January UTC

Jake Kubrin is Metadata Librarian at the Stanford Law School Robert Crown Library who recently completed our intermediate-level Wikidata training course. Here he shares what linked open data makes possible for his work.

Courtesy of Jake Kubrin.

The American Bar Association (ABA) began conducting a series of oral history interviews of leading women lawyers, judges, and legal professionals in 2005, known as the Women Trailblazers in the Law Project. These oral histories provide unique and impactful perspectives on gender inequality in the legal field. In 2016, Robert Crown Law Library at Stanford Law School agreed to digitize, host, and make these oral histories available to the general public (see here: https://abawtp.law.stanford.edu/).

I chose to work with this collection in Wikidata given its size and scale, the previous metadata work completed, as well as the scope and nature of content. The collection consists of approximately 100 different interviewees. Given the time and effort needed to manually edit and create records for the 100 interviewees, OpenRefine was used to reconcile identities, load, and remediate records in batch. Having done a few OpenRefine tutorials and some experience with the facet functionality, I was excited about the opportunity to learn additional OpenRefine tools and maximize the tools I learn.

Having created the original MARC records for the collection, a workable record structure was already established and this was replicated for the work done in Wikidata. Although no WikiProject for oral history collections currently exists, other cataloging best practices pointed me towards this conclusion. Wikidata items were created for the individual interviewee and they included statements to a collection level record and the permanent URL where users can access the interviewee’s complete oral history collection.

The scope and nature of the content of the collection warranted Wikidata entries as well. Given that these many of these women held public office and were high-profile attorneys a lengthy public record existed for all interviewees, there was little question about notability. As a matter of fact, several fulfilled already had lengthy and well-referenced Wikipedia articles. The intent of the ABA in collecting these interviews was to share the challenges faced by women in the legal field, celebrate and recognize their successes, as well as speak truth to discrimination in the legal field today. Using Wikidata to enhance and build it’s structured knowledge base allows the collection to continue to be more utilized, valued, and fulfill the project’s goals.

Reviewing the work done for delivering this collection, I took away several helpful new understandings of Wikidata in spearheading new kinds of work and developing linked open data. When creating a data model for new records for the interviewees, much consideration was given to making sure that properties that could be readily captured from the oral history collection site online and useful for accessibility. Tools that could parse a transcript for keywords may have been able to supply helpful metadata, but they are still in production. Instead, the keywords used as tags on the collection’s online site fulfilled both criteria. The next steps and work ahead of me for enhancing records will be also including a Library of Congress Name Authority File identity property since this was already also captured during the initial work creating the MARC records. As I continue to get further exposure to Wikidata, I aim to develop showcase records and help to support others in their work developing Wikidata oral history collection records.

Finally, working on this project provided a way to support the collective and collaborative effort in building open linked data. Using Wikidata to push this collection to a structured knowledge base allows users to reach the oral history collection more easily. With the goal of sharing this collection as much as possible, taking advantage of the resources and tools to develop Wikidata items will continue to be a highly important part of delivering collections for our Library.


Registration for our upcoming Wikidata courses is open! New to linked data? Join the open data movement in our beginner’s course. Have more experience with linked data or Wikidata? Sign up for our intermediate course that focuses on applications. Or visit data.wikiedu.org for more information.

14 January 2020 security incident on Phabricator

22:20, Thursday, 16 2020 January UTC

On 14 January 2020, staff at the Wikimedia Foundation discovered that a data file exported from the Wikimedia Phabricator installation, our engineering task and ticket tracking system, had been made publicly available. The file was leaked accidentally; there was no intrusion. We have no evidence that it was ever viewed or accessed. The Foundation's Security team immediately began investigating the incident and removing the related files. The data dump included limited non-public information such as private tickets, login access tokens, and the second factor of the two-factor authentication keys for Phabricator accounts. Passwords and full login information for Phabricator were not affected -- that information is stored in another, unaffected system.

The Security team has investigated and assesses that there is no known impact from this incident. However, out of an abundance of caution, we are resetting all Two-Factor Authentication keys for Phabricator and invalidating the exposed login access tokens. Additionally, we continue to encourage people to engage in online security best practices, such as keeping your software updated and resetting your passwords regularly.

The Foundation will continue to investigate this incident and take steps to prevent it from occurring again in the future. In the meantime, Phabricator is online and functioning normally. We regret any inconvenience this may have caused and will provide updates if we learn of any further impact.

Respectfully,

David Sharpe
Senior Information Security Analyst
Wikimedia Foundation

Bugün, Wikipedia’nın 19. doğum gününde, Wikimedia Vakfı Türkiye’den Wikipedia’ya erişimin yeniden açılmakta olduğuna ilişkin haberler aldı.* Bu son gelişme, Anayasa Mahkemesi’nin Türkiye yetkililerinin iki buçuk yıldan uzun süren engelini 26 Aralık 2019 kararı ile anayasaya aykırı bulmasının ardından gerçekleşti. Bugün erken saatlerde Anayasa Mahkemesi kararın tam metnini kamuya açık olarak paylaştı, hemen ardından Wikipedia’ya erişimin yeniden açıldığına dair haberler aldık.

Türkiye’deki kullanıcıların yeniden dünyadaki en büyük küresel diyalogda Türkiye’nin kültürü ve tarihi hakkında yeniden katkıda bulunabilmesi ve Wikipedia’yı Türkiye ve dünya hakkında canlı bir bilgi kaynağı yapabilmesi için çok heyecanlıyız.

Wikimedia Vakfı Genel Müdürü Katherine Maher konuyla ilgili “Türkiye’deki kullanıcılarla yeniden bir araya gelmek için heyecanlıyız” dedi. “Wikimedia’da herkesin temel hakkı olan bilgiye erişimi korumaya odaklanıyoruz. Bu önemli anı her yerdeki bilgi arayanların adına Türkiye’deki katılımcı topluluğu ile paylaşmaktan çok mutluyuz”.

Anayasa Mahkemesi’nin tam metnini aktif şekilde gözden geçiriyoruz. Bu esnada, Avrupa İnsan Hakları Mahkemesi’ndeki davamız hala mahkeme tarafından incelenmeye devam ediyor. Geçtiğimiz yılın ilkbaharında Avrupa İnsan Hakları Mahkemesi’ne yaptığımız başvuru için mahkeme, davamıza öncelikli değerlendirme statüsü vermişti. Türkiye’de ve dünyada online ifade özgürlüğü için güçlü korumalar olmasını savunmaya devam edeceğiz.

Wikipedia dünyanın her yanından insanlar tarafından yazılan ve düzenlenen küresel bir özgür bilgi kaynağıdır. Açık düzenleme modeli sayesinde Wikipedia aynı zamanda herkesin aktif olarak biçimlendirmede katkıda bulunabileceği bir kaynaktır; herkes kültürleri, ülkeleri, ilgi alanları, çalışmaları, ve daha fazlası hakkında bilgilerini Wikipedia’ya ekleyebilir. Gönüllüler birlikte çalışarak tarih, popüler kültür, bilim, spor ve daha birçok alanda güvenilir kaynaklar ile doğrulanabilir bilgiler içeren makaleler yazarlar. Bu ortak yazma, tartışma ve münazara süreci ile Wikipedia dünyadaki tüm bilgiler için daha tarafsız, daha kapsamlı ve daha temsil edici hale gelir.

Türkçe konuşan insanlar için Türkçe konuşan gönüllüler tarafından yazılan Türkçe Vikipedi’deki 335.000 madde de dahil olmak üzere, Wikipedia’da yer alan tüm maddelerin %85’ten fazlası İngilizce dışındaki dillerde yazıldı.

Engel süresince Türkiye’deki öğrenciler, öğretmenler, profesyoneller ve diğerlerinden engelin hayatlarını nasıl etkilediğine dair haberler aldık. Bir çok öğrenci için Vikipedi engeli, finaller döneminden birkaç gün önce gerçekleşti. Uluslararası Wikipedia yazar toplulukları ve sayısız birey sosyal medyada #WeMissTurkey / #Türkiye’yiÖzledik mesajıyla desteklerini paylaşarak Türkiye’deki kullanıcılarla Wikipedia üzerinde bir kez daha işbirliği yapma dileklerini dile getirdiler.

Bugünkü karar ile Türkiye’deki gönüllülerimiz bir kez daha özgür bilgiyi online olarak paylaşma ve katkıda bulunmaya tam olarak katılabilecek.

* Yere bağlı olarak bir çok internet servis sağlayıcının Wikipedia’ya Türkiye’den erişim sağlamaya başladığına, bazılarının ise başlama sürecinde olduğuna dair raporlar aldık. Erişim engeli kaldırılmaya devam ettikçe bu mesajı güncelleyeceğiz.

Today, on Wikipedia’s 19th birthday, the Wikimedia Foundation has received reports that access to Wikipedia in Turkey is actively being restored.* This latest development follows a 26 December 2019 ruling by the Constitutional Court of Turkey that the more than two and a half year block imposed by the Turkish government was unconstitutional. Earlier today, the Turkish Constitutional Court made the full text of that ruling available to the public, and shortly after, we received reports that access was restored to Wikipedia.

We are thrilled that the people of Turkey will once again be able to participate in the largest global conversation about the culture and history of Turkey online and continue to make Wikipedia a vibrant source of information about Turkey and the world.

“We are thrilled to be reunited with the people of Turkey,” said Katherine Maher, Executive Director of the Wikimedia Foundation. “At Wikimedia we are committed to protecting everyone’s fundamental right to access information. We are excited to share this important moment with our Turkish contributor community on behalf of knowledge-seekers everywhere.”

We are actively reviewing the full text of the ruling by the Constitutional Court of Turkey. In the meantime, our case before the European Court of Human Rights is still being considered by the Court. We filed a petition in the European Court of Human Rights in spring of last year, and in July, the Court granted our case priority status. We will continue to advocate for strong protections for free expression online in Turkey and around the world.

Wikipedia is a global free knowledge resource written and edited by people around the world. Because of this open editing model, Wikipedia is also a resource everyone can be a part of actively shaping  — adding knowledge about their culture, country, interests, studies, and more through Wikipedia’s articles. Volunteers work together to write articles about many different topics ranging from history, pop culture, science, sports, and more using reliable sources to verify the facts. It is through this collective process of writing, discussion, and debate that Wikipedia becomes more neutral, more comprehensive, and more representative of the world’s knowledge.

More than 85 percent of the articles on Wikipedia are in languages other than English, which includes the Turkish Wikipedia’s more than 335,000 articles, written by Turkish-speaking volunteers for Turkish-speaking people.

In the time that the block was in effect, we heard from students, teachers, professionals and more in Turkey about how the block had impacted their daily lives. For many students, the block had occurred just days before their final exams. On social media, members of the international volunteer Wikipedia editor community and countless individuals shared messages of support with #WeMissTurkey and their desire to once again collaborate with the people of Turkey on Wikipedia.

With the decision today, our editors in Turkey will once again be able to fully participate in sharing and contributing to free knowledge online.

* We have received reports that several internet service providers in Turkey, depending on the location, have restored access to Wikipedia in Turkey, with some still in the process of restoring access. We will keep this statement updated as further access is restored.

As I am adding large amount of African scientists to Wikidata, I find that I have moved into a green field. A green field as far as Wikipedia and Wikidata are concerned.

To learn about how the information about African science evolves in Wikidata, I created Listeria lists that inform about universities by country, fellows/member of academies of science and members of African young science organisations.

What I produce is a scaffolding; basic information that enables. The information that I use from the Royal Society of South Africa for its fellows includes dates, other awards, employers and even dates of death. Slowly but surely more information is being added for these people and consequently you will also find for, for instance Rhodes University, more employees and additional papers (currently only 1385 papers for its 84 scholars are known).

A scholar like Tebello Nyokong, a Rhodes scholar, has 637 papers to her name. She is a world class scientist and has four Wikipedia articles to her name. All kinds of questions may be queried for her co-authors; the gender distribution, the organisations they represent, the nationality of the co-authors.

Obviously, African science is not well represented at this time. This is a reflection of how people perceive and value African science... In essence it reflects a bias of regular Wikimedia editors. The regular Wikimedia editors are in the west, they have no reason to consider African science but this is a bias. It is highly likely that it will be hard to get Wikipedia articles accepted for African scientists because of a lack of sources and probably a lack of this perceived Western relevance.

Adding one scientist at a time does not make much of a difference. When scientists are added as part of a SourceMD process, any and all scientists who have a public ORCiD profile are likely to get included in Wikidata. This is why so many African scientist are already known. When a notable scientist is then recognised as a recipient of an award, we may already know about the papers they authored.

The SourceMD process is no longer available. It coincides with a lack of resources at Wikidata so any and all resources used for science papers are now available to something else. Understandable, but the result is that I am no longer motivated to seek ORCiD identifiers and consequently, the process is increasingly broken.
Thanks,
      GerardM

weeklyOSM 494

16:00, Sunday, 12 2020 January UTC

31/12/2019-06/01/2020

lead picture

The Whisk(e)y map by Aromatiker is back onine 1 | © Leaflet | © map data OpenStreetMap contributors

About us

  • We are currently looking for people to help with the French language edition of weeklyOSM. If you are willing to help then please contact us.

Mapping

  • Christine Karch, of Geofabrik, tweets about a new service developed by the company: regional instances of taginfo. The urls follow the same pattern as on Geofabrik’s download service.
  • Pascal Neis has updatedUnmapped Places of OpenStreetMap“.
  • Jiri Vlasak presented the Damn-Project (Divide and Map. Now.) as an alternative to the HOT Tasking Manager. In his blog post he explains issues with HOT Tasking Manager and how he proposes solving them.
  • Markus Peloso announced the drafting of a proposal for amenity=give_box to allow the tagging of free sharing points for various types of goods.
  • Hauke Stieler has submitted a proposal for the tagging of duty-free shops (discussion on the tagging mailing list).
  • Andrew Harvey has updated the Australia-specific tagging guidelines on how to tag fire stations in New South Wales. He’d like to add or update the operator data for fire stations and has created a MapRoulette challenge for it.
  • Brian Prangle blogged about the 4th quarter project of OpenStreetMap UK, which aimed for the reduction of FIXME and fixme tags. However, instead of a drop, the number of such tags increased by 1 percent. Brian writes about his findings and why the tags often represent the “social geology” of OSM, rather than mere issues to be resolved.

Community

  • Darafei’s “hate chart” caused the newly elected OSMF board member Allan Mustard to tweet his thoughts about the state of relations between groups within the community. He made an interesting point about diversity in the OSM context by prioritising the diversity between “craft mappers”/”humanitarians”/”corporations”/”passive users”/”social engineers”/”one-time contributors”/”programmers/operators”/”data dumpers”/”local chapters” before the “classic” dividers such as gender, social class, education or ethnicity.
  • OpenCage Geocoder announced the publication of the first interview in their OSM series for 2020. They interviewed OSM_Pontarlier, a 21 year old tech enthusiast from Pontarlier, France, who provided insights into the challenges of mapping small towns such as the lack of other contributors, the help of “global mappers” and his general thoughts about OSM now and in the future.
  • The French government’s open data portal announced the latest (Jan 2020) release of shapefiles of communes. These are based on OSM data. The simplified geometry release normally follows shortly afterwards.
  • You can help to develop an OSM editor without any programming knowledge! You only need to know one language besides English to help with translating StreetComplete or another OSM editor.
  • StreetComplete’s author is now accepting donations via GitHub Sponsors, Liberapay and Patreon.

Imports

  • Daniel tried to restart the import of Microsoft building outlines, which has received some opposition, and addresses the main points in a comprehensive proposal on the Canadian mailing list.

OpenStreetMap Foundation

  • Komяpa has been suggesting for some time that OpenStreetMap’s infrastructure should support change. He’s now suggested a plan to get there (though that “plan” doesn’t seem to include any discussion about what it should achieve, or any documentation, yet).

Humanitarian OSM

Maps

  • Giuseppe Sollazzo has coloured maps of street name elements (“Road”, “Street” etc.) in various urban areas in the United Kingdom (and New York as well). Using OSMnx and OpenStreetMap meant that less than ten lines of program code were needed.
  • GIS LOUNGE provides an overview of how to reach your destination by the most pleasant route instead of always taking the fastest route. Tools for pedestrians and cyclists are presented.
  • “Since moving to Ireland, my mapping interest has been mostly a historical one”, writes b-unicycling on her map of “Historically interesting things on OpenStreetMap”. By choosing different layers, you can look at benchmarks, tower houses, manual pumps and ringforts.
  • Marvin Gülker wrote (de) (automatic translation) about the creation of printable maps with the German OpenStreetMap carto style. A followup post (de) (automatic translation) focuses on the rendering of GPS tracks.
  • Aromatiker has put the Whisk(e)y map back online (de) (automatic translation) with 318 distilleries worldwide.

Licences

  • Nuno Caldeira asks Shipyard Games, Strava and last but not least the New York Times, via Twitter, to attribute the maps provided by Mapbox and based on OpenStreetMap data in accordance with OSM’s licence terms.Allan Mustard, the new elected member of the OSMF board, seems to support this position. Brian Housel from Mapbox argues that the “i”, which only opens the OSM attribution after a “mouse over”, is sufficient. Chris Hill opposes this position. Igor Brecj, from the company ScalableMaps, refers to the attribution FAQ of the OSMF and says that Mapbox will certainly try to justify this kind of attribution or the lack thereof. Many others have participated in this discussion including other members of board of the OSM Foundation, namely Mikel Maron from Mapbox and Guillaume Rischard, who has no professional ties to OSM.
  • Nuno Caldeira and Rob Nickerson noticed Facebook’s release of new countries for RapiD. Facebook made 84 new countries available in their editor, which provides missing features, detected by artificial intelligence on imagery, to the users of RapiD who can then add the features to OSM.

Did you know …

Other “geo” things

  • OSM mappers who still use older Garmin GPS devices, and connoisseurs of software problems, may like to note that the firmware for certain Garmin eTrex devices had a Y2020 problem, with the date wrapping to sometime in May 2000. The Garmin support site provides instructions for updating the firmware, which worked (even on Windows 10) for at least one member of the OSM Weekly team.
  • The market research aggregator Reports Monitor linked to a newly published report, by QY Research, on the global digital map market between 2014 and 2025. As usual with this type of report, it is very expensive.
  • The Swiss mountain village of Brienz/Brinazauls (Graubünden) slides downhill (de) (automatic translation)) at a speed of one metre per year due to a geological “smear mass”. To add to their worries the village also sits under an unstable cliff. The village may need to be moved from its current position as a catastrophe could befall the village if it remains where it is. An extensive geological monitoring system will be installed (de) (automatic translation) to warn the villagers in time. SelfishSeahorse has taken picturesque mapillary images of this area.

Upcoming Events

Where What When Country
Berlin 139. Berlin-Brandenburg Stammtisch 2020-01-09 germany
Bochum Mappertreffen 2020-01-09 germany
Nantes Rencontre mensuelle 2020-01-09 france
Montrouge Rencontre mensuelle locale des contributeurs de Montrouge et alentours 2020-01-09 france
Dresden Stammtisch Dresden 2020-01-09 germany
Salvador Mapeia Bahia 2020-01-11 brazil
Toronto Toronto Mappy Hour 2020-01-13 canada
Munich Münchner Stammtisch 2020-01-14 germany
Hamburg Hamburger Mappertreffen 2020-01-14 germany
Zurich 113. OSM Meetup Zurich 2020-01-14 switzerland
Cologne Köln Stammtisch 2020-01-15 germany
Ulmer Alb Stammtisch Ulmer Alb 2020-01-16 germany
Dortmund Mappertreffen 2020-01-17 germany
Maranhão Mapeia Maranhão 2020-01-18 brazil
Lüneburg Lüneburger Mappertreffen 2020-01-21 germany
Nottingham Nottingham pub meetup 2020-01-22 united kingdom
Bratislava Missing Maps Mapathon Bratislava #8 2020-01-23 slovakia
Lübeck Lübecker Mappertreffen 2020-01-23 germany
Riga State of the Map Baltics 2020-03-06 latvia
Freiburg FOSSGIS-Konferenz 2020-03-11-2020-03-14 germany
Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Mateusz Konieczny, Polyglot, Rogehm, SK53, Sammyhawkrad, SomeoneElse, Guillaume Rischard (Stereo), SunCobalt, TheSwavu, YoViajo, derFred.

Semantic MediaWiki 3.1.2 released

10:19, Sunday, 12 2020 January UTC

January 12, 2020

Semantic MediaWiki 3.1.2 (SMW 3.1.2) has been released today as a new version of Semantic MediaWiki.

It is a release providing a bug fix, a convenience enhancement and updated system message translations. Please refer to the help pages on installing or upgrading Semantic MediaWiki to get detailed instructions on how to do this.

Semantic MediaWiki 3.1.2 released

10:16, Sunday, 12 2020 January UTC

January 12, 2020

Semantic MediaWiki 3.1.2 (SMW 3.1.2) has been released today as a new version of Semantic MediaWiki.

It is a release providing a bug fix, a convenience enhancement and updated system message translations. Please refer to the help pages on installing or upgrading Semantic MediaWiki to get detailed instructions on how to do this.

This Month in GLAM: December 2019

16:25, Saturday, 11 2020 January UTC

A buggy history

03:06, Saturday, 11 2020 January UTC
—I suppose you are an entomologist?—I said with a note of interrogation.
—Not quite so ambitious as that, sir. I should like to put my eyes on the individual entitled to that name! A society may call itself an Entomological Society, but the man who arrogates such a broad title as that to himself, in the present state of science, is a pretender, sir, a dilettante, an impostor! No man can be truly called an entomologist, sir; the subject is too vast for any single human intelligence to grasp.
The Poet at the Breakfast Table (1872) by Oliver Wendell Holmes, Sr. 
 
A collection of biographies
with surprising gaps (ex. A.D. Imms)
The history of interest in Indian insects has been approached by many writers and there are several bits and pieces available in journals and there are various insights distributed across books. There are numerous ways of looking at how people historically viewed insects. One attempt is a collection of biographies, some of which are uncited verbatim (and not even within quotation marks) accounts  from obituaries, by B.R. Subba Rao who also provides something of a historical thread connecting the biographies. Keeping Indian expectations in view, Subba Rao and M.A. Husain play to the crowd. Husain was writing in pre-Independence times where there was a genuine conflict between Indian intellectuals and their colonial masters. They begin with interpretations of mentions of insects in old Indian writings. As can be expected there are mentions of honey, shellac, bees, ants, and a few nuisance insects in old texts. Husain takes the fact that the term Satpada षट्पद or six-legs existed in the 1st century Amarakosa to suggest that Indians were far ahead of time because Latreille's Hexapoda, the supposed analogy, was proposed only in 1825. Such histories gloss over the structures on which science and one can only assume that they failed to find the development of such structures in the ancient texts that they examined. The identification of species mentioned in old texts are often based on ambiguous translations should leave one wondering what the value of claiming Indian priority in identifying a few insects is. For instance K.N. Dave translates a verse from the Atharva-veda and suggests an early date for knowledge of shellac. This interpretation looks dubious and sure enough, Dave has been critiqued by Mahdihassan.  The indragopa (Indra's cowherd) is supposedly something that appears after the rains. Sanskrit scholars have identified it variously as the cochineal insect (the species Dactylopius coccus is South American!), the lac insect, a firefly(!) and as Trombidium (red velvet mite) - the last matches the blood red colour mentioned in a text attributed to Susrutha. To be fair, ambiguities resulting from translation are not limited to those that deal with Indian writing. Dikairon (Δικαιρον), supposedly a highly-valued and potent poison from India was mentioned in the work Indika by Ctesias 398 - 397 BC. One writer said it was the droppings of a bird. Valentine Ball thought it was derived from a scarab beetle. Jeffrey Lockwood claimed that it came from the rove beetles Paederus sp. And finally a Spanish scholar states that all this was a misunderstanding and that Dikairon was not a poison, and believe it or not, was a masticated mix of betel leaves, arecanut, and lime! One gets a far more reliable idea of ancient knowledge and traditions from practitioners, forest dwellers, the traditional honey harvesting tribes, and similar people that have been gathering materials such as shellac and beeswax. Unfortunately, many of these traditions and their practitioners are threatened by modern laws, economics, and culture. These practitioners are being driven out of the forests where they live, and their knowledge was hardly ever captured in writing. The writers of the ancient Sanskrit texts were probably associated with temple-towns and other semi-urban clusters and it seems like the knowledge of forest dwellers was not considered merit-worthy.

A more meaningful overview of entomology may be gained by reading and synthesizing a large number of historical bits, of which there are a growing number. The 1973 book published by the Annual Reviews Inc. should be of some interest. I have appended a selection of sources that I have found useful in adding bits and pieces to form a historic view of entomology in India. It helps however to have a broader skeleton on which to attach these bits and minutiae. Here, there area also truly verbose and terminology-filled systems developed by historians of science (for example, see ANT). I prefer an approach that is free of a jargon overload and like to look at entomology and its growth along three lines of action - cataloguing with the main product being collection of artefacts and the assignment of names, communication and vocabulary-building are social actions involving groups of interested people who work together with the products being scholarly societies and journals, and pattern-finding where hypotheses are made, and predictions tested. I like to think that anyone learning entomology also goes through these activities, often in this sequence. With professionalization there appears to be a need for people to step faster and faster into the pattern-finding way which also means that less time is spent on the other two streams of activity. The fast stepping often is achieved by having comprehensive texts, keys, identification guides and manuals. The skills involved in the production of those works - ways to prepare specimens, observe, illustrate, or describe are often not captured by the books themselves.

Cataloguing

The cataloguing phase of knowledge gathering, especially of the (larger and more conspicuous) insect species of India grew rapidly thanks to the craze for natural history cabinets of the wealthy (made socially meritorious by the idea that appreciating the works of the Creator was as good as attending church)  in Britain and Europe and their ability to tap into networks of collectors working within the colonial enterprise. The cataloguing phase can be divided into the non-scientific cabinet-of-curiosity style especially followed before Darwin and the more scientific forms. The idea that insects could be preserved by drying and kept for reference by pinning, [See Barnard 2018] the system of binomial names, the idea of designating type specimens that could be inspected by anyone describing new species, the system of priority in assigning names were some of the innovations and cultural rules created to aid cataloguing. These rules were enforced by scholarly societies, their members (which would later lead to such things as codes of nomenclature suggested by rule makers like Strickland, now dealt with by committees that oversee the  ICZN Code) and their journals. It would be wrong to assume that the cataloguing phase is purely historic and no longer needed. It is a phase that is constantly involved in the creation of new knowledge. Labels, catalogues, and referencing whether in science or librarianship are essential for all subsequent work to be discovered and are essential to science based on building on the work of others, climbing the shoulders of giants to see further. Cataloguing was probably what the physicists derided as "stamp-collecting".

Communication and vocabulary building

The other phase involves social activities, the creation of specialist language, groups, and "culture". The methods and tools adopted by specialists also helps in producing associations and the identification of boundaries that could spawn new associations. The formation of groups of people based on interests is something that ethnographers and sociologists have examined in the context of science. Textbooks, taxonomic monographs, and major syntheses also help in building community - they make it possible for new entrants to rapidly move on to joining the earlier formed groups of experts. Whereas some of the early learned societies were spawned by people with wealth and leisure, some of the later societies have had other economic forces in their support.

Like species, interest groups too specialize and split to cover more specific niches, such as those that deal with applied areas such as agriculture, medicine, veterinary science and forensics. There can also be interest in behaviour, and evolution which, though having applications, are often do not find economic support.

Pattern finding
Eleanor Ormerod, an unexpected influence
in the rise of economic entomology in India

The pattern finding phase when reached allows a field to become professional - with paid services offered by practitioners. It is the phase in which science flexes its muscle, specialists gain social status, and are able to make livelihoods out of their interest. Lefroy (1904) cites economic entomology as starting with E.C. Cotes [Cotes' career in entomology was short, after marrying the famous Canadian journalist Sara Duncan in 1889 he too moved to writing] in the Indian Museum in 1888. But he surprisingly does not mention any earlier attempts, and one finds that Edward Balfour, that encyclopaedic-surgeon of Madras collated a list of insect pests in 1887 and drew inspiration from Eleanor Ormerod who hints at the idea of getting government support, noting that it would cost very little given that she herself worked with no remuneration to provide a service for agriculture in England. Her letters were also forwarded to the Secretary of State for India and it is quite possible that Cotes' appointment was a result.

As can be imagined, economics, society, and the way science is supported - royal patronage, family, state, "free markets", crowd-sourcing, or mixes of these - impact the way an individual or a field progresses. Entomology was among the first fields of zoology that managed to gain economic value with the possibility of paid employment. David Lack, who later became an influential ornithologist, was wisely guided by his father to pursue entomology as it was the only field of zoology where jobs existed. Lack however found his apprenticeship (in Germany, 1929!) involving pinning specimens "extremely boring".

Indian reflections on the history of entomology

Kunhikannan died at the rather young age of 47
A rather interesting analysis of Indian science is made by the first native Indian entomologist to work with the official title of "entomologist" in the state of Mysore - K. Kunhikannan. Kunhikannan was deputed to pursue a Ph.D. at Stanford (for some unknown reason many of the pre-Independence Indian entomologists trained in Stanford rather than England - see postscript) through his superior Leslie Coleman. At Stanford, Kunhikannan gave a talk on Science in India. He noted in his 1923 talk :

In the field of natural sciences the Hindus did not make any progress. The classifications of animals and plants are very crude. It seems to me possible that this singular lack of interest in this branch of knowledge was due to the love of animal life. It is difficult for Westerners to realise how deep it is among Indians. The observant traveller will come across people trailing sugar as they walk along streets so that ants may have a supply, and there are priests in certain sects who veil that face while reading sacred books that they may avoid drawing in with their breath and killing any small unwary insects. [Note: Salim Ali expressed a similar view ]
He then examines science sponsored by state institutions, by universities and then by individuals. About the last he writes:
Though I deal with it last it is the first in importance. Under it has to be included all the work done by individuals who are not in Government employment or who being government servants devote their leisure hours to science. A number of missionaries come under this category. They have done considerable work mainly in the natural sciences. There are also medical men who devote their leisure hours to science. The discovery of the transmission of malaria was made not during the course of Government work. These men have not received much encouragement for research or reward for research, but they deserve the highest praise., European officials in other walks of life have made signal contributions to science. The fascinating volumes of E. H. Aitken and Douglas Dewar are the result of observations made in the field of natural history in the course of official duties. Men like these have formed themselves into an association, and a journal is published by the Bombay Natural History Association[sic], in which valuable observations are recorded from time to time. That publication has been running for over a quarter of a century, and its volumes are a mine of interesting information with regard to the natural history of India.
This then is a brief survey of the work done in India. As you will see it is very little, regard being had to the extent of the country and the size of her population. I have tried to explain why Indians' contribution is as yet so little, how education has been defective and how opportunities have been few. Men do not go after scientific research when reward is so little and facilities so few. But there are those who will say that science must be pursued for its own sake. That view is narrow and does not take into account the origin and course of scientific research. Men began to pursue science for the sake of material progress. The Arab alchemists started chemistry in the hope of discovering a method of making gold. So it has been all along and even now in the 20th century the cry is often heard that scientific research is pursued with too little regard for its immediate usefulness to man. The passion for science for its own sake has developed largely as a result of the enormous growth of each of the sciences beyond the grasp of individual minds so that a division between pure and applied science has become necessary. The charge therefore that Indians have failed to pursue science for its own sake is not justified. Science flourishes where the application of its results makes possible the advancement of the individual and the community as a whole. It requires a leisured class free from anxieties of obtaining livelihood or capable of appreciating the value of scientific work. Such a class does not exist in India. The leisured classes in India are not yet educated sufficiently to honour scientific men.
It is interesting that leisure is noted as important for scientific advance. Edward Balfour, mentioned earlier, also made a similar comment that Indians were too close to subsistence to reflect accurately on their environment!  (apparently in The Vydian and the Hakim, what do they know of medicine? (1875) which unfortunately is not available online)

Kunhikannan may be among the few Indian scientists who dabbled in cultural history, and political theorizing. He wrote two rather interesting books The West (1927) and A Civilization at Bay (1931, posthumously published) which defended Indian cultural norms while also suggesting areas for reform. While reading these works one has to remind oneself that he was working under and with Europeans and would not have been able to have many conversations on these topics with Indians. An anonymous writer who penned the memoir of his life in his posthumous work notes that he was reserved and had only a small number of people to talk to outside of his professional work.
Entomologists meeting at Pusa in 1919
Third row: C.C. Ghosh (assistant entomologist), Ram Saran ("field man"), Gupta, P.V. Isaac, Y. Ramachandra Rao, Afzal Husain, Ojha, A. Haq
Second row: M. Zaharuddin, C.S. Misra, D. Naoroji, Harchand Singh, G.R. Dutt (Personal Assistant to the Imperial Entomologist), E.S. David (Entomological Assistant, United Provinces), K. Kunhi Kannan, Ramrao S. Kasergode (Assistant Professor of Entomology, Poona), J.L.Khare (lecturer in entomology, Nagpur), T.N. Jhaveri (assistant entomologist, Bombay), V.G.Deshpande, R. Madhavan Pillai (Entomological Assistant, Travancore), Patel, Ahmad Mujtaba (head fieldman), P.C. Sen
First row: Capt. Froilano de Mello, W Robertson-Brown (agricultural officer, NWFP), S. Higginbotham, C.M. Inglis, C.F.C. Beeson, Dr Lewis Henry Gough (entomologist in Egypt), Bainbrigge Fletcher, Bentley, Senior-White, T.V. Rama Krishna Ayyar, C.M. Hutchinson, Andrews, H.L.Dutt


Enotmologists meeting at Pusa in 1923
Fifth row (standing) Mukerjee, G.D.Ojha, Bashir, Torabaz Khan, D.P. Singh
Fourth row (standing) M.O.T. Iyengar (a malariologist), R.N. Singh, S. Sultan Ahmad, G.D. Misra, Sharma, Ahmad Mujtaba, Mohammad Shaffi
Third row (standing) Rao Sahib Y Rama Chandra Rao, D Naoroji, G.R.Dutt, Rai Bahadur C.S. Misra, SCJ Bennett (bacteriologist, Muktesar), P.V. Isaac, T.M. Timoney, Harchand Singh, S.K.Sen
Second row (seated) Mr M. Afzal Husain, Major RWG Hingston, Dr C F C Beeson, T. Bainbrigge Fletcher, P.B. Richards, J.T. Edwards, Major J.A. Sinton
First row (seated) Rai Sahib PN Das, B B Bose, Ram Saran, R.V. Pillai, M.B. Menon, V.R. Phadke (veterinary college, Bombay)

Note: As usual, these notes are spin-offs from researching and writing Wikipedia entries, in this case on several pioneering Indian entomologists. It is remarkable that even some people in high offices, such as P.V. Isaac, the last Imperial Entomologist, and grandfather of noted writer Arundhati Roy, are largely unknown (except as the near-fictional Pappachi in Roy's God of Small Things)


References
An index to entomologists who worked in India or described a significant number of species from India - with links to Wikipedia (where possible - the gaps in coverage of entomologists in general are too many)
(woefully incomplete - feel free to let me know of additional candidates)

Carl Linnaeus - Johan Christian Fabricius - Edward Donovan - John Gerard Koenig - John Obadiah Westwood - Frederick William Hope - George Alexander James Rothney - Thomas de Grey Walsingham - Henry John Elwes - Victor Motschulsky - Charles Swinhoe - John William Yerbury - Edward Yerbury Watson - Peter Cameron - Charles George Nurse - H.C. Tytler - Arthur Henry Eyre Mosse - W.H. Evans - Frederic Moore - John Henry Leech - Charles Augustus de Niceville - Thomas Nelson Annandale - R.C. WroughtonT.R.D. Bell - Francis Buchanan-Hamilton - James Wood-Mason - Frederic Charles Fraser  - R.W. Hingston - Auguste Forel - James Davidson - E.H. Aitken -  O.C. Ollenbach - Frank Hannyngton - Martin Ephraim Mosley - Hamilton J. Druce  - Thomas Vincent Campbell - Gilbert Edward James Nixon - Malcolm Cameron - G.F. Hampson - Martin Jacoby - W.F. Kirby - W.L. DistantC.T. Bingham - G.J. Arrow - Claude Morley - Malcolm Burr - Samarendra Maulik - Guy Marshall
 
Edward Percy Stebbing - T.B. Fletcher - Edward Ernest Green - E.C. Cotes - Harold Maxwell Lefroy - Frank Milburn Howlett - S.R. Christophers - Leslie C. Coleman - T.V. Ramakrishna Ayyar - Yelsetti Ramachandra Rao - Magadi Puttarudriah - Hem Singh Pruthi - Shyam Sunder Lal Pradhan - James Molesworth Gardner - Vakittur Prabhakar Rao - D.N. Raychoudhary - C.F.W. Muesebeck  - Mithan Lal Roonwal - Ennapada S. Narayanan - M.S. Mani - T.N. Ananthakrishnan - Muhammad Afzal Husain

Not included by Rao -   F.H. Gravely - P.V. Isaac - M. Afzal Husain - A.D. Imms - C.F.C. Beeson
 - C. Brooke Worth - Kumar Krishna - M.O.T. Iyengar - K. Kunhikannan


PS: Thanks to Prof C.A. Viraktamath, I became aware of a new book-  Gunathilagaraj, K.; Chitra, N.; Kuttalam, S.; Ramaraju, K. (2018). Dr. T.V. Ramakrishna Ayyar: The Entomologist. Coimbatore: Tamil Nadu Agricultural University. - this suggests that TVRA went to Stanford on the suggestion of Kunhikannan.

    Production Excellence: December 2019

    21:38, Friday, 10 2020 January UTC

    How’d we do in our strive for operational excellence in November and December? Read on to find out!

    📊 Month in numbers
    • 0 documented incidents in November, 5 incidents in December. [1]
    • 17 new Wikimedia-prod-error reports. [2]
    • 23 Wikimedia-prod-error reports closed. [3]
    • 190 currently open Wikimedia-prod-error reports in total. [4]

    November had zero reported incidents. Prior to this, the last month with no documented incidents was December 2017. To read about past incidents and unresolved actionables; check Incident documentation § 2019.

    Explore Wikimedia incident graphs (interactive)


    📖 Many dots, do not a query make!

    @dcausse investigated a flood of exceptions from SpecialSearch, which reported “Cannot consume query at offset 0 (need to go to 7296)”. This exception served as a safeguard in the parser for search queries. The code path was not meant to be reached. The root cause was narrowed down to the following regex:

    /\G(?<negated>[-!](?=[\w]))?(?<word>(?:\\\\.|[!-](?!")|[^"!\pZ\pC-])+)/u

    This regex looks complex, but it can actually be simplified to:

    /(?:ab|c)+/

    This regex still triggers the problematic behavior in PHP. It fails with a PREG_JIT_STACKLIMIT_ERROR, when given a long string. Below is a reduced test case:

    $ret = preg_match( '/(?:ab|c)+/', str_repeat( 'c', 8192 ) );
    if ( $ret === false ) {
        print( "failed with: " . preg_last_error() );
    }
    • Fails when given 1365 contiguous c on PHP 7.0.
    • Fails with 2731 characters on PHP 7.2, PHP 7.1, and PHP 7.0.13.
    • Fails with 8192 characters on PHP 7.3. (Might be due to php-src@bb2f1a6).

    In the end, the fix we applied was to split the regex into two separate ones, and remove the non-capturing group with a quantifier, and loop through at the PHP level (Gerrit change 546209).

    The lesson learned here is that the code did not properly check the return value of preg_match, this is even more important as the size allowed for the JIT stack changes between PHP versions.

    For future reference, @dcausse concluded: The regex could be optimized to support more chars (~3 times more) by using atomic groups, like so /(?>ab|c)+/. — T236419


    📉 Outstanding reports

    Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

    https://phabricator.wikimedia.org/tag/wikimedia-production-error/

    Or help someone that’s already started with their patch:

    → Open prod-error tasks with a Patch-For-Review

    Breakdown of recent months (past two weeks not included):

    • March: 3 of 10 reports left. (unchanged). ⚠️
    • April: Three reports closed, 6 of 14 left.
    • May: (All clear!)
    • June: Three reports closed. 6 of 11 left (unchanged). ⚠️
    • July: One report closed, 12 of 18 left.
    • August: Two reports closed, 4 of 14 left.
    • September: One report closed, with 9 of 12 left.
    • October: Four reports closed, 8 of 12 left.
    • November: 5 new reports survived the month of November.
    • December: 9 new reports survived the month of December.

    🎉 Thanks!

    Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production.

    Until next time,

    – Timo Tijhof


    Footnotes:

    [1] Incidents. –
    wikitech.wikimedia.org/wiki/Incident_documentation#2019

    [2] Tasks created. –
    phabricator.wikimedia.org/maniphest/query…

    [3] Tasks closed. –
    phabricator.wikimedia.org/maniphest/query…

    [4] Open tasks. –
    phabricator.wikimedia.org/maniphest/query…

    Production Excellence: October 2019

    20:41, Friday, 10 2020 January UTC

    How’d we do in our strive for operational excellence last month? Read on to find out!

    📊 Month in numbers
    • 3 documented incidents. [1]
    • 33 new Wikimedia-prod-error reports. [2]
    • 30 Wikimedia-prod-error reports closed. [3]
    • 207 currently open Wikimedia-prod-error reports in total. [4]

    There were three recorded incidents last month, which is slightly below our median of the past two years (Explore this data). To read more about these incidents, their investigations, and pending actionables; check Incident documentation § 2019.


    📖 To Log or not To Log

    MediaWiki uses the PSR-3 compliant Monolog library to send messages to Logstash (via rsyslog and Kafka). These messages are used to automatically detect (by quantity) when the production cluster is in an unstable state. For example, due to an increase in application errors when deploying code, or if a backend system is failing. Two distinct issues hampered the storing of these messages this month, and both affected us simultaneously.

    Elasticsearch mapping limit

    The Elasticsearch storage behind Logstash optimises responses to Logstash queries with an index. This index has an upper limit to how many distinct fields (or columns) it can have. When reached, messages with fields not yet in the index are discarded. Our Logstash indexes are sharded by date and source (one for “mediawiki”, one for “syslog”, and one for everthing else).

    This meant that error messages were only stored if they only contained fields used before, by other errors stored that day. Which in turn would only succeed if that day’s columns weren’t already fully taken. A seemingly random subset of error messages was then rejected for a full day. Each day it got a new chance at reserving its columns, so long as the specific kind of error is triggered early enough.

    To unblock deployment automation and monitoring of MediaWiki, an interim solution was devised. The subset of messages from “mediawiki” that deal with application errors now have their own index shard. These error reports follow a consistent structure, and contain no free-form context fields. As such, this index (hopefully) can’t reach its mapping limit or suffer message loss.

    The general index mapping limit was also raised from 1000 to 2000. For now that means we’re not dropping any non-critical/debug messages. More information about the incident at T234564. The general issue with accommodating debug messages in Logstash long-term, is tracked at T180051. Thanks @matmarex, @hashar, and @herron.

    Crash handling

    Wikimedia’s PHP configuration has a “crash handler” that kicks in if everything else fails. For example, when the memory limit or execution timeout is reached, or if some crucial part of MediaWiki fails very early on. In that case our crash handler renders a Wikimedia-branded system error page (separate from MediaWiki and its skins). It also increments a counter metric for monitoring purposes, and sends a detailed report to Logstash. In migrating the crash handler from HHVM to PHP7, one part of the puzzle was forgotten. Namely the Logstash configuration that forwards these reports from php-fpm’s syslog channel to the one for mediawiki.

    As such, our deployment automation and several Logstash dashboards were blind to a subset of potential fatal errors for a few days. Regressions during that week were instead found by manually digging through the raw feed of the php-fpm channel instead. As a temporary measure, Scap was updated to consider the php-fpm’s channel as well in its automation that decides whether a deployment is “green”.

    We’ve created new Logstash configurations that forward PHP7 crashes in a similar way as we did for HHVM in the past. Bookmarked MW dashboards/queries you have for Logstash now provide a complete picture once again. Thanks @jijiki and @colewhite! – T234283


    📉 Outstanding reports

    Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

    https://phabricator.wikimedia.org/tag/wikimedia-production-error/

    Or help someone that’s already started with their patch:
    Open prod-error tasks with a Patch-For-Review

    Breakdown of recent months (past two weeks not included):

    • March: 1 report fixed. (3 of 10 reports left).
    • April: 8 of 14 reports left (unchanged). ⚠️
    • May: (All clear!)
    • June: 9 of 11 reports left (unchanged). ⚠️
    • July: 13 of 18 reports left (unchanged).
    • August: 2 reports were fixed! (6 of 14 reports left).
    • September: 2 reports were fixed! (10 of 12 new reports left).
    • October: 12 new reports survived the month of October.

    🎉 Thanks!

    Thank you, to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

    Until next time,

    – Timo Tijhof


    🌴“Gotta love crab. In time, too. I couldn't take much more of those coconuts. Coconut milk is a natural laxative. That's something Gilligan never told us.

    Footnotes:

    [1] Incidents. –
    wikitech.wikimedia.org/wiki/Special:PrefixIndex?prefix=Incident…

    [2] Tasks created. –
    phabricator.wikimedia.org/maniphest/query…

    [3] Tasks closed. –
    phabricator.wikimedia.org/maniphest/query…

    [4] Open tasks. –
    phabricator.wikimedia.org/maniphest/query…

    Many dots, do not a query make

    00:00, Thursday, 09 2020 January UTC

    How a long sequence of dots allowed a regex to reach its internal stack limit.

    Premise

    Wikipedia’s production error logs were reporting an increase in app crashes from the search results page. The internal Logstash error report looked as follows:

    [RuntimeException]
    Cannot consume query at offset 0 (need to go to 7296)
    
    at mediawiki/ext…/CirrusSearch/…: QueryStringRegexParser->nextToken
    at mediawiki/ext…/CirrusSearch/…: QueryStringRegexParser->parse
    at mediawiki/ext…/CirrusSearch/…: SearchQueryBuilder::newFTSearchQueryBuilder
    

    What caused this?


    Background

    Wikipedia’s search experience is provided by the CirrusSearch plugin for MediaWiki. It is internally backed by an Elasticsearch cluster.

    There are a number of custom operators supported in the search field, such as wildcards, excluded words, and things like incategory: and intitle:. These are parsed by the plugin’s middleware and turned into a structured query sent to the Elastic API.

    While each error report had a different URL and search query, I noticed most of them had something in common: the search query consisted mostly of dots. For example:

    https://de.wikipedia.org/w/index.php?search=.................... (3000 dots)
    

    Such an odd query might not need to yield a useful response, but it is important that it not crash the application. Doing so leaves the user stranded with an unhelpful “Internal server error” page. It can also interfere with on-going deployments as raised error levels usually indicate that a recent software update caused a problem.


    Investigation

    David Causse (Search Platform team) led the investigation.

    This RuntimeException has been added as a safeguard in the parser for incoming search queries. This check exists toward the end of the parsing code, and should never be reached. It is an indication that a problem appeared previously. The problem was narrowed down to a failure executing the following regex

    /\G(?<negated>[-!](?=[\w]))?(?<word>(?:\\\\.|[!-](?!")|[^"!\pZ\pC-])+)/u
    

    This regex looks complex, but it can actually be simplified to:

    /(?:ab|c)+/
    

    This regex still triggers the problematic behavior in PHP. It fails with a PREG_JIT_STACKLIMIT_ERROR, when given a long string. Below is a reduced test case:

    $ret = preg_match('/(?:ab|c)+/', str_repeat('c', 8192));
    if ($ret === false) {
        print("failed with: " . preg_last_error());
    }
    
    • Fails when given 1365 contiguous c on PHP 7.0.
    • Fails with 2731 characters on PHP 7.2, PHP 7.1, and PHP 7.0.13.
    • Fails with 8192 characters on PHP 7.3. (Might be due to php-src@bb2f1a6).

    In the end, the fix we applied was to split the regex into two separate ones, and remove the non-capturing group with a quantifier, and loop through at the PHP level (Patch 546209).

    The lesson learned here is that the code did not properly check the return value of preg_match, this is even more important as the size allowed for the JIT stack changes between PHP versions.

    For future reference, David concluded: The regex could be optimized to support more chars (~3 times more) by using atomic groups, like so /(?>ab|c)+/.


    This article was inspired by Task #T236419 (October 2019).

    weeklyOSM 493

    11:01, Sunday, 05 2020 January UTC

    24/12/2019-30/12/2019

    lead picture

    BusBoy, YeneGuzo, Trotro, Trufi… many public transport apps, with one origin: Trufi Association 1 | © Trufi Association | Map data © OpenStreetMap contributors

    The whole weeklyOSM team thanks you for your interest, wishes you a Happy New Year (we reported earlier) and joins the wishes of the new OSMF chairman of the board – happy mapping. Do it here!

    Mapping

    • The Osmose team announced the implementation of updates to their QA database with improved, persistent, identifiers as well as better links, false positive support and an improved update process.
    • The German forum discussed (de) (automatic translation) the ID editor’s recommendation of the tag secondary_entrance for building entrances. It turns out that the history of entrance tag values is complex; some, for instance, are influenced by the mapping of Cambridge University done in 2011.

    Community

    • Darafei Komяpa Praliaskouski created an OSM hate chart, with the most “hated” objects by size of contribution. Nice to see that Potlatch has not been forgotten.
    • Frederik Ramm posted an article titled The Diversity Dilemma in which he discusses the possible segregation of OSM contributors into volunteer enthusiasts from wealthy countries and paid local mappers in the “Global South”. He argues that this could cause problems in developing countries in terms of sustainability of contributions to OSM and having others tell you what to map, which is already sometimes even called “colonialism”.

    OpenStreetMap Foundation

    • The newly elected chair of the OSMF board Allan Mustard asked for ideas on how OSM could attract more mappers in under-represented areas, following a proposal from the Diversity Working Group.
    • The minutes of the first OSMF board meeting after the recent election have been made available.

    Events

    Humanitarian OSM

    switch2OSM

    Software

    • [1] Duitama (Colombia), with a population of around 115,000 inhabitants, has launched its public transport app, called BusBoy, based on Trufi and OSM – of course.
    • The OsmAnd team published a blog post with the 2020 New Year resolutions of the now extended team of seven software engineers and recaps the achievements of the previous year.

    Releases

    • The JOSM team has released the next stable release of their OSM editor (15628), which improves the performance of the “Combine way” and “Parallel ways” map modes. With the expert option gui.start.animation you can now disable the start animation if you feel offended by snowflakes.
    • The historic.place-map has updated the background layer for zoom 8 to 10 for a more suitable appearance of the map. Following the withdrawal of the developer, the project is being run just enough to keep things “ticking over” until a new maintainer is found. If you feel interested just drop a note in any language at the end of this forum thread.
    • Less than two weeks after the release of version 3.10 of its navigation software for iOS, the OsmAnd team followed up with 3.11, which improves the support for languages and GPX trip management, brings a driving style setting for bicycles and fixes some bugs.

    Did you know …

    Other “geo” things

    • A tweet (de) (automatic translation) from Eureka!, a German crossword app, points out that some national borders are very obvious on maps of geocaches.
    • In an article about the importance of citizen science and open data for today’s world the website Geospatial World names “Open Street Maps” (sic) as an example of the shift which science and technology have to make towards the people involved.
    • Africapolis, a combined database of 7600 urban settlements throughout Africa, was set up by the OECD and the Sahel and West Africa Club to provide data for urban planners, researchers, and governments, as it is not just Africa’s big cities that continue to grow at a high rate but also small towns or ‘secondary agglomerations’. The project aims to become an open and freely downloadable, standardised geospatial database of geo-localised data, fed by Geographic Information Systems including satellite and aerial images. Navanwita Sachdev explains the background of the project.
    • Forbes featured Geoff Boeing’s paper (which we reported on in July 2018) analysing street network orientation, connectivity, granularity, and entropy in 100 cities around the world, using OpenStreetMap data.

    Upcoming Events

    Where What When Country
    Grenoble Rencontre mensuelle 2020-01-06 france
    London Missing Maps London 2020-01-07 united kingdom
    Hanover OSM-Bearbeitung mit JOSM 2020-01-08 germany
    London Geomob LDN (featuring OSMUK) 2020-01-08 united kingdom
    Stuttgart Stuttgarter Stammtisch 2020-01-08 germany
    Berlin 139. Berlin-Brandenburg Stammtisch 2020-01-09 germany
    Bochum Mappertreffen 2020-01-09 germany
    Nantes Rencontre mensuelle 2020-01-09 france
    Montrouge Rencontre mensuelle locale des contributeurs de Montrouge et alentours 2020-01-09 france
    Dresden Stammtisch Dresden 2020-01-09 germany
    Salvador Mapeia Bahia 2020-01-11 brazil
    Toronto Toronto Mappy Hour 2020-01-13 canada
    Munich Münchner Stammtisch 2020-01-14 germany
    Hamburg Hamburger Mappertreffen 2020-01-14 germany
    Cologne Köln Stammtisch 2020-01-15 germany
    Ulmer Alb Stammtisch Ulmer Alb 2020-01-16 germany
    Dortmund Mappertreffen 2020-01-17 germany
    Maranhão Mapeia Maranhão 2020-01-18 brazil
    Nottingham Nottingham pub meetup 2020-01-22 united kingdom
    Bratislava Missing Maps Mapathon Bratislava #8 2020-01-23 slovakia
    Riga State of the Map Baltics 2020-03-06 latvia
    Valcea EuYoutH OSM Meeting 2020-04-27-2020-05-01 romania
    Cape Town State of the Map 2020 2020-07-03-2020-07-05 south africa

    Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

    This weeklyOSM was produced by Elizabete, Polyglot, Rogehm, SK53, SunCobalt, TheSwavu, derFred, geologist, jinalfoflia.

    WikimediaDebug v2 is here!

    05:57, Saturday, 04 2020 January UTC

    WikimediaDebug is a set of tools for debugging and profiling MediaWiki web requests in a production environment. WikimediaDebug can be used through the accompanying browser extension, or from the command-line.

    This post highlights changes we made to WikimediaDebug over the past year, and explains more generally how its capabilities work.

    1. What's new?
    2. Features overview: Staging changes, Debug logging, and Performance profiling.
    3. How does it all work?

    § 1. What's new?

    Redesigned

    I've redesigned the popup using the style and components of the Wikimedia Design Style Guide.

    New design Previous design

    The images above also show improved labels for the various options. For example, "Log" is now known as "Verbose log". The footer links also have clearer labels now, and visually stand out more.

    New footer Previous footer

    This release also brings dark mode support! (brighter icon, slightly muted color palette, and darker tones overall). The color scheme is automatically switched based on device settings.

    Dark mode
    Inline profile

    I've added a new "Inline profile" option. This is a quicker and more light-weight alternative to the "XHGui" profile option. It outputs the captured performance profile directly to your browser (as hidden comment at the end of the HTML or CSS/JS response).

    Beta Cluster support

    This week, I've set up an XHGui server in the Beta Cluster. With this release, WikimediaDebug has reached feature parity between Beta Cluster and production.

    It recognises whether the current tab is for the Beta Cluster or production, and adapts accordingly.

    • The list of hostnames is omitted to avoid confusion (as there is no debug proxy in Beta).
    • The "Find in Logstash" link points to logstash-beta.wmflabs.org.
    • The "Find in XHGui" link points to performance-beta.wmflabs.org/xhgui/.

    § 2. Features overview

    Staging changes

    The most common use of WikimediaDebug is to verify software changes during deployments (e.g. SWAT). When deploying changes, the Scap deployment tool first syncs to an mw-debug host. The user then toggles on WikimediaDebug and selects the staging host.

    WikimediaDebug is now active and routes browser activity for WMF wikis to the staging host. This bypasses the CDN caching layers and load balancers normally involved with such requests.

    Debug logging

    The MediaWiki software is instrumented with log messages throughout its source code. These indicate how the software behaves, which internal values it observes, and the decisions it makes along the way. In production we dispatch messages that carry the "error" severity to a central store for monitoring purposes.

    When investigating a bug report, developers may try to reproduce the bug in their local environment with a verbose log. With WikimediaDebug, this can be done straight in production.

    The "Verbose log" option configures MediaWiki to dispatch all its log messages, from any channel or severity level. Below is an example where the Watchlist component is used with the verbose log enabled.

    One can then reproduce the bug (on the live site). The verbose log is automatically sent to Logstash, for access via the Kibana viewer at logstash.wikimedia.org (restricted link).

    Aggregate graphs (Kibana) Verbose log (Kibana)
    Performance profiling

    The performance profiler shows where time is spent in a web request. This feature was originally implemented using the XHProf PHP extension (for PHP 5 and HHVM). XHProf is no longer actively developed, or packaged, for PHP 7. As part of the PHP 7 migration this year, we migrated to Tideways which provides similar functionality. (T176370, T206152)

    The Tideways profiler intercepts the internals of the PHP engine, and tracks the duration of every subroutine call in the MediaWiki codebase, and its relation to other subroutines. This structure is known as a call tree, or call graph.

    The performance profile we capture with Tideways, is automatically sent to our XHGui installation at at https://performance.wikimedia.org (public). There, the request can be inspected in fine detail. In addition to a full call graph, it also monitors memory usage throughout the web request.

    Most expensive functions (XHGui) Call graph (XHGui)

    § 3. How does it all work?

    Browser extension

    The browser extension is written using the WebExtensions API which Firefox and Chrome implement.

    Add to Firefox   Add to Chrome

    You can find the source code on github.com/wikimedia/WikimediaDebug. To learn more about how WebExtensions work, refer to MDN docs, or Chrome docs.

    HTTP header

    When you activate WikimediaDebug, the browser is given one an extra HTTP header. This header is sent along with all web requests relating to WMF's wiki domains. Both those for production, and those belonging to the Beta Cluster. In other words, any web request for *.wikipedia.org, wikidata.org, *.beta.wmflabs.org, etc.

    The header is called X-Wikimedia-Debug. In the edge traffic layers of Wikimedia, this header is used as signal to bypass the CDN cache. The request is then forwarded, past the load balancers, directly to the specified mw-debug server.

    Header Format
    X-Wikimedia-Debug: backend=<servername> [ ; log ] [ ; profile ] [ ; forceprofile ] [ ; readonly ]
    mediawiki-config

    This HTTP header is parsed by our MediaWiki configuration (wmf/profiler.php, and wmf/logging.php).

    For example, when profile is set (the XHGui option), profiler.php invokes Tideways to start collecting stack traces with CPU/memory information. It then schedules a shutdown callback in which it gathers this data, connects to the XHGui database, and inserts a new record. The record can then be viewed via performance.wikimedia.org.

    See also

    Further reading

    Add WikimediaDebug to Firefox   Add WikimediaDebug to Chrome

    The Top Ten Wikipedia Stories of 2019

    21:16, Friday, 03 2020 January UTC

    This blog post marks the tenth consecutive year this website has contemplated the most important events, trends, and phenomena affecting Wikipedia and the wider Wikimedia community over the prior twelve months. Ten years is a long time—slightly more than half of Wikipedia’s own history up to this point.

    The very first installment of this series arrived in late 2010 as an “easy-to-write, easier-to-read listicle” but within a couple of years had become a multi-chapter mini-essay project delivered with a solemnity not unlike the closing of a particularly bitter RfC. A few themes came and went: Gamergate, Wikipediocracy, and the Knowledge Engine. Some persisted: Wikipedia’s gender gap, paid editing investigations, and tensions between the Wikimedia Foundation (WMF) and its community. Others fell away entirely: the once-declining number of editors eventually stabilized and even ticked upward, and once-hostile educators learned to love Wikipedia.

    Eventually, the decade turned: the “good internet” techno-optimism of the aughts and early 10s gave way to the “fake news” hellscape of the Trump era. Wikipedia, to its credit, continued doing just as it always had. Recently, the progressive website Mother Jones declared Wikipedia a “hero of the 2010s” for being a “a true project of the commons at a political moment when the very idea of the mutual good is under assault.”

    Indeed, Wikipedia has much to be proud of over the past ten years. No other major website has succeeded as a nonprofit, and no other nonprofit has leveraged its authority quite so effectively in the digital space. Wikipedia is a focal point for both the technology industry and the open access world. Even its controversies usually involve efforts to misappropriate Wikipedia’s reputation for independence and accountability. Wikipedia is something almost everyone can agree on.

    So, how did these themes play out over the past year and decade that was?

    ♦     ♦     ♦

    10. The media’s undying fascination with Wikipedia

    Almost twenty years into Wikipedia’s existence, you’d think that the news media would have finally grown bored of stories about how things work behind the scenes at Wikipedia. If so, you would be wrong.

    This year brought a cavalcade of deep dives into the Wikipedia community, including: “The Dumbest Wikipedia Edit War of the Dumbest Decade” (Gizmodo); “Wikipedia has a Google Translate problem” (The Verge); “Checking the Web on Hunter Biden? A 36-year-old physicist helps decide what you’ll see” (The Washington Post); “Socked Into the Puppet-Hole on Wikipedia” (Wired); “Election Results Mean All Nighters For Politicians, Pundits—And Wikipedia Editors” (Fortune); “Well It Sure Was a Big Year for the ‘Call-out Culture’ Wikipedia Page” (Jezebel); “How Hong Kong’s keyboard warriors have besieged Wikipedia” (Reuters) “Meet the man behind a third of what’s on Wikipedia” (CBS News); and “A Brief History of NRA Employees Editing Wikipedia for Fun and Possibly Profit” (Splinter, RIP). That is a lot of interest in how Wikipedia works, especially considering there are fewer working journalists than ever. Maybe they’re just interested in something on the internet that seems to be working as promised.

    Not surprisingly, the coverage tended to come from technology-focused sites. But and politics and culture outlets from The Washington Post and Slate to to the entire archipelago of former Gawker sites published multiple Wikipedia-focused pieces. While The Wikipedian’s coverage has slowed considerably in the last few years, it’s encouraging to see that in-depth explorations of the dynamics behind the world’s most popular reference source continue to flourish.

    9. Narrowing Wikipedia’s gender gap

    Oh yes, it’s still here (first appearance on this list: 2011), and it, too, quite literally still makes news. In 2019 the New York Times, The Guardian and Fast Company were among numerous outlets to publish pieces pointing out that Wikipedia’s editor community skews heavily male (as does the site’s collection of biographical entries).

    Remarkably, the reason everyone knows about the disparity is because Wikipedia has made a point of keeping it in the discussion. The Wikimedia Foundation published its first report on the demographics of Wikipedia users in 2010, and by the end of the decade many groups and initiatives existed for the purpose of bringing more women into the fold. Have they had an impact?

    Given follow-up analysis after the first survey, which found a modest improvement a couple years later, it seems plausible that the answer is yes. [Update: It turn out I have mischaracterized the analysis, which was a re-interpretation of the same data. Nevertheless, my optimism remains unchanged.] With every year that passes, a new cohort grows up with Wikipedia—and receives increasing encouragement to participate. But as the saying goes, more research is needed.

    8. Everything is (getting more) connected

    In 2004, Jimmy Wales described Wikipedia’s mission as providing “free access to the sum of all human knowledge”. These days, this quote applies less to Wikipedia itself—which has all kinds of limitations on what it deems worthy of inclusion—and more to Wikidata—which really does want to describe everything in the known universe. 2019 was a big year for the open data knowledge base, particularly in the acceleration of content being made available to it from various institutions—including the Smithsonian, the Metropolitan Museum of Art, and the Cleveland Museum of Art, among others. The trend is likely to continue in 2020 as integration with Wikidata becomes more widely accepted among archives and museums.

    But Wikipedia is not left out: this year the Internet Archive launched an initiative to enable the display of actual pages of books cited as sources. As of November, approximately 130,000 citations had been connected to 50,000 books in multiple languages, with more on the way. The Internet Archive is much less famous than Wikipedia, but it deserves a lot more credit than it gets for preserving and distributing open knowledge. (Last year’s list celebrated another of its projects, to rescue and restore links to millions of Wikipedia citations that had previously succumbed to link rot.)

    It’s interesting to me how for-profit Google and not-for-profits Wikipedia and Internet Archive all describe their mission as in some way about collecting and organizing the world’s information. It always reminds me of the final pages of Don DeLillo’s 1997 novel Underworld:

    There is no space or time out here, or in here, or wherever [this] is. There are only connections. Everything is connected. All human knowledge gathered and linked, hyperlinked, this site leading to that, this fact referenced to that, a keystroke, a mouse-click, a password—world without end, amen.

    This passage predates Google (founded 1998) and Wikipedia (2001), but not the Internet Archive (1996). It seems a stretch to say that DeLillo was inspired by the Internet Archive, but they are certainly carrying that hyperconnected vision forward.

    7. Wikipedia or Wikimedia?

    Everyone knows what Wikipedia is, but very few know what “Wikimedia” means. The word was coined in 2003 to name the new non-profit overseeing Wikipedia and other wiki-based sites which had begun to spin off it. Hence the Wikimedia Foundation. The problem is this split branding can be confusing, especially when trying to explain Wikipedia and the Wikimedia movement (see? it’s a mouthful) to new audiences.

    In 2019, the debate ramped up as the WMF hired a major branding firm, Wolff Olins, to help decide whether or not it should retire the m-word and simply become the Wikipedia Foundation. Although the rationale is clear enough, the counter-arguments are compelling, too. Wikipedia has long been the most important project of the WMF, but Wikidata very much seems like the future. Is it too late to make this change?

    In May, the WMF published the results of a multi-part survey asking community members and affiliate groups what it thought of the idea. Some participants objected to the WMF’s methodology, claiming the criteria was selectively interpreted to show more support than actually exists. Some also faulted the fait accompli presumption that the change will inevitably be made unless significant opposition is discovered, in part because it does seem kind of like the WMF is actively trying not to find it.

    Nevertheless, the topic is slated for discussion at two conferences in the first half of 2020. No one knows exactly what will happen, but if the change occurs, look to the Wikimania conference in August for a possible announcement.

    6. Wikipedia meddling for face-saving and profit

    Also in May, the outdoor lifestyle company The North Face and its ad agency Leo Burnett announced, proudly and quite inexplicably, that they had manipulated Wikipedia’s images of scenic hiking destinations to include its own clothing with logos fully visible, in order to dominate Google Images search results for said outdoor locations. The response was swift and fierce, and the images were deleted. Both companies seemed blindsided by the blowback from Wikipedia and the press (see: Adweek, PR Week, Fast Company) even though Burger King had come in for criticism for a similar stunt in 2017. (Also covered in that year’s list.) Each put out terse statements of apology, and the world moved on.

    Less noticed but just as interesting, NBC News hired a PR consultant to influence Wikipedia’s treatment of subjects it cared about by engaging in discussions on their behalf on relevant talk pages. (Necessary disclosure: my company, Beutler Ink, provides similar Wikipedia consulting services.) These subjects included former anchor Matt Lauer and president Noah Oppenheim—accused of sexual misconduct and subsequent cover-up, respectively—which made everyone uneasy. As reported by noted secret account discoverer Ashley Feinberg, the consultant was “verbose” and “relentless” and his suggestions were sometimes debatable, but also “allowed within Wikipedia’s guidelines”. The nuance probably contributed to the limited outrage, although the story popped up again when it was included in Ronan Farrow’s book Catch and Kill.

    Oh, and remember Status Labs, formerly known as Wiki-PR? Yeah, they’re still around, and in December the Wall Street Journal nailed them again for undisclosed paid editing, including on behalf of Theranos, the notoriously fraudulent and now-defunct medical startup. Maybe they’ll start following Wikipedia’s rules now? Hahaha, yeah right.

    5. Wikipedia co-founders keep trying for another big score

    The 2017 and 2018 installments of this list included mentions of famous Wikipedia co-founder Jimmy Wales’ post-Wikipedia attempts to become an internet billionaire, most recently via WikiTribune, a news site he first previewed in his 2013 Wikimania keynote. In October, Wales pivoted to WT.Social, a site intended as an ad-free, user-supported social network to compete with the fake news and clickbait of Twitter and Facebook.

    There are reasons to think it could work: Wales’ fame means that WT Social has got a fair bit of coverage, including pieces from Business Insider and the BBC, and it had more than 400,000 members when I signed up to check it out around New Year’s. The pivot also sort of resembles the one Wales made from Nupedia toward Wikipedia, and that move seemed to work out. But there are reasons to think that this abrupt turn will not: it’s already struggling under the weight of its not-that-explosive growth, its espoused “news focus” will surely limit its appeal, and maybe we actually, you know, like our social networks clickbait-y.

    Elsewhere, long estranged and non-famous Wikipedia co-founder Larry Sanger spent a couple years with Everipedia, an SEO strategy calling itself an encyclopedia that is somehow also a blockchain startup. (Also covered in our 2017 list.) In the honeymoon phase, Sanger promised that Everipedia would “change the world” far more than Wikipedia, but in October of this year, he departed and announced he would be leading a new project called the Encyclopshere: a distributed network of encyclopedias. If it materializes, this would actually be Sanger’s third try at building an encyclopedia to improve on Wikipedia. (Why not just revive Citizendium?)

    Everipedia never made a lot of sense, and neither does Encyclosphere. Each competitor lobbed criticisms at Wikipedia that ranged from valid to puzzling without making a persuasive case for an alternative. The truth is that the quotidian labors of writing, editing, evaluating, arguing, and consensus-building is the real work of creating an encyclopedia, and this is vastly more difficult to realize than starting a new website with a different philosophy about how to store the ones and zeroes.

    Call me crazy, but Wales and Sanger almost sound like they have compatible visions! Perhaps a team-up is in order.

    4. Staff changes at the Wikimedia Foundation 

    The WMF had a turbulent middle of the decade. In 2014, this list was bookended by items about the hiring of then-executive director Lila Tretikov, the next year it included kind of a blind item about various staff departures, and the year after that four separate items related to Tretikov’s messy removal and replacement by Katherine Maher, previously the chief communications officer. The three-and-a-half years since have been considerably smoother, but less so in 2019, and we’re probably closer to the end of Maher’s tenure than the beginning.

    Once again, the last year has seen some major departures at different levels, and the surprising announcement that the entire Community Engagement department would be shuttered. The executive formerly in charge, who had been with the WMF for less than a year and whose style was widely viewed as abrasive, transitioned into one of those dignity-preserving “consulting” contracts so popular in Silicon Valley. The remaining Community Engagement staff has been dispersed to other departments.

    In August, Maher hired a chief of staff, Ryan Merkley. The position had been empty since it was briefly filled by a former Army / DIA / Hillary ’16 official who had been viewed by some in Wikimedia circles as an odd fit. Not so Merkley: he arrived at the WMF after serving as CEO at Creative Commons. But this raised eyebrows, too: why would the leader of one open access institution leave to become second fiddle at another, unless he was being groomed as a successor? Also lurking in the background: complaints about how Merkley had handled sexual harassment claims in his previous role. (Merkley says he did so properly.) Will the matter come back to haunt the WMF? It probably depends on how long Maher plans to stay.

    3. Wikipedia, enemy of authoritarian regimes

    In 2015 China blocked access to Wikipedia’s servers within its borders, and in 2017 Turkey followed suit. The reason is simple: Wikipedia provides access to information that these governments do not like. In May, the Wikimedia Foundation filed a petition with the European Court of Human Rights to make Turkey explain itself, and in December the country’s highest court ordered access to be restored as a matter of human rights. As of this writing, however, Wikipedia has not yet been made available in the country. (This year, China also made sure that absolutely no language edition of Wikipedia can be accessed by its users.)

    Russia has also blocked access to Wikipedia intermittently in recent years, choosing to selectively block access to specific Wikipedia pages until the HTTPS transition made this impossible. In November, Vladimir Putin announced a plan to digitize Russia’s national encyclopedia, the Great Russian Encyclopedia, which had previously been published between 2004 and 2017, and which is controlled by a central authority (not that you’d really expect otherwise).

    By the way, Australia is not an authoritarian state, but nor does it have a constitutional right to free speech, and this year Wikipedia was cited by an Australian court for ignoring a gag order about reposting information relating to Cardinal George Pell’s conviction for rape and sexual abuse. For all the United States’ faults, the First Amendment continues to be the best ally Wikipedia can have.

    2. Movement strategy could use some strategery

    Just because you have a non-profit with a clear mission statement does not mean that you don’t have to make adjustments over time. And so for the last three years the Wikimedia Foundation has been working on something it once called “Wikimedia 2030”—because it asked participants to imagine what the Wikimedia project should look like in 2030—but now just calls Movement Strategy. Perhaps to forestall any jokes about how it really means it wouldn’t be finished until that year?

    For those involved, it’s been a struggle, maybe even a boondoggle. Working groups have been convened and disbanded without arriving at a consensus view; endless conferences and conference calls have failed to reconcile the sprawling directions it has taken. To cite one example of disorder: at Wikimania 2019, the working groups presenting couldn’t even agree on a number scheme for their presentations.

    Later in the summer, strategy participants were called to a last-ditch “harmonization” retreat in Tunisia to finally get it right. But this meeting too seems to have raised more questions than answers. In particular, an emerging theme of decentralizing the WMF—shrinking its size, spinning off dedicated groups, and devolving decision-making to chapter affiliates—was met with pushback by senior leadership. Word now is that yet another effort is underway to rewrite / reconcile the strategy for presentation to affiliates at the upcoming Wikimedia Summit in Berlin in April, but no one is quite sure what it is going to say. A new movement strategy could be a good thing—but right now it feels like process for process’ sake.

    1. Framgate

    In June, the Wikimedia Foundation did something highly unusual: it issued a one-year block for a longtime and very active Wikipedia contributor named Fram, who had been accused of behaving in an abusive manner toward other editors. While the WMF had blocked contributors before, these had always been permanent. Not so here. What could be so awful that it merited a ban, but one with an expiration date? And why didn’t they offer an explanation?

    Reaction from the community was explosive, and divided. Fram was a highly productive contributor, but also one with sharp elbows. Wikipedia has faced plenty of criticism from within and without about harassment problems on the website, and here the Trust & Safety team had ostensibly stepped up to do something about it. But the way they did it left a bad taste, and led, somewhat ironically, to a loss of trust between the WMF and its community.

    The next day, another editor unblocked Fram, only for the WMF to swiftly restore the block and remove the administrator rights of the editor who had restored him. A string of administrator resignations ensued, and nearly 50,000 words were devoted to the community’s internal debate about how to respond. [Update: Actually, I missed the archive pages so the true number may be thousands more.] As a result, the controversy drew far more press attention than anyone expected. BuzzFeed published a lengthy piece with an overreaching title, “The Culture War Has Finally Come For Wikipedia”. Both The Signpost and Slate settled for a slightly more circumspect description, calling it a “constitutional crisis”.

    Indeed, the WMF and its community share some powers, which are not always clearly delineated. The 2030 strategy is supposed to clarify things, but obviously that process had not been resolved by the time Framgate came along. In September, ArbCom decided to vacate the block, but not to restore his administrator privileges. Once again, the WMF said nothing.

    ♦     ♦     ♦

    Folks, this thing is long enough as it is, so I am going to do us both a favor and stop writing after one more sentence. Please send any corrections to thewikipedianblog@gmail.com, and thanks for reading!

    Previous installments: 2010, 2011, 2012a, 2012b, 2013a, 2013b, 2014, 2015, 2016, 2017, 2018

    Image credits, in order of presentation: Slowking4, The North Face, Zachary McCune, Larry Sanger, Kritzolina, Wikimedia Foundation, Sailesh Patnaik. All images CC-BY-SA except The North Face.

    Older blog entries