Africa is a continent filled with incredible people, fascinating culture and enthralling stories. The continent is enriched with all these components that ought to be documented on the internet for the sake of the future generations. However, only about 5% of the content on English Wikipedia is about Africa currently. 

This is why Open Foundation West Africa seeks to bridge the content gap on the web by organizing a content drive campaign (a writing contest)  specifically to achieve this goal. The Africa Wiki Challenge has over the past two years collated over four thousand (4000) articles about Africa, contributed by participants all over the world including countries like Ghana, Nigeria, U.K, Canada, Russia, Rwanda, USA among others. 

The campaign has been recognised by the African Union and the Africa Knowledge Initiative as a great way of projecting the continent on the web. 

In line with the African Union Day theme that seeks to document Africa’s stories to increase the visibility of stories around Africa’s free trade, the African Knowledge Initiative (AKI) seeks to leverage Wikimedia tools and platforms to breed innovative solutions to bridging the content gap on global digital knowledge networks.

This initiative will be implemented through Open Foundation West Africa’s annual writing contest and is themed  ‘The African Continental Free Trade’

On Thursday, 18th May, 2023. The Africa Day Campaign was officially launched globally and we were happy to host over 50 people from various countries across the world. 

The launch event was graced by speakers from the Africa Continental Free Trade Area Secretariat in the person of Mr. Peter Sewornoo the Senior Advisor to the Secretary General and Ceslause Ogbonnaya, the Wikimedian-In-Residence for the Africa Knowledge Initiative project. Wikimedia Foundation respectively.

How to participate?

We can’t do it alone. We need your help to create and improve content about Africa in relation to the campaign theme on Wikipedia and other Wikimedia platforms. There are two ways to participate in this contest. You either apply as a participant or an organizer 

Participant

Join the one-month intensive continent-wide Africa Day Writing contest from 20th May to 20th June 2023 to create, improve or translate existing articles around the topic.

Register and also sign up for the contest by signing on to the Africa Day Writing Contest Campaign Dashboard. Visit the campaign meta page Learn more about how to participate in the contest. 

Organizer

Are you participating as an affiliate, usergroup, organization, Institution, campaign organizer? Here are a few ways you can participate:

Fill the form for organizers HERE

Visit the campaign Meta Page for more information on how to participate in the contest. 

Follow us on all social media handles, Instagram, Twitter: @ofwafrica and on Facebook and LinkedIn on at Open Foundation West Africa

Angika is a language spoken in the Bihar and Jharkhand states of India, as well as the Terai region of Nepal. According to the 1997 census, it had around 7 million speakers. Currently, efforts are being made to increase the visibility of this language on Wikimedia projects. 

Screenshot of Angika Wikipedia taken on 29th May 2023

The Angika Wikipedia was in the incubator stage since August 2010 and officially became a full-fledged Wikipedia on 22nd March 2023. Kundan Amitabh, who laid the foundation of the Angika Wikipedia and is the creator of Angika.com, shared invaluable insights based on his experiences, lessons learned, and hopes for the Angika Wikipedia.

How did you begin to work on the Angika incubator and what were some of the challenges?

I learned about the incubator through a fellow Wikimedia volunteer. It is challenging to discover information about the Wikimedia incubator due to visibility issues. In 2010, I started by translating an article from English Wikipedia into Angika. I also translated the interface into Angika using translatewiki.net. I took care of other prerequisites for the approval of the incubator to ensure the existence of a platform where people can contribute to the language. A language code was necessary, so I applied for it online and received it within a month or two.

How did you engage other contributors?

Wikimedia outreach activities used to occur frequently in Mumbai, my city of residence. Community members from various language backgrounds would participate in these events. During those occasions, I would actively search for other Angika speakers. However, typing in Angika posed a challenge at that time. To address this issue, I would teach typing to Hindi Wikipedians who hailed from the Angika-speaking region so that they could contribute to the Angika incubator. 

Building a community is a significant undertaking. To encourage community contributions, I reached out to Angika writers, utilized social media for outreach, and raised awareness about the language’s usage and the opportunity to contribute. I reached out to several intellectuals, but they showed little interest. Surprisingly, it was the common people who displayed more enthusiasm to contribute. The lack of individual recognition might be a factor, as online volunteerism is not so popular.

What are your hopes for the future with regards to the newly created Wikipedia?

My main expectation is the creation of a larger community and increased involvement from active participants. This would lead to a more vibrant community overall. It is our responsibility to expand the community and convey to potential volunteers that contributing as speakers of the language is their duty. One can utilize content translation from English to Angika. These translations can be used to create translation models in the future. 

The challenges faced by small Wikipedia communities like Angika differ from larger ones, with translation being particularly crucial for smaller communities. There is also a need to integrate it with Google Translate and utilize other technologies like lexemes. However, all of these initiatives require an active community to drive the movement forward. So, the hope is that more people voluntarily come forward to contribute to the Angika Wikipedia. 

There are 1,625 articles in Angika Wikipedia as of 29th May 2023 and it can be accessed at https://anp.wikipedia.org

Monitoring my indoor air quality

Tuesday, 30 May 2023 00:17 UTC
Fri, 19 May 2023 Denver air quality live cam
Fri, 19 May 2023 Denver air quality live cam

If there’s one thing that feels like it’s gotten worse in my lifetime, it’s air quality.

Colorado’s air quality last week was dismal, filled with smoke from Canadian wildfires, making Denver’s air quality among the worst of any major city.

This is what happened to air quality four miles from my house:

Fine particles (PM2.5), Union Resevior, Longmont, CO
Fine particles (PM2.5), Union Resevior, Longmont, CO

And here’s the air quality index (AQI) in my bedroom:

Bedroom air quality index 2023-05-18–2023-05-19
Bedroom air quality index 2023-05-18–2023-05-19

You can see spikes from cooking. And you can see the moment (2023-05-19T22:25 MDT) I swapped out the aging filter on my little LEVOIT air purifier, holding particulate in check, returning indoor air quality to baseline.

Why care about air quality?

Acute exposure to air pollution makes you acutely dumber.

This was the conclusion of MIT researchers back in 2022 when they looked at the effect of air quality on chess.

They combed through 30,000 chess moves, evaluating them with the Stockfish chess engine, comparing 121 players against themselves under different air quality conditions (which they monitored with foobot sensors).

The researchers concluded that an increase as small as 10 µg/m³ PM2.5 causes a 2.1% increased likelihood of player error.

Measuring air quality

AirGradient DIY alongside my previous cludgy attempts at making something similar.
AirGradient DIY alongside my previous cludgy attempts at making something similar.

Reference particle mass counters cost thousands of dollars. And even so-called low-cost air sensors like the ubiquitous PurpleAir will set you back $200.

But the same sensors used inside the PurpleAir, the PlanTower PMS5003, can be found for as little as $15 on AliExpress.

The Plantower sensor, however, is far from a reference device. But studies suggest it’s directionally correct. And, with after-the-factory calibration, it can match readings from more expensive reference meters1.

The Plantower PMS5003 features in AirGradient’s DIY printed circuit board (PCB), which combines air quality, temperature, and CO2 sensors with a cheap ESP8266 for internet. I ordered ten of these PCBs from PCBWay back in 2021 for about $30.

I’ve since modified AirGradient’s example code to support pushing data to Home Assistant via MQTT. From there, sensor data gets sucked up by Prometheus, so I can monitor it via Grafana.

AirGradient Grafana dashboard
AirGradient Grafana dashboard

This system gives me a full view of my indoor air quality. And it’s a needlessly complicated way of reminding me to change out my air filter 😬.


  1. https://doi.org/10.4236/ojap.2021.101001↩︎

Outreachy report #44: May 2023

Tuesday, 30 May 2023 00:00 UTC

✨ This month's highlights We accepted 63 interns! 🎉 We’re planning eight 1,000 intern celebrations around the world! In a post-mortem about the intern selection phase, we started to discuss possible changes we could test during the next cohort. I submitted two proposals to FOSSY 2023 and started to prepare to travel to Portland in July. We accepted 63 interns this cohort! They started their internship this week and we’ll host our first intern chat with them—about working remotely—today.

Tech/News/2023/22

Monday, 29 May 2023 21:59 UTC

Other languages: Deutsch, English, Tiếng Việt, Tyap, Türkçe, français, italiano, polski, suomi, svenska, čeština, русский, українська, עברית, العربية, हिन्दी, বাংলা, ಕನ್ನಡ, 中文, 日本語, 粵語, 한국어

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Recent changes

Problems

  • For a few days earlier this month, the “Add interlanguage link” item in the Tools menu did not work properly. This has now been fixed. [3]

Changes later this week

  • The new version of MediaWiki will be on test wikis and MediaWiki.org from 30 May. It will be on non-Wikipedia wikis and some Wikipedias from 31 May. It will be on all wikis from 1 June (calendar).
  • VisualEditor will be switched to a new backend on small and medium wikis this week. Large wikis will follow in the coming weeks. This is part of the effort to move Parsoid into MediaWiki core. The change should have no noticeable effect on users, but if you experience any slow loading or other strangeness when using VisualEditor, please report it on the phabricator ticket linked here. [4]

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

Growth team newsletter #26

Monday, 29 May 2023 16:37 UTC

Welcome to the twenty-sixth newsletter from the Growth team

Translations are available on wiki Help with translations

One million Suggested Edits

We passed the 1 million Suggested edits milestone in late April!

  • The Suggested edits feature (AKA Newcomer tasks) increases newcomer activation by ~12%, which flows on through to increased retention. (source)
  • Suggested edits increase the number of edits newcomers complete in their first two weeks and have a relatively low revert rate. (source)
  • Suggested edits are available on all Wikipedia language editions.
  • Newer Suggested edits, like Add a link and Add an image, aren’t yet deployed to all wikis, but these structured tasks further increase the probability that newcomers will make their first edit. (source)

Positive reinforcement

Positive reinforcement aims to encourage newcomers who have visited our homepage and tried Growth features to keep editing.

  • The new Impact module was released to Growth pilot wikis in December 2022, and we are now scaling the feature to another ten wikis. [1]
  • The Leveling up features are deployed at our pilot wikis.
  • The Personalized praise features were deployed at our pilot wikis on May 24. Mentors at pilot wikis will start to receive notifications weekly when they have “praise-worthy” mentees. Mentors can configure their notification preferences or disable these notifications.

Add an image

Other updates

What’s next for Growth?

  • We shared an overview of Growth annual planning ideas, and have started community discussion about these potential projects. We would love to hear your feedback on these ideas!

Growth team’s newsletter prepared by the Growth team and posted by bot • Give feedback • Subscribe or unsubscribe.

Shallow insights into the Wikimedia Community

Monday, 29 May 2023 12:17 UTC

Although getting a pointless degree in Community Governance Studies1 would allow me to preface the shit I spout with “as someone with a degree in..”, half the points I (try to) make about The Wikimedia Community™ really are just common sense — they are rarely “deep insights based off my my 10+ years volunteering”, nor things I’ve figured out in my year-and-a-bit working for the Foundation in a community-facing-ish role.

That’s not to say there aren’t deeper insights to be figured out by long-term volunteers, staff, or those with specific industry experience in community governance, but these probably ain’t it.

The community is fractured

This is something both volunteers and staff struggle with, and it manifests in many ways — for example, how can the WMF “listen to the community” when it’s unclear what the community even is in this specific scenario?

Conversely, how can volunteers effectively tell the Foundation what needs to be prioritised if there’s no single, unified voice (and everyone has opinions)?

There are no real answers to these questions, nor a solution I think would adequately represent the entire community fairly — but perhaps that’s because we’re looking at this backwards…

Instead of begrudgingly attempting to split what we try so hard to define as a single entity, we should embrace the idea that the Wikimedia community is built out of readers, editors, power users, moderators, developers (etc.) — these groups are distinct (yet sometimes overlap heavily) and should be treated and listened to separately.

A fractured community doesn’t mean one which in-fights over priorities, and each group’s priorities would still need to be sorted at a high level.

The community is not all knowing

It comes at no surprise that the Wikimedia community is built up of a large and varied user base, with a range of experiences and proficiencies — this being said, as a whole we often act as though we are all-knowing arbiters of every aspect of building Wikimedia; from the software, to the management, to everything in between.

We are not, and for our own sakes we would do well to listen more. A fine example of this is the unfortunate deployment of the new default Wikipedia skin, Vector 2022 (perma) — in an effort not to relitigate the more controversial aspects of the skin (and through fear of attracting needless debate..!), I’ll focus on some key aspects (which, admittedly, will need to be taken without due context) that speak to the point I’m trying to make here:

  • A lot of experienced editors did not like the skin
  • Some of these editors disputed the validity of the user experience research the WMF did

These are users who, at the very least statistically speaking, are unlikely to all be experts in user experience/research/accessibility/web design etc., acting in an authoritative manner on a subject they know little about — subjective opinions on if a design is “good” or “bad” is one thing (and I dare say, a very useful thing) but to dispute research conducted by data analysts based on an negative opinion of the results is inappropriate.

To assuage any appearance of favoritism in this section, I’ll mention that it’s not just the community who is “guilty” of acting in an all-knowing manner — the Foundation, which is built up of a minority of staff who are both integrated members of the community, and a majority who have little to no interaction with the movement, often behaves as though there is a deep, cross-sectional integration between teams and the community. This may well be a goal, and certainly should be a goal, but as it stands it is not the case.

The community has power

A common gripe I see in the volunteer world is “no one listens to us“, and that “the WMF acts in its own interests” — although this can be true in some specific situations2, and the reason volunteers can sometimes feel like this should be explored in depth, on the whole it could not be further from the truth. The community has significant power, and with it a significant responsibility to utilise that power carefully, infrequently and in unison (though, this is not always possible).

This of course comes with many caveats, and we would be wise to weigh up the “power imbalance” — the Foundation has the resources to affect change on both the software (through its paid developers) and the movement (through its close affiliates), but often looks to the community for guidance in these areas.

This guidance, which likely due to the consequences of the other sections of this post, is often lacking.

The community might not even exist

Even after saying all of this, and mentioning “the Wikimedia community” more times than I can be bothered to count3, what we (both as volunteers, and staff) consider the community to be might not even be a thing we can engage with, or represent, in a meaningful way — saying “it doesn’t exist” might be hyperbole on my part, but the core concept I’m trying to make evident is that attempting to listen to and work with a nameless, faceless, and constantly evolving entity is impossible.


So then, what should we do?

Well, first of all, you’re asking the wrong person — asking any one person or organisation isn’t the way to move forward. A truly representative plan would need to involve a significant percentage of the fractured community, the WMF staff and leadership, the affiliates and at this stage, a lot of goodwill.

Explaining a good course of action would be difficult for me to do (I’m a fairly shoddy writerer), so I’ll leave you with a case study in how to almost get it right:

PageTriage: The open letter format

In July 2022, a group of English Wikipedia editors (primarily New Pages Patrollers) created this open letter to the Wikimedia Foundation and its Board of Trustees — it was co-signed by 444 editors and resulted in a significant amount of attention (and work) being done on the PageTriage MediaWiki extension.

This effort, while laudable, is an example of the power of the community being used ineffectively4 — that’s not to say it wasn’t a good idea, nor that it was poorly executed, but it was a narrow application of pressure. It helped solve the immediate problem the group of editors were experiencing, but set a precedent that large, loud groups (the English Wikipedia being the largest, and loudest) get and hold our attention, further fracturing our collective power not based on how we contribute (reader, writer, developer) but on the privilege we hold by being part of the largest and loudest group.


As a group we should remember that we are in this together, and only together can we build a relationship which lasts, fairly represents our needs, promotes mutual respect and (if you’ll excuse the cliché) contributes to the sum of all human knowledge.


The post Shallow insights into the Wikimedia Community appeared first on TheresNoTime.

Tech News issue #22, 2023 (May 29, 2023)

Monday, 29 May 2023 00:00 UTC
previous 2023, week 22 (Monday 29 May 2023) next

Tech News: 2023-22

weeklyOSM 670

Sunday, 28 May 2023 09:55 UTC

16/05/2023-22/05/2023

lead picture

WikiShootMe: Location of Wikidata items [1] | | © wikimedia.org Toolforge | map data © OpenStreetMap contributors

Mapping

  • Wilmer Osario wrote in his diary about re-drawing and improving 90 km of the Caracas-La Guaira Highway, by hand, from scratch, in JOSM. He explained in detail how he did it, adding an impressive number of screenshots to visualise every step. It is like an user manual for mappers who may be planning to do similar things.

Community

  • adreamy has made a Telegram sticker set. offering a more fun and comfortable way for OSM contributors to communicate.
    He noted that sticker images can be exchanged, so if you have a good image or idea, let him know and he’ll take it into consideration.

OpenStreetMap Foundation

  • Grant Slater, the OpenStreetMap Foundation’s Senior Site Reliability Engineer, reflected on the advancements made during his first year of work. Collaboration by Grant with the Operations Working Group has boosted OSM’s infrastructure documentation, reliability, and security. Also, a new Dev server was installed and the old forums migrated to Discourse.

Events

  • Pete Masters (Head of Community at HOT) blogged that the former HOT unSummit has been renamed HOT OpenSummit.

Education

  • Anne Schauß, from HeiGIT, held a training session in Kuala Lumpur, Malaysia, for the International Federation of Red Cross and Red Crescent Societies (IFRC). She showcased the Sketch Map Tool (with functionality similar to Field Papers), that helps collect geolocated field data by tracing with pen and paper. Either OpenStreetMap (OSM) or an image is printed as the background.

Maps

  • #geoObserver reviewed Shadowmap, an application that simulates the movement of shadows. The app can predict the direction of shadows at any given time and place using astronomical calculations of the sun’s movements combined with building data from OpenStreetMap.

Open Data

  • Arno Dagnelies highlighted the lack of a high-quality source of address data for the whole world and presented his OpenStreetData project. To achieve global coverage with high-quality, open-licensed data, the author extracts addresses from raw OpenStreetMap data, along with postcodes, street names, and administrative districts. Arno describes in detail the process of address processing and the difficulties he encountered.

Software

  • The GeoVisio team, which includes contributors from OSM-fr, has developed an open-source streetview platform.
  • yuiseki improved the performance of the Trident GeoAI, which is based on a large language model. You can enter your requirements, and then see the results in this futuristic interface map.

Programming

  • gislars, from the Traffic Flow and OpenStreetMap group, tooted a new analysis and map of parking spaces in Braunschweig as part of the project ‘Parking Analysis with OpenStreetMap’.

Did you know …

  • … that the WikiShootMe application can display the location of Wikidata items around you in the form of a map?
  • … how to use exiftool in order to get the GPS coordinates from an image’s Exif header, in a format which is recognised by OpenStreetMap’s search?
  • … how to set up OsmAnd quick actions to have your favourite features on the map for fast access?

Other “geo” things

  • Rémy Crassard and his colleagues have found several plans of desert kites (hunting or herding traps) carved into rocks in Jordan and Saudi Arabia. Thought to have been made in the Neolithic era, these engravings are among the oldest scale plans found so far.
  • Ursula Petula Barzey explained the various indigenous names of some Caribbean Islands that were used prior to European colonisation.

Upcoming Events

Where What Online When Country
Royal Borough of Greenwich Hasiru Aqua 24-Hour Mapathon 2023-05-26 – 2023-05-27
左營區 2023年5月 OpenStreetMap 街景踏查團 2023-05-27 flag
Tiranë Maproulette Challenge at Open Labs Albania 2023-05-27 flag
Singapore OG Training by HOTOSM AP-Hub OSM ON THE GO: OSM MOBILE APPLICATIONS 2023-05-27
MapRoulette Nights 2023-05-28
OSMF Engineering Working Group meeting 2023-05-31
Düsseldorf Düsseldorfer OpenStreetMap-Treffen 2023-05-31 flag
Bologna Alluvione dell’Emilia-Romagna: editathon e mapping party al Salaborsa Lab 2023-06-03 flag
Mosquera Mapea por Mosquera 2023-06-04 flag
Mosquera Resolución de notas en tiempo real – Real-time note resolution 2023-06-04 flag
MapRoulette Nights 2023-06-04
MapRoulette Monthly Community Meeting 2023-06-06
Missing Maps London Mapathon 2023-06-06
San Jose South Bay Map Night 2023-06-07 flag
OSM-Verkehrswende #48 2023-06-06
Stuttgart Stuttgarter Stammtisch 2023-06-07 flag
Richmond State of the Map United States 2023 2023-06-08 – 2023-06-11 flag
Rīga State of the Map Baltics 2023 2023-06-08 – 2023-06-09 flag
Windsor MapRoulette-A-Thon 2023-06-09 flag
München Münchner OSM-Treffen 2023-06-08 flag
Marseille State of the Map France 2023 2023-06-09 – 2023-06-11 flag
Berlin 180. Berlin-Brandenburg OpenStreetMap Stammtisch – 15 Jahre – mit der Wikipedia-Community 2023-06-09 flag
Zürich OSM-Stammtisch 2023-06-10 flag
Hannover OSM-Stammtisch Hannover 2023-06-10 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by MatthiasMatthias, Nordpfeil, PierZen, Strubbl, TheSwavu, YoViajo, barefootstache, derFred, muramototomoya, rtnf.
We welcome link suggestions for the next issue via this form and look forward to your contributions.

Editing issues

Friday, 26 May 2023 15:41 UTC

May 25, 14:37 UTC
Resolved - This incident has been resolved. A summary will be posted here later.

May 25, 14:25 UTC
Monitoring - A fix has been implemented and we are monitoring the results.

May 25, 14:25 UTC
Update - We are continuing to investigate this issue.

May 25, 14:16 UTC
Investigating - We are aware that users are having trouble editing Wikipedia and other Wikimedia wikis, and we are investigating.

The United Nations Agency for Culture, Science, and Education (UNESCO) has recognized Indonesia for owning various heritages and cultures. Thus, thousands of museums, monuments, and historical places are located in Indonesia to display heritage items, tell the Indonesian heroes’ struggle, and commemorate Indonesia’s great leaders and achievements. Indeed, Indonesia’s cultural heritage is owned by Indonesia; however, the youths also have the critical role of maintaining and developing it as their resource for identity.  

According to the Directorate General of Culture, Ministry of Education and Culture of Indonesia, one of the main problems in the development of museums and historic places is the lack of innovation in marketing and promoting, which in turn has an impact on the awareness and interest of the community, especially the youth. The number of visitors has also decreased due to the COVID-19 pandemic in 2020. For instance, Jakarta’s visitors to museums and historical sites decreased drastically to 81.5% (Statistics Indonesia, 2021). Thus, modernizing digital innovation is critical as everyone lives in the digital age.

It is believed that free and open technology development can enhance the marketing of historical sites and increase visitor numbers, as the public can access the technology without any barriers. Internet access and communication infrastructure can be an innovative way to engage the community and increase the number of visitors. OpenStreetMap (OSM), as a communication platform through the map, can accurately show the location-based historical sites, which allows anyone to add the historical sites’ location freely. Through the open and free accessible map for everyone, any information on the historical sites in Indonesia can be maintained and developed. Other than that, OSM is very flexible, as anyone can add more detailed information to describe a feature. Not only show the location coordinate, but OSM’s contributors also can add information for opening hours, wheelchair access for disabled persons, the website for related features, and so forth. 

Perkumpulan OpenStreetMap Indonesia (POI) collaborated with Wikimedia Indonesia in a program called Historical Crowdsources Spatial Data for Sustainable Development and Inclusive Mapping by conducting the training of contributing on OSM and Wikidata to use Galleries, Libraries, Archives, Museums (GLAM) data. During this collaboration, POI and Wikimedia organized activities to enhance the mapping of historical infrastructure data from GLAM objects into OSM. These activities included the GLAM flash competition and training sessions on OSM and Wikidata (edit-a-thon). The events spanned seven days in total, with five days of online competition dedicated to the GLAM Flash Competition (20 – 25 December 2022) and two days for the Edit-a-thon in Bandung on February 13-14, 2023. Surprisingly, these events gained a great number of GLAM objects mapping on OSM; also, the information was edited on WIkidata. Despite only 219 participants in these events, more than 126,000 buildings were added to OSM, and more than 2,400 items were edited on Wikidata. These numbers indicate that the participants had a significant level of interest in this competition and training session. Besides, the 461 GLAM objects on POI’s Web GIS page would not have been collected without their efforts.

Following their collaboration, POI and Wikimedia organized a competition called GALACTICO. Before the competition began, both organizations held a socialization event to promote it. As discussed during the socialization, GALACTICO required participants to creatively visualize GLAM data from OSM and Wikidata through infographics or videos uploaded on their Instagram accounts to increase public awareness and interest in GLAM as its event’s goal. In total, 32 teams participated in various themes about GLAM. POI and Wikimedia then assessed those teams based on specific criteria, such as theme ideas, use of OSM and Wikidata data, and the complexity and aesthetics of their output. Moreover, the evaluation process led to the selection of eight winners, including the winner, first runner-up, second runner-up, and favorite winner, in both the infographic and Instagram reels categories. The event concluded on May 17, 2023, with a closing ceremony where the winners were announced.

Congratulations to the Wong Ling Lung, The Manuls, Minions, and Lembah Halimun teams for winning the infographic categories. Additionally, the Instagram reels categories were won by Hulahoop, Gudetamap, Penyelamat Bumi, and Angkara Cemara. POI and Wikimedia also sincerely thank all the teams who participated in GALACTICO. Thanks to their efforts, many people have found this highly informative and discovered the GLAM objects in their vicinity. Even some are willing to visit these GLAM objects during their leisure time.

It is hoped that the program will increase the number of GLAM visitors and that more young people will actively preserve GLAM as an integral part of their ancestors’ heritage. Besides, it has been proven that open data is very useful and beneficial in this digital age. It is because the actual value of intangible cultural heritage is not in its cultural presentation but in transmitting valuable knowledge and skills from one generation to another.

A Collared Aracari from the toucan family in Costa Rica. Image by LG Nyqvist, CC BY-SA 4.0, via Wikimedia Commons.

Discussions about the internet tend to focus on the future: new technologies, emerging threats, untapped opportunities. Twenty years of existence does not make the Wikimedia movement immune to the fascination with technology and its promise of progress. Take the present moment as an example, and its discussions about the risks and potential of generative artificial intelligence tools like ChatGPT. But there is also much to gain from slowing down and asking: What kind of digital future do we want? How do we make sure that human rights are respected by new technologies? RightsCon’23 provides a civil society-led space where stakeholders of all kinds can come together to discuss the present in order to understand how to build that shared future.

This conference, hosted by Access Now, brings together activists, artists, technologists, business leaders, policymakers, journalists, and more to discuss human rights in the digital age. RightsCon’23 will be the first hybrid version of the event, taking place both online and in-person in San José, Costa Rica, from 5–8 June 2023. Wikimedians will host and/or participate in five sessions. Their contributions to RightsCon emphasize the importance of the free knowledge movement in the larger online information ecosystem: Information is power, and access—or lack thereof—is often used as a tool for control by those in power.

The Wikimedian sessions at RightsCon’23 illustrate how vital the work of the movement is not only to developing online technology and spaces that enable access and participation in free knowledge, but also to using these to achieve sociotechnical goals—whether helping a country remember and prevent a violent history, learning about labor issues from the perspectives of Black and Indigenous peoples, or uplifting decentralized technologies as tools to carve out civic spaces online.

Similarly to last year’s overview of Wikimedian sessions at RightsCon’22, below is a summary of the sessions with Wikimedian participation this year. If you’re interested in these sessions and the wider RightsCon program, you can register for free until 2 June 2023. We hope to see you there!

Please note that session dates and times might be subject to change. For up-to-date information on session details, please check the RightsCon program, which opened its Summit Platform on 22 May so that registered participants can create their own personalized schedule.


What We Talk about When We Talk about Free Knowledge: Wikimedia Strategies for Advocacy

Date & Time: 6 June @ 13:45–14:45 UTC

Format: Workshop

Location: Online

Presenters: Eric Luth (Wikimedia Sweden), Patricia Díaz Rubio (Wikimedia Chile), Luisina Ferrante (Wikimedia Argentina), Anna Torres (Wikimedia Argentina), and Douglas Scott (Wikimedia South Africa)

Details: Wikipedia and the Wikimedia platforms provide access to information to billions of people worldwide. Our movement collaborates with UN bodies, government agencies, and international organizations across the world to share vital information with the public. Yet at the same time, new platform regulations and internet governance decisions are designed without consideration of how they might impact Wikimedia platforms, with its collaborative practices and user-generated content. As a result, Wikimedians have had to develop advocacy strategies for getting a seat at the negotiating table, and make sure that lawmakers and other stakeholders listen to the movement’s needs. In this workshop, Wikimedians from various continents and countries will discuss how the Wikimedia platforms and other user-generated online platforms are affected by national and international regulations and policies, as well as how to defend and advance the rights of users on the internet. 

Shall We Forget? Open-Source Platforms as Tools for Memory, Truth, and Reconciliation

(In the Tech in Situations of Conflict and Crisis session) 

Date & Time: 6 June @ 17:30–18:30 UTC

Format: Lightning Talk

Location: In-person

Presenter: Valentina Vera-Quiroz (Wikimedia Foundation)

Details: Conflict-related violence in Colombia has taken new forms since the Peace Agreement was signed in 2016 between the government and the largest rebel group. There has been an increase in online takedown requests from parties involved in the conflict to online platforms demanding that their armed conflict-related information be deleted. Today, information about these crimes would not exist if open-source platforms had not documented them, building and keeping records about the crimes that make for one of the most painful episodes of Colombia’s conflict. This talk will discuss how open-source platforms can give a voice to those whose marginalization has denied them the opportunity to tell their stories, and how victims can use these platforms as tools to fight against those who deny the occurrence of past atrocities. Remembering as an activity can increase awareness of human rights abuses and enhance actions to redress wrongdoings, while open-source technologies help us create a record of history, and ensure that what happened in the past can be remembered.

You can learn more about this topic in a Medium blog post that Valentina recently published. 

Latin American Challenges of Accessing, Producing, and Circulating Knowledge of Black and Indigenous Populations Online

Date & Time: 6 June @ 22:30–23:30 UTC

Format: Dialogue

Location: In-person

Presenters: Ivonne Gonzalez (Ennegreciendo Wikipedia), Amalia Toledo (Wikimedia Foundation), Stephanie Lima (InternetLab), and Epsy Campbell Barr (former Vice President of Costa Rica and President of UN Permanent Forum on People of African Descent)

Details: The challenges faced by historically marginalized populations to access, produce, and circulate knowledge online are not comprehended enough in Latin America. At the same time, the knowledge produced by Black and Indigenous people provides a broader understanding of society as a whole, and not just the specificities of these communities. This session contributes to a deeper understanding of these inequalities and how they have been narrowed with the experience of three focus topics at RightsCon’23: gender justice, labor, and the environment. The session will open with short prompts from the speakers, after which the participants will be separated into groups to reflect. Questions to explore include:

  • How is more information about and from women, specially those with Black and Indigenous backgrounds, important to gender and sexuality topics?
  • How is more information from the perspective of these communities important to understand labor issues?
  • How does this knowledge help us understand and advance climate and environmental justice issues? 

All in the Room? Interrogating the Authority Control Construct in Wikipedia

(In the Internet Shutdowns and Access to Information: Perspectives from Africa session)

Date & Time: 7 June @ 13:45–14:45 UTC 

Format: Lightning Talk

Location: Online

Presenter: Dr. Nkem Osuigwe (African Libraries & Information Associations and Institutions [AfLIA])

Details: How we categorize knowledge in physical spaces like libraries is flawed to the detriment of the visibility, inclusion, and equity of knowledge from Africa. Those flaws are now bumped into digital spaces. Information classification schemes that librarians use internationally provide a shared system to produce, process, and retrieve information. This classification scheme shows up as the ‘Authority control’ construct in Wikipedia and Wikidata. Yet these categorizations are tilted towards knowledge from the global North and suppress knowledge from the global South. The ontology of the controlled vocabularies and how they inequitably group and categorize knowledge from and about Africa must be interrogated, for they underpin the organization of knowledge and can hinder/promote visibility and inclusion/equity online. Is this possibly ‘epistemicide? Finding solutions to this, and possibly charting pathways to equitable and inclusive classifications of knowledge so that our stories, histories, and realities are more visible in online spaces, is the core reason behind this session.

Getting the Laws in Check: Policymakers Push for Defending Digital Rights in Southeast Asia

Date & Time: 7 June @ 22:30–23:30 UTC

Format: Panel

Location: In-person

Presenters: Amalia Toledo (Wikimedia Foundation), Seema Chishti (Access Now), ASEAN Parliamentarians for Human Rights (APHR])

Details: Across the Asia and Pacific region, elected lawmakers in national parliaments are increasingly required to engage with digital rights issues, as these digital aspects become a core part of all human rights discussions. Their actions set the context in which government executive officials operate, of the laws that countries adopt, and the policy consensus their governments forge. However, there is relatively little public discussion and engagement with the context, tools, and challenges these lawmakers face in major Asia-Pacific democracies. This session provides a cross-country, cross-region forum for lawmakers engaged with pressing digital rights conversations. This is a joint session between the Wikimedia Foundation, Access Now, and ASEAN Parliamentarians for Human Rights (APHR) with speakers from Access Now, Members of Parliament (MPs) from Malaysia and Thailand, and the Foundation as well. This session offers a conversation with the MPs on policy in South East Asia and how we can learn from other regions—i.e., Latin America and the Caribbean—to create a better environment for freedom of expression and free knowledge in the Asia Pacific region.

Encyclopedia of War: Addressing and Mitigating Risks to Wikimedia Community During the War in Ukraine

Date & Time: 8 June @ 10:45–11:45 UTC

Format: Dialogue

Location: Online

Presenters: Cameran Ashraf (Wikimedia Foundation), Sanda Sandu (Wikimedia Foundation), Max Fischer (Wikimedia Foundation)

Details: The war in Ukraine has many fronts—some more visible than others. On Wikimedia projects, Wikipedia is the largest and most well-known, volunteers are working tirelessly to keep the information neutral, verifiable and up to date—often to great challenges to themselves. Using the war in Ukraine as a case study, the session will explore some of the challenges faced by the Wikimedia community in times of conflict and when contributing from jurisdictions where freedom of expression is restricted and the environment is otherwise unfavorable to free knowledge. The session will be formatted as a community brainstorming and sharing session: participants will learn about the risks that Wikipedia faces, and together we will discuss and identify threats from different perspectives along with methods to respond and mitigate them.

Co-creating the Online Spaces We Want in a (Fe)Diverse and Decentralized Internet

Date & Time: 8 June @ 15:00–16:00 UTC 

Format: Workshop

Location: In-person

Presenters: Amalia Toledo (Wikimedia Foundation), Javier Pallero (Access Now), and Marlena Wisniak (European Center for Not-For-Profit Law [ENCL])

Details: Elon Musk’s purchase of Twitter coincided with a renewed interest in decentralized social media platforms. This session will explore the challenges and opportunities of the ‘fediverse’ as well as emerging technologies such as Web3. These decentralized technology systems have been celebrated and spotlighted as a model for co-creation, distributed governance and shared ownership, thereby claiming to achieve the collective promise of an open internet while bypassing harms stemming from Big Tech. Yet can Web3 truly function as a decentralized system and advance public interest, in a world built around centralized power, money and influence? Can the Fediverse provide an opening for activists in a world with shrinking civic spaces? How would these alternative forms of social media be regulated? Civil society, especially members or marginalized groups, including those outside the US and Western Europe, have a real opportunity today to redefine how platforms should operate and how content could be moderated—we need to seize it. Experts from various sectors will share their perspectives, so that participants can collectively engage in an interactive workshop to shape the alternative futures we want.

In this post, Mike Peel, a GSOC & Outreachy mentor with Wikimedia since 2021, shares his experience with the two programs.

Srishti: Can you tell us about your background and involvement in the Wikimedia movement?

Mike: I’ve been editing the English Wikipedia since 2005, when I was reading an article and was irritated by a grammatical error – so I fixed it. After that I just kept coming back and contributing more. The Wikimedia projects are a natural place to share information and photographs with others while you’re exploring the world around you – and discover pointers to new things to learn and discover.

I soon got involved in the technical side of things too, working on templates that could be used across multiple articles. Wikidata then came along, and revolutionised the way that templates can display information – instead of it being manually provided through template parameters, it could be fetched directly from Wikidata. Initial work on automated infobox templates on the English Wikipedia ran into community issues, but having a single, multilingual, Wikidata Infobox on Commons became a big success.

I also got pulled into the organisational side of things quite early on, and ended up co-founding Wikimedia UK in 2008, serving as a Trustee of it for 5 years, and going on to serve on grants committees (initially the global Funds Dissemination Committee, and later the Northern and Western Europe Grants Committee). I was elected to the Wikimedia Foundation Board in 2022 where I am currently serving as a trustee.

Srishti: It is impressive to see your mentoring involvement in technical projects in almost every round Wikimedia organizes! How many projects have you mentored so far? And how did you get involved with mentoring in Wikimedia’s outreach programs?

Mike: I heard about the Wikimedia Foundation’s involvement in Outreachy and GSoC by emails to wikimedia-l, and I submitted my first project proposal in 2021 as a technical volunteer. I was already familiar with supervising students: I had two summer students working on science articles in the English Wikipedia back in 2014, and I’ve also supervised students working on astronomy projects over the years.

I was really impressed by the Outreachy program, and their focus on tackling underrepresentation in open source tech through paid internships. I ended up supervising two students in each of their Summer 2021, Winter 2021/2, and Summer 2022 rounds (co-supervised by Andy Mabbett), as well as co-supervising a student in Winter 2022/3 (main supervisor Éder Porto).

I also really like how Wikimedia volunteers can (and are encouraged to) supervise students through both Outreachy and GSoC – you don’t have to be a staff member to mentor projects!

(My day job is in academia, so I normally use the word ‘student’ rather than ‘intern’ – since every project is a learning experience.)

Srishti: This past summer you mentored 3 interns as part of GSoC / Outreachy. Can you tell us a little about the projects they worked on?

Mike: The two Outreachy students – Roberto and Nirali – worked on “What’s in a name? Automatically identifying first and last author names for Wikicite and Wikidata“, with Andy Mabbett co-supervising. This project focused on understanding author names for Wikidata items about journal articles – which is really complex since name strings can vary in definitions between countries and cultures. Nirali and Roberto worked on Python scripts to use extra information provided by the authors of the journal articles to distinguish between first and last names, and to add this information to the Wikidata items using the “author given names” and “author last names” properties. It also led to more articles being imported to Wikidata, although this is tricky since items about journal articles weigh heavily on Wikidata’s query service back-end.

The GSoC student – Lennard – worked on a very different project, “Rewrite the Wikidata Infobox on Commons in Lua“. The first version of the Wikidata Infobox that I’d developed and deployed on Wikimedia Commons had been a big success, and was used in over 4 million categories. However, it was not written with efficiency in mind, and it was taking a lot of time to load for complex items. Lennard rewrote the Infobox in Lua, dramatically speeding it up by an order of magnitude, 

Srishti: How was your experience mentoring these interns? How was it mentoring multiple people at the same time?

Mike: It’s always a great experience mentoring students – you get to both share your knowledge, and learn new things as they explore new topics and share their perspectives.

Mentoring several students together can actually be easier than mentoring them individually – since they can help each other out when they encounter problems. The tricky thing is making sure their projects don’t overlap too much, so their work complements each other rather than leading to duplication.

The most time-consuming part of Outreachy is actually during the contribution period, where you can have ~20 candidates all wanting to do your starting tasks. My approach with this was to have a standard mini-course of ~3-6 generic tasks, which could demonstrate the candidate’s interests and capabilities. While the tasks were the same (e.g., code something that does <x>), the subject (e.g., books, science, etc.) was free for the candidate to choose, so they didn’t overlap with editing. Then the most difficult part comes at the end, where you have to pick one of the candidates for the internship! (and being able to select multiple people makes this choice a bit easier!)

Srishti: What steps do you actively take to ensure interns have a good experience? Mainly, how do you foster their engagement with the broader Wikimedia community during and after the program?

Mike: This can be quite difficult. Wikimedia contributors have to be self-reliant and motivating, which is somewhat different from what you get through an intern process. As such, the transition afterwards can be tricky – since the financial motivation is no longer there, and they will often become busy with other things (going back to uni etc.). Some choose to stay around, others move on – but you can keep in touch as a mentor in the future as they need.

Importantly, the internship has to be an experience that gives them knowledge and skills that they’ll make use of in the future, for example, understanding how Wikimedia works, and/or learning a programming language. They may not use those skills to contribute back soon,but they may return in the future, and may use the skills for other things (e.g., to get an open source job in the future).

Srishti: What are some of the challenges you face while mentoring interns? How do you overcome them?

Mike: They can be really quiet at the start! It takes time for them to get to know the project, community, and mentors. Having multiple opportunities and invitations to communicate via different means can help bring them out of their shell – but it does take time.

They will also take longer to go through a full project than you might expect – so make sure there are milestones along the way. The outcomes of the project can also be quite different from what you expect at the start, as they develop throughout the project, so mentors need to have some flexibility!

In general, keep the focus on giving the students a good experience and knowledge base for the future, rather than just aiming to get the project completed!

Srishti: What advice would you give to future mentors to make contributors’ experiences meaningful?

Mike: Co-supervision really helps with making sure the students get to experience different viewpoints, and to make it easier when one supervisor isn’t available for parts of the program. It also helps to have community-facing projects, so that the students get to interact with the wider community rather than working on a project in relative isolation.

Srishti: What advice would you give future interns to thrive at their internship?

Mike: Ask questions! The internship is a learning experience, and your mentors are there to help – so don’t hesitate to ask them anything, there’s no bad question, and it’s a safe environment to learn. 

Episode 138: Danielle Batson

Tuesday, 23 May 2023 17:02 UTC

🕑 58 minutes

Danielle Batson is the wiki community manager for the non-profit organization FamilySearch, which is run by the Church of Jesus Christ of Latter-day Saints.

Links for some of the topics discussed:

Tech/News/2023/21

Tuesday, 23 May 2023 00:22 UTC

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Recent changes

Changes later this week

  • An improved impact module will be available at Wikipedias. The impact module is a feature available to newcomers at their personal homepage. It will show their number of edits, how many readers their edited pages have, how many thanks they have received and similar things. It is also accessible by accessing Special:Impact. [2]
  •  The new version of MediaWiki will be on test wikis and MediaWiki.org from 23 May. It will be on non-Wikipedia wikis and some Wikipedias from 24 May. It will be on all wikis from 25 May (calendar).

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

On April 23, 2023, “Wikipedia Bungaku 9: Yasujiro Ozu” was held. “Wikipedia Bungaku (literature)”, derivative of Wikipedia Town,  is an event in which participants visit an exhibition at the Kanagawa Museum of Modern Literature in Yokohama, Japan, and write Wikipedia articles using materials from the Kanagawa Prefectural Library.

This event was supported by the Wikimedia Foundation. We report on the event in this article.

Outline of the event and the grant

The outline of “Wikipedia Bungaku” and the application for the Wikimedia Foundation’s grant are described in this article.

We applied for a rapid grant to utilize the remaining funds of 46,213 JPY ($358.12 *at the time of application) from the previous two years’ grants. In response to a change in the application process effective April 2022, The application was submitted through the dedicated Fluxx website. The application period is fixed, and the application was processed in January in time for Cycle 7 (January 30 – March 20, 2023) to be held in April for the “wikipedia Bungaku”. Due to a change in the application format, we had to write some new sentences and present some indicators, and it was a bit tricky, but the application was successfully approved in late March.

History of the event

The background of how “Wikipedia Bungaku” was started is described in my contribution to the mail magazine “ACADEMIC RESOURCE GUIDE (ARG)” No. 683.

“Rashinban, “I tried ‘Wikipedia Bungaku'”. ACADEMIC RESOURCE GUIDE (ARG), Issue 683. Archived from the original as of October 26, 2020; viewed October 24, 2020.

Available at: https://web.archive.org/web/20201026170908/http://www.arg.ne.jp/node/9258 Japanese only

The reports of the first and second meetings are also published in the LRG Library Resource Guide Library Resource Guide, Volume 25. Japanese only

During the first and second events, participants wrote articles using materials from the collection of Kanagawa Museum of Modern Literature and materials brought by themselves. However, there were some difficulties in conducting a bibliographic survey at the museum. Firstly, the reading room was located in a different building from the meeting room where the event was held. Secondly, materials could not be taken out of the reading room but could only be copied. Lastly, the capacity of the reading room was limited. To improve the experience of event, we thought it would be necessary to seek the cooperation of public libraries, and approached the librarian of the Kanagawa Prefectural Library for advice.

After some coordination, it was decided to give it a try, and the third event was held under the auspices of the Kanagawa Prefectural Library.

“Kanagawa Prefectural Library Hosts “Wikipedia Bungaku: Matsumoto Seicho” | Current Awareness Portal”. current.ndl.gov.au. current.ndl.go.jp. viewed 12 October 2019.
Available from: https://current.ndl.go.jp/car/37841

Maho Yamamoto, “Report on the Wikipedia Editing Event at Kanagawa Prefectural Library” (pdf), Bulletin of Kanagawa Prefectural Library, No. 14, Kanagawa Prefectural Library, February 2020, pp. 19-40, ISSN 0911-3681. https://www.klnet.pref.kanagawa.jp/uploads/2020/12/kiyou014_02.pdf Japanese only

It was here that we established a style of viewing special exhibitions at the Museum of Modern Literature in the morning and moving to the Kanagawa Prefectural Library in the afternoon for research and writing.

Although there were some irregularities due to the effects of the COVID-19 disaster and preparations for the opening of the new building of the prefectural library, we were able to hold the 8th meeting. A total of 83 articles were edited during the eight events.

Event Preparation

In principle, “Wikipedia Bungaku” is held in conjunction with special exhibitions held at the Kanagawa Museum of Modern Literature in spring and fall, and preparations for the 9th Wikipedia Bungaku began when it became clear that the theme for the spring 2023 exhibition would be Yasujiro Ozu.

For the event, we read articles about Yasujiro Ozu on Wikipedia to check the richness of the content or the room for addition, and to identify people and items that had not yet been covered as articles.

[[:ja:小津安二郎]](Yasujirō Ozu) was selected as Japanese version of “Wikipedia:Good Articles”, and most of the films directed by Ozu were already listed. We decided that it would be better to target related items and selected [[:ja:茅ヶ崎館]](Chigasakikan) and [[:ja:松竹大船撮影所]](Shochiku Ofuna Studio) as candidates for items related to Kanagawa Prefecture.

After a pre-event preview of the exhibition to confirm the contents of the exhibits, we finalized the themes and compiled a list of reference documents. We asked the librarians to research books, magazine articles, newspaper articles, websites, etc. that might be useful for writing, and if the prefectural library had them, we asked them to prepare them at the venue on the day of the event. For newspaper and magazine articles that the library didn’t have, staff members worked in teams to prepare them.

The schedule for the event was decided in late February, after coordination among the museum, the prefectural library, and the event staff, so that the event could be held on a Saturday or Sunday during the Ozu exhibition period. An application for sponsorship was submitted to the museum and an application for co-sponsorship was submitted to the prefectural library, both of which received approval.

Once the date and venue were set, publicity began: an event page was created on Facebook to solicit participants. Event flyers were only posted on the Facebook event page, not printed, distributed, or posted.

Facebook event page:https://www.facebook.com/events/181243867964890

Regarding the venue, the previous 8th event was the first event held after the new opening of the main building of the prefectural library on 9/1/2022. At that time, the event was held in a rented discussion room for group study on the 4th floor (capacity of 8 people*3 rooms), with three rooms connected by movable wall surfaces. However, it was a bit cramped for participants to work in groups on each writing theme and to arrange and spread out the materials they had collected. We decided to use the “Learning ⇄ Communication Area”, which is a larger open space. After a preliminary inspection of the exhibition, a meeting was held at the prefectural library to confirm the layout of the venue.

Day of the event

In the morning session, we gathered at the Kanagawa Museum of Modern Literature to listen to a gallery talk by the curator and view the exhibition. In the afternoon, the group moved to the Kanagawa Prefectural Library and wrote articles using the library’s collection. There were 15 participants.

The morning session began at 10:00. The museum of literature opens at 9:30, so  staff and participants could enter only after this time. The conference room was used only for the event briefing and gallery talk, and for regrouping after viewing the exhibition, it did not take much time to set up the room. The room will be ready for use once the names and name tags are placed at the reception desk and the PCs are connected to the monitors for slide projection.

When it was time for the event to start, the lecturer gave an overview of the event, and greetings from the organizer. Themes to be written today were presented in advance so that the participants would be aware of them when viewing the exhibition.

  • Yasujiro Ozu
    • Ozu and Gourmet
  • Chigasakikan(Chigasaki Pavilion)
  • Shochiku Ofuna Studio

The gallery talk began at 10:05, with the curator in charge of this exhibition giving a 30-minute explanation of the exhibition’s highlights. She introduced what could be read from the letters and diaries, shared newly discovered materials  from the research, and guided us how to read and understand the storyboards. Although staff members had seen the entire exhibition during our preliminary visit, the explanation helped us understand the points of interest and made us want to take a closer look at it again.

File:Wikipediaブンガク9小津安二郎01.jpg on commons.wikimedia.org / Photo by Mayonaka no osanpo / CC-BY-SA-4.0

We started viewing the exhibition at 10:35. Although time was limited to about one hour, participants were free to look around the exhibition until the meeting time.

At 11:45,  we reassembled at the conference room. After explaining the upcoming schedule, staff members led the group to the library. Although it was only one bus ride away, everyone had to eat lunch on the way. Since there was a street performance event in Noge that day and it was expected to be crowded, we split into small groups and ate as needed.

The afternoon session started at 13:30. After being seated, we first introduced ourselves. We asked them to briefly tell us their names, what they had for lunch, why they attended the event, and their experience participating in the event. Then the lecturer explained what Wikipedia is and how to mark up a page,and the librarian of the prefectural library explained about the materials and how to use the library.

After that, the participants were divided into groups and asked to choose which of the three themes presented beforehand they would like to write articles on.

  • <Number of Participants with each theme>
  • Yasujiro Ozu:5
  • Chigasaki Kan (Chigasaki Museum): 3
  • Shochiku Ofuna Studio:5

Coffee break

Every time we start working, we are so concentrated that we forget to take a break. Since “only drinks with lids are allowed in the Learning ⇄ Communication Area”, we decided to buy drinks at the coffee shop on the first floor of the library and carry them to the fourth floor. The cost of the drinks was paid for by a grant from the Wikimedia Foundation, and we had prepared for the drinks to arrive around 15:00, but due to various circumstances, they did not arrive in time, so the final review time became the coffee break time. Also, we forgot to bring out the snacks we had prepared for the participants, which we realized when we were cleaning up and had to rush to distribute to the participants.

At around 16:05, we finished our work. First, the group members discussed what kind of work they had done, what materials they had used, and their impressions. Then, everyone shared the results of the day’s work. Results from remote participants were also reported here. Finally, a group photo was taken with the “materials used today” in their hands. We always use a phrase ending in “i” as a watchword to bring a good smile, but “OZU Yasujiro” ends in “u”,so  we chose “unagi(eel)” from today’s keywords.

The following items were written or added during the event hours

Reflection

There was only one participant who had not edited Wikipedia before. There were several experienced Wikipedia editors in the group, so we were able to support him as needed. We were able to have him edit for the first time successfully.

We had enough materials and subject matter, but many items remained that we did not get around to on the day. “‘Won’t end even if we get home’ is the Wikipedia Edit-a-thon” (by Arai, the lecturer). We would like to continue to add to it, little by little.

  • Additions after the event
  • Japanese Wikipedia main page “New article”
    • [[蓬莱屋 (とんかつ店)]] (Houraiya (tonkatsu restaurant)) 4/25
  • Japanese Wikipedia main page “Enhanced Article Award”
    • [[松竹大船撮影所]] (Shochiki Ofuna Studio) 5/8

Event Participation Report

How to edit a Wikipedia article about a film director you are unfamiliar with in a dignified manner : 9th Wikipedia Bungaku Experience

Tech News issue #21, 2023 (May 22, 2023)

Monday, 22 May 2023 00:00 UTC
previous 2023, week 21 (Monday 22 May 2023) next

Tech News: 2023-21

weeklyOSM 669

Sunday, 21 May 2023 12:00 UTC

 

09/05/2023-15/05/2023

lead picture

OSM Pedestrian Density Visualization [1] | © mvexel | map data © OpenStreetMap contributors

Breaking news

Mapping

  • [1] Martijn van Exel analysed the density of pedestrian infrastructure in the Boston and Dallas areas using visualisations of OpenStreetMap data.
  • Steffen Voß explained how to use OSM tags (to model real-world attributes of buildings etc.) to set visual effects in streets.gl, a 3D-based map application.
  • Piet Brömmel has integrated website contribution statistics based on the first year of contribution into his OpenStreetMap Statistics. He noted that in 2022, just under 50% of all edits were made using accounts that submitted their first changeset between 2018 and 2022.
  • Wilmer Osario, from Venezuela, wrote a forum post detailing how to add place nodes to farms, based on aerial photography, some local knowledge, and reason.

Community

  • BudgieInWA wrote about how they and like-minded people mapped Hyde Park, in Northbridge, Perth, Australia, in detail and met new friends at the same time.
  • Julien Minet compared the number of mapped addresses in OpenStreetMap with the number of addresses in the ICAR register for Wallonia, one of the three Belgian states.
  • Rupert Allan (who some UK OSMers may remember from a project in South Wales) blogged about his invitation to consult on a new OSM project on mapping ancient wetlands in Jordan with a community of different ethnic groups.

OpenStreetMap Foundation

  • Steve Coast has suggested an alternative to the OSMF’s 2023 strategic plan. You can either discuss via the osmf-talk mailing list, on the community forum, or of course on twitter.
  • At the start of the year, the OSMF Board put together their thoughts on what they saw ahead. Four months later, there is a lot to reflect on and assess – each board member presents their thoughts.

Events

  • The OpenStreetMap Philippines Community will host the MAPAtalks Volume 1 workshop on 27 May 2023.
  • You have less than a week and a half left to submit your application to host #StateoftheMap2024.
  • The State of the Map Europe 2023 organising team is looking for helping hands. At present help with finding sponsors and establishing the programme committee are the most important. But help is welcome for other tasks too. If you haven’t done so already, mark your calendar for 10 to 12 November. Subscribe for updates at stateofthemap.eu.
  • The preliminary agenda for the French SotM 2023 conference is now available .

Maps

Programming

  • Four OpenStreetMap-themed projects have been selected for the 2023 Google Summer of Code.
  • Ash Kyd explained how he used vector map technology to create a map of bike lanes in Brisbane, Australia.
  • Andrii Holovin, a member of the Ukrainian OpenStreetMap community, proposed some exciting ideas for OSM 2.0. His suggestions include utilising Git as a version control system, storing objects in the YAML format, and introducing various additional features to enhance the functionality of the API.

Releases

  • Peter explained the use of the ‘route hint’ and ‘custom model’ features of the GraphHopper application.
  • The May 2023 version of Organic Maps has been released.

OSM in the media

  • The online magazine BASIC thinking tested the bicycle navigation offered by OsmAnd.
  • GadgeteerZA reviewed Magic Earth for car navigation on Android Auto.

Other “geo” things

  • The geoEpi group attended the CGA 2023 ‘From Geospatial Research to Health Solutions’ at Harvard in Cambridge, Massachusetts. Along with others Sven Lautenbach, from HeiGIT, and PhD student Steffen Knoblauch, from GIScience, provided a general introduction into the field of GIS health applications.
  • Ireland’s Environmentalists tweeted about EPAIreland’s Pollution Impact Potential (PIP) maps, which show the coverage of phosphorus pollution in Ireland due to high livestock activity and poorly draining soil. Excessive phosphorus pollution can result in excessive plant growth, algal blooms, and a reduction in the amount of oxygen in the water, which is not good for the aquatic species that live there.

Upcoming Events

Where What Online When Country
Washington OSM US Mappy Hour 2023-05-18 flag
UN Mappers – OSM and humanitarian mapping training – session #3 2023-05-18
Toulouse Réunion du groupe local de Toulouse 2023-05-20 flag
Bremen Bremer Mappertreffen (Online) 2023-05-22 flag
San Jose South Bay Map Night 2023-05-24 flag
Bayonne Cartopartie à Bayonne – 25 mai 2023 2023-05-25 flag
UN Mappers – OSM and humanitarian mapping training – session #4 2023-05-25
左營區 2023年5月 OpenStreetMap 街景踏查團 2023-05-27 flag
Singapore OG Training by HOTOSM AP-Hub OSM ON THE GO: OSM MOBILE APPLICATIONS 2023-05-27
OSMF Engineering Working Group meeting 2023-05-31
Düsseldorf Düsseldorfer OpenStreetMap-Treffen 2023-05-31 flag
Bologna Alluvione dell’Emilia-Romagna: editathon e mapping party al Salaborsa Lab 2023-06-03 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Elizabete, MatthiasMatthias, PierZen, SomeoneElse, Strubbl, TheSwavu, andygol, darkonus, derFred, rtnf, s8321414, 冰觞沐雨.
We welcome link suggestions for the next issue via this form and look forward to your contributions.

What I've learned about data visualization

Saturday, 20 May 2023 14:22 UTC

For many people the first word that comes to mind when they think about statistical charts is “lie.”

– Edward R. Tufte

William Playfair, 1801, Statistical map showing the extent, population and income of the principal nations of Europe The birth of the pie chart. Playfair’s attempt to show the size of European countries by relating them to planets, saying: “we have a more accurate idea of the sizes of the planets, which are spheres, than of the nations of Europe […] all of which are irregular forms”
The first pie chart ever. William Playfair, 1801, Statistical map of the principal nations of Europe.

When I moved from engineering to management, people expected me to make charts.

After spending some time learning about data visualization, I’ve come to two important conclusions:

  1. Good data visualization is powerful
  2. Powerful data visualization is rare

But creating a compelling chart is an underrated superpower for engineers. Here are some ideas that helped me learn how to do that.

Learning from experts

The best books I’ve read on data visualization are:

Between those two books, you get a balance: the ideal (from Tufte) and the PowerPoint world (from Knaflic).

Tufte

Edward R. Tufte is professor emeritus of statistics, political science, and computer science at Yale.

And he’s a man upset by pie charts.

the only worse design than a pie chart is several of them.

– Edward R. Tufte

Tufte’s rules:

  • 🦑Maximize the data-ink ratio – Most of your chart should be data. Erase the parts that convey nothing.
  • 🍫Avoid chart junk – Avoid needless colors, shading, tickmarks, and gridlines: remove distractions from data.

Knaflic

Cole Nussbaumer Knaflic was the former manager of Google’s People Analytics team. She knows her way around a PowerPoint1.

Knaflic’s rules:

  • 📈Choose an appropriate visual display – Know your context, then pick your chart
  • 🏰Tell a story – Charts ought to communicate something; you should know what that something is.

Tufte’s principles: a case study

Take a look at this chart from the Wikimedia Foundation’s 2023–2024 budget projections:

Wikimedia Foundation 2023–2024 Draft Budget
Wikimedia Foundation 2023–2024 Draft Budget

This is a default Google Sheets chart for this data.

What I’m able to glean from this chart:

  • “Building analytics & ML Services”: the biggest, > 25%
  • “Features and functionality” and “Supporting volunteers”: smaller, ~20% each
  • “Fundraising,” “Protecting access,” and “General & Admin”: smaller still, ~10% each
  • The colors seem meaningless

Now consider what we glean from the table used to generate it:

Program Budget (millions) Percent
Building analytics & ML services $46.4 26.2%
Features and functionality $39.7 22.4%
Supporting volunteers $35.1 19.8%
General & Admin $21.3 12.0%
Fundraising $17.9 10.1%
Protecting access $16.6 9.4%
Total $177.0 100.0%

This table gives us more information than the chart in a similar amount of space.

From the table, we learn:

  • Exact percentages—no need to guess
  • Exact dollar amounts
  • The total budget as a dollar figure

And we get rid of the meaningless colors.

OK—confession time: there were percentages on the original pie chart. I edited them out.

But I did that to prove a point. The pie chart is doing less work than the numbers. The slices of the pie add almost nothing to the numbers.

Tufte would prefer the table above to the pie chart above because the table:

  • Shows the data
  • Maximizes the data-ink ratio
  • Avoids chartjunk (like the meaningless colors)

And in this case, I agree: I like the table.

But choosing data visualizations is difficult.

How to choose a chart: visual perception accuracy ranked

If you intend to communicate, then people must be able to read your chart.

In 1985, Cleveland & McGill conducted what is still the most cited experiment on what charts are easiest to read. Their purpose was simple: rank standard charts by the number of errors people make while reading them.

Here’s their ranking of charts—from easiest to hardest to read:

Icon Name Example
Position along a common scale scatterplots, bar charts, sparklines
Positions along nonaligned scales stacked bar charts
Length, direction, angle pie charts, donut charts
Area bubble charts, treemaps
Volume, curvature 3d charts
Shading, color saturation Heatmaps

But I love heatmaps! You can use heatmaps. Just understand that only a sophisticated audience can interpret heatmaps correctly.

Know your audience, then use something like the data viz catalog to choose your chart.

Tell a story

Charts are communication.

At their worst, as Tufte said, they’re little more than “devices for showing the obvious to the ignorant.”

But at their best, they’re a powerful way to intuitively communicate a lot of data in a small space.


  1. Tufte will not help you with PowerPoint. In his essay, “The Cognitive Style of PowerPoint” (which is included in the book “Beautiful Evidence”), he says: “bulleted outlines make us stupid” before going on to blame PowerPoint for the space shuttle Columbia disaster.↩︎

reflection in a soap bubble

For nearly two decades, the Wikimedia Foundation has supported free access to the sum of all knowledge.

This ambitious goal would not be possible without the Wikimedia community—thousands of volunteer editors, admins, and functionaries of the Wikimedia projects who not only contribute content, but monitor for harmful material, stop the spread of misinformation, and create policies that determine what content belongs on the projects.

Since the projects are open, collaborative and driven by volunteer efforts, those volunteer editors are best able to respond to requests to change, update, or delete content from our projects. These requests come from governments and private parties, and sometimes also include attempts to obtain nonpublic user information. The Foundation evaluates all requests with an eye towards protecting privacy and freedom of expression. We support the Wikimedia communities’ prerogative to determine what educational content belongs on the projects.

Twice a year, we publish a transparency report outlining the number of requests we received, their types, countries of origin, and other information. The report also features an FAQ and stories about interesting and unusual cases.

Here are a few highlights from the report:

Content alteration and takedown requests. From July to December of 2022, we received 282 requests to alter or remove project content. 13 of these requests were Right to Erasure-based requests related to user accounts. When we receive such a request, we provide the user information on the community-driven vanishing process.

Copyright takedown requests. The Wikimedia communities work diligently to ensure that copyrighted material is not uploaded to the projects without an appropriate free license or exception, such as fair use. Most Wikimedia project content is therefore freely licensed or in the public domain. When we occasionally receive Digital Millennium Copyright Act (DMCA) notices asking us to remove allegedly infringing material, we conduct thorough investigations to make sure the claims are valid. From July to December of 2022, we received 26 DMCA requests, and granted only one. This low number is due to the hard work of community volunteers who ensure that content on the projects is properly licensed.

Requests for user data. The Wikimedia Foundation only grants requests for user data that comply with our requests for user information procedures and guidelines (which includes a provision for emergency conditions). Moreover, the Foundation collects very little nonpublic user information as part of our commitment to user privacy. Any information we do collect is retained for a short amount of time. Of the 29 user data requests we received, zero resulted in disclosure of nonpublic user information.


The Wikimedia Foundation’s biannual transparency report reaffirms our organization’s commitment to transparency, privacy, and freedom of expression. It also reflects the diligent work of the Wikimedia community members who shape the projects. We invite you to learn more about requests we received in the past six months in our comprehensive transparency report. For information about past reports, please see our previous blog posts.

Ellen Magallanes, Senior Counsel
Wikimedia Foundation

The transparency report would not be possible without the contributions of Raji Gururaj, Julianne Alberto, Jim Buatti, Leighanna Mixter, and Aeryn Palmer.

For Dr @ashadevos there are 14 @Wikipedia articles

Thursday, 18 May 2023 13:04 UTC

 

Hardly a "woman in red", Dr De Vos has many accomplishments chronicled in these Wikipedia articles. She presents herself with her colleagues on Facebook and, the graph of her co-authors should paint a similar picture, initially it did not. At first there were only a few publications to her name, they have been expanded to 26 at present. It introduced many co-authors and there are now some 112 co-authors missing.

Obviously, there is much more that could be done. Adding more papers and co-authors adds complexity to the Scholia of Dr de Vos. More distinctions could be added, talks at conferences and papers that were cited. I typically restrict myself to papers with a DOI and authors with an ORCiD identifier as they have the biggest network effect. 

I was reminded by Greenpeace that some people give themselves nothing for their birthday. So I updated this Wikidata item. Who will notice or care.. Like Greenpeace, Dr De Vos cares about whales; it is her specialty.
Thanks,
     GerardM

These students made Wikipedia’s information about fertility care and family planning more inclusive of the LGBTQ community. Their work continues to be read 1,500 views every day, well beyond the conclusion of their course. What other assignment can say the same?

Dr. Cynthia Gabriel’s course about LGBTQ Reproductive Health invites students at the University of Michigan to explore the biological, social, cultural, and legal experiences of LGBTQ+ family-making. Aurora Rynda and Airy Garcia both found themselves in the course last fall as they pursued degrees in biology with focuses on gender, health, and society.

Aurora and Airy worked in a group with six other students to improve Wikipedia’s article about in vitro fertilization (IVF). The article was already fairly comprehensive, but given the lens of their course, some gaps stood out to the students right away.

Noticing gaps in Wikipedia through a course lens

“The main gaps our group noticed in the IVF article were the overall lack of gender inclusive language and representation of the LGBTQ+ community,” Airy explained.

Airy Garcia

“The way the article only uses infertility as a reason to undergo IVF or other assisted reproductive technologies (ARTs) came as a shock to me,” Aurora added.

“We were not too surprised that these gaps existed given the general lack of awareness about LGBTQ+ reproductive health and the fact that the media tends to overrepresent cis and heteronormative individuals as the recipients of IVF,” said Airy.

“While editing this article we wanted to create a more inclusive page where individuals can acquire up to date and informative knowledge,” noted Aurora.

So the students got to work.

Taking initiative to correct self-identified gaps

After learning how to edit through Wiki Education’s resources, the group went about making changes. First, they added gender inclusive language throughout the article to better represent all parents. They showed the different forms that fertility care may take for same-sex couples and transgender parents. And they wrote of the unique challenges facing transgender expectant parents as they navigate insurance coverage. The students also changed the article’s main image (the one that appears on Google search results) to be more inclusive, having found a scientific image in the public domain to replace it with.

Students replaced the image on the left (BruceBlaus, CC BY 3.0) with the image on the right (public domain).

“Given the fact that we were taking a class on LGBTQ+ reproductive health at the time, we wanted to increase awareness of LGBTQ+ individuals accessing assisted reproductive technologies, especially IVF,” Airy continued. “This included changing gendered language to be more inclusive, adding information from emerging research, and dispelling commonly held misconceptions about LGBTQ+ reproductive health.”

“The main point I wanted readers to take away from the article was the future for assisted reproductive technologies and how ART is used not just for infertility, but it’s widely used for same-sex parents and other LGBTQ partners to create a family,” said Aurora. “It is so important to provide information regarding assisted reproductive technologies because we’re in a society where heterosexual reproduction is the large majority and there is a lack of representation for the LGBTQ community. I find it very frustrating when the LGBTQ community in particular gets marginalized and excluded from the reproductive conversation and the technologies aren’t as accessible. I believe that anyone who wants to have children should be able to, regardless of their identity.”

The Dashboard shows which students added what content to the live Wikipedia article. As the image above shows, Dr. Gabriel’s students are responsible for much of the article’s content about LGBTQ expectant parents.

Because of the public-facing nature of the assignment, students feel a responsibility to get it right, no matter the topic. They also tend to feel empowered by the act of correcting information that will be consulted by so many. The IVF article gets 1,500 views every day. Since December when the students stopped editing, 233K readers have viewed their contributions. That’s an incredible impact.

Depending on how the instructor structures the assignment, students often have autonomy over what topic they’ll improve on Wikipedia. This allows them to bring their interests, career aspirations, and identity to their work, as well as connect it to their other studies.

“As a woman of color, I understand what it feels like to feel under and misrepresented, especially in the health field and in research,” Airy shared. “Since I personally do not identify with the LGBTQ+ community, I wanted to ensure that my contributions to the IVF page were as accurate as possible and worked to uplift instead of exclude or misinform.”

Learning research skills along the way

Information on Wikipedia related to health is subject to additional checks and balances when it comes to sourcing. Aurora, Airy, and their group were especially intentional about the research that they cited, making sure it was peer-reviewed and up-to-date.

“Writing for Wikipedia showed me how important it is to be an expert in the topic you are writing about. You have to not only source all your edits, but you also need to comb through the article and find what parts of the article you find incorrect and need to be changed,” Aurora said. “It also taught me the importance of looking at the sources and not believing everything you read. When you question information you read, you enable yourself to learn more about the topic. I’ve learned to question literature and research on my own before agreeing with the opinions of others. I will take this into my field of study by educating others on the inclusivity of ART and the high demand to discover new technologies.”

Plus, it’s rewarding to write for a resource we all use. “This was my first time making significant changes to a Wikipedia page,” said Airy. “It felt nice to be able to expand representation on such an important topic!”

Interested in incorporating a Wikipedia assignment into your course? Visit teach.wikiedu.org to learn more about the free assignment templates and resources that Wiki Education offers to instructors in the United States and Canada.

Tech News issue #20, 2023 (May 15, 2023)

Monday, 15 May 2023 00:00 UTC
previous 2023, week 20 (Monday 15 May 2023) next

Tech News: 2023-20

 A recent Wikipedia Research article aims to prove that the English Wikipedia deletion process is not biased. For some that is a loaded question because it  centers on the question if Wikipedia is equitable.

As so often the article is all about English Wikipedia and it has its own bias. English Wikipedia does not serve half the public of the Wikimedia Foundation and much of the other half does not read English. The gender balance in English Wikipedia is however improving; the percentage of articles about women is slowly but surely increasing.

At issue in the article is whether the English Wikipedia deletion policies effectively harm gender and race biases. Obviously there are more biases; you may be male and white but when you are not from an Anglo-american background chances for Wikipedia recognition are slim. When you care to research this, check out Wikidata, it includes a super set of what Wikipedia includes and it is biased in this way as well.

When a Wikipedia article about a scientist is deleted, it does not follow that its Wikidata item is deleted and given enough identifiers, it is likely that its related subset increases over time tilting the "notability" balance. Even so, many important scientists are "scientists in red", an example is Prof Emily Fairfax her prominence is for instance in her explaining and demonstrating that beavers feature prominently in the fight against forest fires

When English Wikipedia defends its own policies, it follows that they rely on the base assumptions in those policies. When those assumption are questioned, their arguments are lost. Given that English Wikipedia represents a subset of "the sum of all knowledge" that is included in Wikidata, it follows that much of Wikipedia can be understood from such a perspective. 

Wikidata has no "red links"; when a relation exists for an recipient of an award, there must be an item for both the award and the recipient. Wikipedia has one link in black to the "SIRS Lifetime Achievement Award". while Wikidata has a link to all recipients. They are linked to identified publications and other awards and consequently the Scholia for the award is really informative. 

Based on information like this improved information is available that must wait for a Wikipedia volunteer. English Wikipedia is a victim of its success, it cannot fully maintain its information. The same can be said for Wikidata. It is however a superset and it does not necessarily require a mastery of English.

With new technologies becoming more relevant, there is an avenue to improve the quality of any Wikipedia, inform people based on the data in Wikidata and improve on the quality of the information that we provide. 

Thanks,

     GerardM


weeklyOSM 668

Sunday, 14 May 2023 10:16 UTC

02/05/2023-08/05/2023

lead picture

Streets GL, a real-time 3D map renderer [1] | ⓘ Mapbox © powered by Esri | map data © OpenStreetMap contributors

Mapping

  • Following a workshop held by UN Mappers, a group of students from the Erasmus Mundus Masters program in Geospatial Technologies at the University of Jaume I, in Castellón de la Plana Spain, analysed campus safety and identified potential vulnerabilities by mapping street lights in OSM.
  • PhysicsArmature has some tips for mapping rivers: staying simple.
  • Valerie Norton has written an article discussing various tags related to hitching posts and resting places for trail riding. She explored the uses and popularity of tags such as amenity=hitching_post, tourism=trail_riding_rest, amenity=animal_hitch, and amenity=horse_parking. The author contemplates the advantages of different tags and their hierarchical nature, highlighting the need for clearer guidelines, and additional tags in order to enhance mapping accuracy.

Mapping campaigns

  • Marjan Van de Kauter, from TomTom’s OSM team, has announced their plans to conduct edits in Luxembourg based on feedback received for their upcoming new TomTom map, ensuring that the edits add value to OpenStreetMap and do not conflict with recent community updates. The team will focus on editing highways, addresses, POIs, land use, buildings, and water initially, expanding to other features in the future. They will use the hashtags #tomtom and #tt_mapfeedback to provide updates on their progress and welcome feedback from the OSM community during this activity.

Community

  • In connection with the survey on communication in the OSM community (we reported earlier), Imre Samu clarifies the cultural differences in communication in different countries. He quotes from his sources and backs up his statements with three links that everyone who enters the international arena should read.
  • mapmeld blogged about the border of Belgium’s Baarle-Hertog and the Dutch Baarle-Nassau, which overlap in one town riddled with border crossings and enclaves. He included pictures of the weirdest places, including borders intersecting buildings.
  • Sango Bishiri Narcisse, based in Chad, is the UN Mapper of the Month for May.

OpenStreetMap Foundation

  • The Board of the OpenStreetMap Foundation has invited its members and the OSM community to participate in revising its Strategic Plan. The plan will be discussed in four phases over the next two months, starting with a focus on ‘Cluster B: Community Development for OSM’.Feedback is requested on missing elements, plan inconsistencies, urgent stratagems, and which ones are important for the success and growth of OpenStreetMap. Comments can be shared on the OSM Community forum or mailing list and private feedback can be sent to the strategy team or individual board members.
  • The OSMF Board has written draft Fundraising Guidelines and is seeking feedback from the community about them. You can provide any feedback on the OSM Community forum, in the blog comments, or directly to the Board.

Education

  • Denrazir Atara, from HOT Open Mapping Hub Asia-Pacific, is hosting an online training session on Sunday 14 May for beginners and experienced mappers on how to use uMap.

OSM research

  • Ulrich et al. have assessed the potential of OSM to quantify land use and land cover change related carbon fluxes to the atmosphere for a regional case study, in the German federal state of Baden-Württemberg, in their new paper.

Maps

  • [1] Streets GL is an impressive real-time 3D map renderer designed to showcase OpenStreetMap data with stunning visual effects. This TypeScript project utilises WebGL2 and a render graph implementation to generate dynamic geometry, allowing for the visualisation of complex building shapes, roads, trees, and more.
  • karlos on Mastodon pointed out that while weeklyOSM provides a comprehensive list of events, it lacks a corresponding map to visualise their locations. He shared a test map featuring pins with some event information. The Calendar Widget from Jonathan Beliën shows how it could look.

Open Data

  • Jeff Underwood tweeted about recent enhancements to building height data in downtown Phoenix, comparing the information available in OpenStreetMap and Overture. He acknowledged the positive development and noted that Overture’s coverage extends beyond downtown, encompassing the entire region.

Software

  • Bellingcat’s latest geolocation tool, based on OpenStreetMap data, simplifies the process of identifying the location of images for investigative purposes. By selecting key features from an image, such as landmarks or structures, users can narrow down their search within a specified region of interest. The tool provides a list of potential matches on a map, enabling users to pinpoint the exact location for further investigation. With this tool, geolocating images has become more accessible for open-source researchers and investigators.
  • Wille Marcel, in a recent presentation during the OpenStreetMap US Mappy Hour, showcased OSMCha, a powerful tool for OpenStreetMap data analysis and quality control. The tutorial dives into the various functionalities starting at the 5:14 mark, providing a valuable resource for OSM enthusiasts.

Releases

  • Magic Lane has unveiled a new Android version of its Magic Earth navigation app, offering users privacy, offline capabilities, and advanced driver assistance features.

Did you know …

  • … about Open Etymology Map? It is a web application for displaying historical information about people whose names have been used as street or place names. This application uses information from the wikidata:* tag in OSM.
  • … using Ctrl+F5 instead of F5 to refresh a webpage? Ziltoidium pointed out that after an edit, if you press F5 the map is reloaded, but the images continue to come from the cache. He recommends Ctrl+F5 instead and says that changes to the map are visible after 1 to 2 minutes.

OSM in the media

  • Researchers from the Simon Fraser University have developed Canada’s first national open-source dataset of cycling infrastructure, aimed at promoting active transportation options, and assisting decisionmakers. The dataset, derived from OpenStreetMap, provides consistent information on bicycle infrastructure, allowing for better planning decisions.
  • In an interview with the Lower Saxony Tourism Network Marie Witte, from Mittelweser-Touristik (MWT), reported on the useful cooperation with OSM mappers to optimise cycling and hiking tours using digital maps. In order to implement this cooperation, the MWT, together with Tof99 and OSM_RogerWilco, have set up a working group in which mappers and tourism experts exchange information. The cooperation has allowed MWT to improve its suggestions and to ensure the data quality of the routes with the help of the OSM mappers.

Upcoming Events

Where What Online When Country
Salt Lake City OSM Utah Monthly Map Night 2023-05-11 flag
Chippewa Township OpenStreetMap Michigan Meetup 2023-05-11 flag
“Open- und OpenStreetMap-Daten in Blaulichtorganisationen” (Schweiz) 2023-05-11
Berlin 179. Berlin-Brandenburg OpenStreetMap Stammtisch 2023-05-11 flag
Zürich OSM-Stammtisch 2023-05-11 flag
Zaragoza esLibre 2023 2023-05-12 – 2023-05-13 flag
Briançon Parlons d’OpenStreetMap 2023-05-12 flag
Verona MERGE-it 2023-05-12 – 2023-05-13 flag
Gap Parlons d’OpenStreetMap 2023-05-12 flag
Sülysáp Mapping around Sülysáp before, after and during breaks of qbParty (demoscene) 2023-05-13 – 2023-05-14 flag
Gap Cartopartie Durance à vélo dans le pays Gapençais ! 2023-05-13 flag
City Of Vincent Social Mapping Saturday: Hyde Park 2023-05-13 flag
Nanterre Paris Hack Weekend 2023-05-13 – 2023-05-14 flag
Puerto López Notas OSM: Discutamos hashtags para incluir en las notas de OpenStreetMap 2023-05-13 flag
Singapore OG Training by HOTOSM Ap-Hub CREATING ONLINE MAPS USING UMAP 2023-05-13
København OSMmapperCPH 2023-05-14 flag
Grenoble Contribuez à OpenStreetMap avec votre smartphone 2023-05-15 flag
臺北市 OpenStreetMap x Wikidata 月聚會 #52 2023-05-15 flag
Lyon Réunion du groupe local de Lyon 2023-05-16 flag
Bonn 163. Treffen des OSM-Stammtisches Bonn 2023-05-16 flag
City of Edinburgh OSM Edinburgh Social 2023-05-16 flag
Lüneburg Lüneburger Mappertreffen (online) 2023-05-16 flag
Formação UN Mappers: OpenStreetMap e o mapeamento humanitário – sessão 9 2023-05-17
Zürich Missing Maps Zürich Mapathon 2023-05-17 flag
Karlsruhe Stammtisch Karlsruhe 2023-05-17 flag
Washington OSM US Mappy Hour 2023-05-18 flag
Rīga State of the Map Baltics 2023 2023-05-17 – 2023-05-18 flag
Toulouse Réunion du groupe local de Toulouse 2023-05-20 flag
Singapore OG Training by HOTOSM AP-Hub CREATING ONLINE MAPS USING USHAHIDI 2023-05-20
Bremen Bremer Mappertreffen (Online) 2023-05-22 flag
San Jose South Bay Map Night 2023-05-24 flag
Bayonne Cartopartie à Bayonne – 25 mai 2023 2023-05-25 flag
Singapore OG Training by HOTOSM AP-Hub OSM ON THE GO: OSM MOBILE APPLICATIONS 2023-05-27

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by MatthiasMatthias, Strubbl, TheSwavu, barefootstache, darkonus, derFred, isoipsa, rtnf.
We welcome link suggestions for the next issue via this form and look forward to your contributions.

By Daria Cybulska, Director of Programmes and Evaluation at Wikimedia UK

Democracies rely on informed citizens to function effectively. Over recent years, new digital technologies have fundamentally altered the creation and consumption of media content, and introduced new challenges to democratic participation. The increased volume of news, the politicisation of social media, misinformation, disinformation, and the distracting of the public through fake news, along with the rise of polarised and radicalised groups whose own ideology is reinforced by ‘filter bubbles’, all combine to create untrustworthiness, bias and misrepresentation. These issues undermine democracy and its reliance on well-informed citizens. 

Information literacy has the power to counter this. At its heart, information literacy empowers citizens to access, create, consume and critically evaluate information. It builds understanding of the ethical and political issues associated with the use of information, including privacy, data protection, freedom of information, open access/open data and intellectual property. 

In my role as the Director of Programmes at Wikimedia UK, I’ve long believed that our workshops and training sessions make a difference in empowering people – by building their information literacy skills, providing an opportunity to collaborate, and capturing their heritage. In 2021, together with Agnes Bruszik, a research colleague, we delivered a project to critically investigate how engaging with Wikimedia projects contributes to the strengthening of civil society and democratic processes in the UK.

Our main inquiry was to understand how improving information literacy skills contributes to Wikimedia UK’s vision of a more tolerant, informed and democratic society. Does our work increase participants’ information literacy, and does this in turn lead to a more engaged civil society? We reviewed the current understanding and frameworks in the intersection of literacies, civic engagement and democratic participation, to see how information literacy has been found to support civic engagement. We then explored how Wikimedia UK’s work contributes to civic disposition skills. 

Our research concludes that Wikimedia’s activities can increase citizen engagement in democratic processes through our work in information literacy by 1) Providing open and free access to accurate information, 2) Improving information literacy skills of individuals, 3) Encouraging volunteering, and 4) Providing accessible collaborative infrastructure. 

“Information literacy is one of the most important skills of the future. Without understanding how, by who and in which ways knowledge and information is created and distributed, one cannot potentially evaluate the value and credibility of that information. The formulation of opinions, values, principles, or academic and historical referencing must be based upon reliable sources and credible interpretation and presentation of facts and data. Without citizens’ awareness of information manipulation, democratic participation is thus flawed. The Wikimedia movement is in a unique position to educate and encourage individuals to become more information literate, while . also promoting democratic practices such as participatory decision-making, provisioning open access to platforms and information for even the most marginalised minority groups. These practices, in turn, create the know-how for more civic engagement in general.” – Agnes Bruszik

Crucially, freedom of expression and access to reliable information through Wikimedia projects increase intercultural dialogue and decrease the social isolation of minority groups. Wikipedia serves as a platform that can assist displaced or minoritised communities to express and maintain cultural identity.Our experience shows that groups organised around a shared interest, value or cause, and equipped with digital, information and collaboration skills, are more likely to engage in civic participation in public matters relevant for them. Moreover, learning about the culture of democratic participation and processes of engagement empowers individuals, equipping them with transferable skills.

“The rise of populism has been linked to a decline in interest in public affairs and we thought that, being less politically and socially active, people may be less capable of interpreting political phenomena and understanding the complexity of the management of public affairs.” – Science Direct

We are faced with a global trend towards a shrinking civil society space. There are fewer spaces where citizens can develop and practise key civic skills such as collaboration, self-representation, and working within a context of diversity and difference of opinion. This is much needed in any context, including the UK. Civic skills are broad in character and can be developed in a variety of contexts – including opportunities online. Wikipedia has the benefit of being a well known online space, meaning it has the recognition within a big audience that could then be engaged in civic activities. We can engage with people where they already are rather than needing to bring them to a new, unknown space. 

Many participants of Wikimedia UK activities (e.g. editing events) started out as individual editors, who then decided to bring wiki projects into their communities. In a recent survey of our community leaders, we asked if individuals’ participation in Wikimedia UK activities, such as running wiki events, encouraged them to take part in other non-wiki activities? (eg. community organising, campaigning, other kinds of volunteering, etc). One volunteer reports:

“Yes. In speaking to a volunteer for our charity, I became aware of the [community heritage project centring on a particular 19th century industrial action]. I created the Wikipedia page for […], a leading figure in the strike whose mentions elsewhere assured her notability, and through this spoke to the originators of [community heritage project]. I am now actively involved with the group, including as part of their education and community engagement sub-group. It’s likely that Wikipedia work will feature in this at some stage, as they were overjoyed with the […] page and very much convinced of the usefulness of more (and more accurate) Wikipedia representation.”- Community Leader response in a 2021-22 Wikimedia UK volunteer survey.

Working on Wikimedia UK projects can facilitate this spirit of working towards a common good (free knowledge for all), cooperation with others, activism, which in the long run encourages an empowered civil society. This we believe can go a long way towards realising Wikimedia UK’s vision of a more informed, democratic and equitable society.

Explore the report

The post From editing articles to civic power – Wikimedia and Democracy appeared first on WMUK.

Reddit, Pushshift, and deletion

Friday, 12 May 2023 04:00 UTC

On TheoryOfReddit Brian Keegan has posted an open letter regarding Reddit’s tightening of their API access, especially the cutting off of Pushshift’s access.

Pushshift is/was a third-party repository of Reddit data – used by researchers and mods – that had difficulty keeping up with deletion requests, among other things. It was also used by those wanting to find deleted messages.

This issue – that Pushshift violated Reddit’s user’s privacy expectations by retaining data, requiring an additional opt-out step, and then failing to act quickly – is one of the purported reasons for its removal from the Reddit API.

For the past few years, I’ve been seeking to understand three questions relevant to this issue, especially for advice subreddits:

  1. How many users actually delete their posts?
  2. How long does it take for them to do so?
  3. Do users actually worry about their deleted messages surviving?

I answer all these questions in a draft (under review). For example, Table 1 shows levels of removal and deletion across varied subreddits.

Feedback on the draft is welcome. Of course, without Pushshift I can no longer extend the data itself.


Table 1 shows that removal and deletion are common, especially on the advice subreddits. The popular advise subreddits have significantly more deletions (48%) than other sensitive subreddits (32.4%), which have significantly more deletions than tech-related subreddits (20.2%). Moderation has increased over the years, with r/AmItheAsshole going from 14% to 47% to 78%!

Table 1: Percent of submissions deleted and [removed].
subreddit 2018-Mar+ 2020-Mar+ 2022-Mar+
tech subreddits 20.0% [38.1%]
sensitive subreddits 32.4% [16.2%]
Advice 51.6% [09.7%] 53.0% [12.3%] 47.4% [42.8%]
AmItheAsshole 45.8% [13.9%] 48.9% [47.1%] 43.1% [78.4%]
relationship_advice 55.9% [09.5%] 58.9% [09.8%] 53.7% [48.0%]

The popular technology-related subreddits consisted of: Android, apple, audiophile, buildapc, DataHoarder, electronics, gadgets, hardware, ipad, linux, mac, sysadmin, techsupport, web, windows. The sensitive subreddits were those studied by Gaur et al. (2019kaa): Anxiety, BPD, BipolarReddit, BipolarSOs, StopSelfHarm, SuicideWatch, addiction, aspergers, autism, bipolar, depression, opiates, schizophrenia, selfharm; cripplingalcoholism is not included because it was made private earlier in 2022.

This Month in GLAM: April 2023

Wednesday, 10 May 2023 22:33 UTC
  • Albania report: International Roma Day Edit-a-thon in Albania, 2023
  • Brazil report: “Every Book its Public” Campaign and Strategic Committee on Libraries
  • Czech Republic report: What’s new at GLAM in the Czech Republic
  • Indonesia report: GLAM Mini Grants; Structured Data Marathon VIII; Wikisource Online Workshop
  • Italy report: Bridges between Wikimedia and culture
  • New Zealand report: BHL Whitepaper and outreach for Citizen Science Month and WeDigBio, Auckland Museum suburbs project update, New Zealand Women in Architecture Wikidata Project
  • Poland report: Another meeting of EU GLAM Coordinators; Guided tours for Wikipedians in museums in Krakow; Presentation on Art in Wikipedia; Online training on the basics of copyright law; Polish monuments among the top winners of WLM
  • Sweden report: ISOF workshop; More articles from students; SAAB veterans shared their knowledge during metadata edit-a-thon; ArkDes edit-a-thons
  • UK report: Democratising knowledge and cultural diversity
  • USA report: Into the Wikiverse; Earth Day 2023 Bushwick
  • Content Partnerships Hub report: Wikipedia day pitches by FAO and IEA
  • Wiki Loves Living Heritage report: Activities are starting!
  • WMF GLAM report: Biodiversity Heritage Library whitepaper and the #1Lib1Ref campaign
  • Calendar: May’s GLAM events

By Leah Emary, Wikimedian in Residence at The Mixed Museum and Connected Heritage Project Lead

Introduction and Overview

The Wikimedian in Residence partnership continued at the Mixed Museum from January to March 2023, building on the initial stage of the residency that began in September. This blog posting describes the initiatives focussed on during this time and next steps that both the Mixed Museum and Wikimedia UK might like to take to build on the residency. The End of Residency Report can be found here.

January to March 2023

Sustainable digital volunteer programme for the Mixed Museum

As described in the first report from the residency, editing Wikimedia projects is an ideal basis for a digital volunteering programme for the Museum. However, after the scope and scale of recruiting for and managing a volunteer programme became clear, it seemed desirable to fund a volunteer coordinator for the Museum who could deliver this with expertise and focus, rather than making it part of Chamion’s work. 

I outlined what would be required to create a sustainable virtual volunteering programme for the Mixed Museum, with the hopes that this could be used for a future funding bid. You can read the proposal here: Sustainable digital volunteer programme for the Mixed Museum

We had initially planned to create a bespoke, self-study training programme for volunteers based on a set of Google slides and embedded videos taken from Zoom-based training. Because the editing interface of Wikipedia changed quite dramatically in January 2023, the training videos and screenshots were quickly out of date and less suitable for self study, which altered our plans.

Rather than re-record, we took time to consider the implications of future changes to the interfaces making self study videos obsolete, and the considerable investment they take to remain up to date and useful. As the museum doesn’t have anyone to do that ongoing maintenance work, the bespoke training programme would quickly go out of date. We decided to rely on three existing resources for online training (Training Library [Programs & Events Dashboard], The Wikipedia Adventure, and The Introduction to Wikipedia) for a more sustainable future. For more information on which training we decided on and how to contextualise it, see the Sustainable digital volunteer programme for the Mixed Museum document.

Heritage Dot 2.0 Roundtable

Caption: The Connected Heritage team (Leah Emary and Dr Lucy Hinnie) from Wikimedia UK moderated a panel at the Heritage Dot Conference consisting of Dr Victoria Araj, Dr Jane Secker and Dr Chamion Caballero. The panel was chaired by Hope Williard.
The Connected Heritage team (Leah Emary and Dr Lucy Hinnie) from Wikimedia UK moderated a panel at the Heritage Dot Conference consisting of Dr Victoria Araj, Dr Jane Secker and Dr Chamion Caballero. The panel was chaired by Hope Williard.

In March 2023, Chamion participated in a roundtable discussion moderated by the Connected Heritage team and two other Connected Heritage partners, Dr Jane Secker and Dr Victoria Araj, as part of the Heritage Dot 2.0 conference hosted by the University of Lincoln. 

The discussion touched on how engagement with Wiki-based projects enabled these three cultural heritage organisations to improve the accessibility of their collections, while simultaneously empowering volunteers and members through embedded digital upskilling. The Mixed Museum’s Wikipedia edits were discussed as an example of ways that open knowledge can place overlooked cultural histories into the dominant narrative. Chamion also described the legacy the Residency will have on the Museum’s future projects. 

We were honoured to hear Josie Fraser from the National Lottery Heritage Fund mention the roundtable as a highlight of the conference during her closing remarks. 

Queen Mary University London Microinterns 2023

Two student interns, Leyi and Shannon, joined the Mixed Museum for four weeks in March 2023. The internship followed a similar model to one we ran last year. More about the micro-internship format is in this blog posting March is Wiki micro-internship month at The Mixed Museum and Manar al-Athar.

This year, the interns worked from a Trello board of Wiki tasks. Both Leyi and Shannon focussed on the military history of African American soldiers based in the United Kingdom during and after World War 2. While Shannon focussed on adding personal accounts to articles which read like lists of list of dates, Leyi was interested in adding social history to articles on military bases in England. Leyi experienced some pushback from other editors which she describes in her blog-based reflection [link here]  and was resilient in the face of it. The interns had interesting conversations with Chamion around who feels ownership over different parts of history and how our work editing Wikipedia can have an impact on what is  generally accepted as ‘important’. 

What Chamion and I discovered, having run this internship twice now is that, though the platform is the same and the work is similar each year, Wikipedia editing is like a mirror in that each person sees something different reflected back at them in the process. See the Internship Dashboard for 2023.

Museum Ethnographers Group Conference 2023

The work we did at the Residency around copyright and Wikimedia Commons (highlighted here) was the subject of a paper delivered to the Museum Ethnographers Conference in Cambridge in April 2023. Slides for Museum Ethnographers Group Conference presentation.

In the presentation, the example of the Brown Babies images sits alongside work done by Martin Poulter at the Khalli Collections and Lucy Moore’s recuperative work to enhance the representation of museums in Oceania on Wikimedia projects. The three projects engage in a conversation about what it is to edit Wiki in today’s context and how we can work together to best surmount these and create a more equitable digital space for open knowledge.

Looking at visitor numbers to The Mixed Museum 

Chamion has long been interested in what the Museum’s Google analytics tell her about the numbers of visitors clicking through from Wikipedia links to The Mixed Museum. These so-called referral links make up a small but significant portion of the Museum’s audience. In the period before the Connected Heritage partnership, the Museum had very little traffic from Wiki sources. In fact, it was listed seventh of the referral channels, with 25 visitors coming directly from Wiki links. The top referral site then was a National Archives blog post, with 117 visitors. Since March 2022, National Archives referral rate has stayed relatively steady while the Wiki channel link has seen a percentage increase of 1584!

What is also interesting is that those who reach the Museum via Wiki sources are spending an average of 3.44 minutes on the site, compared to before the Connected Heritage partnership when they visited for an average of 1.27 minutes. Clearly, through more sustained and targeted editing, we have not only attracted more Wiki users to The Mixed Museum, but increased the engagement of these visitors with the exhibitions.

We can also see all the pages which reference the Mixed Museum by using the MassViews Analysis tool. 

Next steps

Though Leah’s Residency is coming to an end, the relationship between the Museum and Wikimedia UK will go on. Chamion will be contributing her insight and knowledge to a new research project dedicated to understanding the barriers that small and medium sized heritage organisations face when contributing to open knowledge. We hope to run the microinternships again in March 2024 and ideas for pursuing the digital volunteering programme are in the works. 

Interested in hosting a Wikimedian in Residence?

If you are involved with a heritage or cultural organisation in the United Kingdom and you think a Wikimedian in Residence might be good for your organisation, please talk to us about it. You can book a half hour meeting with the Connected Heritage team via Calendly or drop us an email.

The post Celebrating The Mixed Museum residency as it comes to an end appeared first on WMUK.

Learn why we transitioned the MediaWiki platform to serve traffic from multiple data centers, and the challenges we faced along the way.

Wikimedia Foundation provides access to information for people around the globe. When you visit Wikipedia, your browser sends a web request to our servers and receives a response. Our servers are located in multiple geographically separate datacenters. This gives us the ability to quickly respond to you from the closest possible location.

Six Wikimedia Foundation data centers around the world. Two application data centers located in the United States, in Ashburn and Carrolton. Four caching data centers, in Amsterdam, San Francisco, Singapore, and Marseille.
Data centers, Wikitech.

You can find out which data center is handling your requests by using the Network tab in your browser’s developer tools (e.g. right-click -> Inspect element -> Network). Refresh the page and click the top row in the table. In the “x-cache” response header, the first digit corresponds to a data center in the above map.

HTTP headers from en.wikipedia.org, shown in browser DevTools. The "x-cache" header is set to CP4043. The "server" header says MW2393.

In the example above, we can tell from the 4 in “cp4043”, that San Francisco was chosen as my nearest caching data center. The cache did not contain a suitable response, so the 2 in “mw2393” indicates that Dallas was chosen as the application data center. These are the ones where we run the MediaWiki platform on hundreds of bare metal Apache servers. The backend response from there is then proxied via San Francisco back to me.

Why multiple data centers?

Our in-house Content Delivery Network (CDN) is deployed in multiple geographic locations. This lowers response time by reducing the distance that data must travel, through (inter)national cables and other networking infrastructure from your ISP and Internet backbones. Each caching data center that makes up our CDN, contains cache servers that remember previous responses to speed up delivery. Requests that have no matching cache entry yet, must be forwarded to a backend server in the application data center.

If these backend servers are also deployed in multiple geographies, we lower the latency for requests that are missing from the cache, or that are uncachable. Operating multiple application data centers also reduces organizational risk from catastrophic damage or connectivity loss to a single data center. To achieve this redundancy, each application data center must contain all hardware, databases, and services required to handle the full worldwide volume of our backend traffic.

Multi-region evolution of our CDN

Wikimedia started running its first datacenter in 2004, in St Petersburg, Florida. This contained all our web servers, databases, and cache servers. We designed MediaWiki, the web application that powers Wikipedia, to support cache proxies that can handle our scale of Internet traffic. This involves including Cache-Control headers, sending HTTP PURGE requests when pages are edited, and intentional limitations to ensure content renders the same for different people. We originally deployed Squid as the cache proxy software, and later replaced it with Varnish and Apache Traffic Server.

In 2005, with only minimal code changes, we deployed cache proxies in Amsterdam, Seoul, and Paris. More recently, we’ve added caching clusters in San Francisco, Singapore, and Marseille. Each significantly reduces latency from Europe and Asia.

Adding cache servers increased the overhead of cache invalidation, as the backend would send an explicit PURGE request to each cache server. After ten years of growth both in Wikipedia’s edit rate and the number of servers, we adopted a more scalable solution in 2013 in the form of a one-to-many broadcast. This eventually reaches all caching servers, through a single asynchronous message (based on UDP multicast). This was later replaced with a Kafka-based system in 2020.

Screenshot of the "Bicycle" article on Wikipedia. The menu includes Create account and Log in links, indicating you are not logged-in. The toolbar includes a View source link.
When articles are temporarily restricted, “View source” replaces the familiar “Edit” link for most readers.

The traffic we receive from logged-in users is only a fraction of that of logged-out users, while also being difficult to cache. We forward such requests uncached to the backend application servers. When you browse Wikipedia on your device, the page can vary based on your name, interface preferences, and account permissions. Notice the elements highlighted in the example above. This kind of variation gets in the way of whole-page HTTP caching by URL.

Our highest-traffic endpoints are designed to be cacheable even for logged-in users. This includes our CSS/JavaScript delivery system (ResourceLoader), and our image thumbnails. The performance of these endpoints is essential to the critical path of page views.

Multi-region for application servers

Wikimedia Foundation began operating a secondary data center in 2014, as contingency to facilitate a quick and full recovery within minutes in the event of a disaster. We excercise full switchovers annually, and we use it throughout the year to ease maintenance through partial switchover of individual backend services.

Actively serving traffic from both data centers would add advantages over a cold-standby system:

  • Requests are forwarded to closer servers, which reduces latency. 
  • Traffic load is spread across more hardware, instead of half sitting idle. 
  • No need to “warm up” caches in a standby data center prior to switching traffic from one data center to another.
  • With multiple data centers in active use, there is institutional incentive to make sure each one can correctly serve live traffic. This avoids creation of services that are configured once, but not reproducible elsewhere.

We drafted several ideas into a proposal in 2015, to support multiple application data centers. Many components of the MediaWiki platform assumed operating from one backend data center. Such as assuming that a primary database is always reachable for querying, or that deleting a key from “the” Memcache cluster suffices to invalidate a cache. We needed to adopt new paradigms and patterns, deploy new infrastructure, and update existing components to accommodate these. Our seven-year journey ended in 2022, when we finally enabled concurrent use of multiple data centers!

The biggest changes that made this transition possible are outlined below.

HTTP verb traffic routing

MediaWiki was designed from the ground up to make liberal use of relational databases (e.g. MySQL). During most HTTP requests, the backend application makes several dozen round trips to its databases. This is acceptable when those databases are physically close to the web servers (<0.2ms ping time). But, this would accumulate significant delays if they are in different regions (e.g. 35ms ping time).

MediaWiki is also designed to strictly separate primary (writable) from replica (read-only) databases. This is essential at our scale. We have a CDN and hundreds of web servers behind it. As traffic grows, we can add more web servers and replica database servers as-needed. But, this requires that page views don’t put load on the primary database server — of which there can be only one! Therefore we optimize page views to rely only on queries to replica databases. This generally respects the “method” section of RFC 9110, which states that requests that modify information (such as edits) use HTTP POST requests, whereas read actions (like page views) only involve HTTP GET (or HTTP HEAD) requests.

The above pattern gave rise to the key idea that there could be a “primary” application datacenter for “write” requests, and “secondary” data centers for “read” requests. The primary databases reside in the primary datacenter, while we have MySQL replicas in both data centers. When the CDN has to forward a request to an application server, it chooses the primary datacenter for “write” requests (HTTP POST) and the closest datacenter for “read” requests (e.g. HTTP GET).

We cleaned up and migrated components of MediaWiki to fit this pattern. For pragmatic reasons, we did make a short list of exceptions. We allow certain GET requests to always route to the primary data center. The exceptions require HTTP GET for technical reasons, and change data at the same low frequency as POST requests. The final routing logic is implemented in Lua on our Apache Traffic Server proxies.

Media storage

Our first file storage and thumbnailing infrastructure relied on NFS. NetApp hardware provided mirroring to standby data centers.

By 2012, this required increasingly expensive hardware and proved difficult to maintain. We migrated media storage to Swift, a distributed file store.

As MediaWiki assumed direct file access, Aaron Schulz and Tim Starling introduced the FileBackend interface to abstract this. Each application data center has its own Swift cluster. MediaWiki tries writes to both clusters, and the “swiftrepl” background service manages consistency. When our CDN finds thumbnails absent from its cache, it forwards requests to the nearest Swift cluster.

Job queue

MediaWiki features a job queue system since 2009, for performing background tasks. We took our Redis-based job queue service, and migrated to Kafka in 2017. With Kafka, we support bidirectional and asynchronous replication. This allows MediaWiki to quickly and safely queue jobs locally within the secondary data center. Jobs are then relayed to and executed in the primary data center, near the primary databases.

The bidirectional queue helps support legacy features that discover data updates during a pageview or other HTTP GET request. Changing each of these features was not feasible in a reasonable time span. Instead, we designed the system to ensure queueing operations are equally fast and local to each data center.

In-memory object cache

MediaWiki uses Memcached as an LRU key-value store to cache frequently accessed objects. Though not as efficient as whole-page HTTP caching, this very granular cache is suitable for dynamic content.

Some MediaWiki extensions assumed that Memcached had strong consistency guarantees, or that a cache could be invalidated by setting new values at relevant keys when the underlying data changes. Although these assumptions were never valid, they worked well enough in a single data center.

We introduced WANObjectCache as a simple yet robust interface in MediaWiki. It takes care of dealing with multiple independent data centers. The system is backed by mcrouter, a Memcached proxy written by Facebook. WANObjectCache provides two basic functions: getWithSet and delete. It uses cache-aside in the local data center, and broadcasts invalidation to all data centers. We’ve migrated virtually all Memcached interactions in MediaWiki to WANObjectCache.

Parser cache

Most of a Wikipedia page is the HTML rendering of the editable content. This HTML is the result of parsing wikitext markup and expanding template macros. MediaWiki stores this in the ParserCache to improve scalability and performance. Originally, Wikipedia used its main Memcached cluster for this. In 2011, we added MySQL as the lower tier key-value store. This improved resiliency from power outages and simplified Memcached maintenance. ParserCache databases use circular replication between data centers.

Ephemeral object stash

The MainStash interface provides MediaWiki extensions on the platform with a general key-value store. Unlike Memcached, this is is a persistent store (disk-backed, to survive restarts) and replicates its values between data centers. Until now, in our single data center setup, we used Redis as our MainStash backend.

In 2022 we moved this data to MySQL, and replicate it between data centers using circular replication. Our access layer (SqlBagOStuff) adheres to a Last-Write-Wins consistency model.

Login sessions were similarly migrated away from Redis, to a new session store based on Cassandra. It has native support for multi-region clustering and tunable consistency models.

Reaping the rewards

Most multi-DC work took the form of incremental improvements and infrastructure cleanup, spread over several years. While we did find latency redunction on some of the individual changes, we mainly looked out for improvements in availability and reliability.

The final switch to “turn on” concurrent traffic to both application data centers was the HTTP verb routing. We deployed it in two stages. The first stage applied the routing logic to 2% of web traffic, to reduce risk. After monitoring and functional testing, we moved to the second stage: route 100% of traffic.

We reduced latency of “read” requests by ~15ms for users west of our data center in Carrollton (Texas, USA). For example, logged-in users within East Asia. Previously, we forwarded their CDN cache-misses to our primary data center in Ashburn (Virginia, USA). Now, we could respond from our closer, secondary, datacenter in Carrollton. This improvement is visible in the 75th percentile TTFB (Time to First Byte) graph below. The time is in seconds. Note the dip after 03:36 UTC, when we deployed the HTTP verb routing logic.

Line graph dated 6 September 2022, plotting Singapore upstream latency. Previously around 510ms, and drops down to 490ms after 3 AM.

Further reading

About this post

Featured image credit: Wikimedia servers by Victor Grigas, licensed CC BY-SA 3.0.