In the bustling landscape of Wikimedia events and conferences, each gathering is a unique tapestry woven with diverse themes, cultures, and attendee profiles. The meticulous planning and dedication put forth by the Wiki Women’s Camp COT were not only directed toward shaping an enriching program but also toward curating exceptional experiences beyond the conventional agenda.

With India hosting its first truly global Wikimedia conference after a long time, it was of immense importance for us to showcase the culture, diversity, and hospitality of the country while ensuring we keep true to the core theme of the WWC conference – Map Up, Rise Up!

REDEFINING CONFERENCE PROGRAMMING

Ensuring great experiences for conference participants is paramount as it goes beyond the core program, directly impacting the overall success and long-term impact of the event. While a well-designed program is crucial for delivering valuable content, the participant experience encompasses various elements that contribute to engagement, satisfaction, connections, and lasting impressions.

Furthermore, a focus on participant experience reflects the organizer’s commitment to attendee well-being, safety, and comfort. Seamless logistics, thoughtful use of technology, and cultural showcases are a few ways to provide a hassle-free experience, allowing participants to concentrate on the content and interactions.

Ultimately, the success of a conference isn’t solely measured by the content presented but by the enduring impact on participants. Prioritizing their experience contributes to building an engaged community, encouraging future attendance, and establishing the conference as a reputable and sought-after event within the movement. 

Here’s how the Wiki Women’s Camp- COT focussed on the experiential enhancement of the Camp attendees, with Indian hospitality at its center. (this may be of use while planning your conference or convening).

EASE OF COMMUNICATION AND FAMILIARITY

Participant Telegram Group: One of the first measures adopted by the COT was to create a participant Telegram Group where key information, reminders to follow UCoC, follow-ups, and query responses were given by members of the COT. This also enabled more familiarity with fellow participants and the initiation of bonding. 

Travel Guide: Closer to the conference the COT created a robust and comprehensive Travel Guide which was shared with all the participants ahead of the conference through email, telegram messages, and sessions. This guide included all important information required from a pre-, during, and post-travel perspective. 

EESHA Mascot: The WWC COT launched EESHA- the camp’s mascot. EESHA aimed to spread positive energy and create a vibrant atmosphere throughout the event. It was also the carrier of all important updates and information and was on Telegram connecting with all the participants with key information about their travel, logistics, and the program schedule. 

Ps: Did you know EESHA was manned (yes manned) 24*7 during the camp by 2 male volunteers?

PARTICIPANT COMFORT

Pick Up and Drops: To ensure our participants don’t have to worry about the logistics of getting to the venue, the organizers decided to provide airport pick-up and drop-offs for all the participants and EESHA got in touch with them with the details. 

Accommodation: The accommodation was chosen keeping in mind participant comfort and the ease of accessibility. In an attempt to make the experience of the participants more robust, the hotel management arranged a cultural welcome and performance, customized key card jackets, a customized welcome platter and amenities, and more. All to ensure our participants felt warm and comfortable at each step of their WWC 2023 journey.

INDIAN CULTURE AT THE HEART

The heart of the experience lay in the cultural showcases. From Punjabi and Rajasthani performances to historic city tours and a memorable closing dinner, every moment was a celebration of India’s rich heritage and hospitality.  

Additionally, the food and beverage options were curated keeping in mind not only dietary preferences & hydration indexes but were a showcase of the gastronomic delight a trip to India must offer; they were invitations to immerse in the warmth and diversity of the host country.

In the pages of the Wiki Women’s Camp, the COT scripted an emotional narrative that went beyond the ordinary, leaving participants not only enriched with knowledge but also embraced by the unforgettable tapestry of India’s culture and hospitality. As the echoes of Punjabi rhythms and the laughter from city tours lingered, it was clear that the COT had not just hosted a conference; they had crafted a cherished chapter in the collective memory of the Wikimedia community

Photo credits:

https://commons.wikimedia.org/wiki/File:WWC_COT_4.jpg
https://commons.wikimedia.org/wiki/File:WWC2023_Eesha_(2).svg
https://commons.wikimedia.org/wiki/File:Radhika_Mamidi_captures_WWC_2023_22.jpg
https://commons.wikimedia.org/wiki/File:Welcome_delights.jpg
https://commons.wikimedia.org/wiki/File:Radhika_Mamidi_captures_WWC_2023_02.jpg

Advocating for Freedom of Panorama in South Africa

Friday, 26 January 2024 14:46 UTC

A summary of Wikimedia chapter’s efforts to encourage more inclusive copyright exceptions.

In September 2012, Wikimedia South Africa launched its first event since being legally established only 7 months before. Wiki Loves Monuments 2012 in South Africa was intended to be a celebration of all of the country’s heritage through photographs of public monuments; both new and old. Instead we found that was impossible. This began the chapter’s long advocacy adventure to change the country’s copyright law. An adventure which is not yet finished.

Background

Soon after launching Wiki Loves Monuments, we discovered that we were not allowed to accept photographs or multimedia of recently built monuments. This is because South African law has no copyright exception for creative works located in public spaces such as national monuments or the facades of buildings. Although the photographer owns any copyright to the picture they take, the original creator of the work – e.g. building – still retains a copyright in any reproductions, which in this case is a photograph. This meant that for Wiki Loves Monuments, permission would have to be given by both the photographer and the original artist/architect/owner of the publicly located work.

This was not a problem for older colonial era monuments as any copyright that might exist for them had long expired and so was in the public domain. Such pictures could therefore be freely photographed, shared online, and uploaded to Wikimedia Commons. This created a perverse situation where we could not celebrate recently built monuments erected to remember South Africa’s more recent history, such as the struggle against apartheid. This limitation also applies to the thousands of people who might take photographs of these recent works and unknowingly share them on social media. Freedom of Panorama fixes this by granting a copyright exemption for works located in public spaces such as public statues, monuments, and the facades of buildings.

Following this experience Wikimedia South Africa decided to advocate for the adoption of a Freedom of Panorama exception in South African copyright law in 2014. A volunteer was selected to drive the advocacy effort (the author) which helped ensure accountability and allowed for specialization of expertise on the issue. But first we needed to learn out how to advocate.

Starting off: advocating for change

The chapter was fortunate as our friends at Creative Commons South Africa introduced us to stakeholders involved in copyright reform advocacy. We then spent the next three years attending workshops, public participation processes, and conferences focused on amending South Africa’s copyright legislation. Through these efforts, the chapter learnt a great deal about the government’s public consultation process, built relationships, shaped our message, and met like-minded groups who were also working to expand copyright exceptions for the public good. We also submitted a number of proposals to parliament making the case for Freedom of Panorama to be included in the Bill updating copyright law.

In 2017, the government announced that the parliamentary committee setup to draft the Copyright Amendment Bill would start public hearings. This was our chance to get Freedom of Panorama adopted; but first we needed to obtain a broader and stronger mandate from the South African Wiki community generally to pursue this with full confidence. A strong consensus was obtained from the local Wiki community and a public petition, using banner ads on Wikipedia in South Africa, was launched calling for Freedom of Panorama. The chapter was invited to give testimony to parliament on the necessity for Freedom of Panorama to be included in the Copyright Amendment Bill.

Throughout 2018 the Bill was discussed during open sessions of the parliamentary committee. Crucially a representative of Wikimedia South Africa (the author) sat in on most sessions. This turned out to be important as we noticed that the committee forgot to discuss Freedom of Panorama. Due to our presence in the poorly attended public sessions, we were able to remind the committee members to not omit it. This ensured that a Freedom of Panorama clause was eventually included in the final draft of the Bill. We then had a Bill with a Freedom of Panorama exceptions clause.

During this period, Wikimedia South Africa along with groups representing librarians, documentary film makers, educationists, and the blind community co-founded ReCreate; an umbrella organisation that provided a platform for allies to communicate, coordinate and share resources when advocating for expanded copyright exceptions, like Freedom of Panorama and Fair Use, for the public good. ReCreate would turn out to be an important ally in advocating for the Bill.

An election apportionment diagram of the first vote passing the Bill. The lower house of Parliament passed the Copyright Amendment Bill in 2018 with 197 votes in favour, 4 votes against and 199 absent.

On 5 December 2018, the lower house of parliament passed the Copyright Amendment Bill. Unable to outvote the parties in support of the Bill, opposition political parties tried to block it by ensuring that less than 50% of the house was present to vote. Therefore the opposition only sent a token presence to express their discontent with the Bill during voting, and hoped that not enough pro-Bill MPs would attend the vote. This did not work as 201 out of 400 sitting members of parliament were present on the day. The vote then went onto the upper house of parliament, the Council of Provinces, for a second round of voting which took place three months later.

It was a close call but the Bill was passed by both houses of parliament and sent to the President of South Africa for signing. At the time, we did not know how far away we still were and hoped that the Bill would be gazetted into law within six months.

The long haul: keeping the Bill alive

The President landed up sitting on the Bill for two years before sending it back to parliament to reconsider six issues of concern. The South African Constitution states that it should have taken no longer than six months to make a decision on the Bill. This delay resulted in one of our allies, Blind South Africa, taking the presidency to the Constitutional Court. In addition to Freedom of Panorama, the Bill also included a section ratifying the Marrakesh Treaty that granted copyright exceptions for transcribing written content into an accessible format for the blind community; which they urgently needed. Three years later, the High Court judged that not ratifying the treaty in a timely manner was a denial of the blind community’s right to access knowledge.

Throughout this time, opponents of the Bill waged a campaign in the media spreading misinformation and disinformation about it. A particular target of the campaign was to create confusion and outrage at the adoption of Fair Use which would replace the Fair Dealing copyright exemptions system in the old act. Critics of the Bill misleadingly stated that people would be able to legally pirate entire books or whole songs under Fair Use, thereby robbing authors and musicians of their royalty rights. This is totally false and played off the public’s lack of understanding of copyright and the complex legal jargon around it.

To combat misinformation and disinformation, supporters of the Bill had to spend a lot of energy correcting these accusations in the media so as to mitigate the damage this caused. Having a large network of allies to spread this effort over was very helpful, but with hindsight the impact would have been greater if the Bill’s allies had the same access to resources as the opponents.

Some of the strongest opponents of the Bill were the royalty collecting societies who argued strongly against Fair Use. It is worth noting that the Performers Protection Bill, which is intimately and inseparably linked to the Copyright Amendment Bill, increases the penalty for collecting societies for not paying royalties to authors. The old copyright act simply imposes a fine that was set in 1978 and never adjusted for inflation whilst the new Bill would increase the penalty to up to a year of revenue and/or up to five years in prison. Preventing passage of the Copyright Amendment Bill would prevent the implementation of this change in penalties, which serves the interests of some of the more dishonest collecting societies.

The second vote to pass the Bill took place on 1 September 2022. Parliament passed it with 163 in support and 45 against. MPs coloured in green voted in support of the Bill and those against are coloured in red.

By the beginning of 2022, the South African Parliament had almost completed its review of the Presidency’s concerns with the Bill and held a number of public hearings. An amended version of the Bill was voted on in parliament and passed for a second time with 163 votes in support, 45 against and no abstentions. The following year the amended Bill was sent on a public participation tour of all of South Africa’s provinces after which the Council of Provinces again voted on it; passing it in June 2023 with only 2 out of 9 provinces opposing the Bill.

Currently the concern is that the Bill will not be passed before the next general election later in 2024. The uncertainty over the outcome of the election and its impact on the passage of the Bill into law remains an issue to watch out for. The adventure continues, we are so close to getting Freedom of Panorama in South Africa… and yet still so far.

Lessons learnt

Throughout this process Wikimedia South Africa learnt a number of valuable lessons that others might also benefit from. These are:

  • Be patient and persistent. Five years is very quick, ten years is more likely, but it can take as long as twenty or more years in some cases to amend a law. The longer you work at it, the more likely it is that you will be successful.
  • Be bold. Don’t be afraid to contact legislators and their staff, ask them for face to face meetings, or invite them to events to better understand the issues. Don’t be afraid to write to or talk to the media about your issues. Communication is key.
  • Be consistent. Consistently respond to false information about your cause in the media and be consistent in your message about why your issue is important.
  • Cultivate friends and allies. This will allow you to amplify your collective voice and share resources and insights.
  • Know your adversaries. Understand who they are and why they oppose your position. It will help you develop strategies to more effectively communicate your message, work around them and overcome any challenges. In some cases you might be able to convince them to support you and turn them into allies.
  • Get organised. The better organised your community of supporters and your network of allies are, the more likely it is that you will succeed. An organised, focused and motivated group is an effective and hard to ignore group that can get things done.

It is important for us to recognise that we had four significant strokes of good luck on our side. One, we had a compelling narrative that was difficult to challenge in South African politics and were able to illustrate our complex legal issue with pictures; this helped people understand it quickly and easily. Two, the Bill has the support of the majority of Members of Parliament. Three, Wikimedia South Africa was able to quickly learn the process of advocating Parliament for legal change thanks to the friendly advice of a well informed partner organisation (Creative Commons South Africa). Four, Wikimedia South Africa had volunteers who were prepared to drive the advocacy process for as long as it takes to get the Freedom of Panorama.

We have had bad luck as well such as a Presidency that has been very slow to review the Bill and a well resourced network of energetic adversaries (often unscrupulous) who oppose it.

If you are interested in learning more about Wikimedia South Africa’s advocacy efforts in support of Freedom of Panorama and Fair Use then please check out our advocacy page on Meta-Wiki.

Image credits:

Image collage for the December 2023 issue of ‘Don’t Blink.’ Image by the Wikimedia Foundation, CC BY-SA 4.0, via Wikimedia Commons.

Welcome to “Don’t Blink”! Every month we share developments from around the world that shape people’s ability to participate in the free knowledge movement. In case you blinked last month, here are the most important public policy advocacy topics that have kept the Wikimedia Foundation busy.

The Global Advocacy team works to advocate laws and government policies that protect the volunteer community-led Wikimedia model, Wikimedia’s people, and the Wikimedia movement’s core values. To learn more about us and the work we do with the rest of the Foundation, visit our Meta-Wiki webpage, follow us on X (formerly Twitter) (@WikimediaPolicy), and sign up to our Wikimedia public policy mailing list or quarterly newsletter

________

Protecting Wikimedia’s people
(Work related to privacy and countering surveillance)

Signing a letter to support legislative efforts to reform US mass surveillance
[Read
our letter urging US Congress to limit the National Security Agency’s surveillance]

As part of the Wikimedia Foundation’s efforts to move United States (US) Congress toward surveillance reform, we signed a letter alongside the Mozilla Foundation and various other organizations and companies urging legislators to support reform proposals like the Government Surveillance Reform Act (GSRA) and the Protecting Liberty and Ending Warrantless Surveillance Act (PLEWSA).

Furthermore, we asked members of Congress to strengthen these proposals by limiting the scope of surveillance targeting so that fewer people are swept up in the National Security Agency’s (NSA) massive surveillance system. We also warned lawmakers that the continuation of widely-documented abuses not only impacts the privacy of people online, but also deteriorates the trust of communities and individuals in the internet, weakening its economic and social potential. You can read our letter here.

Protecting Wikimedia’s values
(Work related to human rights and countering disinformation)

Input on UN Code of Conduct for information integrity on digital platforms
[Read our submission, available on Wikimedia Commons]

The United Nations (UN) is increasing its focus on digital policies and on threats specific to the online information environment. As part of this work, the UN decided to develop a Code of Conduct for information integrity on digital platforms. The result of this public consultation will be a document released by the UN Secretariat outlining best practices and recommendations to improve the quality of information online, particularly on digital platforms. The Wikimedia Foundation participated in the open call for submissions, offering input based on the Wikimedia model, which is uniquely rooted in transparency, user protection, and community-led content development. For more details, read our input, which is available on Wikimedia Commons.

Discussing the role of generative AI in mis- and disinformation at the NYC Tech Salon

Costanza Sciubba Caniglia, our Anti-Disinformation Strategy Lead, spoke at an event organized by the Technology Salon NYC (TSNYC) titled “What Can We Do About GenAI Fueling Disinformation?” Members of international organizations, corporations, civil society, and academia such as WITNESS, Google, UNESCO, and the Associated Press (AP) attended to discuss the increasingly significant role of generative AI in both propagating and combating misinformation and disinformation. Attendants were in agreement that it is critical to center global and frontline voices in the prioritization of needs and solutions around these threats, which extend to other closely related technologies such as deepfakes and synthetic media.

The event also represented an opportunity to speak about the use of machine learning (ML) and AI technologies on Wikipedia and other Wikimedia projects. Costanza spoke about how AI works best as an augmentation for the work that humans do, how Wikipedia can be an antidote to disinformation, and how ML and AI technologies can help. In addition, she highlighted the crucial impact of Wikipedia data to train large language models (LLMs), and discussed with those attending how we make LLMs more reliable in general and ensure a healthy online information environment.

Protecting the Wikimedia model
(Work related to access to knowledge and freedom of expression)

Wikimedia Foundation files a brief asking US Supreme Court to strike down Texas and Florida laws
[Read the Foundation’s public statement and our blog post on the lawsuits and the brief]

In early December 2023, the Wikimedia Foundation filed an amicus (“friend-of-the court”) brief with the US Supreme Court supporting legal challenges to laws in Texas and Florida that limit how internet platforms can engage in content moderation. The laws threaten the right to freedom of expression as guaranteed by the First Amendment of the US Constitution, which also protects the right not to speak or express a particular viewpoint. We filed the brief calling on the Court to strike down both laws as unconstitutional.

An amicus brief is a document filed by individuals or organizations who are not part of a lawsuit but who have an interest in the outcome of the case and want to educate the court about their concerns. In our brief, we explained why and how these laws pose a significant risk to the Foundation’s ability to host Wikipedia and other Wikimedia projects, and to the ability of Wikimedians to continue governing the projects as they have for more than 20 years. On the one hand, the laws are too vague and broadly written, and could create unnecessary uncertainty around legal risks for volunteer editors. On the other hand, even if they were written more clearly, forcing websites or the communities that manage them to host speech they do not want would violate their constitutional rights as well as overwhelm the projects with content that is not encyclopedic, neutral, and/or verifiable.

For more information, read the Foundation’s public statement and our blog post on the lawsuits and the brief. 

Discussing the Wikimedia Foundation’s perspective on the EU Digital Services Act
[Read our blog post about the EU’s content moderation law and how it can promote access to knowledge and protect online communities]


In August 2023, the European Union’s (EU) new Digital Services Act (DSA) started to apply to the largest platforms and search engines in the EU, which includes Wikipedia. The law aims to establish common rules that will govern online content moderation and will significantly impact the Wikimedia projects and online spaces worldwide. A subset of DSA rules will apply from mid-February 2024. Worldwide, there is much interest in seeing how the law is implemented and performs in practice, and also how internet platforms will comply and respond.

We believe that the obligations that the DSA establishes online not only can improve people’s experience of the internet, but also protect their rights online. However, we caution that governments and regulators should be careful not to impact smaller community-led platforms when implementing the new requirements, and warn legislators outside of the EU that they cannot copy and paste the DSA without providing adequate protections for free expression and user rights.

Read our blog post to learn about our engagement in the DSA legislative process, the new obligations that will apply to Wikimedia projects, the law’s global implications, and how governments can promote and protect communities’ ability to own and govern online spaces together.

Discussing the importance of public policy that promotes digital public goods, technology, and the internet

Amalia Toledo, our Lead Public Policy Specialist for Latin America and the Caribbean, attended the “Seminário Big Techs, Informação e Democracia na América Latina” (in English, “Seminar on Big Tech, Information, & Democracy in Latin America”) held in São Paulo, Brazil. The event aimed to discuss the impacts of technology in Latin American and Caribbean democracies, offer proactive agendas on how to curb the growing power of large digital platforms in the region, and contribute with regional perspectives to global debates. Participants at the event included civil society representatives, university students, academic researchers, and journalists from independent and alternative media.

Amalia attended the event in order to share with important policy actors the significance of regulation that can protect and promote digital public goods, technology, and the internet—in other words, how these offer vital frameworks to promote the vision that the Wikimedia Foundation and volunteer communities share of a world where everyone, everywhere, can participate in the sum of human knowledge. In addition, she shared reflections on how the Wikimedia projects can strengthen journalism worldwide. You can read more about the event here (in Portuguese).

Organizing a workshop with the Information Society Project at Yale University

The Wikimedia Foundation partnered with Yale University’s Information Society Project (ISP) and the Center for Democracy and Technology (CDT) to organize a day-long workshop in December. During the event, Foundation staff joined a small group of experts from civil society and academia for in-depth discussions about the past, present, and future of internet regulations. Workshop participants learned more about the Wikimedia model, and those attending brainstormed ideas on how to protect and promote public interest projects like Wikipedia at times when policymakers are seeking to curb the perceived harms of social media platforms and other internet phenomena. 

ICYMI: Explaining why Section 230 is important to Wikimedia projects and the diversity of the online information ecosystem
[Read our three-part blog post series on Section 230:
part 1, part 2, and part 3]

Much of the modern internet, including Wikipedia and the Wikimedia projects, could not exist without Section 230. This fundamental US law protects internet platforms from lawsuits against content shared by users online and the decisions on content moderation regarding it. The statute provides the legal certainty empowering volunteer editors to create free knowledge on the Wikimedia projects, and also the ability of the Wikimedia Foundation to host them.

In the US, there is much discussion about changing or terminating the law, both by means of legislators and courts. The Foundation is working to explain how doing so could ruin Wikipedia. We have opposed bills like the EARN IT Act in the past because of their unintended consequences.

Since Section 230 is becoming more widely discussed, we published a blog post series to inform US members of Congress, their staff, and the voters who elect them about some key issues: 1) Why and how websites and services depend on the statute; 2) Misunderstandings and assumptions in legislative reforms that can lead to negative impacts on the online information ecosystem; and, 3) Alternatives that can constructively help to refocus internet regulation and empower communities and individuals to transform the internet for the better.

For more, explore the first post, the second post, and/or the third and final post in our series about Section 230. 

________

Follow us on X (formerly Twitter), visit our Meta-Wiki webpage, join our Wikipedia policy mailing list, and sign up for our quarterly newsletter to receive updates. We hope to see you there!

Digital safety during election time

Friday, 26 January 2024 14:38 UTC

In 2024, over four billion people from more than 40 countries – representing over half of the world’s population – are set to participate in elections. As the election season rolls in across the globe, it’s not just the physical polling stations that buzz with activity, but also the digital world. Our screens become arenas of vibrant discussions, debates, and information sharing. But, as we dive into this digital fervor, we also enter a time of increased digital mischief.

As we edit, delete, block, protect, and debate across Wikimedia projects, this eventful election season is a crucial reminder of the importance of reinforcing digital safety practices. With political debates intensifying and online vulnerabilities escalating, prioritizing digital safety is paramount. Engaging in these online activities responsibly and with heightened awareness of digital safety is essential to minimize potential risks.

What to consider and what you can do

Doxing and online harassment 

Online harassment and doxing become particularly insidious tools during election seasons, a time marked by heightened political discourse and intense emotions. In this environment, sharing personal information online can inadvertently turn you into a target. It is crucial to navigate these digital waters with care. Be cautious about what you share on social media and other platforms. Take the time to understand and adjust your privacy settings to safeguard your personal data. Additionally, stay informed about the latest digital safety practices and consider using tools that help protect your online identity. Remember, in the heat of political debates, your digital safety should remain a top priority.

Social engineering, phishing and malicious links

Be quick, do you see anything wrong with this link: https://www.wikipedia.com? No? Check again! 

Remaining vigilant is crucial when navigating through phishing messages and malicious links, particularly during periods of increased political activity. Cybercriminals often capitalize on these times to dispatch emails and messages that mimic legitimacy, but their true intent is to harvest personal information or disseminate malware. These deceptive communications can come through various channels, including emails, SMS, WhatsApp messages, Instagram direct messages, or even phone calls. Always approach these with a healthy dose of skepticism and take the time to verify the sender’s identity. A common tactic employed by these fraudsters is to create a sense of urgency in their messages, as we did above. Don’t let urgency cloud your judgment. Instead, pause, reflect, and then respond thoughtfully. This measured approach is key in protecting yourself from falling prey to these sophisticated cyber schemes. Also, checkout the shira.app [external link] to learn to identify and defeat phishing attempts or this phishing quiz [external link] by Google.

Implementing a grant?

Have you considered how upcoming elections might alter the safety and security landscape for your project? What is accepted in one socio-political context could become sensitive or contentious in another, especially during the heightened atmosphere of an election period. Consider how these shifts might necessitate adjustments in your grant implementation. For instance, the location of your project activities might coincide with a political rally, or the topics you are exploring could suddenly become politically charged. How will you navigate these changes? It’s crucial to plan for potential adjustments and think about effective communication strategies to inform your participants and team members. Reflecting on these scenarios and preparing a responsive strategy ensures that your project remains relevant, safe, and successful, irrespective of the evolving political climate. See the Safety for Grant recipients course on learn.wiki to learn about holistic security, how to do a risk analysis, and broadly about the challenges of engaging with the movement in different parts of the world.

When communication is shutdown 

The #KeepItOnCoalition[external link] report notes that internet and communication shutdowns are a notable occurrence around election times. Such disruptions can leave individuals isolated, unable to communicate with loved ones or access crucial news updates – a situation that becomes even more distressing during the critical moments of an election. To prepare for these challenges, we’ve compiled a comprehensive guide with strategies to stay connected. To learn about the relationship between internet shutdowns and democratic elections, see this handbook[external link] by AccessNow.

Call to action

If an election is on the horizon in your area, or if you’re simply keen on deepening your understanding of digital safety best practices, we encourage you to get in touch with us. We’re here to help organize a digital safety session for you and your community. We’re excited about the opportunity to collaborate and empower you and your community with essential digital safety skills. 

And in the unfortunate event that you or someone in your community experiences digital attacks, don’t feel alone. The Digital First Aid Kit[external link] is an excellent resource for immediate guidance and self-help to troubleshoot digital emergencies. It’s designed to provide you with quick, effective strategies to address various digital security issues and can also connect you to hands-on help from a civil society support team. 

Our team is always ready to provide support too.  Whether it’s addressing specific threats or offering advice on best practices, we’re here to help you navigate through these challenging situations. Reach out to us anytime at talktohumanrights@wikimedia.org and check out the Digital Safety Resources Page on Meta.

AlphabeticalZürich project

Friday, 26 January 2024 02:39 UTC

Fremantle

· Wikimedia · Fremantle · photography ·

The AlphabeticalZürich project looks terrific:

The AlphabeticalZürich project is a pretty ambitious project: I want to take at least one picture in every street of Zürich, in alphabetical order.

I want to capture the interesting, the whimsy, the pretty for every street of Zürich

I do not aim for exhaustivity in every street. I may not even walk the whole street. For instance, Badenerstrasse is spanning 5km: that’s a project in itself!

I have no idea when I’m going to give up 😉 It’s an ambitious project, and life may happen before its end, or I may lose interest. We’ll start, and see where we go from here!

So, here we go, from Aargauerstrasse to Zypressenstrasse!

What is this project?, 2023-08-17

It makes me wonder about the Fremantle streets project that I seem to be engaged in, where I'm currently definitely lacking in guidelines or rules to keep me focussed.

The Freo streets list started just as a way to give more context to the Fremantle Society Photographic Survey, but there's lots more that can be done with it I think. One of the troubles with it is that there's no solid rule about how to handle historical vs current places/buildings/etc. — if a building has been demolished and another build, or a large block subdivided into smaller ones, does that mean the website should have one page that explains the location through time or separate pages for each incarnation of the place? This is really one of the great strengths of the Wikipedia model, I think: that ambiguities and confusion can be handled really well, because you just write it out in words, and link to everything.

Mahmoud “Mody” Hassan and his cat, Shaki. Image courtesy Mody Hassan, all rights reserved.

Mahmoud “Mody” Hassan is a freshman at Rutgers University Newark. It was his first term of college, and he had just started at Rutgers after being born and raised in Egypt. So there were a lot of changes in his life when he showed up to Dr. Laura Porterfield’s class on “Education and Social Change in the Black Diaspora” and learned he’d be writing a Wikipedia article as a class assignment. Dr. Porterfield was participating in a project run by Wiki Education and funded by the Broadcom Foundation to increase the number of biographies on Wikipedia of diverse people in STEM.

“The concept of making a page on Wikipedia sounded too crazy and made me consider dropping the class,” Mody admits. “But I never did.”

Despite that initial nervousness, Mody dug in on the assignment. He chose to write about Claibourne Smith, an African American chemist who helped advance Delaware State University, a Historically Black College and University (HBCU).

“After reading about him and how much information about him was nowhere to be found, I decided that he should get a page on Wikipedia,” he says.

Mody tackled two different areas of learning: (1) learning about Claibourne Smith and his achievements, and (2) learning how to edit Wikipedia. While he didn’t enjoy the technical difficulties of creating the page, he loved publishing the final article.

“So this assignment took longer than the other assignments, but the most important and different thing is that this article was going to go online for people to read and consume knowledge from, and it was my responsibility that those people weren’t misled by what I wrote,” Mody says. “The feeling of having the assignment done and having it published online for everyone to see was such a flex, and I loved that part.”

He hopes more faculty participate; if they’re going to ask students to write the equivalent of an article anyway, why not publish it on Wikipedia so everyone can benefit? And the experience has taught him useful skills about editing Wikipedia. While he’s busy with schoolwork during the term, he’s already planning to create more articles on the Arabic Wikipedia this summer.

“Making a page about Smith meant a lot to me because I felt like it was a piece of art that I kept looking at,” Mody says.

Call Out for WikiCon Australia 2024 Planning Subcommittee

Wednesday, 24 January 2024 12:00 UTC


WikiCon Brisbane 2023 was a great success but we want to do even better!
, Ali Smith.


We're reaching out to invite two members of our Wikimedia Australia community to join our WikiCon 2024 Planning Subcommittee.

This subcommittee will be at the heart of organising the next unforgettable WikiCon Australia! Depending on your interests and experience you could be involved in shaping the event agenda, to suggesting logistics and outreach.

At this stage Wikicon 2024 will be held in November, and your input on plans throughout this time will be invaluable.

If you're keen to be part of sharing ideas for this exciting opportunity, then please email Alice Woods, by Thursday 29 February 2024. We would love to have your insights and energy on board!

Wiki Education has exciting news to ring in the New Year! We are thrilled to announce a three-year partnership with the Mellon Foundation’s Higher Learning program that will elevate the knowledge of 16,000 higher education students studying the humanities to represent more complete and accurate narratives of the human experience on Wikipedia. Beginning this year, this partnership will amplify our Wikipedia Student Program’s Knowledge Equity initiative that brings to light stories and perspectives that are often missing, misrepresented, or have little information written about them. This project will be the biggest social-justice campaign for the humanities in Wikipedia’s history.

Mellon Foundation logo“Our world is full of rich human stories and horizon-expanding knowledge that should be accessible to all, but that have been left out, suppressed, or otherwise hidden from public view,” says Maria Sachiko Cecire, Program Officer in Higher Learning at the Mellon Foundation. “We are delighted to partner with Wiki Education in this ambitious social justice campaign to bring more information about the full range of human creation and expression to the largest and most consulted reference work on the planet. We are especially thrilled that the significant research and writing skills of humanities faculty and students at colleges and universities across the country will power this essential work.”

We define knowledge equity content as that which pertains to the narratives of women and other non-male gender identities, those with disabilities, LGBTQ+ people, people of color, and others whose perspectives have been historically marginalized by dominant groups. Indeed, the dominant group of Wikipedia contributors are currently well-educated white men from North America and Europe. Featured Articles on Wikipedia’s landing page are largely authored by this group and often lacking a social justice lens. 

Wikipedia editors, including new ones, tend to stick with the way things have always been written if not guided to where the gaps exist. Wiki Education’s resources and support will empower students to add content about knowledge equity, while still adhering to Wikipedia’s rigorous rules on sourcing, writing style, and layout. 

Students will specifically focus on re-shaping the landscape of humanities articles pertaining to academic disciplines such as anthropology; archaeology; arts; classics; cultural studies; disability studies; ethics; gender and sexuality studies; history (including history of science); jurisprudence; languages and literature; music; philosophy; racial and ethnic studies; religion; sociology, as well as interdisciplinary topics related to an equitable human experience, like environmental justice.

By the end of 2026, we expect more than 200 million people will have viewed these articles and increased their understanding of communities, cultures, histories, and notable figures that have not received enough media attention elsewhere.

“I’m excited about our partnership with the Mellon Foundation,” says Frank Schulenburg, Executive Director of Wiki Education. “This initiative will significantly impact the students involved and the countless Wikipedia users who will gain free access to representative and trustworthy information. Our Knowledge Equity initiative is a key aspect of our mission, and I’m especially pleased that we’re starting a large campaign to enhance content in the Humanities.”

This project will catalyze our ongoing Knowledge Equity campaign by significantly growing participation among new humanities faculty and supporting Wikipedia use in their courses. Wiki Education will activate its existing network of academic associations and partners and identify new opportunities to collaborate. Our work will be guided by a newly established Humanities and Social Justice Advisory Committee, composed of seven exemplary humanities scholars who have taught with Wikipedia through our Wikipedia Student Program with a knowledge equity lens. Our program team will onboard and support 800 humanities courses that add over 11 million words, powerfully diversifying who and what you see on Wikipedia. 

Contact Kathleen Crowley, Director of Donor Relations, at kathleen@wikiedu.org if you’re interested in growing this impact. 

About The Andrew W. Mellon Foundation
The Andrew W. Mellon Foundation is the nation’s largest supporter of the arts and humanities. Since 1969, the Foundation has been guided by its core belief that the humanities and arts are essential to human understanding. The Foundation believes that the arts and humanities are where we express our complex humanity, and that everyone deserves the beauty, transcendence, and freedom that can be found there. Through our grants, we seek to build just communities enriched by meaning and empowered by critical thinking, where ideas and imagination can thrive. Learn more at mellon.org

About Wiki Education
Wiki Education is a small, high-impact nonprofit organization systematically building and expanding the content on the English Wikipedia. Our goal is to represent the sum of all human knowledge by making Wikipedia more accurate, representative, and complete through our Wikipedia Student Program and Scholars & Scientists Program. These programs have trained students in higher education classrooms across the United States and Canada and subject matter experts from around the world how to add their knowledge to the most referenced online encyclopedia. We bring 19% of all new contributors to Wikipedia, who have written over hundred thousand articles viewed hundreds of millions of times.

Mellon Foundation logo used courtesy of Mellon Foundation, all rights reserved.

Recently, the Wikimedia Foundation has updated the Country and Territory Protection List Policy, a list of countries and territories about which we limit data publication for privacy reasons. At the Wikimedia Foundation, we are committed to enabling people to participate in the free knowledge movement without having to provide personal information. As such, our projects purposefully collect very little personal information and any personal information that is collected is retained for the shortest possible time. Paramount to our goals is ensuring a safe environment online for our community of readers, editors, and administrators. 

In addition to our commitment to privacy, we also believe that transparency and open access are foundational values of the Wikimedia movement. Nonetheless, there have been very reasonable privacy concerns about releasing certain kinds of data in the past. Any release of aggregated raw data from sensitive countries and territories has the potential to inadvertently reveal someone’s location or activity. The Country and Territory Protection List (CTPL) Policy was originally built in 2019 to help mitigate this risk. It is a blocklist of countries and territories that we don’t publish data for as a privacy mitigation. The first iteration of this blocklist was developed by relying on independent data provided by panels of subject matter experts by creating a composite score from Freedom House and Reporters Without Borders, and excluding any country or territory that was in the lowest category. Since 2019, this mitigation has allowed the geoeditors public monthly data release to be published and subsequently visualized in tools like Wikistats. As we have continued to publish more and more geolocated data since then, the country protection list became a de facto standard. 

Since 2021, WMF has started using differential privacy on some data releases. This statistical technique allows us to both quantify and strictly bound the privacy risk of a given data release to the individuals in the raw dataset. These new capabilities have prompted an update of the Country and Territory Protection List Policy. By using differential privacy, this update to the CTPL allows us to strike a balance between transparency and privacy, simultaneously enhancing WMF’s ability to publish information and preserving everyone’s ability to contribute to our projects online.

What has changed?

The biggest change in this iteration of the Country and Territory Protection List Policy is a move away from binary categorizations of countries and territories. In the past, if a country or territory scored in the lowest category in either of the Freedom House or Reporters Without Borders annual report, WMF put the country on the CTPL and did not publish data about it. Going forward, WMF will look at each country or territory’s scores in these reports and sort them into a risk framework that is not binary.

The result of this risk calculation is four categories: lower risk, medium risk, higher risk, and not published. Statistics about countries and territories with lower risks associated with publishing data can continue to be published as before, without differential privacy. Statistics about countries with medium and higher risks associated with publishing data, which were previously not published, can now be published using differential privacy. The CTPL Policy sets strict and conservative bounds on data releases about these countries and territories to ensure that privacy risks to users are low. Finally, there is a smaller set of countries and territories that are not published for safety reasons, even using differential privacy.

This gradation of risk allows for more nuance in our evaluation of data releases, and should ultimately enable the safe release of more Wikimedia platform data. Over the next few months, we hope to use this policy to enable the release of granular geoeditors data, pageview data, latency data, and more.

In the next section, we’ll dive into some of the findings from the differentially private geoeditors monthly release, the first dataset that will soon be compliant with the new CTPL Policy.

Case study: Russian Wikipedia

Under the previous version of the CTPL Policy, WMF did not release data about several countries where Russian is widely spoken. The new version of the policy allows us to release this data with stringent privacy guarantees. This means that Russian Wikipedia will see a large change in the amount of data visible. 

We’ve visualized some of this data below. For example, here’s a month-by-month GIF of the total number of editors on Russian Wikipedia in every country. Note that the scale on the right-hand side is logarithmic — that means the high end is on the order of ~35,000 editors and the low end is 1 editor.

We’ve also put together a series of line graphs that compare editor activity over time from nine communities of Russian Wikipedia editors. Note that, again, the y-axis scale is logarithmic. We’ve also plotted the release threshold for each country as a dotted gray line — this value represents the count below which we will not publish data, and varies based on whether or not the country is lower, medium, or higher risk.

This latest update to the Country and Territory Protection List will allow us to publish more data, aligning with the value of transparency that we hold so central at the Wikimedia Foundation, in a way that minimizes additional safety and privacy risk to our contributors, editors, and admins. Our projects could not operate without the everyday work of our volunteer contributors, and with this policy update, more information will be released to help our volunteers with this global work. We remain committed to enabling people to participate in the free knowledge movement without having to provide personal information and ensuring a safe online environment for our readers, editors and administrators.

Ellen Magallanes is Senior Counsel at the Wikimedia Foundation and Hal Triedman is Senior Privacy Engineer at the Wikimedia Foundation.

I wrote an article for a Japanese web magazine “ENGLISH JOURNAL.” In the article, I introduced the Wikimedia movement in Malaysia.

Article

Content

This article featured the editathons which the Wikimedia Community User Group Malaysia held and the activities of my friend Taufik Rosman, the winner of Wikimedian of the Year 2023.

CC0

First, I introduced the editathons in Malaysia. Those events are often supported by universities, GLAM, and embassies. For example, the editathon which Malaysian and Japanese Wikimedians held together in December 2023, was supported by Tokyo University of Foreign Studies, International Islamic University Malaysia, the Japanese embassy in Malaysia, and ASEAN.

Then, I featured the famous Malaysian Wikimedian Taufik Rosman.

One of the activities of Taufik is preserving the indigenous culture by using Wikimedia projects especially Wiktionary. In the web magazine, I explained what the Wiktionary is to Japanese readers and introduced the Kent Wiki Club, the student Wikimedia club in Sabah, Malaysia which aims to preserve their language Kadazandusun with a support by Taufik.

In addition, I also picked up the endangered indigenous language featured editathon sponsored by USESCO Jakarta.

Taufik Rosman (Don Wong for Tiny Big Picture, commissioned by The Wikimedia Foundation, CC BY-SA 4.0)

Furthermore, I presented the international collaboration of the Malaysian Wikimedians such as ESEAP, Wikimania, and Wikimedia Japan-Malaysia Friendship project.

Ahmad Ali Karim, CC BY-SA 4.0

Web archive

I made web archive of the article by Wayback Machine on 23 January 2024.

A year of loving Living Heritage

Tuesday, 23 January 2024 14:26 UTC

Cultural heritage is not only monuments and collections of objects, but also intertwined with everyday traditions and living expressions that are transmitted over generations. According to UNESCO, oral traditions including language, performing arts, social practices, rituals, festive events, knowledge and practices concerning nature and the universe or the knowledge and skills to produce traditional crafts are living heritage.

Wiki Loves Living Heritage was launched in 2023 to celebrate the twentieth anniversary of the UNESCO Convention for the Safeguarding of the Intangible Cultural Heritage as an invitation for the wikimedians as well as the heritage authorities and organizations around the world to bring living heritage to the Wikimedia projects.

See the elements inscribed in 2023!

Photo contests, editathons, documentation projects

The project brought together the focal points (the authorities and organizations responsible for safeguarding living heritage) and the Wikimedia affiliates to decide what they wished to do together.

Photo campaigns and contests were familiar and fun formats for arranging events in different parts of the world. Wikimania 2023 in Singapore teamed up with National Youth Achievement Award (NYAA) and the National Heritage Board (Singapore) (NHB) to organize Wiki Loves Living Heritage in Singapore Photo Contest. Dr. Kirk Siang Yeo of the NHB was invited to give a keynote lecture. In Europe, Wikimedia affiliates worked with focal points to run the European photo contest. All these photos are now openly available in Wikimedia Commons, expanding the visual knowledge of the world’s cultures.

Across the world, many formats were explored, including editathons, a heritage camp, wiki club, documentation trip, or data imports and editing events. The initiatives can be explored on the main page. The metrics page shows how the data and articles increased over the year and the Get inspired page is for sharing learnings.

A Fiery Finale. At the end of the Nine Gods Festival which is held around the end of September till around early October in some of the temples in Singapore, the finale of the festival is usually the release and burning of paper boats in the sea. The boats are tugged out and set on fire by the devotees. Rodney Ee CC BY-SA 4.0. The image won the second prize in the Singapore photo contest.

The role of data

Inventories document living heritage practices in each country. For Wiki Loves Living Heritage, key details, including names in local languages, heritage type, locations, and a link to the inventory entry, are added to Wikidata. This information creates connections across Wikimedia projects, such as links to existing Wikipedia articles, images in Wikimedia Commons, books in Wikisource, as well as objects in museums or literature in libraries outside Wikimedia projects.

Wiki Loves Living Heritage pages feature the heritage elements and encourage users add more information through links to the Wikimedia projects. The article How to import an inventory explains how to gather the data of an inventory and how to arrange it for importing to Wikidata.

Keep an eye on the Get inspired page for article additions!

Shibori in the Zeugfärberei Gutau, Upper Austria by Funke, CC BY-SA 4.0.

Ethics of open sharing

Both protecting and opening information must be carefully thought of when sharing living heritage online. Data about community-owned heritage or images of identifiable people might require sharing under some conditions only or the permission might need to be revoked, if needed.

The Ethics of Open Sharing webinar in collaboration with the Creative Commons Open Culture Platform gave an introduction to these questions. The amazing speakers Melissa Shaginoff, Shailili Zamora Aray, Patricia Adjei and Mehtab Khan set the tone and discourse for the whole year to come. The Ethical Sharing Card deck was published at the same event.

Wikiproject Local Contexts was created at GLAMhack 2023 in Geneva. It started exploring whether the content labels created by Local Contexts in support of indigenous sovereignty and ethical data governance could be used with images on Wikimedia Commons.

How to know if an image can be shared? With the images uploaded via the Wiki Loves Living Heritage campaign, we tested how the permission to share, granted by the people in the image or the community whose tradition is being shared, can be shared along with the uploaded images.

A dagomba boy dancing baamaya in Northern Ghana by Sir Amugi, CC BY-SA 4.0.

What next?

The campaign is over but the Wiki Loves Living Heritage pages are maintained and developed.

The Living Heritage pages for individual heritage elements are further developed into multilingual workshop pages to collect images, articles and source references about the elements and where image contributions, research or writing events can start from.

Only a fraction of the world’s living heritage inventories are on Wikimedia projects. The Wiki Loves Living Heritage community continues to support adding inventories and elements to Wikidata to encourage safeguarding, documenting and revitalizing the elements.

Collaborations between Wikimedia affiliates and focal points will continue to be supported by the project team. Some initiatives are only just starting to happen. Let’s imagine how the different campaigns and organizations in the Wikisphere can come together to support living heritage activities in the years to come!

Keep following the Wiki Loves Living Heritage page for updates.

Dokra products include items like bullock cart, horse, elephant, peacock, owl, Nandi (bull), idols of Durga, Saraswati, Ganesh and Lakshmi Narayan, lamp, candle stand, incense stick holder, ash tray, soap case, mobile holder, door knob, figurines of women, mother and child, tribal couple, wall panels with stories of Krishna Leela. Contact Base (WB), CC BY-SA 4.0. Original image has been edited by Susanna Ånäs with Adobe AI in Photoshop to remove a white area in the picture.

Thank you all, and let’s keep on working!

The different corners of the Open movement made this global collaboration come true. The staff in the affiliates as well as Wikimedia volunteers participated by bringing data online, writing articles, transcribing living heritage literature, contributing to the photo contests, joining the weekly meetups, presenting at events, solving problems on the online channel, maintaining translations or providing technical advice. Wikimedia Foundation Culture & Heritage team as well as the Communications team provided support throughout the year. Two events have been organized with the Let’s Connect Learning Clinics.

Wiki Loves Living Heritage was part of several events over the course of the year. Living Heritage Bazaar events were arranged in Europe in collaboration with the European Heritage Days coordinators, at Wikimania in Singapore and in the GLAM Wiki 2023 Conference in Montevideo, Uruguay. At GLAMhack 2023 in Geneva, Switzerland, Traditional Knowledge content labeling on Wikimedia projects was workshopped with Local Contexts. Data import was discussed at WikidataCon, modelling living heritage at Data Modelling Days, and using Wikimedia Meta as the project platform at GLAM Wiki 2023.

The campaign was supported by the Wikimedia Foundation, Council of Europe / European Heritage Days, Quebec Council for Living Heritage, Dutch Centre for Intangible Cultural Heritage, Federal Office of Culture of Switzerland, Workshop Intangible Heritage Flanders & Flemish Commission for UNESCO in Belgium, and the Finnish Heritage Agency.

Wiki Loves Living Heritage was initiated by the European Network of Focal Points for the 2003 Convention (ENFP) and AvoinGLAM.

Polissian dudka-vykrutka – an element of the intangible cultural heritage of Ukraine by Рівненська обласна державна адміністрація, CC BY-SA 4.0

Tech/News/2024/04

Tuesday, 23 January 2024 01:03 UTC

Other languages: Deutsch, Dusun Bundu-liwan, English, Ghanaian Pidgin, Nederlands, Soomaaliga, Tiếng Việt, español, français, italiano, norsk bokmål, polski, português, português do Brasil, suomi, čeština, русский, українська, עברית, العربية, বাংলা, 中文

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Problems

  • A bug in UploadWizard prevented linking to the userpage of the uploader when uploading. It has now been fixed. [1]

Changes later this week

  • The new version of MediaWiki will be on test wikis and MediaWiki.org from 23 January. It will be on non-Wikipedia wikis and some Wikipedias from 24 January. It will be on all wikis from 25 January (calendar). [2][3]

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

The introduction of Vector 2022 skin into en.wiki

Monday, 22 January 2024 19:59 UTC



[This started out as a response for [[Wikipedia:Requests for comment/Evaluation of Vector 2022]] but turned out to be expansive and too-broad-scope for there. It's intended for those familiar with the basics of en.wiki and WMF. ]

The introduction of Vector 2022 skin into en.wiki was a disaster by pretty much any metric.

Huge amounts of heat but very little light was generated on public-facing wikis (mainly en.wiki) and tools (mainly phabricator) and reading between the lines the process was very unpleasant for WMF staffers and the en.wiki admins. The end result was that a new skin containing modest improvements (mainly maintenance fixes) was adopted at the cost of huge ill-will.

Given that regular UI changes in web-based SaaS systems have been de rigueur for more than a decade, how did we get to the point where this change was so contentious?
  1. It wasn’t about the technical content of the change. The changes were technically boring, competently implemented and worked reliably in the overwhelming proportion of situations for the overwhelming proportion of editors.
  2. It wasn’t about the intention of the WMF staffers directly involved in the process. All the WMF staffers appeared to behave professionally and appropriately.
  3. It wasn’t about the intention of the en.wiki admins. All the en.wiki admins appeared to behave appropriately.
  4. It may have been partly about the huge pool of en.wiki editors who are deeply invested in the project, each of whom with their own point of view, set of priorities and fields of expertise. This, however, is a fundamental strength of the project (both Wikipedia as a whole and en.wiki specifically).

Systematic issues

en.wiki is a volunteer-edited information system running on systems provided by the professionally-staffed WMF. The volunteer side, while not explicitly a social-media forum, certainly shares aspects with social-media fora including, unfortunately, pile-ons. The en.wiki response to Vector 2022 was a classic pile-on: a community responded to a technical event in an emotionally charged manner with many people expressing very similar strongly-held views in such a way that emotive content completely obscured any informative content.

Indeed, the en.wiki policy WP:!VOTE policy encourages on-wiki pile-ons by explicitly prohibiting votes and vote-like processes unless each voter makes a substantive argument. Those substantive arguments can get very emotive.

Causes

  1. The boundary between the volunteer-run and professionally-staffed portions of en.wiki is brittle, current processes and arrangements ensure that making technical changes to en.wiki is an all-or-nothing big-bang operation which is very costly to all concerned.
  2. Technical changes to the en.wiki platform are seen by en.wiki editors as coming from “elsewhere” and being done to them, setting up an in-group and an out-group, with the WMF consistently being the out-group.
  3. en.wiki continues to allow pile-ons.

Concrete ideas for WMF

Some of these ideas aim to 'soften' the boundary between the volunteer-run and professionally-staffed portions of en.wiki, increasing the proportion of editors the skills, knowledge and insight to better understand the underlying infrastructure and technologies. Other ideas aim to increase to availability of relevant academic studies in related areas.
  1. Consider recasting wiki infrastructure updates to make WMF tech teams arbiters of technical quality rather than sources of disruption. This might be by funding (and providing infrastructure for) commercial or academic teams to build, debug, test and evaluate skins (and similar) which are then promoted to wikis by WMF based on quality.
  2. Consider sponsoring academic research and a theme or track at a usability conference or journal on wikipedia usability (reading and editing; across language and culture textual practices; design for avoiding pile-ons; etc).
  3. Consider sponsoring science communication in fields relevant to the wikipedia project: web UI; information systems; multilingual web usability; readability; etc.; etc. By promoting awareness of the academic consensuses in these fields there is hope that we steer discussion along evidence-based lines rather than “I don’t like X, don’t do it”
  4. Consider sponsoring the creation and maintenance of wikibooks on each of the technologies wikipedia relies on, prioritising those usable by non-privileged wikimedians within the project (javascript, css, SQL, etc). Boosting access to such resources and aligning the versions and examples with the broader project would promote these skills across the project and enable motivated volunteers to engage with these technologies much more easily.
  5. Consider using the volunteers who were actively involved in discussions related to one update as candidates for notification / testing of related updates. My participation in discussions related to Vector 2010 apparently didn’t qualify me for notification about Vector 2022; it should probably have. 12 years may seem like a long time to the WMF, but non-trivial numbers of active en.wiki users have been editing since before WMF was founded, embodying significant institutional knowledge. [Service awards can be used to find veteran editors.]
  6. Consider processes to rollout changes to only portions of a wiki at once for testing purposes.
  7. Consider moving to rolling updates of significant features as is common in SaaS. A new mainline skin appearing every January on all wikis, being made the default in May and marked as deprecated 48 months later. A new alternative skin appearing alongside it, with more innovative changes and more radical changes to the visual aesthetic might be deprecated earlier, with successful features appearing in a future mainline.
  8. Consider publishing explicit design criteria for future wikimedia skins (and similar) built / commissioned by the WMF. 
  9. Consider ‘introspection into the wikimedia system’ to be a design criteria for future wikimedia skins built / commissioned by the WMF. ‘Introspection into the wikimedia system’ in this context means enabling and encouraging users to reflect on the wikimedia install before them and might include: consistent visual differentiation between UI elements created by wikimedia core functionality, installed gadgets and /wiki/User:<user>/common.js; links from preference options to the respective tags in phabricator; etc.
  10. Consider publishing formal technical evaluations of skins, to provide evidence and motivate change and progress. If editors can see that one skin fails on 25% of browsers used globally and one fails on 1% of browsers used globally, that's hard evidence the the second fulfills the WMF's mission better than the other. 

Concrete ideas for en.wiki

  1. Consider better ways of handling contentious issues which don’t result in pile-ons and bordering-on-unclosable RFCs.
  2. Consider a policy requiring complaints of specific technical issues in WMF infrastructure (broadly construed, but including skins) to be required to include a link to a relevant phabricator ticket (or a statement of why one can’t be created) if instructions for doing so are already on the page. Driving people who complain about WMF tech stuff to phabricator to create a bug report should be obvious, but it is apparently not.

This fall, students in Melanie Sanwo’s Honors English class at Clovis Community College came together with a common mission: Add biographies of diverse people in STEM to Wikipedia. The 12 students split in four groups to add four new biographies to Wikipedia: Steve Ramirez, Joseph Monroe, Juan G. Santiago, and James M. Jay.

For the Clovis students, it was an opportunity to see themselves in the people they were writing about. Designated as a Hispanic-Serving Institution, Clovis is located in Fresno, California.

“By adding a biography of a diverse person, I felt like I was bringing their story out in the world. Not many would know about this individual but because we started his article, it would allow others to learn about him,” says Kaitlyn Chhay, a first-year student at Clovis who is planning to major in biology. “My favorite part about writing for Wikipedia is knowing that in some way I am kind of leaving an impact on this website. Now forever on, I will know that I was one person of a group who started his article and brought his name out into the light.”

Benson Karki, a computer science major, echoed Kaitlyn’s sentiments.

“Adding a biography of diverse individuals in STEM to Wikipedia is significant to me because it serves as an inspiration for individuals from similar backgrounds,” he says. “Highlighting the accomplishments of someone like Juan G. Santiago, whose work was not widely showcased on the internet, is essential in demonstrating the possibilities within STEM fields to a broader audience.”

Bringing diverse people’s accomplishments to light is the goal of the project the students participated in, funded by the Broadcom Foundation. By adding diverse biographies to Wikipedia, the project seeks to help students see themselves in the heroes and heroines of science.

“As a person of color coming from Lebanese immigrant parents, adding a biography of a person also of color in STEM means for me that I can make it,” Danny Aoun says. “Having a role model who has pursued their dreams as a person of color and has a commonality with me in neuroscience motivates me to strive in whatever I am currently doing because my end goal will be to make as big of an impact on this world that Steve Ramirez has.”

Danny says he plans to transfer to a four-year college after finishing his first two years at Clovis. He hopes to major in psychology or psychological and brain sciences, and dreams of completing a doctoral degree. So creating the biography of a neuroscientist was particularly meaningful to him.

“My favorite part about writing for Wikipedia was learning what this underrepresented STEM researcher has done for modern neuroscience,” he says. “Neuroscience is such a fascinating field, so I enjoyed learning about the research he has done and what it means for the future.”

Juliet Herzog, another Clovis student, is also planning to transfer to a four-year school, and is majoring in biology.

“It was interesting to research and write about someone influential in STEM because of my interest in science,” she says. “Writing this article felt rewarding. As my group did research for the microbiologist we were creating an article for, James M. Jay, it seemed like he was influential in his field. From our research, he seemed dedicated to microbiology, and this was reflected in the awards honored to him. For this reason, I am glad he has a Wikipedia article that others can continue to make contributions to and others can read.”

Of course, not only did students learn about the people whose accomplishments they were featuring on Wikipedia — they also learned about Wikipedia itself.

“One crucial aspect I’d like to emphasize is how meticulously Wikipedia articles are crafted. They are credible, unbiased, and extensively backed by citations, challenging the misconception that they are not reliable sources; a significant lesson I learned through this assignment,” Benson says.

And it’s not just learning about Wikipedia; students gained core research and writing skills as well.

“One thing I really appreciate about this project was the skills developed from writing this type of article; for example, research and writing objectively,” Juliet says. “I thought it was an interesting project because of how different it was from past writing assignments. I am mostly familiar with writing argumentative essays or thematic essays, so this writing was very different for me. Furthermore, this project not only helps students, but also Wikipedia as this project raises awareness among students of writing for this website.”

Danny agrees. While he loved sharing Steve Ramirez’s contributions to neuroscience research, he also enjoyed gaining skills along the way.

“My second favorite part about writing about Wikipedia was learning information literacy. I did not only realize how to be literate at gathering information but how important it is in our day and age where seeking accurate information is a necessity,” he says. “I grew passionate for this project through the information I was learning, and this ties back to what Wikipedia is, a worldwide, free encyclopedia. Being able to provide more information to the public on what I may be researching in the future and supporting Wikipedia’s journey to make a more equitable encyclopedia has been a great honor.”

Visit teach.wikiedu.org to learn how to incorporate a Wikipedia assignment into your own course.

Header image of students in the class courtesy Melanie Sanwo, all rights reserved.

Tech News issue #4, 2024 (January 22, 2024)

Monday, 22 January 2024 00:00 UTC
previous 2024, week 04 (Monday 22 January 2024) next

Tech News: 2024-04

weeklyOSM 704

Sunday, 21 January 2024 11:14 UTC

09/01/2024-17/01/2024

lead picture

The OSM Iceberg [1] | Copyright © Xvtn

Mapping

  • Requests for comments have been made on these proposals:
    • hitchhiking=true to add a tagging system for hitchhiking spots, aiming to identify common and successful hitchhiking locations.
    • toilets:menstrual_products for adding information about whether a toilet has menstrual products available or not.

Community

  • [1] Xvtn has created an OpenStreetMap-themed iceberg chart meme.
  • Anne-Karoline Distel described how she searched for and mapped village pounds in Ireland and Great Britain.
  • Franjo Lukežić explained how to set up a drawing tablet to improve the FastDraw experience in JOSM. There is also a hint (towards the end) on how to reduce the amount of clicks when drawing a neighbouring area.
  • King edgar blogged about his experience at the State of the Map Africa 2023 conference in Yaoundé, Cameroon. The event was focused on ‘Open Mapping as a Support Tool for Local Development in Africa’.
  • Makilagi Ed shared their experience of mapping with OpenStreetMap for the first time, detailing how taking a journey through familiar and new areas, marking waypoints and creating routes, has deepened their understanding of the art of mapping.
  • unsungNovelty published a post about three places on OpenStreetMap that he has mapped and remembered the most. He noted the beauty of Valparaiso in Tamil Nadu, told how he had to correct his mistakes in Pettineo, Italy and shared how he refined the contours of Dirks Lake in Arkansas, USA.
  • Tom Hughes, maintainer of the OpenStreetMap website, made it to the OpenUK New Year’s Honours list, which recognises the UK’s top open source influencers.
  • The OpenStreetMap France community is debating the future of their social media presence, particularly on Twitter and Mastodon, with discussions focusing on whether to continue using Twitter, fully migrate to Mastodon, or adopt a hybrid approach for broader outreach and engagement.

OpenStreetMap Foundation

Local chapter news

  • Amanda McCann triggered a discussion by reporting that Slack, a proprietary chat service used by the OSM US and others, is moving to an ‘ephemeral’ 90 day history of discussions unless using thee paid service. Alternatives to this service are being discussed.
  • The next FOSSGIS OSM Community Meeting will be held in Essen, Germany in May 2024.
  • OpenStreetMap US hosted a virtual conference event titled ‘Mapping USA 2024’, which ran on 19 and 20 January.
  • OpenStreetMap US is selling various OpenStreetMap-themed merchandise.

Events

  • The call for the SotM 2024 travel grant programme is now open. Apply before the end of January if you need financial support to join the State of the Map 2024 in Nairobi.
  • An event to be held at the EPN – Médiathèque Louis Aragon in Martigues on Friday 22 March will offer an introduction to OpenStreetMap and how to contribute, as part of a series of collaborative workshops organised by the city’s public digital space to explore digital tools and practices suitable for different skill levels.
  • KDE España released a recording of the talk titled ‘Geographic Open Data with OpenStreetMap’ via YouTube. You can also watch this video on the PeerTube instance maintained by the KDE community.
  • HeiGIT reported that their ‘Waterproofing Data’ project, a mapping project to improve disaster mitigation process in the flood-prone areas of Brazil, has won the ESRC Celebrating Impact Prize 2023 in the ‘Outstanding Societal Impact’ category.
  • Participants in the OpenStreetMap@Asia channel in Telegram held their second monthly workshop, tentatively called ‘Map-py Wednesday’, with the goal of encouraging regional mappers to get to know each other better, share about local happenings, and learn from each other, through snappy map-py (lightning) talks. Map-py Wednesday is held online on the third Wednesday of the month, at 08:30 UTC.

Education

  • The UN Maps Learning Hub has launched its OSM Advanced courses, starting with a course on validating data in OSM. It offers an initial guide to learning validation techniques and tools such as the OSM Tasking Manager validation steps, JOSM tools and plugins, Overpass, Osmose, Whodidit, OSMCha, then a guide to learning how to interact with OSM contributors, and finally a guide that proposes a specific validation workflow for different map features (places, land use, road network, hydrography, bridges, fords and culverts) with a long-term perspective. Feedback is welcome!

OSM research

  • A new building classification model using OpenStreetMap building data aims to estimate the earthquake risks to the built environment, enhancing the accuracy of damage assessment and aiding in disaster management.
  • HeiGIT and Amsterdam University Medical Centres have conducted a study to analyse the relationship between the food-related retailers location data on OpenStreetMap and its influence on the health of people living in an area.

Maps

  • Christoph Hormann released some additional layers for Musaicum EU-plus, a 10 m resolution satellite image mosaic of Europe.
  • Paul Norman analysed the load and performance of the minutely map tile updates on OpenStreetMap. With an update capacity of 4800 tiles per second, the OSM server is expected to handle 95% of map tile update requests in less than 60 seconds.
  • Russ Garrett tooted a map of the world’s electricity networks made using data from OpenStreetMap.

OSM in action

  • Geocaching noted their OpenStreetMap-based Trails map as a replacement of Google Maps for their premium members. They highlight that the maps are made by locals, that they have more details and are offline available.
  • Jake Coppinger released his ‘Australian Cycleway Stats’, a web dashboard that displays statistics on cycleways in Australia based on data from OpenStreetMap.

Open Data

  • Sven Geggus has updated the list of trekking locations in Germany on the OpenStreetMap wiki . Some areas on this list still need more details.

Software

  • Daniel Schep announced that he has made a new version of Overpass Ultra that makes it easier to customise map styles and feature popups.
  • bs2000 has created an OpenStreetMap-based journey planner application. This application is able to process the GPX route of the planned journey, then it will find nearby travel-related POI locations (such as toilets, lodging, refuelling stations and others) along the route.
  • Pieter Vander Vennet blogged about all the achievements of MapComplete in 2023. It is an impressive list!
  • Vespucci tooted that support for OAuth2 is available in version 20, which is planned to enter the first beta stage at the end of March 2024.

Programming

  • Mark Litwintschik has written about how to work with openly published flight tracking data collected by a network of volunteers contributing to a common aircraft tracking feed.
  • Lucas Longour created some userscripts to display Panoramax links on OpenStreetMap and overpass turbo.
  • rtnf has shared Javascript code to integrate OpenStreetMap data with Wikidata.

Releases

  • GeoDesk for Python 0.1.4 has been released. Highlights include large-area spatial queries now being 4 to 30 times faster (find all 60+ million mapped buildings in the US in under a second), and area calculations for features near the poles are now significantly more accurate.
  • Version 2.2.0 of Transportr has been released, with the main change being the switch to MapLibre, making it available on F-Droid again.
  • Tilemaker version 3.0.0 has been released. It can convert OSM data in osm-pbf format into Mapbox Vector Tile format (.mbtiles).

Did you know …

OSM in the media

  • Mortiz Poldrack, from Tarnkappe.info, reported on the OpenStreetMap Foundation’s move to enforce the rules regarding OSM’s data attribution requirements.

Other “geo” things

  • Using LiDAR technology, a team of archaeologists have discovered a cluster of ancient city traces in the Amazon rainforest region of Ecuador.
  • Using training data from Google Street View, a team of students from Stanford University have built an AI system to automatically detect where photos were taken. In a ‘Geoguessr’ geolocation competition, this AI system successfully defeated a human geolocation expert in multiple rounds.
  • Mapbox released MapGPT, a location-aware AI assistant.
  • NASA has launched TEMPO (Tropospheric Emissions: Monitoring of POllution), a satellite-based air quality monitoring system. This satellite can provide hourly reports on atmospheric pollutants in the North American region.

Upcoming Events

Where What Online When Country
Gent OpenStreetMap meetup & MapComplete workshop at TomTom 2024-01-18 flag
Washington Mapping USA 2024-01-19 – 2024-01-20 flag
Bengaluru OSM Bengaluru Mapping Party 2024-01-20 flag
City of Fremantle Social Mapping Saturday: Fremantle 2024 2024-01-20 flag
Hai Buluk OSM Africa Monthly Mapathon: Map South Sudan 2024-01-20 ss
Windsor OSM Windsor-Essex Meetup 2024-01-23 flag
Bremen Bremer Mappertreffen 2024-01-22 flag
San Jose South Bay Map Night 2024-01-24 flag
iD Community Chat 2024-01-24
OSMF Engineering Working Group meeting 2024-01-24
London Borough of Islington Geomob London 2024-01-24 flag
Richmond MapRVA Happy Hour 2024-01-25 flag
[Online] OpenStreetMap Foundation board of Directors – public videomeeting 2024-01-25
Lübeck 138. OSM-Stammtisch für Lübeck und Umgebung 2024-01-25 flag
Wien 70. Wiener OSM-Stammtisch 2024-01-25 flag
MapComplete Community Call 2024-01 2024-01-26
Terni Compleanno OpenStreetMap Italia 2024 2024-01-26 flag
Localidad Teusaquillo Mapeemos los restaurantes de la calle Bonita y la Macarena 2024-01-27 flag
Saint-Étienne Rencontre Saint-Étienne et sud Loire 2024-01-30 flag
Salt Lake City Salt Lake City Monthly Map Night 2024-02-01 flag
Düsseldorf Düsseldorfer OpenStreetMap-Treffen (online) 2024-01-31 flag
Amsterdam End of Winter Mapping Party 2024-02-01 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by MatthiasMatthias, PierZen, Strubbl, TheSwavu, barefootstache, derFred, mcliquid, rtnf.
We welcome link suggestions for the next issue via this form and look forward to your contributions.

I really enjoyed watching "You are what you eat", a Netflix four part documentary based on research of the differences found between a vegan and an omnivorous diet in identical twins. The results of this research can be found in a paper called "Cardiometabolic Effects of Omnivorous vs Vegan Diets in Identical Twins". 
The documentary has several story lines, one is about the research itself, another informs about participants in the study and finally we are informed about the industry that produces our food. The chosen participants are a vehicle for the story, there were chefs, athletes cheese aficionados and people from other cultures (seen from an US-American perspective). What people eat is produced so we are informed about the food industry. The picture painted is not pretty but based in facts.

On YouTube there are several "reviews" and now some reviews as well. All of the "reviews" are really disappointing because they express expectations that are not realistic. The program is NOT about only the science and it is NOT giving equal weight to the production of fish or meat. The results of the research are favorable to a vegan diet and the documentary provides information on what is available when less or no meat is eaten. It is why we learn about the quality of vegan cheese and meat products. Great cheeses and a biltong that is not meat based are explored by participants of the study. 

I found the YouTube "reviews" disappointing because they came across as hatchet jobs. When they consider the documentary biased, it finds its basis in the bias of the reviewer and not necessarily on the results of the research. When it is said that these reviews were requested by "so many people", it feels like that people in the agro business exposed their hand. 

Wikipedia has the article on the documentary and it has an article on the principal author of the paper. They have an appropriate neutral point of view.

My Wikidata reaction is that I added the paper to Wikidata, I added many of its authors and many of the papers cited as references and to be brutally honest, seen from within Wikidata it looks awful, it is one dimensional, it is unusable. However thanks to tools the full impact of available information becomes available. Scholia is my preferred tools for science. This is the Scholia for the paper.
Thanks,
      GerardM

Enhance Your MediaWiki with Bootstrap

Saturday, 20 January 2024 00:00 UTC

Improve your wiki by integrating it with Bootstrap and its components.

Welcome to the dynamic realm of content and knowledge management with MediaWiki, essential for businesses, professional groups, and individual users alike. More than just aesthetics, MediaWiki and Bootstrap combine practicality and responsive design to enhance user experience, ensuring efficiency and ease of access for platforms like Wikipedia.

Bootstrap, a powerful tool for building modern websites, shines in these circumstances. Combining Bootstrap with popular MediaWiki themes such as Chameleon and Medik can turn the classic wiki style into something sleek, fast, and more user-friendly.

Why is this blend of Bootstrap and MediaWiki a win for businesses, wiki communities, and individuals? Let us discover how Bootstrap revolutionizes the MediaWiki interface with its responsive design and user-friendly features.

Enriching MediaWiki with Bootstrap Integration

Bootstrap enhances MediaWiki with mobile responsiveness. Its easy integration with themes like Chameleon and Medik ensures a visually appealing, functional experience, adapting seamlessly to various screen sizes in our mobile-centric world.

This integration translates to better user engagement and productivity for businesses, professional groups, and individuals using MediaWiki as a knowledge base or collaboration tool. Employees and users can efficiently access information on any device, improving workflow and collaboration.

Bootstrap for MediaWiki as a Resource

The "Bootstrap for MediaWiki" wiki is crucial for those integrating Bootstrap with MediaWiki, offering many practical examples and tutorials. It showcases Bootstrap's versatility within MediaWiki, from adding an image carousel to implementing responsive design elements like grids.

This wiki is essential, particularly as the MediaWiki software limits certain HTML tags, restricting the direct use of numerous Bootstrap examples provided via its documentation. This wiki bridges that gap, offering tailored solutions and adaptations for effective integration.

Ideal for people with varying MediaWiki experience, it simplifies the customization process with easy-to-follow examples and code snippets. This resource benefits enterprises, professionals, and beyond, providing essential components like alert boxes, buttons, cards, and many more for easy integration.

The "Bootstrap for MediaWiki" wiki is more than a guide; it is an evolving platform enriched by community contributions, ensuring it stays current with the latest trends in web design and MediaWiki enhancements.

Screenshots of the Bootstrap for MediaWiki wiki

Conclusion

In conclusion, integrating Bootstrap with MediaWiki exceeds mere aesthetics. This powerful combination proves highly effective for enterprises, professional settings, and wiki communities, transforming MediaWiki into a dynamic, user-centric, responsive platform.

The "Bootstrap for MediaWiki" wiki exemplifies this potential, offering practical insights and embodying the essence of community-driven innovation in open-source development. Whether you are a seasoned developer or new to MediaWiki customization, these resources and collaborative support simplify the enhancement process.

Incorporating Bootstrap into MediaWiki leads to more efficient, attractive, and user-focused websites. This integration invites you to a refined world of content management.

Elevate Your MediaWiki Now!

Dive into the "Bootstrap for MediaWiki" wiki and start transforming your site today. Join our community, experiment with new ideas, and share your insights. Whether you're a first-timer or a seasoned pro, every step you take enriches our collective journey.

Your journey to a more dynamic and responsive MediaWiki begins!

We are now in an election year when a vast range of issues, from war to healthcare, that affect Americans come to the forefront of their decision-making in the voting booth. Alarmingly, a survey by Gallup and the Knight Foundation found that the level of trust in the media that covers these issues is so low now that about half of the American public believes national news organizations intend to deceive them. In a trend that is sure to grow, a whopping 58% of those surveyed also say that they get their information online. 

While the Internet, social media, and generative artificial intelligence tools makes information quick and easy to access, often claims are made without sources to support them, and their credibility can be tainted by biased agendas. ChatGPT can also hallucinate sources and generate content that has no basis in fact. At the same time, a recent NPR/Ipsos poll found that Americans across the political spectrum feel the country is in crisis and at risk of failing. With the rise in civil unrest and discussions of civil war, the need to spread neutrally presented information that can be trusted is all the more urgent.  

In this era where information is abundant but accuracy is often compromised, it is crucial to equip the next generation of leaders with the skills to discern reliable sources and contribute to the creation of factual knowledge. This is especially key to maintaining the integrity of Wikipedia, which consistently ranks in the top ten of all online search results and is largely trusted by the public. Indeed, Wikipedia is free from advertising or the influence of private interests that can distort other online and social media sites. 

Rapoport Foundation LogoWiki Education has partnered with the Bernard and Audre Rapoport Foundation to foster a more informed, digitally literate citizenry who reads and edits Wikipedia and participates in civic society.

“On behalf of the Rapoport Foundation, we are pleased to support Wiki Education with a $25,000 grant for the Wikipedia Student Program, and we look forward to learning from this partnership over the course of 2024,” said Jenny Peel, Rapoport Foundation Program Officer.

Wikipedia’s rigorous rules and guidelines on writing with neutrality and citing reliable sources have been refined over two decades by the volunteer editing community. Our Wikipedia Student Program teaches students how to follow these strict policies to improve their research and digital media literacy skills. The support from the Rapoport Foundation will help us serve 800 students in 40 courses at universities and colleges across the United States as they learn to tackle misinformation on Wikipedia and hone their ability to detect it. Students will edit or create at least 670 articles that we expect to receive more than 7 million views by the general public, including policymakers, journalists, and more.

Getting information right on Wikipedia can have huge ramifications. Recent research has found articles have the power to influence our democracy and rule of law. When researchers analyzed judges’ decisions out of Ireland’s lower courts, they discovered that the cases with Wikipedia articles were 21% more likely to be cited as precedents and that lower courts drew on Wikipedia articles in framing these precedents and their meaning to make their decisions. The reliance on Wikipedia—used to shape and dictate the way of life in America and abroad—can have a reverberating impact for generations.

In collaboration with the Rapoport Foundation, Wiki Education recognizes this significance and is building a foundation of trust and accuracy in the information landscape. We are immensely grateful to the Rapoport Foundation for their timely support of this work to improve the desperate state of our democracy.

For more information, please contact:
Kathleen Crowley
Director of Donor Relations
kathleen@wikiedu.org

Rapoport Foundation logo used courtesy Rapoport Foundation, all rights reserved.

Telling the story of an African American chemist

Wednesday, 17 January 2024 16:06 UTC
Harold Evans in a lab
Harold Evans in 1949. Image in public domain, via Wikimedia Commons.

Chemist Harold B. Evans was one of a handful of African American scientists who worked on the Manhattan Project. A photo of him has been included in the Wikipedia article on African-American scientists and technicians on the Manhattan Project for several years — but Wikipedia lacked his biography until student De’Narie Breeland created it this year.

De’Narie wrote the article on Evans as a class assignment for Corry Stevenson’s Principles of Engineering class at Denmark Technical College. Denmark, a Historically Black College and University, is located in Allendale, South Carolina, where De’Narie is from. De’Narie and her classmates were assigned to create a biography of a diverse person in STEM with support from Wiki Education as part of an initiative sponsored by the Broadcom Foundation to support diversity in STEM on Wikipedia.

“At first, I was confused that I was going to be writing an article or biography. I have never written a biography or article for Wikipedia or for a college class,” De’Narie says.

But she didn’t let that stop her. She jumped right in and started researching Evans’s contributions.

“I chose Harold B. Evans because he was an African American male who had an opportunity that many African Americans could not have,” De’Narie says. “It was very meaningful because us African Americans have been overlooked in the past and I love to see African Americans achieve in something so big.”

She learned the ins and outs of Wikipedia editing through Wiki Education’s online trainings, with support from Wiki Education’s Wikipedia Expert Brianda Felix. And, in addition to learning more about Wikipedia, she also learned about the importance of African American scientists. De’Narie isn’t sure what she wants to pursue career-wise yet — cosmetology, midwifery, and real estate are all of interest — but she enjoyed researching Evans and adding citations about his life.

“I learned that there were a few, only a few, African Americans who were able to help on the Manhattan Project,” De’Narie says. “This experience was really great.”

Visit teach.wikiedu.org to learn how to incorporate a Wikipedia assignment into your own course

Many knowledge institutions use external identifiers to link their catalogues with Wikidata. Not many people are aware of the effort volunteers put into maintaining Wikidata’s data quality. In this interview, user Epìdosis shares some insights from his eleven years of editing and reconciling libraries’ catalogues on Wikidata.   


Hi Camillo/Epìdosis! Thank you for agreeing to share your thoughts and experiences from your years of catalogue work with Wikidata. Can you briefly give an overview of your efforts and contributions to Wikidata, which I believe are all voluntary?

My activity on Wikidata, which has been all voluntary with just one exception in the last decade, has many different focuses: the daily patrolling of my watchlist (more than 100,000 items); matching existing items and creating new items through Mix’n’Match, mainly for library authority files and biographical dictionaries; resolving duplications and conflations of items about humans, starting from the constraint violations of the IDs of library authority files; reconciling Italian controlled vocabularies with Wikidata; periodically revising inconsistencies in reconciliation between the Italian national authority file (SBN) and Wikidata; supervising university projects and datathons on Wikidata; didactic activity about Wikidata, mainly for Italian librarians and university students, but also high school students in a few cases; and writing scientific articles about Wikidata and its relationship with data produced by libraries.

You have made 7.1 million edits so far on Wikidata. Wow! About what percentage of these edits are manual edits, and how much time per day do you usually spend editing Wikidata? 

According to NavelGazer I made 1.9 million of my edits without using any external tool; all of these can be considered manual in a certain sense, although many of these were facilitated by using gadgets. I usually spend 1-3 hours each day on Wikidata.

About how many libraries or institutions have you reached out to over your past 10 years with Wikidata? What are their usual reactions when you first introduce yourself to them? 

I’ve had direct contact with six VIAF members: ICCU, the institution managing the Italian national authority file; NLG, the National Library of Greece; PTBNP, the National Library of Portugal; SUDOC, the network of French academic libraries; SKMASNL, the Slovak National Library; J9U, the National Library of Israel.  I’ve had indirect contact with seven others: BNE for Spain, BNC for Catalonia, DNB for Germany, NKC for the Czech Republic, BAV for the Vatican, PLWABN and NUKAT for Poland. 

I’ve also had contact with a few dozen smaller libraries in Italy and a few in Greece and Cyprus. Librarians are usually happy to see that the data they produce is appreciated by Wikidata users, who reconcile them with Wikidata and use them as references; in most cases, they also appreciate mistake reports. The most frequent issue I encounter is that libraries are often severely understaffed. Only in a very few cases have I succeeded in persuading libraries to introduce editing Wikidata items into their cataloguing routine.

In your years of reconciling libraries’ catalogues with Wikidata, what would you say are the top three issues and challenges faced by editors like yourself and by libraries? 

For libraries, I have already said the most relevant one: too few librarians involved in managing the authority file. 

Another issue for libraries: once librarians have added good data on Wikidata, an integrated library system (ILS) usually doesn’t offer any display option which shows data taken from Wikidata to the readers of the catalogue. Only a few libraries have managed to create their own ways of displaying data from Wikidata. 

On Wikidata users’ side, surely the most relevant problem is failures in data-roundtripping: contacting libraries to report and correct mistakes. Many libraries don’t show any contact option; others are very slow in solving issues because they are understaffed. Inefficient data round-tripping is severely damaging also for Wikidata data curation; constraint violation reports are flooded by unsolvable mistakes in external databases and thus are much less efficient in showing solvable mistakes in Wikidata itself.

What advice would you give to Wikidata editors and librarians to overcome some of these challenges? 

Having too few librarians dedicated to authority files is often caused by conceiving of authority files as internal materials used only by cataloguers, instead of as public materials potentially useful for readers of the catalogues. Good-quality authority records are crucial to putting the authority files into the network of Linked Open Data. Spreading this new conception of authority files is probably the best way to increase the number of librarians dedicated to their improvement, which would then speed up the workflow concerning mistake reports coming from Wikidata users. 

To convince librarians to work directly on Wikidata items, we have to offer them good ways to display in their catalogues the data they add to Wikidata. These ways are presently very rare and not standardized: we should convince ILS producers to offer to each library the option of displaying selected statements from Wikidata in the public interface of authority records, once an authority record is connected to a Wikidata item (see https://www.wikidata.org/wiki/User:Bargioni/AuthorityBox_SBN.js as an example of a Wikidata-generated AuthorityBox).

Let’s say I’m reconciling a library’s catalogue with Wikidata via Mix’n’Match, and I discover issues from the library’s catalogue. What is the best way to reach out to them? Do I tell them, “Dear National Library X, there is a problem with entity 123 in your catalogue. Please fix it!”? 🙂 

In fact, I usually write messages like this! But of course, writing a message presupposes that the library has made contact information available. (Often this is not the case.) If an answer comes relatively soon, and the librarian seems helpful, I reply explaining that I noticed the mistake(s) during my Wikidata activity and asking if they are interested in establishing some sort of deeper collaboration to reconcile their authority file with Wikidata. This method was effective on many occasions.

Lastly, would you like to say anything to the community of Wikidata editors like yourself, as a motivation to everyone, when the going gets tough? 

Continue insisting and someone will listen to you, sooner or later!


Thank you so much, Epìdosis, for your contributions and inspiring insights.

Wikipedia is here to stay

Tuesday, 16 January 2024 19:43 UTC

This week, Wikipedia celebrates its twenty-third birthday. In that time, the online encyclopedia has become one of the top ten most popular websites in the world, with over 15 billion visits a month. Despite the changes to the internet over the last two decades, from Web 1.0 to our current era of the world wide web, from smartphones to the rise of artificial intelligence, Wikipedia continues to thrive.

Almost four years ago, I joined the Wikimedia Foundation as the Director of Machine Learning. I’ve worked in data and machine learning my entire career. My background and skill set is in data, computation, and algorithms — understanding it and using it. 

Over the last four years, the thing that stands out so clearly in my mind is that Wikipedia is built by people and for the people, and it is here to stay. 

The ubiquity of Wikipedia can make it easy to forget that behind every fact, every image, and every word in every article is a person — a person with a life, with family, friends, and pressures on their time. 

People are why Wikipedia continues to persist. 

For over twenty years and multiple times a second as you read this, thousands of people are spending their time building a global commons of freely available, neutral, reliable information for the world. That is why, despite the rapid changes the internet has gone through, the online encyclopedia remains relevant.

The internet is not static. It is an evolving set of technologies, cultures, and norms. And just like it has since its start, Wikipedia will evolve with the internet. For example, Wikipedia volunteers have used machine learning for years to help detect vandalism, translate articles, and provide a better experience for editing.

These efforts will only get more effective and efficient. The Wikimedia Foundation will continue to build new AI tools making Wikipedia better – more reliable and accessible to everyone. 

One thing that will never change is our fundamental belief that Wikipedia comes from its editors. The foreseeable future remains that a large group of humans working together — debating, discussing, fact-checking each other, even when using AI to boost their efforts — will be the best way to create reliable knowledge.

The news is filled with stories on advances in artificial intelligence. There are ways of interacting with information on the web that people would not have dreamed of a few years ago. But as the internet is flooded with AI-generated content (some of it good, much of it bad) the value of an individual human volunteer, spending their evenings after the kids are asleep or after they get off from their job, building and improving the knowledge commons that is Wikipedia, will only become more important.

There will be something after this artificial intelligence boom, and something else after that. Still, the work of Wikipedia and its volunteers to build a reliable resource of information freely for the world will continue. Wikipedia is here to stay.

Happy birthday Wikipedia, and cheers to many more ahead.

Chris Albon is the Director of Machine Learning at the Wikimedia Foundation. You can follow him at @chrisalbon.

The post Wikipedia is here to stay appeared first on Wikimedia Foundation.

Episode 154: Srishti Sethi

Tuesday, 16 January 2024 17:34 UTC

🕑 52 minutes

Srishti Sethi (also known as SrishAkaTux) is a developer advocate in the Language Team at the Wikimedia Foundation. Prior to the mass reorganization at the Wikimedia Foundation in summer 2023, she was part of the Technical Engagement team.

Links for some of the topics discussed:

Web Perf Hero: Máté Szabó

Tuesday, 16 January 2024 05:30 UTC

MediaWiki is the platform that powers Wikipedia and other Wikimedia projects. There is a lot of traffic to these sites. We want to serve our audience in a way that they get the best experience and performance possible. So efficiency of the MediaWiki platform is of great importance to us and our readers.

MediaWiki is a relatively large application with 645,000 lines of PHP code in 4,600 PHP files, and growing! (Reported by cloc.) When you have as much traffic as Wikipedia, working on such a project can create interesting problems. 

MediaWiki uses an “autoloader” to find and import classes from PHP files into memory. In PHP, this happens on every single request, as each request gets its own process. In 2017, we introduced support for loading classes from PSR-4 namespace directories (in MediaWiki 1.31). This mechanism involves checking which directory contains a given class definition.

Problem statement

Kunal (@Legoktm) noticed after MediaWiki 1.35, wikis became slower due to spending more time in fstat system calls. Syscalls make a program switch to kernel mode, which is expensive.

We learned that our Autoloader was the one doing the fstat calls, to check file existence. The logic powers the PSR-4 namespace feature, and actually existed before MediaWiki 1.35. But, it only became noticeable after we introduced the HookRunner system, which loaded over 500 new PHP interfaces via the PSR-4 mechanism.

MediaWiki’s Autoloader has a class map array that maps class names to their file paths on disk. PSR-4 classes do not need to be present in this map. Before introducing HookRunner, very few classes in MediaWiki were loaded by PSR-4. The new hook files leveraged PSR-4, exposing many calls to file_exists() for PSR-4 directory searching, in every request. This adds up pretty quickly thereby degrading MediaWiki performance.

See task T274041 on Phabricator for the collaborative investigation between volunteers and staff.

Solution: Optimized class map

Máté Szabó (@TK-999) took a deep dive and profiled a local MediaWiki install with php-excimer and generated a flame graph. He found that about 16.6% of request time was spent in the Autoloader::find() method, which is responsible for finding which file contains a given class.

Figure 1: Flame graph by Máté Szabó.

Checking for file existence during PSR-4 autoloading seems necessary because one namespace can correspond to multiple directories that promise to define some of its classes. The search logic has to check each directory until it finds a class file. Only when the class is not not found anywhere may the program crash with a fatal error.

Máté avoided the directory searching cost by expanding MediaWiki’s Autoloader class map to include all classes, including those registered via PSR-4 namespaces. This solution makes use of a hash-map, where each class maps to one and only one file path on disk, making it a 1-to-1 mapping.

This means, the Autoloader::find() method no longer has to search through the PSR-4 directories. It now knows upfront where each class is, by merely accessing the array from memory. This removes the need for file existence checks. This approach is similar to the autoloader optimization flag in Composer.


Impact

Máté’s optimization significantly reduced response time by optimizing the Autoloader::find() method. This is largely due to the elimination of file system calls.

After deploying the change to MediaWiki appservers in production, we saw a major shift in response times toward faster buckets: a ~20% increase in requests completed within 50ms, and a ~10% increase in requests served under 100ms (T274041#8379204).

Máté analyzed the baseline and classmap cases locally, benchmarking 4800 requests, controlled at exactly 40 requests per second. He found latencies reduced on average by ~12%:

Table 1: Difference in latencies between baseline and classmap autoloader.
Latencies Baseline Full classmap
p50
(mean average)
26.2ms 22.7ms (~13.3% faster)
p90 29.2ms 25.7ms (~11.8% faster)
p95 31.1ms 27.3ms (~12.3% faster)

We reproduced Máté’s findings locally as well. On the Git commit right before his patch, Autoloader::find() really stands out.

Figure 2: Profile before optimization.
Figure 3: Profile after optimization.

NOTE: We used ApacheBench to load the /wiki/Main_Page URL from a local MediaWiki installation with PHP 8.1 on on Apple M1. We ran it both in a bare metal environment (PHP built-in webserver, 8 workers, no APCU), and in MediaWiki-Docker. We configured our benchmark to run 1000 requests with 7 concurrent requests. The profiles were captured using Excimer with a 1ms interval. The flame graphs were generated with Speedscope, and the box plots were created with Gnuplot.

In Figure 4 and 5, the “After” box plot has a lower median than the “Before” box plot. This means there is a reduction in latency. Also, the standard deviation in the “After” scenario shrunk, which indicates that responses were more consistently fast (not only on average). This increases the percentage of our users that have an experience very close to the average response time of web requests. Fewer users now experience an extreme case of web response slowness.

Figure 4: Boxplot for requests on bare metal.
Figure 5: Boxplot for requests on Docker.

Web Perf Hero award

The Web Perf Hero award is given to individuals who have gone above and beyond to improve the web performance of Wikimedia projects. The initiative is led by the Performance Team and started mid-2020. It is awarded quarterly and takes the form of a Phabricator badge.

Read about past recipients at Web Perf Hero award on Wikitech.


Further reading

Tech News issue #3, 2024 (January 15, 2024)

Monday, 15 January 2024 00:00 UTC
previous 2024, week 03 (Monday 15 January 2024) next

Tech News: 2024-03

weeklyOSM 703

Sunday, 14 January 2024 11:45 UTC

02/01/2024-08/01/2024

lead picture

Eclipse Map [1] | Copyright © Time and Date AS | map data © OpenStreetMap contributors

Mapping

  • Using Pascal Neis’ website OSM-notes, Vitor George has created the application ‘Oldest OSM Notes by Country’ to find the oldest notes per country and to help fix them in a meaningful way.
  • Brian Sperlongano, aka ZeLonewolf, presented his Wikidata data quality assurance script, which detects issues with US administrative boundaries by comparing them with Wikidata and US Census Bureau data. The tool checks boundaries with admin_level values 7–9 and CDPs with boundary=census. The results are updated daily and are sorted by state. The author is looking for feedback and corrections for the tool. He pointed out that some of the problems may be Wikidata-related.
  • Requests for comments have been made on these proposals:
  • Voting on emergency=disaster_response, for consistent worldwide tagging of disaster response service stations providing emergency response for civilians during or after a disaster, is open until Thursday 18 January.

Community

  • Alper shared how he is mapping the paths of unknown parts of a forest with high definition GNSS and Manjaro Linux running on his tablet. He used QGIS’s GPS live tracking feature to record data directly from his GNSS receiver.
  • Justo Gómez Vaello, a Geomatics and Surveying Engineer from Spain, is the UN Mapper of the Month for January 2024.
  • The very active OSM community in Fukushima discusses each issue of the Japanese version of weeklyOSM online. They then publish this discussion on YouTube. Recommended for imitation.

OpenStreetMap Foundation

  • The OSMF Operations Working Group announced on 10 January that OpenStreetMap had discontinued the Facebook login feature following some new requirements. The working group was able to fix the access problem a few days later. The approximately five hundred thousand OpenStreetMap user accounts affected by the shutdown could have still accessed their accounts by using the password reset feature.
  • The OSMF virtual event Local Chapters and Communities Congress 2024 (#LCCC24) now has a date: Saturday 2 March 13:00 UTC to 16:00 UTC. More information can be found on the OSM Wiki.

Local chapter news

  • Minh Nguyễn has compiled an excellent resource on the recently updated Manual of Uniform Traffic Control Devices for Streets and Highways (MUTCD), the governing guide for traffic signs in the United States.
  • FOSSGIS e.V. has welcomed the new year and provided a review and outlook of the organisation’s work.
  • Sawan Shariar announced the important dates for the OSM Bangladesh Election 2024, with the election happening on Friday 26 January. Additionally, he gave an overview of the election process, how to nominate a candidate, and how to get involved.
  • The 20th anniversary of OpenStreetMap will be celebrated at the State of the Map Europe 2024. OpenStreetMap Poland has the honour of hosting this event, which will be held in Łódź on 18 to 21 July. Volunteers are needed to help organise and run this conference.

Events

  • The FOSSGIS Conference 2024 will take place in Hamburg on 20 to 23 March. Registration is currently open . An ‘unconference’, to be held on OSM Saturday, is being organised via the wiki. Helping hands are needed for the organisation of the conference.
  • Shravan has invited you to the ‘End of Winter Editing Party’, which will be held at TomTom’s HQ on De Ruijterkade 154, in Amsterdam on Thursday 1 February.

OSM research

  • A study has revealed that human infrastructure is increasingly encroaching on sandy coasts worldwide, with significant portions of these areas having less than 100 metres of infrastructure-free space, posing serious risks to ecosystem services, and necessitating the integration of nature protection into spatial planning policies.
  • Pamela Lattanzi et al. have conducted a fishing zone participatory mapping project in small-scale fisheries of the Campania region of Italy. The mapping was done by printing OpenStreetMap map tiles with additional features such as bathymetry contour lines, on A4 paper at 1:100,000 and 1:250,000 scales.
  • This study introduced the MERIT-Plus dataset, enhancing the original MERIT river network by delineating endorheic and exorheic drainage basins, enabling more accurate hydrological modeling, and improved understanding of global and regional water balances, especially in endorheic regions using vector polygons from OpenStreetMap.
  • J. Rafael Verduzco Torrest and David Philip McArthur analysed the public transport accessibility indicators for urban and regional services in Great Britain. This analysis was conducted using OpenStreetMap road and pedestrian network data.
  • Wanwan Li utilised data from OpenStreetMap to develop a realistic virtual world synthesising technique.

Humanitarian OSM

  • Ann-Jinette Hess reported in the BORGEN Magazine about the Missing Maps Project and how to participate.

Maps

  • Mark Stosberg has created a report about the most dangerous intersections for pedestrians in Bloomington, Indiana, where he relied on OpenStreetMap data.

OSM in action

  • [1] These two OpenStreetMap-based maps show the path of the total solar eclipse happening on Friday 8 March. It will cross North America from Mexico’s Pacific coast to Canada’s Atlantic coast:

    Hopefully there will be clear skies to (cautiously) observe the eclipse.

  • Cycling Weekly discussed the challenges cyclists face with satnav route planning.
  • Mastodon Near Me is a uMap showing many global Mastodon servers by country and region. The list is maintained by Jaz.
  • ECF’s (European Cyclists’ Federation) report on cycling infrastructure used OpenStreetMap as the map background.
  • utzer has used uMap and OpenStreetMap to create their own personalised itinerary map.
  • GPS-tour.info is a website that offers tracks for various outdoor activities. Users can search for tours, upload their own tracks, and share their experiences. The website is based on OpenStreetMap.

Open Data

  • The SEMIC community has published guidelines on how to use DCAT-AP for High-Value Datasets, on GitHub. The DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe.
  • The FOKUS (Fraunhofer Institute for Open Communication Systems) is offering an ‘Open Data Literacy’ course. This free course is online, can be customised, and teaches the basics of managing open data and metadata.

Software

  • Organic Maps distributed through the Google Play store has temporarily made login via OSM accounts unavailable due to Google’s restrictions. The developer mentioned this on GitHub. It is not yet clear whether this issue will affect other mobile applications that can log in via OSM accounts for editing.
  • The GeoServer roadmap for 2024 is focusing on improving performance, supporting new data formats and data sources, and expanding functionality for the analysis and visualisation of geodata.
  • Harald Hetzner has released a JOSM plugin for displaying local terrain elevation, elevation contour lines, and hillshade based on SRTM data.
  • Harald Hetzner has released another JOSM plugin for opening all the GPX tracks in the current map view from a specified directory.
  • In his diary, Jiri Vlasak summarised the functionality and scope of the project, the work done in 2023, and the upcoming work for 2024 for Divide and map. Now. ‘the damn project’ helps mappers to divide a big area into smaller squares, so that people can map together.
  • Organic Maps version 2024.01.09-5 has been released. In this version, support for Android Auto has entered the public preview stage.

Programming

  • Dustin Carlino wrote about his December walking-focused project where he’s developing a crossing tool (Severance Snape), which visualises places where a pedestrian naturally wants to cross on some ‘desire line’, but can’t and has to take a detour.
  • Robert Riemann published his project Jekyll MapLibre, which is able to display OpenStreetMap-based maps using MapLibre GL JS in Jekyll.

Releases

  • zyphlar has published a collection of stickers and posters for Organic Maps on GitHub.

Did you know …

  • … the satellite navigator osmin for Android and Linux devices?
  • … the GNOME desktop map application Maps, which is based on OpenStreetMap?
  • Haptické Mapy.cz from Blind Friendly Maps and Touch Mapper? This service supports the creation of tactile maps for the blind. You can print them out yourself or, if you live in Europe, have them printed for you.
  • … about the various websites that can show you an animation of public transport systems around the world?
    • TRAVIC – Transit Visualisation Client: Uses static timetable data from various sources.
    • Swiss Railway Network: Uses timetable data from Swiss Federal Railways (SBB), complete with animated train movements.
    • Mini Tokyo 3D: Uses data from Public Transportation Open Data Center, Japan (公共交通オープンデータセンター ), complete with 3D train movement animation.
    • Treinposities.nl: Shows train movements around Netherlands using data from NDOV, iRail, and NS API.
    • Carto GRAOU: Using data from SNCF Réseau, France.
    • rasp.yandex.ru: Displays train and bus data in Russia.
    • Spacing out with train: Uses data from Korea Transport Database (KTDB), showing 3D train animation like Mini Tokyo 3D above.
  • … the Wikidata SPARQL example queries showing how to link Wikidata with OpenStreetMap by using the OSM Sophox service?

OSM in the media

  • The Drive talked about Curvature. Curvature is a programme that analyses the geometry of OSM roads and generates a map of the most twisty roads, colour-coded by how many curves they have.
  • Eric van Rees wrote, in Geo Week News, about Marc Prioleau’s views on the possible ways Overture and OpenStreetMap could collaborate.

Other “geo” things

  • Google developed the ‘Flood Forecasting Initiative’, an AI-based system to report and predict riverine flood disasters.
  • Jack-bo1220 has compiled a list of papers, datasets, codes, and pre-trained weights related to remote sensing foundation models.
  • The Japan Aerospace Exploration Agency (JAXA) has released several satellite photos showing the emergence of new coastal land after the Noto Peninsula earthquake on 1 January.
  • Pia Volk has written Deutschlands verschwundene Orte (Germany’s lost places), which is about the history of thirty now-invisible historic locations in Germany, ranging from Bronze Age settlements to symbolic buildings removed for political reasons.
  • TomTom and Microsoft are planning to develop an AI-based conversational automotive assistant.
  • TomTom has presented their new ‘Orbis Maps’, a map based on the standard specification of the Overture Maps Foundation. The map is currently available through an early access process.

Upcoming Events

Where What Online When Country
Lorain County OpenStreetMap Midwest Meetup 2024-01-11 flag
Berlin 187. Berlin-Brandenburg OpenStreetMap Stammtisch 2024-01-11 flag
Zürich OSM-Stammtisch Zürich 2024-01-11 flag
Bochum OSM-Stammtisch Bochum 2024-01-11 flag
Ustaritz Journée OSM Pays Basque Sud Landes 2024-01-13 flag
Hannover OSM-Stammtisch Hannover 2024-01-13 flag
København OSMmapperCPH 2024-01-14 flag
Niort Rencontre OSM Poitou à Niort (79) 2024-01-14 flag
Hilversum OSGeo.nl OSM-NL QGIS-NL New Year’s Party 2024-01-14 flag
Budapest OSM Hungary Meetup 2024-01-15 flag
Grenoble Réunion du groupe local de Grenoble 2024-01-15 flag
Lyon Réunion du groupe local de Lyon 2024-01-16 flag
City of Edinburgh OSM Edinburgh pub meetup 2024-01-16 flag
Lüneburg Lüneburger Mappertreffen 2024-01-16 flag
Map-py Wednesday 2024-01-17
Karlsruhe Stammtisch Karlsruhe 2024-01-17 flag
Gent OpenStreetMap meetup & MapComplete workshop at TomTom 2024-01-18 flag
Washington Mapping USA 2024-01-19 – 2024-01-20 flag
Bengaluru OSM Bengaluru Mapping Party 2024-01-20 flag
City of Fremantle Social Mapping Saturday: Fremantle 2024 2024-01-20 flag
Hai Buluk OSM Africa Monthly Mapathon: Map South Sudan 2024-01-20 ss
Bremen Bremer Mappertreffen 2024-01-22 flag
San Jose South Bay Map Night 2024-01-24 flag
iD Community Chat 2024-01-24
OSMF Engineering Working Group meeting 2024-01-24
London Borough of Islington Geomob London 2024-01-24 flag
Lübeck 138. OSM-Stammtisch für Lübeck und Umgebung 2024-01-25 flag
Wien 70. Wiener OSM-Stammtisch 2024-01-25 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by LuxuryCoop, PierZen, SeverinGeo, Strubbl, TheSwavu, barefootstache, derFred, mcliquid, renecha, rtnf.
We welcome link suggestions for the next issue via this form and look forward to your contributions.

This Month in GLAM: December 2023

Saturday, 13 January 2024 05:36 UTC

This fall, Howard University professor Msia Clark taught a course on “Black Women and Pop Culture”, which focuses on Black women’s representations. So what could be more perfect than to ask her students to improve the representation of Black women on Wikipedia, the website the world visits when it wants to know more about topics?

“Women of color are underrepresented throughout Wikipedia,” Dr. Clark explains. “I designed the course with a goal of our students helping to improve the representations of Black women on Wikipedia.”

Mission accomplished, according to Zainab Ahmed, one of the students in the course.

“Writing a biography of a Black woman in STEM was very meaningful to me because it was allowing Black women in a field that is mainly dominated by White men to be acknowledged,” she says. “It also provides more access to Black girls who are interested in STEM to be able to research other people like in that field. It also stops downplaying the role Black women have played in the STEM field.”

Zainab and her classmates worked on biographies as part of an initiative to increase the diversity of Wikipedia’s STEM biographies, funded by the Broadcom Foundation. Wiki Education’s staff provided support for students as they researched and wrote the biographies. Zainab says as she researched the contributions of the woman she chose, she was inspired and excited to learn more about her contributions.

And, of course, she learned about editing Wikipedia. While she’d made a few edits before, she hadn’t dived into writing a full article before. She enjoyed the formatting tasks, deciding what information went into subheadings like “Early life” or “Career”.

“In comparison to a traditional term paper I prefer this because it is more research yet less restrictive. I did not have a word limit to meet, I just had to make sure if it was objective and factual,” Zainab says. “I felt like a true editor and writer.”

These learnings are exactly what Dr. Clark wanted her students to get out of the class.

“They definitely learned about the individuals they wrote about. They also learned how underrepresented women of color are on Wikipedia and the implications of that underrepresentation. They saw their work as part of an effort to help improve that representation,” she says. “It allows students to understand how their work contributes to a database that is relied on to provide information to users around the world. It also holds them accountable for their work. It’s not just what your professor thinks, but if your work a valuable contribution to public knowledge.”

Zainab says she felt that accountability. While she was initially “intimidated,” she says, now that she’s learned more about Wikipedia, she wants to keep editing.

“I will definitely continue soon now that I have a more informed understanding of how to do a proper Wikipedia article,” she says.

Visit teach.wikiedu.org to learn how to incorporate a Wikipedia assignment into your own course.

Header image of students in the class courtesy Msia Clark, all rights reserved.

The European Parliament’s resolution offers the opportunity to clarify Wikimedia Europe’s position on the practice of geo-blocking and its impact on Wikipedia and its sister projects.

Introduction

On Tuesday 13 December Parliament adopted a very important own-initiative report on the implementation of the 2018 Geo-blocking Regulation in the digital single market (376 votes in favor, 111 against and 107 abstentions). This is the legislation that aims at addressing unjustified geo-blocking and other forms of discrimination based on customers’ nationality, place of residence or place of establishment within the EU. The final goal is indeed to ease online cross-border transactions. This report is of importance because it paves the way for future changes of the Geo-Blocking Regulation, even though it cannot bind the next Parliament, which, in principle, could take a new position on the topic. In this sense, Article 9 of the Regulation foresees a revision clause for evaluating the implementation of the law in view of the possible extension of its scope to the online supply of copyright-protected content, including audiovisual content1. Indeed, when the Regulation was adopted, an essential element of the political deal was the possibility to continue to geo-block such content (as provided for in Article 4(1)(b)), at the same time fully excluding audiovisual content from the scope of the regulation – the exclusion is foreseen in Article 1(3) and recital 8 further specifies that “(…) Audiovisual services, including services the principle purpose of which is the provision of access to broadcasts of sports events and which are provided on the basis of exclusive territorial licenses, are excluded from the scope of this Regulation (…)”.

First revision and consumer perception

On one hand, audiovisual content, being heavily copyrighted, is subject to strict geo-blocking in order to preserve the territorial character of the licensing system, – which, in turn, it is claimed, guarantees the financial sustainability of audiovisual productions and cultural diversity. On the other hand, consumers have very high expectations on the possibility of accessing such content across borders, at least within the EU. Ending the practice of geo-blocking for online audiovisual and copyrighted content is therefore a key open question with regards to achieving a true European digital single market.

This aspect also results from the first short-term review of the Regulation that was published in November 2020, where the Commission highlighted that despite the fact that consumers would like to have cross-border access to audiovisual content, its online accessibility across EU countries is very limited, i.e. only 14% on average. And the latter circumstance, continues the Commission, can be explained not only in light of the peculiar financing model of audiovisual productions, but also because service providers can put in place “commercial practices segmenting the single market along national borders”. In addition, the CJEU explicitly pointed out, in its judgment of the Case C-132/19 Groupe Canal + v Commission, that “according to the case-law of the Court of Justice, an agreement which might tend to restore the partitions between national markets is liable to frustrate the Treaty’s objective of achieving the integration of those markets through the establishment of a single market. Therefore, agreements which are aimed at partitioning national markets according to national borders or make the interpenetration of national markets more difficult may be regarded, in the light of the objectives they pursue and the economic and legal context of which they form part, as agreements whose object is to restrict competition within the meaning of Article 101(1) TFEU.”

In other words, if we want to achieve a truly connected digital Europe where consumers can smoothly participate in culture and access goods and services without consideration of national borders, it is apparent that the Commission should put forward a legislative proposal to extend the scope of the Geo-Blocking Regulation to electronically supplied copyright-protected and audiovisual content.

A missed opportunity & the advantages of a truly digital single market

From our perspective, Parliament missed a unique opportunity to take such a progressive stance reflecting citizens’ expectations. This notwithstanding the text adopted by the Internal Market and Consumer Protection Committee (IMCO) clearly went in this direction and Parliament, already in 2021, held a plenary debate where it formally asked the Commission to adopt a legislative proposal in this sense.

The IMCO committee recalled  “the obligation for the Commission to report on the evaluation of the Geo-blocking Regulation, and recommend[ed] accompanying it with a comprehensive revision of the Geo-blocking Regulation by 2025 the latest, with a particular focus on the inclusion of audiovisual services in the scope of the Regulation”. Further, the IMCO Committee had called on the Commission “to fund a selection of emblematic European films to be made available online in all countries and languages via the Creative Europe MEDIA programme”. At the same time, it expressed its concerns about the fact that “geo-blocking also occurs in the case of audiovisual productions funded or co-funded by the EU MEDIA programme and is of the opinion that whenever EU funds are involved in the financing of audiovisual content, no EU citizen should be deprived of access to it”. 

The final report doesn’t contain all these references and, on the contrary, it includes now a specific recital (letter I) stating that “maintaining geo-blocking for copyrighted works and protected subject matter is one of the major tools for guaranteeing cultural diversity” and the corresponding paragraph 24 specifying: “considers that the inclusion of audiovisual services in the scope of the Geo-blocking Regulation would result in a significant loss of revenue, putting investment in new content at risk, while eroding contractual freedom and reducing cultural diversity in content production, distribution, promotion and exhibition; emphasises that such an inclusion would result in fewer distribution channels, ultimately driving up prices for consumers”. One could legitimately ask: cui prodest the approval of all these amendments?

Certainly not European citizens who are looking forward to seamlessly accessing cultural works and audiovisual content across borders, thus taking advantage of a true European digital single market. It would make it easier for them to access information, knowledge and cultural content within the EU, thus substantiating their fundamental right to freedom of expression and information – as enshrined in Article 11 of the Charter of Fundamental Rights.

European citizens cannot exclusively be seen as consumers, especially if one thinks of the concrete negative effects that such an exclusion has on the possibility of achieving a true European public sphere. And everybody knows how desperately Europe needs it, as it will help the EU to reduce the gap between its citizens and Institutions. 

An idea of how this EU public sphere may look like is offered by the pioneering project called Wiki loves Broadcast. It is a project launched by the German Wikipedia Community in 2016 and it is based on the basic idea that content significantly funded with public money should be freely usable by the public, by taking advantage of free licences (e.g. CC BY-SA 4.0). In the end, such content is a commons as the public has paid for it. Based on this very idea, in 2019 Terra X (ZDF), one the most prestigious German TV documentary programs, started releasing short videos under a free Creative-Commons-License (mostly CC BY 4.0). From that moment, 392 videos have been released and uploaded to the free media database Wikimedia Commons. Almost the majority of those videos have been featured in Wikipedia articles with a  total of over 98 million views. This means that all EU citizens can freely access and use such high quality content. Following this example, in 2022, also the Tagesschau (ARD) began releasing short animated clips under a free Creative-Commons-License (CC BY-SA 4.0). In total, 90 videos have been released and uploaded to Wikimedia Commons, amassing over 1.6 million views in total so far

These projects show a concrete way of how the EU can achieve a true European digital public sphere. Including copyright-protected works as well as audiovisual media services within the scope of the Regulation will definitely ease and encourage the establishment of this kind of partnerships with other public, or even private, broadcasters, and unlock the potential collaboration and public-private partnerships.

Conclusion

We believe that European citizens should no longer suffer an anachronistic limitation of their possibilities of accessing knowledge, information and cultural content. This is the best antidote to make the EU closer to its citizens, fight disinformation, spread quality content and ultimately preserve cultural diversity. Lawmakers cannot abdicate their role, also because, if this will be the case, the Court of Justice already showed to be ready to step in filling this gap.

1. This appears evident from the Statement of the Commission accompanying the Regulation where it is specified that “As part of the evaluation, it will also perform a substantive analysis of the feasibility and potential costs and benefits arising from any changes to the scope of the Regulation, (…). The Commission will also carefully analyse whether in other sectors, including those not covered by Directive 2006/123/EC which are also excluded from the scope of the Regulation pursuant to Article 1(3) thereof, such as services in the field of transport and audiovisual services, any remaining unjustified restrictions based on nationality, place of residence or place of establishment should be eliminated.If in the evaluation the Commission comes to the conclusion that the scope of the Regulation needs to be amended, the Commission will accompany it with a legislative proposal to that effect.

Imagining Future MediaWiki

Wednesday, 10 January 2024 11:29 UTC

 As we roll into 2024, I thought I'd do something a little different on this blog.

A common product vision exercise is to ask someone, imagine it is 20 years from now, what would the product look like? What missing features would it have? What small (or large) annoyances would it no longer have?

I wanted to do that exercise with MediaWiki. Sometimes it feels like MediaWiki is a little static. Most of the core ideas were implemented a long time ago. Sure there is a constant stream of improvements, some quite important, but the core product has been fixed for quite some time now. People largely interact with MediaWiki the same way they always have. When I think of new fundamental features to MediaWiki, I think of things like Echo, Lua and VisualEditor, which can hardly be considered new at this point (In fairness, maybe DiscussionTools should count as a new fundamental feature, which is quite recent). Alternatively, I might think of things that are on the edges. Wikidata is a pretty big shift, but its a separate thing from the main experience and also over a decade old at this point.

I thought it would be fun to brainstorm some crazy ideas for new features of MediaWiki, primarily in the context of large sites like Wikipedia. I'd love to hear feedback on if these ideas are just so crazy they might work, or just crazy. Hopefully it inspires others to come up with their own crazy ideas.

What is MediaWiki to me?

Before I start, I suppose I should talk about what I think the goals of the MediaWiki platform is. What is the value that should be provided by MediaWiki as a product, particularly in the context of Wikimedia-type projects?

Often I hear Wikipedia described as a top 10 document hosting website combined with a medium scale social network. While I think there is some truth to that, I would divide it differently.

I see MediaWiki as aiming to serve 4 separate goals:

  • A document authoring platform
  • A document viewing platform (i.e. Some people just want to read the articles).
  • A community management tool
  • A tool to collect and disseminate knowledge

The first two are pretty obvious. MediaWiki has to support writing Wikipedia articles. MediaWiki has to support people reading Wikipedia articles. While I often think the difference between readers and editors is overstated (or perhaps counter-productive as hiding editing features from readers reduces our recruitment pool), it is true they are different audiences with different needs.

What I think is a bit under-appreciated sometimes but just as important, is that MediaWiki is not just about creating individual articles, it is about creating a place where a community of people dedicated to writing articles can thrive. This doesn't just happen at the scale of tens of thousands of people, all sorts of processes and bureaucracy is needed for such a large group to effectively work together. While not all of that is in MediaWiki, the bulk of it is.

One of my favourite things about the wiki-world, is it is a socio-technical system. The software does not prescribe specific ways of working, but gives users the tools to create community processes themselves. I think this is one of our biggest strengths, which we must not lose sight of. However we also shouldn't totally ignore this sector and assume the community is fine on its own - we should still be on the look out for better tools to allow the community to make better processes.

Last of all, MediaWiki aims to be a tool to aid in the collection and dissemination of knowledge¹. Wikimedia's mission statement is: "Imagine a world in which every single human being can freely share in the sum of all knowledge." No one site can do that alone, not even Wikipedia. We should aim to make it easy to transfer content between sites. If a 10 billion page treatise on Pokemon is inappropriate for Wikipedia, it should be easy for an interested party to set up there own site that can house knowledge that does not fit in existing sites. We should aim to empower people to do their own thing if Wikimedia is not the right venue. We do not have a monopoly on knowledge nor should we.

As anyone who has ever tried to copy a template from Wikipedia can tell you, making forks or splits from Wikipedia is easy in theory but hard in practice. In many ways I feel this is the area where we have most failed to meet the potential of MediaWiki.

With that in mind, here are my ideas for new fundamental features in MediaWiki:

As a document authoring/viewing platform

Interactivity

Detractors of Wikipedia have often criticized how text based it is. While there are certainly plenty of pictures to illustrate, Wikipedia has typically been pretty limited when it comes to more complex multimedia. This is especially true of interactive multimedia. While I don't have first hand experience, in the early days it was often negatively compared to Microsoft Encarta on that front.

We do have certain types of interactive content, such as videos, slippy maps and 3D models, but we don't really have any options for truly interactive content. For example, physics concepts might be better illustrated with "interactive" experiments, e.g. where you can push a pendulum with a mouse and watch what happens.

One of my favourite illustrations on the web is this one of an Enigma machine. The Enigma machine for those not familiar was a mechanical device used in world war 2 to encrypt secret messages. The interactive illustration shows how an inputted message goes through various wires and rotates various disks to give the scrambled output. I think this illustrates what an Enigma machine fundamentally is better than any static picture or even video would ever be able to.

Right now there are no satisfactory solutions on Wikipedia to make this kind of content. There was a previous effort to do something in the vein of interactive content in the graph extension, which allowed using the Vega domain specific language to make interactive graphs. I've previously wrote on how I think that was a good effort but ultimately missed the mark. In short, I believe it was too high level which caused it to lack the flexibility neccessarily to meet the needs of users, while also being difficult to build simplifying abstractions overtop.

I am a big believer that instead of making complicated projects that prescribe certain ways of doing things, it is better to make simpler, lower level tools that can be combined together in complex ways, as well as abstracted over so that users can make simple interfaces (Essentially the unix philosophy). On Wiki, I think this has been borne out by the success of using Lua scripting in templates. Lua is low level (relative to other wiki interfaces), but the users were able to use that to accomplish their goals without MediaWiki developers having to think about every possible thing they might want to do. Users were than able to make abstractions that hid the low level details in every day use.

To that end, what I'd like to see, is to extend Lua to the client side. Allow special lua interfaces that allow calling other lua functions on the client side (run by JS), in order to make parts of the wiki page scriptable while being viewed instead of just while being generated.

I did make some early proof-of-concepts in this direction, see https://bawolff.net/monstranto/index.php/Main_Page for a Demo of Extension:Monstranto. See also a longer piece I wrote, as well as an essay by Yurik on the subject I found inspiring.

Mobile editing

This is one where I don't really know what the answer is, but if I imagine MW in 20 years, I certainly hope this is better.

Its not just MediaWiki, I don't think any website really has authoring long text documents on mobile figured out.

That said, I have seen some interesting ideas around, that I think are worth exploring (None of these are my own ideas)

Paragraph or sentence level editing

This idea was originally proposed about 13 years ago by Jan Paul Posma. In fact, he write a whole bachelor's thesis on it.

In essence, Mobile gets more frustrating the longer the text you are editing is. MediaWiki often works on editing at the granularity of a section, but what about editing at the granularity of a paragraph or a sentence instead? Especially if you just want to fix a typo on mobile, I feel it would be much easier if you could just hit the edit button on a sentence instead of the entire section.

Even better, I suspect that parsoid makes this a lot easier to implement now than it would have been back in the day.

Better text editing UI (e.g. Eloquent)

A while ago I was linked to a very interesting article by Scott Jenson about the problems with text editing on mobile. I think he articulated the reasons it is frustrating very well, and also proposed a better UI which he called Eloquent. I highly recommend reading the article and seeing if it makes sense to you.

In many ways, we can't really do this, as this is an android level UI not something we control in the web app. Even if we did manage to make it in a web app somehow, it would probably be a hard sell to ordinary users not used to the new UI. Nonetheless, I think it would be incredibly beneficial to experiment with alternate UIs like these, and see how far we can get. The world is increasingly going mobile, and Wikipedia is increasingly getting left behind.

Alternative editing interfaces (e.g. voice)

Maybe traditional text editing is not the way of the future. Can we do something with voice control?

It seems like voice controlled IDEs are increasingly becoming a thing. For example, here is a blog post about someone who programs with a voice programming software called Talon. It seems like there are a couple other options out there. I see Serenade mentioned quite a bit.

A project in this space that looks especially interesting is cursorless. The demo looked really cool, and i could imagine that a power user would find it easier to use a system like this to edit large blobs of WikiText than the normal text editing interface on mobile. Anyways, i reccomend watching the demo video to see what you think.

All this is to say, I think we should look really hard at the possibilities in this space for editing MediaWiki from a phone. On screen keyboards are always going to suck, might as well look to other options.

As a community building platform

Extensibility

I think it would be really cool if we had "lua" extensions. Instead of normal php extensions, a user would be able to register/upload some lua code, that gets subscribed to hooks, and do stuff. In this vision, these extension types would not be able to do anything unsafe like raw html, but would be able to do all sorts of stuff that users normally use javascript for.

This could be per user or also global. Perhaps could be integrated with a permission system to control what they can and cannot do.

I'd also like to see a super stable API abstraction layer for these (and normal extensions). Right now our extension API is fairly unstable. I would love to see a simple abstraction layer with hard stability guarantees. It wouldn't replace the normal API entirely, but would allow simpler extensions to be written in such a way that they retain stability in the long term.

Workflows

I think we could do more to support user-created workflows. The Wiki is full of user created workflows and processes. Some are quite complex others simple. For example nominating an article for deletion or !voting in an RFC.

Sometimes the more complicated ones get turned into javascript wizards, but i think that's the wrong approach. As I side earlier, I am a fan of simpler tools that can be used by ordinary users, not complex tools that do a specific task but can only be edited by developers and exist "outside" the wiki.

There's already an extension in this area (not used by Wikimedia) called PageForms. This is in the vein of what I am imagining, but I think still too heavy. Another option in this space is the PageProperties extension which also doesn't really do what I am thinking of.

What I would really want to see is an extension of the existing InputBox/preload feature.

As it stands right now, when starting a new page or section, you can give a url parameter to preload some text as well as parameters to that text to replace $1 markers.

We also have the InputBox extension to provide a text box where you can put in the name of an article to create with specific text pre-loaded.

I'd like to extend this idea, to allow users to add arbitrary widgets² (form elements) to a page, and bind those widgets to specific parameters to be preloaded.

If further processing or complex logic is needed, perhaps an option to allow the new preloaded text to be pre-processed by a lua module. This would allow complex logic in how the page is edited based on the user's inputs. If there is one theme in this blog post, it is I wish lua could be used for more things on wiki.

I still imagine the user would be presented with a diff view and have to press save, in order to prevent shenanigans where users are tricked into doing something they don't intend to.

I believe this is a very light-weight solution that also gives the community a lot of flexibility to create custom workflows in the wiki that are simple for editors to participate in.

Querying, reporting and custom metadata

This is the big controversial one.

I believe that there should be a way for users to attach custom metadata to pages and do complex queries over that metadata (including aggregation). This is important both for organizing articles as well as organizing behind the scenes workflows.

In the broader MediaWiki ecosystem, this is usually provided by either the SemanticMediaWiki or Cargo extensions. Often in third party wikis this is considered MediaWiki's killer feature. People use them to create complex workflows including things like task trackers. In essence it turns MediaWiki into a no-code/low-code user programmable workflow designer.

Unfortunately, these extensions all scale poorly, preventing their use on Wikimedia. Essentially I dream of seeing the features provided by these extensions on Wikipedia.

The existing approaches are as follows:

  • Vanilla MediaWiki: Category pages, and some query pages.
    • This is extremely limited. Category pages allow an alphabetical list. Query pages allow some limited pre-defined maintenance lists like list of double redirects or longest articles. Despite these limitations, Wikipedia makes great use out of categories.
  • Vanilla mediawiki + bots:
    • This is essentially Wikipedia's approach to solving this problems. Have programs do queries offsite and put the results on a page. I find this to be a really unsatisfying solution. A Wikipedian once told me that every bot is just a hacky workaround to MediaWiki failing to meet its users' needs, and I tend to agree. Less ideologically, the main issue here is its very brittle - when bots break often nobody knows who has access to the code or how it can be fixed. Additionally, they often have significant latency for updates (If they run once a week, then the latency is 7 days) and ordinary users are not really empowered to create their own queries.
  • Wikidata (including the WDQS SPARQL endpoint)
    • Wikidata is adjacent to this problem, but not quite trying to solve it. It is more meant as a central clearinghouse for facts, not a way to do querying inside Wikipedia. That said Wikidata does have very powerful query features in the form of SPARQL. Sometimes these are copied into Wikipedia via bots. SPARQL of course has difficult to quantify performance characteristics that make it unsuitable for direct embedding into Wikipedia articles in the MediaWiki architecture. Perhaps it could be iframed, but that is far from being a full solution.
  • SemanticMediaWiki
    •  This allows adding Semantic annotations to articles (i.e. Subject-verb-object type relations). It then allows querying using a custom semantic query language. The complexity of the query language make performance hard to reason about and it often scales poorly.
  • Cargo
    • This is very similar to SemanticMediaWiki, except it uses a relational paradigm instead of a semantic paradigm. Essentially users can define DB tables. Typically the workflow is template based, where a template is attached to a table, and specific parameters to the template are populated into the database. Users can then use (Sanitized) SQL queries to query these tables. The system uses an indexing strategy of adding one index for every attribute in the relation.
  • DPL
    • DPL is an extension to do complex querying and display using MediaWiki's built in metadata like categories. There are many different versions of this extension, but all of them have potential queries that scale linearly with the number of pages in the database, and sometimes even worse.

I believe none of these approaches really work for Wikipedia. They either do not support complex queries or allow too complex queries with unpredictable performance. I think the requirements are as follows:

  • Good read scalability (By read, I mean scalability when generating pages (during "parse" in mediawiki speak). On Wikipedia, pages are read and regenerated a lot more often than they are edited.
    • We want any sort of queries to have very low read latency. Having long pauses waiting for I/O during page parsing is bad in the MediaWiki architecture
    • Queries should scale consistetly. They should at worse be roughly O(log n) in the number of pages on the wiki. If using a relational style database, we would want the number of rows the DBMS have to look at be no more than a fixed max number
  • Eventual write consistency
    • It is ok if it takes a few minutes for things using the custom metadata to update after it is written. Templates already have a delay for updating.
    • That said, it should still be relatively quick. On the order of minutes ideally. If it takes a day or scales badly in terms of the size of the database, that would also be unacceptable.
    • write performance does not have to scale quite as well as read performance, but should still scale reasonably well. 
  • Predictable performance.
    • Users should not be able to do anything that negatively impacts site performance
    • Users should not have to be an expert (or have any knowledge) in DB performance or SQL optimization.
    • Limits should be predictable. Timeouts suck, they can vary depending on how much load the site is under and other factors. Queries should either work or not work. Their validity should not be run-time dependent. It should be obvious to the user if their query is an acceptable query before they try and run it. There should be clear rules about what the limits of the system are.
  • Results should be usable for futher processing
    • e.g. You should be able to use the result inside a lua module and format it in arbitrary ways
  • [Ideally] Able to be isolated from the main database, shardable, etc.
  • Be able to query for a specific page, a range of pages, or aggregates of pages (e.g. Count how many pages are in a range, average of some property, etc)
    • Essentially we want just enough complexity to do interesting user defined queries, but not enough that the user is able to take any action that affects performance.
    • There are some other query types that are more obscure but maybe harder. For example geographic related queries. I don't think we need to support that.
    • Intersection queries are an interesting case, as they are often useful on wiki. Ideally we would support that too.

 

Given these constraints I think the CouchDB model might be the best match for on-wiki querying and reporting.

Much of the CouchDB marketing material is aimed around their local data eventual consistency replication story. Which is cool and all but not what I'm interested in here. A good starting point for how their data model works is their documentation on views. To be clear, I'm not neccesarily suggesting using CouchDB, just that its data model seems like a good match to the requirements.

CouchDB is essentially a document database based around the ideas of map-reduce. You can make views which are similar to an index on a virtual column in mysql. You can also make reduce functions which calculate some function over the view. The interesting part is that the reduce function is indexed in a tree fashion, so you can efficiently get the value of the function applied to any contiguous range of the rows in logrithmic time. This allows computing aggregations of the data very efficiently. Essentially all the read queries are very efficient. Potentially write queries can be less so but it is easy to build controls around that. Creating or editing reduce functions is expensive because it requires regenerating the index, but that is expected to be a rare operation and users can be informed that results may be unreliable until it completes.

In short, the way the CouchDB data model works as applied to MediaWiki could be as follows:

  • There is an emit( relationName, key, data) function added to lua. In many ways this is very similar to adding a page to a category named relationName with a sortkey specificed by key. data is optional extra data associated with this item. For performance reason, there may be a (high) limit to the max number of emit() on a page to prevent DB size from exploding.
  • Lua gets a function query( relationName, startKey, endKey ). This returns all pages between startKey and endKey and their associated data. If there are more than X (e.g. 200) number of pages, only return the first X.
  • Lua gets a queryReduced( relationName, reducerName, startKey, endKey ) which returns the reduction function over the specified range. (Main limitation here is the reduce function output must be small in size in order to make this efficient)
  • A way is added to associate a lua module as a reduce function. Adding or modifying these functions is potentially an expensive operation. However it is probably acceptable to the user that this takes some time

All the query types here are efficient. It is not as powerful as arbitrary SQL or semantic queries, but it is still quite powerful. It allows computing fairly arbitrary aggregation queries as well as returning results in a user-specified order. The main slow parts is when a reduction function is edited or added, which is similar to how a template used on very many pages can take a while to update. Emiting a new item may also be a little slower than reading since the reducers have to be updated up the tree (With possibly contention on the root node), however that is a much rarer operation, and users would likely see it as similar to current delays in updating templates.

I suspect such a system could also potentially support intersection queries with reasonable efficiency subject to a bunch of limitations.

All performance limitations are pretty easy for the user to understand. There is some max number of items that can be emit() from a page to prevent someone from emit()ing 1000 things per page. There is a max number of results that can be returned from a query to prevent querying the entire database, and a max number of queries allowed to be made from a page. The queries involve reading a limited number of rows, often sequential. The system could probably be sharded pretty easily if a lot of data ends up in the database.

I really do think this sort of query model provides the sweet spot of complex querying but predictable good performance and would be ideal for a MediaWiki site running at scale that wanted SMW style features.

As a knowledge collection tool

Wikipedia can't do everything. One thing I'd love to see is better integration between different MediaWiki servers to allow people to go to different places if their content doesn't fit in Wikipedia.

Template Modularity/packaging

Anyone who has ever tried to use Wikipedia templates on another wiki knows it is a painful process. Trying to find all the dependencies is a complex process, not to mention if it relies on WikiData or JsonConfig (Commons data: namespace)

The templates on a Wiki are not just user content, but complex technical systems. I wish we had a better systems for packaging and distributing them.

Even within the Wikimedia movement, there is often a call for global templates. A good idea certainly, but would be less critical if templates could be bundled up and shared. Even still, having distinct boundries around templates would probably make global templates easier than the current mess of dependencies.

I should note, that there are extensions already in this vein. For example Extension:Page_import and Extension:Data_transfer. All of them are nice and all, but I think it would maybe be cooler to have the concept of discrete template/module units on wiki, so that different components are organized together in a way that is easier to follow.

Easy forking

Freedom to fork is the freedom from which all others flow. In addition to providing an avenue for people who disagree with the status quo a way to do their own thing, easy forking/mirroring is critical when censorship is at play and people want to mirror Wikipedia somewhere we cannot normally reach. However running a wiki the size of english wikipedia is quite hard, even if you don't have any traffic. Simply importing an xml dump into a mysql DB can be a struggle at the sizes we are talking about.

I think it would be cool if we made ready to go sqlite db dumps. Perhaps possibly packaged as a phar archive with MediaWiki, so you could essentially just download a huge 100 GB file, plop it somewhere, and have a mirror/fork

Even better if it could integrate with EventStream to automatically keep things up to date.

Conclusion

So those are my crazy ideas for what I think is missing in MediaWiki (With an emphasis on the Wikipedia usecase and not the third party use-case). Agree? Disagree? Hate it? I'd love to know. Maybe you have your own crazy ideas. You should post them, after all, your crazy idea cannot become reality if you keep it to yourself!

Notes:

¹ I left out "Free", because as much as I believe in "Free Culture" I believe the free part is Wikimedia's mission but not MediaWiki's.

² To clarify, by widgets i mean buttons and text boxes. I do not mean widgets in the sense of the MediaWiki extension named "Widgets".