Recently, the Wikimedia Foundation began a fundraising campaign on Wikipedia in India, inviting anyone who relies on Wikipedia to support its future. Banners are appearing on the English Wikipedia in India, asking readers to consider contributing with a donation.

The banners have generated a lot of conversation online. We wanted to address some of the comments from our Indian users about why we are running the campaign and how donations to Wikipedia are used. Wikipedia is not a commercial website, and we are not driven by profit or advertising incentives. We are a charitable organization supported by the people who read Wikipedia. Our mission is to ensure that everyone, everywhere, can share in and access free knowledge.

India Fundraising Banner

Anyone can edit Wikipedia, and in the nearly 20 years since Wikipedia’s founding, millions of volunteer editors of diverse backgrounds and beliefs from all over the world have contributed to its articles. Wikipedia is powered by more than 250,000 global volunteers every month; it spans more than 50 million articles across nearly 300 languages.

Readers in India visit Wikipedia more than 750 million times each month, the fifth highest number of views from any country. Not only do Indians access Wikipedia in large numbers, Indian Wikipedia editors are integral and valued contributors to the encyclopedia, which is available in 23 of the languages spoken across India. In recent months, Indian volunteer contributors have ensured that neutral, reliable information about COVID-19 is available across Indic languages on Wikipedia. They have collaborated with global health partners so that information and new developments about the pandemic are well-sourced, accurate, and vetted by medical experts.

We ensure Wikipedia remains fast, secure, and reliable wherever you are in the world.

Reader donations are critical to supporting Wikipedia’s global presence. To meet the needs of readers in India and around the world, we operate an international technology infrastructure comparable to the world’s largest commercial websites. This includes hosting costs like keeping our servers running, as well as significant, ongoing engineering work to make sure Wikipedia is reliable, secure, loads quickly, and protects your privacy. As a living, constantly-changing project (with 350 edits per minute and roughly 6,700 pageviews per second), this has become increasingly important as we’ve seen more and more people turn to Wikipedia as a resource during the COVID-19 pandemic.

Donations also allow us to dedicate engineering resources to ensure that you can access Wikipedia in your preferred language, on your preferred device, no matter where you are in the world — from a dial-up modem to a brand new smartphone. Most major websites support an average of 50-100 languages — Wikipedia supports roughly 300 languages, a number that grows every year. We also use donations to reinforce volunteer efforts in ensuring information on Wikipedia is neutral, accurate, and well-sourced, working with volunteer contributors to build tools that protect against vandalism and help identify unsourced information.

Altogether, over 250 people work in engineering and product development at the Wikimedia Foundation. They work directly with our servers; manage traffic and uptime; maintain security and respond to attacks; design, develop, and release new features and experiences; support advanced editing tools and services; protect reader and editor privacy; improve accessibility for people with disabilities; and keep bandwidth costs affordable for readers. That works out to only one employee for every four million monthly readers of Wikipedia! Compared to other top websites with thousands of engineers, our mandate is to do a lot, with a little.

We support community-led projects to set knowledge free

We recognize that some of the best ideas to set knowledge free come from people around the world who use and contribute to Wikipedia and its sister free knowledge projects. At the Foundation, we collaborate with Wikimedia volunteers around the globe to support their ideas and help them bring more free knowledge to the world. In India, the Wikimedia Foundation has provided more than 2 million USD in funding to free knowledge since 2010, through grants to individuals, groups, and organizations, across 10 language communities. Support has helped Indian volunteers preserve and digitize historical sources, introduce them to new skills, and promote knowledge equity.

Last year, the Foundation issued more than 8 million USD in funding, or 7% of the Foundation’s annual budget, through grants to Wikimedia community members, affiliates, and nonprofit organizations around the world. This grantmaking is done in large part through participatory decision making processes, with Wikimedia volunteers serving on grant committees to discuss and provide recommendations on grant proposals. This funding supports the addition of new knowledge to Wikipedia and the other Wikimedia project sites through edit-a-thons, events, and initiatives that work to fill knowledge gaps and increase diversity on our sites.

We advocate for global policies that protect and advance access to information

The Foundation’s policy and legal efforts help ensure that everyone has the right to access, share, and create knowledge, while defending our volunteers from threat of reprisal, and upholding our commitment to free expression and open knowledge. We advocate for free licenses and open source software and work to make sure that copyright laws are built and reformed so that people can share and use knowledge more broadly. We also fight against censorship and protect the right of everyone to speak and learn freely. Support for this work is vital to giving you and users everywhere equal access to Wikimedia projects.

Your loyalty and donations ensure free knowledge can thrive

Wikipedia and Wikimedia projects belong to everyone—they are built for and by you. From readers to editors, we all have a stake in preserving and telling the stories of our history, our culture, and the intriguing and notorious people who have shaped our world. We recognize that not everyone has the ability or means to give. All that we ask is that you continue to seek us out as the world’s largest free knowledge resource.

For those who can support us, your donations will help continue to sustain the systems that make Wikipedia possible, and ensure the free knowledge movement can grow and thrive. We know not everyone can afford to support this work — which is exactly why Wikipedia exists. Our mission is to make sure free knowledge is available to the world, for everyone, everywhere.

We’re able to make this commitment thanks to the tremendous generosity of past and present donors, and the incredible work of the global Wikimedia volunteer communities. But our work is not done. There is so much more knowledge in the world, and so many more people to reach. To fulfill our mission to create a world in which every human being can freely share in the sum of all knowledge, we need to meet the new challenges of our time. If you value this work, please support it. Visit donate.wikimedia.org to make a contribution today.

As the COVID-19 pandemic rages across the United States, residents are seeking information about their local governments’ responses. Local newspapers are often a great source for the most recent news, but it’s hard to get a big picture of the pandemic’s impact on states, cities, and regions from reading a daily newspaper. Wikipedia, however, provides that overview — assuming you live in a region where Wikipedia’s volunteers have expanded the article.

Sadly, however, that’s not every region of the country. That’s why Wiki Education launched a series of courses in our Wiki Scholars program devoted to improving the quality of content on articles related to state and regional responses to COVID-19. To date, we’ve wrapped up two courses, have a third ongoing, and are actively recruiting for more. Course fees for these courses have been paid by a generous sponsor who is supporting the improvement of COVID-19-related articles on Wikipedia.

The two courses that have wrapped up demonstrate the benefits of the Wiki Scholars model. Wiki Education staff recruits experts in public policy, political science, journalism, and other related topic areas to take a 6-week course where we teach the participants how to edit Wikipedia articles related to the pandemic. In our first two courses, 36 subject matter experts have added content to articles that have been viewed millions of times.

Many of the participants improved the “timeline” sections of state articles, adding the daily and weekly updates, and most also improved other sections in articles. One participant wrote a large section in Maine’s article on the impact to higher education in the state. The Arizona article has a section on epidemiology and public health responses written nearly entirely by a participant in one of our courses. The South Carolina article now has a section on the epidemiology and public health response, as well as impacts to K-12 and higher education schools and the economy.

Our courses also offered an opportunity for participants to address Wikipedia’s equity gaps in this content area. One participant added a section on the pandemic’s impact on the Northern Arapaho tribe to the Wyoming article. Another noted the Navajo Nation, which at the time had a higher per capita positive rate than any U.S. state, didn’t have an article about COVID-19, so our participants created one. Another participant added a section about New Mexico’s Navajo Nation to the New Mexico article.

A participant from our course wrote most of the article about North Dakota’s response, but once they finished the course, the article edits stopped as well. These courses have demonstrated that it takes more than just an individual or even a group of individuals to keep these articles up to date. Articles like this desperately need regular edits; that’s why we’re offering more courses. The work our past participants have done has demonstrated the value of these improvements; now, we need to do more.

If you’re interested in learning how to improve Wikipedia’s articles related to COVID-19, we are actively accepting applicants for our next course.

Header/thumbnail image of COVID-19 testing in Arizona by User:Prim8acs, a participant in one of our courses, CC BY-SA 4.0 via Wikimedia Commons

Afinal, por que eu reingressei no ensino superior?

00:00, Tuesday, 04 2020 August UTC



A necessidade de um diploma de ensino superior é bastante debatida dentro da indústria da tecnologia — não há um dia em que não vejamos tweets sobre o assunto em nossas linhas do tempo. Por isso — e também para registrar as minhas reflexões sobre o ensino superior como um todo —, resolvi compartilhar o processo de tomada de decisão que me levou à decisão de reingressar no ensino superior em uma graduação com foco em computação.

De onde parti

No começo da minha carreira na tecnologia, eu era uma graduanda em Engenharia Mecânica com uma grande predileção por computação. Como conto em A minha jornada até o Outreachy, ou Como aprendi a parar de me preocupar e começar a contribuir, o meu envolvimento com tecnologia até o meu primeiro estágio nunca havia sido total, estando limitado a cursos de extensão ou horas de voluntariado em projetos ou eventos. Ser selecionada como estagiária do Outreachy na comunidade Wikimedia em 2018 — e ter uma experiência tão positiva a ponto de conseguir alinhar uma nova oportunidade na área logo após o término do meu estágio — confirmou algo que suspeitava há algum tempo: tenho uma grande aptidão para atuar na área, e cursar Engenharia Mecânica se tornou uma grande limitação.

Cursei apenas um semestre na graduação antes de trancar a minha matrícula, e isso me permitiu realinhar as minhas expectativas ao mesmo tempo que progredi na minha carreira profissional: tive a oportunidade de viajar para várias cidades no Brasil e no mundo, fui mentorada pela comunidade da Mozilla, juntei-me ao time do Outreachy… E decidi fazer o ENEM (Exame Nacional do Ensino Médio, o único método de ingresso que a universidade federal do meu estado adota em todas as modalidades) novamente.

Você está indo tão bem. Por que um diploma?

A trajetória profissional de alguém acaba sendo uma grande mistura de competência, sorte e privilégio — em medidas bastante diferentes em cada caso. Conquistar um diploma de ensino superior me permite aumentar o fator “competência” e de quebra me confere algum grau de seguridade caso eu decida deixar o Brasil. Demais benefícios, como algum grau de envolvimento com pesquisa acadêmica ou conexões no meio universitário, são fatores interessantes mas que não influenciaram a minha decisão de reingressar no ensino superior.

Ainda falando sobre privilégio, é importante ressaltar que sou bastante privilegiada por poder fazer essa escolha (e ser capaz de ingressar e permanecer no ensino superior público brasileiro). Para mim foi difícil fazer uma prova de nível médio após tantos anos sem contato com o conteúdo cobrado, mas o meu desempenho, ainda que não tenha sido brilhante, incontestavelmente comprova que a minha boa base no Ensino Médio em um colégio particular persiste.

O bom

Comecei a minha graduação em Sistemas de Informação em 2019 e hoje estou com 21% do curso integralizado graças aos processos de reaproveitamento de créditos. Há uma sinergia entre a minha vida profissional e a minha vida acadêmica que me permite entender determinados conteúdos com uma profundidade e maturidade maior ao mesmo tempo que posso oferecer uma perspectiva interessante aos meus colegas de turma e professores. Aliás, reingressar no ensino superior no meio de seus vinte anos é bastante diferente de ingressar na faculdade no fim da adolescência — sinto que tenho mais serenidade para resolver problemas e encarar desafios.

Ter construído uma carreira antes de ingressar em uma graduação afim também eliminou a pressão e a incerteza de como entraria no mercado de trabalho após a minha formatura. Já não sinto mais que o meu sucesso é dependente de um diploma de ensino superior, e isso me permite inclusive ser mais assertiva e vocal em momentos em que percebo alguma injustiça ou desvio de conduta.

O meio-termo

Apesar de ter aulas em apenas um turno (noturno), cursar uma graduação ainda exige um grau de dedicação e responsabilidade. Acumular funções em diversas organizações e projetos não é uma boa ideia — fazer isso te levará a um burn out. Isso pode diminuir as suas perspectivas de remuneração ou envolvimento com coisas fantásticas a curto prazo, mas vejo um diploma de ensino superior como um bom investimento a longo prazo que eventualmente se pagará no futuro.

O mau

Como mencionei há alguns parágrafos, você enfrentará algumas situações desagradáveis no ensino superior — e isso inevitavelmente acabará afetando a sua saúde mental. Sendo bastante sincera, não há um semestre em que eu não pense em desistir da ideia de ter um diploma apesar dos incentivos que citei há pouco. As dinâmicas de poder em uma universidade acabam afetando o seu dia-a-dia como estudante, e muitas vezes sinto que a universidade “sufoca” a minha criatividade em nome de algo que sequer sei se coordenadores e núcleos de estruturação de cursos sabem nomear.

Também há um grande desejo de “atender as demandas do mercado” sem que isso seja acompanhado de uma formação humana — discussões sobre o impacto social da tecnologia são muitas vezes vistas como mero detalhe. Foram raras as vezes em que testemunhei algum professor tocar em assuntos como condições de trabalho e o papel da tecnologia no reforço de estruturas que não tornam o mundo melhor1.

Devo considerar seguir o mesmo caminho?

Acredito que toda pessoa deve tomar decisões sobre a sua carreira de forma informada e voluntária, por isso te incentivo a questionar a utilidade de um diploma de ensino superior em seus planos de carreira. Lembre-se que, como em toda decisão em sua carreira, cursar uma graduação te trará alguns benefícios e te forçará a fazer algumas renúncias. Analise o seu momento de vida e veja se essa é uma escolha que faz sentido no grande esquema do seu caminho profissional.


  1. Uma das raras ocasiões aconteceu no ensino de uma disciplina do primeiro semestre — um professor publicou um formulário online em que perguntava por que decidimos ingressar em Sistemas de Informação. Ao receber várias respostas relacionadas a “tornar o mundo melhor”, ele respondeu: “A tecnologia não é uma área tão aclamada porque ‘torna o mundo melhor’ — ela é apenas uma área extremamente rentável.” ↩︎



Promotion

III Workshop ADAs

I was invited to give a short workshop at the III Workshop ADAs, an annual event ran and organized by an initiative focused on women in technology called Projeto ADAs. It consisted on teaching viewers how Git and GitHub works, and giving them an idea of what a contribution path looks like. In addition to teaching university students about open source in a more practical way, I recommended Outreachy alongside Google Summer of Code and Google Season of Docs as paid opportunities they should consider applying to, and was interviewed by an organizer about my career in open source. That workshop was streamed live on July 4th, and it’s still available on their channel. It was well-received among students and professors.

Backstage

Longitudinal study

Last month I explored our longitudinal study to gather answers related to a few communities of interest we were evaluating. This month I discussed a few strategies to review and analyze all answers of all communities with Sage. We agreed that:

  • The most telling aspect of an answer is often what a former intern didn’t say about their internship, their mentors or their community.
  • We should also keep in mind the history of Outreachy itself, and read and analyze all answers considering that context.

Reading those answers takes quite a while, and it’s often a very exhausting task. With a bit of automation magic I was able to convert the .csv file to a .txt that structures all answers in a more readable format, but I always double check to make sure no important info is left behind. I had hopes it would make an impact on the way we organize things in our next round, but a more realistic timeline is delivering an in-depth report by September.

Impact of the COVID-19 pandemic on Brazilian academic calendars

The December-March round is extremely popular with Brazilian students. With the pandemic suspending most of academic calendars in Brazil, Sage and I agreed to preemptively check the dates of all public Brazilian universities that had more than one candidate in the last December-March round to understand better what challenges we would face when defining the dates of the selection process and the internship itself. A surprising number of universities haven’t defined their policies quite yet, but all of our universities of interest published an official document stating at least the most important dates related to all 2020 terms (start and end dates).

Reading interviews and news, I expect this pandemic to impact all academic calendars in the world for at least two years (or four rounds).

Policy for students from the Federal University of Goiás

I was invited by an Instituto de Informática professor to participate in the talks of creating a hackerspace in the Federal University of Goiás. We discussed the possibility of creating some kind of policy within the hackerspace that would possibly allow Federal University of Goiás students to use their Outreachy hours as credits or even as mandatory internship hours.

The use of Outreachy hours to comply to mandatory internship hours is quite controversial as Brazilian law has a really strict definition of what truly counts as an internship — in my conversations with FIEG last year, they told me that nationwide promotion of the program should frame it as a “paid mentorship” rather than a “paid internship”. This particular professor, however, has managed to create a few policies (at first with elective classes) to help students use Google Summer of Code hours as academic credits, and he agreed that a hackerspace may be able to provide the structure we need to help students validate their hours.

I believe this particular initiative has more chances to thrive than my failed attempt at creating a federal policy as I have much more leverage as an undergraduate student at my university than I had with an external organization such as FIEG.

Tech News issue #32, 2020 (August 3, 2020)

00:00, Monday, 03 2020 August UTC
TriangleArrow-Left.svgprevious 2020, week 32 (Monday 03 August 2020) nextTriangleArrow-Right.svg
Other languages:
Bahasa Indonesia • ‎Bahasa Melayu • ‎Deutsch • ‎English • ‎Hausa • ‎Nederlands • ‎Sunda • ‎Tagalog • ‎Tiếng Việt • ‎Türkçe • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português • ‎português do Brasil • ‎slovenčina • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎ייִדיש • ‎עברית • ‎العربية • ‎فارسی • ‎ߒߞߏ • ‎नेपाली • ‎मराठी • ‎हिन्दी • ‎ગુજરાતી • ‎தமிழ் • ‎తెలుగు • ‎മലയാളം • ‎ไทย • ‎አማርኛ • ‎中文 • ‎日本語 • ‎ꯃꯤꯇꯩ ꯂꯣꯟ • ‎한국어

weeklyOSM 523

11:14, Sunday, 02 2020 August UTC

21/07/2020-27/07/2020

lead picture

Civil Protection Portugal uses OSM – with attribution 😉 1 | © Civil Protection Portugal | Map data © OpenStreetMap contributors

Breaking news

  • Dorothea reminds us to celebrate the 16th anniversary of OSM on 8 August. She provides some ideas on how that can be done!

About us

  • Happy Birthday OSM-Wochennotiz! It was ten years ago this week that the first issue (de) (automatic translation) of the Wochennotiz was published.

Mapping

  • Matthew Woehlke has proposed the new tag sport=four_square for a sport called Four square.
  • Michael Montani’s proposal to introduce a natural=bare_soil tag for ‘an area covered by soil, without any vegetation’ is currently open for voting until 7 August.
  • Following the British Geospatial Commission’s announcement that unique identifiers for addresses and streets would become available as open data (we reported earlier), proposals have been produced for ref:GB:uprn (unique property reference number) and ref:GB:usrn (unique street reference number). Discussion has taken place on the Talk-GB and Tagging mailing lists.
  • JesseFW provided us with an explanation as to why the coastlines on Carto hadn’t been updated since January 2020. It seems that the coastline update was another victim of the Río de la Plata edit war (which we reported on earlier).
  • User mahdi1234 published a guide for beginners on visualising changes in OSM over time. He shows in detail how to create a time lapse with OSM data.

Community

  • Christoph Hormann objects to the framing of craft mappers in OpenStreetMap as conservatives opposed to change. He feels that this is part of a new narrative being communicated in OSMF politics; that is, the need for change in OpenStreetMap, and craft mappers’ opposition to change.
  • On the Geomob podcast Steven Feldman chats with recent FOSS4GUK keynote speaker María Arias de Reyna, a senior software engineer at Red Hat and former President of the Open Source Geospatial Foundation. The episode deals with María’s current work, and her recent talk at FOSS4GUK, but also imposter syndrome, and science fiction.
  • Øukasz recounts their experience of two recent interactions with CartONG, one with developing a tagging schema for refugee camps and the other with importing a UNHCR refugee camp dataset. Léonie Miège, from CartONG, responded in a blog comment.
  • Richard Fairhurst has written a new guide for data owners wishing to contribute to OSM. It is an output of cooperation with two local councils in the UK, which were recently funded by the Open Data Institute to investigate using and contributing to crowdsourced open map data such as OSM.
  • OpenStreetMap US has published its July 2020 newsletter.
  • Igor Eliezer has made a video showing how the 3D modelling of the Museu Paulista (São Paulo, Brazil) was re-worked in OpenStreetMap using the JOSM editor. The 3D preview at the beginning and end of the video is from the F4Map website and, during modelling, in Kendzi3D within the JOSM editor.

Imports

  • Alex Hennings has reviewed Facebook and ESRI’s proposed ‘not-an-import’ imports (we reported earlier) of ArcGIS datasets through RapiD or JOSM MapWithAI plugin and found them wanting. Of particular concern was the lack of solicited community review on the imports-us mail list.

OpenStreetMap Foundation

  • The OSMF board would like to consult the community on its hiring plans for a Senior Site Reliability Engineer. This is the first position based on the hiring framework which osmf-talk discussed a few days ago.

Events

  • Proceedings of the Academic Track at the State of the Map 2020 have been published.

Humanitarian OSM

  • HOT is conducting an online survey of people who have used RapiD to find out what their experience was. The data will be used to understand how RapiD could be made more accessible and usable for a variety of users.

Maps

  • Nuno Caldeira congratulated, through Twitter, the Portuguese National Emergency and Civil Protection Authority for using OpenStreetMap data, and correct attribution, in a tweet (pt) about a large forest fire that occurred last week in Portugal.
  • Taiwan, a nation in the Far East with special political status, has lots of outlying islands. Dadan Island in Kinmen is one of the most remote. MG, the Italian mapper, shares the Kinmen information website he created, The Kinmen Rising Project on the OSM Taiwan Telegram channel, showing photos from his journey with OSM as the base map, and of course he mapped a lot of POIs on the island.

Software

  • A research paper analyses the growing amount of freely available spatio-temporal information (such as aerial imagery) to support and guide mappers in their work. Artificial neural networks identify regions of interest where OSM is likely to require updating.
  • Venkanna Babu Guthula has released Label-Pixels, a tool for semantic segmentation of remote sensing images using fully convolutional networks (FCNs), designed for extracting road networks from remote sensing imagery.

Releases

  • Sarah Hoffmann announced release 1.3.0 of osm2pgsql with the addition of the (still experimental) new flex output. Jochen Topf, the main contributor for this release, explained how this gives more flexibility when exporting data from OSM to PostgreSQL.
  • The iD editor was updated and now has touch support, so it can be used on tablets (smartphone sized screens aren’t fully supported yet). Other highlights are integrated quality checks and multi-selection editing.
  • With the release of the latest version of iD, the ‘locator overlay’, a semi-transparent overlay when zoomed out, has been rebuilt. Via the OpenStreetMap editor-layer-index, the new overlay is now available on OpenStreetMap.org, and soon on the HOT Tasking Manager and other instances of iD.

Did you know …

  • … Finde.cash displays banks and ATMs with the respective ATM networks on a map? It also offers route planning by foot, bicycle or public transport, and four background options including OpenCycleMap. Missing ATMs can be inserted directly. The map is worldwide but the menu is only available in German.
  • MyOSMatic, the free of charge web service to generate city maps using OSM Data, which are available in PNG, PDF and SVG ready to print? Menus are available in 25 languages.
  • … the ‘OSM Quality Ranking’ (Beta) assesses and ranks 51 US cities by OSM data quality, checking geometry and tagging for streets, roads, and relations?

Other “geo” things

  • Brooklyn Historical Society’s map collection includes over 1,500 digitised historical maps spanning the seventeenth century to the present.
  • Nathanael Peterlini examined (de) (automatic translation) difficulties cartographers face when trying to please all of their users’ political views. They look at the cases of Kosovo and Palestine and how they are treated by Apple, Google, and OSM.
  • Garmin has been the victim of a ransomware attack. As a result, many of their online services were interrupted or are still down.
  • An update to Google Maps has allowed docked bike share riders in cities such as Chicago, Montreal and London to see end-to-end walking and cycling directions for their journey integrated with bike and dock availability. Cities Today gave some background to the new service.
  • Tagesspiegel interviewed 21,000 people about what scares them on the street and what Berlin’s bike paths should look like in the future. The results are explained (de) (automatic translation) with a series of graphics.

Upcoming Events

Where What When Country
London London Missing Maps Mapathon (ONLINE) 2020-08-04 uk
Mannheimn Mannheimer Mapathons – Treffen im Luisenpark 2020-08-04 deutschland
Stuttgart Stuttgarter Stammtisch 2020-08-05 germany
San José Civic Hack & Map Night (online) 2020-08-06 united states
Taipei OSM x Wikidata #19 2020-08-10 taiwan
Hamburg Hamburger Mappertreffen 2020-08-11 germany
Munich Münchner Stammtisch 2020-08-12 germany
Berlin 146. Berlin-Brandenburg Stammtisch 2020-08-14 germany
Zurich 120. Mapping-Party/OSM Meetup Zurich 2020-08-15 switzerland
Cologne Bonn Airport 130. Bonner OSM-Stammtisch (Online) 2020-08-18 germany
Lüneburg Lüneburger Mappertreffen 2020-08-18 germany
Cologne Köln Stammtisch ggf. ONLINE 2020-08-19 germany
Kandy 2020 State of the Map Asia 2020-10-31-2020-11-01 sri lanka

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Anne Ghisla, MatthiasMatthias, Nakaner, Nordpfeil, PierZen, Polyglot, Rogehm, SK53, TheSwavu, derFred, richter_fn.

Commissioners for Tanzanian Regions

12:44, Saturday, 01 2020 August UTC
Aggrey Mwanri is one of the 31 commissioners for a Tanzanian Region. The Tabora Region has a population of 2,291,623 inhabitants. For most of the 31 regions we know at least one commissioner and only for the Arusha Region we know "them all". 

I have been adding information about these Regional Commissioners and this is from a quality point of view a step in the right direction. Slowly but surely we know for more African countries structures and politicians.

When you compare African countries with "Western" countries, such structures are comparable. This makes it possible to show the extend the data in Wikidata does not represent the African reality. 

It is more than likely that there are lists of the data that is currently missing. These lists help us provide the bare bones of what it takes to know about African countries. 

So who are the data wizards who show where we our data is lacking. Where are the lists that enable the people who know tools like OpenRefine to fill in the gaps. Who has the pictures so that a Wikipedia article for a Mr Mwanri is illustrated??
Thanks,
       GerardM

Monthly Report, May 2020

22:25, Thursday, 30 2020 July UTC

Highlights

  • In May, we launched a Wiki Scientists course in partnership with 500 Women Scientists, facilitating as 20 members of 500 Women Scientists learned how to expand Wikipedia’s biographies of women in STEM. Thanks to a high demand from their members, we have continued searching for additional funding to support more women scientists as they join the Wikipedia community.
  • Our first two Wikidata courses of 2020 just wrapped up. The Beginner course had 10 participants who created 67 new Wikidata items, made more than 1,220 edits to Wikidata, and added 74 references to statements, improving the data quality for all of those claims. The 10 participants of the Intermediate course created 205 new items and edited over 5,100 existing ones. One of the participants was working on a project for the Metropolitan Museum of Art and uploaded over 5,000 images to Wikimedia Commons which can be used on Wikidata. Additionally, this course had several participants who are working on the LINCs project. This project aims to connect humanities research in Canada through linked data. The perspectives these individuals brought to this course demonstrated how Wikidata can influence other large scale linked data initiatives and, in turn, how these initiatives can influence Wikidata. To see some of the outstanding work these course participants did, follow this link. This record number of items edited is nothing short of inspiring. We hope this indicates that more institutions are willing to invest in Wikidata or that more driven participants are finding their way to our course. Either way this is a large body of high quality work that will benefit Wikidata and the larger linked data community.

Programs

Wikipedia Student Program

Status of the Wikipedia Student Program for Spring 2020 in numbers, as of May 31:

  • 409 courses were in progress (268, or 65%, were led by returning instructors).
  • 7,496 student editors were enrolled.
  • 55% of students were up-to-date with their assigned training modules
  • Students edited 6,200 articles, created 556 new entries, and added 5 million words and 54,100 references.

While a handful of spring quarter courses are still working on their Wikipedia assignments, May saw most of our courses wrap up for the term. Spring 2020 was a time of upheaval for our instructors and students as their courses abruptly switched to online platforms in the middle of the term. Despite these challenges, our students contributed 5 million words to Wikipedia, even tackling articles related to COVID-19.

Though a great deal of uncertainty surrounds the Fall 2020 term, Wikipedia Student Program Manager Helaine Blumenthal began to prepare for the following academic year. We are paying close attention to what institutions of higher education are planning as the pandemic unfolds and trying to adapt our materials to better serve our instructors and students in their changed classroom circumstances. Helaine hopes to look more deeply into best practices for online teaching and to assemble a robust list of library resources so our students can continue to access high quality sources despite being unable to physically go to their university libraries. 

Wikipedia Experts Shalor Toncray, Elysia Webb, and Ian Ramjohn were busy closing out courses and identifying the great work our students did this term.

Student work highlights:

On May 5, student work was featured on the main page of Wikipedia. A student improved an article about a moth species called the slender Scotch burnet. The article received nearly 1,300 views the day it was featured. Student work also appeared on the main page of Wikipedia again on May 29, where it was viewed an impressive 7,500 times! Climate of Pluto was created by a student in Vincent Chevrier Planetary Atmospheres course at University of Arkansas.

Gay and lesbian bars have long been a part of society. Some have needed to remain relatively secret in order to escape persecution while others have openly advertised their services to the local community. Daniel’s, which opened in late 1975, was one of the first lesbian bars in Spain and one of the first LGBT bars in Barcelona. Opened by María del Carmen Tobar, it originally was a bar and billiards room but expanded to have a dance hall. The bar attracted women from a wide variety of backgrounds including non-lesbian women. In the early years of the Spanish democratic transition the bar was accepted because its owner was well connected in the local government through her band-mate Daniela. Despite this, the police still occasionally raided the bar during its early years. Tobar played an active role in making Daniel’s the center of lesbian life in Barcelona, sponsoring sports teams and a theater group. The bar also sold feminist literature, including the magazine call Red de Amazonas. The bar later closed, but would be remembered in books and exhibits for its importance in the lesbian history of Spain. This article was expanded by a Colby College student in Dean Allbritton’s Queer Spain class, which sought to expand Wikipedia’s knowledge on LGBT history in Spain.

Many have heard of Amelia Earhart, but have you heard of Grace Muriel Earhart Morrissey? Morrissey was Amelia’s younger sister and a high school teacher, author, and activist. Earhart taught at the high schools in Medford and Belmont, MA, and she remained an active member of the Medford community until her death. She spent decades documenting Amelia’s life and managing her legacy, devoting significant time to coordinating her sister’s posthumous affairs, setting up donations, marshaling information, and dealing with Amelia’s fans. Morrissey also spoke out against the speculations that arose in the wake of Amelia’s death. She denied, for example, that her sister died while on a spy mission, as some theorists have past suggested. She wrote two books about Amelia, Courage is the Price and Amelia, My Courageous Sister. This article was created during May by a student in Lisa Gulesserian’s Kindred Spirits class at Harvard University, who allowed Amelia’s sister to shine. 

There are many Black men and women who fought against the injustices perpetuated against African-Americans seeking equal and fair treatment. Curlee Brown, Sr. is one such person who chose to challenge the inequality in the education system, as he launched a legal case that resulted in the integration of what would become the West Kentucky Community and Technical College. In 1950 Brown had attempted to enroll in the school, only to be rejected due to a then recently amended 1904 state law that prohibited desegregation in schools. He brought a lawsuit against the school and the U.S. District Court at Paducah ruled that the college must allow Brown and other Black applicants to enroll; however, the school fought against this. Their appeals were ultimately unsuccessful and the college was eventually integrated. For his tireless work with activism and the Paducah NAACP, Brown Sr. received multiple awards and honors and to honor his legacy the Kentucky NAACP created the Curlee Brown Scholarship. The Paducah branch of the NAACP created the Curlee Brown Award, which they grant to individuals who have made a visible impact in the field of human rights. In 2010 Brown Sr. was inducted into the Kentucky Civil Rights Hall of Fame. Another notable individual was Cyrus Field Adams, a Republican civil rights activist, author, teacher, newspaper manager and businessman. Adams fought a key battle in civil rights for African Americans. He used his variety of positions through his life, whether that be working for the newspaper, teacher, or working for the treasurer to advocate for civil rights. In his later life after being appointed by Theodore Roosevelt to be the Assistant Register at the US Treasury, he used this platform to write a book titled, The National Afro-American Council, Organized 1898: a history etc. In 1912, Adams decided to leave his position at the Treasury and join President Taft’s re-election campaign as asked to do so by Taft himself. This was an attempt to get Adams out of the treasury position as Taft had promised that position to another African-American man who supported Taft. Taft lost this election and President Wilson took over, he replaced every Republican that had worked for Taft including Adams. In the years to follow, an investigation was launched regarding the time Adams spent at the treasury to try to discredit his career. It’s thanks to University of Kentucky students in Nikki Brown’s African American History, 1865 to the Present class that we now have these articles. 

Joanna Mary Boyce (7 December 1831 – 15 July 1861) was a British painter associated with the Pre-Raphaelite Brotherhood. She is also known by her married name as Mrs. H.T. Wells, or as Joanna Mary Wells. She produced multiple works with historical themes, as well as portraits and sketches, and authored art criticism responding to her contemporaries. Boyce first exhibited her artwork publicly in 1855 at the Royal Academy. Though Boyce exhibited two pieces, it was her painting Elgiva that won Boyce the admiration of such critics as John Ruskin and Ford Madox Brown. In it, Boyce depicted model Lizzie Ridley as a tragic heroine from Anglo-Saxon historical legend, possibly following the precedent of Pre-Raphaelite painter John Everett Millais who had depicted Elgiva eight years prior. Following her first exhibition, Boyce continued to pursue artistic excellence through extensive sketching and international art-viewing expeditions. She spent 1857 in Italy, and in December of that year married miniaturist Henry Tanworth Wells (later a Royal Academician) in Rome. Boyce used her time in Italy to work on paintings such as The Boys’ Crusade and La Veneziana, a portrait of a Venetian lady. In addition to her own artistic practice at this time, Boyce also continued a lifelong practice of seeking out and analyzing the artwork of her contemporaries. Boyce published some of this analysis as art criticism in the Saturday Review, wherein she lauded the “sincerity” and principles of the Pre-Raphaelite art movement, and noted the positive influence of John Ruskin on the English art world. At the time of her death, contemporaries remarked on Boyce’s talent as an artist: Dante Gabriel Rossetti described her as “a wonderfully gifted woman”, and another obituarist called her a genius. Later critics have observed that Boyce’s reputation was somewhat constrained by her early death, but her art has been highlighted in exhibitions up until the present day. Who do we have to thank for the expansion of this article? None other than a UW Madison student from Anna Simon’s Art Librarianship class!

The Rio Grande sucker is a freshwater fish species native to the American southwest. Like many fish species in the area, populations have declined as a consequence of land use change, habitat loss, environmental degradation and competition from non-native species. As a result of this, the Rio Grande sucker is considered endangered in Colorado and a “species of concern” in Arizona. But before a student in Derek Houston’s Biology 667 class created an article about it, there was no Wikipedia article about the Rio Grande sucker. Given the important role that Wikipedia articles serve as a starting point for research into a topic, the presence of this article might have an impact on regulators who are trying to manage this fish species.

Little things run the world — in particular, the microorganisms that make up most of the living things on Earth. Rhodobacter capsulatus is a type of purple bacteria, which are bacteria that are able to make their own food using photosynthesis, much like plants do, but purple bacteria use a purple molecule to capture light instead of the green pigments that plants use. Rhodobacter capsulatus is also able to make gene transfer agents, small packages of DNA that allow them to transfer genes to other bacteria, without using sex. Before a student in Kelly Bender’s Prokaryotic Diversity class started editing it, Wikipedia’s article about this species of bacteria was just a three-sentence stub which mostly talked about the Latin roots of its scientific name. The student editor was able to expand the article into something very informative, adding sections about its genomics, morphology, ecology and significance, among others. Other students in the class made similar improvements to the Pseudomonas stutzeri and Chlamydia felis articles.

Scholars & Scientists Program

Wikipedia

This month we launched a 6-week intensive course focused on improving Wikipedia’s coverage of COVID-19 pandemic information. Specifically, participants are focusing on state-specific articles. In the United States, many of the actions taken that affect people’s lives most happen at the state level, and out of a commitment to public knowledge on Wikipedia we decided to run a course at no cost to participants in order to shore up this vital content. The scope of the course was more narrow than usual, and the duration a bit shorter, which allowed Scholars & Scientists Program Manager Ryan McGrady to develop a custom curriculum to guide participants to maximize their impact in a relatively short period of time.

We still have a couple weeks left in the COVID-19 Wiki Scholars course, but participants are already doing some incredible work. Highlights include a significantly expanded section of the Maine article that focuses on the impact on education; a near tripling of the size of the Wyoming article, including a major update to the timeline, impact on the economy, impact on colleges, and effects on the Northern Arapaho tribe and Yellowstone; several updates to the Florida timeline; increasing the size of the North Dakota article from about 6,500 to 45,000 bytes; and the addition of a significant section on the impact on voting in the New York article. At the start of the course, Wikipedia already had articles on all 50 states, but one Wiki Scholar ran into a challenge: how should we cover the well-documented impact on the Navajo Nation, which has the highest per capita rate of infection in the country and covers parts of three states? The answer seems obvious in hindsight, but nobody had done it yet: to create a brand new article about the COVID-19 pandemic in the Navajo Nation. Thanks to that Wiki Scholar, the impact on this community is covered on Wikipedia.

We were also excited to launch a course in partnership with 500 Women Scientists, focused on improving Wikipedia’s coverage of women in science. We’re less than half way through the course at the end of the month, and participants are still developing their articles, but we already have several great examples of biographies created or improved:

  • Rana Fine, whose research concerns ocean circulation processes over time through use of chemical tracers and the connection to climate.
  • Abigail Thompson, a mathematician who specializes in knot theory and low-dimensional topology.
  • Rachel Green, a professor of molecular biology and genetics researching ribosomes and their function in translation.
  • Deborah Kelley, a marine biologist studying hydrothermal vents, active submarine volcanoes, and life in those areas of the deep ocean.

The Women in Red Wiki Scholars course we kicked off last month started to hit its stride in May. We still have a few weeks left to go, but here are some of the biographies of women Wiki Scholars have created or improved this month:

  • Mary Carson Breckinridge (1881-1965), American nurse midwife who founded the Frontier Nursing Service.
  • Jane Sharp (c. 1641 – ?), an English midwife who wrote The Midwives Book: or the Whole Art of Midwifery Discovered in 1671.
  • Anne de Graville (c. 1490 – c. 1540), French Renaissance poet, translator, book collector, and lady-in-waiting to Queen Claude of France.
  • Madeleine Brès (1842-1921), the first French woman to obtain a medical degree.
  • Montserrat Calleja Gómez, Spanish physicist who specializes in bionanomechanics.
  • Anne-Marie Lagrange, French astrophysicist whose work focuses on extrasolar planetary systems.
  • Natalie Roe, experimental particle physicist and observational cosmologist who is the Director of the Physics Division at Lawrence Berkeley National Laboratory.

Last month we highlighted some of the articles improved through the course we ran with the American Physical Society. It wrapped up early this month, but not before participants added a few more articles to their list of pages created or improved:

  • Peter F. Green, materials scientist and Deputy Laboratory Director for Science and Technology at the National Renewable Energy Laboratory.
  • Tulika Bose, physicist at the University of Wisconsin-Madison whose research focuses on developing triggers for experimental searches of new phenomena in high energy physics.
  • Henry T. Brown, chemical engineer who was the first African American director of the American Institute of Chemical Engineers in 1983.
  • Rayleigh theorem for eigenvalues, a concept in mathematics concerning the behavior of the solutions of an eigenvalue equation as the number of basis functions employed in its resolution increases.

We also finished our third course focused on family planning topics in partnership with the Society of Family Planning. As with previous courses, participants improved several high-impact articles on abortion, contraception, and related topics. Among the improvements this month were: the addition of a section on teleabortion to the telehealth article; updates to a wide range of state-specific abortion articles, like Abortion in New York and Abortion in Guam; extensive edits to the Title X article, the only federal grant program dedicated solely to providing individuals with comprehensive family planning and related preventative health services; and a variety of improvements to the pregnancy test article.

Wikidata

Our first two Wikidata courses of 2020 just wrapped up. These two courses were able to a staggering amount of work in six short weeks.

  • Beginner: This course had 10 participants who created 67 new Wikidata items, made more than 1,220 edits to Wikidata, and added 74 references to statements, improving the data quality for all of those claims. The participants in this course were engaged and excited about the course material. We were lucky to host several individuals from City College, part of CUNY, in New York. Having multiple perspectives from one institution emphasized just how many applications Wikidata has. We also had a participant work on a collection of theater posters. Take a look at this well-modeled item for Yosef Bulof in Gidon. One item, William Marshal, 1st Earl of Pembroke, received more than 20 new references, bolstering the accuracy of their respective statements. The interests of this group varied greatly. this link to see a complete list of items they edited.
  • Intermediate: Ten editors participated in this course. They created 205 new items and edited over 5,100 existing ones. One of the participants was working on a project for the Metropolitan Museum of Art and uploaded over 5,000 images to Wikimedia Commons which can be used on Wikidata. Additionally, this course had several participants who are working on the LINCs project. This project aims to connect humanities research in Canada through linked data. The perspectives these individuals brought to this course demonstrated how Wikidata can influence other large scale linked data initiatives and, in turn, how these initiatives can influence Wikidata. To see some of the outstanding work these course participants did, follow this link.

This record number of items edited is nothing short of inspiring. We hope this indicates that more institutions are willing to invest in Wikidata or that more driven participants are finding their way to our course. Either way this is a large body of high quality work that will benefit Wikidata and the larger linked data community. 

Advancement

Partnerships

In May, we launched a Wiki Scientists course in partnership with 500 Women Scientists, facilitating as 20 members of 500 Women Scientists learned how to expand Wikipedia’s biographies of women in STEM. Thanks to a high demand from their members, we have continued searching for additional funding to support more women scientists as they join the Wikipedia community. 

We spent some time in May reworking the Scholars & Scientists end-of-course survey for participants, making sure we continue learning about their experiences in the course, motivations for participating, and how they assess their learning outcomes. We deployed the new survey to Wiki Scientists who completed the American Physical Society course, and we’re excited to use the information to demonstrate the value of working on Wikipedia to others’ employers and organizations.

Communications

Attabey Rodríguez Benítez has tips for folks stuck at home: learn how to add photos to Wikimedia Commons like she did in our Wiki Scientist course!

Dr. Lilly Eluvathingal learned how to add content to Wikipedia pages in her area of expertise through one of our Wiki Scientist courses. This month,  shared on our blog what she thought of the experience. 

The Wikipedia page Andrew Oh drastically improved achieved Good Article status when he continued to edit it after his course. Read more about what he found so valuable about the experience.

Blog posts:

External media:

Research:

Technology

In May, we turned our attention from our project to improve the user experience for students and instructors — which we had been iterating on through April — to the long-term foundations of the Dashboard. Google Summer of Code interns Amit Joki and Shashwat Kathuri, while not scheduled to officially start the ‘coding period’ until June, have already started making major strides to modernize the Dashboard’s JavaScript infrastructure by replacing deprecated libraries and features and replacing them with more stable and well-supported alternatives. This work will accelerate into the summer, as Amit focuses on streamlining our JavaScript and reducing the amount of code that browsers need to download, while Shashwat develops a system to better keep track of system errors and data bottlenecks.

Finance & Administration

The total expenditures for the month of April were $174K, ($16K) under the budget of $190K. The Board was under ($9K) by moving the Board Meeting from In-person to Remote. Fundraising was over budget +$6K due to a personnel change creating a need for consulting work +$2K and employment costs +$3K and +$1K in Indirect Costs. General & Administrative were over +$11K due to Indirect overhead allocation change +$6K, Professional Fees +$4K, and Administrative Costs +$1K. Programs were under by ($24K) including Payroll ($5K), while under in Travel ($8K), Professional Fees ($2K) Communications ($2K) and Indirect costs ($7K).

Office of the ED

Current priorities:

  • Finalizing the annual plan & budget for fiscal year 2020–21
  • Dealing with the effects of the COVID-19 pandemic on our organization

In May, Frank continued working on putting the annual plan & budget for next fiscal year together. Relying on a multitude of different data sources, as well as numerous conversations, he tried to come up with a realistic picture of how the COVID-19 pandemic and other destabilizing events in the United States could affect Wiki Education. In particular, his goal was to understand how institutional funders might move forward given that times of crisis always trigger a reaction from philanthropy that might threaten the survival of nonprofits which – like Wiki Education – depend on continuously unlocking new funding opportunities from institutional grantmakers. Frank ran through different scenarios and possible ways to mitigate a situation where grantmakers wouldn’t accept any new grantees for the foreseeable future, and where they would focus on protecting their endowments instead. 

After sending a first draft of the potential annual plan for 2020–21 to the board, Frank started extensive conversations with individual board members. He highlighted the extreme uncertainty that made coming up with the best path forward difficult, and he listened to how the board members assessed the situation. Discussions included projections about the situation in our country in general, about the possible reaction of the philanthropic sector, as well as effects on higher education and knowledge institutions like museums, archives, and libraries. All these conversations lasted through May and the board generously agreed to extend the timeline for the delivery of the final version of the annual plan in order to find the best solution for Wiki Education and the many millions of people being positively impacted by our organization’s work.

All code is built

20:57, Wednesday, 29 2020 July UTC

HEADER CAPTION: The head of the Statue of Liberty on exhibit at the Paris World's Fair, 1878. The statue was built in France ahead of time, shipped overseas in crates, and then assembled in New York. Image by Albert Fernique / public domain.

The process of mapping human-readable source code inputs to optimized, machine-readable outputs is called compiling or more generally, building. It's been a necessary part of software development since computers evolved past machine code. Even to serve the most abstract, high-level languages such as HTML and CSS, this build process is essential.

Just-in-time build steps

We build code all the time at Wikimedia. Every page request benefits from Less compilation, CSS and JavaScript minification, internationalization, URL mapping, and bundling build steps. All of this occurs at runtime through the ResourceLoader pipeline.

ResourceLoader's just-in-time build process is critical when key parameters vary on request. However, it has some notable limitations including:

  • Every just-in-time build step must be extremely performant, so fast that it can run on-the-fly, or our pages will load slowly. Additionally, sequential steps cannot be appended ad infinitum.
  • Effectively, ResourceLoader's just-in-time build steps can only use tools written in PHP. JavaScript execution is not possible.
  • Just-in-time build steps are less secure. They execute on production servers and serve content directly to the user. This eliminates the separation between development and runtime-only dependency trees, which can dramatically increase the attack surface, sometimes by orders of magnitude. Additionally, build outputs are shipped directly to the user without any opportunity for security review. When it comes to security, a just-in-time build step always strives to be as secure as an ahead-of-time build step that produces static outputs.
  • Just-in-time build steps are custom and complex. An ahead-of-time build step can easily be a one-liner that invokes standard tooling but the equivalent just-in-time build step, if one exists, is just as likely to be hundreds of lines of custom code. Historically, these custom steps have suffered from bus factor and received little attention beyond basic life support. Few engineers possess the abilities to write code of the caliber needed to add new build steps or change existing ones, which means the rest of Wikipedia and WMDE is blocked on their evolution. For example, we have been unable to keep pace with fundamental features like source map support (a formal request since 2013) or ES6 transpilation. In fact, there are laundry lists of missing features now standard elsewhere. The lack of standard functionality means that developing any code at Wikimedia is a completely different and far slower experience than the rest of the industry.
  • Just-in-time build step outputs have worse caching. The most advanced build step executed at runtime endeavors to have the same caching that comes out-of-the-box with an ahead-of-time build step: a plain file on disk.
CAPTION: No step in the pipeline can be delayed, and the longer the pipeline, the longer it takes to go from a nut to a new car. Image by unknown author / public domain.

Solving problems too big for just-in-time

Some problems are only solvable by just-in-time build steps. However, many solutions cannot meet the constraints of just-in-time build steps, so only a subset of all problems can be solved. This is a more general limitation of just-in-time build steps, not the ResourceLoader implementation. In practice, this means that developers cannot add a build step to the pipeline but are still left with their problem unsolved.

There must be an alternative. Our options include:

  1. Double down on building new features in ResourceLoader. This approach fails to address the fundamental limitations of all just-in-time build steps and may require reimplementing existing open-source solutions.
  2. Ship extra tooling to every user's browser and let them process it. Besides significantly increased bandwidth and computation costs that go against our mission to serve everyone, this isn't very eco-friendly, fails to solve many problems, leads to the laggy browsing experiences users so loathe on JavaScript-heavy pages, and doesn't scale far past polyfills.
  3. Replace ResourceLoader with industry standard tooling that has fewer constraints. This will require exploration, be expensive, and may have the same outcome as #1.
  4. Enhance ResourceLoader by building what we can ahead-of-time.

The first two options don't work. The third option doesn't sound like a good first choice. The fourth is the most conventional and proven solution.

Ahead-of-time build steps

Ahead-of-time build steps are usually what people think of when they refer to "building code." Most build problems that remain to be solved in Wikimedia only fit in the ahead-of-time space. As you might expect, we're using these enhancements all over the place already and can't live without them. Some examples include:

  • OOUI: Portions of this library are built with Grunt and a suite of packages from NPM for minification, uglification, and additional processing. The results are dozens of build products that are file-copied into Core manually.
  • Page Previews: This gem of a codebase is fully compiled from the latest JavaScript with Webpack. It serves about two billion virtual pageviews a month.
  • Wikibase : Ahead-of-time build tools are used by Wikibase including Webpack, TypeScript, and a plethora of other standards to serve the Wikidata communities.
  • MultimediaViewer: Commits to MultimediaViewer use ahead-of-time build steps to replace any human readable source SVGs with optimized, machine-readable outputs.
  • MediaWiki: Core uses a build step on every deployment. The process is called "a full scap." When the process fails, it's called "a full scapadapadoo."
  • MobileFrontend: All JavaScript in MobileFrontend, the heart of the mobile site, is built by Webpack. That's over 50% of all pageviews benefiting from an ahead-of-time build step using industry standard tooling.
  • Wikipedia for KaiOS: This Webpack-powered project uses a build step to serve a highly performant web app.
  • ContentTranslation: The glittering new ContentTranslation app uses the Vue CLI and standard tooling to generate the next-generation interfaces essential to serving contributors around the world. Put plainly, this is the kind of modern experience that would be impossible to build without modern tooling that leverages ahead-of-time build steps.
  • Wikipedia.org: Portals uses a build step to synchronize sister project statistics. I know someone who has a recurring task each week reminding him "it's build time." Although triggering the build step is person-powered, the outputs are what you would expect of an ahead-of-time build step: practical and project specific.
  • VisualEditor: VE is a sophisticated application that requires a build step. I don't know what this does exactly but I would guess it's solving the same kinds of problems everyone else has ahead-of-time.
  • And many more.

These ahead-of-time build steps are everywhere in Gruntfiles, Gulpfiles, Webpack configs, NPM package.json files, and shell scripts. Even if the Foundation mandated it today, we could never get rid of them.

Evolving the ResourceLoader pipeline with a new stage

Ahead-of-time build steps are the only solution for many problems, so it's fortunate they have such a proven track record of success both within and beyond the MediaWiki ecosystem. As everyone who is already using ahead-of-time build steps has discovered, they're the perfect complement to ResourceLoader's just-in-time build steps.

However, this is a problem at scale and it needs to be solved at scale. Informal developer builds work surprisingly well but aren't as efficient for developers as they could be. We need to extend the pipeline to include a pre-ResourceLoader stage. This stage is an ahead-of-time build step.

CAPTION: The International Space Station was built on Earth in modules that were optimized for assembly and constructed in orbit. Similarly, ResourceLoader modules can be built before deployment and finally assembled in the user's browser. Image by NASA/Crew of STS-132 / public domain.

In conclusion:

  • ResourceLoader provides useful just-in-time build steps.
  • Many projects have requirements that cannot be solved at runtime. These real problems are only solvable by traditional ahead-of-time build steps.
  • Just-in-time and ahead-of-time build steps are already in use by and are for everyone, and we can't change that.
  • Ahead-of-time build steps often use standard tools but are highly project specific. These should not be centralized nor should they be constrained by artificial limitations. Per-project solution autonomy must be preserved.
  • Adding a pre-ResourceLoader stage can integrate neatly with the current ResourceLoader system by extending the pipeline to include these existing ahead-of-time workflows.

Above all, a build step means freedom. The freedom to succeed and the freedom to use the tool that's right for the job, not the rare tool that fits into a runtime-only pipeline.

Thanks to Jan Drewniak, Santhosh Thottingal, Daniel Cipoletti, Joe Walsh, Bernd Sitzmann, and Mónica Pinedo Bajo for reviewing and providing detailed feedback.

This post is also available on the Wikimedia Tech Blog.

Production Excellence #22: June 2020

16:54, Tuesday, 28 2020 July UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📈 Month in review
  • 4 documented incidents in June. [1]
  • 37 new production errors were filed and 27 were closed. [2] [3]
  • 72 recent production errors still open (up from 68).
  • 203 total Wikimedia-prod-error tasks currently open (up from 192). [4]

For more about recent incidents see Incident documentation, on Wikitech or Preventive measures in Phabricator.


📖 Outstanding errors

Breakdown of new errors reported in June that are still open today:

  1. (Needs owner) / Newsletter extension: Unexpected locking SELECT query. T253926
  2. (Needs owner) / FlaggedRevs extension: Unable to submit review of page due to bad fr_page_id record. T256296
  3. Editing team / MassMessage extension: Delivery fails due to system user conflict. T171003
  4. Parsing team / Parsoid: Pagebundle data unavailable due to a bad UTF-8 string. T236866
  5. Growth team / Recent changes: Update for ActiveUsers data failing due to deadlock. T255059
  6. Growth team / GrowthExperiments: Issue with question display on personal homepage. T255616
  7. Language team / Translate extension: Update jobs fail due to invalid function call. T255669
  8. Language team / ContentTranslation: Save action fails due to duplicate insert query. T256230
  9. Core Platform team / Content handling: Incompatible content type during content merge/stash. T255700
  10. Core Platform team / Monolog: API usage logs and error logs sometimes missing due to socket failure. T255578
  11. Search Platform team / WikibaseCirrus: Elevated error levels from EntitySearchElastic warnings. T255658
  12. Wikidata / API: Generator query fails due to invalid API result format. T254334
  13. Wikidata / API: EntityData query emits warning about bad RDF. T255054
  14. Wikidata / Repo: Entity relation update jobs fail due to deadlock. T255706

📊 Trends
Take a look at the workboard and look for tasks that could use your help.

Summary over recent months:

  • July 2019 (5 of 18 tasks left): Two tasks closed.
  • August (1 of 14 tasks left): Another task closed, only one remaining! 🚀
  • September (5 of 12 tasks left): Two tasks closed.
  • October (6 of 12 tasks left), no change.
  • November (3 of 5 tasks left): Another task closed.
  • December (5 of 9 tasks left), no change.
  • January 2020 (5 of 7 tasks lef), no change.
  • February (4 of 7 tasks left), no change.
  • March (2 of 2 tasks left), no change.
  • April (11 of 14 tasks left): Three tasks closed.
  • May (11 tasks left): Three tasks closed.
  • June: 14 new tasks survived the month of June. ⚠️

At the end of May the number of open production errors over recent months was 68. Of those, 10 got closed, but with 14 new tasks from June still open, the total has grown further to 72.

The workboard had 192 open tasks last month, which saw another increase, to now 203 open tasks (this includes tasks from 2019 and earlier).


🎉 Thanks!

Thank you to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


ATC: “Do you want to report a UFO?” Pilot: “Negative. We don't want to report.”
   ATC: “Do you wish to file a report of any kind to us?” Pilot: “I wouldn't know what kind of report to file.”
  ATC: “Me neither…”

Footnotes:
[1] Incidents. – https://wikitech.wikimedia.org/wiki/Incident_documentation#2020
[2] Tasks created. – https://phabricator.wikimedia.org/maniphest/query/VTpmvaJLYVL1/#R
[3] Tasks closed. – https://phabricator.wikimedia.org/maniphest/query/qn5yeURqyl3D/#R
[4] Open tasks. – https://phabricator.wikimedia.org/maniphest/query/Fw3RdXt1Sdxp/#R

Ornithologists in cartoons

06:56, Tuesday, 28 2020 July UTC
From: The Graphic. 25 April 1874.
It is said that the modern version of badminton evolved from a game played in Poona (some sources name the game itself as Poona). When I saw this picture from 1874 about five years ago, I gave little thought to it. Revisiting it after five years after some research on one of A.O. Hume's ornithological collaborators, I have a strong hunch that one of the people depicted in the picture is recognizable although it is not going to be easy to confirm this.

I recently created a Wikipedia entry for a British administrator who worked in the Bombay Presidency, G.W. Vidal, when I came across a genealogy website (whose maintainer unfortunately was uncontactable by email) with notes on his life that included a photograph in profile and a cartoon. The photograph was apparently taken by Vidal himself, a keen amateur photographer apart from being a snake and bird enthusiast. Like naturalists of that epoch, many of his specimens were shot, skinned or pickled and sent off to museums or specialists. He was an active collaborator of Hume and contributed a long note in Stray Feathers on the birds of Ratnagiri District, where he was a senior ICS official. He continued to contribute notes after the ornithological exit of Hume, to the Journal of the Bombay Natural History Society. This gives further support for an idea I have suggested before that a key stimulus for the formation of the BNHS was the end of Stray Feathers. Vidal's mother has the claim for being the first women novelist of Australia. Interestingly one of his daughters, Norah, married Major Robert Mitchell Betham (2 May 1864 – 14 March 1940), another keen amateur ornithologist born in Dapoli, who is well-known in Bangalore birding circles for being the first to note Lesser Floricans in the region. Now Vidal was involved in popularizing badminton in India, apparently creating some of the rules that allowed matches to be played. The man at the left in the sketch in the 1874 edition of The Graphic looks quite like Vidal, but who knows! What do you think?

PS: Vidal sent bird specimens to Hume, and at least two subspecies have been named from his specimens after him - Perdicula asiatica vidali and Todiramphus chloris vidali.

For more information on Vidal, do take a look at the Wikipedia entry. More information from readers is welcome as usual.

PS: 26-July-2020: It would appear that an old badminton court near Sholapur was also of some ornithological interest.
I think I can safely say that there is only one place in India where this bird has been shot, and where I have shot it during every month in the year, and this Sholapur. There was a grass and baubul jungle near the old Fives court on the Bijapur Road which always contained florican. - "Felix" (1906). Recollections of a Bison & Tiger Hunter. London:J.M. Dent & Co. p. 183.

Special Olympics staff improve Wikipedia’s equity

16:32, Monday, 27 2020 July UTC

Jamie Valis is the Director of Health Training at Special Olympics, Lindsay Dubois is the Director, Research and Evaluation at Special Olympics, and Chelsea Fosse is a public health dentist, former Coordinator of the Clinical Director Community of Practice at Special Olympics, and Senior Health Policy Analyst at the American Dental Association. Jamie, Lindsay, and Chelsea recently participated in a WITH Wiki Scientists training course and reflect on their experience related to health disparities for individuals with intellectual disabilities.

Jamie Valis, Lindsay Dubois, and Chelsea Fosse
Jamie Valis, left; Lindsay Dubois, center; and Chelsea Fosse, right

At first glance, a Special Olympics competition looks like any other sporting event. Passion radiates from faces on the field, stretching is happening on the sidelines in preparation for an upcoming match, seas of uniform colors form as teams flock together throughout the facility, and fans cheer loudly. But what you may miss if you don’t look closely, is that life-saving health services and screenings are also taking place on-site during these competitions. Special Olympics athletes have intellectual disabilities (ID) and face greater barriers accessing and utilizing our complex health care system and suffer at disproportionate rates from chronic health conditions. They have difficulty finding providers who are trained and willing to serve them, and may struggle with or need the support of others in their day-to-day health needs. Special Olympics offer these critical health screenings to help end the health disparities and health care inequities that exist and are experienced by people with ID.

Special Olympics – with support from the Centers for Disease Control and Prevention (CDC) and the Golisano Foundation – has been striving to correct these inequities and promote the health and wellness of athletes and those individuals with ID. Our Inclusive Health programming includes: training health care providers and health professional students to provide higher quality care for people with ID; performing health screenings; offering health services such as prescription glasses, shoe fittings, and fluoride treatments; and referring athletes for follow-up care when necessary. While we’re incredibly proud of the progress we’ve made in bringing quality health care to our athletes in a positive and encouraging environment, we recognize there are so many more lives we need to touch, and additional work that needs to be done to increase awareness of the health disparities that exist in our communities around the world

When WITH Foundation and Wiki Education announced the opportunity for those in the disability and health space to come together to expand and enhance their knowledge base on disability and health, we were eager to participate. The call for participation in this program came through American Academy of Developmental Medicine & Dentistry (AADMD) on January 11th, 2020 – the exact day that Chinese media reported the first death due to COVID-19. On the first day of the Wiki Scientists course there were 0 confirmed cases of COVID-19 in the United States. The day the class ended, there were 22,086 confirmed cases in the United States. The correlation between the timeline of COVID-19 and this course is noteworthy given people with developmental disabilities are one of the most vulnerable populations to the virus. The most important risk for people with ID, however, is not the underlying condition, but the lack of access to quality health care and subsequent health inequities.

Three Special Olympics staff members were part of an 18-member cohort of students who were selected to participate in a 12-week WITH Wiki Scientists course to understand how to make scholarly research and Wikipedia contributions accessible to those with disabilities. The educational backgrounds, professional experiences, and personal journeys were integrated to allow each participant to explore their areas of interest while contributing to a broader mission. Participating in this course with such a diverse group allowed us to collaborate with other disability advocates and the broader disability community. We were reminded that many of the battles we face on a daily basis are those that are not just unique to people with intellectual disabilities.

Understanding how to utilize our sandboxes (pages designed for testing edits on Wikipedia), becoming proficient at using the visual editor, and evaluating wiki articles were amongst the many components of the weekly training assignment that we completed. Throughout the course, we were challenged to contribute to two Wikipedia articles. We placed emphasis on articles that had high relevance for understanding the health needs of people with intellectual disabilities and for solutions to improving health equity.

While the course focused on the basics of making contributions to existing articles and creating new Wikipedia entries, there was also emphasis on the overall concept of inclusivity and the need to create content that people of all abilities can read and understand. To address this, Wikipedia launched a simple English version of its platform for people with different needs, children, adults with learning difficulties, and people who are trying to learn English. The caveat is that content contributors need to write and submit this content separately.

Wikipedia editors are like anyone else, and they have their own biases and interests. Wikipedia is a reflection of the people who create it, and not necessarily the experiences and contributions of the broader population. This means that articles may have a more “ableist” point of view if they were written by scholars or contributors who don’t understand the lived experiences and needs of people with intellectual and developmental disabilities; these articles may use terminology that makes assumptions about the abilities of people with disabilities.

What became clear throughout the course was that Wikipedia, for all its vast wisdom and knowledge, is not immune to the shortcomings that we continue to observe in the fight for equity for people with intellectual and developmental disabilities. As a result of this course, the Special Olympics staff involved in this course propose a call to action for all scholars, experts, and other interested individuals to do the following:

1) Set up a Wikipedia account and learn how to create and review content.

2) Work on a simple language summary of your findings, recommendations, and guidelines every time you publish on Wikipedia content about people with intellectual and developmental disabilities

3) Review existing Wikipedia entries that are related to your areas of expertise and help create simple English versions of this content.

It has been the such a rewarding experience to meet new colleagues, collaborate with other disabilities advocates, and broaden our own horizons on issues faced by others in the disabilities community.

Lindsay Dubois, PhD
Chelsea Fosse, DMD, MPH
Jamie Valis, PhD

 

Interested in taking a course like the one Lindsay, Chelsea, and Jamie took? Visit learn.wikiedu.org to see current course offerings.

Header/thumbnail image by Maggie Mengel, AKSM photography, CC BY-SA 4.0 via Wikimedia Commons. Images of authors courtesy Jamie Valis.

Tech News issue #31, 2020 (July 27, 2020)

00:00, Monday, 27 2020 July UTC
TriangleArrow-Left.svgprevious 2020, week 31 (Monday 27 July 2020) nextTriangleArrow-Right.svg
Other languages:
British English • ‎Deutsch • ‎Deutsch (Sie-Form)‎ • ‎English • ‎Nederlands • ‎español • ‎français • ‎italiano • ‎polski • ‎português do Brasil • ‎suomi • ‎čeština • ‎русский • ‎српски / srpski • ‎українська • ‎עברית • ‎العربية • ‎മലയാളം • ‎中文 • ‎日本語 • ‎한국어

weeklyOSM 522

10:33, Sunday, 26 2020 July UTC

14/07/2020-20/07/2020

lead picture

Victoria Crawford‘s streets in Hackney align with the rising sun 1 | © Victoria Crawford | map data © OpenStreetMap contributors

Mapping

  • Michael Reichert would like to revert a series of surface and tracktype tags added without local knowledge by an armchair mapper. A discussion followed on the feasibility of mapping these tags only from imagery and whether reverting would be justified in all countries affected.
  • Cycling infrastructure has been selected as the UK quarterly project for Q3 2020. The project’s wiki page lists details of the types of things that can be done as part of the project. Of particular note is the availability of large quantities of cycling-related data from Transport for London which needs merging with OSM.
  • Adamant36’s request to add RapiD to the list of editors on the OSM map resulted in a robust discussion on GitHub.
  • Supaplex030 describes (de) (automatic translation) in his blog how measuring vibrations with a smartphone sensor gives excellent results to objectively classify the flatness of surfaces and map them with smoothness=*.
  • The MITFAHR|DE|ZENTRALE has published a guide (de) (automatic translation) on how to find railway stations without bike parking facilities.

Community

  • Stephan Knauss asked what happened to the FOSSGIS affiliate link for Amazon.de. Frederik Ramm replied that the link still exists but the revenue is declining. The statistics have been updated for those who are interested.
  • Andy Allan had a productive day working on the OpenStreetMap website.
  • Deane Kensok blogged about a partnership between Esri and Facebook to provide data from the ArcGIS user community to OSM. These datasets are OSM-tagged, compatibly licensed, and available to use for building maps in RapID and JOSM (via a plugin).
  • In Geomob Podcast #24 Ed Freyfogle interviews Peter Karich, co-founder of routing software maker Graphhopper. Graphhopper is a unique success story – they have built a thriving business on top of an open-source software project and OpenStreetMap data.
  • In Geomob podcast #25 Ed chats with geo industry veteran Randy Meech, co-founder of StreetCred. Perviously Randy was CEO of Mapzen, and CTO of MapQuest.
  • At the close of the State of the Map 2020 a live quiz with an OpenStreetMap theme was held, in which our team member ‘Nakaner’ took first place. A few days after the conference finished, Ilya Zverev published the quiz as a standalone game Test your knowledge of OpenStreetMap history and technology, or use the code to create your own quiz!
  • Ed Neerhut wrote in Geospatial World about the importance of maps in a crisis.

OpenStreetMap Foundation

  • openstreetmap.org was offline for a short period of time. Due to a memory-leakage issue with creating planet dumps, the server (ironbelly) was running out of memory, which interfered with its other role as NFS (Network File System) server for the web site. In order to prevent this in future, user images will be moved to Amazon’s S3.
  • The minutes of the Local Chapters and Communities Working Group meeting of 15 June have been published.
  • OpenStreetMap US has applied to become an official OSMF local chapter. The supporting documentation is available on the OSMF wiki.
  • The OpenStreetMap Ops Team tweeted a reminder that any Bitcoin donated to the OSMF would be used to build a stronger project.

Humanitarian OSM

  • Nominations for this year’s Humanitarian OpenStreetMap Team Board of Directors election have closed. The candidates have till 31 July to present their ideas and engage with the membership on how they would contribute to HOT in an elected role. You can find the list of candidates, the details of their nomination, and links to their candidate statements on the wiki page.
  • HOT published in a blogpost what was achieved with the 2019 HOT Microgrants.
  • Mikel Maron wrote a blogpost about what HOT needs to work on until 2025.

Maps

  • The LandlordTech project has created a survey in order to collect and display data on new forms of housing injustice caused by surveillance, tracking, data accumulation, and algorithmic methods intruding into domestic and neighbourhood spaces.
  • John Nelson wrote about firefly cartography and how these glowing maps attract feedback like no other type of map that he has made.
  • Michal Migurski announced that Facebook is releasing an update to Daylight – a 56 gigabyte export of validated OSM data.
  • A YouTube video about South Korea’s map service policy depicts Google’s fight for better map data, and the companies profiting from the situation.

Licences

  • The minutes of the Licensing Working Group meeting on 9 July have been published. The Attribution Guideline, for which the Board of Directors would prefer a stricter version than the one LWG proposed, was given a lot of attention. Simon Poole also announced that he is stepping down from the LWG.

Software

  • New features have been added to the browser extension OSM Smart Menu. Now users can create links using URL Templates [1], rename existing links [2] and access the link list from any website [3].
  • John Vargas-Muñoz et al. have published a paper reviewing the use of machine learning to improve OSM data and machine learning based techniques that use OSM data for applications in other domains.

Programming

  • hauke-stieler has developed a new task management tool, as an alternative to the HOT tasking manager or MapCraft, and called it Simple Task Manager (STM). On the mailing list Hauke explains (de) (automatic translation) in detail what made him do it and how to work with the STM.
  • [1] Victoria Crawford reports on Twitter about her map, which shows which roads in Hackney run toward the rising sun during the year, according to her, inspired by puntofisso, which in turn refers to Cédric Scherer’s (@CedScherer) 30DayMapChallenge. She has published her code on GitHub.
  • Alexey Pechnikov reported (ru) (automatic translation) about his experiences with creating PostgreSQL / PgRouting routing systems using OpenStreetMap data.

Did you know …

  • … that the OpenStreetMap wiki has an A to Z to help you figure out how to tag objects?
  • … that you can view 19th century maps of Europe on Mapire?

Other “geo” things

  • The Group on Earth Observations (GEO) and Google Earth Engine (GEE) have announced that 32 projects, from 22 countries, will be awarded a total of US$3 million towards production licences and US$1 million in technical support from EO Data Science, to tackle some of the world’s greatest challenges using open Earth data.
  • Alexander Zipf reported on work done at Heidelberg University to develop a new routing algorithm for pedestrians. The algorithm minimises the exposure of pedestrians to traffic noise pollution while taking into account the route distance.
  • The history of indigenous peoples and their political movements is an important issue in Taiwan. Researchers in east coast county Hualian have trained members of the Bunun people to use GPS devices, and in an expedition surveyed traces of the abandoned settlements where their ancestors lived. They found (automatic translation) 50 historic remains such as abandoned houses in the remote area.

Upcoming Events

Where What When Country
Ludwigshafen a.Rhein (Stadtbibliothek) Mannheimer Mapathons e.V. 2020-07-23 germany
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-07-29 germany
London London Missing Maps Mapathon (ONLINE) 2020-08-04 uk
Stuttgart Stuttgarter Stammtisch 2020-08-05 germany
Taipei OSM x Wikidata #19 2020-08-10 taiwan
Zurich 120. OSM Meetup Zurich 2020-08-11 switzerland
Munich Münchner Stammtisch 2020-08-12 germany
Kandy 2020 State of the Map Asia 2020-10-31-2020-11-01 sri lanka

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by AnisKoutsi, Joker234, MatthiasMatthias, Nakaner, Nordpfeil, Polyglot, Rogehm, SK53, TheSwavu, derFred, mOlind, richter_fn.

First a definition; "When data is biased, we mean that the sample is not representative of the entire population". This approach successfully underpins the Women in Red project currently a percentage of 18.51% women in English Wikipedia has been achieved. Compare the coverage of Anglo-American politicians with the politicians from the whole of Africa, the bias in the data at Wikidata is already obvious, it will then have numbers attached to it.

This is not a problem for Wikidata alone and yes, we can have a project and include a lot of data to get to a growth percentage as we did for the Women in Red. Worthwhile in its own right but in this way we do not forge a closer relation with its "premier brand Wikipedia". It would be mere stamp collecting.

The best argument for having data in Wikidata is that it is used. This is done in self selecting Wikipedias through global info boxes and lists. Interwiki links are used on every Wikipedia. Integrating the necessary functionality is a meta/technical affair and firmly for the Wikimedia Foundation to own. 

The functionality to make this happen implements an existing idea with additional twists.
  • Pictures for the subject are linked to courtesy of Special:MediaSearch
  • Automated descriptions are provided in every language to aid disambiguation. At first the functionality by Magnus is used and it is to be replaced with improved descriptions provided by Abstract Wikipedia
  • A Reasonator like display is provided to inform on the data we have on an item.
  • Suggestions for the inclusion in categories and lists are provided based on Wikidata definitions for categories and lists.
  • To help people find sources, alternate sources, Scholia is included when there are papers about the subject. Once existing citations are available, they are an additional resource
In essence this is a toolset that you can opt into as an individual and/or it is the standard for a project. Particularly for the smaller projects this will prove to be really valuable; it will prevent false friends, it indicates heavily linked items that do not have an article. It stimulates the addition of labels because it is beneficial in finding illustrations. 

This proposal is relatively low tech and it will bring our many communities together by providing widely the information that is available to us.
Thanks,
     GerardM

What to love in English Wikipedia

19:03, Thursday, 23 2020 July UTC
This list of commissioners of the Arusha Region is great, it provides the basic information that enables me to include this information in Wikidata. It can be assumed that they are all from Tanzania, politicians and human as well. 

What I love in English Wikipedia are lists like this. It is more than likely that for every Tanzanian region there will be a similar list and as a consequence we can include all these fine politicians to Wikidata, list them in whatever Wikipedia.

As more politicians for Tanzania or any other African country are added, politicians will pop up who have held multiple offices. This will be explicit in Wikidata and in Wikipedia you could use Special:WhatLinksHere.

Technically there is not much stopping us from associating red links with Wikidata items. This is the same guy used in the "WhatLinksHere" and you find him in this list that is a work in progress as well. 

Think this through.. With lists like this in any Wikipedia, these people are findable, linkable. It will be possible to state in text what a given commissioner did and, there will be no ambiguity because of the link. 

So I love English Wikipedia for the rich resource of information it is. I love its editors who provide us with the information that enables the reuse of data. I will rejoice when it is recognised that we can do much more. When we accept that together, as an ecosystem, we are in a position where we actually share the sum of all knowledge that is available to us.
Thanks,
        GerardM

Manjari - 4th anniversary

04:40, Thursday, 23 2020 July UTC

A rough drawing I did in 2014 November 20 and shared with my friends as a new font idea. I got this concept from my explorations about perfect curves in Malayalam script after I released Chilanka font. I spent all my free time from then onwards till releasing Manjari typeface on 23rd July 2016 by making it as perfect as I can. I took two months time off from my job in 2016 to complete this work too.

Production Excellence #16: October 2019

03:09, Thursday, 23 2020 July UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

📊 Month in numbers
  • 3 documented incidents. [1]
  • 33 new Wikimedia-prod-error reports. [2]
  • 30 Wikimedia-prod-error reports closed. [3]
  • 207 currently open Wikimedia-prod-error reports in total. [4]

There were three recorded incidents last month, which is slightly below our median of the past two years (Explore this data). To read more about these incidents, their investigations, and pending actionables; check Incident documentation § 2019.


📖 To Log or not To Log

MediaWiki uses the PSR-3 compliant Monolog library to send messages to Logstash (via rsyslog and Kafka). These messages are used to automatically detect (by quantity) when the production cluster is in an unstable state. For example, due to an increase in application errors when deploying code, or if a backend system is failing. Two distinct issues hampered the storing of these messages this month, and both affected us simultaneously.

Elasticsearch mapping limit

The Elasticsearch storage behind Logstash optimises responses to Logstash queries with an index. This index has an upper limit to how many distinct fields (or columns) it can have. When reached, messages with fields not yet in the index are discarded. Our Logstash indexes are sharded by date and source (one for “mediawiki”, one for “syslog”, and one for everthing else).

This meant that error messages were only stored if they only contained fields used before, by other errors stored that day. Which in turn would only succeed if that day’s columns weren’t already fully taken. A seemingly random subset of error messages was then rejected for a full day. Each day it got a new chance at reserving its columns, so long as the specific kind of error is triggered early enough.

To unblock deployment automation and monitoring of MediaWiki, an interim solution was devised. The subset of messages from “mediawiki” that deal with application errors now have their own index shard. These error reports follow a consistent structure, and contain no free-form context fields. As such, this index (hopefully) can’t reach its mapping limit or suffer message loss.

The general index mapping limit was also raised from 1000 to 2000. For now that means we’re not dropping any non-critical/debug messages. More information about the incident at T234564. The general issue with accommodating debug messages in Logstash long-term, is tracked at T180051. Thanks @matmarex, @hashar, and @herron.

Crash handling

Wikimedia’s PHP configuration has a “crash handler” that kicks in if everything else fails. For example, when the memory limit or execution timeout is reached, or if some crucial part of MediaWiki fails very early on. In that case our crash handler renders a Wikimedia-branded system error page (separate from MediaWiki and its skins). It also increments a counter metric for monitoring purposes, and sends a detailed report to Logstash. In migrating the crash handler from HHVM to PHP7, one part of the puzzle was forgotten. Namely the Logstash configuration that forwards these reports from php-fpm’s syslog channel to the one for mediawiki.

As such, our deployment automation and several Logstash dashboards were blind to a subset of potential fatal errors for a few days. Regressions during that week were instead found by manually digging through the raw feed of the php-fpm channel instead. As a temporary measure, Scap was updated to consider the php-fpm’s channel as well in its automation that decides whether a deployment is “green”.

We’ve created new Logstash configurations that forward PHP7 crashes in a similar way as we did for HHVM in the past. Bookmarked MW dashboards/queries you have for Logstash now provide a complete picture once again. Thanks @jijiki and @colewhite! – T234283


📉 Outstanding reports

Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Or help someone that’s already started with their patch:
Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • March: 1 report fixed. (3 of 10 reports left).
  • April: 8 of 14 reports left (unchanged). ⚠️
  • May: (All clear!)
  • June: 9 of 11 reports left (unchanged). ⚠️
  • July: 13 of 18 reports left (unchanged).
  • August: 2 reports were fixed! (6 of 14 reports left).
  • September: 2 reports were fixed! (10 of 12 new reports left).
  • October: 12 new reports survived the month of October.

🎉 Thanks!

Thank you, to everyone else who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof


🌴“Gotta love crab. In time, too. I couldn't take much more of those coconuts. Coconut milk is a natural laxative. That's something Gilligan never told us.

Footnotes:
[1] Incidents. – wikitech.wikimedia.org/wiki/Special:PrefixIndex?prefix=Incident…
[2] Tasks created. – phabricator.wikimedia.org/maniphest/query…
[3] Tasks closed. – phabricator.wikimedia.org/maniphest/query…
[4] Open tasks. – phabricator.wikimedia.org/maniphest/query…

Production Excellence #17: December 2019

03:09, Thursday, 23 2020 July UTC

How’d we do in our strive for operational excellence in November and December? Read on to find out!

📊 Month in numbers
  • 0 documented incidents in November, 5 incidents in December. [1]
  • 17 new Wikimedia-prod-error reports. [2]
  • 23 Wikimedia-prod-error reports closed. [3]
  • 190 currently open Wikimedia-prod-error reports in total. [4]

November had zero reported incidents. Prior to this, the last month with no documented incidents was December 2017. To read about past incidents and unresolved actionables; check Incident documentation § 2019.

Explore Wikimedia incident graphs (interactive)


📖 Many dots, do not a query make!

@dcausse investigated a flood of exceptions from SpecialSearch, which reported “Cannot consume query at offset 0 (need to go to 7296)”. This exception served as a safeguard in the parser for search queries. The code path was not meant to be reached. The root cause was narrowed down to the following regex:

/\G(?<negated>[-!](?=[\w]))?(?<word>(?:\\\\.|[!-](?!")|[^"!\pZ\pC-])+)/u

This regex looks complex, but it can actually be simplified to:

/(?:ab|c)+/

This regex still triggers the problematic behavior in PHP. It fails with a PREG_JIT_STACKLIMIT_ERROR, when given a long string. Below is a reduced test case:

$ret = preg_match( '/(?:ab|c)+/', str_repeat( 'c', 8192 ) );
if ( $ret === false ) {
    print( "failed with: " . preg_last_error() );
}
  • Fails when given 1365 contiguous c on PHP 7.0.
  • Fails with 2731 characters on PHP 7.2, PHP 7.1, and PHP 7.0.13.
  • Fails with 8192 characters on PHP 7.3. (Might be due to php-src@bb2f1a6).

In the end, the fix we applied was to split the regex into two separate ones, and remove the non-capturing group with a quantifier, and loop through at the PHP level (Gerrit change 546209).

The lesson learned here is that the code did not properly check the return value of preg_match, this is even more important as the size allowed for the JIT stack changes between PHP versions.

For future reference, @dcausse concluded: The regex could be optimized to support more chars (~3 times more) by using atomic groups, like so /(?>ab|c)+/. — T236419


📉 Outstanding reports

Take a look at the workboard and look for tasks that might need your help. The workboard lists error reports, grouped by the month in which they were first observed.

https://phabricator.wikimedia.org/tag/wikimedia-production-error/

Or help someone that’s already started with their patch:

→ Open prod-error tasks with a Patch-For-Review

Breakdown of recent months (past two weeks not included):

  • March: 3 of 10 reports left. (unchanged). ⚠️
  • April: Three reports closed, 6 of 14 left.
  • May: (All clear!)
  • June: Three reports closed. 6 of 11 left (unchanged). ⚠️
  • July: One report closed, 12 of 18 left.
  • August: Two reports closed, 4 of 14 left.
  • September: One report closed, with 9 of 12 left.
  • October: Four reports closed, 8 of 12 left.
  • November: 5 new reports survived the month of November.
  • December: 9 new reports survived the month of December.

🎉 Thanks!

Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production.

Until next time,

– Timo Tijhof


Footnotes:
[1] Incidents. – wikitech.wikimedia.org/wiki/Incident_documentation#2019
[2] Tasks created. – phabricator.wikimedia.org/maniphest/query…
[3] Tasks closed. – phabricator.wikimedia.org/maniphest/query…
[4] Open tasks. – phabricator.wikimedia.org/maniphest/query…

Improve Wikipedia’s 2020 election-related articles

15:55, Wednesday, 22 2020 July UTC

The November election is coming up in just over three months, and while the presidential race will dominate headlines, there are a lot of statewide and regional district races and ballot measures that will also appear on ballots.

Informing citizens on the issues and candidates on their ballots in a neutral, fact-based way is critical to a functioning democracy. One such neutral source? Wikipedia, which receives 3 billion page views each month from the United States alone.

Wikipedia needs subject matter experts to ensure the public has access to high-quality information. But Wikipedia’s technical, procedural, and cultural barriers to entry keep most scholars out.

Enter Wiki Education’s virtual courses.

Participants will collaborate for 6 weeks to add neutral, fact-based content to articles related to local ballot initiatives, races, and issues. The course, which will run between August 21 and September 25, involves a one-hour Zoom meeting each Friday, plus an additional three hours of independent work.

Will you add your subject matter expertise to make it better ahead of the November election? In doing so, you’ll simultaneously learn to write for Wikipedia while giving back to a resource you use daily.

Participants will receive a shareable, electronic certificate-of-completion issued by Wiki Education upon course completion. Tuition is $800, with an early bird discount that brings the total to $600 if you enroll by July 31. We encourage you to seek professional development funds to cover the cost of tuition. We’re happy to work with you to provide your department or organization more information about the course.

Many past course participants have tried to edit Wikipedia on their own or at an edit-a-thon; Wikipedia’s large cultural and technical barriers kept them from successfully contributing content. In our structured, 6-week courses, we’re able to help participants overcome those barriers to successfully edit Wikipedia.

Learn more about the course and enroll today.

Header/thumbnail image by magerleagues, CC BY-SA 2.0 via Wikimedia Commons.

Dr. Anna Lappala is an Instructor at Harvard University and Dr. Julia Dshemuchadse is an Assistant Professor at Cornell University. They recently took part in the APS Wiki Scientists Course—a partnership between the American Physical Society and Wiki Education. A second edition of this course is now accepting applications.

Anna Lappala and Julia Dshemuchadse
Anna Lappala, left, and Julia Dshemuchadse, right.

Women are still wildly outnumbered in physics and, similarly, physics fares even worse than other sciences in breaking down barriers for underrepresented minorities to enter the field.

We are both physicists, both with interdisciplinary profiles. Julia went from studying physics to doing research in—and teaching—materials science. Anna moved into physics as a PhD student, from an undergraduate degree in chemistry. We both attended the Wikipedia Edit-a-thon at the 2019 APS March Meeting, hosted by Wiki Education and the APS Committees on the Status of Women in Physics, on Minorities in Physics, and on Informing the Public (pictured above). For Anna, the edit-a-thon was not something she had planned to attend during the APS Meeting, but she was curious and decided to find out about what was going on ‘behind the scenes’ of Wikipedia editing, and whether it could be something that she could do and contribute to. Julia came with a little experience: she had started editing in 2016 when—after discussing US politics with her parents in Germany—she realized that “Black Lives Matter” didn’t have a German-language page and went ahead to translate part of the English-language version to get the German page started. We both were immediately drawn in by Jess Wade, who spoke at the APS edit-a-thon about her continuous Wikipedia advocacy and endless enthusiasm.

In the beginning, the skill to edit Wikipedia articles effectively and efficiently can seem impenetrable. Many of us think that writing a Wikipedia article is just like writing an essay, and not many would volunteer to write essays regularly. Interestingly enough, writing for Wikipedia is nothing like writing boring essays: you get to choose the topic or the person of interest and get a chance to do exciting investigative work in an attempt to find out more about something (or someone) that inspires you, finding even more interesting information than you could imagine. The process is not always straightforward and comfortable. After all, the goal is to publish on a topic or a biography that the world has not seen before, which can seem intimidating. Both of Julia’s initial articles about scientists of color—created at the hackathon—were nominated for deletion. Ultimately, one was deleted and the other one persisted, but the opaque discussion among editors and Wikipedia’s intricate notability criteria proved to be an effective deterrent. Anna’s hackathon article about a scientist was vandalized early on and she was relieved to be able to rely on Jess Wade’s support when resolving the issue. While the act of editing a page was not too hard to master, the layers of decision making and organization within Wikipedia proved to be aspects about which we could both learn a lot more.

The APS Wiki Scientist Course “Biographies of Women and Minority Physicists” held online this spring, facilitated by Wiki Education, addressed exactly the challenges that we faced when starting our “careers” as Wikipedians. We both highly valued the structure and support of the course and are deeply grateful to Elysia, who led the course, as well as our resident Wikipedia experts Ian and Ryan. Personally, we also consider ourselves extremely lucky that we had met each other—albeit briefly—prior to the course. Diving into Wikipedia editing together throughout these 12 weeks enabled us to experience the added benefits of competition and accountability. We could help each other by finding resources, exploring missing biographies, and just serving as sounding boards for text and ideas along the way.

Another component that we truly valued were the discussions about structural inequities that lead to the underrepresentation of women and minority scientists among biographical articles on Wikipedia. We found that a particularly striking aspect is how social inequities due to gender, class, racial and ethnic marginalization are aggravated in the representation of minorities among Wikipedia editors, featured scientists in the media, as well as in perceptions prevalent in society of which accomplishments are worth revering. Media coverage predominantly concentrates on straight, cis, white, male, able-bodied, “Western” scientists and simultaneously dismisses topics that are typically covered by researchers that do not fit this description (as do some members of the academic community). This skewed focus on what constitutes valuable and significant research in turn perpetuates an underrepresentation of other voices and reliable sources that report on them. While many outlets have begun to consciously counteract this bias, it is by no means a thing of the past, and it will take continued effort to remedy.

The structure of Wikipedia itself can also lead to erasure or harassment of gender and sexual minorities, creating additional barriers to enter the ranks of Wikipedia editors and therefore the opportunity to shape the content and tone of the encyclopedia. Wikipedians, however, are actively working on making Wikipedia—and its community of editors—a better place. Wiki Education and entities like the American Physical Society are joining in this effort by actively engaging with The Free Encyclopedia and training and empowering a growing set of editors that strive to contribute to a more complete and balanced resource for us all. We both are hoping to keep promoting the opportunities that lie in contributing to Wikipedia within our communities and on an institutional level, as well as—thanks to the opportunities of the Internet and the constraints forced upon us all by the COVID-19 pandemic—internationally through virtual edit-a-thons and online collaborations just like ours. (And hopefully at some point in the future also again in person …)

Interested in taking a course like the one Anna and Julia took? The American Physical Society is now accepting applications from members for the second edition of the course.

Our aim is to share the sum of the knowledge available to us with everyone, everywhere, in every language. That is what we are to achieve.

As we establish what we, as a movement, are to do, it follows that we need to measure how well we do. When a community does not play an active part for a particular goal, that too will show in the numbers.

Commons does not need to work in English only. The "Special:MediaSearch" works in all the languages we support. With this search engine enabled on every Wikipedia, we will learn how well it gets adopted in  all our languages. We will know if new Wikidata labels are used in searches on Commons. We will know if more diversity is realised in the pictures used in Wikipedia. We will know how many pictures are downloaded and from what languages.

Only in the Portuguese Wikipedia we find the governors of Mozambican provinces only in text. We can include them in Wikidata, make Listeria lists for them, but how do we disambiguate these politicians. What does it take to make the information for them usable for "abstract Wikipedia"?  How do we assemble information about countries like Mozambique and how do we get it to the quality level that some expect? As important, how do we get people from Mozambique interested and involved? 

Some Wikipedians opine that the Wikimedia Foundation does not need to raise funding for their project. Arguably this is correct, but we can raise funds for other projects, other languages elsewhere because we have more and other ambitions to realise. As we raise more money outside of the USA, more people will gain a sense of ownership. 

When we are to overcome our bias for English and our bias for Wikipedia, we need to market our other languages, our other projects. We need key performance indicators.. For Wikisource, how many books were downloaded. For Commons how many media files were downloaded and from what language.

Results need to be objective and measurable. As our research proves to have been about English Wikipedia we have a problem. We seriously need to consider to what extend it is applicable.
Thanks,
      GerardM

NB While the bias is real and the relationship with English Wikipedians is often antagonistic, it is important to recognise  English Wikipedia as the source for much of the information that ends up in other projects. When we collaborate more, our available data will reach more people in an informative way.

Tech News issue #30, 2020 (July 20, 2020)

00:00, Monday, 20 2020 July UTC
TriangleArrow-Left.svgprevious 2020, week 30 (Monday 20 July 2020) nextTriangleArrow-Right.svg
Other languages:
Bahasa Melayu • ‎British English • ‎Deutsch • ‎English • ‎Nederlands • ‎Simple English • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português do Brasil • ‎suomi • ‎svenska • ‎čeština • ‎русский • ‎українська • ‎עברית • ‎العربية • ‎کوردی • ‎മലയാളം • ‎ไทย • ‎中文 • ‎日本語 • ‎한국어

weeklyOSM 521

10:56, Sunday, 19 2020 July UTC

07/07/2020-13/07/2020

lead picture

OSM now with ÖPNV public transport map 1 | © Wikipedia | Map data © OpenStreetMap contributors

Mapping

  • Mateusz Konieczny wants to know whether there is a way of tagging to distinguish between company offices the public can walk into and those where if you tried, it would result in being escorted out by security.
  • Michael Montani has requested comments on their proposal to introduce a natural=bare_soil tag for ‘an area covered by soil, without any vegetation’. (Nabble)
  • Matthew Woehlke wants feedback on his junction=intersection proposal. The new tag would identify portions of a highway which are part of an intersection. (Nabble)
  • Skyler Hawthorne asked the tagging list if there is an accepted way to tag terrace buildings that have names.
  • Mike Thompson has noticed that network has different meanings and possible values depending on the type of route it is added to. They asked the tagging list why the network tag can’t have consistent meaning across all route types.
  • Someone has made a site relation for the Aurelian city walls of Rome. Martin Koppenhoefer asks the readers of the tagging list if this makes sense.
  • Speciality coffee is a term for the highest grade of coffee available. Jake Edmonds is seeking suggestions for how to tag cafes that serve speciality coffee.
  • Martijn van Exel’s series of Tuesday evening JOSM streams continues. He is looking for suggestions of what he could cover in future sessions.
  • Fabian Kowatsch introduced the new filter parameter available in the OpenStreetMap History Data Analytics Platform (ohsome).
  • higa4 analysed (ja) (automatic translation) the change over time of the OSM map of Japan using ohsome.
  • The Belgian Green Party launched a new tool (automatic translation) to help crowdsource information on nature reserves and forests. The tool uses and contributes to OpenStreetMap directly.

Community

  • On the Talk-at mailing list, the contributor plepe presented (de) (automatic translation) his ogd-wikimedia-osm-checker. It compares (automatic translation) the entries of different OGD datasets with Wikidata, Wikipedia, Wikimedia Commons and OpenStreetMap. The source code is also available.
  • The OSM April Fool’s joke from 2017 (adaptations to plate tectonics) was not recognised as such by ScubbX and a discussion about it was started (automatic translation) on the Talk-at mailing list recently.
  • OSMF Board member Rory McCann reported on his activities in June – both within and outside the Board.
  • Harry Wood intends to end the decade-old tradition of weekly ‘featured images’, unless others are willing to step up and take over his role.
  • Dara Carney-Nedelman blogged about their joy on discovering the OSM community. Dara also calls on all students young and old, if their summer plans may not be what they imagined, to learn a new skill: mapping.

Imports

  • Homy is asking for feedback on a proposed import of public bicycle repair stations in Baden-Württemberg, Germany.

OpenStreetMap Foundation

  • The minutes of the non-public OSMF Board meeting on 11 June 2020 have been published. The agenda included the selection of Microgrant applicants, the membership application of a possible ODbL violator, and responses to the RFC on iD governance.
  • The Data Working Group has published its activity report for the second quarter of 2020. Besides the number of tickets, it contains concise descriptions of some outstanding cases.
  • The OSMF Board has amended its Rules of Procedure.
  • The minutes of the Licensing Working Group meeting on 11 June have been published.
  • John Whelan explained why he prefers TransferWise to PayPal when making payment or donating to the OSMF.
  • The minutes of the OSM System Administrators Group meeting of 4 June have been published.

Events

  • Videos of the State of the Map 2020, which took place online, continue to be uploaded to media.ccc.de.

Humanitarian OSM

  • HOT is looking for a Head of Community to manage HOT’s Community Team. Applications close 26 July 2020.
  • HOT is supporting the Greater Accra Resilient and Integrated Development Project to assist in the protection of communities from flooding.

Maps

  • Martin Ždila announced that freemap.sk has been expanded to further (European) countries. The map menu is available in English, Slovak, Czech and Hungarian.
  • [1] The public transport map ÖPNVKarte is now available as a map layer on OpenStreetMap.org. The OpenStreetMap Blog features an article on the new layer’s arrival.
  • Are you planning an action and need an off-line map you can distribute? Using Aktionskarten such a map can be created in five minutes.
  • A Dutch user of OsmAnd would like to display 1m contours, a not unreasonable request for a resident of the relatively flat landscape of the Netherlands.

Licences

  • Reddit user brezherov asked if Google Maps is copying OSM. Their question arose when they noticed that after adding local businesses and buildings to OSM, within a couple of weeks Google Maps had significantly updated those same areas.

Software

  • Sam Crawford explained how Trail Router works. Trail Router is a route planner whose routing algorithm favours greenery and nature, and biases against busy roads.
  • Openbloc has created a new JavaScript library for the creation of 3D maps. A demo can be found here.

Programming

  • SviMik has created a tool to synchronise your Mapillary and OpenStreetCam accounts. There is a discussion thread on the Mapillary forum.
  • User K_Sakanoshita announced (ja) an update to ‘Town Walk Map Maker(ja). The update improves the map representation, POI information, and interface.

Did you know …

  • … you have the opportunity to contribute a little bit to OSM every day? Ilya Zverev will give you a small daily task with his telegram bot ‘OSM Streak’.
  • MapRoulette? It gives you small and easy tasks you can complete in under a minute to improve OpenStreetMap.

Other “geo” things

  • Harald Schernthanner distributed a funny map (automatic translation) via Twitter. It is supposed to show how a Viennese person imagines Austria appears on a map.
  • The wealthier you are the more light pollution you create. Asmi Kumar explains how machine learning can estimate the wealth of an area by comparing daytime and night-time satellite images.
  • The Long Beach Post reported on how they analysed a detailed data log of every person the Long Beach Police Department stopped or detained over the span of 2019. They used OpenRefine for data cleanup and the creation of n-gram fingerprints to reconcile incorrect street names against a canonical list they created using OpenStreetMap data of official street names and intersections in Long Beach.
  • A Standardised Test of university entry of Taiwan, Advanced Subjects Test was held on 3 to 5 July. The geography (automatic translation) quiz had much more geographic technology related material compared to previous years’ tests. The mask map and determining where to buy masks for preventing the spread of COVID-19 was one of the quiz topics which showed the concept of GIS.

Upcoming Events

Where What When Country
Budapest Auguszt patisserie test & drinks in La Piazza 2020-07-16 hungary
Cologne Bonn Airport 129. Bonner OSM-Stammtisch (Online) 2020-07-21 germany
Nottingham Nottingham pub meetup 2020-07-21 united kingdom
Lüneburg Lüneburger Mappertreffen 2020-07-21 germany
Berlin 13. OSM-Berlin-Verkehrswendetreffen (Online) 2020-07-21 germany
Budapest Cziniel patisserie test & hake on bank Római 2020-07-21 hungary
Ludwigshafen a.Rhein (Stadtbibliothek) Mannheimer Mapathons e.V. 2020-07-23 germany
Düsseldorf Düsseldorfer OSM-Stammtisch 2020-07-29 germany
London London Missing Maps Mapathon (ONLINE) 2020-08-04 uk
Stuttgart Stuttgarter Stammtisch 2020-08-05 germany
Kandy 2020 State of the Map Asia 2020-10-31-2020-11-01 sri lanka

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropriate.

This weeklyOSM was produced by Nakaner, Nordpfeil, Polyglot, Rogehm, Supaplex, TheSwavu, YoViajo, derFred, geologist, k_zoar.

The bias for Wikipedia as a project is strong, the bias for English makes it worse. When our aim is to share the sum of all knowledge, we have to acknowledge this and consider the consequences and allow for potential remedies.

"Bias" is a loaded word. When you read the Wikipedia article it is only negative. Dictionaries give more room an example: "our strong bias in favor of the idea". The Wikimedia Foundation is considering rebranding and it explicitly states that it seeks a closer relation with its premier brand Wikipedia. 

This is a published bias. It follows that other projects do not receive the same attention, do not get the same priority. For me it is obvious that as a consequence the WMF could do better when it intends to "share in the sum of all available knowledge" let alone the knowledge that is available to it.

Arguably another more insidious bias is the bias for English, particularly the bias for the English Wikipedia. Given that the proof of the pudding is in the eating, we have a world wide public and the use for our information hardly grows. Research is done on English Wikipedia so in effect we arguably do not even know what we are talking about.

When we are to do better, it means that we be need to be free to discuss our biases, present arguments and even use the arguments or publications of others to make a point. The COO of the WMF states in the context of diversity in tech and media that "when the bonus of executives relies on diversity, diversity will happen". It is reasonable to use this same argument. When the bonuses for executives of the WMF rely on the growth in all our projects, it stands to reason that they will make the necessary room for growth. When one of the best Wikipedians says "There are only a limited number of projects that the WMF can take on at any time, and this wouldn't have been my priority", this demonstrates a bias against the other projects. Arguably the WMF has never really, really, really supported other projects, it does not market them, it does not support them, they exist because the MediaWiki software allows for the functionality. 

When we are to counter the institutional bias of the WMF, we have to be able to make the case, present arguments and ask for the WMF to accept the premise and consider suggestions for change. This proves to be an issue and makes our biases even more intractable.
Thanks,
       GerardM

Migrating tools.wmflabs.org to HTTPS

17:18, Friday, 17 2020 July UTC

Starting 2019-01-03, GET and HEAD requests to http://tools.wmflabs.org will receive a 301 redirect to https://tools.wmflabs.org. This change should be transparent to most visitors. Some webservices may need to be updated to use explicit https:// or protocol relative URLs for stylesheets, images, JavaScript, and other content that is rendered as part of the pages they serve to their visitors.

Three and a half years ago @yuvipanda created T102367: Migrate tools.wmflabs.org to https only (and set HSTS) about making this change. Fifteen months ago a change was made to the 'admin' tool that serves the landing page for tools.wmflabs.org so that it performs an http to https redirect and sets a Strict-Transport-Security: max-age:86400 header in its response. This header instructs modern web browsers to remember to use https instead of http when talking to tools.wmflabs.org for the next 24 hours. Since that change there have been no known reports of tools breaking.

The new step we are taking now is to make this same redirect and set the same header for all visits to tools.wmflabs.org where it is safe to redirect the visitor. As mentioned in the lead paragraph, there may be some tools that this will break due to the use of hard coded http://... URLs in the pages they serve. Because of the HSTS header covering tools.wmflabs.org, this breakage should be limited to resources that are loaded from external domains.

Fixing tools should be relatively simple. Hardcoded URLs can be updated to be either protocol relative (http://example.org//example.org) or explicitly use the https protocol (http://example.orghttps://example.org). The proxy server also sends an X-Forwarded-Proto: https header to the tool's webservice which can be detected and used to switch to generating https links. Many common web application frameworks have support for this already:

If you need some help figuring out how to fix your own tool's output, or to report a tool that needs to be updated, join us in the #wikimedia-cloud IRC channel.

TJ Bliss

TJ Bliss, Chief Academic Officer at the Idaho State Board of Education and former Chief Advancement Officer at Wiki Education, has been appointed to Wiki Education’s new Advisory Board.

TJ is the first member of Wiki Education’s Advisory Board that is tasked with increasing Wiki Education’s reputation and network for revenue generation. He will work closely with me on adding influencers in key areas of the organization’s programmatic focus (equity, communicating science, OER/OEP, linked open data, GLAM, Wikimedia, etc.) as well as prospective donors to the newly formed Advisory Board. 

I’m thrilled to have TJ spearhead the creation of Wiki Education’s Advisory Board with me. His passion for Open Educational Resources and for Wiki Education’s mission will make a big difference. 

TJ has a long track record of supporting Wiki Education. As a Program Officer at the William and Flora Hewlett Foundation, TJ provided a major initial grant to support Wiki Education’s efforts with Open Educational Practice. In 2017, TJ left the Hewlett Foundation to join Wiki Education’s senior staff to lead advancement, fundraising, and business development efforts. TJ’s transition to the Advisory Board will allow him to directly support Wiki Education’s important mission going forward. 

“I am honored to be able to continue my involvement with Wiki Education, which I believe is one of the most important organizations working in the open knowledge space today,” TJ says. “I’m looking forward to helping build Wiki Education’s reputation and influence, to ensure the organization can support faculty, students, and open knowledge projects for many years to come.”

Speeding up Toolforge tools with Redis

11:22, Friday, 17 2020 July UTC

Over the past two weeks I significantly sped up two of my Toolforge tools by using Redis, a key-value database. The two tools, checker and shorturls were slow for different reasons, but now respond instantaneously. Note that I didn't do any proper benchmarking, it's just noticably faster.

If you're not familiar with it already, Toolforge is a shared hosting platform for the Wikimedia community build entirely using free software. A key component is providing web hosting services so developers can build all sorts of tools to help Wikimedians with really whatever they want to do.

Toolforge provides a Redis server (see the documentation) for tools to use for key-value caching, pub/sub, etc. One important security note is that this is a shared service for all Toolforge users to use, so it's especially important to prefix your keys to avoid collisions. Depending on what exactly you're storing, you may want to use a cryptographically-random key prefix, see the security documentation for more details.

Redis on Toolforge is really straightforward to take advantage of for caching, and that's what I want to highlight.

checker

"checker"

Visit the toolSource code

checker is a tool that helps Wikisource contributors quickly see the proofread status of pages. The tool was originally written as a Python CGI script and I've since lightly refactored it to use Flask and jinja2 templates.

On each page load, checker would make a database query to get the list of all available wikis, and then an additional query to get information about the selected wiki and an API query to get namespace information. This data is basically static, it would only change whenever a new wiki is created, which is rare.

<+bd808> I think it would be a lot faster with a tiny bit of redis cache mixed in

I used the Flask-Caching library, which provides convenient decorators to cache the results of Python functions. Using that, adding caching was about 10 lines of code.

To set up the library, you'll need to configure the Cache object to use tools-redis.

from flask import Flask
from flask_caching import Cache
app = Flask(__name__)
cache = Cache(
    app,
    config={'CACHE_TYPE': 'redis',
            'CACHE_REDIS_HOST': 'tools-redis',
            'CACHE_KEY_PREFIX': 'tool-checker'}
)

And then use the @cache.memoize() function for whatever needs caching. I set an expiry of a week so that it would pick up any changes in a reasonable time for users.

shorturls

"shorturls"

Visit the toolSource code

shorturls is a tool that displays statistics and historical data for the w.wiki URL shortener. It's written in Rust primarily using the rocket.rs framework. It parses dumps, generates JSON data files with counts of the total number of shortened URLs overall and by domain.

On each page load, shorturls generates an SVG chart plotting the historical counts from each dump. To generate the chart, it would need to read every single data file, over 60 as of this week. On Toolforge, the filesystem is using NFS, which allows for files to be shared across all the Toolforge servers, but it's sloooow.

<+bd808> but this circles back to "the more you can avoid reading/writing to the NFS $HOME, the better your tool will run"

So to avoid reading 60+ files on each page view, I cached each data file in Redis. There's still one filesystem call to get the list of data files on disk, but so far that seems to be acceptable.

I used the redis-rs crate combined with rocket's connection pooling. The change was about 40 lines of code. It was a bit more invovled because redis-rs doesn't have any support for key prefixing nor automatic (de)serialization so I had to manually convert to/from JSON.

The data being cached is immutable, but I still set a 30 day expiry on it, just in case I change the format or cache key, I don't want the data to sit around forever in the Redis database.

Conclusion

Caching mostly static data in Redis is a great way to make your Toolforge tools faster if you are reguarly making SQL queries, API requests or filesystem reads that don't change as often. If you need help or want tips on how to make other Toolforge tools faster, stop by the #wikimedia-cloud IRC channel or ask on the Cloud mailing list. Thanks to Bryan Davis (bd808) for helping me out.

Today, we are writing to share the discovery and squashing of a bug that occurred earlier this year. This particular bug was also one of the rare instances in which we kept a Phabricator ticket private to address a security issue. To help address questions about when and why we make a security-related ticket private, we’re also sharing some insight into what happens when a private ticket about a security issue is closed.

Late last year, User:Suffusion of Yellow spotted a bug that could have allowed an association to be made between logged-in and non-logged-in edits made from the same IP address. Users with dynamic IP addresses could have been affected, even if they personally never made any non-logged-in edits.

Suffusion of Yellow created a Phabricator ticket about it, and immediately worked to get eyes on the issue. The bug was repaired with their help. We’re grateful for their sharp eyes and their diligent work to diagnose and fix the problem. As part of our normal procedure, the Security team investigated once the bug was resolved. They found no evidence of exploit. We are not able to reveal further technical details about the bug, and here is why:

When a Phabricator ticket discussing a security bug is closed, Legal and Security teams at the Wikimedia Foundation evaluate whether or not to make the ticket public. Our default is for all security tickets to become public after they are closed, so that members of the communities can see what issues have been identified and fixed. The majority of tickets end up public. But once in a while, we need to keep a ticket private.

We have a formal policy we use to determine whether a ticket can be publicly viewable, and it calls for consideration of the following factors:

  • Does the ticket contain non-public personal data? For example, in the case of an attempt to compromise an account, the ticket may include IP addresses normally associated with the account, to identify login attempts by an attacker.
  • Does the ticket contain technical information that could be exploited by an attacker? For example, in discussing a bug that was ultimately resolved, a ticket may include information about other potential bugs or vulnerabilities.
  • Does the ticket contain legally sensitive information? For example, a ticket may contain confidential legal advice from Foundation lawyers, or information that could harm the Foundation’s legal strategy.

In this case, we evaluated the ticket and decided that it could not be made public based on the criteria listed above.

Even when we can’t make a ticket public, we can sometimes announce that a bug has been identified and resolved in another venue, such as this blog. In this case, Suffusion of Yellow encouraged us to make the ticket public, and while pandemic-related staff changes have caused a delay, that request reminded us to follow through with this post. We appreciate their diligence. Keeping the projects secure is a true partnership between the communities of users and Foundation technical staff, and we are committed to keeping users informed as much as possible.

Respectfully,

David Sharpe
Senior Information Security Analyst
Wikimedia Foundation

Clara de Pablo is a Fellow in the Office of Communications and Marketing at the Smithsonian’s National Museum of American History. She enrolled in Wiki Education’s introductory Wikidata course to learn more about how to apply linked data practices to her work.

Clara de Pablo

My involvement with Wikidata began — as all great stories do — with a long and meandering intern task. I work in communications at the Smithsonian’s National Museum of American History, and we’d decided to invite members of Congress to a preview of an upcoming exhibition. To narrow down the 535 members of congress, we asked our high-school intern Sofia to make a list of all the representatives with daughters. We thought this would take her a few days. A full month later, Sofia was still Googling senators and typing the ages of their daughters into a Word document. The information existed, but it wasn’t searchable or organized in a usable way. 

Wikidata seemed like a perfect foil. Wikidata steps in essentially as Wikipedia for data — so much data exists online, and in theory, Wikidata makes it easy to search and navigate. A week or so into Sofia’s research project (she was on senators whose last names started with “S”), I signed up for a Wikidata course. 

The course itself was set up in a very friendly, helpful way. The class met every Tuesday afternoon via video call, during which our instructors, Will and Ian, would screen share and show us how to navigate or use one aspect of Wikidata. The weeks were organized into tangible objectives — learning what “item” or “property” meant, learning how to edit entries, learning how to query. Before each meeting, there was a short slideshow tutorial on our class dashboard, which introduced the concepts and often guided us through short exercises to apply them. The video calls were especially useful for troubleshooting places we were getting stuck, or to see what other people in the course were doing with their newfound skills. The instructors often recommended queries to look at for inspiration, and made themselves available to answer questions via email or Slack outside of our meeting times. 

Queries in particular demanded special attention. Wikidata is searchable through the process of querying— using the computer language SPARQL to “ask” queries and sort a massive amount of data into the answering dataset to a specific question. As the course progressed, it became very apparent that Wikidata carries a steep learning curve. A few members of the class had figured out how to use queries to make elaborate interactive data trees; I had only succeeded in changing “Instance of: dog” to “Instance of: cat.” For fun, I tried to see if I could make a chart of the US Senators. I was frustrated for nearly an hour before I realized that I needed to add the boundary “Instance of: humans.”

This points to a fundamental challenge within Wikidata: linked data is only useful or usable to people who understand querying. Without access to a months-long course, I would never have been able to figure it out. Even with the help of the course, there is still a lot about Wikidata left to learn before I can build my own datasets in a meaningful way. This barrier to usability keeps most people from joining the Wikidata community — and, just like Wikipedia, Wikidata works best when more people contribute their expertise to it. 

The course helped alleviate some of these barriers for me — I am able to create and edit items, and adapt existing queries to answer simple questions. I feel confident that I could create usable datasets using what I learned about items and properties, and link them to existing Wikidata entries. The course helped dismantle something that looked intimidating on the outside — strings of numbers that defined properties, a scary new coding language — and broke them down into a series of simple, logical steps. Ultimately, the greatest benefit of joining was having access to teachers who could help answer my questions when I got stuck. 

Wikidata has great potential in the museum field if it becomes more user-friendly. The Smithsonian (and cultural institutions across the country) have an incredible treasure trove of data and information in our collections, but it’s poorly (if at all) accessible to members of the public. The ability to use linked data to search our digital collections would make our information usable to anyone who wanted it. For example, take classroom education. Using linked data, teachers and students could easily search for historic events from the same year, baseball gloves owned by World Series champions, American presidents with pet pigs. Museum curators might know these things off the top of their heads from countless years spent in the collections, but the information in their heads isn’t searchable by the general public. 

The Smithsonian was founded in 1846 as an “establishment for the increase and diffusion of knowledge among men.” As the possibilities for sharing knowledge have rapidly expanded, the Smithsonian is racing to adapt to a technological world. The Smithsonian’s new Secretary, Lonnie Bunch, has declared one of his priorities to be making the Smithsonian “digital first.” Across the Smithsonian, hundreds of people are working to ensure that our digital databases reflect the full scope of the collections they represent. Linked data would help make these massive online stores useful and usable — all of the Smithsonian’s knowledge would be available to the public we serve. 

Wikidata could be an incredible resource for making data usable to the public, but linked data has a steep learning curve. Learning how to use it requires practice, a lot of patience, and a little bit of help sometimes. 

Interested in taking a course like the one Clara took? Visit learn.wikiedu.org to see current course offerings.