en.planet.wikimedia

January 28, 2017

Wikimedia India

Fifth AGM of Wikimedia India Chapter

An AGM is an annual general meeting or General Body Meeting conducted once per year where members of Wikimedia India chapter gathers to discuss, audit and make report of the chapter’s functioning. Wikimedia India Chapter, which was approved by the Chapters Committee of the Wikimedia Foundation in 2010, is registered under the Karnataka Societies of Registration Act 1960 which complies the chapter to conduct an AGM along with audit of the accounts every year. A notice of an AGM is generally announced 21 days prior to the date of the Annual General Meeting and sent to all the members of the chapter.  During the annual general meeting as per Karnataka Societies Registration Act 1960, election of executive committee members of the chapter, submission of annual account reports and restructuring of rules and policies are generally agreed upon.

The 5th annual general meeting is scheduled to take place on 29 January 2017 at the chapter’s office located at

Work Adda, No.98/1, MMR

Plaza, 1st Floor, Above DCB Bank, Sarjapur Main Road, Jakkasandra, Koramangala 1st Block,

Bangalore

The meeting and discussion during the 5th AGM will reflect the chapter’s work during the fiscal year of 2014-2015 which includes,  approval of the report of the activities of Wikimedia India Chapter, consideration and approval of the audited accounts, consideration and approval of the results of the Executive committee elections conducted in November 2015. Draft budget (2015-2016) and appointment of auditors for 2015-16 will be considered. Discussion to extend the membership validity from 1 year to 5 years will take place during the meeting. Future projects and the learnings of the chapter from their past events will be summarized and any other subject will be discussed with the permission of the chair.

by Jim Carter at January 28, 2017 05:33 AM

Wiki Education Foundation

Roundup: Food Browning

If the only ingredients in caramel are sugar and water, why does it have a taste and smell different from sugar? Why do bananas get darker as they ripen? How do caramelized onions get so sweet? Why do people have different opinions about the center of a piece of bread as opposed to its crust? Why would someone choose a browned piece of meat to a similarly cooked, but visibly lighter piece?

To understand the answers to these questions requires an explanation of the different chemical reactions involved in “food browning“. The brown color of ripe bananas, bread crust, and caramelized onions, for example, are the result of three different processes — the first is enzymatic and the other two are have to do with the rearranging or breakdown of amino acids or sugars at different temperatures.

Food browning is something that affects what we eat everyday, but if you were to look for information about it on Wikipedia back in September, you’d see a short article with no citations or references. Then it was improved by a student in Heather Tienson’s Introduction to Biochemistry class at UCLA. Now it’s three times the size, complete with citations to reliable scientific sources.

Another student in the class substantially improved Wikipedia’s biography of African American nutritional chemist and former Howard University dean, Cecile Hoover Edwards. One of the more difficult, more frequently underdeveloped parts of science biographies is the section on the person’s research contributions — yet those contributions are typically the very reason we have a biography about them in the first place. For the article on Edwards, the student focused almost entirely on building out the section on her work, such as her extensive efforts to identify low-cost foods for optimal protein production.

Other stand-out work from this class included expansions of articles on the proteins VDAC1 and SCN8A, the process of protein folding, and biographies of pharmacologist Nancy Zahniser and University of Colorado professor Natalie Ahn.

There’s a lot of great chemistry content on Wikipedia, but also a whole lot of room for improvement. Some important topics have articles which are underdeveloped, outdated, missing references, or missing altogether. Students are of an age when they’ve begun to grasp the material, but they also remember what it’s like not to have the necessary vocabulary. In that way, they are well suited to writing on Wikipedia, where they not only research class topics but communicate it to a general audience. To learn more about how to get involved, send us an email at contact@wikiedu.org or visit teach.wikiedu.org.

Photo: Barangan banana Indonesia.JPG by Midori, CC BY-SA 3.0, via Wikimedia Commons.

by Ryan McGrady at January 28, 2017 12:50 AM

January 27, 2017

Wikimedia UK

Wikimedia UK Education Summit #WMUKED17

Academics from the Women's Classical Committee learning to edit Wikipedia - Image by Jwslubbock
Academics from the Women’s Classical Committee learning to edit Wikipedia – Image by Jwslubbock

Blog post by Josie Fraser, educational technologist and trustee of Wikimedia UK

The Wikimedia UK Education Summit takes place on February 20th at Middlesex University, London, in partnership with the University’s Department of Media.

It follows on from the successful 2016 Wikimedia UK Education Meetup. Wikimedians and educators working in schools, colleges, higher education and adult education met in Leicester to help inform the work of Wikimedia UK in relation to education, and connect to others using (or wanting to use) Wikimedia projects. The day showcased educators supporting learning and actively engaging learners using a range of projects, including Wikipedia, Wikisource and Wikidata.

This event will continue to build connections and share expertise in relation to Wikimedia UK’s work in formal education. Everyone is welcome – whether you are just getting started and want to find out more about how Wikimedia projects can support education, or you are an established open education champion!

If you would like to attend, please sign up on the Eventbrite page.

Why should educators attend?

The day will open with two talks. Melissa Highton (Director of the Learning, Teaching and Web Services, University of Edinburgh) will talk about the benefits of appointing a Wikimedian in Residence. If your institution is looking for an effective, affordable and innovative way of actively engaging students and supporting staff development through real world knowledge projects, this is a not-to-be-missed talk!

Stefan Lutschinger (Associate Lecturer in Digital Publishing, Middlesex University) will talk about incorporating Wikipedia editing into the university curriculum. Stefan will cover the practical experience of using Wikimedia projects with formal learning communities.

There will be a range of workshops throughout the day – ideal for those looking for an introduction to specific projects, or to brush up on their skills. Workshops include Wikidata, Wikipedia in the Classroom (and using the Education Dashboard), and how to maximise the potential of a Wikimedian in Residence in a university setting. There will also be a session looking at identifying and curating Wikimedia project resources for educators, helping to support others across the UK. Alongside all of this will be a facilitated unconference space for attendees to discuss subjects not covered by the planned programme.

Please consider signing up here for a lightening talk (of up to five minutes) to share projects and ideas, or email karla.marte@wikimedia.org.uk.

What can Wikimedia UK offer educators?

Wikimedia UK is the national charity for the global Wikimedia movement and enables people and organisations to contribute to a shared understanding of the world through the creation of open knowledge. We recognise the powerful and important role formal education can and does play in relation to this, but also the challenges sometimes faced by educators in relation to institutional adoption and use of Wikimedia projects, including Wikipedia.

This summit offers educators and Wikimedians in the UK the opportunity to work together to help learners and organisations connect and contribute to real world projects and to the global Wikimedia community.

Wikimedia UK can support educators in a wide range of ways: providing events, training, support, connecting communities to volunteers, and helping identify potential project funding.

Can’t make the summit, but want to be involved?

Become a Wikimedia UK member – membership is only £5 per year and provides a range of practical benefits – directly supporting the work of the organisation to make knowledge open and available to all, and being kept in touch about Wikimedia UK events, activities and opportunities. You can join online here.

by John Lubbock at January 27, 2017 11:35 PM

Wiki Education Foundation

Thanks for the code contributions, GCI students!

The Wiki Ed Dashboard got 20 improvements over the last two months from five young coders participating in Google Code-In, a contest that gets pre-university students involved in open source software development. Their work ranged from new features, to accessibility and performance improvements, to bug fixes, to new automated code tests, to expanded documentation on how to set up the development environment, to “code quality” work that makes the system easier for others to understand and change later. And all of these are live on dashboard.wikiedu.org now!

As a mentor participating in the Code-In alongside others in the Wikimedia tech community, I spent some time in November identifying a few coding tasks that were beginner-friendly. I wasn’t sure what to expect. The Dashboard uses a very different set of technologies from most Wikimedia projects, and in the past, just getting a development environment up and running has been a stumbling block for both newbies and veteran developers. I had recently put some effort into streamlining the setup instructions, but for the Code-In I expected to put a lot of time into simply helping people get set up. But after the first week, I realized that these students were more than capable of getting the system up and running — and that I’d need to find more — and more challenging — tasks for them. I enjoyed seeing young minds exploring Ruby — the programming language I’ve become quite fond of through my work on the Dashboard.

The student contributions didn’t go unnoticed among my Wiki Ed colleagues, either. Jami thanked me the other day for the “LIFE CHANGING” addition of some extra data about not-yet-submitted courses on the Dashboard. I had to tell her she should be thanking two of the Code-In students, who had done that work.

So thanks again to all the students who helped improve the Wiki Ed Dashboard, and thanks as well to the Wikimedia Developer Relations team for facilitating it!

The Wiki Ed Dashboard is free software, which anyone may use, study, modify and share. We develop it in the open, and we welcome anyone with the skills to help us improve it. Our code powers not only Wiki Ed’s dashboard.wikiedu.org, but also the global Wikimedia Programs & Events Dashboard that supports Wikipedia editing projects anywhere, in any language. If you’re looking for an impactful, socially relevant software project to contribute to, give me a ping at sage@wikiedu.org.

Google Code-in and the Google Code-in logo are trademarks of Google Inc.

by Sage Ross at January 27, 2017 09:41 PM

Weekly OSM

weeklyOSM 340

01/17/2017-01/23/2017

Mapping

  • Chethan Gowda from Mapbox writes about this awesome OSMIC JOSM style plugin which enables icons on all POIs when editing OpenStreetMap.

  • A user is asking in the OSM Forum if there is an alternative to Bing for Malaysia as large areas are not covered yet.

  • User Wille from Brasilia, made available the GPS tracks collected by the Brazilian Environmental Protection Agency. To use this data in ImproveOSM, Martijn van Exel tweaked the algorithm for recognizing missing roads. ImproveOSM now contains many unpaved roads in Brazil. ImproveOSM, works with iD editor and there is also a JOSM plugin of the same.

  • A tweet from Brundritt informs that Bing Maps have been updating its imagery. This process should be completed in a few months according to the tweet.

  • On talk-GB, Andrew Hain reports that a mapper added names to polygons landuse=residential and landuse=commercial in south west London (UK). This mapper did not respond to the changeset comments posted by Hain indicating that the names should be in the description and not the polygons themselves.

  • Joost Schouppe asks which tagging scheme for dog toilettes should he publish as a proposal for voting.

  • On the Tagging mailing list, there is a discussion about OSM tags for public transport cards data, which are gradually replacing transport tickets, according to user Warin.

  • On the Tagging mailing list, Martijn van Exel is asking about destination:street tags which were noticed by Telenav mapping team on (mostly) motorway_link off-ramps in Canada. It’s an undocumented sub-tag of the destination tag. Van Exel is asking about how it is being used and if there is some sort of consensus that is documented somewhere else other than the OSM wiki.

  • Joost Schouppe raises the discussion about shop=fuel which was already mentioned here. The issue concerns shops that sell fuel but are not fuel stations. Joost proposes identifying such products.

  • Mapbox updated the basemap imagery in Washington, DC with 2015 aerial imagery at 3 inch (7.5 cm) resolution. Great for mapping (make mapping great again) and counting people as well when you need alternative facts.

Community

  • Escada interviews Steve All from California (USA), for his Mapper of the Month series. The interview is published on the new website of the Belgian OSM community.

  • According to Pascal Neis, while the number of OSM mappers is increasing around the world, it is decreasing in Germany.

  • Are 25.000 mappers enough for Germany? It’s enough for the urban areas but in the rural ones there is still much to do, a weeklyOSM editor says.

Imports

  • Michael Spreng asks on the imports mailing list about the import of addresses in the Canton of Bern (GEBADR list). There has been no feedback concerning to it. The import was refined and the bulk import is starting, which is expected to take some time. Building layouts are going to be improved when possible.

Humanitarian OSM

  • Pascal Neis comments on a Russell Deffner’s tweet about the validation process of Missing Maps. This process apparently produces a lot of OSM changes.

  • Logistics Cluster updates its access constraints map in South Sudan every Friday. This should be of special interest to humanitarian deliveries in the area.

Maps

  • Paul Norman suggests a simple extension to CartoCSS which would decrease the size of our main page’s style by about one third.

  • Molly Lloyd from Mapbox teamed up with some organizers from the Women’s March to create a map where you can find all the events of Woman’s March in different cities and countries.

  • Take back the tech! uses technology to end violence against women and encourages activism against gender-based violence. Using an OSM-based map, people can report cases of violence against women from all over the world.

  • Some of the issues faced by people creating symbols for map styles aren’t always appreciated.

  • Stephan Bösch-Plepelits showcases on the Dev mailing list, the PTMap, a public transport map rendering based on Overpass API, which renders route relations according to the ‘new’ PTv2 scheme.

  • Joost Schouppe shows in his diary how road mapping (tag highway=*) evolved in Brussels.

switch2OSM

Open Data

  • The international Open Data Hackathon will take place on March 4, 2017. The map was broken at our editorial deadline.

  • Weather data which is produced by the German Weather Service (DWD) could be freely available in the future, according to an article (de) of the Spiegel Online. (automatic translation)

Licences

  • Due to copyright issues with Mapbox, the project OSM2VectorTiles was discontinued. The authors have created a successor, the OpenMapTiles, with their own vector tile scheme, and free from legal problems.

Software

Programming

  • For the Google Summer of Code 2017, project proposals are being collected in the Wiki.

  • Geofabrik reports in a blog post how they recently improved referential integrity in their extracts. With a marginal impact on file size while cutting down processing time, this was made possible by switching to the latest osmium version.

Releases

OSM in the media

  • Mapanica, the OSM community in Nicaragua, proposes a new project to help improve the frequency data of public transport, in order to create a system that allows people to better plan their trips in the city of Managua.

  • German TV GRIP_RTL2 used stamen’s great looking #watercolor OpenStreetMap for Romania in their yesterday’s episode (via pascal_n). No attribution was mentioned.

Other “geo” things

  • This is how buildings look when OSM enthusiasts are rebuilding their house.

  • Geospatial World wrote about DigitalGlobe’s AComp: "When a satellite takes an image, the light reflecting from the ground is impacted by the atmosphere and can affect the visual aesthetics of the image. That’s where DigitalGlobe’s Atmospheric Compensation (AComp) steps in."

  • Carlos Felipe Castillo informed: "The new private beta from Blueshift has arrived!" A fun and easy tool to create dynamic maps.

  • Yorokobu from Spain featured the nice maps from Axis Maps.

  • Euro space agency’s Galileo satellites have been stricken by mystery clock failures.

  • Eric Gundersen shows a satellite image of Barack Obama’s presidential inauguration in 2009 by GeoEye, now DigitalGlobe.

  • Open Stats from India notes that Uber’s OpenData Platform is not really Open Data. They call it #openwashing.

Upcoming Events

This weeklyOSM was produced by Hakuch, Peda, Polyglot, Rogehm, SeleneYang, Spec80, YoViajo, derFred, jinalfoflia, keithonearth, vsandre, wambacher, widedangel.

by weeklyteam at January 27, 2017 11:58 AM

Shyamal

Moving Plants

All humans move plants, most often by accident and sometimes with intent. Humans, unfortunately, are only rarely moved by plants. 

The history of plant movements have often been difficult to establish. In the past the only way to tell a plant's homeland was to look for the number of related species in a region to provide clues on origin. This idea was firmly established by Nikolai Vavilov before being sent off to his unfortunate death in Siberia. Today, genetic relatedness of plants can be examined by comparing the similarity of chosen DNA sequences and among individuals of a species those sequence locations that are most variable. Some recent studies on individual plants and their relatedness have provided some very interesting glimpses into human history. A study on baobabs in India and their geographical origins in East Africa established by a study in 2015 and that of coconuts in 2011 are hopefully just the beginnings. These demonstrate ancient human movements which have never received much attention in story-tellings of history. 

Unfortunately there are a lot of older crank ideas that can be difficult for untrained readers to separate. I recently stumbled on a book by Grafton Elliot Smith, a Fullerian professor who succeeded J.B.S.Haldane but descended into crankdom. The book "Elephants and Ethnologists" (1924) can be found online and it is just one among several similar works by Smith. It appears that Smith used a skewed and misapplied cousin of Dollo's Law. According to him, cultural innovation tended to occur only once and that they were then carried on with human migrations. Smith was subsequently labelled a "hyperdiffusionist", a disparaging term used by ethnologists. When he saw illustrations of Mayan sculpture he envisioned an elephant where others saw at best a stylized tapir. Not only were they elephants, they were Asian elephants, complete with mahouts and Indian-style goads and he saw this as definite evidence for an ancient connection between India and the Americas! An idea that would please some modern-day cranks and zealots.

Smith's idea of the elephant as emphasised by him.
The actual Stela in question
 "Fanciful" is the current consensus view on most of Smith's ideas, but let's get back to plants. 

I happened to visit Chikmagalur recently and revisited the beautiful temples of Belur on the way. The "Archaeological Survey of India-approved" guide at the temple did not flinch when he described an object in one of the hands of a carving as being maize. He said maize was a symbol of prosperity. Now maize is a crop that was imported to India and by most accounts only after the Portuguese sea incursions into India in 1492. In the late 1990s, a Swedish researcher identified similar  carvings (actually another one at Somnathpur) from 12th century temples in Karnataka as being maize cobs. It was subsequently debunked by several Indian researchers from IARI and from the University of Agricultural Sciences where I was then studying. An alternate view is that the object is a mukthaphala, an imaginary fruit made up of pearls.
Somnathpur carvings. The figures to the
left and right hold the puported cobs.
(Photo: G41rn8)

The pre-Columbian oceanic trade ideas however do not end with these two cases from India. The third story (and historically the first, from 1879) is that of the sitaphal or custard apple. The founder of the Archaeological Survey of India, Alexander Cunningham, described a fruit in one of the carvings from Bharhut, a fruit that he identified as custard-apple. The custard-apple and its relatives are all from the New World. The Bharhut Stupa is dated to 200 BC and the custard-apple, as quickly pointed out by others, could only have been in India post-1492. The Hobson-Jobson has a long entry on the custard apple that covers the situation well. In 2009, a study raised the possibility of custard apples in ancient India. The ancient carbonized evidence is hard to evaluate unless one has examined all the possible plant seeds and what remains of their microstructure. The researchers however establish a date of about 2000 B.C. for the carbonized remains and attempt to demonstrate that it looks like the seeds of sitaphal. The jury is still out.
I was quite surprised that there are not many writings that synthesize and comment on the history of these ideas on the Internet and somewhat oddly I found no mention of these three cases in the relevant Wikipedia article (naturally, fixed now with an entire new section) - pre-Columbian trans-oceanic contact theories

There seems to be value for someone to put together a collation of plant introductions to India along with sources, dates and locations of introduction. Some of the old specimens of introduced plants may well be worthy of further study.

Introduction dates
  • Pithecollobium dulce - Portuguese introduction from Mexico to Philippines and India on the way in the 15th or 16th century. The species was described from specimens taken from the Coromandel region (ie type locality outside native range) by William Roxburgh.
  • Eucalyptus globulus? - There are some claims that Tipu planted the first of these (See my post on this topic).  It appears that the first person to move eucalyptus plants (probably E. globulosum) out of Australia was  Jacques Labillardière. Labillardiere was surprized by the size of the trees in Tasmania. The lowest branches were 60 m above the ground and the trunks were 9 m in diameter (27 m circumference). He saw flowers through a telescope and had some flowering branches shot down with guns! (original source in French) His ship was seized by the British in Java and that was around 1795 or so and released in 1796. All subsequent movements seem to have been post 1800 (ie after Tipu's death). If Tipu Sultan did indeed plant the Eucalyptus here he must have got it via the French through the Labillardière shipment.  The Nilgiris were apparently planted up starting with the work of Captain Frederick Cotton (Madras Engineers) at Gayton Park(?)/Woodcote Estate in 1843.
  • Muntingia calabura - when? - I suspect that flowerpecker populations boomed after this.
  • Delonix regia - when?
  • In 1857, Mr New from Kew was made Superintendent of Lalbagh and he introduced in the following years several Australian plants from Kew including Araucaria, Eucalyptus, Grevillea, Dalbergia and Casuarina. Mulberry plant varieties were introduced in 1862 by Signor de Vicchy. The Hebbal Butts plantation was establised around 1886 by Cameron along with Mr Rickets, Conservator of Forests, who became Superintendent of Lalbagh after New's death - rain trees, ceara rubber (Manihot glaziovii), and shingle trees(?). Apparently Rickets was also involved in introducing a variety of potato (kidney variety) which got named as "Ricket". -from Krumbiegel's introduction to "Report on the progress of Agriculture in Mysore" (1939) 

Further reading
  • Johannessen, Carl L.; Parker, Anne Z. (1989). "Maize ears sculptured in 12th and 13th century A.D. India as indicators of pre-columbian diffusion". Economic Botany 43 (2): 164–180.
  • Payak, M.M.; Sachan, J.K.S (1993). "Maize ears not sculpted in 13th century Somnathpur temple in India". Economic Botany 47 (2): 202–205. 
  • Pokharia, Anil Kumar; Sekar, B.; Pal, Jagannath; Srivastava, Alka (2009). "Possible evidence of pre-Columbian transoceaic voyages based on conventional LSC and AMS 14C dating of associated charcoal and a carbonized seed of custard apple (Annona squamosa L.)" Radiocarbon 51 (3): 923–930. - Also see
  • Veena, T.; Sigamani, N. (1991). "Do objects in friezes of Somnathpur temple (1286 AD) in South India represent maize ears?". Current Science 61 (6): 395–397.

by Shyamal L. (noreply@blogger.com) at January 27, 2017 02:47 AM

January 26, 2017

Wikimedia Foundation

A history of the finger, as seen through Wikipedia

Photo by Unknown, public domain/CC0.

The first visually documented use of the finger—look at the man standing in the back row, far left side. Photo by unknown, public domain/CC0.

Last Tuesday, English Wikipedia editor Muboshgu left work and jumped on the highway to get home. Along the way, a driver in the lane next to him decided to merge without checking to see if any other cars—like, say, Muboshgu’s—were occupying the space.

Seeing this, Muboshgu took action. “First he got the horn, then he got the bird,” he told me.

That “bird,” a colloquial term for giving someone the finger, has become an indelible symbol of contempt in Western culture.

Naturally, 24 different-language Wikipedias have an article on it.

Muboshgu played a significant role in expanding the finger’s English Wikipedia article to where it is today. He nominated it for “good article” status, a marker of quality awarded only after a peer review from a fellow editor, and it appeared on Wikipedia’s front page in the “Did you know?” section.

He came across the finger’s article in July 2012 when rewriting the biography on Phil Nevin, a former pro baseball player who gave the finger to a heckling fan in 2002. “In linking to the article, I saw how short it was,” he said. “I knew that such a widely used gesture with the implications and reactions associated with it deserved a longer article.”

Indeed, Muboshgu’s research showed that the finger’s origins go back to Ancient Greece and Rome, where in the latter it was known as digitus impudicus—the “shameless, indecent or offensive finger.” The first modern visual evidence of the finger comes from the United States, where in 1886 a baseball pitcher named Charles “Old Hoss” Radbourn was photographed giving it to his team’s rivals. There is no information available to know if he faced any repercussions, but he did make it into the sport’s Hall of Fame. Ironically, just over a century later, a baseball executive resigned after giving the finger to a fan on Fan Appreciation Night.

Although the use of the finger has been on the rise in recent decades, its use is still controversial. “People have varying opinions about items of cultural phenomena such as this,” Muboshgu says, as “it can produce visceral reactions in people, which can cloud judgment.”

This extends to Wikipedia as well; while writing an article on the finger, Muboshgu anticipated potential backlash from other editors’ dislike of the obscene gesture. While the English Wikipedia has a strong policy against censorship, its exact interpretation can and has been often debated when controversial material comes up. In this case, however, Muboshugu was faced only with constructive criticism aimed at the article’s content, not its subject.

And what does Muboshgu think of the finger? “I love the First Amendment to the United States Constitution. This gesture is just one of many things that I can do that in another country might result in my being thrown in jail. I don’t take that right for granted.”

Ed Erhart, Editorial Associate
Wikimedia Foundation

by Ed Erhart at January 26, 2017 11:04 PM

Wikimedia Foundation receives $500,000 from the Craig Newmark Foundation and craigslist Charitable Fund to support a healthy and inclusive Wikimedia community

Photo by Daniel McKay, CC BY-SA 2.0.

Photo by Daniel McKay, CC BY-SA 2.0.

Today, the Wikimedia Foundation announced the launch of a community health initiative to address harassment and toxic behavior on Wikipedia, with initial funding of US$500,000 from the Craig Newmark Foundation and craigslist Charitable Fund. The two seed grants, each US$250,000, will support the development of tools for volunteer editors and staff to reduce harassment on Wikipedia and block harassers.

Approximately 40% of internet users, and as many as 70% of younger users have personally experienced harassment online, with regional studies showing rates as high as 76% for young women. While harassment differs across the internet, on Wikipedia and other Wikimedia projects, harassment has been shown to reduce participation on the sites. More than 50% of people who reported experiencing harassment also reported decreasing their participation in the Wikimedia community.

Volunteer editors on Wikipedia are often the first line of response for finding and addressing harassment on Wikipedia. “Trolling,” “doxxing,” and other menacing behaviors are burdens to Wikipedia’s contributors, impeding their ability to do the writing and editing that makes Wikipedia so comprehensive and useful. This program seeks to respond to requests from editors over the years for better tools and support for responding to harassment and toxic behavior.

“To ensure Wikipedia’s vitality, people of good will need to work together to prevent trolling, harassment and cyber-bullying from interfering with the common good,” said Craig Newmark, founder of craigslist. “To that end, I’m supporting the work of the Wikimedia Foundation towards the prevention of harassment.”

The initiative is part of a commitment to community health at the Wikimedia Foundation, the non-profit organization that supports Wikipedia and the other Wikimedia projects, in collaboration with the global community of volunteer editors. In 2015, the Foundation published its first Harassment Survey about the nature of the issue in order to identify key areas of concern. In November 2016, the Wikimedia Foundation Board of Trustees issued a statement of support calling for a more “proactive” approach to addressing harassment as a barrier to healthy, inclusive communities on Wikipedia.

“If we want everyone to share in the sum of all knowledge, we need to make sure everyone feels welcome,” said Katherine Maher, Executive Director of the Wikimedia Foundation. “This grant supports a healthy culture for the volunteer editors of Wikipedia, so that more people can take part in sharing knowledge with the world.”

The generous funding from the Craig Newmark Foundation and craigslist Charitable Fund will support the initial phase of a program to strengthen existing tools and develop additional tools to more quickly identify potentially harassing behavior, and help volunteer administrators evaluate harassment reports and respond effectively. These improvements will be made in close collaboration with the Wikimedia community to evaluate, test, and give feedback on the tools as they are developed.

This initiative addresses the major forms of harassment reported on the Wikimedia Foundation’s 2015 Harassment Survey, which covers a wide range of different behaviors: content vandalism, stalking, name-calling, trolling, doxxing, discrimination—anything that targets individuals for unfair and harmful attention.

From research and community feedback, four areas have been identified where new tools could be beneficial in addressing and responding to harassment:

  • Detection and prevention – making it easier and faster for editors to identify and flag harassing behavior
  • Reporting – providing victims and respondents of harassment improved ways to report instances that offer a clearer, more streamlined approach
  • Evaluating – supporting tools that help volunteers better evaluate harassing behavior and inform the best way to respond
  • Blocking – making it more difficult for someone who is blocked from the site to return

For more information, please visit the community health initiative‘s main page.

A related press release is available on the Wikimedia Foundation’s website.

by Wikimedia Foundation at January 26, 2017 08:30 PM

Semantic MediaWiki

Help:Embedded format

Help:Embedded format
Embedded format
Embed selected articles.
Available languages
deenzh-hans
Further Information
Provided by: Semantic MediaWiki
Added: 0.7
Removed: still supported
Requirements: none
Format name: embedded
Enabled by default: 
Indicates whether the result format is enabled by default upon installation of the respective extension.
yes
Authors: Markus Krötzsch
Categories: misc
Group:
Table of Contents

↓ INFO ↓

The result format embedded is used to embed the contents of the pages in a query result into a page. The embedding uses MediaWiki transclusion (like when inserting a template), so the tags <includeonly> and <noinclude> work for controlling what is displayed.

Parameters

General

Parameter Type Default Description
source text empty Alternative query source
limit whole number 50 The maximum number of results to return
offset whole number 0 The offset of the first result
link text all Show values as links
sort list of texts empty Property to sort the query by
order list of texts empty Order of the query sort
headers text show Display the headers/property names
mainlabel text no The label to give to the main page name
intro text empty The text to display before the query results, if there are any
outro text empty The text to display after the query results, if there are any
searchlabel text ... further results Text for continuing the search
default text empty The text to display if there are no query results

Format specific

Parameter Type Default Description
embedformat text h1 The HTML tag used to define headings
embedonly yes/no no Display no headings

The embedded format introduces the following additional parameters:

  • embedformat: this defines which kinds of headings to use when pages are embedded, may be a heading level, i.e. one of h1, h2, h3, h4, h5, h6, or a description of a list format, i.e. one of ul and ol
  • embedonly: if this parameter has any value (e.g. yes), then no headings are used for the embedded pages at all.

Example

The following creates a list of recent news posted on this site (like in a blog):

{{#ask:
 News date::+
 language code::en
 |sort=news date
 |order=descending
 |format=embedded
 |embedformat=h3
 |embedonly=yes
 |searchlabel= <br />[view older news]
 |limit=3
}}

This produces the following output:


English

Semantic MediaWiki 2.4.5 (SMW 2.4.5) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

English

Semantic MediaWiki 2.4.4 (SMW 2.4.4) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for MySQL 5.7 issues of the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

English

Semantic MediaWiki 2.4.3 (SMW 2.4.3) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.
[view older news]

NoteNote: The newline (<br />) is used to put the further results link on a separate line.

Remarks

Note that by default this result format also adds all annotations from the pages that are being embedded to the page they are embedded to.1 Starting with Semantic MediaWiki 2.4.0 this can be prevented for annotations done via parser functions #set parser function and #subobject parser function by setting the embedonly parameter to "yes".2 In-text annotations will continue to be embedded. Thus these annotations need to be migrated to use the #set parser function to prevent this from happening.

Also note that embedding pages may accidently include category statements if the embedded articles have any categories. Use <noinclude> to prevent this, e.g. by writing

<noinclude>Category:News feed</noinclude>

Semantic MediaWiki will take care that embedded articles do not import their semantic annotations, so these need not be treated specifically.

Last but not least note that printout statements have no effect on embedding queries.

Limitations

You cannot use the embed format to embed a query from another page if that query relies on the magic word {{PAGENAME}}.



This documentation page applies to all SMW versions from 0.7 to the most current version.
      Other languages: defrzh-hans

Help:Embedded format en 0.7


by Kghbln at January 26, 2017 07:11 PM

Wiki Education Foundation

Teaching Digital History with Wikipedia

As more archives become digitized, historians are turning to new technologies to delve into the past. Drawing from a variety of disciplines, ranging from computational science to digital mapping, the burgeoning field of Digital History is enabling historians to comb through vast amounts of historical data and visualize the past in new ways. From archiving historical restaurant menus to mapping emancipation, historians are embracing new technologies to reimagine the past, and they’re increasingly making this knowledge available to the general public.

It’s not lost on a growing number of historians that digital history projects can provide their students with exciting new ways to understand and engage with the past. At institutions around the country, historians are using social media, online archives, and a host of digital technologies to help their students think critically about history. For some, however, implementing digital history projects is an appealing, but daunting prospect.

Wikipedia-based assignments are an ideal foray into the world of digital history projects. That’s why Educational Partnerships Manager, Jami Mathewson, and I visited the American Historical Association’s 2017 annual meeting in Denver at the beginning of January. When it comes to choosing a digital history assignment, it’s easy to be overwhelmed by the variety of new tools and technologies, and as Jami and I heard time and time again, many instructors are simply unsure where to begin.

That’s where the Wiki Education Foundation can step in. When you incorporate a Wikipedia-based assignment into your course, you don’t need to start from scratch. Wiki Ed has recommendations for all stages of your Wikipedia assignment and has developed the technology and resources to make it all come together. Our brochures and handouts and interactive training modules ensure that both instructors and students have the basic knowledge to begin contributing to Wikipedia. Our Dashboard tool keeps track of your assignment from assignment design to peer review to the moment when students make live edits to Wikipedia. And of course, we have an entire staff devoted to supporting Wikipedia assignments.

There are numerous digital history projects from which history instructors can choose, but Wikipedia is particularly well-suited for conveying some of the fundamental principles of the practice of history.

  • Distinguishing sources: Primary sources are the bread and butter of the historian’s trade, but because Wikipedia has a policy of “no original research” that includes users’ analysis, evaluation, interpretation, or synthesis of primary sources, they can typically only be used for straightforward statements of fact. Far from incompatible with the goals of a history course, however, students who contribute to Wikipedia in the field of history learn how to distinguish between primary and secondary sources of information. They become adept at examining a document’s authority and accuracy and learn how primary sources of information form the foundation of secondary literature.
  • History vs. historiography: The difference between these two concepts can be difficult for history students to grasp. The former is the narrative that historians piece together to describe the past, and the latter is the study of historical methodology. When students tackle history articles on Wikipedia, they have to engage with both concepts. They have to convey the facts as captured in the most current secondary literature, but they also have to consider how different schools of historical thought approach a given historical period. In doing so, students learn that facts and methodology are inseparable in the field of history.
  • Bias and perspective: Historians are storytellers above all. They organize a set of facts into a coherent narrative, and they ultimately decide which parts of the story to include and which to discard. Who writes a piece of history matters, and students must face this reality head on when contributing to Wikipedia. Though they should strive to be as objective as possible, they must ultimately decide which facts and sources of information will produce the most robust and well-balanced Wikipedia entry. When others come along and improve upon their work, they can see first-hand that history is an unfolding narrative rather than a static set of facts.
  • Underrepresentation and misrepresentation: In recent decades, the field of history has come to encompass people and voices previously left out of the historical cannon. Women, minorities, and other historically disenfranchised groups are now the focus of many historical inquiries, but many of their entries on Wikipedia are either underdeveloped or missing altogether. History students have a unique opportunity to fill in these important content gaps and help Wikipedia reflect the true breadth and depth of history.

As the American Historical Association has well documented, only a small percentage of history students go on to become historians. While Wikipedia assignments are particularly adept at teaching students the tools of the historical trade, they also help students develop critical media literacy and technical skills that they can apply to their future academic and professional lives. Students who contribute to Wikipedia learn to navigate an increasingly complicated media landscape. They develop the skills necessary to make critical judgments about sources of information, such as whether a news headline is real or fake, and come to better understand the ways in which consuming information is not the same as producing it, and that they are two sides of the same coin.

The goal of digital history is to help historians — and, in turn, the public — understand the past in new ways by drawing on new forms of digital analysis. Similarly, Wikipedia assignments can help students grasp the study of history by understanding how to use one of the most prolific digital tools in use today, while at the same time contributing to a historical record millions of people read every day.

There’s no time like the present to begin documenting the past. If you’re interested in incorporating a Wikipedia assignment into your course, please email us at contact@wikiedu.org or visit teach.wikiedu.org.

by Helaine Blumenthal at January 26, 2017 06:52 PM

Wikimedia Foundation

Wikimedia Research Newsletter, December 2016

Getting more female editors may not increase the ratio of articles about women

Reviewed by Reem Al-Kashif

A bachelor’s degree thesis by Feli Nicolaes[1] finds that, contrary to the general perception, male and female editors do not tend to edit biographical articles on people of their own gender.

Previous research suggested that one solution to the lack of Wikipedia’s biographies of women could be to increase the number of female editors. This was based on the assumption that women would prefer to edit women’s biographies, and men would prefer to edit men’s biographies. Nicolaes refers to this as homophily in her thesis, “Gender bias on Wikipedia: an analysis of the affiliation network”. However, homophily has so far neither been formally investigated nor proved to exist in Wikipedia. Nicolaes analyzes this using datasets from her research group at the University of Amsterdam, of English Wikipedia editors and the pages they edit. She tracks the editing behavior of both self-identified male and female editors on Wikipedia. Contrary to the mainstream assumption, homophily was not found. In other words, female users’ edits are not focused on female biography pages. In fact, Nicolaes finds “inverted homophily” when considering female users who edit a single biographical article more than 200 times: they are more likely to direct this amount of attention to biography articles about men than male editors are.

This brings to mind an initiative to increase content about women—be it biography articles or other content related to women—that has been live since December 2015 in the Arabic Wikipedia. The initiative is in a form of contest where male and female editors try to achieve as much as they can from their self-set goals. Over the four rounds of the contest, only one woman reached the top three in two rounds. So, if the goal is to add more content about women, bringing more women might not be useful. However, Nicolaes also argues that the study should be replicated on larger datasets to validate the results. It remains to be seen whether the same editor behaviour exists in other language editions. Another limitation of the study is its apparent reliance on the gender information that editors publicly state in their user preferences—a method that is widely used but may be susceptible to biases (discussed in more detail in this review).

Theorizing the foundations of the gender gap

Reviewed by Aaron Shaw

In a forthcoming paper, “‘Anyone can edit’ not everyone does: Wikipedia and the gender gap”[2], Heather Ford and Judy Wajcman use some of the theoretical tools of feminist science and technology studies (STS) to describe underpinnings of the Wikipedia gender gap. The authors argue that three aspects of Wikipedia’s infrastructure define it as a particularly masculine or male-dominated project:

(1) the epistemological foundations of what constitutes valid encyclopedic knowledge,
(2) Wikipedia’s software infrastructure, and
(3) Wikipedia’s policy infrastructure.

The authors argue that each of these arenas represents a space where male activity and masculine norms of truth, scientific fact, legitimacy, and freedom define boundaries of legitimate contribution and action. Accordingly, these boundaries of legitimate contribution and action systematically exclude or devalue perspectives and contributions that could overcome the lack of female participation or perspectives in the Wikipedia projects. The result, according to Ford and Wajcman, is that Wikipedia has created a novel and powerful form of knowledge-production expertise on a foundation that reproduces existing gender hierarchies and inequalities.

How old and new astronomy papers are being cited

Reviewed by Piotr Konieczny

The author analyzes[3] Wikipedia’s citations to academic peer reviewed articles, finding that “older papers from before 2008 are increasingly less likely to be cited”. The authors attempt to use Wikipedia citations as a proxy for public interest in astronomy, though the analysis makes no comparison to other research about public interest in sciences. The article notes that citations to articles from 2008 are most common, and it represents the peak of citations, with fewer and fewer citations for years since 2008. The analysis is also limited due to the cut-off date (1996), “because Scopus indexing of journals changes in this year”. The author concludes that the observed citation pattern is likely “consistent with a moderate tendency towards obsolescence in public interest in research”, as papers become obsolete and newer ones are more likely to be cited; older papers are cited for timeless, uncontroversial facts, and newer for newer findings. They also note that the late 2000s, i.e. the years around 2008, may represent when most of Wikipedia’s content in astronomy was created, through this is not backed up by much besides speculation. Overall, it is an interesting question, but one that does not provide any surprising insights.

Wikipedia is not a suitable source for election predictions

Reviewed by Piotr Konieczny

The topic of this conference paper, “Election prediction based on Wikipedia pageviews”,[4] is certainly timely. The authors look at which of Wikipedia’s articles related to the US presidential election registered high popularity, and then ask whether elections can be predicted based “on the number of views the spiking pages have and on the correlation between these pages and the presidential nominees or their political program”. They provide an online visualization showing some “Wikipedia topics that have spiked before, during or after [an] election event.”

The authors limit themselves (reasonably) to the English and Spanish Wikipedias. They do a good job of presenting their methods, and outlining problems with gathering data on popularity of articles—something that would be much easier if Wikipedia articles and databases were more friendly when it comes to information about their popularity. Within the limitations described in the paper, the authors conclude that Wikipedia articles about politicians are used mostly after, not before or during debates or other events such as primaries or elections, which suggests that they are not used for fact checking but instead as an information source after the event. “Wikipedia is not, in fact, a reliable polling source”, write the authors, based on (this could be clarified further) the fact that people check Wikipedia after the events, not before them, hence making Wikipedia’s pageviews problematic for prediction.

“Black Lives Matter in Wikipedia: Collaboration and collective memory around online social movements”

Reviewed by Piotr Konieczny
Protesters lying down over rail tracks with a "Black Lives Matter" banner.

Black Lives Matter die-in protesting alleged police brutality in 2015

In this paper,[5] the researchers look at the relation between the Black Lives Matter (BLM) social movement and its coverage in Wikipedia, asking the following research questions:

  1. “How has Wikipedia editing activity and the coverage of BLM movement events changed over time?”
  2. “How have Wikipedians collaborated across articles about events and the BLM movement?” and
  3. “How are events on Wikipedia re-appraised following new events?”

They aim to contribute to academic discourse on social movements and claim to describe “knowledge production and collective memory in a social computing system as the movement and related events are happening.” They conclude that Wikipedia is a neutral platform, but does indirectly support (or hinder) the movement (or its opponents) by virtue of increased visibility, in the same vein as coverage by the media would. The quality of the movement’s history and documentation on Wikipedia is judged to be of higher value, accessibility, and quality than snapshots on social media platforms like Twitter. Wikipedia also provides space for interested editors to work on articles indirectly related to BLM, further increasing the visibility of related topics, as interested editors move beyond direct BLM articles to other aspects. Examples include historical articles about events preceding BLM that would probably not be written/expanded on in Wikipedia if not for the rise of the BLM movement. The authors conclude that social movement activists can use Wikipedia to document their activities without compromising Wikipedia’s neutrality or other policies: “Without breaking with community norms like NPOV, Wikipedia became a site of collective memory documenting mourning practices as well as tracing how memories were encoded and re-interpreted.” This is a valuable argument that draws interesting connections between Wikipedia and social movements, particularly considering that some (like this reviewer) consider Wikipedia itself to be a social movement.

Briefly

Conferences and events

The third annual Wiki Workshop will take place on April 4 as part of the WWW2017 conference in Perth, Australia. The workshop serves as a platform for Wikimedia researchers to get together on an annual basis and share their research with each other (see also our overview of the papers from the 2016 edition). All Wikimedia researchers are encouraged to submit papers for the workshop and attend it. More details at the call for papers.

See the research events page on Meta-wiki for other upcoming conferences and events, including submission deadlines.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions are always welcome for reviewing or summarizing newly published research.

  • “Facilitating the use of Wikidata in Wikimedia projects with a user-centered design approach”[6] From the abstract: “In its current form, [data from Wikidata] is not used to its full potential [on other Wikimedia projects] for a multitude of reasons, as user acceptance is low and the process of data integration is unintuitive and complicated for users. This thesis aims to develop a concept using user-centered design to facilitate the editing of Wikidata data from Wikipedia. With the involvement of the Wikimedia community, a system is designed which integrates with pre-existing work flows.”
  • “A corpus of Wikipedia discussions: over the years, with topic, power and gender labels”[7] From the abstract: “… we present a large corpus of Wikipedia Talk page discussions that are collected from a broad range of topics, containing discussions that happened over a period of 15 years. The dataset contains 166,322 discussion threads, across 1236 articles/topics that span 15 different topic categories or domains. The dataset also captures whether the post is made by an registered user or not, and whether he/she was an administrator at the time of making the post. It also captures the Wikipedia age of editors in terms of number of months spent as an editor, as well as their gender.”
  • “Wikipedia and the politics of openness” Two reviews of the 2014 book with this title[supp 1], in the journal Information, Communication & Society[8] and in Contemporary Sociology: A Journal of Reviews[9], with the latter summarizing the book as follows: “Tkacz’s text has three main empirical chapters. The first sorts out the ‘politics of openness,’ by which he means how collaboration emerges and forms in an open-ended context. The second empirical contribution is about the possibility that the framing of social interaction might, by itself, be enough to create order and encourage productivity in an environment like Wikipedia. … The third empirical contribution is that project exit has an extremely important role in maintaining the stability of Wikipedia. As people develop projects, they create parallel, break-off versions of a project [forks].”
  • “Derivation of ‘is a’ taxonomy from Wikipedia category graph”[10]
  • “‘En Wikipedia no se escribe jugando’: Identidad y motivación en 10 wikipedistas regiomontanos.”[11] From the English abstract: “This study qualitatively analyses the contributions in the talk pages of the Spanish Wikipedia by the ten most-active registered users in Monterrey, Mexico. Using virtual ethnography … this research finds that these self-styled ‘wikipedistas’ assume the site’s collective identity when interacting with anonymous users, and that their main motivations for ongoing participation are not related to the repository of knowledge in itself, but to their group dynamics and inter-personal relationships within the community.”
  • “Schreiben in der Wikipedia” (“Writing in Wikipedia”)[12] From the book (translated): “From the perspective of Wikipedia research, it can observed that Wikipedia must not be regarded as a community medium [‘gemeinschaftliches Medium’] per se, but that it reflects a conglomerate of individual and community writing processes, which in turn both influence the text genesis, with differing scopes. This chronological development is laid open here for the first time in case of some exemplary article texts, and subsequently, specific properties of each article topic are related to creation of the article that is basd on it.”
  • “Beyond the Book: linking books to Wikipedia”[13] From the abstract: “The book translation market is a topic of interest in literary studies, but the reasons why a book is selected for translation are not well understood. The Beyond the Book project investigates whether web resources like Wikipedia can be used to establish the level of cultural bias. This work describes the eScience tools used to estimate the cultural appeal of a book: semantic linking is used to identify key words in the text of the book, and afterwards the revision information from corresponding Wikipedia articles is examined to identify countries that generated a more than average amount of contributions to those articles. … We assume a lack of contributions from a country may indicate a gap in the knowledge of readers from that country. We assume that a book dealing with that concept could be more exotic and therefore more appealing for certain readers … An indication of the ‘level of exoticness’ thus could help a reader/publisher to decide to read/translate the book or not. Experimental results are presented for four selected books from a set of 564 books written in Dutch or translated into Dutch, assessing their potential appeal for a Canadian audience.”
  • “A multilingual approach to discover cross-language links in Wikipedia”[14] From the abstract: “… given a Wikipedia article (the source) EurekaCL uses the multilingual and semantic features of BabelNet 2.0 in order to efficiently identify a set of candidate articles in a target language that are likely to cover the same topic as the source. The Wikipedia graph structure is then exploited both to prune and to rank the candidates. Our evaluation carried out on 42,000 pairs of articles in eight language versions of Wikipedia shows that our candidate selection and pruning procedures allow an effective selection of candidates which significantly helps the determination of the correct article in the target language version.”
  • “Analyzing organizational routines in online knowledge collaborations: a case for sequence analysis in CSCW[15] From the abstract: “Research into socio-technical systems like Wikipedia has overlooked important structural patterns in the coordination of distributed work. This paper argues for a conceptual reorientation towards sequences as a fundamental unit of analysis for understanding work routines in online knowledge collaboration. Using a data set of 37,515 revisions from 16,616 unique editors to 96 Wikipedia articles as a case study, we analyze the prevalence and significance of different sequences of editing patterns.” See also slides and a separate review by Aaron Halfaker (“This is a weird paper. It isn’t actually a study. It’s more like a methods position paper.”)
  • “Wikipedia: medium and model of collaborative public diplomacy”[16] From the abstract: “Taking a case-study approach, the article posits that Wikipedia holds a dual relevance for public diplomacy 2.0: first as a medium; and second, as a model for public diplomacy’s evolving process. Exploring Wikipedia’s folksonomy, crowd-sourced through intense and organic collaboration, provides insights into the potential of collective agency and symbolic advocacy.”
  • “Enabling fine-grained RDF data completeness assessment”[17] From the abstract: “The idea of the paper is to have completeness information over RDF data sources and use it for checking query completeness. In particular, [for Wikidata,] an indexing technique was developed to allow to scale completeness reasoning to Wikidata-scale data sources. The applicability of the framework was verified using Wikidata and COOL-WD, a completeness tool for Wikidata, was developed. The tool is available at http://cool-wd.inf.unibz.it/
  • “Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO”[18] From the abstract: “In recent years, several noteworthy large, cross-domain and openly available knowledge graphs (KGs) have been created. These include DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. Although extensively in use, these KGs have not been subject to an in-depth comparison so far. In this survey, we provide data quality criteria according to which KGs can be analyzed and analyze and compare the above mentioned KGs.” From the paper: “… Wikidata covers all relations of the gold standard, even though it contains considerably less relations [than Freebase] (1,874 vs. 70,802). The Wikidata methodology to let users propose new relations, to discuss about their coverage and reach, and finally to approve or disapprove the relations, seems to be appropriate.”

    Mus musculus had all its genes imported into Wikidata

  • “Wikidata as a semantic framework for the Gene Wiki initiative”[19] From the abstract: “… we imported all human and mouse genes, and all human and mouse proteins into Wikidata. In total, 59 721 human genes and 73 355 mouse genes have been imported from NCBI and 27 306 human proteins and 16 728 mouse proteins have been imported from the Swissprot subset of UniProt. … The first use case for these data is to populate Wikipedia Gene Wiki infoboxes directly from Wikidata with the data integrated above. This enables immediate updates of the Gene Wiki infoboxes as soon as the data in Wikidata are modified. … Apart from the Gene Wiki infobox use case, a SPARQL endpoint and exporting functionality to several standard formats (e.g. JSON, XML) enable use of the data by scientists.”
  • “Connecting every bit of knowledge: The structure of Wikipedia’s First Link Network”[20] From the abstract: “By following the first link in each article, we algorithmically construct a directed network of all 4.7 million articles: Wikipedia’s First Link Network. … By traversing every path, we measure the accumulation of first links, path lengths, groups of path-connected articles, and cycles. … we find scale-free distributions describe path length, accumulation, and influence. Far from dispersed, first links disproportionately accumulate at a few articles—flowing from specific to general and culminating around fundamental notions such as Community, State, and Science. Philosophy directs more paths than any other article by two orders of magnitude. We also observe a gravitation towards topical articles such as Health Care and Fossil Fuel.” (See also media coverage: “All Wikipedia Roads Lead to Philosophy, but Some of Them Go Through Southeast Europe First” and Wikipedia:Getting to Philosophy)

References

  1. Nicolaes, Feli (2016-06-24). “Gender Bias on Wikipedia: An analysis of the affilliation network” (PDF). Faculty of Science, Science Park 904, 1098 XH Amsterdam: University of Amsterdam. 
  2. Ford, Heather; Wajcman, Judy. Anyone can edit’ not everyone does: Wikipedia and the gender gap” (PDF). Social Studies of Science. ISSN 0306-3127. >
  3. Thelwall, Mike (2016-11-14). “Does astronomy research become too dated for the public? Wikipedia citations to astronomy and astrophysics journal articles 1996–2014”. El Profesional de la Información 25 (6): 893–900. doi:10.3145/epi.2016.nov.06. ISSN 1699-2407. >
  4. Ciocirdel, Georgiana Diana; Varga, Mihai (2016). Election prediction based on Wikipedia pageviews (PDF). p. 9. 
  5. Twyman, Marlon; Keegan, Brian C.; Shaw, Aaron (2016-11-03). “Black Lives Matter in Wikipedia: Collaboration and collective memory around online social movements”. arXiv:1611.01257 [physics]. doi:10.1145/2998181.2998232. 
  6. Kritschmar, Charlie (2016-03-03). Facilitating the use of Wikidata in Wikimedia projects with a user-centered design approach (PDF) (Thesis).  Bachelor’s thesis written at the HTW Berlin in Internationale Medieninformatik
  7. Prabhakaran, Vinodkumar; Rambow, Owen (2016). “A corpus of Wikipedia discussions: over the years, with topic, power and gender labels”. p. 5. 
  8. Gotkin, Kevin (2016-02-24). “Wikipedia and the politics of openness”. Information, Communication & Society 0 (0): 1–3. doi:10.1080/1369118X.2016.1151911. ISSN 1369-118X.  Closed access
  9. Rojas, Fabio (2016-03-01). “Wikipedia and the Politics of Openness”. Contemporary Sociology: A Journal of Reviews 45 (2): 251–252. doi:10.1177/0094306116629410lll. ISSN 0094-3061. 
  10. Ben Aouicha, Mohamed; Hadj Taieb, Mohamed Ali; Ezzeddine, Malek (2016-04-01). “Derivation of “is a” taxonomy from Wikipedia category graph”. Engineering Applications of Artificial Intelligence 50: 265–286. doi:10.1016/j.engappai.2016.01.033. ISSN 0952-1976.  Closed access
  11. Corona Reyes, Sergio; Reyes, Sergio Antonio Corona; Yáñez, Brenda Azucena Muñoz (2015-12-29). ““En Wikipedia no se escribe jugando”: Identidad y motivación en 10 wikipedistas regiomontanos.”. Global Media Journal México 12 (23). 
  12. Kallass, Dr Kerstin (2015). Schreiben in der Wikipedia. Springer Fachmedien Wiesbaden. doi:10.1007/978-3-658-08265-9. ISBN 978-3-658-08265-9.  Closed access (in German)
  13. Martinez-Ortiz, C.; Koolen, M.; Buschenhenke, F.; Dalen-Oskam, K. v (2015-08-01). “Beyond the Book: linking books to Wikipedia”. 2015 IEEE 11th International Conference on e-Science (e-Science). 2015 IEEE 11th International Conference on e-Science (e-Science). pp. 12–21. doi:10.1109/eScience.2015.12.  Closed access
  14. Bennacer, Nacéra; Vioulès, Mia Johnson; López, Maximiliano Ariel; Quercini, Gianluca (2015-11-01). “A multilingual approach to discover cross-language links in Wikipedia”. In Jianyong Wang, Wojciech Cellary, Dingding Wang, Hua Wang, Shu-Ching Chen, Tao Li, Yanchun Zhang (eds.). Web Information Systems Engineering – WISE 2015. Lecture Notes in Computer Science. Springer International Publishing. pp. 539–553. ISBN 9783319261898.  Closed access
  15. Keegan, Brian C.; Lev, Shakked; Arazy, Ofer (2015-08-19). “Analyzing organizational routines in online knowledge collaborations: a case for sequence analysis in CSCW”. arXiv:1508.04819 [physics, stat]. 
  16. Byrne, Caitlin; Johnston, Jane (2015-10-23). “Wikipedia: medium and model of collaborative public diplomacy”. The Hague Journal of Diplomacy 10 (4): 396–419. doi:10.1163/1871191X-12341312. ISSN 1871-191X.  Closed access
  17. Darari, Fariz; Razniewski, Simon; Prasojo, Radityo Eko; Nutt, Werner (2016). “Enabling fine-grained RDF data completeness assessment”. Proceedings of the 16th International Conference on Web Engineering (ICWE ’16). Lugano, Switzerland. 2016. Springer International Publishing. doi:10.1007/978-3-319-38791-8_10.  Closed access (preprint freely available online)
  18. Färber, Michael; Ell, Basil; Menne, Carsten; Rettinger, Achim; Bartscherer, Frederic (2016). Linked data quality of DBpedia, Freebase, OpenCyc, Wikidata, and YAGO. 
  19. Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Mitraka, Elvira; Turner, Julia; Putman, Tim; Leong, Justin; Naik, Chinmay; Pavlidis, Paul; Schriml, Lynn; Good, Benjamin M.; Su, Andrew I. (2016-01-01). “Wikidata as a semantic framework for the Gene Wiki initiative”. Database 2016: 015. doi:10.1093/database/baw015. ISSN 1758-0463. PMID 26989148. 
  20. Ibrahim, Mark; Danforth, Christopher M.; Dodds, Peter Sheridan (2016-05-01). “Connecting every bit of knowledge: The structure of Wikipedia’s First Link Network”. arXiv:1605.00309 [cs]. 
Supplementary references:
  1. Tkacz, Nathaniel (2014-12-19). Wikipedia and the politics of openness. Chicago ; London: University Of Chicago Press. ISBN 9780226192277. 

Wikimedia Research Newsletter
Vol: 6 • Issue: 12 • December 2016
This newletter is brought to you by the Wikimedia Research Committee and The Signpost
Subscribe: Syndicate the Wikimedia Research Newsletter feed Email WikiResearch on Twitter WikiResearch on Facebook[archives] [signpost edition] [contribute] [research index]


by Tilman Bayer at January 26, 2017 06:12 AM

January 25, 2017

Wikimedia UK

The first week’s highlights from #1lib1ref

We are just over a week into the second annual #1lib1ref campaign, where we “imagine a world where every librarian adds one more reference to Wikipedia.”

Jerwood Library, Trinity Hall, Cambridge. Photo by Andrew Dunn, CC BY-SA 2.0.

We are just over a week into the second annual #1lib1ref campaign, where we “imagine a world where every librarian adds one more reference to Wikipedia.”

Wikipedia is based on real facts, backed up by citations—and librarians are expert at finding supporting research.

This year’s campaign launched on January 15, to celebrate Wikipedia’s sixteenth birthday.  As of Monday, participants have made over 1,543 contributions on 1,065 articles in 15 different languages.

We know that more librarian meetups, events, editathons, webinars, coffee hours, tweets, photos, sticker-selfies, blog posts and more have happened—share them on social media to help spread the campaign! Here are a few highlights from the week.

IFLA white papers

Following a year-long conversation with the International Federation of Library Associations, they kicked off #1lib1ref by officially publishing two “Opportunities Papers” emphasizing the potential for collaboration between Wikipedia and academic and public libraries.

Showing the story of a citation

#1lib1ref provides a great opportunity for communities to create resources about how to contribute to Wikimedia projects. Below are great new ones made for the campaign:

Video via Wikimedia Germany and the Simpleshow Foundation, CC BY-SA 4.0.
  1. Wikimedia Deutschland made a great video explainer in both English and German.
  2. NCompass Live hosted a webinar: The Wikimedia Foundation’s Alex Stinson alongside Wiki-Librarians Jessamyn West, Phoebe Ayers, Merrilee Profitt and Kelly Doyle provided an overview of the ways different library communities can improve Wikipedia.
  3. Wikipedian in Residence at the University of Edinburgh, Ewan McAndrew, developed excellent introductory videos for how to contribute to #1lib1ref!

A global story grows bigger

The campaign is already bigger than last year, as we’ve already surpassed our contributions from last year and we’re not even finished yet.  To capture the scope and excitement, we created a Storify to capture and share some of the most interesting of last week’s tweets, which numbered over 1,000.

We still have two more weeks to go! Keep pushing to get your local librarians and libraries involved with the campaign, and help share the gift of a citation with the world.

Alex Stinson, GLAM Strategist
Jake Orlowitz, Head of the Wikipedia Library
Wikimedia Foundation

by Alex Stinson at January 25, 2017 04:01 PM

Gerard Meijssen

#Wikidata - Sultanism anyone?

The definition of "sultanism" is:
In political science, sultanism is a form of authoritarian government characterized by the extreme personal presence of the ruler in all elements of governance. The ruler may or may not be present in economic or social life, and thus there may be pluralism in these areas, but this is never true of political power.
There are prominent scientists who use the term. It  therefore must be applicable and indeed there are some who consider that any sultanate is defined by it.  The problem is that the name is very much linked to Islam but that it equally applies to monarchs like Henry VIII. King Henry started the church of England but the way that the church of England came to be makes sultanism applicable.

It does not really matter how the concept of sultanism came to be. The name chosen is extremely prejudicial. The problem we face is that words and facts matter. Both Wikipedia and Wikidata represent a neutral point of view and therefore a concept like sultanism deserves a place. However, when such a concept is to be applied, it needs to applied in a neutral way. It means that you can not point to a country and say "sultanate". It means that it applies to a ruler and it therefore applies to Henry as much as to an evil genius like Jafar.
Thanks,
     GerardM

by Gerard Meijssen (noreply@blogger.com) at January 25, 2017 07:09 AM

January 24, 2017

Wikimedia Foundation

The first week’s highlights from #1lib1ref

Jerwood Library, Trinity Hall, Cambridge. Photo by Andrew Dunn, CC BY-SA 2.0.

Jerwood Library, Trinity Hall, Cambridge. Photo by Andrew Dunn, CC BY-SA 2.0.

We are just over a week into the second annual #1lib1ref campaign, where we “imagine a world where every librarian adds one more reference to Wikipedia.”

Wikipedia is based on real facts, backed up by citations—and librarians are expert at finding supporting research.

This year’s campaign launched on January 15, to celebrate Wikipedia’s sixteenth birthday.  As of Monday, participants have made over 1,543 contributions on 1,065 articles in 15 different languages.

We know that more librarian meetups, events, editathons, webinars, coffee hours, tweets, photos, sticker-selfies, blog posts and more have happened—share them on social media to help spread the campaign! Here are a few highlights from the week.

IFLA white papers

Following a year-long conversation with the International Federation of Library Associations, they kicked off #1lib1ref by officially publishing two “Opportunities Papers” emphasizing the potential for collaboration between Wikipedia and academic and public libraries.

Showing the story of a citation

#1lib1ref provides a great opportunity for communities to create resources about how to contribute to Wikimedia projects. Below are great new ones made for the campaign:

File:Explainer Video - Using good sources on wikipedia.webm

Video via Wikimedia Germany and the Simpleshow Foundation, CC BY-SA 4.0.

 

  1. Wikimedia Deutschland made a great video explainer in both English and German.
  2. NCompass Live hosted a webinar: The Wikimedia Foundation’s Alex Stinson alongside Wiki-Librarians Jessamyn West, Phoebe Ayers, Merrilee Profitt and Kelly Doyle provided an overview of the ways different library communities can improve Wikipedia.
  3. Wikipedian in Residence at the University of Edinburgh, Ewan McAndrew, developed excellent introductory videos for how to contribute to #1lib1ref!

A global story grows bigger

The campaign is already bigger than last year, as we’ve already surpassed our contributions from last year and we’re not even finished yet.  To capture the scope and excitement, we created a Storify to capture and share some of the most interesting of last week’s tweets, which numbered over 1,000.

We still have two more weeks to go! Keep pushing to get your local librarians and libraries involved with the campaign, and help share the gift of a citation with the world.

Alex Stinson, GLAM Strategist
Jake Orlowitz, Head of the Wikipedia Library
Wikimedia Foundation

Image by Spiritia, public domain/CC0.

by Alex Stinson and Jake Orlowitz at January 24, 2017 08:43 PM

William Beutler

#1Lib1Ref and Adventures in Practical Encyclopedia-Building

Wikipedia_Library_owlThe Wikipedian has long been of the opinion, perhaps controversial on Wikipedia, that it is a mistake to think that it can recruit the entire world to become Wikipedia editors. Yet this is the premise upon which so many aspects of Wikipedia’s platform are based.

Start with the fact that anyone can edit (almost) any page at any time. This was Wikipedia’s brilliant original insight, and there is no doubt it made Wikipedia what it is today. But along with scholars and other knowledge-loving contributors comes the riff raff. The calculation is that the value of good editors attracted by Wikipedia’s open-editing policy will outweigh the vandals and troublemakers. On one hand, it is an article of faith not rigorously tested. On the other hand, Wikipedia’s mere existence is proof that the bet is generally sound.

All of which is preamble to praise Wikipedia’s #1Lib1Ref project, now in its second year, for taking what is to my mind a more sensible approach to building Wikipedia’s editorship: targeting persons and professions that already have more in common with Wikipedia than they might realize, in this case librarians. Whereas the official Wikimedia vision statement calls for “a world in which every single human being can freely share in the sum of all knowledge”, the #1Lib#1Ref tagline suggests “a world where every librarian added one more reference to Wikipedia.”

This is great! As much as The Wikipedian strongly supports the big-picture goal of the vision statement, the fact is asking “every” person to contribute “all” things is no place to begin. But asking a very specific type of person to make just one contribution actually turns out to be massively more powerful because it is vastly more effective.

Speaking anecdotally, the greatest hurdle to becoming a Wikipedia contributor is figuring out how to make that very first edit.[1]The second greatest hurdle is getting that person to figure out what to do next, but that is for another day. Encouraging the determination to give it a try, and creating a simple set of steps to help them get there, will do a lot more than the sum of all lofty rhetoric.

#1Lib#1Ref runs January 15 to February 3, and you can learn more about it via The Wikipedia Library. If you decide to get involved, you should also consider posting with the obvious hashtag on Twitter or another social platform of your choice. Oh, and if you don’t get to it before February 3, I’m sure they’ll be happy to have you join in after the fact.

P.S. You have no idea how hard it was to write this without making either a Bob Marley or U2 reference. If you now have one song or the other stuck in your head, you are most welcome.

The Wikipedia Library logo by User:Heatherawalls, licensed under Creative Commons.

Notes   [ + ]

1. The second greatest hurdle is getting that person to figure out what to do next, but that is for another day.

by William Beutler at January 24, 2017 04:09 PM

January 23, 2017

Andy Mabbett (pigsonthewing)

Bromptons in Museums and Art Galleries

Every time I visit London, with my Brompton bicycle of course, I try to find time to take in a museum or art gallery. Some are very accommodating and will cheerfully look after a folded Brompton in a cloakroom (e.g. Tate Modern, Science Museum) or, more informally, in an office or behind the security desk (Bank of England Museum, Petrie Museum, Geffrye Museum; thanks folks).


Brompton bicycle folded

When folded, Brompton bikes take up very little space

Others, without a cloakroom, have lockers for bags and coats, but these are too small for a Brompton (e.g. Imperial War Museum, Museum of London) or they simply refuse to accept one (V&A, British Museum).

A Brompton bike is not something you want to chain up in the street, and carrying a hefty bike-lock would defeat the purpose of the bike’s portability.


Jack Wills, New Street (geograph 4944811)

This Brompton bike hire unit, in Birmingham, can store ten folded bikes each side. The design could be repurposed for use at venues like museums or galleries.

I have an idea. Brompton could work with museums — in London, where Brompton bikes are ubiquitous, and elsewhere, though my Brompton and I have never been turned away from a museum outside London — to install lockers which can take a folded Brompton. These could be inside with the bag lockers (preferred) or outside, using the same units as their bike hire scheme (pictured above).

Where has your Brompton had a good, or bad, reception?

Update

Less than two hours after I posted this, Will Butler-Adams, MD of Brompton, >replied to me on Twitter:

so now I’m reaching out to museums, in London to start with, to see who’s interested.

by Andy Mabbett at January 23, 2017 08:24 PM

Wikimedia Foundation

“I knew that once I started, I wouldn’t be able to stop writing”: Başak Tosun

Photo by Muzammil, CC BY-SA 4.0.

Photo by Muzammil, CC BY-SA 4.0.

Başak Tosun has been editing the Turkish Wikipedia for over a decade, and she still remembers the feeling she got when she received an email inviting her to contribute.

“The moment I read about [Wikipedia], the idea of writing encyclopedic articles sounded like fun,” Tosun recalls. “But I know myself very well. I hesitated about visiting that website. I knew that once I started, I wouldn’t be able to stop writing.”

Tosun successfully held out for a few months but eventually decided to take the plunge. Editing Wikipedia started as a simple hobby, writing articles about her favorite anime characters, before it became more when she decided to channel her efforts into filling content gaps on the Turkish Wikipedia.

“Many artists, writers and scientists were missing on Wikipedia, or not being fairly or adequately represented on the internet,” says Tosun. “I felt empowered knowing I could do something about it.”

Photo by Horace Vernet, public domain.

Massacre of the Mamluks in Cairo Citadel, 1805. Photo by Horace Vernet, public domain.

One of the 1,100 articles created by Tosun was on Ibn Taghribirdi, a historian from fifteenth century Egypt. He lived with Cairo’s Mamluk elite (the Turkish ruling class of slave origins). Ibn Taghribirdi is known for his analytic style in documenting the Mamluk rulers of Egypt and the history of Egypt during the middle ages.

Overall, Tosun has a passion for editing history and biographies and has invested much time in developing articles about the history of Turkey, women in art, musician and dancer profiles, and more. This interest has bled over into her professional life as well: “When I started editing history, I recognized that my general knowledge of it was lacking,” she said. “So I started a four-year degree program in history at an open university that I’m completing this year.”

Tosun doesn’t have a utopian view of the Wikipedia community. Mistakes occur, she notes, but assuming good faith makes them tolerable. Even editing conflicts can result in collaboration on developing a topic. “Sometimes there are conflicts on the ethnic origins of people in the biographies I write,” Tosun explains. “Most of the time, the person in question is of mixed origins, and therefore all sides of the conflict have merit. In such cases, I usually try to add as much detailed information and references as I can to support all views.”

Tosun enjoys sharing her experience with others and showing them how to contribute. According to her, “It’s always easy to start contributing to Wikipedia as long as the new user recognizes the edit button.” Based on that, she suggested that her sister, a psychology professor, assign her students editing tasks on Wikipedia as part of their syllabus. Tosun offered to help train the students on how to edit Wikipedia.

The plan worked and the next semester, another professor joined the efforts with 102 students. “Most students do not continue contributing extensively, but at least they become better readers of Wikipedia,” Tosun explains. Together with the Turkish Wikipedia community, Tosun is now helping with several Wikipedia courses in different universities.

When not on Wikipedia, Tosun works for a web hosting and domain registration company. She studied political science before going for a second degree in history.

One of her wishes for 2017, is to help organize an editathon (editing workshop) at the Poetry Library in the city where she lives, where the participants would focus on editing Turkish poet profiles.

Interview by Syed Muzammiluddin, Wikimedia Community Volunteer
Profile by Samir Elsharbaty, Digital Content Intern, Wikimedia Foundation

by Syed Muzammiluddin and Samir Elsharbaty at January 23, 2017 08:15 PM

Wiki Education Foundation

Students share linguistics with the world

While visiting the Linguistic Society of America Conference in Austin earlier this month, I asked attendees: why do you think the study of linguistics is so relevant today? Their replies were varied: the election, the rise of fake news, the importance of understanding language bias, and knowing how we use rhetoric to persuade others.

In 2016, linguistics was a topic of interest not just in academic scholarship, but also in popular culture and politics. When Arrival hit theaters in the fall, it challenged us to think about the power of language in shaping our understanding of the world — or other worlds. Throughout the year, news outlets asked us to consider the relevance of the President-elect’s rhetorical devices and speech patterns in shaping public opinion.

Here at Wiki Ed, we agree that the public’s understanding of these issues is paramount. That’s why in November 2015, just as we were starting to promote our Year of Science, we announced our partnership with the Linguistic Society of America to support students as they work to systematically improve coverage of linguistics topics on Wikipedia. And in the last year we’ve done just that.

Since the beginning of our partnership in the spring 2016 term, we’ve supported 25 courses with 348 students as they contributed to language and linguistics articles on Wikipedia. Together, the 373 articles they improved, including 13 new entries, have been viewed over 13.5 million times. These numbers further illustrate the relevance of linguistics to public conversations in 2016. That’s partly why I found myself so glad to be returning to LSA as we kick off the new year, and why Wiki Ed is so proud of our partnership. In 2017, we’d love to continue to grow our support of these classes.

Wiki Ed provides technical tools, training materials, and flexible assignment timelines to make integrating Wikipedia into your courses as simple as possible. Instructors and students also receive staff support throughout the semester. One instructor teaching with us for the second time this spring came by my booth at the conference and said “Wiki Ed’s support will save me 50 hours in prep time!” I hope you’ll join us in using Wikipedia to help the world understand the crucial work of linguists.

For more information about teaching with Wikipedia generally, visit teach.wikiedu.org. If you’d like to talk with someone about setting up an assignment in your next course, reach out at contact@wikiedu.org.

by Samantha Weald at January 23, 2017 08:00 PM

Sam Wilson

Wikisource Hangout

I wonder how long it takes after someone first starts editing a Wikimedia project that they figure out that they can read lots of Wikimedia news on https://en.planet.wikimedia.org/ — and when, after that, they realise they can also post to the news there? (At which point they probably give up if they haven’t already got a blog.)

Anyway, I forgot that I can post news, but then I remembered. So:

There’s going to be a Wikisource meeting next weekend (28 January, on Google Hangouts), if you’re interested in joining:
https://meta.wikimedia.org/wiki/Wikisource_Community_User_Group/January_2017_Hangout

by Sam Wilson at January 23, 2017 11:43 AM

Gerard Meijssen

#Wikipedia - #Sources anyone?

Sources are important. They make it obvious what is correct and what is not. For content in Wikidata Wikipedia is an important source of information. It aims to be neutral and there are loads of sources.

When you bring the information together in a tree like the one to the right, it follows that all the information has to agree with that interpretation. It all starts with "Duqaq Temür Yalığ" but he is called "Toqaq" in the article on Seljuq.

The article on the Seljuk Empire is quite wonderful because it includes the spouses of the Sultans and their lineage. Really relevant to understand the politics of the time.

I do include information where I can find and understand it. Quite often, information is problematic. Sometimes it is obviously wrong as in attributing a person to a modern country. As more data is entered, the information becomes more complicated and coherent. Errors become more glaringly obvious. It becomes more and more a matter of adding individual statements that are the difference and not so much long lists of data.

At some stage the puzzles will be left and sources will need to be sought to make the right statements, not the obvious statements.
Thanks,
       GerardM

by Gerard Meijssen (noreply@blogger.com) at January 23, 2017 10:34 AM

Andre Klapper

Wikimedia in Google Code-in 2016

(Google Code-in and the Google Code-in logo are trademarks of Google Inc.)

Google Code-in 2016 has come to an end. Wikimedia was one of the 17 organizations who took part to offer mentors and tasks to 14-17 year old students exploring free and open source software projects via small tasks.

Congratulations to our 192 students and 46 mentors for fixing 424 tasks together!

Being one of the organization admins, deciding on your top five students at the end of the contest always takes time and discussions as many students have provided impressive work and it hurts to have to put a great contributor on the 6th or 7th place.
Google will announce the Grand Prize winners and finalists on January 30th.

Reading the final feedback of students always re-assures that all the effort mentors and organization admins put into GCI are worth it:

  • In 1.5 month, I learned more than in 1.5 year. — Filip
  • I know these things will be there forever and it’s a big thing for me to have my name on such a project as MediaWiki. — Victor
  • What makes kids like me continue a work is appreciation and what the community did is give them a lot. — Subin
  • I spent my best time of my life during the contest — David

Read blogposts by GCI students about their experience with Wikimedia.

To list some of the students’ achievements:

  • Many improvements to Pywikibot, Kiwix (for Wikipedia offline reading), Huggle, WikiEduDashboard, Wikidata, documentation, …
  • MediaWiki’s Newsletter extension received a huge amount of code changes
  • The Pageview API offers monthly request stats per article title
  • jQuery.suggestions offer reason suggestions to block, delete, protect forms
  • A {{PAGELANGUAGE}} magic word was added
  • Changes to number of observations in the Edit Quality Prediction model
  • A dozen MediaWiki extension pages received screenshots
  • Lots of removal of deprecated code in MediaWiki core and extensions
  • Long CREDIT showcase videos got split into ‘one video per topic’ videos on Wikimedia Commons
  • Proposals for a redesign of the Romanian Wikipedia’s main page
  • Performance improvements to the importDump.php maintenance script
  • Converted Special:RecentChanges to use the OOUI library
  • Allow users to apply change tags as they make logged actions using the MediaWiki web API
  • Added some hooks to Special:Unblock
  • Added a $wgHTTPImportTimeout setting for Special:Import
  • Added ability to configure the web service endpoint and added phpcs checks in MediaWiki’s extension for Ideographic Description Sequences
  • Glossary wiki pages follow the formatting guidelines
  • Research on team communication tools

We also received valuable feedback from our mentors on what we can improve for the next round.

Thanks to everybody for your friendliness, patience, and help provided.
Thanks for your contributions to free software and free knowledge.
See you around on IRC, mailing lists, tasks, and patch comments!

by aklapper at January 23, 2017 04:47 AM

January 21, 2017

Andy Mabbett (pigsonthewing)

Four Stars of Open Standards

I’m writing this at UKGovCamp, a wonderful unconference. This post constitutes notes, which I will flesh out and polish later.

I’m in a session on open standards in government, convened by my good friend Terence Eden, who is the Open Standards Lead at Government Digital Service, part of the United Kingdom government’s Cabinet Office.

Inspired by Tim Berners-Lee’s “Five Stars of Open Data“, I’ve drafted “Four Stars of Open Standards”.

These are:

  1. Publish your content consistently
  2. Publish your content using a shared standard
  3. Publish your content using an open standard
  4. Publish your content using the best open standard

Bonus points for:

  • making clear which standard you use
  • publishing your content under an open licence
  • contributing your experience to the development of the standard.

Point one, if you like is about having your own local standard — if you publish three related data sets for instance, be consistent between them.

Point two could simply mean agreeing a common standard with other items your organisation, neighbouring local authorities, or suchlike.

In points three and four, I’ve taken “open” to be the term used in the “Open Definition“:

Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).

Further reading:

by Andy Mabbett at January 21, 2017 03:13 PM

Gerard Meijssen

#Wikipedia - Support understanding the #gender gap

#Wikidata needs to mature. #Wikipedia needs to mature. They both have wishes they aim to fulfil that escapes them. The gender gap is such an issue and it can be used to illustrate how both will mature when they cooperate.

When you want to know how many articles are expected to be written at a given point you need to analyse the red links. They indicate articles that are likely notable and indicate a structural need in Wikipedia. To do that you need data and you need a tool.

When links exist for every red link to an item in Wikidata, you have both the data and a tool. This will help Wikipedia with its disambiguation, and it will show up what a Wikipedia is missing. It is a tool that may drive people to write articles about the missing links.

All the red links will now link to Wikidata and articles in other Wikipedias. It also allows for people to add statements to Wikidata so that facts about those items are known. For instance that it is about a woman. When statements to awards, professions and events are known, there is added weight to write an article.

In this way two purposes are served; researchers have better tools that help them understand the gender gap and it will help people who care about he gender gap work on reducing that gap.

Technically it is not that complicated to achieve. If there is a problem with this proposal it may be that Wikipedians need to understand that this is not a power grab but a way to improve quality and efficiency of their project.
Thanks,
      GeratdM

by Gerard Meijssen (noreply@blogger.com) at January 21, 2017 08:51 AM

January 20, 2017

Wiki Education Foundation

Monthly Report for December 2016

Highlights

  • At the end of the Wikipedia Year of Science, we tallied the contributions our 287 science courses contributed: 4.93 million words added to 5,640 articles, including 622 new entries, viewed 270 million times just during their respective terms. That means we added the equivalent of 11% of the last print edition of Encyclopedia Britannica to science content on Wikipedia during the Year of Science.
  • Our fall term wrapped up in December, with us supporting more than 6,300 students in 276 courses. In the fall term, student editors added 4.2 million words of content across all disciplines, providing better content for 253 million readers.
  • We announced new Wikipedia Visiting Scholars that will be working with the University of San Francisco’s Department of Rhetoric and Language and San Francisco State University’s Paul K. Longmore Institute on Disability.
  • We released an addition to our series of subject-specific editing brochures: Editing Wikipedia articles on Political Science.

Programs

Educational Partnerships

Samantha Weald attends the American Geophysical Union conference in San Francisco
Samantha Weald attends the American Geophysical Union conference in San Francisco

In December, Outreach Manager Samantha Weald, Classroom Program Manager Helaine Blumenthal, Director of Programs LiAnna Davis, and Educational Partnerships Manager Jami Mathewson attended the final academic conference during the Wikipedia Year of Science. At the American Geophysical Union’s annual meeting in San Francisco, staff members met earth scientists eager to improve Wikipedia’s content. At the conference, we spoke to dozens of scientists who believe Wikipedia is a valuable website for them, their students, and the world. We’re excited to bring more geophysics, geology, and earth science students to Wikipedia in the coming years, helping us amplify the impact of this year’s Wikipedia Year of Science.

As we wrapped up another year of recruitment, we reflected on our aim to increase the Wiki Education Foundation’s visibility to university and college instructors in the United States and Canada. Over the course of the year, we attended 23 conferences to share Wiki Ed’s mission with university instructors. We also made 12 campus visits, where Wiki Ed’s program participants hosted us to encourage their colleagues to join our efforts. Additionally, we hosted four outreach webinars. Through these outreach initiatives, we brought more instructors than ever into the Classroom Program, supporting a record 515 courses and nearly 11,000 students in 2016.

Classroom Program

Status of the Classroom Program for Fall 2016 in numbers, as of December 31:

  • 276 Wiki Ed-supported courses were in progress (130 or 47%, were led by returning instructors).
  • 6,307 student editors were enrolled.
  • 60% of students were up-to-date with the student training.
  • Students edited 5,700 articles, created 722 new entries, and contributed 4.18 million words.

The Fall 2016 term has come to a close, and we’re busily preparing for Spring 2017. Our most successful term to date was defined by growth, productivity, and experimentation. With 276 courses doing Wikipedia assignments, the Classroom Program has grown to nearly triple the size it was in Fall 2014. And of course with this rapid growth, our students are having an even greater impact on Wikipedia. To ensure that all of our instructors and students get the support they need, we implemented several new programs during the Fall 2016 term, including a series of interactive webinars and a more robust help section built into the Dashboard.

While we’re proud of the above numbers, the true success of the Classroom Program is, in some ways, immeasurable. As recent events have demonstrated, fake news poses a serious threat to an informed citizenry. Students who learn how to contribute to Wikipedia are not only making reliable information available to the public at large, they are also developing critical media literacy skills that enable them to discern real from fake sources of information. In learning Wikipedia’s strict policies around sourcing, our students know to question headlines and to dig deeper. These are lifelong skills that not only serve our students, but society more generally.

The close of 2016 also marked the end of the Wikipedia Year of Science. During this year-long initiative, we strove to improve Wikipedia content in STEM and social science fields, while developing critical science communication skills among our students. Our Year of Science campaign consisted of 287 courses and 6,270 students. Together, they contributed 4.93 million words to 5,640 Wikipedia articles, including 622 new entries, and their work was viewed 270 million times in the spring and fall terms alone. A specific goal of the Year of Science was to improve Wikipedia’s coverage of women scientists, and our students either expanded or created well over 100 articles on important but overlooked women in the sciences. While the Year of Science has come to an end, we recognize that our work in this area has, in many ways, only just begun. Science literacy, along with media literacy, are key components of an accurately informed society, and we will continue to prioritize both going forward.

Angel_food_cake_with_strawberries_(4738859336)
The Wikipedia article on angel food cake was among those improved in Richard Ludescher’s Food Physical Systems class at Rutgers University.
Image: Angel food cake with strawberries by F_A, CC BY 2.0, via Wikimedia Commons.

We saw some great work from several courses:

When we think about food, we think about taste, preparation, and whether it’s healthy or not. We rarely think about things like hydrogen bonding, electrostatic interactions, or Van der Waals interactions. But these are important aspects, and thanks to students in Richard Ludescher’s Food Physical Systems class at Rutgers University, information of this sort is now available through a number of Wikipedia articles. A student in the class expanded the angel food cake article by adding sections about the manufacturing process, the ingredients used in commercial production, and the physical and biochemical roles played by these ingredients in the final product. Another student expanded the croissant article by adding information about their manufacturing and the changes in the physical and chemical properties of ingredients during manufacturing, baking, and storage. Other students added information on the physical and chemical properties of a number of other foods including marshmallows, mayonnaise and chewing gum. The expansion of the meat analogue article added information about the composition, processing, and physical structure of the product that are required to mimic the texture and taste of meat.

Students in Glenn Dolphin’s Introductory Geology class continued their work expanding biographies of women geologists. Maria Crawford contributed to lunar petrology, continental collisions (on earth) and the geology of the Pennsylvania Piedmont, but at the beginning of the term her Wikipedia biography was only four sentences long and said nothing of her contributions. A student turned that stub into a substantial article which documented her achievements from a career that spanned four decades. Another student created an article on Virginia Harriett Kline, a a stratigrapher who earned a Ph.D. in geology in 1935 and made important contributions to petroleum geology. Other students in the class continued to expand the articles they had worked on earlier in the term.

Early studies of child psychology often focused on conflict and aggression. Lois Barclay Murphy chose instead to focus on normal childhood development; she played an important role in the development of that field. Marvin Zuckerman played an important role in the development of the field of sensation seeking. Mary K. Rothbart is an expert on infant temperament development. Rena R. Wing is an expert on the behavioral treatment of obesity. None of these psychologists had biographies on Wikipedia. Similarly, the field of geriatric psychology had been omitted. These were among the articles created by students in James Council’s History and Systems of Psychology class at North Dakota State University. Other students worked to expand existing articles, like the profile of mood states, which was expanded from a short stub into a substantial article.

One of Wiki Ed’s great successes has been recruiting professors in archaeology and anthropology to expand and improve articles on archaeological sites, artifacts and methods. There are thousands of sites which are notable but not covered in Wikipedia — students in courses like Rice University’s African Prehistory have added to articles like Manyikeni in Mozambique and KM2 and KM3 in Tanzania. They’ve also updated the article on South Africa’s Border Cave, an already substantial article which now covers more modern work on the site and artifacts found within it.

Critical theory is hard work. Explaining it to laypeople is even harder. Doing so on Wikipedia, harder still. The language of critical theory (in nearly any discipline: law, economics, feminism) is often disjoint from or at odds with the main voices in the discipline — otherwise it’s hard to say it’s critical! Students in John Willinsky’s Critical Theory and Pedagogies outdid themselves in added to Wikipedia’s coverage of critical mathematics pedagogy, critical pedagogy (a hard phrase to hear for the policy debate veterans in our audience), and expanding coverage of books like Learning to Labour, a critical educational ethnography. Work on narrow, difficult topics like critical pedagogy of place requires research and preparation and the students’ work speaks to the hard work they’ve done.

Finally, interim Content Expert Rob Fernandez, who graciously agreed to join our staff temporarily to help out with the rush at the end of the fall term, wrapped up his contract with Wiki Ed in December. Rob’s help to ensure our student editors and instructors got top-notch support was invaluable. Thank you for your contributions, Rob, and best of luck on your new job!

Community Engagement

Blausen_0088_BloodClot
Barbara Page’s article about thrombosis prevention explains treatments to prevent the formation of blood clots inside a blood vessel.
Image: Blausen 0088 BloodClot.png by Blausen.com staff, CC AT 3.0, via Wikimedia Commons.

Community Engagement Manager Ryan McGrady announced two new Visiting Scholars positions at the beginning of this month, which got their start at the very end of last month and are already using institutional resources to improve Wikipedia. User:Lingzhi partnered with the University of San Francisco to improve rhetoric and language topics, and Jackie Koerner is developing articles on disability, such as disability in the United States, with the Paul K. Longmore Institute on Disability at San Francisco State University.

Existing Scholars continued to produce great work. George Mason University’s Gary Greenbaum had another article achieve the impressive Featured Article designation, Alabama Centennial half dollar. The University of Pittsburgh’s Barbara Page built up her portfolio of impressive medical editing with substantial improvements to Wikipedia’s entry for thrombosis prevention.

The community is getting ready to start the 11th annual WikiCup competition, in which experienced editors are awarded points for producing high-quality content. For the 2016 event that just ended, Wiki Ed sponsored a side competition with prizes for the two users with the most Good Article and Featured Articles on scientific topics. In first place was also the overall winner of the competition, User:Casliber, who developed articles like the violet webcap mushroom and the Lynx constellation. In second place, despite sitting out for the final round of the competition, was User:Cwmhiraeth, who improved some very big topics like millipede and habitat.

Program Support

Editing_Wikipedia_articles_on_Political_Science_(Wiki_Ed).pdf
Our newest subject-specific editing brochure will help students working on political science articles.

Communications

LiAnna Davis has been working with San Francisco-based media firm PR & Company to pitch stories to national press about impact Wiki Ed’s programs are having, and especially the impact of the Year of Science.

We announced the newest in our series of subject-specific editing brochures in December: Editing Wikipedia articles on Political Science. Thanks to the Wikipedia editors and partners at the MPSA who provided review and/or feedback for this.

Blog posts:

External media:

Digital Infrastructure

Continuing with the main technical focuses from last month, Product Manager Sage Ross spent December focused on collaboration and mentorship, as well as working on bug fixes and feature development.

Sejal Khatri started her Outreachy internship this month to improve the Dashboard’s user profile pages. She’s already made considerable progress toward our plans for these profile pages, and she’s made several improvements and bug fixes for course pages as well. Check out Sejal’s latest post on her internship blog to see what she’s been up to. December was also a busy month for the high school students participating in Google Code-In. Sage has been mentoring them on Dashboard tasks, including some performance and accessibility improvements, documentation and testing, bug fixes, and new features to help Wiki Ed staff handle new courses more efficiently. This month saw contributions to Wiki Ed’s codebase from nine developers outside of our staff and contractors — a new record.

Sage developed the initial version of an Article Viewer tool that lets you see a full Wikipedia article — as it looks on Wikipedia — without leaving the Dashboard. The Article Viewer is currently available alongside the Diff Viewer when you zoom in on a particular edited article in a course’s Articles tab.

In anticipation of increased Dashboard usage in 2017, in late December — just before Christmas weekend — we migrated the software to a more powerful server. The ensuing 20 minutes were the only downtime for the Dashboard during the entire 2016 term (although a handful of network disruptions and problems with Wikimedia servers did affect Dashboard users at earlier points in the term).

Research and Academic Engagement

In December, Research Fellow Zach McDowell completed the 13 focus groups portion of the research program. A total of 475 minutes of focus group recordings were sent away for transcription, resulting in more than 250 pages of text for analysis.

Survey research participation continued to grow, with more than 1,200 responses in the pre-assessment as well as more than 850 responses for the post-assessment. Surveys close on January 17, 2017.

Zach spent the remainder of his time beginning preliminary assessment of the data and cleanup plan. Additionally, Zach has been seeking out a graduate student to engage as a data science intern to expedite the analysis process of the data.

Finance & Administration / Fundraising

Wiki_Education_Foundation's_San_Francisco_team_holiday_party
Wiki Education Foundation’s San Francisco team holiday party.

Finance & Administration

To celebrate the holiday season, San Francisco based staff gathered at LiAnna’s house for a holiday party. Executive Director Frank Schulenburg led the group in the creation and consumption of a Feuerzangenbowle, and we enjoyed dinner and games.

For the month of December, expenses were $157,772 versus the approved budget of $206,733. The majority of the $49k variance continues to be due to staffing vacancies ($13k); as well as the timing of outside professional services ($22k), and the printing ($11k) expenses.

Wiki Ed Expenses 2016-12
Expenses December 2016 Actual vs. Plan

Our year-to-date expenses of $900,208 was also less than our budgeted expenditures of $1,196,085 by $296k. Like the monthly variance, the year-to-date variance was also largely impacted by staffing vacancies ($100k). In addition, the timing and deferral of professional services ($69k); marketing and cultivation ($18k); volunteer workshops ($13k); and printing ($18k); as well as savings in staffing related expenses ($16k) and in travel ($61k) contributed to the variance.

Fundraising

  • Wiki Ed Expenses 2016-12 YTD
    Expenses Year to Date December 2016 Actual vs. Plan

    Wiki Ed conducted its first–ever individual donor acquisition mailing, which reached more than 11,000 individuals. Appeals were sent via U.S. Mail in late December.

  • Google renewed their support with a $20,000 gift to Wiki Ed.

Office of the ED

Current priorities:

  • Securing funding
  • Developing a plan for next fiscal year
  • Working with the board on additional funding options

In order to be able to share an early outline of our future programmatic work with the board, Frank traditionally starts brainstorming ideas with senior staff in December. That’s why this month we embarked on thinking about the general direction for the upcoming fiscal year 2016–17 and developed a vision for the time ahead to be shared with the board on January 28–29.

Frank also started conversations with existing and prospective funders on some initiatives that are in our project pipeline for 2017. These conversations – as well as our projections of the expected impact – will inform our roadmap for the upcoming year and beyond.

Also in December, Frank prepared a series of documents for an ad hoc board taskforce that will look into additional funding streams prior to the in-person board meeting at the end of January. The board taskforce meetings will start next month via video conference with the goal of coming up with a recommendation to the board as a whole.

 

Visitors and guests

  • Merrilee Proffitt, OCLC
  • Steve Kaplan, Message LA

by Ryan McGrady at January 20, 2017 11:56 PM

Content Translation Update

January 20 CX Update: More fixes for page loading and template editor

Hello, and welcome to another CX update post, in which I am happy to report about several significant bug fixes.

  • Pages that had full stops (⟨.⟩) in headings couldn’t be loaded after auto-saving and closing the browser tab. This is now fixed. It’s a follow-up to a similar bug a fix for which was reported last week. If you still have issues with loading saved pages, please report them. (bug report)
  • Adapted infoboxes would often say “Main Page” on the top, no matter what was the page being translated, or into what language. This could also happen with other kinds of templates. This affected pages with templates that used that {{PAGENAME}} magic word. This is now fixed, and the auto-adapted template now shows the relevant page name. (bug report)
  • An unnecessary horizontal scrollbar was shown on some pages that had wide tables. It was removed. (code change)

 


by aharoni at January 20, 2017 07:50 PM

Wiki Education Foundation

Rediscovering the “higher” in higher education with a Wikipedia writing assignment

Dr. Joel Parker is Associate Professor of Biological Sciences at SUNY Plattsburgh, where he has incorporated Wikipedia into his Cell Biology courses. In November we featured some of the great work his students did in our Cell Service roundup. In this post, he explains how assigning students to contribute to Wikipedia brings them through the process of discovery.

Joel Parker
Joel Parker

Assigning students to write for Wikipedia achieves the highest outcome of higher education by teaching your students the full process of discovery. This lesson is especially important today as higher education is being debased with lower learning outcomes that overemphasize the practical training of our students for the workplace. What makes higher education “higher” is the opportunity for students to work with scholars to learn how to advance both our own knowledge and knowledge within our scholarly disciplines. Writing for Wikipedia can facilitate the transition from passive learner to active discoverer for your senior students. I make this happen for my senior level Cell Biology class with a writing for Wikipedia assignment that requires my students to go through all of the stages of the academic discovery process.

Discovery is the general common objective that defines higher education. This discovery process happens at three levels at universities. The first level of discovery is students discovering for themselves previously learned knowledge about the world. This is mastering the material and background knowledge that one expects of a degree holder. The next level is students and academics working together to discover truths about how the universe works. It involves noticing a gap or flaw in our current knowledge, then imagining and proving a solution. Finally, and no less important, the third level is a personal version of the second: effectively contributing the answer to the discipline and the world by communicating the discovery. This personal discovery is the transformation required by our students to gain the confidence to transition from just being beneficiaries of knowledge, to becoming propagators and contributors of new knowledge. This third level is especially important to higher education as the teens and early twenties are perhaps the most formative years when our adult personalities and sense of selves are formed.

In my senior level cell biology class I have my students do each step of the discovery process in a Wikipedia writing assignment. The first step begins when I assign my students to search for and critique an existing cell biology Wikipedia article. This means finding mistakes, missing sections, and places where they can improve the article. The next step is actually doing the fixes and creating new content to fill the voids. This technical side includes writing in the encyclopedic style, communicating science at the correct level, and can even include graphic design when figures are called for. The final step is publicly publishing the article in the correct format and style, then dealing with the judgments, suggestions and edits from the rest of the community. I constantly remind my students throughout the assignment that the overriding goal and assessment criteria is their contribution through improving the articles. It is not sufficient to just write the minimum number of sentences and put in some number of new citations. The changes must improve the article and the citations have to be citations that others will genuinely find helpful or else they do not count. All academics will instantly recognize this outlined process as exactly what we do in our own intellectual work from identifying a question, answering the question, and publishing the solution with peer review. The goal, and the measurable outcome for the students that put in the effort, are articles significantly improved in some way.

With writing for Wikipedia, my students have the opportunity to experience the personal transition from being beneficiaries of their most used and appreciated reference source, to becoming contributors to that source. The objective is for them to become confident enough to see themselves as experts with the ability to contribute and improve the world with what they have learned from their university education. They experience this directly because their work is not just about their grade, but also clearly beneficial to future students like themselves who will be using the edited pages. Thus the assignment forces a maturing change in perspective. Even the most incremental of improvements means the world is different and better thanks to the application of their education to the world’s largest and most used encyclopedia.

Facilitating and advancing discovery is what defines higher education. Wikipedia writing assignments are the one of the best ways to teach, and to remind ourselves, of that primary learning outcome.

If you’d like to learn more about how to incorporate Wikipedia into your course, visit teach.wikiedu.org or send us an email at contact@wikiedu.org.

Photo: Dr. Joel Parker.jpg, by Joel Parker, CC BY-SA 4.0, via Wikimedia Commons.

by Guest Contributor at January 20, 2017 06:02 PM

Wikimedia UK

Wikidata: the new hub for cultural heritage

This article is by: Dr Martin Poulter, Wikimedian In Residence at the University of Oxford – This post was originally published on the Oxford University Museums blog.

There is a site that lets users create customised and unusual lists of art works: works of art whose title is an alliteration, self-portraits by female artists, watercolour paintings wider than they are tall, and so on. These queries do not use any gallery or museum’s web site or search interface but draw from many collections around the world. The art works can be presented in various ways, perhaps on a map of locations they depict, or in a timeline of their creation, colour-coded by the collection where they are held. The data are incomplete, but these are the early days of an ongoing and ambitious project to share data about cultural heritage—all of it.

Judith_with_the_head_of_Holofernes
Judith with the head of Holofernes, Self Portrait (1610s) Fede Galizia, John and Mable Ringling Museum of Art

Wikimedia is a family of charitable projects that are together building an archive of human knowledge and culture, freely shareable and reusable by anyone for any purpose. Wikipedia, the free encyclopedia, is only the best-known part of this effort. Wikidata is a free knowledge base, with facts and figures about tens of millions of items. These data are offered as freely as possible, with no restriction at all on their copying and reuse.

Already, large amounts of data about artworks are being shared by formal partnerships. The University of Barcelona have worked with Wikimedians to share data about Art Nouveau works, recognising that it is far better to have all these data in one place than scattered across various online and offline sources. The National Library of Wales has employed a Wikidata Visiting Scholar to share data about its artworks, including the people and places they depict. The Finnish National Gallery, the Rijksmuseum in Amsterdam and the National Galleries of Scotland are among the institutions who have either formally uploaded catalogue data to Wikidata, or made data freely available for import. To see the sizes of these shared catalogues, one just has to ask Wikidata.

Wikidata logo – Image CC BY-SA 3.0

Wikidata queries can be built using SPARQL, a database query language not for the faint-of-geek. However, there is an open community of users sharing and improving queries. The visualisations they create can be shared online or embedded inside other sites or apps. Developers can build applications for the public; easy to use, but offering a distinctive view of Wikidata’s web of knowledge.

One such application is Crotos, a family of tools generating image galleries and maps of art, filtered by format, artist, place depicted and other attributes. Crotos shows images of the art, so it only includes works with a digital image available in Wikimedia Commons. Wikidata itself has no such restriction: it describes art whether or not a freely-shareable scan is available.

So while the Wikidata site itself might not have mass appeal, the service it provides is gradually transforming the online world, providing a single source of data for some of the most popular web sites and apps. Those “infoboxes” summarising key facts and figures at the top of Wikipedia articles are increasingly being driven from Wikidata, so dates, locations and other facts can be entered in one place but appear on hundreds of sites.

The really exciting prospect is that of building visualisations and other interactive educational objects, integrating information from many collections and other data sources. Wikidata would be interesting enough as an art database, but it also shares bibliographic, genealogical, scientific, and other kinds of data, covering modern as well as historical topics. This allows combined queries, such as art by people born in a particular region and time period, or works depicting people described in a particular book.

Wikidata is massively multilingual, using language-independent identifiers and connecting these to names in hundreds of languages as well as to formal identifiers. In a way it is the ultimate authority file; a modern Rosetta Stone connecting identifiers from institutions’ authority files, scholarly databases and other catalogues (Hinojo (2015)).

There are thousands of properties that a Wikidata item can have. Just considering a small selection that are relevant to art and culture, it is clear that the number of possible queries is astronomical.

  • Many features of an art work can be described:
    • instance of: in other words, the type. Wikidata has many types to choose from, from oil sketch and drawing, via architectural sculpture and stained glass, to aquatint and linocut
    • collection
    • material used
    • height, width
    • genre, movement
    • co-ordinates of the point of view
  • People and places can be connected to an artwork: depicts, creator, attributed to, owned by, after a work by, commissioned by.
  • There are relations between people: parent, sibling, influenced by, school of, author and addressee of a letter.
  • People can also be connected to groups or organisations: member of, founder, employer, educated at.

With so many kinds of data, Wikidata draws in volunteer contributors with varying interests. Just as there are people who will sit down for an evening to improve a Wikipedia article or to categorise images on Wikimedia Commons, there are people fixing and improving Wikidata’s entries and queries. As with Wikipedia, Wikidata benefits from the intersection of different interests. Contributors speak different languages and have different background knowledge. Some are interested in a particular institution’s collection, while others are interested in a particular style of art, others in a given location or historic individual. Hence one entry can attract multiple contributors, each motivated by a different interest.

Over time, Wikidata’s role in Wikipedia will expand. Explore English Wikipedia and you find many list articles, such as List of works by Salvador Dalí or List of Hiberno-Saxon illuminated manuscripts. At the moment, these are all manually maintained, but a program—the ListeriaBot—has been created to turn Wikidata queries into lists suitable for Wikipedia: see for example this (draft) list of paintings of art galleries. Catalan Wikipedia, with a much smaller contributor base than the English language version, is already using the bot to write list articles such as Works of Jacob van Ruisdael, saving many hours of human effort. As automated creation of list articles becomes more widespread, cultural institutions that share catalogue data will help ensure the correctness and completeness of these articles.

Un paisatge del riu amb figures, by Jacob Van Ruysdael (1628/1629–1682), Museu de Belles Arts Puixkin

Like Wikipedia, Wikidata depends on Verifiability: any statement of fact is expected to cite or link a credible published source. Hence it has active links to catalogues and other formally vetted sites, which usually supply more scholarly detail and primary research than Wikidata itself. So Wikidata is not a replacement for cultural institutions’ catalogues. The hub metaphor is apt: it is a central point, linking together disparate resources and giving them a useful shape. Its credibility will always depend on the formally vetted sources that it cites, and there will always be users who want to check what they read by following up the citations. In practice, this means that sharing ten thousand records with Wikidata is a way to get ten thousand incoming links to the institution’s own catalogue. What’s more, the free reuse of Wikidata means that other sites will use those links.

Wikidata and its partners have a huge task ahead of them, but the potential reward is vast. We could have data on all artworks, browsable in endless and genuinely new ways, with connections to their official catalogues, their physical locations, and scholarly literature. The sooner the cultural sector as a whole gets involved, the sooner we can bring this about.

References

Note

I am grateful to Wikidata users Jane Darnell (User:Jane023), Magnus Manske (User:Magnus Manske – creator of User:ListeriaBot) and Andy Mabbett (User:Pigsonthewing) for many of the useful links in this article.

 

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International Licence.

by Martin Poulter at January 20, 2017 12:49 PM

User:Geni

The Canon EF 11-24mm f/4 for Wikipedians

Its a £2700 lens. At that price I suspect anyone buying it can come to their own conclusions. Still on a full frame camera it is an extremely useful lens. The width makes it great for urban architecture, larger items in museums, and interiors in general. The short minimum focus distance makes it great for objects in cases and the lens’s sharpness makes it viable to crop the resulting images.

Obviously if you want to shoot longer than 24mm then you need another lens but for wide angle work the lens is excellent.

Downsides. Its a £2700 lens. You could buy quite a lot of other gear for that. The Sigma 12-24mm f/4 is about £1000 cheaper and nearly as sharp at the wide end. If you are shooting on a crop sensor then the 10-18mm is under £300 so unless you really really need the sharpness for some reason I wouldn’t go near this lens for a crop system. On top of that its big and its heavy. Not something I have an issue with but for anyone more weight conscious (but then why shoot full frame?) it may present a problem. The f/4 speed may be less than idea for indoor work but thats becoming less and less of a problem as camera low light abilities improve.

Overall a very useful bit of kit but also really rather on the expensive side.


by geniice at January 20, 2017 11:48 AM

Shyamal

The many shades of citizen science

Everyone is a citizen but not all have the same kind of grounding in the methods of science. Someone with a training in science should find it especially easy to separate pomp from substance. The phrase "citizen science" is a fairly recent one which has been pompously marketed without enough clarity.

In India, the label of a "scientist" is a status symbol, indeed many actually proceed on paths just to earn status. In many of the key professions (example: medicine, law) authority is gained mainly by guarded membership, initiation rituals, symbolism and hierarchies. At its roots, science differs in being egalitarian but the profession is at odds and its institutions are replete with tribal ritual and power hierarchies.

Long before the creation of the profession of science, "Victorian scientists" (who of course never called themselves that) pursued the quest for knowledge (i.e. science) and were for the most part quite good as citizens. In the field of taxonomy, specimens came to be the reliable carriers of information and they became a key aspect of most of zoology and botany. After all what could you write about or talk about if you did not have a name for the subject under study. Specimens became currency. Victorian scientists collaborated in various ways that involved sharing information, sharing /exchanging specimens, debating ideas, and tapping a network of friends and relatives for gathering more "facts". Learned societies and their journals helped the participants meet and share knowledge across time and geographic boundaries.  Specimens, the key carriers of unquestionable information, were acquired for a price and there was a niche economy created with wealthy collectors, not-so-wealthy field collectors and various agencies bridging them. That economy also included the publishers of monographs, field guides and catalogues who grew in power along with organizations such as  museums and later universities. Along with political changes, there was also a move of power from private wealthy citizens to state-supported organizations. Power brings disparity and the Victorian brand of science had its share of issues but has there been progress in the way of doing science?

Looking at the natural world can be completely absorbing. The kinds of sights, sounds, textures, smells and maybe tastes can keep one completely occupied. The need to communicate our observations and reactions almost immediately makes one need to look for existing structure and framework and that is where organized knowledge a.k.a. science comes in. While the pursuit of science might seem be seen by individuals as being value neutral and objective, the settings of organized and professional science are decidedly not. There are political and social aspects to science and at least in India the tendency is to view them as undesirable and not be talked about so as to appear "professional".  

Being silent so as to appear diplomatic probably adds to the the problem. Not engaging in conversation or debate with "outsiders" (a.k.a. mere citizens) probably fuels the growing label of "arrogance" applied to scientists. Once the egalitarian ideal of science is tossed out of the window, you can be sure that "citizen science" moves from useful and harmless territory to a region of conflict and potential danger. Many years ago I saw a bit of this  tone in a publication boasting the virtues of Cornell's ebird and commented on it. Ebird was not particularly novel to me (especially as it was not the first either by idea or implementation, lots of us would have tinkered with such ideas, even I did with - BirdSpot - aimed to be federated and peer-to-peer - ideally something like torrent) but Cornell obviously is well-funded. I commented in 2007 that the wording used sounded like "scientists using citizens rather than looking upon citizens as scientists", the latter being in my view the nobler aim to achieve. Over time ebird has gained global coverage, but has remained "closed" not opening its code or discussions on software construction and by not engaging with its stakeholders. It has on the other hand upheld traditional political hierarchies and processes that ensure low-quality in parts of the world where political and cultural systems are particularly based on hierarchies of users. As someone who has watched and appreciated the growth of systems like Wikipedia it is hard not to see the philosophical differences - almost as stark as right-wing versus left-wing politics.

Do projects like ebird see the politics in "citizen-science"?
Arnstein's ladder is a nice guide to judge
the philosophy behind a project.
I write this while noting that criticisms of ebird as it currently works are slowly beginning to come out (despite glowing accounts in the past). There are comments on how it is reviewed by self-appointed police  (it seems that the problem seems to be not just in the appointment - indeed why could not have the software designers allowed anyone to question any record and put in methods to suggest alternative identifications - gather measures of confidence based on community queries and opinions on confidence measures), there are supposedly a class of user who manages something called "filters" (the problem here is not just with the idea of creating user classes but also with the idea of using manually-defined "filters", to an outsider like me who has some insight in software engineering poor-software construction is symptomatic of poor vision, guiding philosophy and probably issues in project governance ), there are issues with taxonomic changes (I heard someone complain about a user being asked to verify identification - because of a taxonomic split - and that too a split that allows one to unambiguously relabel older records based on geography - these could have been automatically resolved but developers tend to avoid fixing problems and obviously prefer to get users to manage it by changing their way of using it - trust me I have seen how professional software development works), and there are now dangers to birds themselves. There are also issues and conflicts associated with licensing, intellectual property and so on. Now it is easy to fix all these problems piecemeal but that does not make the system better, fixing the underlying processes and philosophies is the big thing to aim for. So how do you go from a system designed for gathering data to one where you want the stakeholders to be enlightened. Well, a start could be made by first discussing in the open.

I guess many of us who have seen and discussed ebird privately could have just said I told you so, but it is not just a few nor is it new. Many of the problems were and are easily foreseeable. One merely needs to read the history of ornithology to see how conflicts worked out between the center and the periphery (conflicts between museum workers and collectors); the troubles of peer-review and open-ness; the conflicts between the rich and the poor (not just measured by wealth); or perhaps the haves and the have-nots. And then of course there are scientific issues - the conflicts between species concepts not to mention conservation issues - local versus global thinking. Conflicting aims may not be entirely solved but you cannot have an isolated software development team, a bunch of "scientists" and citizens at large expected merely to key in data and be gone. There is perhaps a lot to learn from other open-source projects and I think the lessons in the culture, politics of Wikipedia are especially interesting for citizen science projects like ebird. I am yet to hear of an organization where the head is forced to resign by the long tail that has traditionally been powerless in decision making and allowing for that is where a brighter future lies. Even better would be where the head and tail cannot be told apart.

Postscript: 

There is an interesting study of fieldguides and their users in Nature - which essentially shows that everyone is quite equal in making misidentifications - just another reason why ebird developers ought to just remove this whole system creating an uber class involved in rating observations/observers.

23 December 2016 - For a refreshingly honest and deep reflection on analyzing a citizen science project see -  Caroline Gottschalk Druschke & Carrie E. Seltzer (2012) Failures of Engagement: Lessons Learned from a Citizen Science Pilot Study, Applied Environmental Education & Communication, 11:178-188.
20 January 2017 - An excellent and very balanced review (unlike my opinions) can be found here -  Kimura, Aya H.; Abby Kinchy (2016) Citizen Science: Probing the Virtues and Contexts of Participatory Research Engaging Science, Technology, and Society 2:331-361.

by Shyamal L. (noreply@blogger.com) at January 20, 2017 04:49 AM

Wikimedia Foundation

Community digest: Wiki Loves Women, bridging two gaps at a time; news in brief

Photo by Teemages, CC BY-SA 4.0.

Photo by Teemages, CC BY-SA 4.0.

Malouma is a Mauritanian singer and songwriter who was forced to put her career on hold after being forced into marriage. Many of her songs advocate for women’s rights, so much so that she was censored for part of the 1990s in Mauretania.

Hannah Kudjoe was a Ghanaian dressmaker who later became a political activist. She became of the the major figures calling for for the independence of their country in the 1940s.

Qut el Kouloub was a writer from Egypt who contributed generously to French literature in the first half of the twentieth century. Many critics were confused as to whether her works are fictional or nonfictional historical biographies, and like Malouma, Kouloub used her novels to advocate for women’s rights in Egypt.

The work of Malouma, Kudjoe, and Kouloub all deserve a place in history, but women of their background and experience are not well-represented on the internet. Two of the three women had no article on the English Wikipedia until Wiki Loves Women participants created them; they also helped develop the third one. In addition to that, over 1,300 other pages have been created or developed as part of the project.

Wiki Loves Women is a project that addresses two content gaps on Wikipedia at the same time. Its aim is to encourage both gender and geographical diversity on Wikipedia by adding content about African women. The project is now active in Côte d’Ivoire, Cameroon, Nigeria and Ghana.

“I … realised recently that many articles on Wikipedia are not being read [often],” says Olaniyan Olushola, the project manager of Wiki Loves Women in Nigeria. Olushola used the Wikimedia user group Nigeria Facebook page to promote the content created as part of the Wiki Loves Women events that he leads.

Olushola is trying to “find a way to honor Nigerian women by bridging gender inequalities and reducing systemic bias on Wikipedia.” He was introduced to Wikipedia and mentored by a woman: Isla Haddow-Flood, a co-founder of Wiki Loves Women.

Together with Florence Devouard, Haddow-Flood worked on developing the idea of a project that could help increase the presence of African women on Wikipedia. “After working together on Kumusha Takes Wiki and Wiki Loves Africa, it was apparent that the content gap relating to women was a real issue,” Devouard and Haddow-Flood wrote in an email to us. They continued:

With less than 20% of (all) Wikipedia contributors being female, the global community has long acknowledged the gender gap as a problem. But in sub-Saharan Africa, when combined with the contributor gap—only 25% of edits to subjects about the Sub-Saharan region come from within the region—the lack of information about women forms an abyss.

 
Wiki Loves Women kicked off in January 2016 with a writing contest that was held as part of Wikipedia’s fifteenth anniversary. Several partners, including the German cultural association Goethe-Institut, and four teams in different African countries joined the initiative. So far, participants of the project have uploaded over 1,000 photos to Wikimedia Commons, the free media repository, in addition to editing and creating a similar number of articles on Wikipedia.

On International Women’s Day in March, Wiki Loves Women will hold a translate-a-thon, an editing event to translate Wikipedia articles about women in different languages. The organizers emphasize that everyone is welcome to join.

“It is time for the people of Africa to tell their own stories, change their narrative, shake up the global stereotypes, and share information about what they value and find interesting and important in the world,” say Haddow-Flood and Devouard.

In brief

Wikimania updates: Scholarship applications for Wikimania 2017, which is being held in Montréal, Canada on 11–13 August, are now being accepted. The deadline is on 20 February 2017 at 23:59 UTC. More information is available on the scholarships page and on the FAQ page of the event. Moreover, the steering committee of Wikimania, has decided to explore Cape Town in South Africa as a host for Wikimania 2018. A final decision will be made by spring 2017.

Three billion edits: This week, the total edit count on all Wikimedia projects reached 3,000,000,000. Near the same time, the WikiSpecies community celebrated creating their 500,000th page. The entry is about Pseudocalotes drogon and was created by Wikimedian Burmeister.

Wikimedia developer summit 2017: Last week, many Wikimedia technical contributors, third-party developers, users of MediaWiki and Wikimedia APIs gathered at Golden Gate Club in San Francisco for the Wikimedia developer summit 2017. The event lasted for 2 days where the attendees discussed a list of main topics selected by the community.

Donating data to Wikidata: Wikimedia Germany have published a tutorial video about Wikidata, the collaboratively-edited knowledge base. The short video explores WikiData and how contributing to the website works.

2016 on the Arabic Wikipedia: Mohamed Badaren, an editor and administrator on the Arabic Wikipedia has created a video with a summary of the major events in 2016 and their impact on Wikipedia. The video is an adaptation of earlier English-language versions, Edit 2014 and Edit 2015.

New Signpost published: A new edition of the English Wikipedia’s community-written news journal was published this week. Stories included a “surge” in new administrator promotions on the English Wikipedia; an introspective piece looking at the future of the Signpost; coverage of recent research suggesting that women are not more likely to edit about women; an interview with an active Wikipedian who has been blind since birth; and more.

Kurier: New pieces in the Kurier, the German Wikipedia’s “not necessarily neutral [and] non-encyclopedic” news page, include a three-part look back at the year 2016 and an invitation to a Wiki Loves Music event in Hamburg.

Wiki Project Med Foundation is open for members: Wiki Project Med Foundation is a user group that promotes for better coverage of Medical content on Wikimedia projects. The group is now open for membership applications.

Samir Elsharbaty, Digital Content Intern
Wikimedia Foundation

by Samir Elsharbaty at January 20, 2017 01:29 AM

January 19, 2017

Wikimedia Foundation

Introducing the Wikimedia Resource Center: A hub that helps volunteers find the resources they need

Photo via the Library of the London School of Economics and Political Science, Flickr Commons.

Photo via the Library of the London School of Economics and Political Science, Flickr Commons.

Wikimedia volunteers embrace a wide spectrum of work when it comes to contributing to Wikimedia projects: from reporting a bug, to developing a tool, to requesting a grant to start a new Wikimedia program, and more. As the movement expands to include more affiliates, and more programmatic activities every year, newer Wikimedians are faced with lack of experience in the movement and its various channels to request support.

The response to these questions will lead our new Wikimedian to different pages, from Outreach Wikimedia, to Meta Wikimedia, and MediaWiki.org, as well as connect them to experienced Wikimedians who may be able to help. In a recent user experience research, we learned that the majority of program leaders rely heavily on their personal network and personal contacts to find the information they need.

In order to expand Wikimedia communities’ efforts, however, we need to guarantee open access to resources that support this very important work. The Wikimedia Resource Center is a hub designed in response to this issue: it is a single point of entry for Wikimedians all over the world to access the resources and staff support they need to develop new initiatives, and also expand existing ones.

File:Wikimedia Resource Center - Demo.webm

Demo of the new Wikimedia Resource Center.

 

How does it work?

In the Wikimedia Resource Center you will find resources grouped in nine different tabs, according to the goal the resources serve. Let’s imagine you wanted to start a new Wikimedia program. Under Skills Development tab, you will find evaluation tools, program reports and toolkits, and learning patterns, among other resources. Each tab has an introduction page that describes the area, what each resource means and who can give you direct support in any given topic. Skills Development, together with Grants SupportPrograms SupportProduct DevelopmentGlobal ReachLegal, and Communications, all have the same logic.

Contact and Questions, and Consultations Calendar are slightly different. Under Contact and Questions, you will find frequently asked questions that are searchable by topic. This tab also has a new feature: Ask a question. Wikimedians can use this feature to inquire Wikimedia Foundation staff about any topic that is not covered in the FAQ, and they can do so publicly through the Wikimedia Resource Center, or privately via email. Under Contact and Questions, Wikimedians will also find information about the Emergency response system, and in future developments, also a network of Wikimedians.

Consultations Calendar is a public schedule of upcoming collaborations between Wikimedia Foundation and communities. In this tab, you will also find Wikimedia Community News, that transcludes the content of calendar on Meta Wikimedia main page.

If you get lost, you can always find help on the top right corner on every page.

Help us test!

This release constitutes the alpha version of the Wikimedia Resource Center, and at this stage, user feedback is key to improve its functionality. We want to hear from you! If you have comments about the Wikimedia Resource Center, you can submit feedback publicly, on the Talk Page, or privately, via a survey hosted by a third party, that shouldn’t take you more than 4 minutes to complete.

We started small, only including resources developed by the Wikimedia Foundation, in order to be able to launch an initial version of the hub. In this way, we can learn what works and what needs to be developed further, to include features to better connect Wikimedians. Check the project’s progress on Meta by clicking here.

We hope that this hub will better support Wikimedians’ efforts all over the world, and improve findability of the resources that empower them to do their best work.

María Cruz, Communications and Outreach Project Manager, Community Engagement
Wikimedia Foundation

by María Cruz at January 19, 2017 07:26 PM

Weekly OSM

weeklyOSM 339

01/10/2017-01/16/2017

Karte mit neuen Straßen

Data collected by Red Cross volunteers 1 | © OpenStreetMap Mitwirkende CC-BY-SA 2.0

Mapping

  • Kartitotp shows in her blogpost that the community together with the Mapbox team in Ayacucho took a great step forward to make the 150,000-inhabitant city in the Peruvian Andes, the best mapped city in Latin America. 20 bus routes from 22 public transport companies are now available in OSM.
  • Martin Koppenhöfer raises once again the question why monuments up until today are not clearly distinguished by the two proposed subkeys.

Community

Events

  • Fatouma Harber and Aboul Hassane Cisse, hosted from 7th-9th January 2017 in collaboration with the OSM community of Mali a CartoCamp (mapping party) in Tombouctou.
  • Ulf Treger, in his lecture on maps on the 33C3, takes a look back at the historical development of maps and map projections to date and their geopolitical background.
  • Selene Yang published in a diary of photos from the SotM Latam, 2016 that happened in São Paulo, Brazil.
  • The State of the Map Africa Working Group has starting a logo contest.

Humanitarian OSM

  • In an OSM diary entry, “everyone_sinks_starco” complained about a HOT mapathon in Indonesia. It turned out to be a very effective rant, because various members of HOT Indonesia posted comments to explain what had happened (if you don’t understand Bahasa Indonesia you’ll need to copy and paste the second half of the comments into an online translation tool). To this, user Iyan the Project Manager of Humanitarian OpenStreetMap Team Indonesia clarified and explained about the project.
  • That mapathons can be done well, too, is shown by the Red Cross: After training local mappers, 7000 villages in Liberia, Guinea and Sierra Leone are mapped and GPS traces of 70,000 km of roads & paths are collected by those new volunteers.
  • The blog globalinnovationexchange.org has a very upbeat post published on the topic: Fighting Ebola with Information.

Maps

  • J. Budissin is seeking a volunteer who would setup and operate a Sorbian map as the former admin is not willing to do so anymore. Preferably people from Lusatia and the surroundings can volunteer.

switch2OSM

  • The Chilean tax administration uses OSM maps. (Via osmCL)

Open Data

  • Martin Isenburg reports on rapidlasso.com that now there is open and free LiDAR data in Germany. First, North Rhine-Westphalia and then Thuringia have opened their geoportals for free download of geospatial data at the beginning of 2017. We are full of hope, Martin says, that other federal states will follow their lead. It would simply not make any sense to try to sell this kind of data, as it was shown in England recently.

Software

  • Robot8A tries to convince Telegram developer to use OSM instead of Google Maps. Interesting discussion follows.

Programming

  • Adrien Pavie shows his JS library Pic4Carto which allows to embed geolocated pictures into a website. Right now it supports Flickr, Mapillary and Wikimedia Commons.
  • Karlos shows the newest changes of OSM go, e.g. built-in 3D-models of benches or wind turbines and first impressions from the London tube.

Releases

Software Version Release date Comment
Locus Map Free * 3.21.1 2017-01-10 Bugfix release.
Mapbox GL JS v0.31.0 2017-01-10 One new feature and two bugfixes.
Mapillary iOS * 4.5.12 2017-01-10 Minor fixes.
OSRM 5.5.3 2017-01-11 Two enhancements and three bugfixes.
Naviki iOS;* 3.53 2017-01-12 Supporting Apple Watch.
OSM Contributor 3.0.1 2017-01-12 Bugfix release.
QGIS 2.18.3 2017-01-13 No info.
libosmium 2.11.0 2017-01-14 Many changes, please read release info.
Traccar Client Android 4.0 2017-01-14 No info.
pyosmium 2.11 2017-01-15 Use current libosmium.

Provided by the OSM Software Watchlist.

(*) unfree software. See: freesoftware.

Did you know …

  • … Franz-Benjamin Mocnik’s visualizations on OpenStreetMap changeset and wiki tags?
  • TorFlow? It shows the traffic between the individual nodes of Tor in real time.

OSM in the media

  • The MVV, the local traffic company of Munich, will soon launch (automatic translation) a new service based on OpenStreetMap to show arrivals and delays of local trains. The MVV notes that OpenStreetMap data is not only free but also more current than data from HERE.
  • Federal Agency for Civic Education published an article how OpenStreetMap could be used for educational purposes in a public school. (automatic translation)
  • The Herald, in Zimbabwe writes about the importance of collaborative mapping initiatives, such a Missing Maps to help build resilience and better humanitarian response.

Other “geo” things

  • Examples of using OpenStreetMap data and Mapzen tools in news companies.
  • The QuickMapServices (we reported earlier) now contains more than 555 services.
  • Mashable presents jeans from Spinali Design that helps to navigate. We hope for the sake of your safety that only OpenStreetMap data is being used.

Upcoming Events

Where What When Country
Tokyo 東京!街歩き!マッピングパーティ:第4回 根津神社 01/21/2017 japan
Manila 【MapAm❤re】OSM Workshop Series 8/8, San Juan 01/23/2017 philippines
Bremen Bremer Mappertreffen 01/23/2017 germany
Graz Stammtisch Graz 01/23/2017 austria
Nottingham Nottingham Pub Meetup 01/24/2017 uk
Dresden Stammtisch 02/02/2017 germany
Lyon Stand OSM Salon Primevère 02/03/2017-02/05/2017 france
Brussels FOSDEM 2017 02/04/2017-02/05/2017 belgium
Genoa OSMit2017 02/08/2017-02/11/2017 italy
Cardiff OpenDataCamp UK 02/25/2017-02/26/2017 wales
Passau FOSSGIS 2017 03/22/2017-03/25/2017 germany
Avignon State of the Map France 2017 06/02/2017-06/04/2017 france
Aizu-wakamatsu Shi State of the Map 2017 08/18/2017-08/20/2017 japan
Buenos Aires FOSS4G+SOTM Argentina 2017 10/23/2017-10/28/2017 argentina

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropiate..

This weeklyOSM was produced by Peda, Polyglot, Rogehm, SeleneYang, SomeoneElse, TheFive, YoViajo, derFred, jinalfoflia, keithonearth, vsandre.

by weeklyteam at January 19, 2017 03:44 PM

January 18, 2017

Wikimedia Foundation

Why I wrote 100 articles in 100 days about inspiring Jewish women

Ester Rada, an Israeli musician who now has an article on the Spanish Wikipedia. Photo by Oren Rozen, CC BY-SA 3.0..

Ester Rada, an Israeli musician who now has an article on the Spanish Wikipedia. Photo by Oren Rozen, CC BY-SA 3.0.

Seven months ago, I was looking for a new job.

With little else to do after applying for a new one, I browsed Facebook, where I saw Wikimedian friends posting with #100wikidays.

I quickly discovered that the hashtag referred to a challenge undertaken by Wikipedians to write one new article each day for a hundred days. It was the brainchild of my Bulgarian colleague Spiritia, who only a month earlier was the runner-up Wikipedian of the year for coming up with it.

To release the stress of the job hunt, I decided to do it—but my way, by writing articles about Jewish women on the Spanish, Portuguese, English, and Ladino Wikipedias.

I started with a woman from Venezuela, the country I was born in: Margot Benacerraf, a movie director of Moroccan-Jewish origin who received the Cannes Prize in 1959. Who could imagine that in the late 50s, a young woman from a country little-known to many would capture the attention of critics at the Cannes Festival? Benacerraf is now considered the mother of the Venezuelan cinema, founder of the National Cinematheque.

Another woman worthy of mentioning is Houda Nonoo, who served as ambassador of Bahrain to the US from 2008 to 2013. She is the third woman to be an ambassador of Bahrain and the first Jew named as an ambassador from any country in the Arab World.

During these 100 days, I spoke much about these women, telling their stories to everyone who asked about them. One day, I came across an article about a semi-legendary queen in Ethiopia that ended the Axum dynasty, crowned herself, and set the churches of Abyssinia on fire. I asked an Ethiopian friend about her, who immediately replied “Esato? She burned Ethiopia, killed the princes, and took all their gold!”

I don’t know if it’s a legend or not, and neither do historians, but my friend sounded very excited to tell me about her. And now we have an article about Gudit!

On every one of my hundred days, I spent time on the internet looking for another notable Jewish woman whose life would catch my attention. Some were so impressive that I needed to create them on the spot.

One was Caterina Tarongí, who was burned alive by the Spanish Inquisition. Her words to her brother on the way to the auto-da-fé have survived through folk songs and expressions. Another was Raquel Líberman, born in Poland, who declared in publicly denouncing and breaking a Argentine human trafficking network that specialized in Jewish women that “I can only die once, I won’t withdraw the complaint.” The organized network was comprised of over 30,000 women over a seventy-year period.

At some point, I started searching for interesting Jewish women in other language Wikipedias, looking to spread awareness of these people across national and linguistic borders. The one that interested me most was Violeta Yakova, a Bulgarian resistance fighter during the Second World War. Along with two fellow Jews, Yakova killed well-known anti-semites and Nazi informers. The only article about her was in Bulgarian, but after I translated her article into English, other participants in the challenge translated it into seven more! When things like this happen, you get this feeling of accomplishment, of not just contributing to the expansion of free knowledge, but also of engaging other people do it with you as well. It’s a win-win situation.

#100wikidays also gave me the opportunity to interact with other Wikipedians, many of whom I had never met before. Out of these, one of the most remarkable colleagues I befriended in this experience has been Mervat Salman. Mervat lives in Amman, Jordan; I live in Jerusalem, a religious Jew who became an Israeli citizen at Ben-Gurion Airport.

At first sight, one would only focus on our differences. But there’s more: we both work in the IT industry, we both like Middle Eastern food and music and—the most important thing—we both believe in freedom of knowledge and the need to make it accessible for everyone.

After I finished the challenge, I was exhausted. #100wikidays took up a good deal of my time over those one hundred days, but it was satisfying and completely worth the effort.

But I couldn’t rest for long. Only days later, Mervat started asking me if I wanted to take on the #100WikiCommonsDays challenge—like #100wikidays but with pictures. Since I didn’t start immediately, she asked again, and again … until I started uploading photos to Commons. And here I am, halfway through it!

Inspiration is essential in life. I was inspired by all these 100 women, and I hope others will be too.

Maor Malul, Wikipedian

by Maor Malul at January 18, 2017 11:18 PM

Greg Sabino Mullane

MediaWiki extension.json change in 1.25

I recently released a new version of the MediaWiki "Request Tracker" extension, which provides a nice interface to your RequestTracker instance, allowing you to view the tickets right inside of your wiki. There are two major changes I want to point out. First, the name has changed from "RT" to "RequestTracker". Second, it is using the brand-new way of writing MediaWiki extensions, featuring the extension.json file.

The name change rationale is easy to understand: I wanted it to be more intuitive and easier to find. A search for "RT" on mediawiki.org ends up finding references to the WikiMedia RequestTracker system, while a search for "RequestTracker" finds the new extension right away. Also, the name was too short and failed to indicate to people what it was. The "rt" tag used by the extension stays the same. However, to produce a table showing all open tickets for user 'alois', you still write:

<rt u='alois'></rt>

The other major change was to modernize it. As of version 1.25 of MediaWiki, extensions are encouraged to use a new system to register themselves with MediaWiki. Previously, an extension would have a PHP file named after the extension that was responsible for doing the registration and setup—usually by mucking with global variables! There was no way for MediaWiki to figure out what the extension was going to do without parsing the entire file, and thereby activating the extension. The new method relies on a standard JSON file called extension.json. Thus, in the RequestTracker extension, the file RequestTracker.php has been replaced with the much smaller and simpler extension.json file.

Before going further, it should be pointed out that this is a big change for extensions, and was not without controversy. However, as of MediaWiki 1.25 it is the new standard for extensions, and I think the project is better for it. The old way will continue to be supported, but extension authors should be using extension.json for new extensions, and converting existing ones over. As an aside, this is another indication that JSON has won the data format war. Sorry, XML, you were too big and bloated. Nice try YAML, but you were a little *too* free-form. JSON isn't perfect, but it is the best solution of its kind. For further evidence, see Postgres, which now has outstanding support for JSON and JSONB. I added support for YAML output to EXPLAIN in Postgres some years back, but nobody (including me!) was excited enough about YAML to do more than that with it. :)

The extension.json file asks you to fill in some standard metadata fields about the extension, which are then used by MediaWiki to register and set up the extension. Another advantage of doing it this way is that you no longer need to add a bunch of ugly include_once() function calls to your LocalSettings.php file. Now, you simply call the name of the extension as an argument to the wfLoadExtension() function. You can even load multiple extensions at once with wfLoadExtensions():

## Old way:
require_once("$IP/extensions/RequestTracker/RequestTracker.php");
$wgRequestTrackerURL = 'https://rt.endpoint.com/Ticket/Display.html?id';

## New way:
wfLoadExtension( 'RequestTracker' );
$wgRequestTrackerURL = 'https://rt.endpoint.com/Ticket/Display.html?id';

## Or even load three extensions at once:
wfLoadExtensions( array( 'RequestTracker', 'Balloons', 'WikiEditor' ) );
$wgRequestTrackerURL = 'https://rt.endpoint.com/Ticket/Display.html?id';

Note that configuration changes specific to the extension still must be defined in the LocalSettings.php file.

So what should go into the extension.json file? The extension development documentation has some suggested fields, and you can also view the canonical extension.json schema. Let's take a quick look at the RequestTracker/extension.json file. Don't worry, it's not too long.

{
    "manifest_version": 1,
    "name": "RequestTracker",
    "type": "parserhook",
    "author": [
        "Greg Sabino Mullane"
    ],
    "version": "2.0",
    "url": "https://www.mediawiki.org/wiki/Extension:RequestTracker",
    "descriptionmsg": "rt-desc",
    "license-name": "PostgreSQL",
    "requires" : {
        "MediaWiki": ">= 1.25.0"
    },
    "AutoloadClasses": {
        "RequestTracker": "RequestTracker_body.php"
    },
    "Hooks": {
        "ParserFirstCallInit" : [
            "RequestTracker::wfRequestTrackerParserInit"
        ]
    },
    "MessagesDirs": {
        "RequestTracker": [
            "i18n"
            ]
    },
    "config": {
        "RequestTracker_URL": "http://rt.example.com/Ticket/Display.html?id",
        "RequestTracker_DBconn": "user=rt dbname=rt",
        "RequestTracker_Formats": [],
        "RequestTracker_Cachepage": 0,
        "RequestTracker_Useballoons": 1,
        "RequestTracker_Active": 1,
        "RequestTracker_Sortable": 1,
        "RequestTracker_TIMEFORMAT_LASTUPDATED": "FMHH:MI AM FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_LASTUPDATED2": "FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_CREATED": "FMHH:MI AM FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_CREATED2": "FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_RESOLVED": "FMHH:MI AM FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_RESOLVED2": "FMMonth DD, YYYY",
        "RequestTracker_TIMEFORMAT_NOW": "FMHH:MI AM FMMonth DD, YYYY"
    }
}

The first field in the file is manifest_version, and simply indicates the extension.json schema version. Right now it is marked as required, and I figure it does no harm to throw it in there. The name field should be self-explanatory, and should match your CamelCase extension name, which will also be the subdirectory where your extension will live under the extensions/ directory. The type field simply tells what kind of extension this is, and is mostly used to determine which section of the Special:Version page an extension will appear under. The author is also self-explanatory, but note that this is a JSON array, allowing for multiple items if needed. The version and url are highly recommended. For the license, I chose the dirt-simple PostgreSQL license, whose only fault is its name. The descriptionmsg is what will appear as the description of the extension on the Special:Version page. As it is a user-facing text, it is subject to internationalization, and thus rt-desc is converted to your current language by looking up the language file inside of the extension's i18n directory.

The requires field only supports a "MediaWiki" subkey at the moment. In this case, I have it set to require at least version 1.25 of MediaWiki - as anything lower will not even be able to read this file! The AutoloadClasses key is the new way of loading code needed by the extension. As before, this should be stored in a php file with the name of the extension, an underscore, and the word "body" (e.g. RequestTracker_body.php). This file contains all of the functions that perform the actual work of the extension.

The Hooks field is one of the big advantages of the new extension.json format. Rather than worrying about modifying global variables, you can simply let MediaWiki know what functions are associated with which hooks. In the case of RequestTracker, we need to do some magic whenever a <rt> tag is encountered. To that end, we need to instruct the parser that we will be handling any <rt> tags it encounters, and also tell it what to do when it finds them. Those details are inside the wfRequestTrackerParserInit function:

function wfRequestTrackerParserInit( Parser $parser ) {

    $parser->setHook( 'rt', 'RequestTracker::wfRequestTrackerRender' );

    return true;
}

The config field provides a list of all user-configurable variables used by the extension, along with their default values.

The MessagesDirs field tells MediaWiki where to find your localization files. This should always be in the standard place, the i18n directory. Inside that directory are localization files, one for each language, as well as a special file named qqq.json, which gives information about each message string as a guide to translators. The language files are of the format "xxx.json", where "xxx" is the language code. For example, RequestTracker/i18n/en.json contains English versions of all the messages used by the extension. The i18n files look like this:

$ cat en.json
{
  "rt-desc"       : "Fancy interface to RequestTracker using <code>&lt;rt&gt;</code> tag",
  "rt-inactive"   : "The RequestTracker extension is not active",
  "rt-badcontent" : "Invalid content args: must be a simple word. You tried: <b>$1</b>",
  "rt-badquery"   : "The RequestTracker extension encountered an error when talking to the RequestTracker database",
  "rt-badlimit"   : "Invalid LIMIT (l) arg: must be a number. You tried: <b>$1</b>",
  "rt-badorderby" : "Invalid ORDER BY (ob) arg: must be a standard field (see documentation). You tried: <b>$1</b>",
  "rt-badstatus"  : "Invalid status (s) arg: must be a standard field (see documentation). You tried: <b>$1</b>",
  "rt-badcfield"  : "Invalid custom field arg: must be a simple word. You tried: <b>$1</b>",
  "rt-badqueue"   : "Invalid queue (q) arg: must be a simple word. You tried: <b>$1</b>",
  "rt-badowner"   : "Invalid owner (o) arg: must be a valud username. You tried: <b>$1</b>",
  "rt-nomatches"  : "No matching RequestTracker tickets were found"
}

$ cat fr.json
{
  "@metadata": {
     "authors": [
         "Josh Tolley"
      ]
  },
  "rt-desc"       : "Interface sophistiquée de RequestTracker avec l'élement <code>&lt;rt&gt;</code>.",
  "rt-inactive"   : "Le module RequestTracker n'est pas actif.",
  "rt-badcontent" : "Paramètre de contenu « $1 » est invalide: cela doit être un mot simple.",
  "rt-badquery"   : "Le module RequestTracker ne peut pas contacter sa base de données.",
  "rt-badlimit"   : "Paramètre à LIMIT (l) « $1 » est invalide: cela doit être un nombre entier.",
  "rt-badorderby  : "Paramètre à ORDER BY (ob) « $1 » est invalide: cela doit être un champs standard. Voir le manuel utilisateur.",
  "rt-badstatus"  : "Paramètre de status (s) « $1 » est invalide: cela doit être un champs standard. Voir le manuel utilisateur.",
  "rt-badcfield"  : "Paramètre de champs personalisé « $1 » est invalide: cela doit être un mot simple.",
  "rt-badqueue"   : "Paramètre de queue (q) « $1 » est invalide: cela doit être un mot simple.",
  "rt-badowner"   : "Paramètre de propriétaire (o) « $1 » est invalide: cela doit être un mot simple.",
  "rt-nomatches"  : "Aucun ticket trouvé"
}

One other small change I made to the extension was to allow both ticket numbers and queue names to be used inside of the tag. To view a specific ticket, one was always able to do this:

<rt>6567</rt>

This would produce the text "RT #6567", with information on the ticket available on mouseover, and hyperlinked to the ticket inside of RT. However, I often found myself using this extension to view all the open tickets in a certain queue like this:

<rt q="dyson"></rt>

It seems easier to simply add the queue name inside the tags, so in this new version one can simply do this:

<rt>dyson</rt>

If you are running MediaWiki 1.25 or better, try out the new RequestTracker extension! If you are stuck on an older version, use the RT extension and upgrade as soon as you can. :)

by Greg Sabino Mullane (noreply@blogger.com) at January 18, 2017 03:41 AM

Broken wikis due to PHP and MediaWiki "namespace" conflicts

I was recently tasked with resurrecting an ancient wiki. In this case, a wiki last updated in 2005, running MediaWiki version 1.5.2, and that needed to get transformed to something more modern (in this case, version 1.25.3). The old settings and extensions were not important, but we did want to preserve any content that was made.

The items available to me were a tarball of the mediawiki directory (including the LocalSettings.php file), and a MySQL dump of the wiki database. To import the items to the new wiki (which already had been created and was gathering content), an XML dump needed to be generated. MediaWiki has two simple command-line scripts to export and import your wiki, named dumpBackup.php and importDump.php. So it was simply a matter of getting the wiki up and running enough to run dumpBackup.php.

My first thought was to simply bring the wiki up as it was - all the files were in place, after all, and specifically designed to read the old version of the schema. (Because the database scheme changes over time, newer MediaWikis cannot run against older database dumps.) So I unpacked the MediaWiki directory, and prepared to resurrect the database.

Rather than MySQL, the distro I was using defaulted to using the freer and arguably better MariaDB, which installed painlessly.

## Create a quick dummy database:
$ echo 'create database footest' | sudo mysql

## Install the 1.5.2 MediaWiki database into it:
$ cat mysql-acme-wiki.sql | sudo mysql footest

## Sanity test as the output of the above commands is very minimal:
echo 'select count(*) from revision' | sudo mysql footest
count(*)
727977

Success! The MariaDB instance was easily able to parse and load the old MySQL file. The next step was to unpack the old 1.5.2 mediawiki directory into Apache's docroot, adjust the LocalSettings.php file to point to the newly created database, and try and access the wiki. Once all that was done, however, both the browser and the command-line scripts spat out the same error:

Parse error: syntax error, unexpected 'Namespace' (T_NAMESPACE), 
  expecting identifier (T_STRING) in 
  /var/www/html/wiki/includes/Namespace.php on line 52

What is this about? Turns out that some years ago, someone added a class to MediaWiki with the terrible name of "Namespace". Years later, PHP finally caved to user demands and added some non-optimal support for namespaces, which means that (surprise), "namespace" is now a reserved word. In short, older versions of MediaWiki cannot run with modern (5.3.0 or greater) versions of PHP. Amusingly, a web search for this error on DuckDuckGo revealed not only many people asking about this error and/or offering solutions, but many results were actual wikis that are currently not working! Thus, their wiki was working fine one moment, and then PHP was (probably automatically) upgraded, and now the wiki is dead. But DuckDuckGo is happy to show you the wiki and its now-single page of output, the error above. :)

There are three groups to blame for this sad situation, as well as three obvious solutions to the problem. The first group to share the blame, and the most culpable, is the MediaWiki developers who chose the word "Namespace" as a class name. As PHP has always had very non-existent/poor support for packages, namespaces, and scoping, it is vital that all your PHP variables, class names, etc. are as unique as possible. To that end, the name of the class was changed at some point to "MWNamespace" - but the damage has been done. The second group to share the blame is the PHP developers, both for not having namespace support for so long, and for making it into a reserved word full knowing that one of the poster children for "mature" PHP apps, MediaWiki, was using "namespace". Still, we cannot blame them too much for picking what is a pretty obvious word choice. The third group to blame is the owners of all those wikis out there that are suffering that syntax error. They ought to be repairing their wikis. The fixes are pretty simple, which leads us to the three solutions to the problem.


MediaWiki's cool install image

The quickest (and arguably worst) solution is to downgrade PHP to something older than 5.3. At that point, the wiki will probably work again. Unless it's a museum (static) wiki, and you do not intend to upgrade anything on the server ever again, this solution will not work long term. The second solution is to upgrade your MediaWiki! The upgrade process is actually very robust and works well even for very old versions of MediaWiki (as we shall see below). The third solution is to make some quick edits to the code to replace all uses of "Namespace" with "MWNamespace". Not a good solution, but ideal when you just need to get the wiki up and running. Thus, it's the solution I tried for the original problem.

However, once I solved the Namespace problem by renaming to MWNamespace, some other problems popped up. I will not run through them here - although they were small and quickly solved, it began to feel like a neverending whack-a-mole game, and I decided to cut the Gordian knot with a completely different approach.

As mentioned, MediaWiki has an upgrade process, which means that you can install the software and it will, in theory, transform your database schema and data to the new version. However, version 1.5 of MediaWiki was released in October 2005, almost exactly 10 years ago from the current release (1.25.3 as of this writing). Ten years is a really, really long time on the Internet. Could MediaWiki really convert something that old? (spoilers: yes!). Only one way to find out. First, I prepared the old database for the upgrade. Note that all of this was done on a private local machine where security was not an issue.

## As before, install mariadb and import into the 'footest' database
$ echo 'create database footest' | sudo mysql test
$ cat mysql-acme-wiki.sql | sudo mysql footest
$ echo "set password for 'root'@'localhost' = password('foobar')" | sudo mysql test

Next, I grabbed the latest version of MediaWiki, verified it, put it in place, and started up the webserver:

$ wget http://releases.wikimedia.org/mediawiki/1.25/mediawiki-1.25.3.tar.gz
$ wget http://releases.wikimedia.org/mediawiki/1.25/mediawiki-1.25.3.tar.gz.sig

$ gpg --verify mediawiki-1.25.3.tar.gz.sig 
gpg: assuming signed data in `mediawiki-1.25.3.tar.gz'
gpg: Signature made Fri 16 Oct 2015 01:09:35 PM EDT using RSA key ID 23107F8A
gpg: Good signature from "Chad Horohoe "
gpg:                 aka "keybase.io/demon "
gpg:                 aka "Chad Horohoe (Personal e-mail) "
gpg:                 aka "Chad Horohoe (Alias for existing email) "
## Chad's cool. Ignore the below.
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 41B2 ABE8 17AD D3E5 2BDA  946F 72BC 1C5D 2310 7F8A

$ tar xvfz mediawiki-1.25.3.tar.gz
$ mv mediawiki-1.25.3 /var/www/html/
$ cd /var/www/html/mediawiki-1.25.3
## Because "composer" is a really terrible idea:
$ git clone https://gerrit.wikimedia.org/r/p/mediawiki/vendor.git 
$ sudo service httpd start

Now, we can call up the web page to install MediaWiki.

  • Visit http://localhost/mediawiki-1.25.3, see the familiar yellow flower
  • Click "set up the wiki"
  • Click next until you find "Database name", and set to "footest"
  • Set the "Database password:" to "foobar"
  • Aha! Looks what shows up: "Upgrade existing installation" and "There are MediaWiki tables in this database. To upgrade them to MediaWiki 1.25.3, click Continue"

It worked! Next messages are: "Upgrade complete. You can now start using your wiki. If you want to regenerate your LocalSettings.php file, click the button below. This is not recommended unless you are having problems with your wiki." That message is a little misleading. You almost certainly *do* want to generate a new LocalSettings.php file when doing an upgrade like this. So say yes, leave the database choices as they are, and name your wiki something easily greppable like "ABCD". Create an admin account, save the generated LocalSettings.php file, and move it to your mediawiki directory.

At this point, we can do what we came here for: generate a XML dump of the wiki content in the database, so we can import it somewhere else. We only wanted the actual content, and did not want to worry about the history of the pages, so the command was:

$ php maintenance/dumpBackup.php --current > acme.wiki.2005.xml

It ran without a hitch. However, close examination showed that it had an amazing amount of unwanted stuff from the "MediaWiki:" namespace. While there are probably some clever solutions that could be devised to cut them out of the XML file (either on export, import, or in between), sometimes quick beats clever, and I simply opened the file in an editor and removed all the "page" sections with a title beginning with "MediaWiki:". Finally, the file was shipped to the production wiki running 1.25.3, and the old content was added in a snap:

$ php maintenance/importDump.php acme.wiki.2005.xml

The script will recommend rebuilding the "Recent changes" page by running rebuildrecentchanges.php (can we get consistentCaps please MW devs?). However, this data is at least 10 years old, and Recent changes only goes back 90 days by default in version 1.25.3 (and even shorter in previous versions). So, one final step:

## 20 years should be sufficient
$ echo '$wgRCMAxAge = 20 * 365 * 24 * 3600;' >> LocalSettings.php
$ php maintenance/rebuildrecentchanges.php

Voila! All of the data from this ancient wiki is now in place on a modern wiki!

by Greg Sabino Mullane (noreply@blogger.com) at January 18, 2017 03:23 AM

January 17, 2017

Erik Zachte

Browse winning Wiki Loves Monuments images offline

wlm_2016_in_aks_the_reflection_taj_mahal

Click to show full size (1136×640), e.g. for iPhone 5

 

The pages on Wikimedia Commons which list the winners of the yearly contests [1] contain a feature ‘Watch as Slideshow!’. Works great.

However, wouldn’t it be nice if you could also show these images offline (outside a browser), annotated and resized for minimal footprint?

Most end-of-year vacations I do a hobby project for Wikipedia. This time I worked on a script [2] [3] to make the above happen. The script does the following:

  • Download all images from Wiki Loves Monuments winners pages [1]
  • Collect image, author and license info for each image on those winners pages
  • or if not available there, collect these meta data from the upload pages on Commons
  • Resize the images so they are exactly the required size
  • Annotate the image unobtrusively in a matching font size:
    contest year, country, title, author, license
wlm-annotations

Font size used for 2560×1600 image

 

  • Prefix the downloaded image for super easy filtering on year and/or countrywlm-winners-file-list-detail


I pre-rendered several sets with common image sizes, ready for download. You can request an extra set for other common screen sizes [4] [5]:

wlm_download_folder


For instance the 1920×1080 set is ideal for HDTV (e.g. for Appl
e TV screensaver) or large iPhones. On TV the texts are readable by itself, on phone some manual zooming is needed (but unobtrusiveness is key).

[1] 2010 2011 2012 2013 2014 2015 2016
[2] The script has been tested on Windows 10.
Prerequisites: curl and ImageMagicks convert (in same folder).
[3] I am actually already rewriting the script, separating it into two scripts, to make it more modular and more generally applicable. First script will extract information from WLM/WLE (WLA?) winners pages and image upload pages, and generate a csv file. Second script will read this csv, download images, resize and annotate them. I will announce the git url here when done.
[4] 4K is a bit too large for easy upload. I may do that later when the script can also run on WMF servers.
[5] Current sets are optimal for e.g. HDTV and new iPhones (again, others may follow):
1920×1080 HDTV and iPhone 6+/7+
1334×750 iPhone 6/6s/7
1136×640 iPhone 5/5s 

by Erik at January 17, 2017 12:46 PM

Gerard Meijssen

#Wikimedia - What is our mission

Many Wikipedians have a problem with Wikidata. It is very much cultural. One argument is that Wikidata does not comply with their policies and therefore cannot be used. A case in point is "notability", Wikidata knows about much more and how can that all be good?

To be honest, Wikidata is immature and it needs to be a lot better. When a Wikipedia community does not want to incorporate data from Wikidata at this point, fine. Let us find what it takes to do so in the future. Let us work on approaches that are possible now and add value to everyone.

Many of the arguments that are used show a lack of awareness of Wikipedia's own history. There are no reminders to the times when it was good to be "bold". It is forgotten that content should be allowed to improve over time and, this is still true for all of the Wikimedia content.

The problem is that every Wikidata provides a service to every Wikimedia project and as a consequence there are parts of a project where Wikidata will never comply with its policies. Arguably, all the policies of all the projects including Wikidata service what the Wikimedia Foundation is about it is to provide "every single person on the planet is given free access to the sum of all human knowledge".  When the argument is framed in this way, the question becomes a different one; it becomes how can we benefit from each other and how can we strengthen the quality of each others offerings.

Wikidata got a flying start when it replaced all the interwiki links. When all the wiki links and red links are associated with Wikidata links, it will allow for new ways to improve the consistency of Wikipedia. The problem with culture is that it is resistant to change. So when the entrenched practice is that they do not want Wikidata, let's give them the benefits of Wikidata. In a "phabricator" thingie I tried to describe it.

The proposal is for both red links and wiki links to be associated with Wikidata items. It will make it easier to use the data tools associated with Wikidata to verify, curate and improve the Wikipedia content. Obviously every link could have an associated statement. When more and more Wikipedia links are associated with statements Wikidata improves but as part of the process, these links are verified and errors will be removed.

The nice thing is that the proposal allows for it to be "opt in". The old school Wikipedians do not have to notice. It will only be for those who understand the premise of using Wikidata to improve content. In the end it will allow Wikidata and even Wikipedia to mature. It will bring another way to look at quality and it will ensure that all the content of the Wikimedia Foundation will get better integrated and be of a higher quality.
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 17, 2017 09:25 AM

Wikimedia Foundation

Wikipedia is built on the public domain

Image by the US Department of Agriculture, public domain/CC0.

Image by the US Department of Agriculture, public domain/CC0.

Wikipedia is built to be shared and remixed. This is possible, in part, thanks to the incredible amount of material that is available in the public domain. The public domain refers to a wide range of creations that are not restricted by copyright, and can be used freely by anyone. These works can be copied, translated, or remixed, so the public domain provides a crucial foundation for new creative works. On Wikipedia, some articles are based on text from older public domain encyclopedias or include images no longer protected by copyright. People regularly use public domain material to bring educational content to life and create interesting new ways to share it further.

There are three basic ways that material commonly enters the public domain.

First, when you think of the public domain, you may think of the very old creations that have expired copyright. In the United States and many other countries, copyright lasts for the life of the author plus seventy years. Works published before 1923 are in the public domain, but the rest are governed by complex copyright rules. Peter B. Hirtle of Cornell University created a helpful chart to determine when the copyright terms for various types of works will expire in the U.S.. Due to the copyright term extension in the 1976 Copyright Act and later amendments, published works from the United States will not start entering the public domain until 2019. In places outside of the U.S., copyright terms expire after shorter terms on January 1, celebrated annually as public domain day.

Second, a valuable contributor to the public domain is the U.S. federal government. Works created by the U.S. government are in the public domain as a matter of law. This means that government websites may provide a rich source of freely usable photographs and other material. A primary purpose of copyright is to promote creation by rewarding people with exclusive rights, but the government does not need this sort of incentive. Government works are already funded directly by taxpayers, and should belong to the public. Putting the government’s creations in the public domain allows everyone to benefit from the government’s work.

Third, some authors choose to dedicate their creations to the public domain. Tools like Creative Commons Zero (CC0) allow people to mark works that the public can freely used without restrictions or conditions. CC0 is used for some highly creative works, like the photographs on Unsplash. Other creators may wish release their works freely, but still maintain some copyright with minimal conditions attached. These users may adopt a license like Creative Commons Attribution Share-Alike (CC BY-SA) to require other users to provide credit and re-license their works. Most of the photographs on Wikimedia Commons and all the articles on Wikipedia are freely available under CC BY-SA. While these works still have copyright and are not completely in the public domain, they can still be shared and remixed freely alongside public domain material.

In the coming years, legislators in many countries will consider writing new copyright rules to adapt to changes in technology and the economy. One important consideration is how these proposals will protect the public domain to provide room for new creations. The European Parliament has already begun considering a proposed change to the Copyright Directive, including concerning new rights that would make the public domain less accessible to the public. As copyright terms have been extended over the past few decades, works from the 1960s remain expensive and inaccessible that would otherwise be free of copyright. As we consider changing copyright rules, we should remember that everyone, including countless creators, will benefit from a rich and vibrant public domain.

Stephen LaPorte, Senior Legal Counsel
Wikimedia Foundation

Interested in getting more involved? Learn more about the Wikimedia Foundation’s position on copyright, and join the public policy mailing list to discuss how Wikimedia can continue to protect the public domain.

by Stephen LaPorte at January 17, 2017 12:25 AM

January 16, 2017

Wiki Education Foundation

The Roundup: Serious Business

It can be tricky to find publicly accessible, objective information about business-related subjects. It’s more common for there to be monetary incentives to advocate, promote, omit, or underplay particular aspects, points of view, or examples. The concepts can also be complex, weaving together theory, history, law, and a variety of opinions. Effectively writing about business on Wikipedia thus requires neutrality, but also great care in selecting sources and the ability to summarize the best information about a topic. It’s for these reasons that students can make particularly valuable contributions to business topics on Wikipedia. They arrive at the subject without the burden of a conflict of interest that a professional may have, they have access to high-quality sources, and have an expert to guide them on their way.

Students in Amy Carleton’s Advanced Writing in the Business Administration Professions course at Northeastern University made several such contributions.

One student contributed to the article on corporate social responsibility, adding information from academic research on the effects of the business model on things like employee turnover and customer relations.

Another student created the article about the investigation of Apple’s transfer pricing arrangements with Ireland, a three-year investigation into the tax benefits Apple, Inc. received. The result was the “biggest tax claim ever”, though the decision is being appealed.

Overtime is something that affects millions of workers, and which has been a common topic of labor disputes. Wikipedia has an article about overtime in general, but it’s largely an overview of relevant laws. What had not been covered until a student created the article, are the effects of overtime. Similarly, while Wikipedia covers a wide range of immigration topics, it did not yet cover the international entrepreneur rule, a proposed immigration regulation that would to admit more foreign entrepreneurs into the United States. As with areas where there are common monetary conflicts of interest, controversial subjects like immigration policy are also simultaneously challenging to write and absolutely crucial to write about.

Some of the other topics covered in the class include philanthropreneurs, the globalization of the football transfer market, peer-to-peer transactions, and risk arbitrage.

Contributing well-written, neutral information about challenging but important topics is a valuable public service. If you’re an instructor who may want to participate, Wiki Ed is here to help. We’re a non-profit organization that can provide you with free tools and staff support for you and your students as you have them contribute to public knowledge on Wikipedia for a class assignment. To learn more, head to teach.wikipedia.org or email contact@wikiedu.org.

Photo: Dodge Hall Northeastern University.jpg, User:Piotrus (original) / User:Rhododendrites (derivative), CC BY-SA 3.0, via Wikimedia Commons.

by Ryan McGrady at January 16, 2017 05:07 PM

Semantic MediaWiki

Semantic MediaWiki 2.4.5 released/en

Semantic MediaWiki 2.4.5 released/en


January 16, 2017

Semantic MediaWiki 2.4.5 (SMW 2.4.5) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

by TranslateBot at January 16, 2017 01:27 PM

Semantic MediaWiki 2.4.5 released

Semantic MediaWiki 2.4.5 released

January 16, 2017

Semantic MediaWiki 2.4.5 (SMW 2.4.5) has been released today as a new version of Semantic MediaWiki.

This new version is a minor release and provides bugfixes for the current 2.4 branch of Semantic MediaWiki. Please refer to the help page on installing Semantic MediaWiki to get detailed instructions on how to install or upgrade.

by Kghbln at January 16, 2017 01:24 PM

User:Legoktm

MediaWiki - powered by Debian

Barring any bugs, the last set of changes to the MediaWiki Debian package for the stretch release landed earlier this month. There are some documentation changes, and updates for changes to other, related packages. One of the other changes is the addition of a "powered by Debian" footer icon (drawn by the amazing Isarra), right next to the default "powered by MediaWiki" one.

Powered by Debian

This will only be added by default to new installs of the MediaWiki package. But existing users can just copy the following code snippet into their LocalSettings.php file (adjust paths as necessary):

# Add a "powered by Debian" footer icon
$wgFooterIcons['poweredby']['debian'] = [
    "src" => "/mediawiki/resources/assets/debian/poweredby_debian_1x.png",
    "url" => "https://www.debian.org/",
    "alt" => "Powered by Debian",
    "srcset" =>
        "/mediawiki/resources/assets/debian/poweredby_debian_1_5x.png 1.5x, " .
        "/mediawiki/resources/assets/debian/poweredby_debian_2x.png 2x",
];

The image files are included in the package itself, or you can grab them from the Git repository. The source SVG is available from Wikimedia Commons.

by legoktm at January 16, 2017 09:18 AM

January 15, 2017

Wikimedia Foundation

Librarians offer the gift of a footnote to celebrate Wikipedia’s birthday: Join #1lib1ref 2017

Photo by Diliff, CC BY-SA 4.0.

Photo by Diliff, CC BY-SA 4.0.

Wikipedia has just turned 16, at a time when the need for accurate, reliable information is greater than ever. In a world where social media channels are awash with fake news, and unreliable assertions come from every corner, the Wikimedia communities and Wikipedia in particular have offered a space for that free, accessible and reliable information to be aggregated and shared with the broader world.

Making sure that the public, our patrons, reach the best sources of information is at the heart of the Wikipedia community’s ideals. The concept of all the information on Wikipedia being “verifiable”, connected to an editorially controlled source, like a reputable newspaper or academic journal, has helped focus the massive collaborative effort that Wikipedia represents.

This connection of Wikipedia’s information to sourcing, however, is an ideal; Wikipedia grows through the contributions of thousands of people every month, and we cannot practically expect every new editor to understand how Wikipedia relies on footnotes, how to find the right kinds of research material, or how to add those references to Wikipedia. All of these steps require not only a broader understanding of research, but how those skills apply to our context.

Unlike an average Wikipedia reader, librarians understand these skills intimately: not only do librarians have training and practical experience finding and integrating reference materials into written works, but they teach patrons these vital 21st-century information literacy skills every day. In the face of a flood of bad information, the health of Wikipedia relies not only on contributors, but community educators who can help our readers understand how our content is created. Ultimately, the skills and goals of the library community are aligned with the Wikipedia community.

That is why we are asking librarians to “Imagine a world where every librarian added one more reference to Wikipedia” as part of our second annual “1 Librarian, 1 Reference” (#1lib1ref) campaign. There are plenty of opportunities to get involved: there are over 313,000 “citation needed” statements on Wikipedia and 213,000 articles without any citations at all.

Last year, #1lib1ref spread around the world, helping over 500 librarians contribute thousands of citations, and sparking a conversation among library communities about what role Wikipedia has in the information ecosystem. Still, Wikipedia has over 40 million articles in hundreds of languages; though the hundreds of librarians made many contributions to the English, Catalan and a few other language Wikipedias, we need more to significantly change the experience of Wikipedia’s hundreds of millions of readers.

This year, we are calling on librarians the world over to make #1lib1ref a bigger, better contribution to a real-information based future. We are:

  • Supporting more languages for the campaign
  • Providing a kit to help organize gatherings of librarians to contribute and talk about Wikipedia’s role in librarianship.
  • Extending the campaign for another couple weeks, from January 15 until February 3.

Share the campaign in your networks and go to your library to ask your librarian to join in the campaign in the coming weeks, to contribute a precious Wikipedia birthday gift to the world: one more citation on Wikipedia!

Alex Stinson, GLAM-Wiki Strategist
Wikimedia Foundation

You can learn more about 1lib1ref at its campaign page.

by Alex Stinson at January 15, 2017 08:31 PM

Gerard Meijssen

#Wikipedia - Who is Fiona Hile?

When you look for Fiona Hile on the English Wikipedia, you will find this. It is a puzzle and there are probably two people by that name that do not have an article (yet).

One of them is an Australian poet. When you google for her you find among other things a picture. When you seek her information on VIAF you find two identifiers and in the near future she will have a third: Wikidata.

From a Wikidata point of view it is relevant to have an item for her because she won two awards. It completes these lists and it connects the two awards to the same person.

When you asks yourself is Mrs Hile really "notable", you find that the answer depends on your point of view. Wikipedia already mentions her twice and surely a discussion on the relative merits of notability is not everyone's cup of tea.

Why is Mrs Hile notable enough to blog about? It is a great example that Wikipedia and Wikidata together can produce more and better information.
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 15, 2017 07:40 PM

The Peter Porter Poetry Prize

For me the Peter Porter Poetry Prize is an award like so many others. There is one article, it lists the names of some of the people who are known to have won the prize. Some are linked and some are not. For one winner I linked to a German article and for a few others I created an item.

This list is complete, it has a link to a source so the information can be verified and I am satisfied with the result up to a point.

What I could do is add more awards and people who have won awards. The article for Tracy Ryan, the 2009 winner, has a category for another award that she won.  This award does not have a webpage with all the past winners so the question is; is Wikipedia good enough as a source. I added the winners to the award, made a mistake corrected it and now Wikidata knows about a Nathan Hobby.

Jay Martin is the 2016 winner of the  T.A.G. Hungerford Award. It has a source but it is extremely likely that this will disappear in  2017. The problem I have is that I want to see this information shared but all the work done to improve on Wikidata data is not seen at Wikipedia. When we share our resources and when we are better in tune with each others needs as editors, we will be better able to "share in the sum of our available knowledge".
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 15, 2017 12:20 PM

Is #Wikipedia the new #Britannica?

At the time the Britannica was best of breed. It was the encyclopaedia to turn to. Then Wikipedia happened and obviously it was not good enough, people were not convinced. When you read the discussions why Wikipedia was not good enough, there was however no actual discussion. The points of view were clear, they had consequences and it was only when research was done that Wikipedia became respectable. Its quality was equally good and it was more informative and included more subjects. The arguments did not go away the point of view became irrelevant. People and particularly students use Wikipedia.

Today Wikipedia is said to be best of breed. It is where you find encyclopaedic information and as Google rates Wikipedia content highly it is seen and used a lot by many people.

The need for information is changing. We have recently experienced a lot of misinformation and the need to know what is factually correct has never been more important. What has become clear is that arguments and information alone is not what sways people. So the question is where does that leave Wikipedia?

The question we have to ask is, what does it take to convince people, to be open minded. What to do when people expect a neutral point of view but the facts are unambiguous in one direction? What if the language used is not understood? What are the issues of Wikipedia, what are its weaknesses and what are its strength?

So far quality is considered to be found in sources, in the reputation of its writers. When this is not what convinces, how do we show our quality or better, how do we get people to reconsider and see the other point of view?
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 15, 2017 08:04 AM

January 13, 2017

Weekly OSM

weeklyOSM 338

01/03/2017-01/09/2017

Logo

New routing possiblities for wheelchairs 1 |

Mapping

  • Regio OSM, a completeness checker for addresses now checks 1702 communities and many cities in Germany, one of the 11 countries where the tool can be used.
  • An interesting combination of OpenData and OSM to improve the OSM data of schools in the UK. One drawback is that a direct link exists only to iD. If iD is open, however, you can open JOSM from there. 😉
  • Pascal Neis describes in a blog post his tools for QA in OSM
  • Arun Ganesh shows the significance of the wikidata=* tag by an example of the North Indian city of Manali. In his contribution, he also points to possibilities for improving OSM with further information via Wikidata, Wikimedia Commons, WikiVoyage and also points out information about using Wikidata with Mapbox tools.
  • The OSM Operations team announced a new feature on the main map page: Public GPS-Tracks.
  • Tom Pfeifer asks, how the quite modern form of cooperation, sharing of workspaces and equipment in the form of the coworking space should be tagged.
  • Chris uses AutoHotKey (Windows) and JOSM to optimize his mapping experience. He demonstrates this in a video, while tracing building outlines.
  • User rorym shows why it is useful not to make mechanical edits but “look at the area and look for other mistakes!”

Community

OpenStreetMap Foundation

Events

  • Klaus Torsky reports (de) (automatic translation) on the last FOSS4G in Germany. He links to an interview (en) with Till Adams the brain behind the organisation of FOSS4G in Bonn.
  • Frederik Ramm invites people for the February hack weekend happening in Karlsruhe.
  • A mapping party took place in Tombuctu took place from 7th to 9th of January.

Humanitarian OSM

  • Kizito Makoye reports on the initiative of the Dar es Salaam City Administration, Tanzania, in the floodplains of poor regions such as Tandal drones. The Ramani-Huria project supports this by implementing the acquired data in OSM-based maps. This and other measures will improve the living conditions and the infrastructure in the slum areas.

Maps

switch2OSM

  • Uber uses OpenStreetMap. Grant Slater expects Uber to contribute to OSM data.

Software

  • The Wikimedia help explains how to use the Wikidata ID to display the outline of OSM Objects in Wikimedia maps.
  • User Daniel writes a diary on how the latest release of Open Source Routing Machine (version 5.5) has made it easier to set up our own routing machine and shares some documentation related to it.

Releases

OpenStreetMap Routing Machine released version 5.5 which comes with some huge enhancements in guidance, tags, API and infrastructure.

Software Version Release date Comment
OSRM 5.5.0 2016-12-16 Navigation, tag interpretation and the API infrastructure have been improved.
JOSM 11427 2016-12-31 No info.
Mapillary Android * 3.14 2017-01-04 Much faster GPX fix.
Mapbox GL JS v0.30.0 2017-01-05 No info.
Naviki Android;* 3.52.4 2017-01-05 Accuracy improved.
Mapillary iOS * 4.5.11 2017-01-06 Improved onboarding.
SQLite 1.16.2 2017-01-06 Four fixes.

Provided by the OSM Software Watchlist.

(*) unfree software. See: freesoftware.

Did you know …

  • … the daily updated extracts by Netzwolf?
  • … your next holiday destination? If yes, then the map with georeferenced images in Wikimedia Commons is ideal to inform oneself in advance.
  • … the GPS navigator uNav for Ubuntu smartphones? This OSM-based Navi-App is now available in version 0.64 for the Ubuntu Mobile Operating System (OTA-14).

OSM in the media

  • Tracy Staedter (Seeker) explained the maps of Geoff Boeing. He calls his visualization tool OSMnx (OSM + NetworkX). The tool can create the physical characteristics of the streets of each city in a black & white grid, showing impressive historical city developments. Boeing says, “The cards help change opinions by demonstrating to people that the density of a city is not necessarily bad.”

Other “geo” things

  • The Open Traffic Partnership (OTP) is an initiative in Manila, Philippines which aims to make use of anonymized GPS data to analyze traffic congestion. The partnership has led to an open source platform – OSM is represented by Mapzen – that enables developing countries to record and analyze traffic patterns. Alyssa Wright, President of the US OpenStreetMap Foundation, said: “The partnership seeks to improve the efficiency and effectiveness of global transport use and supply through open data and capacity expansion.”
  • This is how the Mercator Projection distorts the poles.
  • Treepedia, developed by MIT’s Senseable City Lab and World Economic Forum, provides a visualization of tree cover in 12 major cities including New York, Los Angeles and Paris.

Upcoming Events

Where What When Country
Lyon Mapathon Missing Maps pour Ouahigouya 01/16/2017 france
Brussels Brussels Meetup 01/16/2017 belgium
Essen Stammtisch 01/16/2017 germany
Grenoble Rencontre groupe local 01/16/2017 france
Manila 【MapAm❤re】OSM Workshop Series 7/8, San Juan 01/16/2017 philippines
Augsburg Augsburger Stammtisch 01/17/2017 germany
Cologne/Bonn Bonner Stammtisch 01/17/2017 germany
Scotland Edinburgh 01/17/2017 uk
Lüneburg Mappertreffen Lüneburg 01/17/2017 germany
Viersen OSM Stammtisch Viersen 01/17/2017 germany
Osnabrück Stammtisch / OSM Treffen 01/18/2017 germany
Karlsruhe Stammtisch 01/18/2017 germany
Osaka もくもくマッピング! #02 01/18/2017 japan
Leoben Stammtisch Obersteiermark 01/19/2017 austria
Urspring Stammtisch Ulmer Alb 01/19/2017 germany
Tokyo 東京!街歩き!マッピングパーティ:第4回 根津神社 01/21/2017 japan
Manila 【MapAm❤re】OSM Workshop Series 8/8, San Juan 01/23/2017 philippines
Bremen Bremer Mappertreffen 01/23/2017 germany
Graz Stammtisch Graz 01/23/2017 austria
Brussels FOSDEM 2017 02/04/2017-02/05/2017 belgium
Genoa OSMit2017 02/08/2017-02/11/2017 italy
Passau FOSSGIS 2017 03/22/2017-03/25/2017 germany
Avignon State of the Map France 2017 06/02/2017-06/04/2017 france
Aizu-wakamatsu Shi State of the Map 2017 08/18/2017-08/20/2017 japan
Buenos Aires FOSS4G+SOTM Argentina 2017 10/23/2017-10/28/2017 argentina

Note: If you like to see your event here, please put it into the calendar. Only data which is there, will appear in weeklyOSM. Please check your event in our public calendar preview and correct it, where appropiate..

This weeklyOSM was produced by Peda, Polyglot, Rogehm, SeleneYang, SomeoneElse, SrrReal, TheFive, YoViajo, derFred, jinalfoflia, keithonearth, wambacher.

by weeklyteam at January 13, 2017 07:00 PM

Wikimedia Tech Blog

Importing JSON into Hadoop via Kafka

Photo by Eric Kilby, CC BY-SA 2.0.

Photo by Eric Kilby, CC BY-SA 2.0.

JSON is…not binary

JSON is awesome.  It is both machine and human readable.  It is concise (at least compared to XML), and is even more concise when represented as YAML. It is well supported in many programming languages.  JSON is text, and works with standard CLI tools.

JSON sucks.  It is verbose.  Every value has a key in every single record.  It is schema-less and fragile. If a JSON producer changes a field name, all downstream consumer code has to be ready.  It is slow.  Languages have to convert JSON strings to binary representations and back too often.

JSON is ubiquitous.  Because it is so easy for developers to work with, it is one of the most common data serialization formats used on the web [citation needed!].  Almost any web based organization out there likely has to work with JSON in some capacity.

Kafka was originally developed by LinkedIn, and is now an open source Apache project with strong support from Confluent.   Both of these organizations prefer to work with strongly typed and schema-ed data.  Their serialization format of choice is Avro.  Organizations like this have tight control over their data formats, as it rarely escapes outside of their internal networks.  There are very good reasons Confluent is pushing Avro instead of JSON, but for many, like Wikimedia, it is impractical to transport data in a binary format that is unparseable without extra information (schemas) or special tools.

The Wikimedia Foundation lives openly on the web and has a commitment to work with volunteer open source contributors.  Mediawiki is used by people of varying technical skill levels in different operating environments.  Forcing volunteers and Wikimedia engineering teams to work with serialization formats other than JSON is just mean!  Wikimedia wants our software and data to be easy.

For better or worse, we are stuck with JSON.  This makes many things easy, but big data processing in Hadoop is not one of them.  Hadoop runs in the JVM, and it works more smoothly if its data is schema-ed and strongly typed.  Hive tables are schema-ed and strongly typed.  They can be mapped onto JSON HDFS files using a JSON SerDe, but if the underlying data changes because someone renames a field, certain queries on that Hive table will break.  Wikimedia imports the latest JSON data from Kafka into HDFS every 10 minutes, and then does a batch transform and load process on each fully imported hour.

Camus, Gobblin, Connect

LinkedIn created Camus to import Avro data from Kafka into HDFS.   JSON support was added by Wikimedia.  Camus’ shining feature is the ability to write data into HDFS directory hierarchies based on configurable time bucketing.  You specify the granularity of the bucket and which field in your data should be used as the event timestamp.

However, both LinkedIn and Confluent have dropped support for Camus.  It is an end-of-life piece of software.  Posited as replacements, LinkedIn has developed Gobblin, and Kafka ships with Kafka Connect.

Gobblin is a generic HDFS import tool.  It should be used if you want to import data from a variety of sources into HDFS.  It does not support timestamp bucketed JSON data out of the box.  You’ll have to provide your own implementation to do this.

Kafka Connect is generic Kafka import and export tool, and has a HDFS Connector that helps get data into HDFS.  It has limited JSON support, and requires that your JSON data conform to a Kafka Connect specific envelope.  If you don’t want to reformat your JSON data to fit this envelope, you’ll have difficulty using Kafka Connect.

That leaves us with Camus.  For years, Wikimedia has successfully been using Camus to import JSON data from Kafka into HDFS.  Unlike the newer solutions, Camus does not do streaming imports, so it must be scheduled in batches. We’d like to catch up with more current solutions and use something like Kafka Connect, but until JSON is better supported we will continue to use Camus.

So, how is it done?  This question appears often enough on Kafka related mailing lists, that we decided to write this blog post.

Camus with JSON

Camus needs to be told how to read messages from Kafka, and in what format they should be written to HDFS.  JSON should be serialized and produced to Kafka as UTF-8 byte strings, one JSON object per Kafka message.  We want this data to be written as is with no transformation directly to HDFS.  We’d also like to compress this data in HDFS, and still have it be useable by MapReduce.  Hadoop’s SequenceFile format will do nicely.  (If we didn’t care about compression, we could use the StringRecordWriterProvider to write the JSON records \n delimited directly to HDFS text files.)

We’ll now create a camus.properties file that does what we need.

First, we need to tell Camus where to write our data, and where to keep execution metadata about this Camus job.  Camus uses HDFS to store Kafka offsets so that it can keep track of topic partition offsets from which to start during each run:

# Final top-level HDFS data output directory. A sub-directory
# will be dynamically created for each consumed topic.
etl.destination.path=hdfs:///path/to/output/directory

# HDFS location where you want to keep execution files,
# i.e. offsets, error logs, and count files.
etl.execution.base.path=hdfs:///path/to/camus/metadata

# Where completed Camus job output directories are kept,
# usually a sub-dir in the etl.execution.base.path
etl.execution.history.path=hdfs:///path/to/camus/metadata/history

Next, we’ll specify how Camus should read in messages from Kafka, and how it should look for event timestamps in each message.  We’ll use the JsonStringMessageDecoder, which expects each message to be  UTF-8 byte JSON string.  It will deserialize each message using the Gson JSON parser, and look for a configured timestamp field.

# Use the JsonStringMessageDecoder to deserialize JSON messages from Kafka.
camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.JsonStringMessageDecoder


camus.message.timestamp.field specifies which field in the JSON object should be used as the event timestamp, and camus.message.timestamp.format specifies the timestamp format of that field.  Timestamp interpolation is handled by Java’s SimpleDateFormat, so you should set camus.message.timestamp.format to something that SimpleDateFormat understands, unless your timestamp is already an integer UNIX epoch timestamp.  If it is, you should use ‘unix_seconds’ or ‘unix_milliseconds’, depending on the granularity of your UNIX epoch timestamp.

Wikimedia maintains a slight fork of JSONStringMessageDecoder that makes the camus.message.timestamp.field slightly more flexible.  In our fork, you can specify sub-objects using dotted notation, e.g. camus.message.timestamp.field=sub.object.timestamp. If you don’t need this feature, then don’t bother with our fork.

Here are a couple of examples:

Timestamp field is ‘dt’, format is an ISO-8601 string:

# Specify which field in the JSON object will contain our event timestamp.
camus.message.timestamp.field=dt

# Timestamp values look like 2017-01-01T15:40:17
camus.message.timestamp.format=yyyy-MM-dd'T'HH:mm:ss


Timestamp field is ‘meta.sub.object.ts’, format is a UNIX epoch timestamp integer in milliseconds:

# Specify which field in the JSON object will contain our event timestamp.
# E.g. { “meta”: { “sub”: { “object”: { “ts”: 1482871710123 } } } }
# Note that this will only work with Wikimedia’s fork of Camus.
camus.message.timestamp.field=meta.sub.object.ts

# Timestamp values are in milliseconds since UNIX epoch.
camus.message.timestamp.format=unix_milliseconds

If the timestamp cannot be read out of the JSON object, JsonStringMessageDecoder will log a warning and fall back to using System.currentTimeMillis().

Now that we’ve told Camus how to read from Kafka, we need to tell it how to write to HDFS. etl.output.file.time.partition.mins is important. It tells Camus the time bucketing granularity to use.  Setting this to 60 minutes will cause Camus to write files into hourly bucket directories, e.g. 2017/01/01/15. Setting it to 1440 minutes will write daily buckets, etc.

# Store output into hourly buckets.
etl.output.file.time.partition.mins=60

Use UTC as the default timezone.
etl.default.timezone=UTC

# Delimit records by newline.  This is important for MapReduce to be able to split JSON records.
etl.output.record.delimiter=\n


Use SequenceFileRecordWriterProvider if you want to compress data.  To do so, set mapreduce.output.fileoutputformat.compress.codec=Snappy (or another splittable compression codec) either in your mapred-site.xml, or in this camus.properties file.

# SequenceFileRecordWriterProvider writes the records as Hadoop Sequence files
# so that they can be split even if they are compressed.
etl.record.writer.provider.class=com.linkedin.camus.etl.kafka.common.SequenceFileRecordWriterProvider

# Use Snappy to compress output records.
mapreduce.output.fileoutputformat.compress.codec=SnappyCodec


Finally, some basic Camus configs are needed:

# Replace this with your list of Kafka brokers from which to bootstrap.
kafka.brokers=kafka1001:9092,kafka1002:9092,kafka1003:9092

# These are the kafka topics camus brings to HDFS.
# Replace this with the topics you want to pull,
# or alternatively use kafka.blacklist.topics.
kafka.whitelist.topics=topicA,topicB,topicC

# If whitelist has values, only whitelisted topic are pulled.
kafka.blacklist.topics=

There are various other camus properties you can tweak as well.  You can see some of the ones Wikimedia uses here.

Once this camus.properties file is configured, we can launch a Camus Hadoop job to import from Kafka.

hadoop jar camus-etl-kafka.jar com.linkedin.camus.etl.kafka.CamusJob -P /path/to/camus.properties -Dcamus.job.name="my-camus-job"


The first time this job runs, it will import as much data from Kafka as it can, and write its finishing topic-partition offsets to HDFS.  The next time you launch a Camus job with this with the same camus.properties file, it will read offsets from the configured etl.execution.base.path HDFS directory and start consuming from Kafka at those offsets.  Wikimedia schedules regular Camus Jobs using boring ol’ cron, but you could use whatever new fangled job scheduler you like.

After several Camus runs, you should see time bucketed directories containing Snappy compressed SequenceFiles of JSON data in HDFS stored in etl.destination.path, e.g. hdfs:///path/to/output/directory/topicA/2017/01/01/15/.  You could access this data with custom MapReduce or Spark jobs, or use Hive’s org.apache.hive.hcatalog.data.JsonSerDe and Hadoop’s org.apache.hadoop.mapred.SequenceFileInputFormat.  Wikimedia creates an external Hive table doing just that, and then batch processes this data into a more refined and useful schema stored as Parquet for faster querying.

Here’s the camus.properties file in full:

#
# Camus properties file for consuming Kafka topics into HDFS.
#

# Final top-level HDFS data output directory. A sub-directory
# will be dynamically created for each consumed topic.
etl.destination.path=hdfs:///path/to/output/directory

# HDFS location where you want to keep execution files,
# i.e. offsets, error logs, and count files.
etl.execution.base.path=hdfs:///path/to/camus/metadata

# Where completed Camus job output directories are kept,
# usually a sub-dir in the etl.execution.base.path
etl.execution.history.path=hdfs:///path/to/camus/metadata/history

# Use the JsonStringMessageDecoder to deserialize JSON messages from Kafka.
camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.JsonStringMessageDecoder

# Specify which field in the JSON object will contain our event timestamp.
camus.message.timestamp.field=dt

# Timestamp values look like 2017-01-01T15:40:17
camus.message.timestamp.format=yyyy-MM-dd'T'HH:mm:ss

# Store output into hourly buckets.
etl.output.file.time.partition.mins=60

# Use UTC as the default timezone.
etl.default.timezone=UTC

# Delimit records by newline.  This is important for MapReduce to be able to split JSON records.
etl.output.record.delimiter=\n

# Concrete implementation of the Decoder class to use
camus.message.decoder.class=com.linkedin.camus.etl.kafka.coders.JsonStringMessageDecoder

# SequenceFileRecordWriterProvider writes the records as Hadoop Sequence files
# so that they can be split even if they are compressed.
etl.record.writer.provider.class=com.linkedin.camus.etl.kafka.common.SequenceFileRecordWriterProvider

# Use Snappy to compress output records.
mapreduce.output.fileoutputformat.compress.codec=SnappyCodec

# Max hadoop tasks to use, each task can pull multiple topic partitions.
mapred.map.tasks=24

# Connection parameters.
# Replace this with your list of Kafka brokers from which to bootstrap.
kafka.brokers=kafka1001:9092,kafka1002:9092,kafka1003:9092

# These are the kafka topics camus brings to HDFS.
# Replace this with the topics you want to pull,
# or alternatively use kafka.blacklist.topics.
kafka.whitelist.topics=topicA,topicB,topicC

# If whitelist has values, only whitelisted topic are pulled.
kafka.blacklist.topics=

# max historical time that will be pulled from each partition based on event timestamp
#  Note:  max.pull.hrs doesn't quite seem to be respected here.
#  This will take some more sleuthing to figure out why, but in our case
#  here it’s ok, as we hope to never be this far behind in Kafka messages to
#  consume.
kafka.max.pull.hrs=168

# events with a timestamp older than this will be discarded.
kafka.max.historical.days=7

# Max minutes for each mapper to pull messages (-1 means no limit)
# Let each mapper run for no more than 9 minutes.
# Camus creates hourly directories, and we don't want a single
# long running mapper keep other Camus jobs from being launched.
# We run Camus every 10 minutes, so limiting it to 9 should keep
# runs fresh.
kafka.max.pull.minutes.per.task=9

# Name of the client as seen by kafka
kafka.client.name=camus-00

# Fetch Request Parameters
#kafka.fetch.buffer.size=
#kafka.fetch.request.correlationid=
#kafka.fetch.request.max.wait=
#kafka.fetch.request.min.bytes=

kafka.client.buffer.size=20971520
kafka.client.so.timeout=60000

# Controls the submitting of counts to Kafka
# Default value set to true
post.tracking.counts.to.kafka=false

# Stops the mapper from getting inundated with Decoder exceptions for the same topic
# Default value is set to 10
max.decoder.exceptions.to.print=5

log4j.configuration=false

##########################
# Everything below this point can be ignored for the time being,
# will provide more documentation down the road. (LinkedIn/Camus never did! :/ )
##########################

etl.run.tracking.post=false
#kafka.monitor.tier=
kafka.monitor.time.granularity=10

etl.hourly=hourly
etl.daily=daily
etl.ignore.schema.errors=false

etl.keep.count.files=false
#etl.counts.path=
etl.execution.history.max.of.quota=.8

Nuria Ruiz, Lead Software Engineer (Manager)
Andrew Otto, Senior Operations Engineer
Wikimedia Foundation

by Andrew Otto at January 13, 2017 06:05 PM

January 12, 2017

Wikimedia Foundation

Inspire Campaign’s final report shows achievements in gender diversity and representation within the Wikimedia movement

Photo by Flixtey, CC BY-SA 4.0.

Photo by Flixtey, CC BY-SA 4.0.

In March 2015, the Wikimedia Foundation launched its first Inspire Campaign with the goal of improving gender diversity within our movement. The campaign invited idea on how we as a movement could improve the representation of women within our projects, both in its content and as contributors.

The response and effort from volunteers has been remarkable.  Across the ideas that were funded, there was a diversity of approaches such as:

  • An edit-a-thon series in Ghana to develop content on notable Ghanaian women
  • A tool to track how the gender gap is changing on Wikipedia projects
  • A pilot on mentorship-driven editing between high school and college students

These and other initiatives have resulted in concrete and surprising outcomes, such as:

  • Creating or improving over 12,000 articles, including 126 new biographies on women,
  • Engaging women as project leaders, volunteers, experienced editors and new editors,
  • Correcting gender-related biases within Wikipedia articles.

As this campaign draws to a close we’d like to celebrate the work of our grant-funded projects: the leaders, volunteers, and participants who contributed (many whom were women), and the achievements that have moved us forward in addressing this topic.

Protecting user privacy through prevention

The internet can be a hostile place, and in this age of information, we have become more cautious about what we reveal about ourselves to others online. You can imagine then that in a campaign designed to attract women, privacy became a central concern for both program leaders and participants.

Program leaders were sensitive to this challenge, and cultivated spaces where women could contribute without compromising their need for privacy. For instance, we asked program leaders to report the number of women who attended their events. Many program leaders pushed back, citing the need to protect privacy. They raised two good points: that some editors choose not to disclose their gender online as a safety measure, and that by even associating their name or username with a public event designed for women, they could inadvertently compromise their privacy. Consequently, the total number of women who participated in these programs was underreported.

In spite of this conflict, it was clear that women were majority participants across funded projects.  In projects hosting multiple events for training or improving project content, such as those hosted by AfroCROWD in New York, the Linguistics Editathon series in Canada and the U.S., and WikiNeedsGirls in Ghana, well over 50% of participants were women across their own events.  Furthermore, in the mentorship groups formed through the Women of Wikipedia (WOW!) Editing Group, all 34 participants were women.  These women showed strong commitment as a result of the program, and in a follow-up survey, many of them wanted to continue contributing together with their mentorship group beyond the program.

Who is missing on Wikipedia?

There is an impressive amount of information on Wikipedia today: over 43 million articles across 284 languages. In English Wikipedia alone, there are over 5 million articles today. A fair number of these articles are dedicated to people: biographies about notable individuals amount to over 6.5 million articles, and this number continues to increase every year.

It can be difficult to see what is missing within this sea of information, and biographies are one well-defined area where the question of “Who is missing?” is particularly pertinent. Today,  biographies about women amount to just over 1 million articles across all languages. One million biographies out of 6 million biographies, or 16% of biographies in total. One million articles out of 43 million articles, or 2% of Wikipedia content in total (whereas 12% of Wikipedia content is biographies of men). This is one way to understand how women are underrepresented on Wikipedia today, and we know even less about the extent of underrepresentation for other non-male gender identities.

One Inspire grant sought to address the visibility of this issue through the development of a tool: Wikipedia Human Gender Indicators (WHGI). WHGI uses Wikidata to track the gender of newly-created biographies by language Wikipedia (and other parameters, such as country or date of birth of the individual), and provides reports in the form of weekly snapshots.

The project has seen solid evidence of usage since its completion. In February 2016 alone the site had approximately 4,000 pageviews and over 1,000 downloads of available datasets. The team also received important feedback from users on the tool: participants in WikiProject Women in Red—a volunteer project that has created more than 44,000 biographies about women—characterized the project as valuable to their work, as it helps them identify notable women to write about.

The first step to addressing a problem is to identify it. WHGI helps us to do that in a concrete, data-driven way.

Why does “Queen-Empress” redirect to “King-Emperor”?

Addressing the gender gap goes beyond addressing gaps in content. It includes igniting conversations and addressing bias in content, bias might be more subtle or even unseen to casual readers.

Just for the record is an ongoing Gender Gap Inspire grant that focuses on these more subtle forms of content bias on English Wikipedia. One of their events analyzed the process of Wikipedia editing to investigate the possibilities and challenges of gender-neutral writing.

They specifically looked at how pages are automatically re-directed to others (e.g. “Heroine” automatically re-directs to “Hero”) and the direction of those redirects: female to gender-neutral, male to gender-neutral, female to male, male to female. An analysis of almost 200 redirects on English Wikipedia showed that ~100 direct from male/female terms to gender-neutral terms, and ~100 from female to male terms.  For example, “Ballerina” redirects to “Ballet dancer” and “Queen-Empress” redirects to “King-Emperor”.

These redirections may seem like minor technical issues, but they result in an encyclopedia that is rife with systemic bias. Raising awareness of these types of bias, starting discussions on and off wiki, and directly editing language were some of the main approaches Inspire grantees took to address bias.

Learn more!

These and other outcomes can be read in more detail in our full report. We encourage you to read on, learn more about what our grantees achieved, and join us in celebrating these project leaders and their participants! You can also learn more about Community Resources’ first experiment with proactive grantmaking and what we learned from this iteration.

Sati Houston, Grants Impact Strategist
Chris Schilling, Community Organizer
Wikimedia Foundation

by Sati Houston and Chris Schilling at January 12, 2017 06:26 PM

Gerard Meijssen

#Wikidata - Clare Hollingworth and #sources

Mrs Hollingworth was a famous journalist. She recently passed away and as I often do, I added content to the Wikidata item of the person involved.

Information like awards are something I often add and it was easy enough to establish that Mrs Hollingworth received a Hannen Swaffer Award in 1962. I found a source for the award and I had my confirmation.

The Wikipedia article has it that "She won the James Cameron Award for Journalism (1994)." There is however no source and I can find a James Cameron lecture and award but Mrs Hollingworth is not noted as receiving this award; it is Ed Vulliamy.

People often say that Wikipedia is not a source. The problem is that for Wikidata it often is. Particularly in the early days of Wikidata massive amounts of data were lifted off the Wikipedias and it is why there is so much initial content to build upon.

When you work from sources, you find an issue with the Wikipedia content. My source does not know about Mr Paul Foot either. Mrs Lyse Doucet does have a Wikipedia article but she is not linked in the Wikipedia list.

To truly get to the bottom of issues like these takes research and, I am willing nor able to do this for each and every subject that I touch. It is impossible to work on all the issues that exist because of everything that I did. I have over 2,1 million edits on Wikidata. What I do is make a start and I am happy to be condemned for the work that I did, work that does have issues but they are all there to be solved someday.
Thanks,
      GerardM

by Gerard Meijssen (noreply@blogger.com) at January 12, 2017 08:54 AM

Resident Mario

Wikimedia Foundation

Writing Ghana into Wikipedia: Felix Nartey

Editor’s note: We’re testing new video coding that will allow Safari and Edge users to play videos like Felix’s directly on the blog. Please bear with us. In the meantime, you can see the video on Wikimedia Commons.

Video by Victor Grigas, CC BY-SA 3.0.

The chaos of the Second World War touched all corners of the world, and the Gold Coast (now Ghana) was no exception. The colony’s resources were marshaled for the war effort, and the headquarters of Britain’s West Africa Command were located in Accra. Nearly 200,000 soldiers from the area would eventually sign up to serve in various branches of the armed forces.

In support of these efforts, No. 37 General Hospital was established in Accra to provide medical care for injured Allied troops. The hospital’s name was changed to 37 Military Hospital of the Gold Coast shortly before the colony gained independence. It is now open to the public and serves as a teaching hospital for graduate and medical students.

Despite its impact on healthcare in Accra, there was no Wikipedia article about the hospital until February 2014, when Wikipedian Felix Nartey created it.

Photo by Ruby Mizrahi/Wikimedia Foundation, CC BY-SA 3.0.

Photo by Ruby Mizrahi/Wikimedia Foundation, CC BY-SA 3.0.

“I was walking with a friend from an event when we just thought there was no picture of this place on Wikimedia Commons, the free media repository,” says Nartey, who also served as the former community manager for the Wikimedia Ghana user group. “My friend took a picture with my tablet, and so did I, then we headed home.”

Nartey wanted to use his photo in the hospital’s article on Wikipedia, but was shocked to find no article for it: the search results page told him that “no results” matched his search term, but that he could create a page about it.

He created a short “stub” article that night. Only a few weeks later, two other editors expanded it into an informative entry that pleasantly surprised him the next time he visited the article.

Nartey believes that knowledge sharing activities like editing Wikipedia have an effect on people’s lives, but at the present time there are significant content gaps on the site. There are fewer articles about topics on the African continent than there are about Europe or North America, and those that do exist tend to be shorter and less detailed.

Increasing the diversity of contributions on Wikipedia helps achieve higher-quality content and combat systemic bias, and that’s why many people—including Nartey—are trying to figure out the reasons behind these gaps and are putting forth great effort to bridge them.

In Ghana, a current paucity of contributors could “be partly blamed on the current growing unemployment situation, which is certainly an impediment to people’s willingness to do things for free,” says Nartey. But even in better times, while “the internet connection is reasonable in urban areas, it’s expensive, so people tend to go without it or place it last on their list of priorities, which, of course, affects contributions to Wikipedia.”

But when Nartey and other volunteers start editing Wikipedia, the positive energy created works as an incentive for them to maintain their contributions. He explains:

The only way you can have an impact in this world is to always leave something behind from where you came from and give back to society, whatever that means for you… That is the feeling I get whenever I edit Wikipedia. And I feel like it’s the joy of every Wikipedian to really see your impact.

 
In addition to editing, Nartey leads several initiatives in Ghana where he promotes the importance of editing Wikipedia. Some examples include GLAM activities, the Wikipedia Education Program, the Wikipedia Library, etc. In these activities, Nartey speaks with students, cultural organization officials, and Wikipedians to find the best ways to encourage people from his country to contribute.

“I mostly teach people about the essence of wanting to contribute to Wikipedia,” Nartey explains. “Information itself is useless until it’s shared with the whole world. And the only way you can do that is through a medium like Wikipedia. You need to [highlight] that essence in the minds of people and inspire them to contribute to Wikipedia. It’s easy for you to tap in and just tell someone you need to do this because Wikipedia is already creating that impact.”

Interview by Ruby Mizrahi, Interviewer
Outline by Tomasz Kozlowski, Blog Writer, Wikimedia Foundation
Profile by Samir Elsharbaty, Digital Content Intern, Wikimedia Foundation

by Ruby Mizrahi, Tomasz Kozlowski and Samir Elsharbaty at January 12, 2017 12:16 AM

January 10, 2017

Wikimedia Foundation

Wikimedia Foundation joins EFF and others encouraging the California Court of Appeal to protect online free speech

Photo by Coolcaesar, CC BY-SA 3.0.

Photo by Coolcaesar, CC BY-SA 3.0.

On Tuesday, January 10, 2016, the Wikimedia Foundation joined an amicus brief filed by the Electronic Frontier Foundation, Eric Goldman, Rebecca Tushnet, the Organization for Transformative Works, Engine, GitHub, Medium, Snap, and Yelp encouraging the California Court of Appeal to review the ruling of the trial court in Cross v. Facebook. The case involves important principles of freedom of speech and intermediary liability immunity (which shields platforms like Wikimedia, Twitter, and Facebook from liability for content posted by users), both essential to the continued health of the Wikimedia projects.

The case began when users on Facebook created a Facebook page which criticized the plaintiff, a musician, based on his business practices. The plaintiff (along with the label and marketing companies that represented him) brought suit against Facebook with a number of claims including misuse of publicity rights. The trial court denied Facebook’s anti-SLAPP motion  and found that the plaintiff could assert a right of publicity claim against Facebook. Worryingly, under the trial court’s reasoning, such a claim arises for any speech on social media that is: (i) about a real person; and (ii) published on a website that includes advertisements. In other words, a platform that carries advertising can be held liable for the speech of its users merely because this speech relates to a real person. The court’s reasoning is not consistent with well-established rules for limits to online speech.

Facebook filed an appeal against this ruling before the California Court of Appeal where the case is currently pending. In our amicus brief, we encourage the Court of Appeal to review the lower court’s decision by pointing to the legal and policy consequences of the lower court’s ruling.

We and our co-signers argue that the court reached this absurd result through two major errors in its reasoning. First, the court did not follow the well-established First Amendment limits to the right of publicity. Second, the court did not correctly apply the immunity granted in CDA Section 230. Congress enacted Section 230 to encourage the development of the internet and other interactive media by shielding intermediaries not only from liability for actionable content created or posted by users, but also from the cost and uncertainty associated with litigation itself. This framework is essential to the success of the Wikimedia projects and many other major websites across the internet that host user-generated content. If allowed to stand, a social media site such as Facebook, Twitter, or Tumblr can be sued for any post about a real person made by a user, ultimately undermining congressional intent.

We hope that the California Court of Appeal will protect the First Amendment right to comment on and criticize public figures. We also urge the court will uphold the immunity granted under US law to intermediaries that enables robust free speech and has become a fundamental pillar in the architecture of the internet.

Tarun Krishnakumar, Legal Fellow
Stephen LaPorte, Senior Legal Counsel
Wikimedia Foundation

Special thanks to the Electronic Frontier Foundation for drafting this amicus brief, and for Aeryn Palmer for leading the Wikimedia Foundation’s contribution

by Tarun Krishnakumar and Stephen LaPorte at January 10, 2017 11:46 PM

This month in GLAM

This Month in GLAM: December 2016

by Admin at January 10, 2017 11:06 AM

Jeroen De Dauw

PHP 7.1 is awesome

PHP 7.1 has been released, bringing some features I was eagerly anticipating and some surprises that had gone under my radar.

New iterable pseudo-type

This is the feature I’m most exited about, perhaps because I had no clue it was in the works. In short, iterable allows for type hinting in functions that just loop though their parameters value without restricting the type to either array or Traversable, or not having any type hint at all. This partially solves one of the points I raised in my Missing in PHP 7 series post Collections.

Nullable types

This feature I also already addressed in Missing in PHP 7 Nullable return types. What somehow escaped my attention is that PHP 7.1 comes not just with nullable return types, but also new syntax for nullable parameters.

Intent revealing

Other new features that I’m excited about are the Void Return Type and Class Constant Visibility Modifiers. Both of these help with revealing the authors intent, reduce the need for comments and make it easier to catch bugs.

A big thank you to the PHP contributors that made these things possible and keep pushing the language forwards.

For a full list of new features, see the PHP 7.1 release announcement.

by Jeroen at January 10, 2017 10:23 AM

January 09, 2017

Wikimedia Tech Blog

Wikimedia Foundation receives $3 million grant from Alfred P. Sloan Foundation to make freely licensed images accessible and reusable across the web

Photo by Ajepbah, CC BY-SA 3.0 DE.

Photo by Ajepbah, CC BY-SA 3.0 DE.

The Wikimedia Foundation, with a US$3,015,000 grant from the Alfred P. Sloan Foundation, is leading an effort to enable structured data on Wikimedia Commons, the world’s largest repository of freely licensed educational media. The project will support contributors’ efforts to integrate Commons’ media more readily into the rest of the web, making it easier for people and institutions to share, access, and reuse high-quality and free educational content.

Wikimedia Commons includes more than 35 million freely licensed media files—photos, audio, and video—ranging from stunning photos of geographic landscapes to donations from institutions with substantial media collections, like the Smithsonian, NASA, and the British Library. Like Wikipedia, Wikimedia Commons has become a “go-to” source on the internet—used by everyone from casual browsers to major media outlets to educational institutions, and easily discoverable through search engines. It continues to rapidly grow every year: Volunteer contributors added roughly six million new files in 2016.

Today, the rich images and media in Wikimedia Commons are described only by casual notation, making it difficult to fully explore and use this remarkable resource. The generous contribution from the Sloan Foundation will enable the Wikimedia Foundation to connect Wikimedia Commons with Wikidata, the central storage for structured data within the  Wikimedia projects. Wikidata will empower Wikimedia volunteers to transform Wikimedia Commons into a rich, easily-searchable, and machine-readable resource for the world.

Over three years, the Wikimedia Foundation will develop infrastructure, tools, and community support to enable the work of contributors, who have long requested a way to add more precise, multilingual and reusable data to media files. This will support new uses of Commons’ media, from richer and more dynamic illustration of articles on Wikipedia, to helping new users, like museums, remix the media in their own applications. Structured data will also be compatible with and support Wikimedia Commons’ partnership communities, including “GLAM” institutions (galleries, libraries, archives, museums) that have donated thousands of images in recent years. With the introduction of structured data on Commons, GLAM institutions will be able to more easily upload media and integrate existing metadata into Wikimedia Commons and share their collections with the rest of the web.

“At Wikimedia, we believe the world should have access to the sum of all knowledge, from encyclopedia articles to archival images,” said Katherine Maher, Executive Director of the Wikimedia Foundation. She continued:

Wikimedia Commons is a vast library of freely licensed photography, video, and audio that illustrates knowledge and the internet itself. With this project, and in partnership with the Wikimedia community of volunteer contributors, we hope to expand the free and open internet by supporting new applications of the millions of media files on Wikimedia Commons. We are grateful for the generous support of the Sloan Foundation, our longtime funders, in this important work.

 
“We are delighted to continue our near-decade-long support of Wikimedia with this potentially game-changing grant to unlock millions of media files—the most common form of modern communication and popular education, growing exponentially each year—into a universal format that can be read and reused not just by Wikipedia’s hundreds of millions of readers in nearly 300 languages but by educational, cultural and scientific organizations and by anyone doing a Google search or using the web,” said Doron Weber, Vice President and Program Director at the Alfred P. Sloan Foundation.

At a time when the World Wide Web, like the rest of the world, is beset by increasing polarization, commercialization, and narrowing, Wikipedia continues to serve as a shining, global counter-example of open collaborative knowledge sharing and consensus building presented in a reliable context with a neutral point of view, free of fake news and false information, that emphasizes how we can come together to build the sum of all human knowledge. We all need Wikipedia, its sister projects, its technology, and its values, now more than ever.

 
The Wikimedia Foundation is partnering on this project with Wikimedia Germany (Deutschland), the independent nonprofit organization dedicated to supporting the Wikimedia projects in Germany. Wikimedia Germany incubated and oversaw Wikidata’s initial operations, and continues to manage Wikidata’s technical and engineering roadmap. The project will be overseen in consultation with the Wikimedia community of volunteer contributors on collaboration and community support. The USD$3,015,000 grant from the Sloan Foundation will be given over a three year period.

For more information, please visit the structured data page on Wikimedia Commons.

by Wikimedia Foundation at January 09, 2017 07:41 PM

January 08, 2017

Jeroen De Dauw

Rewriting the Wikimedia Deutschland fundraising

Last year we rewrote the Wikimedia Deutschland fundraising software. In this blog post I’ll give you an idea of what this software does, why we rewrote it and the outcome of this rewrite.

The application

Our fundraising software is a homegrown PHP application. Its primary functions are donations and membership applications. It supports multiple payment methods, needs to interact with payment providers, supports submission and listing of comments and exchanges data with another homegrown PHP application that does analysis, reporting and moderation.

fun-app

The codebase was originally written in a procedural style, with most code residing directly in files (i.e., not even in a global function). There was very little design and completely separate concerns such as presentation and data access were mixed together. As you can probably imagine, this code was highly complex and very hard to understand or change. There was unused code, broken code, features that might not be needed anymore, and mysterious parts that even our guru that maintained the codebase during the last few years did not know what they did. This mess, combined with the complete lack of a specification and units tests, made development of new features extremely slow and error prone.

derp-code

Why we rewrote

During the last year of the old application’s lifetime, we did refactor some parts and tried adding tests. In doing so, we figured that rewriting from scratch would be easier than trying to make incremental changes. We could start with a fresh design, add only the features we really need, and perhaps borrow some reusable code from the less horrible parts of the old application.

They did it by making the single worst strategic mistake that any software company can make: […] rewrite the code from scratch. —Joel Spolsky

We were aware of the risks involved with doing a rewrite of this nature and that often such rewrites fail. One big reason we did not decide against rewriting is that we had a time period of 9 months during which no new features needed to be developed. This meant we could freeze the old application and avoid parallel development, resulting in some kind of feature race. Additionally, we set some constraints: we would only rewrite this application and leave the analysis and moderation application alone, and we would do a pure rewrite, avoiding the addition of new features into the new application until the rewrite was done.

How we got started

Since we had no specification, we tried visualizing the conceptual components of the old application, and then identified the “commands” they received from the outside world.

old-fun-code-diagram

Creating the new software

After some consideration, we decided to try out The Clean Architecture as a high level structure for the new application. For technical details on what we did and the lessons we learned, see Implementing the Clean Architecture.

The result

With a team of 3 people, we took about 8 months to finish the rewrite successfully. Our codebase is now clean and much, much easier to understand and work with. It took us over two man years to do this clean up, and presumably an even greater amount of time was wasted in dealing with the old application in the first place. This goes to show that the cost of not working towards technical excellence is very high.

We’re very happy with the result. For us, the team that wrote it, it’s easy to understand, and the same seems to be true for other people based on feedback we got from our colleagues in other teams. We have tests for pretty much all functionality, so can refactor and add new functionality with confidence. So far we’ve encountered very few bugs, with most issues arising from us forgetting to add minor but important features to the new application, or misunderstanding what the behavior should be and then correctly implementing the wrong thing. This of course has more to do with the old codebase than with the new one. We now have a solid platform upon which we can quickly build new functionality or improve what we already have.

The new application is the first Wikimedia (Deutschland) deployed on, and wrote in, PHP7. Even though not an explicit goal of the rewrite, the new application has ended up with better performance than the old one, in part due to the PHP7 usage.

Near the end of the rewrite we got an external review performed by thePHPcc, during which Sebastian Bergmann, who you might know from PHPUnit fame, looked for code quality issues in the new codebase. The general result of that was a thumbs up, which we took the creative license to translate into this totally non-Sebastian approved image:

You can see our new application in action in production. I recommend you try it out by donating 🙂

Technical statistics

These are some statistics for fun. They have been compiled after we did our rewrite, and where not used during development at all. As with most software metrics, they should be taken with a grain of salt.

In this visualization, each dot represents a single file. The size represents the Cyclomatic complexity while the color represents the Maintainability Index. The complexity is scored relative to the highest complexity in the project, which in the old application was 266 and in the new one is 30. This means that the red on the right (the new application) is a lot less problematic than the red on the left. (This visualization was created with PhpMetrics.)

fun-complexity

Global access in various Wikimedia codebases (lower is better). The rightmost is the old version of the fundraising application, and the one next to it is the new one. The new one has no global access whatsoever. LLOC stands for Logical Lines of Code. You can see the numbers in this public spreadsheet.

global-access-stats

Static method calls, often a big source of global state access, where omitted, since the tools used count many false positives (i.e. alternative constructors).

The differences between the projects can be made more apparent by visualizing them in another way. Here you have the number of lines per global access, represented on a logarithmic scale.

lloc-per-global

The following stats have been obtained using phploc, which counts namespace declarations and imports as LLOC. This means that for the new application some of the numbers are very slightly inflated.

  • Average class LLOC: 31 => 21
  • Average method LLOC: 4 => 3
  • Cyclomatic Complexity / LLOC : 0.39 => 0.10
  • Cyclomatic Complexity / Number of Methods: 2.67 => 1.32
  • Global functions: 58 => 0
  • Total LLOC: 5517 => 10187
  • Test LLOC: 979 => 5516
  • Production LLOC: 4538 => 4671
  • Classes 105 => 366
  • Namespaces: 14 => 105

This is another visualization created with PhpMetrics that shows the dependencies between classes. Dependencies are static calls (including to the constructor), implementation and extension and type hinting. The applications top-level factory can be seen at the top right of the visualization.

fun-dependencies

by Jeroen at January 08, 2017 09:02 AM

January 07, 2017

User:Bluerasberry

Year of Science 2016 – a new model for Wikipedia outreach

The Year of Science was a 2016 Wikipedia outreach campaign managed by the Wiki Education Foundation with funding support from the Simons Foundation. The campaign had several goals, including developing science articles on Wikipedia, recruiting scientists as volunteer Wikipedia editors, promoting discussions about the culture and impact of Wikipedia in the scientific community, and integrating more science themes into existing Wikipedia community programs.

It is easy to say that the Year of Science was one of the biggest and highest impact campaigns which the Wikipedia community has produced to date. Previous campaigns rarely lasted more than a month, and campaigns rarely include multiple events in multiple cities or recruited so many participants. It is unprecedented for any Wikipedia campaign to bring so many discussions to professional spaces, but Year of Science included talks and workshops at academic conferences throughout the year. The very brand and idea of a “year of science” was provocative to see in circulation around Wikipedia, and pushed the community’s imagination of what is possible.

The campaign will have its own outcome reports and counts of progress. 2016 just ended so these are not available yet. When they come out, they will describe the counts of how many people attended workshops and registered Wikipedia articles to add citations to academic journals. With Wikipedia being digitally native, so many metrics are available. That part of the impact can be measured quantitatively. Beyond that I am confident that the social outreach changes the cultural posturing in science to Wikipedia, which I think is overdue to change. Right now, Wikipedia is riding a 10+-year wave of being the world’s most consulted source of science information. Assuming that Wikipedia survives into the future, I think people might look back and wonder when Wikipedia’s influence as a popular publication was recognized, and this Year of Science campaign might be cited as one of the first examples of professional Wikipedia outreach into a population of people who still had serious reservations about acknowledging Wikipedia at all. It was a risk to do Year of Science in 2016; 2014 or before would have been premature considering Wikipedia’s reputation then. Although things are better now, things are changing quickly and every year outreach like this is becoming easier to conduct and more likely to have a high impact with less effort.

I am pleased with the campaign outcomes. From a Wikimedia community perspective of wanting to keep what worked and spend less time repeating the parts which were less effective, the campaign could be criticized, but I do not think the criticism should detract from celebrating everything that everyone accomplished. Most parts of the program were successful, and I expect that other stakeholders will publish to describe those parts. For the sake of anyone who might want to do similar projects, I will review the challenges.

Metrics are incomplete
The Wikipedia community values transparency. However, many people in the Wikipedia community stay in digital spaces and underestimate the difficulty of doing outreach away from the keyboard. The Year of Science tracked as much of Wikipedia engagement as is routine to track in outreach programs, but from anecdotes, I know that much and perhaps most data was not captured. There are various reasons for this. One reason is that Wikipedia’s software is nonprofit and rooted in the late 1990s, whereas commercial websites have all the advantages of being state of the art and intuitive to use. Wikipedia’s clunky interface and infrastructure is a barrier to getting users to agree to the lamest parts of Wikipedia, like volunteering for metrics tracking. In platforms like Facebook, every aspect of people’s lives are tracked routinely with single clicks, but in Wikipedia, there are social options to preserve privacy and then technical limitations even for people who are sharing what they do. The idea for metrics tracking in a program like this is that if someone volunteers to report to a campaign organizer which Wikipedia article they edited, then we ought to be able to track that. For a campaign like this, we actually need to be able to track hundreds or thousands of participants. What happens in practice is that for various reasons, this tracking connection is difficult to make in Wikipedia for reasons which are not present in other organizing platforms. This is simultaneously a problem, and an intentional choice with its own rights-preserving benefits, and a social situation on which to reflect. Something that came out of this is development of the Programs & Events dashboard. I think that the P&E dashboard could prove to be of the most significant innovations to Wikipedia in its entire history, because the dashboard is the first effort to provide a system for collecting media metrics reporting the impact of Wikipedia. When stories about Wikipedia communication metrics are told, then I think the Year of Science should be remembered as one of the second-wave driving forces in the development of the concept.

Some experiments failed to develop
In typical wiki-fashion, the beginning of the campaign was treated as a call for all sorts of sub-projects. Should the campaign include a contest, a newsletter, collaborations with 10+ ongoing Wikipedia initiatives, and formal partnerships with respected science organizations? As it happens, Wikipedia is an improvised project which changes quickly depending on participant interest. When a few people want something, they start to create it, and some kinds of communication which worked well for offline activism – like newsletters – can seem slow in the age of Internet. Wikipedia does have some newsletters, but just in the same way that The New York Times publishes online first and only puts yesterday’s news in the latest edition of their paper publications, things like newsletters for digital communities can have low relevance for people who are living the experience. The Year of Science campaign ambitiously listed a range of projects, but many never materialized, and things that did not seem important in the inception of the idea became important months later. Insiders of a campaign often hesitate to definitively strike an idea which is not progressing, but for this campaign, I think some of the ideas which were raised in the beginning looked quite dead to both Wikipedians and science professionals who might have checked the campaign page. Wikipedia has trouble managing timed campaigns, because it is difficult to crowdsource the management of projects which must happen on a schedule. Wiki-style editors will not be bold enough to go into a campaign space started by another and tell them that they need to abandon certain halted projects, and the leadership of a campaign might not be able to recognize when enough time has passed to declare an initiative dead. By the end of the year, the campaign page accumulated some distracting cruft. Anyone replicating the campaign should plan in advance how to introduce new ideas to stay current and how to kill off paused concepts to prevent being overburdened. I would recommend by making modest promises in the beginning, introducing supplementary projects without prior announcement as a bonus rather than in fulfillment of a commitment, and not advertising any non-essential feature or service as ongoing and dependable until and unless that feature has already been provided in several iterations over a period of time.

No centralized forum
The idea of a centralized outreach campaign in Wikipedia is a little crazy. Wikipedia was imagined from its founding as a crowdsourced project in which any individual can contribute information, and other people can spontaneously organize to review and manage it according to rules which are developed by consensus. At no point in Wikipedia’s history has there been much concept of centralized leadership or even support. With Year of Science, there were outreach events in every way possible targeting individuals who would do anything, including editing articles, providing review and suggestions, developing the Wikidata database, or joining conversations. Beyond individuals all sorts of organizations external to Wikipedia participated, including conference teams, universities, social groups, and professional societies.

Although there was a campaign landing page to orient anyone to the Year of Science concept, the Wikipedia community is not accustomed to anticipating the existence of this kind of central campaign or using such forums provided by a campaign as a way to connect to sponsored support services. In some ways, Wiki Ed as an organization provided staff support for the outreach by setting up some basic infrastructure to make the campaign possible. Things that any traditional off-wiki outreach campaign would imagine to be essential – like logos, basic text instructions, sign-up sheets, reporting queues, designated talk pages, and some points of contact – are not aspects of Wikipedia community culture which the wiki community expects to exist in the wiki campaigns which have been successful to date. There is a cultural mismatch in what a science professional would expect to exist in a social campaign and what the wiki community imagines should exist. The organizers of the Year of Science campaign imagined the campaign landing page to be a bridge for this, and it was, but the concept of a traditional community entry point has not developed in the wiki community to a point which permits two-way communication between the Wikipedia community and people communicating in other ways. This is not a problem unique to Wikipedia, as people not familiar with communication in YouTube, Facebook, Twitter, or any other digital community platform have trouble moving messaging into and and getting comments out of those platforms as well. With Wikipedia, the paths to communication are less developed, and the Year of Science pushed to test what was possible.

For future campaigns, as outreach becomes broader, there could be more notice of what central services are and are not available. The Wikipedia community will tend to anticipate that there is no central service; off-wiki communities will tend to expect that there will be. Both communities will have challenges grasping the reality which is in the middle of these expectations. The centralized support which is available should be ready to promote services to those not expecting them, and preemptively match the support requested by off-wiki communities to what is available.

Take aways
Let’s do it again! The very precedent of the Year of Science is good for me in my medical outreach, because the credibility it generated gives me more of a foundation to to go further. This kind of campaign could be repeated globally in all languages for a year, or anyone could modify the concept to be local in one language and for a shorter time if that suited them. I would like to see more science themed campaigns. I can imagine other people exploring campaigns with themes in the humanities, for trades and labor, by geographical interest, for content types like datasets, or for engagement types like translation. I expect that now that this has happened, the next campaign organizers will be more informed going into the project now that the risk has been taken.

This entire experience also marks one of the first times that content sponsorship has been provided, albeit in the wiki way. It is not at all orthodox right now for anyone to fund wiki development, but not only did Simons Foundation do this, but they even let it happen in the wiki way: with invitations for any person or organization to contribute and to share the information which was important to them, as a volunteer, and without any promotional agenda.

by bluerasberry at January 07, 2017 07:58 PM

Gerard Meijssen

#Maps - Where did they live?

This map is in many ways perfect. It tells us a story. It helps visualise what happened in the past. The map is simple, they are the contours of present day Europe, more or less and in it you see roughly where what happened.

Obviously the map could be improved but typically it makes little difference for understanding what it is that is shown when it is seen in isolation.

When this map is part of a continuum of maps, it will show the movements over time. It will show where they are at a given time. They will show where the Vandals settled down and show where they fought their battles. Better understanding will emerge but it may get complicated. The Vandals were not the only ones around. It was a time of turmoil and only when the shape of former countries and battles are shown a better understanding emerges.

For many "former countries" maps are not available and when they are they are of a similar quality as the map of the Vandals. What I would love is maps as an overlay and just add maps and facts as they are available. Many maps will only over time get some credibility but it is an improvement over nothing to see.
Thanks,
       GerardM

by Gerard Meijssen (noreply@blogger.com) at January 07, 2017 07:44 AM

January 06, 2017

Wikimedia Foundation

The end of ownership? Rethinking digital property to favor consumers at a Yale ISP talk

Photo by Nick Allen, CC BY-SA 3.0.

Photo by Nick Allen, CC BY-SA 3.0.

Suppose a consumer named Alice buys a record of a David Bowie music album. Although Alice is not an expert in property law, she probably knows what privileges she enjoys by buying the LP record. For example, Alice can freely lend or rent it to her friend. Alice also possesses similar rights if she were to buy a book. But what happens when Alice buys an e-book or a song on iTunes? Can she enjoy the same rights with the e-book as she could with the book? Can she lend it or rent it to whoever she wants without any restrictions? Probably not. In the online world, users’ rights on digital copies and content are subject to licensing and technological restrictions imposed by copyright holders.

How we approach content licensing is critical for the Wikimedia projects and our mission of spreading free knowledge around the world. To help us better understand current research on this issue, I recently attended a talk on this subject hosted by our friends and collaborators at the Yale Information Society Project. The talk, entitled The End of Ownership: Personal Property in the Digital Economy, was given on November 3, 2016 by Professor Aaron Perzanowski of Case Western Reserve University.

Intellectual property, including copyright, is governed by the principle of exhaustion, also called the first-sale doctrine. This principle, established in Bobbs-Merrill Co. v. Straus in 1908, holds that copyright holders lose their ability to control further sales over their copyrighted works once they transfer the works to new owners. Bobbs-Merrill, the plaintiff and a publisher, drafted a notice in its books forbidding sales under one dollar, warning that violations of this condition would be considered copyright infringement. The defendants resold the books for less than a dollar each. In the end, the United States Supreme Court agreed with the defendants’ position and sent a clear message that copyright holders are not able to control prices or resales after the first sale of the copyrighted work. Even today, the first-sale doctrine is an important defense for consumers. In Alice’s case, copyright holders can control any use of the physical copy of the Bowie LP record until the first sale to Alice. Once Alice owns the record, she can re-sell it, donate it, etc.

But according to Perzanowski, the notion of property has changed: in the past, copies used to be valuable because they were scarce and difficult to produce. In the internet era, the paradigm has shifted. Because everything disseminates quickly and at almost no cost, copies have lost value. This is why buying in the digital world is a different experience. If Alice wants to buy an e-book on Amazon or an album on iTunes, she is not actually buying the “copy” of such e-book or album, but instead is licensing them. According to Perzanowski, these licensing schemes are undermining consumers’ rights that once were protected by ownership. Generally, the license terms of digital products will include the prohibition against transfer and sublicensing, among other restrictions. Thus, the notion of a copy of a work is disappearing, because in these licensing schemes, rights that are obvious in a physical object like resale, rental, or donation rights are neglected. Furthermore, these restrictions are authorized by law, specifically the Digital Millennium Copyright Act (DMCA), which includes provisions on Digital Rights Management (DRM) technologies that limits the consumer’s ability to use the product. If Alice buys an e-book, DRM technologies and legal provisions may limit her ability to print and copy-paste, and may impose time-limited access to her book.

Unfortunately, consumers do not seem to understand these limitations of the digital market. Perzanowski explained that 48% of online consumers think once they purchased an e-book by clicking on the “buy now” button, they are able to lend it to someone else and 86% think they actually own the book and can use it in the device of their choice. Perzanowski believes that consumers should be better informed so that they can have a better sense of autonomy on how to use the digital products they buy. For this reason, he advocates for better education on digital licensing so that consumers can recalibrate their expectations.

The use of Creative Commons (“CC”) licensing for content on the Wikimedia projects helps address Perzanowski’s concerns regarding limited consumer rights with respect to digital works and consumer education on digital licensing. First, CC licensing allows the Wikimedia communities to enjoy broader rights for digital works, such as the ability to share content with a friend or to produce derivatives, that are not available under more typical digital licensing schemes. Second, Creative Commons’ and the Foundation’s approaches to licensing provide certainty to consumers and promote transparency in how they can license content. For example, Creative Commons provides summaries of CC licenses in plain English. Similarly, the Wikimedia Foundation clearly explains to its users and contributors their rights and responsibilities in the use of CC licensed Wikimedia content. The Foundation also publicly consults with the Wikimedia communities on these licensing terms; it recently closed a consultation with the communities on a proposed change from CC BY-SA 3.0 to CC BY-SA 4.0.

In today’s world, physical copies and analog services are more the exception than the rule. This, however, shouldn’t mean that digital copies and online services have to be offered with fewer rights for users compared to rights available under traditional licensing schemes. We support licensing schemes that allow users to retain the same rights that they would otherwise have in the offline world so that users have the power to edit, share, and remix content: the more we empower, the more knowledge expands and creativity grows.

We believe strongly in a world where knowledge can be freely shared. Our visits to Yale ISP allow us to remain engaged in discussions about internet-related laws and affirm the importance of licenses like Creative Commons for the future of digital rights.

Ana Maria Acosta, Legal Fellow
Wikimedia Foundation

by Ana Maria Acosta at January 06, 2017 08:06 PM